SAP-samples/hana-import-csv-from-oracle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REUSE status

This proof of concept uses features of both SAP HANA and Oracle databases to move tables from Oracle to SAP HANA as flat files, i.e., CSV. To achieve the best possible performance, the tool create many threads to export tables to CSV, move them to the SAP HANA platform, and load them into an SAP HANA schema.

To better visualize the entires process, this flowchart hightlights some of the major processing steps:

Process Flow

After a job is created (see sample-job.sql), a job is started using the process in section 1. This routine starts as many threads in the Oracle database as defined in the job - these are processed in step 2. Each thread is 100% asynchronous. Once started, each thread automatically re-submits itself until all tables have been processed.

The process in step 2 is the job orchestrator that determines if all tables have been exported and if the job has been terminated. When tables remain, a new job is started to obtain and process the next available table - step 3. All threads can be stopped using the appropriate API call (see stop-job.sql). Stopping may take a some time because each thread will finish the table it was exporting before detecting the stop operation.

Step 3 describes the main processing life cycle of a single table. The most important step from a job control standpoint is the locking of the job table. This ensures this thread/job has exclusive control of the job metadata, all other threads wait until the lock is released. Once a table is identified it is exported to a CSV file to a directory specified when the job was created (sufficient disk space is required).

To keep everything asynchronous, after the CSV file is complete, a new DBMS_SCHEDULER job is created to run the compress/transfer script at the operating system. The last step in the job executes a remote SSH session to run the import scripts into the SAP HANA instance.

Steps 4 and 5 are clean-up jobs that constantly scan for and load log files generated by the operating system scripts.

  • This project was developed and tested using Oracle RDBMS 12c and SAP HANA 2 SPS03.
  • Proper schema privileges must be granted to database users on both platforms.
  • To successfully run this project, the Oracle instance must be configured to allow access to the operating system to both create files using UTL_FILE and execute scripts.
  • The SSH between the Oracle host and the SAP HANA host must be configured for certificate authentication - passwords must NOT be required.

Setup and configuration details may be found in the create-users.txt file.

The files listed below are used mostly in the order they appear. Please note that you will need to make adjustments to directories and databases schemas to match your environment.

Use these scripts to setup the necessary database users and make adjustments to the Oracle environment.

FileDescription
images/flow.pngThis is a detailed view of the major control elements of the process.
create-user.txtGrants needed in the Oracle and HANA databases. Also includes links to useful links.
create-tables.sqlCreate the tables supporting the process.

Install the main Oracle PL/SQL package (.pls/.plb) in the Oracle database and position the Linux scripts as appropriate for the target environment

FileDescription
CSV_EXPORTER.plsOracle package specification. Note: there are some global variables in the package specification that must be adjusted.
CSV_EXPORTER.plbOracle package body.
csv_exporter.shLinux script placed on the Oracle server to zip-n-ship the CSV output for a table.
csv_launch.shLinux script on the SAP HANA server that is call by the csv_exporter.sh to start a HANA import.
csv_loader.shLinux script on the SAP HANA server to perform the HANA import.

These samples demonstrate creating and running a job.

FileDescription
tests/sample-job.sqlExample of building and running an export of a schema.
tests/stop-job.sqlExample of stopping a running job.

There are no known issues.

This code is provided "as-is" with no expected changes or support. Questions or comments should be directed to Mark Greynolds.

This project is only updated by SAP employees and accepts no other contributions.

Copyright (c) 2024 SAP SE or an SAP affiliate company. All rights reserved. This project is licensed under the Apache Software License, version 2.0 except as noted otherwise in the LICENSE file.

About

This tool quickly moves data from an Oracle database to the SAP HANA platform via CSV files. The Oracle scheduler is used for high parallelism of transfer in network constrained environments.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •