Find Bulk Loader jobs in SAP BODS - sap

Is there a straight forward way to find out through the metadata whether any BODS job is using the 'Bulk Loader Option' in its target table objects.
I wanted to find all the jobs in my repository which use the Teradata Multiload option as a Bulk Loading utility.

Related

Error in SSIS Packages loading data into azure data warehouse

We have some ssis packages loading data into azure data warehouse from CSV files. All the data flow tasks inside the packages are configured for parallel processing.
Recently packages are started failing with following error.
Failed to copy to SQL Data Warehouse from blob storage. 110802;An internal DMS error occurred that caused this operation to fail. Details: Exception: System.NullReferenceException, Message: Object reference not set to an instance of an object.
When we run the package manually (Running Each dft individually) its running fine. When we run the package manually as it is ( with parallel processing), same error occurs.
Anyone here please help to find the root-cause for this issue?
I believe this problem may occur if multiple jobs are trying to access the same file exactly at the same time.
You may need to check if one CSV file is source for multiple SSIS packages, if yes, you may need to change your approach.
When one package is trying to read one CSV file, it locks that file so that other job can't modify this file.
To get rid of this problem, you can use sequential DFTs for those tasks that are using the same CSV as source and keep other DFTs in parallel as it is.
IMHO it's a mistake to use SSIS Data Flow to insert data in Azure SQL Data Warehouse. There were problems with the drivers early on which made performance horrendously slow and even though these may now have been fixed, the optimal method for importing data into Azure SQL Data Warehouse is Polybase. Place your csv files into blob store or Data Lake, then reference those files using Polybase and external tables. Optionally then import the data into internal tables using CTAS, eg pseudocode
csv -> blob store -> polybase -> external table -> CTAS to internal table
If you must use SSIS, consider using only the Execute SQL task in more of an ELT-type approach or use the Azure SQL DW Upload Task which is part of the Azure Feature Pack for SSIS which is available from here.
Work through this tutorial for a closer look at this approach:
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/design-elt-data-loading

Transfer Data from Oracle database 11G to MongoDB

I want to have an automatic timed transfer from Oracle database to MongoDB. In a typical RDBMBS scenario, i would have established connection between two databases by creating a dblink and transferred the data by using PL/SQL procedures.
But i don't know what to do in MongoDB case; thus, how and what should i be implementing so that i can have an automatic transfer from Oracle database to MongoDB.
I would look at using Oracle Goldengate. It has a MONGODB Handler.
https://docs.oracle.com/goldengate/bd123110/gg-bd/GADBD/using-mongodb-handler.htm#GADBD-GUID-084CCCD6-8D13-43C0-A6C4-4D2AC8B8FA86
https://oracledb101.wordpress.com/2016/07/29/using-goldengate-to-replicate-to-mongodb/
What type of data do you want to transfer from the Oracle database to MongoDB? If you just want to export/import a small number of tables on a set schedule, you could use something like UTL_FILE on the Oracle side to create a .csv export of the table(s) and use DBMS_SCHEDULER to schedule the export to happen automatically based on your desired time frame.
You could also use an application like SQL Developer to export tables as .csv files by browsing to the table the schema list, then Right Click -> Export and choosing the .csv format. You may also find it a little easier to use UTL_FILE and DBMS_SCHEDULER through SQL Developer instead of relying on SQL*Plus.
Once you have your .csv file(s), you can use mongoimport to import the data, though I'm not sure if MongoDB supports scheduled jobs like Oracle (I work primarily with the latter.) If you are using Linux, you could use cron to schedule a script that will import the .csv file on a scheduled interval.

Push data to Azure SQL database

I am pretty new using Azure SQL database. I have been given a task to push a 100 million record text file to Azure SQL database. I'm looking for suggestions how to do it in an efficient manner.
You have several options to upload on-premise data to your SQL Azure database
SSIS - As Randy mentioned you can create an SSIS package (using SSMS) and schedule an SQL Agent job to run this package periodically.
Azure Data Factory - You can define an ADF pipeline that periodically uploads data from your on-premise file to your SQL Azure database. Depending on your requirements you might need just the initial 'Connect and collect' part of the pipeline or you might want to add further additional processing in the pipeline
bcp - The 'bulk copy program' utility can be used to copy data between SqlServer and a data file.Similar to the SSIS package you can use an SQL Agent job to schedule periodic uploads using bcp.
SqlBulkCopy - I doubt if you would need this, but in case you need to integrate this into your application programmatically this class helps you achieve the same as the bcp utility (bcp is faster) via .NET code.
I would do this via SSIS using SQL Studio Managemenet Studio (if it's a one time operation). If you plan to do this repeatedly, you could schedule the SSIS job to execute on schedule. SSIS will do bulk inserts using small batches so you shouldn't have transaction log issues and it should be efficient (because of bulk inserting). Before you do this insert though, you will probably want to consider your performance tier so you don't get major throttling by Azure and possible timeouts.

Migration of ETL jobs to Hadoop

I have a set of ETL (created in informatica) jobs which I want to migrate to Hadoop. I have already created source and target tables into hadoop environment. Now I can write a hive query to implement logic of ETL which pulls data from source and write to target table. But this is lengthy process since my ETL jobs are complex (with complex business logic), development and testing of these queries are taking longer time. I would like to know if there are any better way to migrate my ETL code to Hadoop ? I heard we can use pandas dataframe instead of hive. Any suggestions ?

Is there a way where the output of the MapReduce job is imported into SQL table?

Is there a way where the output of the MapReduce job is imported into SQL table?
I want to know if we could automatically import the output of MapReduce job (MapReduce job should be responsible for exporting ) into SQL table (MySQL,Oracle, etc..).
I know Sqoop could be used as a tool but could it be used in MR job?
Unless you write in the reducer some code that instead of writing the output to the context, it connects via JDBC to the SQL table and insert it (which would be a really BAD idea), the only thing you can do is to use Oozie to automate the execution of the MapReduce job and then perform the insertion using Sqoop. Oozie is a workflow scheduler to automate all these kinds of operations. You can find more information about it here.