ssis importing two excel files with different number of columns into database - sql

I want to build a ssis package that imports a flat file with variable number of columns and stores it in a database table. further i will be using that table to left join with few other tables to find if a particular column value matches with the ones i already have in the database. Then i want to give all non matching rows as output in another folder as a csv file.
So how to write a for each loop to import variable column number flat file and store it in a table. Can we write a package to create a table after it imports a file or can we store the imported file in a temp table created dynamically, because it doesnt matter if the table is deleted after the package stops.
thanks

Related

Sql script to search value in column of database by taking value from a file

I have a csv file with two columns. The file has over 200.000 rows. Inside database I have the same table with the same values.
How can I write a script so that I can search for the values that are present in file but not in database?
I am using SQL Developer for this
Creating an External table is the best option when you want to read the contents of a flat-file using a select query.
Click here to know more about how to create an external table.
After creating the external table, you can make use of a query similar to below to identify the records which are exclusively available in the external table(i.e. flat file).
select *
from new_external_table et
where not exists (select 1 from source_table st where et.column_name=st.column_name);

CSV file matadata validation (comparing with existing SQl Table)

I have a requirement of validating CSV file before loading into staged-folder, and later have to load into sql table.
I need to validate metadata (the structure of the file must be same as target sql table)
No. of columns should be equal to the target sql table
order of columns should be same as target sql table
Data types of columns (no text values should exist in numeric field of csv file)
looking for some easy and efficient way achieve this.
Thanks for help
A Python program and module that does most of what you're looking for is chkcsv.py: https://pypi.org/project/chkcsv/. It can be used to verify that a CSV file contains a specified set of columns and that the data type of each column conforms to the specification. It does not, however, verify that the order of columns in the CSV file is the same as the order in the database table. Instead of loading the CSV file directly into the target table, you can load it into a staging table and then move it from there into the target table--this two-step process eliminates column order dependence.
Disclaimer: I wrote chkcsv.py
Edit 2020-01-26: I just added an option that allows you to specify that the column order should be checked also.

Import CSV file to SQLite database (without headers)

How to load a CSV file into the table using the console? The problem is that I have to somehow omit the headers from the CSV file (I can not delete them manually).
From the sqlite3 doc on CSV Import:
There are two cases to consider: (1) Table "tab1" does not previously
exist and (2) table "tab1" does already exist.
In the first case, when the table does not previously exist, the table
is automatically created and the content of the first row of the input
CSV file is used to determine the name of all the columns in the
table. In other words, if the table does not previously exist, the
first row of the CSV file is interpreted to be column names and the
actual data starts on the second row of the CSV file.
For the second case, when the table already exists, every row of the
CSV file, including the first row, is assumed to be actual content. If
the CSV file contains an initial row of column labels, that row will
be read as data and inserted into the table. To avoid this, make sure
that table does not previously exist.
It is either/or. You will have to outsmart it.
Assuming "I can not delete them manually" means from the csv, not from the table, you could possibly sql delete the header line after the import.
Or: Import into a temp table in the target database, insert into target table from the temp table, drop the temp table.
Or:
connect to an in-memory database
import the CSV into a table
attach the target database
insert into target table from the imported in-memory table
just add option --skip 1, see https://www.sqlite.org/cli.html#importing_csv_files

Multiple CSV files to multiple tables not yet created

Database platform: SQL Server 2012
I have a folder with a lot of CSV's. I require the creation of a table for each CSV. The CSV has the column names in the first row, data in subsequent rows.
I have a handy SSIS package to iterate through a folder and import over into existing tables in a database but in this case, it is our first load and we would also like to create the tables as part of the process.
I know how to do it one at a time through the import wizard or SSIS DBO source, new table button. I was wonder if there was a more automated way using SSIS.
After further review of the 313 CSV's, I determined that 75% of them are lookup tables and the other 25% are relevant data. I will simply go through each one and build out a staging table for each one and then properly build out the structure. Only will take about 1 day to build one SSIS package to churn through all the CSV's I want to use and then I'm all set!

Best Way to ETL Multiple Different Excel Sheets Into SQL Server 2008 Using SSIS

I've seen plenty of examples of how to enumerate through a collection of Excel workbooks or sheets using the Foreach Loop Container, with the assumption that the data structure of all of the source files are identical and the data is going to a single destination table.
What would be the best way to handle the following scenario:
- A single Excel workbook with 10 - 20 sheets OR 10 - 20 Excel workbooks with 1 Sheet.
- Each workbook/ sheet has a different schema
- There is a 1:1 matching destination tables for each source sheet.
- Standard cleanup: workbooks would be created and placed in a "loading" folder, SSIS package runs on a job that reads the files in the loading folder and moves them to an archive folder upon successful completion
I know that I can create a seperate SSIS package for each workbook, but that seems really painful to maintain. Any help is greatly appreciated!
We faced the same issue long back. I will just summarize you what we have done.
We have written an SSIS-package pro-grammatically using C#. A MetaTable is maintained which holds the information (table name, columns, positions of these columns in the flat file.) of the flat files. We extract the flat file name and then query it to the meta-table about the table this flat file belongs to and the columns it is having and the column positions in the flat file.
We execute the package in the SQLSERVER by passing each flat file as a command line argument to the PackageExe. So it reads and processes each flat file.
Example Suppose say we have a flat file FF, we first extract name of the flat file and then get the table name by querying to the DB, lets say it is TT which contains columns COL-1, COL-2 with positions 1 to 10 and 11 to 20 respectively. By reading this information from the MetaTable now I have created a derived column transformation (package.)
Our Application has a set of flat files in a folder and by using "For Loop Container SSIS" we get one flat file at a time and do the above process.