Bulk insert from a txt file which has its own rules - bulkinsert

I got a txt file which includes 350.000 lines and I have to download and insert it to my sql server database. I write the part that connects to FTP gets the related file and download it. What I want is insert it to my table.
Here's a line:
9996281000L0000000000000000
As you can see also I need to seperate the specific parts like
999 628 1000 L 0000000000000000
I need an effective solution which cuts the lines and inserts the data to related columns.
Anyone any ideas how I can achieve this?

Look into the BCP utility and its format files. It's a detailed and somewhat complex process, but it will do the job quickly and efficiently once set up.
You can get similar functionality (with a much better GUI) with SQL Server Integration Services (SSIS). While completely different, it does much the same thing as bcp.

Related

SQL Server - Copying data between tables where the Servers cannot be connected

We want some of our customers to be able to export some data into a file and then we have a job that imports that into a blank copy of a database at our location. Note: a DBA would not be involved. This would be a function within our application.
We can ignore table schema differences - they will match. We have different tables to deal with.
So on the customer side the function would ran somethiug like:
insert into myspecialstoragetable select * from source_table
insert into myspecialstoragetable select * from source_table_2
insert into myspecialstoragetable select * from source_table_3
I then run a select * from myspecialstoragetable and get a .sql file they can then ship to me which we can then use some job/sql script to import into our copy of the db.
I'm thinking we can use XML somehow, but I'm a little lost.
Thanks
Have you looked at the bulk copy utility bcp? You can wrap it with your own program to make it easier for less sophisticated users.
Since it is a function within your application, in what language is the application front-end written ? If it is .NET, you can use Data Transformation Services in SQL Server to do a sample export. In the last step, you could save the steps into a VB/.NET module. If necessary, modify this file to change table names etc. Integrate this DTS module into your application. While doing the sample export, export it to a suitable format such as .CSV, .Excel etc, whichever format from which you will be able to import into a blank database.
Every time the user wants do an export, he will have to click on a button that would invoke the DTS module integrated into your application, that will dump the data to the desired format. He can mail such file to you.
If your application is not written in .NET, in whichever language it is written, it will have options to read data from SQL Server and dump them to a .CSV or text file with delimiters. If it is a primitive language, you may have to do it by concatenating the fields of every record, by looping through the records and writing to a file.
XML would be too far-fetched for this, though it's not impossible. At your end, you should have the ability to parse the XML file and import it into your location. Also, XML is not really suited if the no. of records are too large.
You probably think of a .sql file, as in MySql. In SQL Server, .sql files, that are generated by the 'Generate Scripts' function of SQL Server's interface, are used for table structures/DDL rather than the generation of the insert statements for each of the record's hard values.

Do while loop with GPDB using talend

I have a very large data set in GPDB from which I need to extract close to 3.5 million records. I use this for a flatfile which is then used to load to different tables. I use Talend, and do a select * from table using the tgreenpluminput component and feed that to a tfileoutputdelimited. However due to the very large volume of the file, I run out of memory while executing it on the Talend server.
I lack the permissions of a super user and unable to do a \copy to output it to a csv file. I think something like a do while or a tloop with more limited number of rows might work for me. But my table doesnt have any row_id or uid to distinguish the rows.
Please help me with suggestions how to solve this. Appreciate any ideas. Thanks!
If your requirement is to load data into different tables from one table, then you do not need to go for load into file and then from file to table.
There is a component named tGreenplumRow which allows you to write direct sql queries (DDL and DML queries) in it.
Below is a sample job,
If you notice, there are three insert statements inside this component. It will be executed one by one separated by semicolon.

What is the best way to import data using insert statements into a table in MS SQL?

I have exported a table from another db into an .sql file as insert statements.
The exported file has around 350k lines in it.
When i try to simply run them, I get a "not enough memory" error before the execution even starts.
How can import this file easily?
Thanks in advance,
Orkun
You have to manually split sql file into smaller pieces. Use Notepad++ or some other editor capable to handle huge files.
Also, since you wrote that you have ONE table, you could try with utility or editor which can automatically split file into pieces of predefined size.
Use SQLCMD utility.. search MICROSOFT documentation.. with that you just need to gives some parameters. One of them is file path.. no need to go through the pain of splitting and other jugglery..

What is the easiest way to add a bunch of content to a SQL database?

Nothing technical here. Suppose I have a lot of different categorized data, and I would like to create a database out of it. Would someone literally hand plug in all that info with SQL code itself? Or do some people make a mock website just to input data? What are some of your strategies?
If there would be no way to do it automatically, then a mock website would be the way to go: you can even use it with more people at once, actually multiplying the input speed (as long as you don't mess up assigning each of them a different part of the data).
What format is your data in? And how much of it is there? If its Excel then SQL Server has tools to import it in. I'm not sure if MySQL has anything similar. Even if it doesn't one other technique I have used with Excel data is to use a formula to concatenate as required to generate the INSERT statements. Then just paste those into a query window and run that.
I wouldn't do a website for it unless I was building an admin site for it already and wanted to test that with the initial load.
Most databases have a way to do bulk inserts or have tools for data import.
My strategies normally involve such tools.
Here is an example of importing a CSV file to SQL Server.
Most database servers provide a way to import data from a variety of formats, you could look into that first.
If not, you could write a simple script or console application to parse your input data, and write out a SQL script to insert the data into appropriate tables.
For example, if you data was in a CSV file, you would parse each line in the file, and generate an insert statement to write out to a .sql file.
MyData.csv
1,2,3,'Test',4
2,3,4,'Test2,6
GeneratedInsert.sql
insert into table (col1,col2,col3,col4,col5) values (1,2,3,'Test',4)
insert into table (cal1,col2,col3,col4,col5) values (2,3,4,'Test2',6)
MySQL has a statement LOAD DATA INFILE that is intended for loading bulk data from flat files. It's easy to use and much faster than alternative methods.
But first you do have to use SQL to design tables with fields that match the field of your import data. That is, if you have some file with comma-separated data:
Titanic;1997;4 stars
Batman Begins;2005;5 stars
"Harry Potter and the Sorcerer's Stone";2001;3 stars
...
You would create a table:
CREATE TABLE Movies (
title VARCHAR(100) NOT NULL,
year YEAR NOT NULL
rating VARCHAR(10)
);
Then load data:
LOAD DATA INFILE 'movies.txt' INTO TABLE Movies
FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"';
Most web languages have some sort of auto-scaffolding that you can quickly set up. Useful for admin work as well, if your site is hosted without direct access to DB.
Otherwise, yeah - write the SQL statements. Useful to bring a database up as part of your build process.

How do I handle large SQL SERVER batch inserts?

I'm looking to execute a series of queries as part of a migration project. The scripts to be generated are produced from a tool which analyses the legacy database then produces a script to map each of the old entities to an appropriate new record. THe scripts run well for small entities but some have records in the hundreds of thousands which produce script files of around 80 MB.
What is the best way to run these scripts?
Is there some SQLCMD from the prompt which deals with larger scripts?
I could also break the scripts down into further smaller scripts but I don't want to have to execute hundreds of scripts to perform the migration.
If possible have the export tool modified to export a BULK INSERT compatible file.
Barring that, you can write a program that will parse the insert statements into something that BULK INSERT will accept.
BULK INSERT uses BCP format files which come in traditional (non-XML) or XML. Does it have to get a new identity and use it in a child and you can't get away with using SET IDENTITY INSERT ON because the database design has changed so much? If so, I think you might be better off using SSIS or similar and doing a Merge Join once the identities are assigned. You could also load the data into staging tables in SQL using SSIS or BCP and then use regular SQL (potentially within SSIS in a SQL task) with the OUTPUT INTO feature to capture the identities and use them in the children.
Just execute the script. We regularly run backup / restore scripts that are 100's Mb in size. It only takes 30 seconds or so.
If it is critical not to block your server for this amount to time, you'll have to really split it up a bit.
Also look into the -tab option of mysqldump with outputs the data using TO OUTFILE, which is more efficient and faster to load.
It sounds like this is generating a single INSERT for each row, which is really going to be pretty slow. If they are all wrapped in a transaction, too, that can be kind of slow (although the number of rows doesn't sound that big that it would cause a transaction to be nearly impossible - like if you were holding a multi-million row insert in a transaction).
You might be better off looking at ETL (DTS, SSIS, BCP or BULK INSERT FROM, or some other tool) to migrate the data instead of scripting each insert.
You could break up the script and execute it in parts (especially if currently it makes it all one big transaction), just automate the execution of the individual scripts using PowerShell or similar.
I've been looking into the "BULK INSERT" from file option but cannot see any examples of the file format. Can the file mix the row formats or does it have to always be consistent in a CSV fashion? The reason I ask is that I've got identities involved across various parent / child tables which is why inserts per row are currently being used.