Import large delimited .txt file in SQL Server 2008 - sql

Every morning one of my clients send me a .txt file with ' ; ' as separator, and this is how the file is currently being imported in a temp table using SSIS:
mario.mascarenhas;MARIO LUIZ MASCARENHAS;2017-03-21 13:18:22;PDV;94d33a66dbaaff15a01d8139c7acd7c6;;;1;0;0;0;0;0;0;0;0;0;0;\N
evilanio.asevedo;EVILANIO ASEVEDO;2017-03-21 13:26:10;PDV;30a1bd072ac5f158f99445bb0975e423;;;1;1;0;0;0;0;0;0;0;0;0;\N
marcelo.tarso;MARCELO TARSO;2017-03-21 13:47:09;PDV;ef6b5e971242ec345552cdb724968f8a;;;1;0;0;0;0;0;0;0;0;0;0;\N
tiago.rodrigues;TIAGO ALVES RODRIGUES;2017-03-21 13:49:04;PDV;d782d4b30c0d302fe815b2cb48de4d03;;;1;1;0;0;0;0;0;0;0;0;0;\N
roberto.freire;ROBERTO CUSTODIO;2017-03-21 13:54:53;PDV;78794a18187f068d612e6b6370a60781;;;1;0;0;0;1;0;0;0;0;0;0;\N
eduardo.lima;EDUARDO MORO LIMA;2017-03-21 13:55:24;PDV;83e1c2696faa83d54881b13c70a07924;;;1;0;0;0;0;0;0;0;0;0;0;\N
Each file constains at least 23,000 rows just like that.
I already made a table with the correct number of columns to receive this data. So what I want is to "explode" (just like in PHP) the row using ' ; ' as the column separator and loop the insert in my table named dbo.showHistoricalLogging.
I've been searching for a solution here in Stack but nothing specific having this volume of data in consideration and looping an insert.
Any idea? I'm running SQL Server 2008.

My suggestion,
convert the text file into a csv file, then refer to this post from StackOverFlow to use the Bulk package. I have used this before while I was in University of Arizona for one of my programming assignments in my Database Designs class. Any clarifications and/or question, leave in a comment and will do my best.

Something like this should work
BULK INSERT [TableName] FROM 'C:\MyFile.txt' WITH (FIELDTERMINATOR = ';', ROWTERMINATOR = '\\N');
consult the Microsoft Bulk Insert documentation if you need other parameters. Alternatively SSIS makes this super easy as well - many ways you could do this honestly.

Related

How to export VARCHAR2 data that includes commas(!) to Excel from MSSQLMgmtStudio(2012)

First, a grumble: MS builds SQL Server Studio AND Excel, but can't make one save in the standard format of the other?
OK, I'm a data analyst, but not allowed to change/mod either the data or structures directly. So full READ, but no WRITE.
I'm trying to do a dump so I can do some of this analysis offline, as I have no remote access either.
So one VARCHAR2 column in this table is for comments on the purchase of the asset being described/tracked. Of course, there are commas. The only export types built into SQL Server Studio are .csv and .txt, and .csv just turns into a mess when 'comma' is included as a delimiter.
So after an hour or so of screwing around with this, (including reading a thread on methods for excluding the one column from a SELECT while still exporting the other 221 columns in the table, without having to write them all out manually (fun reading, impressive, but means I'd have to figure out which of them actually works, and then still export the one column separately and insert it in the Excel separately)) I am throwing this problem on the pile at StackOverflow.
Someone else must have worked around this frustration of the .csv format as export VS the commas embedded in 'comment' text.
Any help would be appreciated.
Why don't you simply select all data in ssms result window, then copy and then paste in a blank excel file?
It should copy paste all data in correct format including comma valued fields in single column.
Try that.
So If you replace the ' to some special character you can export it.
Select
Replace(columnName,'''','`')
from Table
Other solution if you use the manager studio
https://learn.microsoft.com/en-us/sql/integration-services/import-export-data/start-the-sql-server-import-and-export-wizard

Full Text Search For Excel Files

I have several 100 excel files that are not normalized enough for me to efficiently import into the tables of my database. It is difficult to find information but from what I have seen it is possible to index xlsx files with FTS. I am not really looking to implement an alternate database for this as it is a one time thing that will not receive new data.
Would it be possible to do this with FTS and if so could someone point me in the right direction as the info I've found on msdn is quite vague.
Thanks.
I have done something similar using BULK. I would suggest taking a look at it
http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-file
How it works is excel data can be taken as a text file. Each column is separated by a ";" and each row by "\n" you can then use BULK to crawl through your excel sheet and insert it in a table.
Note that all the values coming from BULK are text values. So if your table contains int values, for example, you will need a temporary table.
CREATE TABLE #TEMPORARYTABLE(
)
The # creates a table that only exists until you disconnect from sql server.
All values in that table should be nvarchars
You can then insert into your real table the #TEMPORARYTABLE and CAST the Nvarchar values to int values or whatever else you need
FTS is a feature in SQL Server, The data you want to create FTS for needs to in a SQL Server database.
Excel being in Excel and not in SQL Server , you will not be able to create FTS for them Excel sheets.
But if you import that data into SQL Server only then you will be able to make use of FTS features, till then unfortunately FTS is not an option for you.

sql server Bulk insert csv with data having comma

below is the sample line of csv
012,12/11/2013,"<555523051548>KRISHNA KUMAR ASHOKU,AR",<10-12-2013>,555523051548,12/11/2013,"13,012.55",
you can see KRISHNA KUMAR ASHOKU,AR as single field but it is treating KRISHNA KUMAR ASHOKU and AR as two different fields because of comma, though they are enclosed with " but still no luck
I tried
BULK
INSERT tbl
FROM 'd:\1.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW=2
)
GO
is there any solution for it?
The answer is: you can't do that. See http://technet.microsoft.com/en-us/library/ms188365.aspx.
"Importing Data from a CSV file
Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. For information about the requirements for importing data from a CSV data file, see Prepare Data for Bulk Export or Import (SQL Server)."
The general solution is that you must convert your CSV file into one that can be be successfully imported. You can do that in many ways, such as by creating the file with a different delimiter (such as TAB) or by importing your table using a tool that understands CSV files (such as Excel or many scripting languages) and exporting it with a unique delimiter (such as TAB), from which you can then BULK INSERT.
They added support for this SQL Server 2017 (14.x) CTP 1.1. You need to use the FORMAT = 'CSV' Input File Option for the BULK INSERT command.
To be clear, here is what the csv looks like that was giving me problems, the first line is easy to parse, the second line contains the curve ball since there is a comma inside the quoted field:
jenkins-2019-09-25_cve-2019-10401,CVE-2019-10401,4,Jenkins Advisory 2019-09-25: CVE-2019-10401:
jenkins-2019-09-25_cve-2019-10403_cve-2019-10404,"CVE-2019-10404,CVE-2019-10403",4,Jenkins Advisory 2019-09-25: CVE-2019-10403: CVE-2019-10404:
Broken Code
BULK INSERT temp
FROM 'c:\test.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FIRSTROW= 2
);
Working Code
BULK INSERT temp
FROM 'c:\test.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FORMAT = 'CSV',
FIRSTROW= 2
);
Unfortunately , SQL Server Import methods( BCP && BULK INSERT) do not understand quoting " "
Source : http://msdn.microsoft.com/en-us/library/ms191485%28v=sql.100%29.aspx
I have encountered this problem recently and had to switch to tab-delimited format. If you do that and use the SQL Server Management Studio to do the import (Right-click on database, then select Tasks, then Import) tab-delimited works just fine. The bulk insert option with tab-delimited should also work.
I must admit to being very surprised when finding out that Microsoft SQL Server had this comma-delimited issue. The CSV file format is a very old one, so finding out that this was an issue with a modern database was very disappointing.
MS have now addressed this issue and you can use FIELDQUOTE in your with clause to add quoted string support:
FIELDQUOTE = '"',
anywhere in your with clause should do the trick, if you have SQL Server 2017 or above.
Well, Bulk Insert is very fast but not very flexible. Can you load the data into a staging table and then push everything into a production table? Once in SQL Server, you will have a lot more control in how you move data from one table to another. So, basically.
1) Load data into staging
2) Clean/Convert by copying to a second staging table defined using the desired datatypes. Good data copied over, bad data left behind
3) Copy data from the "clean" table to the "live" table

TSQL Bulk Insert

I have such csv file, fields delimiter is ,. My csv files are very big, and I need to import it to a SQL Server table. The process must be automated, and it is not one time job.
So I use Bulk Insert to insert such csv files. But today I received a csvfile that has such row
1,12312312,HOME ,"House, Gregory",P,NULL,NULL,NULL,NULL
The problem is that Bulk Insert creates this row, specially this field "House, Gregory"
as two fields one '"House' and second ' Gregory"'.
Is there some way to make Bulk Insert understand that double quotes override behaviour of comma?
When I open this csv with Excel it sees this field normally as 'House, Gregory'
You need preprocess your file, look to this answer:
SQL Server Bulk insert of CSV file with inconsistent quotes
If every row in the table has double quotes you can specify ," and ", as column separators for that column using format files
If not, get it changed or you'll have to write some clever pre-processing routines somewhere.
The file format need to be consistent for any of the SQL Server tools to work
Since you are referring to Sql Server, I assume you have Access available as well (Microsoft-friendly environment). If you do have Access, I recommend you use its Import Wizard. It is much smarter than the import wizard of Sql Server (even version 2014), and smarter than the Bulk Insert sql command as well.
It has a widget where you can define the Text seperator to be ", it also makes no problems with string length because it uses the Access data type Text.
If you are satisfied with the results in Access you can import them later to Sql Server seamlessly.
The best way to move the data from Access to Sql is using Sql Server Migration Assistant, available here

What is the easiest way to add a bunch of content to a SQL database?

Nothing technical here. Suppose I have a lot of different categorized data, and I would like to create a database out of it. Would someone literally hand plug in all that info with SQL code itself? Or do some people make a mock website just to input data? What are some of your strategies?
If there would be no way to do it automatically, then a mock website would be the way to go: you can even use it with more people at once, actually multiplying the input speed (as long as you don't mess up assigning each of them a different part of the data).
What format is your data in? And how much of it is there? If its Excel then SQL Server has tools to import it in. I'm not sure if MySQL has anything similar. Even if it doesn't one other technique I have used with Excel data is to use a formula to concatenate as required to generate the INSERT statements. Then just paste those into a query window and run that.
I wouldn't do a website for it unless I was building an admin site for it already and wanted to test that with the initial load.
Most databases have a way to do bulk inserts or have tools for data import.
My strategies normally involve such tools.
Here is an example of importing a CSV file to SQL Server.
Most database servers provide a way to import data from a variety of formats, you could look into that first.
If not, you could write a simple script or console application to parse your input data, and write out a SQL script to insert the data into appropriate tables.
For example, if you data was in a CSV file, you would parse each line in the file, and generate an insert statement to write out to a .sql file.
MyData.csv
1,2,3,'Test',4
2,3,4,'Test2,6
GeneratedInsert.sql
insert into table (col1,col2,col3,col4,col5) values (1,2,3,'Test',4)
insert into table (cal1,col2,col3,col4,col5) values (2,3,4,'Test2',6)
MySQL has a statement LOAD DATA INFILE that is intended for loading bulk data from flat files. It's easy to use and much faster than alternative methods.
But first you do have to use SQL to design tables with fields that match the field of your import data. That is, if you have some file with comma-separated data:
Titanic;1997;4 stars
Batman Begins;2005;5 stars
"Harry Potter and the Sorcerer's Stone";2001;3 stars
...
You would create a table:
CREATE TABLE Movies (
title VARCHAR(100) NOT NULL,
year YEAR NOT NULL
rating VARCHAR(10)
);
Then load data:
LOAD DATA INFILE 'movies.txt' INTO TABLE Movies
FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"';
Most web languages have some sort of auto-scaffolding that you can quickly set up. Useful for admin work as well, if your site is hosted without direct access to DB.
Otherwise, yeah - write the SQL statements. Useful to bring a database up as part of your build process.