Bulk Import of CSV into SQL Server - sql

I am having a .CSV file that contain more than 1,00,000 rows.
I have tried the following method to Import the CSV into table "Root"
BULK INSERT [dbo].[Root]
FROM 'C:\Original.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
But there are so many errors like check your Terminators.
I opened the CSV with Notepad.
There is no Terminator , or \n. I find at end of the row a square box is there.
please help me to import this CSV into table.

http://msdn.microsoft.com/en-us/library/ms188609.aspx
Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. Note that the field terminator of a CSV file does not have to be a comma. To be usable as a data file for bulk import, a CSV file must comply with the following restrictions:
Data fields never contain the field terminator.
Either none or all of the values in a data field are enclosed in quotation marks ("").
Note: There may be other unseen characters that need to be stripped from the source file. VIM (command ":set list") or Notepad++(View > Show Symbol > Show All Characters) are two methods to check.

If you are comfortable with Java, I have written a set of tools for CSV manipulation, including an importer and exporter. The project is up on Github.com:
https://github.com/carlspring/csv-db-tools
The importer is here:
https://github.com/carlspring/csv-db-tools/tree/master/csv-db-importer
For instructions on how to use the importer, check:
https://github.com/carlspring/csv-db-tools/blob/master/csv-db-importer/USAGE
You will need to make a simple mapping file. An example can be seen here:
https://github.com/carlspring/csv-db-tools/blob/master/csv-db-importer/src/test/resources/configuration-large.xml

Related

SQL How can I copy a csv file into a table with this delimiter problem?

I'm trying to copy a csv file into a table. The delimiter is ',' but the csv file has a field named 'Description' where it also uses ',' but not as a delimiter. As part of a text.
How could I copy the csv file into the Import table?
If the comma is always within the double quotes then it shouldn't be a problem.
If not, you have a corrupt CSV file. The simplest way is probably to parse your file prior to importing to fix the corruption.
The details of how exactly to parse will depend on the dataset. Which fields are optional? which fields are compulsory? How many commas can occur at most? That kind of information is crucial for writing a parsing script.

importing excel table into database

I have a following table in xlsx format which I would like to import into the my sql database:
The table is pretty complicated and I only want the records after '1)HEADING'
I have been looking at php libraries to import into sql but they only seem to be for simple excel files.
You have two ways to realize that :
First method :
1) Export it into some text format. The easiest will probably be a tab-delimited version, but CSV can work as well.
2) Use the load data capability. See http://dev.mysql.com/doc/refman/5.1/en/load-data.html
3) Look half way down the page, as it will gives a good example for tab separated data:
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\'
4) Check your data. Sometimes quoting or escaping has problems, and you need to adjust your source, import command-- or it may just be easier to post-process via SQL.
Second method :
There's a simple online tool that can do this called sqlizer.io.
You upload an XLSX file to it, enter a sheet name and cell range, and it will generate a CREATE TABLE statement and a bunch of INSERT statements to import all your data into a MySQL database.

sql server Bulk insert csv with data having comma

below is the sample line of csv
012,12/11/2013,"<555523051548>KRISHNA KUMAR ASHOKU,AR",<10-12-2013>,555523051548,12/11/2013,"13,012.55",
you can see KRISHNA KUMAR ASHOKU,AR as single field but it is treating KRISHNA KUMAR ASHOKU and AR as two different fields because of comma, though they are enclosed with " but still no luck
I tried
BULK
INSERT tbl
FROM 'd:\1.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW=2
)
GO
is there any solution for it?
The answer is: you can't do that. See http://technet.microsoft.com/en-us/library/ms188365.aspx.
"Importing Data from a CSV file
Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. For information about the requirements for importing data from a CSV data file, see Prepare Data for Bulk Export or Import (SQL Server)."
The general solution is that you must convert your CSV file into one that can be be successfully imported. You can do that in many ways, such as by creating the file with a different delimiter (such as TAB) or by importing your table using a tool that understands CSV files (such as Excel or many scripting languages) and exporting it with a unique delimiter (such as TAB), from which you can then BULK INSERT.
They added support for this SQL Server 2017 (14.x) CTP 1.1. You need to use the FORMAT = 'CSV' Input File Option for the BULK INSERT command.
To be clear, here is what the csv looks like that was giving me problems, the first line is easy to parse, the second line contains the curve ball since there is a comma inside the quoted field:
jenkins-2019-09-25_cve-2019-10401,CVE-2019-10401,4,Jenkins Advisory 2019-09-25: CVE-2019-10401:
jenkins-2019-09-25_cve-2019-10403_cve-2019-10404,"CVE-2019-10404,CVE-2019-10403",4,Jenkins Advisory 2019-09-25: CVE-2019-10403: CVE-2019-10404:
Broken Code
BULK INSERT temp
FROM 'c:\test.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FIRSTROW= 2
);
Working Code
BULK INSERT temp
FROM 'c:\test.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FORMAT = 'CSV',
FIRSTROW= 2
);
Unfortunately , SQL Server Import methods( BCP && BULK INSERT) do not understand quoting " "
Source : http://msdn.microsoft.com/en-us/library/ms191485%28v=sql.100%29.aspx
I have encountered this problem recently and had to switch to tab-delimited format. If you do that and use the SQL Server Management Studio to do the import (Right-click on database, then select Tasks, then Import) tab-delimited works just fine. The bulk insert option with tab-delimited should also work.
I must admit to being very surprised when finding out that Microsoft SQL Server had this comma-delimited issue. The CSV file format is a very old one, so finding out that this was an issue with a modern database was very disappointing.
MS have now addressed this issue and you can use FIELDQUOTE in your with clause to add quoted string support:
FIELDQUOTE = '"',
anywhere in your with clause should do the trick, if you have SQL Server 2017 or above.
Well, Bulk Insert is very fast but not very flexible. Can you load the data into a staging table and then push everything into a production table? Once in SQL Server, you will have a lot more control in how you move data from one table to another. So, basically.
1) Load data into staging
2) Clean/Convert by copying to a second staging table defined using the desired datatypes. Good data copied over, bad data left behind
3) Copy data from the "clean" table to the "live" table

Only import specific data columns - Comma Delimited List

I used the following command to import data from a text file, however, I need to find out a way of selecting specific columns within the text file. The following links have been suggested to me however I'm struggling to understand whether I need to replace my current SQL with the examples on MSDN:
BULK INSERT T2 FROM 'c:\Temp\Data.txt' WITH (FIELDTERMINATOR = ',')
http://msdn.microsoft.com/en-us/library/ms179250.aspx
http://msdn.microsoft.com/en-us/library/ms187908.aspx
I have the following fields held within a text file which is separated by comma. The data is also separated by comma enabling me to use the above code to import it all.
Date,Time,Order,Item,Delivery Slot,Delivery Time
Is there a way to only import Date, Time, Item and Delivery Time into an SQL database table?
Use a Format File for your BULK INSERT. You can specify which fields are imported through this file definition.
EDIT: example from MSDN.
BULK INSERT bulktest..t_float
FROM 'C:\t_float-c.dat' WITH (FORMATFILE='C:\t_floatformat-c-xml.xml');
GO

How to import a very large csv file into an existing SQL Server table?

I have a very large csv file with ~500 columns, ~350k rows, which I am trying to import into an existing SQL Server table.
I have tried BULK INSERT, I get - Query executed successfully, 0 rows affected. Interestingly, BULK INSERT worked, in a matter of seconds, for a similar operation but for a much smaller csv file, less than 50 cols., ~77k rows.
I have also tried bcp, I get - Unexpected EOF encountered in BCP data-file. BCP copy in failed.
The task is simple - it shouldn't be hard to the limits of pure frustration. Any ideas or suggestions? Any other tools, utilities that you have successfully used to accomplish a bulk import operation or something similar? Thanks.
-- BULK INSERT
USE myDb
BULK INSERT myTable
FROM 'C:\Users\myFile.csv'
WITH
(
FIRSTROW = 2,
-- DATAFILETYPE = 'char',
-- MAXERRORS = 100,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
);
-- bcp
bcp myDb.dbo.myTable in 'C:\Users\myFile.csv' -T -t, -c
UPDATE
I have now changed course. I've decided to join the csv files, which was my goal to begin with, outside of SQL Server so that I don't have to upload the data to a table for now. However, it'll be interesting to try to upload (BULK INSERT or 'bcp') only 1 record (~490 cols.) from the csv file, which otherwise failed, and see if it works.
Check your file for an EOF character where it shouldn't be - BCP is telling you there is a problem with the file.
Notepad ++ may be able to load the file for you to view and search.
Most likely the last line lacks a \n. Also, there is a limitation in the row size (8060 bytes) in SQL-Server although T-SQL should have mention this. However, check this link:
My advice: Start with one row and get it to work. Then the rest.
How are you mapping the fields in the file with the columns in the table? Are the number of columns in the table the same as the number of fields in the file? Or are you using a format file to specify the column mapping? If so, is the format file formatted correctly?
If you are using the format file and if you have the "Number of columns" parameter wrong, it will cause the error "Unexpected end of file". See this for some other errors/issues with bulk uploading.
It is probably not the solution your expecting but with Python you could create a table out of the csv very easily (just uploaded a 1GB CSV file):
import pandas as pd
import psycopg2
from sqlalchemy import create_engine
# Read the csv to a dataframe
df = pd.read_csv('path_to_csv_file', index_col='name_of_index_column', sep=",")
# Connect and upload
engine = create_engine('postgresql+psycopg2://db_user_name:db_password#localhost:5432/' + 'db_name', client_encoding='utf8')
df.to_sql('table_name', engine, if_exists='replace', index =True, index_label='name_of_index_column')