How many ways to skip first record from a file using bcp - batch-processing

my file format is .xx1 when i try to exclude the first recording using -F 2 ,it pick from 3rd line instead of 2nd line.

You may try this:
bulk insert <table> from '<path of data file>'
with
(
firstrow=2,
formatfile='<path of formatfile>'
)

Related

How to clean bad data from huge csv file

So I have huge csv file (assume 5 GB) and I want to insert the data to the table but it return error that the length of the data is not the same
I found that some data has more columns than I want
For example the correct data I have has 8 columns but some data has 9 (it can be human/system error)
I want to take only 8 columns data, but because the data is so huge, I can not do it manually or using parsing in python
Any recommendation of a way to do it?
I am using linux, so any linux command also welcome
In sql I am using COPY ... FROM ... CSV HEADER; command to import the csv into table
You can use awk for this purpose. Assuming you field delimiter is comma (,) this code can do the work:
awk -F\, 'NF==8 {print}' input_file >output_file
A fast and dirty php solution as single command line:
php -r '$f=fopen("a.csv","rb"); $g=fopen("b.csv","wb"); while ( $r=fgetcsv($f) ) { $r = array_slice($r,0,8); fputcsv($g,$r); }'
It reads file a.csv and writes b.csv.

What does 'insert overwrite local directory' mean in Hive?

Im having some issues understanding what does the following type of query do:
insert overwrite local directory $directorey_name$
select $some_query$
What does this mean, and what are the side effects of this?
Export the query results into a file on the local file system
insert overwrite local directory '/tmp/hello'
row format delimited
fields terminated by '|'
select 1,2,3,'Hello','world'
;
! ls /tmp/hello;
000000_0
! cat /tmp/hello/000000_0;
1|2|3|Hello|world

Running Query from text file

I'm trying to run a big query query from the command line, but because my query is very long I've written it in a text file. The query works from the GUI and I'm overwriting a table that already exsists
bq query --allow_large_results --replace --destination_table=me.Tbl_MyTable '`cat query.txt`'
However, I'm getting error results:
Error in query string: Error processing job
'dev:bqjob_r_00000123456789456123_1': Encountered "
"\'cat query.txt\' "" at line 1, column 1.
Was expecting: EOF
Do I need to put the entire file path in the .txt filename? (this doesn't seem to make a difference)
Are there any characters I need to be careful with in the text file (e.g. "\" or quotation marks) ?
I'm using where clauses and group by clauses - is that an issue?
Instead of cat, just pipe the input from the file. The command would be:
bq query --allow_large_results --replace --destination_table=me.Tbl_MyTable < query.txt
This will send the contents of query.txt to the bq tool.
Elliot is right, now if you want to cat, sed or anything, pipe it:
cat query.txt | bq query

what to use instead of bulk load? Got error 'do not have permission to use the bulk load statement'

I am trying to add a text file into SQL database table using BULK INSERT.
BULK
INSERT My_Tablename
FROM 'C:\testing\temptest.txt'
WITH
(
FIELDTERMINATOR = '|',
ROWTERMINATOR = '\n'
)
GO
But got error that 'do not have permission to use the bulk load statement'.
Is there any alternative way to do it?
I don't want to set TRUSTWORTHY ON or create certificate for BULK admin permission.
Try using the SQL Server Import and Export Wizard.
Right click on the database in in Object Explorer within SSMS.
Go to Tasks > Import Data
Select "Flat File Source" for your data source and follow the wizard to specify delimiters, etc.
Although #SQLChao definitely has the answer, I did not remember the location of said Import Data option and simply opened the delimited file with my favorite text editor, Notepad++ and did the following find and replaces with Search Mode set to extended:
Find: ' Replace: ''
Find: | Replace: ','
Find: \r\n Replace: ')\r\n
Find: \r\n Replace: \r\nINSERT INTO [DB_Name].[Schema_Name].[Table_Name] VALUES(\r\n'
The only issues should be in your first and last insert statements which can manually be edited as need be.
I then copied the text straight into Sql Server and executed.

Issue with bulk insert

I am trying to insert the data from this link to my SQL server
https://www.ian.com/affiliatecenter/include/V2/CityCoordinatesList.zip
I created the table
CREATE TABLE [dbo].[tblCityCoordinatesList](
[RegionID] [int] NOT NULL,
[RegionName] [nvarchar](255) NULL,
[Coordinates] [nvarchar](4000) NULL
) ON [PRIMARY]
And I am running the following script to do the bulk insert
BULK INSERT tblCityCoordinatesList
FROM 'C:\data\CityCoordinatesList.txt'
WITH
(
FIRSTROW = 2,
MAXERRORS = 0,
FIELDTERMINATOR = '|',
ROWTERMINATOR = '\n'
)
But the bulk insert fails with following error
Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".
When I google, I found several articles which says the issue may be with RowTerminator, but I tried everything like \n\r, \n etc, but nothing is working.
Could anyone please help me to insert this data into my database?
Try ROWTERMINATOR = '0x0a'.
it should work.
This can also happen if the number of columns mismatch between the table and the imported file
I got the same error message, and as you had mention, it was related to unexpected line ending.
In my case the line ending was specified in a fmt file as a Windows Line ending (CRLF), written as \r\n, and the data file to process has a Mac classic one (CR).
I solved it with an editor that can show the current line ending and change it. I used EditPad Lite wich shows the opened file line ending in the bottom bar and pressing it allow to replace with the expected one.
I had this on SQL2019 when the FORMAT='CSV' option was used, and there was a comma on the end of each line in the source file. So the table your BULK inserting into needed to have an extra dummy field to cater for the fact each record has essentially a blank field in the source file.
!
I get the same error, probably from the file encoding problem. I fixed it by opening the problem CSV file using Notepad++, select everything and copy to clipboard. Next, create a new text file (making sure it has the CSV file extension), open it using Notepad++, then paste the text to the new file. Save and close all files. You should be able to successfully load the new CSV file into the SQL server.
you need run BULK INSERT - command from windows login (not from SQL). Now I don't have any examples