Postgres extra data after last expected column CSV

Postgres extra data after last expected column CSV - sql

I have a CSV like this
something,other,5,"The problem, is "right" here"
to import I write
COPY table_name FROM '/tmp/data.csv' CSV HEADER;
now, I think that the problem is the double quote inside the doble quote, am I right?
So, how can I fix it?
Thanks.

If you ever have doubt, you can just put the data directly into a table and then export the contents using the same copy command (only do "copy to"). Or, you can put it in the spreadsheet of your choice and export to CSV.
something,other,5,"The problem is ""right"" here"

Related

psycopg2: export csv to database, dealing with e+ expression

I have a csv file containing
numbers like "1.456e+07"
and I am using function "copy_expert" to export the file to database
but I am getting error
psycopg2.DataError: invalid input syntax for integer: "1.5637e+07"
I notice that I can insert "100" as an integer, but when I do "1.5637e+07" with qoute, it doesn't work.
I am using pandas dataframe's to_csv to generate the csv files. not sure how to get rid of qoute for integer like "1.5637e+07" only (I have string column), or whether there is other solution.

I find out the solution
Normally, pandas doesn't put quotes around number. However, I set float_format parameter which causes this. I reset
quoting=csv.QUOTE_MINIMAL
in the function call and the quotes go away.

Loading huge csv file using COPY

I am loading CSV file using COPY.
COPY cts FROM 'C:\...\cts.csv' using DELIMITERS',';
However, error comes out
ERROR: invalid input syntax for type double precision: ""
CONTEXT: COPY testdata, line 7, column latitude: ""
How to fix it please?

Looks like your CSV isn't quite formatted correctly. "" isn't a number, and numbers don't need to be be quoted in CSV.
I find it's usually easier in PostgreSQL to create a staging import table with all text columns, and import CSVs to there first. Then do a cleanup query to put the CSV data into the real table.

Access VBA, importing csv file via TransferText with commata as decimal separator and semicolon as delimiter

I'm having some problems importing double numbers from csv files. The files have a semicolon delimiter and comma as decimal separator.
I can't set up import specs since the order of the fields in the csv often changes and it would be a desaster if the data goes into the wrong field.
Also the csv files will have to written to a temporary table first. Don't hate me for it, but since I have to process data and set some information fields for later data processing this is by far the easiest, fastest and safest way to achieve it.
Here is the problem itself:
When using TransferText it will import, but of course interpret the comma as delimiter. Not good ...
When replacing comma by full stop and semicolon by comma it works. But it will ignore full stops, so 1.2 becomes 12, 1.333 becomes 1333. The field will be of type double.
I've tests numerous things. Besides TransferText I've tried:
DoCmd.RunSQL ("INSERT INTO Tabelle1 SELECT cdbl(a1) as aa FROM[TEXT;FMT=Delimited;HDR=YES;CharacterSet=437;DATABASE=C:\SPOT].[test.csv]")
But nothing seems to work, even when I create a new table with field type DOUBLE before using TransferText ... decimals are still ignored.
So, I would be happy if you could tell me either how to use TransferText with or without replacing semicolon and comma in a first step or how to use the INSERT INTO stuff.
Thank you very much!

Ok, I think I got it!
The problem where the regional settings and that my Access uses comma as decimal separator. I was also not able to create a Import Spec via manual import, since it needs to have defined which fields will have to be imported.
What I did now was this:
Open the table MSysIMEXSpecsthat contains the import specs via query:
select * from MSysIMEXSpecs
Then add a new row and set SpecName = "Whatever", DecimalPoint= "," and 'FieldSeparator` = ";" and whatever other settings have to be made.
Since there is this workaround, isn't there a way to do this easier?

Cannot upload CSV that starts with an integer

I'm stuck with what seems like a weird BigQuery bug : I cannot upload a CSV file that starts (first line, first column) by an integer.
Here's my schema : COL1:INTEGER,COL2:INTEGER,COL3:STRING
Here's my csv file content :
100,4,XXX
100,4,XXX
If I put the STRING column as first column, the upload is OK.
If I add a header and tell BigQuery to skip it during the import, the upload is ok too.
But with the CSV and schema above, BigQuery always complains : Line:1 / Field:1, Value cannot be converted to expected type.
Anyone knows what the problem is ?
Thank you in advance,
David

I could not reproduce this problem--I copied and pasted the content into a file and uploaded it with no problems.
Perhaps the uploaded file format is corrupted somehow? If there are extra bytes at the beginning of the file, those would be ignored in a header row but might result in this error is the first value of the first field is expected to be an integer. I'd recommend examining the actual binary data in the file to make sure there's nothing funny going on.
Also, are you doing this import via web UI, command-line tool, or API? Have you tried one of the other methods?

How to make Postgres Copy ignore first line of large txt file

I have a fairly large .txt file ~9gb and I will like to load this txt file into postgres. The first row is the header, followed by all the data. If I postgres COPY the data directly, the header will cause an error that data type do not match with my postgres table, so I will need to remove it somehow.
Sample data:
ProjectId,MailId,MailCodeId,prospectid,listid,datemailed,amount,donated,zip,zip4,VectorMajor,VectorMinor,packageid,phase,databaseid,amount2
15,53568419,89734,219906,15,2011-05-11 00:00:00,0,0,90720,2915,NonProfit,POLICY,230,3,1,0
16,84141863,87936,164657,243,2011-03-10 00:00:00,0,0,48362,2523,NonProfit,POLICY,1507,5,1,0
16,81442028,86632,15181625,243,2011-01-19 00:00:00,0,0,11501,2115,NonProfit,POLICY,1508,2,1,0
While the COPY function for postgres has the "header" setting that can ignore the first row, it only works for csv files:
copy training from 'C:/testCSV.csv' DELIMITER ',' csv header;
when I try to run the code above on my txt file, it gets an error:
copy training from 'C:/testTXTFile.txt' DELIMITER ',' csv header
ERROR: unquoted newline found in data
HINT: Use quoted CSV field to represent newline.
I have tried adding "quote" and "escape" attributes but the command just won't seem to work for txt file:
copy training from 'C:/testTXTFile.txt' DELIMITER ',' csv header quote as E'"' escape as E'\\N';
ERROR: COPY escape must be a single one-byte character
Alternatively, I thought about running java or create a seperate stagging table to remove the first row...but these solutions are expansive and time consuming. I will need to load 9gb of data just to remove the first row of headers... are there other solutions out there to remove the first row of a txt file easily so that I can load the data into my postgres database?

Use HEADER option with CSV option:
\copy <table_name> from '/source_file.csv' delimiter ',' CSV HEADER ;
HEADER
Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV format.

I've looked up docs at https://www.postgresql.org/docs/10/sql-copy.html
written about HEADER is not only true for CSV, but TSV also!
My solution was this in psql
\COPY mytable FROM 'mydata.tsv' DELIMITER E'\t' CSV HEADER;
(in addition mydata.tsv contaned header row which I excluded from copying to database table)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Postgres extra data after last expected column CSV - sql

I have a CSV like this something,other,5,"The problem, is "right" here" to import I write COPY table_name FROM '/tmp/data.csv' CSV HEADER; now, I think that the problem is the double quote inside the doble quote, am I right? So, how can I fix it? Thanks.

If you ever have doubt, you can just put the data directly into a table and then export the contents using the same copy command (only do "copy to"). Or, you can put it in the spreadsheet of your choice and export to CSV. something,other,5,"The problem is ""right"" here"

Related

psycopg2: export csv to database, dealing with e+ expression

Loading huge csv file using COPY

Access VBA, importing csv file via TransferText with commata as decimal separator and semicolon as delimiter

Cannot upload CSV that starts with an integer

How to make Postgres Copy ignore first line of large txt file

Categories

Resources