Error code: DelimitedTextMoreColumnsThanDefined Azure Data Factory - sql

I am trying to copy data from a csv file to a sql table in Azure Data Factory
This is my type property for the CSV file
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"fileName": "2020-09-16-stations.csv",
"container": "container"
},
"columnDelimiter": ",",
"escapeChar": "\\",
"firstRowAsHeader": true,
"quoteChar": "\""
I recieve following error:
ErrorCode=DelimitedTextMoreColumnsThanDefined,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error found when processing 'Csv/Tsv Format Text' source '2020-09-16-stations.csv' with row number 2: found more columns than expected column count 11.,Source=Microsoft.DataTransfer.Common,'
This is row #2
0e18d0d3-ed38-4e7f,Station2,Mainstreet33,,12207,Berlin,48.1807,11.4609,1970-01-01 01:00:00+01,"{""openingTimes"":[{""applicable_days"":96,""periods"":[{""startp"":""08:00"",""endp"":""20:00""}]},{""applicable_days"":31,""periods"":[{""startp"":""06:00"",""endp"":""20:00""}]}]}"
I think the last column, the JSON query is making trouble in this case. When I view the data it looks fine:
I thought exactly the "quoteChar": "\""would prevent that the last column makes problems. I have no idea why I am getting this error while i run debug

Try setting the escape character = " (a double quote). This should treat each pair of double quotes as an actual single quote and wont consider them as a "Quote Char" within the string, so you will end up with a string that looks like this (and which the system knows is a single string and not something it has to split):
{"openingTimes":[{"applicable_days":96,"periods":[{"startp":"08:00","endp":"20:00"}]},
{"applicable_days":31,"periods":[{"startp":"06:00","endp":"20:00"}]}]}

This is because this value "{""openingTimes"":[{""applicable_days"":96,""periods"":[{""startp"":""08:00"",""endp"":""20:00""}]},{""applicable_days"":31,""periods"":[{""startp"":""06:00"",""endp"":""20:00""}]}]}" contains several comma and your columnDelimiter is "," which leads to that value is split to several column. So you need to change your columnDelimiter.

Related

Data Factory Copy Activity: Error found when processing 'Csv/Tsv Format Text' source 'xxx.csv' with row number 6696: found more columns than expected

I am trying to perform a simply copy activity in Azure Data Factory from CSV to SQL Table, but I'm getting the following error:
{
"errorCode": "2200",
"message": "ErrorCode=DelimitedTextMoreColumnsThanDefined,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error found when processing 'Csv/Tsv Format Text' source 'organizations.csv' with row number 6696: found more columns than expected column count 41.,Source=Microsoft.DataTransfer.Common,'",
"failureType": "UserError",
"target": "Copy data1",
"details": []
}
The copy activity is as follows
Source
My Sink is as follows:
As preview of the data in source is as follows:
This seems like a very straight forward copy activity. Any thoughts on what might be causing the error?
My row 6696 looks like the following:
3b1a2e5f-d08b-166b-4b91-eb53009b2377 Compassites Software Solutions organization compassites-software https://www.crunchbase.com/organization/compassites-software 318375 17/07/2008 10:46 05/12/2022 12:17 company compassitesinc.com http://www.compassitesinc.com IND Karnataka Bangalore "Pradeep Court", #163/B, 6th Main 3rd Cross, JP Nagar 3rd phase 560078 operating Custom software solution experts Big Data,Cloud Computing,Information Technology,Mobile,Software Data and Analytics,Information Technology,Internet Services,Mobile,Software 01/11/2005 51-100 info#compassitesinc.com 080-42032572 http://www.facebook.com/compassites http://www.linkedin.com/company/compassites-software-solutions http://twitter.com/compassites https://res.cloudinary.com/crunchbase-production/image/upload/v1397190270/c3e5acbde40f36eaf4f8c6f6eda3f803.png company
No commas
As the error message indicates, there is a record at row number 6696 where there is a value containing , as a character in it.
Look at the following demonstration where I have taken a similar case. I have 3 columns in my source. The data looks as shown below:
When I run use similar dataset settings and read these values, the same error would be thrown.
So, the value T1,OG is being considered as if they belong to 2 different columns since they have dataset delimiter within the value.
Such values would throw an error as it is ambiguous to read. One way to avoid this is to enclose such values with quote character (double quote in this case).
Now when I run the copy activity, it would give the desired output.
The table data would look like this:

Need Pentaho JSON without array

I wanted to output json data not as array object and I did the changes mentioned in the pentaho document, but the output is always array even for the single set of values. I am using PDI 9.1 and I tested using the ktr from the below link
https://wiki.pentaho.com/download/attachments/25043814/json_output.ktr?version=1&modificationDate=1389259055000&api=v2
below statement is from https://wiki.pentaho.com/display/EAI/JSON+output
Another special case is when 'Nr. rows in a block' = 1.
If used with empty json block name output will looks like:
{
"name" : "item",
"value" : 25
}
My output comes like below
{ "": [ {"name":"item","value":25} ] }
I have resolved myself. I have added another JSON input step and defined as below
$.wellDesign[0] to get the array as string object

Snowflake Searching string in semi structured data

I have a table. There are many columns and rows. One column that I am trying to query in Snowflake has semi structured data. For example, when I query
select response
from table
limit 5
This is what is returned
[body={\n "id": "xxxxx",\n "object": "charge",\n "amount": 500,\n "amount_refunded": 0,\n "application": null,\n "application_fee": null,\n "application_fee_amount": null,\n "balance_transaction": null,\n "billing_details": {\n "address": {\n "city": null,\n "zip": "xxxxx",]
I want to select only the zip in this data. When I run code:
select response:zip
from table
limit 5
I get an error.
SQL compilation error: error line 1 at position 21 Invalid argument types for function 'GET': (VARCHAR(16777216), VARCHAR(11))
Is there a reason why this is happening? I am new to snowflake so trying to parse out this data but stuck. Thanks!
Snowflake has very good documentation on the subject
For your specific case, have you attempted to use dot notation? It's the appropiate method for accessing JSON. So
Select result:body.zip
from table
Remember that you have your 'body' element. You need to access that one first with semicolon because it's a level 1 element. Zip is located within body so it's a level 2. Level 1 elements are accessed with semicolon, level 2 elements are accessed with dot notation.
I think you have multiple issues with this.
First I think your response column is not a variant column. Please run the below query and confirm
SHOW COLUMNS ON table;
Even if the column is variant, the way the data is stored is not in a valid JSON format. You will need to strip the JSON part and then store that in the variant column.
Please do the first part and share the information, I will then suggest next steps. I wanted to put that in the comment but comment does not allow to write so many sentences.

invalid input syntax for type double precision: " chargebackvalue"

I'm trying to upload a .csv file to Postgres and I'm getting this error:
invalid input syntax for type double precision: " chargebackvalue"
image error
Here it is the structure of the table:
table structure here
The code of the .csv file:
stoneid; mundipaggid; cardnumber; emblem; chargebackvalue; cardmask; chargebackdate; emitter; description; purchasedate; clientName; tacomorderid; useremail
0155477; 'or_3E2W0X5s5jtPjWYO';0670000546857; 'Visa'; 60.6; '498453******3271'; '2019-10-17'; 'Banco do Brasil S.A.'; 'Teste'; '2019-10-10'; 'Silvana Teixeira Da Silva';99854; 'teste#teste.com'
This is too long for a comment.
I would recommend loading the data into a staging table, where all the columns are strings.
Then, select from that table to load the final table. This makes it easier to track down problems in the data that might occur during the load.
Clearly the row you have shown is not the cause of the error. Or, if this is the entire file, then you simply have not skipped the first line because it has header names rather than values.

MongoDB query returns 3 dots instead of answer

I am following an online course on a website and when I am trying to submit a query on my local MongoDB, it returns ... instead of the answer.
The query I submit is
db.scores.find( { "type" : "essay", "score" : 50 }, { student : true, _id : false ).pretty()
The "..." that I get as an "answer" from the local MongoDB server indicates that the server is expecting from me to provide it with more input.
I clearly have a syntax error on my query, I forgot to close a curly bracket.
The correct query db.scores.find( { "type" : "essay", "score" : 50 }, { student : true, _id : false } ).pretty() does not return "..."
HINT: In case the forgotten input is not in the end of the query, but somewhere in the middle (as happened in this query) you can escape the "..." mode by hitting the "enter" two times and then try to type in the new query again.
When I had this same error, its was the result of a string value being terminated prematurely due to a ' or " in the string. Look for any extraneous quotation marks or apostrophes in the values you're adding which may interfere with the declaration.
Just as addition: watch out the password string. Be aware it not to contain quotation mark which interfers with quotation mark of declaration of it, like in my case. I got ... too.