How to bulk load JSON with values into Synapse SQL dedicated pool

How to bulk load JSON with values into Synapse SQL dedicated pool - azure-synapse

I'm attempting to bulk load json files, along with their filenames and paths into a synapse analytics dedicated sql table but I'm just stumped on how to accomplish it. I can load the json files solo without a problem, but I really need the additional values as well.
This is what I'm trying but it doesn't work:
Copy INTO dbo.PolicyStagingJsonOnly
SELECT jsonContent,
[result].filename() as fn,
[result].filepath() as fp
FROM
OPENROWSET(
BULK 'https://datalakexxxx.blob.core.windows.net/staging/policy/*.json',
FORMAT = 'CSV',
FIELDQUOTE = '0x0b',
FIELDTERMINATOR ='0x0b',
rowterminator = '0x0c'
)
WITH (
jsonContent varchar(MAX)
) AS [result]

Related

SQL; csv import with semicolons in data and double quotes

I'm wanting to import a CSV file which has some values as such:
123;456;"78;9";1011
Simply said, there are some quotes in a value, but the value is within double quotes. When I use a bulk import, the value '"78' is put into one column, whereas '9"' is put into the next column. How can I prevent this?
I am using below query:
BULK INSERT CSVTest
FROM 'c:\csvtest.csv'
WITH
(
FIELDTERMINATOR = ';',
ROWTERMINATOR = '\n'
)
GO
I'm using SQL Server!
In a test environment i've setup the new sql server, and the fieldquote seems to be ignored in the statement, and the fields are still split up. What am I doing wrong? I'm doing:
BULK INSERT CSVTest
FROM 'c:\csvtest.csv'
WITH
(
FIELDTERMINATOR = ';',
ROWTERMINATOR = '\n',
FIELDQUOTE='"'
)
GO

Import CSV into SQL (CODE)

I want to import several CSV files automatically using SQL-code (i.e. without using the GUI). Normally, I know the dimensions of my CSV file. So, in many cases I create an empty table with, let say, x columns with the corresponding data types. Then, I import the CSV file into this table using BULK INSERT. However, in this case I don't know much about my files, i.e. information about data types and dimensions are not given.
To summerize the problem:
I receive a file path, e.g. C:...\DATA.csv. Then, I want to use this path in SQL-code to import the file to a table without knowing anything about it.
Any ideas on how to solve this problem?

Use something like this:
BULK INSERT tbl
FROM 'csv_full_path'
WITH
(
FIRSTROW = 2, --Second row if header row in file
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
ERRORFILE = 'error_file_path',
TABLOCK
)
If columns are not known, you could try with:
select * from OpenRowset
Or, do a bulk insert with only the first row as one big column, then parse it to create the dynamic main insert. Or bulk insert the whole file into a table with just one column, then parse that...

You can use OPENROWSET (documantation).
SELECT *
INTO dbo.MyTable
FROM
OPENROWSET(
BULK 'C:\...\mycsvfile.csv',
SINGLE_CLOB) AS DATA;
In addition, you can use dynamic SQL to parameterize table name and location of csv file.

Bulk insert from txt in SQL table

I need to do some bulk inserts in SQL Table from a txt file.
bulk insert [dbo].[TempSample]
from 'D:\sqls\sample.txt'
with (fieldterminator = ',', rowterminator = '\n')
go
In the txt file I have descriptions like 'Hörsching'. After insert is made I found descriptions in my table like 'H÷rsching'. How can we deal with that ? The collation of the table is set to Latin1_General_CI_AS.

How is the file encoded?
Have you tried using the CODEPAGE parameter to specify the file's encoding?

Special characters displaying incorrectly after BULK INSERT

I'm using BULK INSERT to import a CSV file. One of the columns in the CSV file contains some values that contain fractions (e.g. 1m½f).
I don't need to do any mathematical operations on the fractions, as the values will just be used for display purposes, so I have set the column as nvarchar. The BULK INSERT works but when I view the records within SQL the fraction has been replaced with a cent symbol (¢) so the displayed text is 1m¢f.
I'm interested to understand why this is happening and any thoughts on how to resolve the issue. The BULK INSERT command is:
BULK INSERT dbo.temp FROM 'C:\Temp\file.csv'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' );

You need to BULK INSERT using the CODEPAGE = 'ACP', which converts string data from Windows codepage 1252 to SQL Server codepage.
BULK INSERT dbo.temp FROM 'C:\Temp\file.csv'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', CODEPAGE = 'ACP');
If you are bringing in UTF-8 data on a new enough version of SQL Server:
[...] , CODEPAGE = '65001');
You may also need to specify DATAFILETYPE = 'char|native|widechar|widenative'.

Bulk insert with a different schema

I am trying to get data from a csv file with the following data.
Station code;DateBegin;DateEnd
01;20100214;20100214
02;20100214;20100214
03;20100214;20100214
I am trying bulk insert as
BULK INSERT dbo.#tmp_station_details
FROM 'C:\station.csv'
WITH (
FIELDTERMINATOR ='';'',
FIRSTROW = 2,
ROWTERMINATOR = ''\n''
)
But the table tmp_station_details has one extra column as Priority.
Its schema is like
[Station code] [Priority] [DateBegin] [DateEnd]
Now is this possible to bulk insert without altering the schema of the table.

Add FORMATFILE = 'format_file_path' to your "with" block. Refer to BOL: using a format file to skip a table column for an example.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to bulk load JSON with values into Synapse SQL dedicated pool - azure-synapse

Related

SQL; csv import with semicolons in data and double quotes

Import CSV into SQL (CODE)

Bulk insert from txt in SQL table

Special characters displaying incorrectly after BULK INSERT

Bulk insert with a different schema

Categories

Resources