Getting an error while copying data from one folder to another in Azure Data Factory - sql

This query used to work in Azure Data Factory pipeline but stopped working few days ago. Nothing changed in case of file names/ formats etc in Azure Blob storage. Getting error in this line:
SELECT * FROM OPENROWSET (
BULK
'/filepath.csv#snapshot=*', FORMAT = 'CSV'
)
The error says .csv#snapshot=* has URL suffix which is not allowed.
Full code:
-- CREATE OR REPLACE VIEW clean.barriers AS
IF EXISTS (SELECT * FROM sys.tables t
JOIN sys.schemas s ON (t.schema_id = s.schema_id)
WHERE s.name = 'clean' AND t.name = 'barriers')
EXEC('DROP EXTERNAL TABLE [clean].[barriers]')
CREATE EXTERNAL TABLE [clean].[barriers]
WITH
(
LOCATION = 'clean/synapse/barriers',
DATA_SOURCE = "",
FILE_FORMAT = [SynapseParquetFormat]
)
AS
SELECT * FROM OPENROWSET (
BULK
'/filepath.csv#snapshot=*', FORMAT = 'CSV'
)
WITH(
-- Schema adjusted to what we have in clean/barriers in Bigquery
mshp_id INT,
prog_name NVARCHAR(256),
barrier_name NVARCHAR(256),
days INT
) AS load_clean_data

As per the Official Documentation, you should have a Data source for the source file also from which you are trying to copy the data.
So, try to create a data source for the source CSV file and check, it may work.
Also, as you are executing the above script using ADF, first try to execute it without ADF and if the error occurs then problem can be with the script not ADF. If not try to change the activity of ADF and check.
You can try this trouble shoot also in your BULK path. As you want the data from that csv files folder give the path like below and check.
/folder/*.csv

Related

Operation CREATE EXTERNAL FILE FORMAT is not allowed for a replicated database on Azure Synapse SQL Built-in Serverless Pool

I am trying to Create and query external tables from a file in Azure Data Lake from Azure Synapse Serverless SQL Pool using the following guide:
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables?tabs=hadoop
Everything appears to be going fine.
I have created the following script to create an external table:
IF NOT EXISTS (SELECT * FROM sys.external_file_formats WHERE name = 'SynapseDelimitedTextFormat')
CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat]
WITH ( FORMAT_TYPE = DELIMITEDTEXT ,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
USE_TYPE_DEFAULT = FALSE
))
GO
IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name = 'synapsefilename_synapselakev2_dfs_core_windows_net')
CREATE EXTERNAL DATA SOURCE [synapsefilename_synapselakev2_dfs_core_windows_net]
WITH (
LOCATION = 'abfss://synapsefilename#synapselakev2.dfs.core.windows.net'
)
GO
CREATE EXTERNAL TABLE GlobalOptionsetMetadata (
[C1] nvarchar(4000),
[C2] nvarchar(4000),
[C3] nvarchar(4000),
[C4] nvarchar(4000),
[C5] nvarchar(4000),
[C6] nvarchar(4000),
[C7] nvarchar(4000)
)
WITH (
LOCATION = 'GlobalOptionsetMetadata.csv',
DATA_SOURCE = [synapsefilename_synapselakev2_dfs_core_windows_net],
FILE_FORMAT = [SynapseDelimitedTextFormat]
)
GO
SELECT TOP 100 * FROM dbo.GlobalOptionsetMetadata
GO
However, when I click run I get the following error:
Operation CREATE EXTERNAL FILE FORMAT is not allowed for a replicated database.
Any thoughts?
it is most likely Database1 is a "replicated" lake database which you can't create external file formats. you have to select a Serverless/Dedicated SQL database while creating the External table. Try to change the database in "Use Database" combobox near to "Connect to" at top of query window and select Serverless/Dedicated SQL Database instead.

Reading JSON file into a table

I am trying to read JSON files from a location and write to SQL server table. The files in the location change everyday, as a result I may need to find a dynamic way to select file name.
I have tried to use the OPENROWSET, however I read that I cannot parametize the location name with that. I tried to use OPENROWSET with a dynamic query but I get an error that file location can not be found. After reading about it seems it could be a folder permission issue.
What I am not trying and hoping to get help with is how I can read the JSON text file and write the data into a table and then I can use the OPENJSON function from there.
Can anyone help me with how I can load a SQL server with the JSON data in some sort of a blob using T-SQL or SSIS.
--Bulk Import data from file
Select BulkColumn from openrowset(Bulk'D:\home\HS\HS-Web\wwwroot\Json files\test.json',single_blob)JSON;
--View the imported data from Bulk Import as a single column
DECLARE #TestDetails VARCHAR(MAX)
SELECT #TestDetails = BulkColumn FROM
OPENROWSET(BULK'D:\Omkar\Projects\HS\Documents\test.json', SINGLE_BLOB) JSON;
SELECT #TestDetails as SingleRow_Column
--Check if imported data is valid data or not if=1 data is valid
if(ISJSON(#TestDetails)=1)
BEGIN
PRINT 'Valid Data Imported'
END
ELSE
BEGIN
PRINT 'Invalid Data Imported'
END
GO
--Now Select Data to be added to table here $.tests is array object name
SELECT testCode,Test,Method FROM OPENJSON(#TestDetails, '$.Tests')
WITH(
testCode nvarchar(50)'$.testCode',
Test nvarchar(50)'$.Test',
Method nvarchar(50)'$.Method'
)
--Now insert data into table if default values need to be inserted then in select take default value
Insert into TestDetails(Active,CreatedDate,testCode,Test,Method)
SELECT '1','2019-10-23 06:01:10.7927233',testCode,Test,Method FROM
OPENJSON(#TestDetails, '$.Tests')
WITH(
testCode nvarchar(50)'$.testCode',
Test nvarchar(50)'$.Test',
Method nvarchar(50)'$.Method'
)[Json file Screen Shot][1]
So if I get this correctly, your issue is not about how to read the JSON, but rather how to get the file?
As you found out, any interaction with the file system out of SQL-Server (T-SQL) can get very tricky. SQL-Server is restricted to its own user and will see its own machine. So a path on C:\ might not be the one you expected.
However, before fiddling around with permissions, kerberos for acting as-authentication and shared paths I'd suggest to create a staging table like:
CREATE TABLE dbo.JSONImport_staging
(ID INT IDENTITY CONSTRAINT PK_JSONImport_staging PRIMARY KEY
,ImportDate DATETIME2 NOT NULL CONSTRAINT DF_JSONImport_staging_ImportDate DEFAULT(SYSUTCDATETIME())
,FileLocation NVARCHAR(1000) NULL
,Content NVARCHAR(MAX) NULL
,ProcessedOn DATETIME2 NULL
,Success BIT NULL);
And use one of the many approaches you'll find on the net to store data in such a table
PowerShell (something along this)
Any programming language of your choice
SSIS
and many more
You can easily use an external sheduled job to check for files and shift them into the staging table and then use an internal job (within SQL-Server) to check for unprocessed files and read them into the target tables.
As always in such cases:
Keep the staging table as open, generic and error tolerant as possible.
Do any integrity check, conversion and processing transaction-safe on the way between your staging table and the target tables.
Make sure a file can be access by SQL
IF OBJECT_ID('tempdb..#JsonFile') IS NOT NULL
DROP TABLE #JsonFile;
CREATE TABLE #JsonFile
(
[JsonLine] NVARCHAR(MAX)
);
BULK INSERT #JsonFile
FROM '\\UNC_path\file.json'
WITH ( ROWTERMINATOR = '' );
SELECT *
FROM #JsonFile;

How to import data from .txt file to populate a table in SQL Server

Every day a PPE.txt file with clients data, separated by semicolon and always with the same layout is stored to a specific file directory.
Every day someone has to update a specific table from our database based in this PPE.txt.
I want to automate this process via a SQL script
What I thought would be a solution is to import the data via a script from this .txt file into a created table, then execute the update.
What I have so far is
IF EXISTS (SELECT 1 FROM Sysobjects WHERE name LIKE 'CX_PPEList_TMP%')
DROP TABLE CX_PPEList_TMP
GO
CREATE TABLE CX_PPEList_TMP
(
Type_Registy CHAR(1),
Number_Person INTEGER,
CPF_CNPJ VARCHAR(14),
Type_Person CHAR(1),
Name_Person VARCHAR(80),
Name_Agency VARCHAR(40),
Name_Office VARCHAR(40),
Number_Title_Related INTEGER,
Name_Title_Related VARCHAR(80)
)
UPDATE Table1
SET SN_Policaly_Exposed = 'Y'
WHERE Table1.CD_Personal_Number = CX_PPEList_TMP.CPF_CNPJ
AND Table1.SN_Policaly_Exposed = 'N'
UPDATE Table1
SET SN_Policaly_Exposed = 'N'
WHERE Table1.CD_Personal_Number NOT IN (SELECT CX_PPEList_TMP.CPF_CNPJ
FROM CX_PPEList_TMP)
AND Table1.SN_Policaly_Exposed = 'Y'
I know I haven't given much, but it is because I don't have much yet.
I want to populate the CX_PEPList_TMP temp table with the data from the PEP.txt file via a script so I could just execute this script to update my database. But I don't know any kind of command I can use neither have found in my research.
Thanks in advance!
Using OPENROWSET
You can read text files using OPENROWSET option (first you have to enable adhoc queries)
Using Microsoft Text Driver
SELECT * FROM OPENROWSET('MSDASQL',
'Driver={Microsoft Text Driver (*.txt; *.csv)};
DefaultDir=C:\Docs\csv\;',
'SELECT * FROM PPE.txt')
Using OLEDB provider
SELECT
*
FROM
OPENROWSET
('Microsoft.ACE.OLEDB.12.0','Text;Database=C:\Docs\csv\;IMEX=1;','SELECT *
FROM PPE.txt') t
Using BULK INSERT
You can import text file data to a staging table and update data from it:
BULK INSERT dbo.StagingTable
FROM 'C:\PPE.txt'
WITH
(
FIELDTERMINATOR = ';',
ROWTERMINATOR = '\n'
)
In your case,i recommend you to use an ETL like SSIS it's much better and easy to work with and you can also Schedule the package to execute in a specific time

Get error when build E_CSC_USER_NOTAUTHORIZED: This statement requires USE permissions for database 'master'

I've been trying to build my U-SQL script and I've even used the example one below:
CREATE ASSEMBLY IF NOT EXISTS [Newtonsoft.Json] FROM "assemblies/Newtonsoft.Json.dll";
CREATE ASSEMBLY IF NOT EXISTS [Microsoft.Analytics.Samples.Formats] FROM "assemblies/Microsoft.Analytics.Samples.Formats.dll";
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
DECLARE #INPUT_FILE string = #"/Samples/Data/json/donut.json";
//Extract the sps property from the Json file as a string.
#json =
EXTRACT sps string
FROM #INPUT_FILE
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
#json =
SELECT sps.Replace("\r\n", "") AS sps
FROM #json;
/*
Parse the sps property to extract the id and name values as a SQL.MAP
*/
#sps_json =
SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(sps, "$..id") AS sp_id_map,
Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(sps, "$..type") AS sp_type_map
FROM #json;
/*
Explode the id and type maps to get the values of the Id and type as individual rowsets
*/
#sps_id_property =
SELECT id_name.Split('.')[0] AS id_name,
id_value
FROM #sps_json
CROSS APPLY
EXPLODE(sp_id_map) AS T(id_name, id_value);
#sps_type_property =
SELECT type_name.Split('.')[0] AS type_name,
type_value
FROM #sps_json
CROSS APPLY
EXPLODE(sp_type_map) AS T(type_name, type_value);
/*
JOIN the Id and Value maps to return the properties as a rowset
Output of the following JOIN statement
1001,Regular
1002,Chocolate
1003,Blueberry
1004,Devil's Food
*/
#sps = SELECT [id].id_value AS id, [type].type_value AS type
FROM #sps_id_property AS [id]
INNER JOIN #sps_type_property AS [type]
ON id.id_name == type.type_name;
/*
Output the file.
*/
OUTPUT #sps
TO "/rukmanig/output/sps.csv"
USING Outputters.Csv(quoting : false);
However, when I build this, I get the following error:
E_CSC_USER_NOTAUTHORIZED: This statement requires USE permissions for database 'master'
I have no idea why I have this. My other colleague when he builds the same project has no problems. I was able to build before, but for some reason I can't anymore.
Anybody know why?
Thanks.
I assume you have been able to fix it. But after we introduced file and folder and DB level ACLs, existing users on existing accounts needed to be given explicit permissions on the master database.
Newly added users or newly created accounts should do it automatically.

Is it possible to run a SQL Select statment on a CSV file?

Is it possible to execute a SQL Select statement on a CSV file on a Sybase Database?
Update DBA.user_data
set user_data.date_Sent = '12/16/2015'
where user_data.caseid in (select caseid
from DBA.cases
where cases.caseid=user_data.caseid
And cases.caseid in (select * FROM 'C:\\example\\test.csv' WITH (FIELDTERMINATOR = ',',ROWTERMINATOR = '\n')));
Assuming you are using Sybase ASE, you can access flat files using the Component Integration Services (CIS).
I suggest you check out the Component Integration Services User Guide, which is part of the SAP/Sybase documentation.
Check out the section on File system access: File Access
You will create a proxy (or existing) table, using the file information in the definition.
create proxy_table <table_name>
external file at " pathname" [column delimiter “<string>”]
OR
create existing table fname (
column1 int null,
column2 datetime null,
column3 varchar(1024) null
etc. etc.
) external file at "pathname" [column delimiter “<string>”]
Only the select, insert, and truncate table statements are supported
for file access. update and delete result in errors if the file proxy
is the target of these commands.
Documentation: create proxy_table