Is it possible to run a SQL Select statment on a CSV file? - sql

Is it possible to execute a SQL Select statement on a CSV file on a Sybase Database?
Update DBA.user_data
set user_data.date_Sent = '12/16/2015'
where user_data.caseid in (select caseid
from DBA.cases
where cases.caseid=user_data.caseid
And cases.caseid in (select * FROM 'C:\\example\\test.csv' WITH (FIELDTERMINATOR = ',',ROWTERMINATOR = '\n')));

Assuming you are using Sybase ASE, you can access flat files using the Component Integration Services (CIS).
I suggest you check out the Component Integration Services User Guide, which is part of the SAP/Sybase documentation.
Check out the section on File system access: File Access
You will create a proxy (or existing) table, using the file information in the definition.
create proxy_table <table_name>
external file at " pathname" [column delimiter “<string>”]
OR
create existing table fname (
column1 int null,
column2 datetime null,
column3 varchar(1024) null
etc. etc.
) external file at "pathname" [column delimiter “<string>”]
Only the select, insert, and truncate table statements are supported
for file access. update and delete result in errors if the file proxy
is the target of these commands.
Documentation: create proxy_table

Related

Getting an error while copying data from one folder to another in Azure Data Factory

This query used to work in Azure Data Factory pipeline but stopped working few days ago. Nothing changed in case of file names/ formats etc in Azure Blob storage. Getting error in this line:
SELECT * FROM OPENROWSET (
BULK
'/filepath.csv#snapshot=*', FORMAT = 'CSV'
)
The error says .csv#snapshot=* has URL suffix which is not allowed.
Full code:
-- CREATE OR REPLACE VIEW clean.barriers AS
IF EXISTS (SELECT * FROM sys.tables t
JOIN sys.schemas s ON (t.schema_id = s.schema_id)
WHERE s.name = 'clean' AND t.name = 'barriers')
EXEC('DROP EXTERNAL TABLE [clean].[barriers]')
CREATE EXTERNAL TABLE [clean].[barriers]
WITH
(
LOCATION = 'clean/synapse/barriers',
DATA_SOURCE = "",
FILE_FORMAT = [SynapseParquetFormat]
)
AS
SELECT * FROM OPENROWSET (
BULK
'/filepath.csv#snapshot=*', FORMAT = 'CSV'
)
WITH(
-- Schema adjusted to what we have in clean/barriers in Bigquery
mshp_id INT,
prog_name NVARCHAR(256),
barrier_name NVARCHAR(256),
days INT
) AS load_clean_data
As per the Official Documentation, you should have a Data source for the source file also from which you are trying to copy the data.
So, try to create a data source for the source CSV file and check, it may work.
Also, as you are executing the above script using ADF, first try to execute it without ADF and if the error occurs then problem can be with the script not ADF. If not try to change the activity of ADF and check.
You can try this trouble shoot also in your BULK path. As you want the data from that csv files folder give the path like below and check.
/folder/*.csv

SQL Synapse, use dynamic/parameterized Azure Container in CREATE EXTERNAL TABLE

We have a scenario where the source csv files are isolated by Customer i.e., each Customer will have a Container in the Azure Storage.
When creating External Table in SQL Synapse, is it possible to pass the Container name as parameter that way there are not multiple External Data Tables in SQL Synapse DB?
CREATE EXTERNAL DATA SOURCE AzureBlobStorage with (
TYPE = HADOOP,
LOCATION ='wasbs://<**container100**>#<accountname>.blob.core.windows.net',
CREDENTIAL = AzureStorageCredential
);
CREATE EXTERNAL TABLE [dbo].[res1_Data] (
[ID] INT,
[UniqueId] VARCHAR(50),
[Status] VARCHAR(50) NULL,
[JoinedDate] DATE
)
WITH (LOCATION='<**container2**>/<folder>/<file>.csv',
DATA_SOURCE = AzureBlobStorage,
FILE_FORMAT = CEFormat
);
Unfortunately you can't use variables within DDL commands. However, you can build dynamic statements and then execute with sp_executesql to do this.
More information here.

Reading JSON file into a table

I am trying to read JSON files from a location and write to SQL server table. The files in the location change everyday, as a result I may need to find a dynamic way to select file name.
I have tried to use the OPENROWSET, however I read that I cannot parametize the location name with that. I tried to use OPENROWSET with a dynamic query but I get an error that file location can not be found. After reading about it seems it could be a folder permission issue.
What I am not trying and hoping to get help with is how I can read the JSON text file and write the data into a table and then I can use the OPENJSON function from there.
Can anyone help me with how I can load a SQL server with the JSON data in some sort of a blob using T-SQL or SSIS.
--Bulk Import data from file
Select BulkColumn from openrowset(Bulk'D:\home\HS\HS-Web\wwwroot\Json files\test.json',single_blob)JSON;
--View the imported data from Bulk Import as a single column
DECLARE #TestDetails VARCHAR(MAX)
SELECT #TestDetails = BulkColumn FROM
OPENROWSET(BULK'D:\Omkar\Projects\HS\Documents\test.json', SINGLE_BLOB) JSON;
SELECT #TestDetails as SingleRow_Column
--Check if imported data is valid data or not if=1 data is valid
if(ISJSON(#TestDetails)=1)
BEGIN
PRINT 'Valid Data Imported'
END
ELSE
BEGIN
PRINT 'Invalid Data Imported'
END
GO
--Now Select Data to be added to table here $.tests is array object name
SELECT testCode,Test,Method FROM OPENJSON(#TestDetails, '$.Tests')
WITH(
testCode nvarchar(50)'$.testCode',
Test nvarchar(50)'$.Test',
Method nvarchar(50)'$.Method'
)
--Now insert data into table if default values need to be inserted then in select take default value
Insert into TestDetails(Active,CreatedDate,testCode,Test,Method)
SELECT '1','2019-10-23 06:01:10.7927233',testCode,Test,Method FROM
OPENJSON(#TestDetails, '$.Tests')
WITH(
testCode nvarchar(50)'$.testCode',
Test nvarchar(50)'$.Test',
Method nvarchar(50)'$.Method'
)[Json file Screen Shot][1]
So if I get this correctly, your issue is not about how to read the JSON, but rather how to get the file?
As you found out, any interaction with the file system out of SQL-Server (T-SQL) can get very tricky. SQL-Server is restricted to its own user and will see its own machine. So a path on C:\ might not be the one you expected.
However, before fiddling around with permissions, kerberos for acting as-authentication and shared paths I'd suggest to create a staging table like:
CREATE TABLE dbo.JSONImport_staging
(ID INT IDENTITY CONSTRAINT PK_JSONImport_staging PRIMARY KEY
,ImportDate DATETIME2 NOT NULL CONSTRAINT DF_JSONImport_staging_ImportDate DEFAULT(SYSUTCDATETIME())
,FileLocation NVARCHAR(1000) NULL
,Content NVARCHAR(MAX) NULL
,ProcessedOn DATETIME2 NULL
,Success BIT NULL);
And use one of the many approaches you'll find on the net to store data in such a table
PowerShell (something along this)
Any programming language of your choice
SSIS
and many more
You can easily use an external sheduled job to check for files and shift them into the staging table and then use an internal job (within SQL-Server) to check for unprocessed files and read them into the target tables.
As always in such cases:
Keep the staging table as open, generic and error tolerant as possible.
Do any integrity check, conversion and processing transaction-safe on the way between your staging table and the target tables.
Make sure a file can be access by SQL
IF OBJECT_ID('tempdb..#JsonFile') IS NOT NULL
DROP TABLE #JsonFile;
CREATE TABLE #JsonFile
(
[JsonLine] NVARCHAR(MAX)
);
BULK INSERT #JsonFile
FROM '\\UNC_path\file.json'
WITH ( ROWTERMINATOR = '' );
SELECT *
FROM #JsonFile;

How to import data from .txt file to populate a table in SQL Server

Every day a PPE.txt file with clients data, separated by semicolon and always with the same layout is stored to a specific file directory.
Every day someone has to update a specific table from our database based in this PPE.txt.
I want to automate this process via a SQL script
What I thought would be a solution is to import the data via a script from this .txt file into a created table, then execute the update.
What I have so far is
IF EXISTS (SELECT 1 FROM Sysobjects WHERE name LIKE 'CX_PPEList_TMP%')
DROP TABLE CX_PPEList_TMP
GO
CREATE TABLE CX_PPEList_TMP
(
Type_Registy CHAR(1),
Number_Person INTEGER,
CPF_CNPJ VARCHAR(14),
Type_Person CHAR(1),
Name_Person VARCHAR(80),
Name_Agency VARCHAR(40),
Name_Office VARCHAR(40),
Number_Title_Related INTEGER,
Name_Title_Related VARCHAR(80)
)
UPDATE Table1
SET SN_Policaly_Exposed = 'Y'
WHERE Table1.CD_Personal_Number = CX_PPEList_TMP.CPF_CNPJ
AND Table1.SN_Policaly_Exposed = 'N'
UPDATE Table1
SET SN_Policaly_Exposed = 'N'
WHERE Table1.CD_Personal_Number NOT IN (SELECT CX_PPEList_TMP.CPF_CNPJ
FROM CX_PPEList_TMP)
AND Table1.SN_Policaly_Exposed = 'Y'
I know I haven't given much, but it is because I don't have much yet.
I want to populate the CX_PEPList_TMP temp table with the data from the PEP.txt file via a script so I could just execute this script to update my database. But I don't know any kind of command I can use neither have found in my research.
Thanks in advance!
Using OPENROWSET
You can read text files using OPENROWSET option (first you have to enable adhoc queries)
Using Microsoft Text Driver
SELECT * FROM OPENROWSET('MSDASQL',
'Driver={Microsoft Text Driver (*.txt; *.csv)};
DefaultDir=C:\Docs\csv\;',
'SELECT * FROM PPE.txt')
Using OLEDB provider
SELECT
*
FROM
OPENROWSET
('Microsoft.ACE.OLEDB.12.0','Text;Database=C:\Docs\csv\;IMEX=1;','SELECT *
FROM PPE.txt') t
Using BULK INSERT
You can import text file data to a staging table and update data from it:
BULK INSERT dbo.StagingTable
FROM 'C:\PPE.txt'
WITH
(
FIELDTERMINATOR = ';',
ROWTERMINATOR = '\n'
)
In your case,i recommend you to use an ETL like SSIS it's much better and easy to work with and you can also Schedule the package to execute in a specific time

Insert Large Objects into Azure SQL Data warehouse

I have created a table in Azure SQL Data Warehouse as below:
CREATE TABLE dbo.test_lob_type
(
id VARCHAR(80) NOT NULL,
mime_type VARCHAR(80) NOT NULL,
binary_lob VARBINARY(MAX) NULL
)
WITH
(
DISTRIBUTION = HASH ( id ),
CLUSTERED INDEX ( id ASC )
);
I want to insert a BLOB object into this table. I tried to achieve this using the OPENROWSET command as pointed in the link How to insert a blob into a database using sql server management studio
But unfortunately this command does not work with Azure SQL DW. Can anyone provide any input on how to insert any BLOB object into a SQL DW table from the command line?
bcp is supported for this scenario. Here is a simple example using SQL Authentication and char format:
REM Example using SQL Authentication and character file
bcp dbo.test_lob_type in test_lob_type.bcp -S yourDWServer.database.windows.net -d yourDWName -U yourLogin -P yourPassword -c
If your file only contains the blob, consider loading to a staging table before inserting into the main table.