Check whether EXTERNAL DATA SOURCE exist in Azure SQL - azure-sql-database

I am new to Azure SQL Database. I have a EXTERNAL DATA SOURCE as listed in following link.
CREATE EXTERNAL DATA SOURCE
CREATE EXTERNAL DATA SOURCE [My_data_src] WITH (TYPE = RDBMS, LOCATION = N'myserver', CREDENTIAL = [my_cred], DATABASE_NAME = N'MyDB')
GO
Before creating a new EXTERNAL DATA SOURCE, I need to find out whether this already exists. is there any query or dmv existing to find this?

The following command give you list of all existing External Data Source in database
select * from sys.external_data_sources;
To check particular External Data Source, exist or not use following command:
IF EXISTS (
SELECT name
FROM sys.external_data_sources
WHERE [name] = 'Name of Datasource'
)
BEGIN
PRINT 'Yes'
END
It will print Yes if Data Source Exist.
Refer - sys.external_data_sources (Transact-SQL)

Related

How Paramterize Copy Activity to SQL DB with Azure Data Factory

I'm trying to automatically update tables in Azure SQL Database from another SQLDB with Azure Data Factory. At the moment, the only way to update the table Azure SQL Database is to physically select the table you want to update in Azure SQL Database, as shown here:
My configuration to automatically select a table the SQLDB that I want to copy to Azure SQL Database is as follows:
The parameters are as follows:
#concat('SELECT * FROM ',pipeline().parameters.Domain,'.',pipeline().parameters.TableName)
Can someone let me know how to configure my SINK and/or connection to automatically insert the table selected from SOURCE.
My SINK looks like the following:
And my connection looks like the following:
Can someone let me know how to configure my SINK and/or connection to
automatically insert the table selected from SOURCE.
You can use Edit option in the SQL dataset.
Create a dataset parameter for the sink table name. In the SQL sink dataset check the Edit checkbox in it and use the dataset parameter. If you want, you can use dataset parameter for the database name also. Here I have given directly (dbo).
Now in the copy activity sink, you can give the table name dynamically from any pipeline parameter (give your parameter in this case) or any variable using the dynamic content.
Also, enable the Auto create table which will create new table if the table with the given name not exists and if it exists it ignores creation and copies data to it.
My sample result:

Azure Synapse SQL Serverless, how to create external table from CSV with fields longer than 8Kb?

I have a CSV with more than 500 fields, hosted on an Azure storage account; however I just need a couple of columns, which may contain values longer than 8Kbytes. For this reason, I started by writing a simple query in Azure Synapse SQL Serverless like this:
SELECT TOP 100 C1, C2 FROM OPENROWSET(
BULK 'https://mysa.blob.core.windows.net/my_file.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0'
) AS [result]
It fails with the error "String or binary data would be truncated while reading column of type 'VARCHAR'". But it does not JUST report this warning, it does not return ANY rows because of this warning.
So, a simple solution is to disable warnings; of course that value is truncacted to 8Kb, but the query doesn't fail this way:
SET ANSI_WARNINGS OFF
SELECT TOP 100 * FROM OPENROWSET(
BULK 'https://mysa.blob.core.windows.net/my_file.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0'
AS [result]
SET ANSI_WARNINGS ON
Now I need some help to get the final target, which is to build an EXTERNAL TABLE, rather than just a SELECT, leaving the CSV where it is (in other words: I don't want to create a materialized view or a CETAS or a SELECT INTO which would duplicate data).
If I run it this way:
CREATE EXTERNAL TABLE my_CET (
C1 NVARCHAR(8000),
C2 NVARCHAR(8000)
)
WITH (
LOCATION = 'my_file.csv',
DATA_SOURCE = [my_data_source],
FILE_FORMAT = [SynapseDelimitedTextFormat]
)
, it seems working because it successfully creates an external table, however if I try to read it, I get the error "External table my_CET is not accessible because location does not exist or it is used by another process.".
If I try setting ANSI_WARNINGS OFF, it tells me "The option 'ANSI_WARNINGS' must be turned ON to execute requests referencing external tables.".
As said I don't need all the 500 fields hosted in the CSV but just a couple of them, including the one which I should truncate data to 8KB as I did in the above example.
If I use a CSV file where no field is larger than 8KB, the external table creation works correctly, but I couldn't manage to make it work when some values are longer than 8Kb.
I think when creating an external table from a csv you have to bring in all the columns. I am sure someone can correct me if I am wrong.
Depending on what you want to do, you could create a view from the external table using a select query. e.g.
CREATE VIEW my_CET_Vw
AS
SELECT C1,
C2
FROM my_CET

How to correct 'Operating system error code 12007' when accessing Azure blob storage in a SQL stored procedure

I'm trying to create a stored procedure that will access a file in an azure blob storage container, store the first line of the file in a temporary file, use this data to create a table (effectively using the header fields in the file as the column titles), and then populate the file with the rest of the data.
I've tried the basic process in a local SQL database, using a local source file on my machine, and the procedure itself works as I want it to, creating a new table from the supplied file.
However, when I've set it up within an Azure SQL database and amend the procedure to use a 'datasource' rather than pointing it at a local file, it's producing the following error:
Cannot bulk load because the file "my_example_file" could not be opened. Operating system error code 12007(failed to retrieve text for this error. Reason: 317).
My stored procedure contains the following:
CREATE TABLE [TempColumnTitleTable] ([ColumnTitles] [nvarchar](max)
NULL);
DECLARE #Sql NVARCHAR(Max) = 'BULK INSERT [dbo].
[TempColumnTitleTable] FROM ''' + #fileName + ''' WITH
(DATA_SOURCE = ''Source_File_Blob'', DATAFILETYPE = ''char'',
FIRSTROW = 1, LASTROW = 1, ROWTERMINATOR = ''0x0a'')';
EXEC(#Sql);
The above should be creating a single column table containing all the text for the headers, which I can then interrogate and use for the column titles in my permanent file.
I've set up the DataSource as follows:
CREATE EXTERNAL DATA SOURCE Source_File_Blob
WITH (
TYPE = BLOB_STORAGE,
LOCATION = 'location_url',
CREDENTIAL = AzureBlobCredential
);
with an appropriate credential in place!
I'm expecting it to populate my temporary column title file (and then go on and do the other populating that I haven't shown code for above), but it just returns the mentioned error code.
I've had a Google, but the error code seems to be related to other 'path' type issues that I don't think apply here.
We've got similar processes that use blob storage with the same credentials, and they all seem to work ok, but the problem is that the person who wrote them is no longer at our company, so I can't actually consult them!
So basically, what would be causing that error? I don't think it's access, since I am able to run similar processes on other blobs, and as far as I can tell access levels are the same on these.
Yep - used the wrong URL as prefix. It was only when I was finally got access to the blob storage that I realised.

How to resolve special character issue in SQL Server data warehouse

I have to load the data from datalake into a SQL Server data warehouse using the polybase tables. I have created the set up for the creation of external tables. I have created the external tables and I am trying to do select * from ext_t1 table but I'm getting ???? for a column in ext_table.
Below is my external table script. I have found the issue with the special character in data. How can we escape the special character and need to use only varchar datatype not nvarchar. Can some help me on this issue?
CREATE EXTERNAL FILE FORMAT [CSVFileFormat_Test] WITH (FORMAT_TYPE = DELIMITEDTEXT, FORMAT_OPTIONS (FIELD_TERMINATOR = N',', STRING_DELIMITER = N'"',DATE_FORMAT='yyyy-MM-dd', FIRST_ROW = 2, USE_TYPE_DEFAULT = True,Encoding='UTF8'))
CREATE EXTERNAL TABLE [dbo].[EXT_TEST1]
( A VARCHAR(10),B VARCHAR(20))
(DATA_SOURCE = [Azure_Datalake],LOCATION = N'/A/Test_CSV/',FILE_FORMAT =csvfileformat,REJECT_TYPE = VALUE,REJECT_VALUE = 1)
Data: (special character in csv for A column as follows)
ÐК Ð’ÐЗМ Завод
ÐК Ð’ÐЗМ ЗаÑтройщик
This is data mismatch issue and this read may help you .
External Table Considerations
Creating an external table is easy, but there are some nuances that need to be discussed.
External Tables are strongly typed. This means that each row of the data being ingested must satisfy the table schema definition. If a row does not match the schema definition, the row is rejected from the load.
The REJECT_TYPE and REJECT_VALUE options allow you to define how many rows or what percentage of the data must be present in the final table. During load, if the reject value is reached, the load fails. The most common cause of rejected rows is a schema definition mismatch. For example, if a column is incorrectly given the schema of int when the data in the file is a string, every row will fail to load.
Data Lake Storage Gen1 uses Role Based Access Control (RBAC) to control access to the data. This means that the Service Principal must have read permissions to the directories defined in the location parameter and to the children of the final directory and files. This enables PolyBase to authenticate and load that data.

File does not exist in database - SQL Server Snapshot

I am attempting to create a database snapshot with SQL Server 2008 R2 using the following T-SQL code.
CREATE DATABASE SNAP_myDB_0900
ON
(NAME = myDB, FILENAME = 'C:\myDB_0900.SNAP')
AS SNAPSHOT OF myDB
I get the following error:
The file 'myDB' does not exist in database 'myDB'
This code works with other databases in the same instance but not this one. I have double checked the file name and it is correct.
Why am I getting this error?
Verify the database file name that you are trying to create the snapshot based off of:
select name, physical_name
from myDB.sys.database_files;
The NAME you give you snapshot file(s) needs to match the source database file name.
In other words, if myDB's data file has a name of datafile1, then you will have to use ... NAME = 'datafile1' ... when creating your snapshot.