I am attempting to pull in data from a CSV file that is stored in an Azure Blob container and when I try to query the file I get an error of
File 'https://<storageaccount>.blob.core.windows.net/<container>/Sales/2020-10-01/Iris.csv' cannot be opened because it does not exist or it is used by another process.
The file does exist and as far as I know of it is not being used by anything else.
I am using SSMS and also a SQL On-Demand endpoint from Azure Synapse.
What I did in SSMS was run the following commands after connecting to the endpoint:
CREATE DATABASE [Demo2];
CREATE EXTERNAL DATA SOURCE AzureBlob WITH ( LOCATION 'wasbs://<container>#<storageaccount>.blob.core.windows.net/' )
SELECT * FROM OPENROWSET (
BULK 'Sales/2020-10-01/Iris.csv',
DATA_SOURCE = 'AzureBlob',
FORMAT = '*'
) AS tv1;
I am not sure of where my issue is at or where to go next. Did I mess up anything with creating the external data source? Do I need to use a SAS token there and if so what is the syntax for that?
#Ubiquitinoob44, you need to create a database credential:
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-storage-access-control?tabs=shared-access-signature
I figured out what the issue was. I haven't tried Armando's suggestion yet.
First I had to go to the container and edit IAM policies to give my Active Directory login a Blob Data Contributor role. The user to give access to will be your email address for logging in to your portal.
https://learn.microsoft.com/en-us/azure/storage/common/storage-auth-aad-rbac-portal?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json
After that I had to re-connect to the On-Demand endpoint in SSMS. Make sure you login through the Azure AD - MFA option. Originally I was using the On-Demand endpoint username and password which was not given access to the Blob Data Contributor role for the container.
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/resources-self-help-sql-on-demand
Related
I'm just finding my way around Azure, trying to build a modern data warehouse. One thing I haven't been able to figure out is how to query my data lake from an Azure SQL database.
Something similar to the following works in Azure Synapse (note the long-term plan is to remove Synapse due to cost reasons):
SELECT top 100 *
FROM
OPENROWSET( BULK
'https://storageaccount.blob.core.windows.net/container/folder/2022/09/03/filename.parquet'
,SINGLE_BLOB
) AS [result]
But I get the following error running this from an Azure SQL database (in the Azure portal, using the query editor):
Failed to execute the query. Error: Cannot bulk load because the file "https://storageaccount.blob.core.windows.net/container/folder/2022/09/03/filename.parquet" could not be opened. Operating system error code 6(The handle is invalid.).
I also tried the code below after searching on the Internet:
CREATE EXTERNAL DATA SOURCE pocBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://storageaccount.blob.core.windows.net/container/folder/2022/09/03',
CREDENTIAL= sqlblob);
-- Query remote file
SELECT *
FROM OPENROWSET(BULK 'filename.parquet',
DATA_SOURCE = 'pocBlobStorage',
SINGLE_CLOB
--FORMATFILE='currency.fmt',
--FIRSTROW=2
--, FORMATFILE_DATA_SOURCE = 'pocBlobStorage'
) as D
I tried various combinations of the formatting options, but couldn't get anything to work.
The current error I'm getting is: Failed to execute query. Error: Referenced external data source "pocBlobStorage" not found.
I'm wondering if I need to do something to enable the Azure SQL database to access my data lake. For example, I haven't configured any credential called 'SQL blob' as per my last code segment, but I'm not sure where to do this (for example something similar to creating a linked service in azure data factory).
So how do I query my data lake, directly from my azure SQL database? Is the issue in my query, or do I need to configure access first, and if so how?
I'm trying to mask sensitive data via an Azure SQL database.
The data is saved as normal text and one column as XML and another saved as json.
I've tried adding rules to the database but when I open SSMS and run a select statement it does not apply to any of the data in the columns (normal text, xml or json saved data)
There's no user excluded to see unmasked data.
Just want to understand why the data is not masked when I perform a select on SSMS.
My rules look like the below:
XML Rule
JSON Rule:
Text Rule:
My SQL statment:
SELECT TOP (1000) * from database_Name
As mentioned in Microsoft Document it says,
The identities in Azure Active Directory (Azure AD) or SQL are included in the masking process and should have access to the unmasked sensitive data.
Maybe you are accessing data as SQL admin or Azure AD user because of that you can see sensitive data.
By hiding important information from unwanted users at multiple layers of the database, you may prevent access and gain control. You may give or remove UNMASK permission to a user.
The code taken from Microsoft-documentation it says,
Give UNMASK permission to user
GRANT UNMASK ON Data.Membership TO USER;
To Query the data under the context of user
EXECUTE AS USER='USER';
To revoke UNMASK permissions
REVOKE UNMASK ON Data.Membership FROM USER;
Data after granting permission to user
Data after removing permission from user
Taken Reference from:
SQL Database dynamic data masking with the Azure portal
Granting and Revoking the Permission
I have set up a Serverless SQL pool in Azure Synapse that is querying a view I had set up of a linked Azure Data Lake.
CREATE VIEW DeviceTelemetryView
AS SELECT corporationid, deviceid, version, Convert(datetime, dateTimestamp, 126) AS dateTimeStamp, deviceData FROM
OPENROWSET(
BULK 'https://test123.dfs.core.windows.net/devicetelemetry/*/*/*/*/*/',
FORMAT = 'PARQUET'
) AS [result]
GO
Using my Azure AD credentials from with synapse studio or SSMS I have no issues querying this View. When I try to query using my SQL Admin account I get the following error:
Cannot find the CREDENTIAL 'https://test123.dfs.core.windows.net/devicetelemetry/////*/', because it does not exist or you do not have permission.
It is important that I am able to query using SQL Admin Creds as we are wanting to query this View via our application for various reports and thus don't want to use AAD creds.
I have tried the SO solution provided here: GRANT Database Scoped Credential syntax gives mismatched input error
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[WorkspaceSystemIdentity] TO [sqlAdmin];
As this seems to be the default credential that was created when linking my DataLake to Synapse however this gives me the following error when run against the db where my view exists:
Cannot find the database scoped credential 'WorkspaceSystemIdentity', because it does not exist or you do not have permission.
You would need to create server-scoped credential to allow access to storage files.
Server-scoped credential
These are used when SQL login calls OPENROWSET function without
DATA_SOURCE to read files on some storage account. The name of
server-scoped credential must match the base URL of Azure storage
(optionally followed by a container name). However, SQL users can't
use Azure AD authentication to access storage and serverless SQL pool doesn't return subfolders unless you specify /** at the end of path.
I'm having troubles on a Azure SQL Database where i'm trying to read DB Audit logs.
Both procedures sys.fn_get_audit_file or sys.fn_xe_file_target_read_file sould be able to read a file.
But whatever I do i'm getting blank tables.But, even if I specify a non existing file I receive a table with zero records instead of a error.
So I'm afraid its something else.
My login is in the db_owner group.
Any suggestions ?
I found that I could only read XEL files by using the same server and same database context that they were created for. So for example, consider the following scenario:
ServerA is the Azure Synapse instance I was creating the audit XEL files from, all related to DatabaseA
ServerB is a normal SQL instance that I want to read the XEL files on
Test 1:
Using ServerB, try to read file directly from blob storage
Result: 0 rows returned, no error message
Test 2:
Using ServerB, download the XEL files locally, and try to read from the local copy
Result: 0 rows returned, no error message
Test 3:
Using ServerA, with the current DB = 'master', try to read file directly from blob storage
Result: 0 rows returned, no error message
Test 4:
Using ServerA, with the current DB = 'DatabaseA', try to read file directly from blob storage
Result: works perfectly
Because I really wanted to read the files from ServerB, I also tried doing a CREATE CREDENTIAL there that was able to read & write to my blob storage account. That didn't make any difference unfortunately - a repeat of Test 1 got the same result as before.
I'm trying to create an external data source to access Azure Blob Storage. However, I'm having issues with creating the actual data source.
I've followed the instructions located here:
Examples of bulk access to data in azure blob storage and
Create external data source - transact sql. I'm using SQL Server 2016 on a VM accessing via SSMS on a client machine using Windows Authentication with no issues. Instructions say creating this external data source works for SQL Server 2016 and Azure Blob Storage.
I have created the Master Key:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = <password>
and, the database scoped credential
CREATE DATABASE SCOPED CREDENTIAL UploadCountries
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = <key>;
I have verified both of these exist in the database by querying sys.symmetric_keys and sys.database_scoped_credentials.
However, when I try executing the following code it says 'Incorrect syntax near 'EXTERNAL'
CREATE EXTERNAL DATA SOURCE BlobCountries
WITH (
TYPE = BLOB_STORAGE,
LOCATION = 'https://<somewhere>.table.core.windows.net/<somewhere>',
CREDENTIAL = UploadCountries
);
Your thoughts and help are appreciated!
Steve.
In “Examples of Bulk Access to Data in Azure Blob Storage”, we can find:
Bulk access to Azure blob storage from SQL Server, requires at least SQL Server 2017 CTP 1.1.
And in Arguments section of “CREATE EXTERNAL DATA SOURCE (Transact-SQL)”, we can find similar information:
Use BLOB_STORAGE when performing bulk operations using BULK INSERT or OPENROWSET with SQL Server 2017
You are using SQL Server 2016, so you get Incorrect syntax near 'EXTERNAL' error when you create external data source for Azure Blob storage.