I have a need to connect to Synapse Analytics Serverless SQL Pool database using SQL Authentication.
I created a serverless SQL Pool database and created a SQL User and provided db_owner access.
Then created an external table below
IF NOT EXISTS (SELECT * FROM sys.external_file_formats
WHERE name = 'SynapseDeltaFormat')
CREATE EXTERNAL FILE FORMAT [SynapseDeltaFormat]
WITH ( FORMAT_TYPE = PARQUET)
GO
IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name =
'test_dfs_core_windows_net')
CREATE EXTERNAL DATA SOURCE [test_dfs_core_windows_net]
WITH (
LOCATION = 'abfss://test.dfs.core.windows.net'
)
GO
CREATE EXTERNAL TABLE table_staging (
<columns here>
)
WITH (
LOCATION = 'bronze/table_staging/',
DATA_SOURCE = [test_dfs_core_windows_net],
FILE_FORMAT = [SynapseDeltaFormat]
)
GO
SELECT TOP 100 * FROM dbo.table_staging
GO
Get below error when trying to access data of the table using SQL User
External table 'dbo.table_staging' is not accessible because location does not exist or it is used by another process.
Table data is accessible using AD user. Created DataSource using SQL User.
It seems like that SQL Server User does not have access to data lake/data storage. How to grant that access?
Just started delta lakes myself :)
Per default the serverless sql authenticates using the user context. When you are querying from within the synapse data studio you are using your AD users context which is why you can connect to the external storage.
However for SQL users and AD users that do not have access to be able to query you need to use credentials when setting up the query.
You can finde detailed instructions here:
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-storage-access-control?tabs=user-identity
You need to have way of accessing the storage - either service principal, SAS, Managed Identity
I just set up credentials using a service principal "App registration"
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<secret>';
go
CREATE DATABASE SCOPED CREDENTIAL [credentialname] WITH
IDENTITY = '<Client-ID>#https://login.microsoftonline.com/<tenant-id>/oauth2/v2.0/token'
, SECRET = '<token>'
GO
CREATE EXTERNAL DATA SOURCE mydatasource WITH (
LOCATION = 'https://<storageaccount>.dfs.core.windows.net/onedatahubtest',
CREDENTIAL = credentialname
);
CREATE VIEW MYVIEW AS
SELECT *
FROM OPENROWSET(
BULK 'Example/table',
DATA_SOURCE='mydatasource',
FORMAT = 'delta') as rows;
And now my sql user can access the view
Related
I have a Serverless SQL pool set up in Azure Synapse Analytics, and I am trying to run this query:
CREATE DATABASE SCOPED CREDENTIAL myCredential
WITH IDENTITY = 'test',
SECRET = 'test2';
When I run the query I get this error:
Incorrect syntax near 'IDENTITY'.
How can I correct this issue?
Please use the below format :
USE [master]
GO
-- Create the lake house logic database
IF db_id('nyctaxidwdelta') IS NULL
EXEC('CREATE DATABASE nyctaxidwdelta COLLATE Latin1_General_100_BIN2_UTF8')
GO
USE [nyctaxidwdelta]
GO
-- Create a master key
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'blabla!'
GO
-- Create database scoped credential that use Synapse Managed Identity
CREATE DATABASE SCOPED CREDENTIAL WorkspaceIdentity
WITH IDENTITY = 'Managed Identity'
GO
-- Create external data source
IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name = 'eds_nyctaxi')
CREATE EXTERNAL DATA SOURCE [eds_nyctaxi]
WITH (
LOCATION = 'https://mystorage.dfs.core.windows.net/lakedata/',
CREDENTIAL = WorkspaceIdentity
)
GO
Can i create a Custom role or edit existing role in Azure Synapse, where
i can provide only SELECT query access using Built-in serverless Pool and
Pipelines access should be restricted
Ideally i'm looking for a role who can only read SQL & Lake data, query it using different technologies (SQL, Spark) and should not have access to anything else
You can actually create the External Table on the required using the Database Scoped Credential and first GRANT REFERENCES and then SELECT permission to the External Table for SQL user. Follow the below steps:
CREATE DATABASE SCOPED CREDENTIAL SampleIdentity
WITH IDENTITY = 'Managed Identity'
GO
CREATE EXTERNAL DATA SOURCE [DataLakeStorage] WITH (LOCATION = N'https://theorders.dfs.core.windows.net/', CREDENTIAL = SampleIdentity)
GO
Caller must have one of the following permissions to execute OPENROWSET function:
One of the permissions to execute OPENROWSET:
ADMINISTER BULK OPERATIONS enables login to execute OPENROWSET function.
ADMINISTER DATABASE BULK OPERATIONS enables database scoped user to execute OPENROWSET function.
REFERENCES DATABASE SCOPED CREDENTIAL to the credential that is referenced in EXTERNAL DATA SOURCE.
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[SampleIdentity] TO [SQLUser];
GO
CREATE EXTERNAL TABLE [dbo].[DimProductexternal]
( ProductKey int, ProductLabel nvarchar, ProductName nvarchar )
WITH
(
LOCATION='/DimProduct/year=*/month=*' ,
DATA_SOURCE = AzureDataLakeStore ,
FILE_FORMAT = TextFileFormat
) ;
You can now Grant SELECT permission to the user for external table.
GRANT SELECT ON [dbo].[DimProductexternal] TO [SQLUser]
GO
To restrict the access to the resource in Synapse, you can assign ROLE BASED ACEESS CONTROL (RBAC)
To restrict run/cancel pipelines access in Synapse workspace you can assign Synapse Monitoring Operator role using the RBAC in synapse. Refer Synapse RBAC roles and the actions they permit for more details.
I try to created Row-Level-Security in a Azure Synapse ondemand database. The data is stored in Azure Datalake Storage Gen 2. The script is working fine, but members of the restricted user group can still run the OPENROWSET command manually and see al the data. Does somebody knows what part I'am missing?
CREATE DATABASE SCOPED CREDENTIAL WorkspaceIdentity
WITH IDENTITY = 'Managed Identity'
GO
CREATE EXTERNAL DATA SOURCE [DataLakeStorage] WITH (LOCATION = N'https://theorders.dfs.core.windows.net/', CREDENTIAL = WorkspaceIdentity )
GO
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[WorkspaceIdentity] TO [MyTestGroup];
GO
CREATE VIEW [model].[my_orders] as
SELECT * FROM
OPENROWSET(BULK 'dimorders/*.parquet',
DATA_SOURCE = 'DataLakeStorage', FORMAT = 'parquet') as rows
WHERE [UserName] = suser_name()
GO
GRANT SELECT ON [model].[my_orders] TO [MyTestGroup]
GO
The example script for receiving all the data, without restriction
SELECT * FROM
OPENROWSET(BULK 'dimorders/*.parquet',
DATA_SOURCE = 'DataLakeStorage', FORMAT = 'parquet') as rows
I would suggest you follow below steps which shows how to give a user permission to access a particular database.
Note - The steps below need to be run for each SQL pool to grant user
access to all SQL databases except in section Workspace-scoped
permission where you can assign a user a sysadmin role at the
workspace level.
Set up security groups
Prepare your ADLS Gen2 storage account
Create and configure your Azure Synapse Workspace
Grant the workspace MSI access to the default storage container
Grant Synapse administrators the Azure Contributor role on the workspace
Assign SQL Active Directory Admin role
Grant access to SQL pools
Add users to security groups
Network security
Refer - https://learn.microsoft.com/en-us/azure/synapse-analytics/security/how-to-set-up-access-control#supporting-more-advanced-scenarios
Use case-
After learning that AD Passthrough is not working as expected on Synapse Serverless pool with ADLS Gen-2 ; I am trying to use traditional method of creating external tables on Serverless Pool and granting READ ONLY access to users to a set of tales and enable WRITE BACK option to another ADLS Gen-2 container using CETAS option .
Looks like I am stuck there as well - to move forward.
I have tried to explain my scenario in below image.
Now - I have 5 external tables on a database where I have a READ ONLY access to the schema's where those table exists.
I wanted to create few more tables - which ideally does a JOIN between those 5 tables and aggregates the data and writes back to ADLS Gen-2 for reporting/data science purpose.
What access should I grant for WRITE back purpose ?
I tried creating new schema and granting ALTER, CONTROL, SELECT access to that schema along with CREATE TABLE access at database level . I dont want to grant more access to database level - as it has data scoped credential having managed identity referenced- which will grant full access on ROC container objects.
Grant select on SCHEMA ::sandbox to sls_svc ;
Grant ALTER on SCHEMA ::sandbox to sls_svc ;
GRANT CONTROL ON SCHEMA::[sandbox ] TO [sls_svc];
Grant CREATE TABLE to sls_svc;
CREATE EXTERNAL TABLE sanbox.revenue-by-month
WITH (
LOCATION = '/ROW/revenue-by-month/',
DATA_SOURCE = ADLS-ROW,
FILE_FORMAT = EF_PARQUET
)
AS
SELECT * from table1;
all users in sls_svc role has STORAGE DATA CONTRIBUTOR access on READ-WRITE-CONTAINER (ROW)
Below are the error messages I am getting
I also tried creating a new database. hoping that i can grant full access on that database - so that cross DB query can work - but I am out of luck there as well.
Any thoughts ?
It seems that you have correctly set permissions https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-overview?tabs=impersonation#permissions
Are you sure that you can successfully execute just select statement and that the issue is not in SELECT part?
GRANT CONNECT to the database that was created
+
GRANT DDL_ADMIN access
resolved the issue
I have 2 DB on same SQL Azure server and i have same table(TB1) on both DB, now i want to read data from TB1 of DB2 and insert data into TB1 of DB1.
I am using below query but getting error.
insert into TB1 select 1,* from [DB2].dbo.TB1
Error Message
Msg 40515, Level 15, State 1, Line 16
Reference to database and/or server name in 'DB2.dbo.TB1' is not supported in this version of SQL Server.
Yes, you can use the Elastic Query Features on SQL Azure.It's the only way you can perform the cross database Queries.
Here are the detailed Queries to follow:
Run the below Query in your DB1(Since you said like reading the TB1 from DB2 and insert those Data's into your TB2 in your DB1)
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'STro*ngPaSSe0rD';
CREATE DATABASE SCOPED CREDENTIAL Login
WITH IDENTITY = 'Login',
SECRET = 'STro*ngPaSSe0rD';
CREATE EXTERNAL DATA SOURCE RemoteReferenceData
WITH
(
TYPE=RDBMS,
LOCATION='myserver.database.windows.net',
DATABASE_NAME='DB2',
CREDENTIAL= Login
);
CREATE EXTERNAL TABLE [dbo].[TB1]
(
[Columns] [DataTypes]
)
WITH (DATA_SOURCE = [RemoteReferenceData])
After these step, you can Query the external table like the Normal table. Though some limitations while using the External table, like you couldn't able to Insert Data's into a EXTERNAL TABLE(Reference table)
Azure supports this cross database query feature since 2015 but needs some extra setup to work and Elastic Query.
The first step is create a security credential:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>';
CREATE DATABASE SCOPED CREDENTIAL DB2Security
WITH IDENTITY = '<username>',
SECRET = '<password>';
The "username" and "password" should be the username and password used to login into the DB2 database.
Now you can use it to define a external datasource, so DB1 can connect to DB2:
CREATE EXTERNAL DATA SOURCE DB2Access
WITH (
TYPE=RDBMS,
LOCATION='myservernotyours.database.secure.windows.net',
DATABASE_NAME='DB2',
CREDENTIAL= DB2Security);
Finally, you map the TB1 as a external table from the DB2 database, using the previous external datasource:
CREATE EXTERNAL TABLE dbo.TB1FromDB2(
ID int,
Val varchar(50))
WITH
(
DATA_SOURCE = DB2Access);
You can also accomplish this using the Azure SQL Data Sync, but the data are replicated in one single database and this feature are still a preview version (May/2018) and you always see oldest data (the minimal configurable interval for each synchronization is 5 minutes).
You can perform cross database queries using the elastic query features on SQL Azure.
You will have to create an external data source and an external table to be able to query tables on other SQL Azure databases. This article shows how to do it.
Hope this helps.