Restricting direct access to Azure sql external data source - sql

I try to created Row-Level-Security in a Azure Synapse ondemand database. The data is stored in Azure Datalake Storage Gen 2. The script is working fine, but members of the restricted user group can still run the OPENROWSET command manually and see al the data. Does somebody knows what part I'am missing?
CREATE DATABASE SCOPED CREDENTIAL WorkspaceIdentity
WITH IDENTITY = 'Managed Identity'
GO
CREATE EXTERNAL DATA SOURCE [DataLakeStorage] WITH (LOCATION = N'https://theorders.dfs.core.windows.net/', CREDENTIAL = WorkspaceIdentity )
GO
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[WorkspaceIdentity] TO [MyTestGroup];
GO
CREATE VIEW [model].[my_orders] as
SELECT * FROM
OPENROWSET(BULK 'dimorders/*.parquet',
DATA_SOURCE = 'DataLakeStorage', FORMAT = 'parquet') as rows
WHERE [UserName] = suser_name()
GO
GRANT SELECT ON [model].[my_orders] TO [MyTestGroup]
GO
The example script for receiving all the data, without restriction
SELECT * FROM
OPENROWSET(BULK 'dimorders/*.parquet',
DATA_SOURCE = 'DataLakeStorage', FORMAT = 'parquet') as rows

I would suggest you follow below steps which shows how to give a user permission to access a particular database.
Note - The steps below need to be run for each SQL pool to grant user
access to all SQL databases except in section Workspace-scoped
permission where you can assign a user a sysadmin role at the
workspace level.
Set up security groups
Prepare your ADLS Gen2 storage account
Create and configure your Azure Synapse Workspace
Grant the workspace MSI access to the default storage container
Grant Synapse administrators the Azure Contributor role on the workspace
Assign SQL Active Directory Admin role
Grant access to SQL pools
Add users to security groups
Network security
Refer - https://learn.microsoft.com/en-us/azure/synapse-analytics/security/how-to-set-up-access-control#supporting-more-advanced-scenarios

Related

Azure Elastic Job Agent - Credentials

I want to refresh my tables in the same database with my queries using with Elastic Job Agent. I need to create credential first to connect database. How can I arrange the credentials in this case? Every documentation they have either 2 database or 2 server.
I found 2 sources but I couldn't understand the concept. Can you please explain it to me in a clear way?
Source 1 : Microsoft document
--Connect to the new job database specified when creating the Elastic Job agent
-- Create a database master key if one does not already exist, using your own password.
CREATE MASTER KEY ENCRYPTION BY PASSWORD='<EnterStrongPasswordHere>';
-- Create two database scoped credentials.
-- The credential to connect to the Azure SQL logical server, to execute jobs
CREATE DATABASE SCOPED CREDENTIAL job_credential WITH IDENTITY = 'job_credential',
SECRET = '<EnterStrongPasswordHere>';
GO
-- The credential to connect to the Azure SQL logical server, to refresh the database metadata in server
CREATE DATABASE SCOPED CREDENTIAL refresh_credential WITH IDENTITY = 'refresh_credential',
SECRET = '<EnterStrongPasswordHere>';
GO
Source 2: link
--In the master database
CREATE LOGIN mastercredential WITH PASSWORD='YourPassword1';
CREATE LOGIN jobcredential WITH PASSWORD='YourPassword2';
CREATE USER mastercredential FROM LOGIN mastercredential;
--In the job database
CREATE USER mastercredential FROM LOGIN mastercredential;
--In the target database
CREATE USER jobcredential FROM LOGIN jobcredential;
-- In the job database
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'YourPassword3';
CREATE DATABASE SCOPED CREDENTIAL mastercredential
WITH IDENTITY = 'mastercredential',
SECRET = 'YourPassword1';
CREATE DATABASE SCOPED CREDENTIAL jobcredential
WITH IDENTITY = 'jobcredential',
SECRET = 'YourPassword2';

Custom Role in Azure Synapse

Can i create a Custom role or edit existing role in Azure Synapse, where
i can provide only SELECT query access using Built-in serverless Pool and
Pipelines access should be restricted
Ideally i'm looking for a role who can only read SQL & Lake data, query it using different technologies (SQL, Spark) and should not have access to anything else
You can actually create the External Table on the required using the Database Scoped Credential and first GRANT REFERENCES and then SELECT permission to the External Table for SQL user. Follow the below steps:
CREATE DATABASE SCOPED CREDENTIAL SampleIdentity
WITH IDENTITY = 'Managed Identity'
GO
CREATE EXTERNAL DATA SOURCE [DataLakeStorage] WITH (LOCATION = N'https://theorders.dfs.core.windows.net/', CREDENTIAL = SampleIdentity)
GO
Caller must have one of the following permissions to execute OPENROWSET function:
One of the permissions to execute OPENROWSET:
ADMINISTER BULK OPERATIONS enables login to execute OPENROWSET function.
ADMINISTER DATABASE BULK OPERATIONS enables database scoped user to execute OPENROWSET function.
REFERENCES DATABASE SCOPED CREDENTIAL to the credential that is referenced in EXTERNAL DATA SOURCE.
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[SampleIdentity] TO [SQLUser];
GO
CREATE EXTERNAL TABLE [dbo].[DimProductexternal]
( ProductKey int, ProductLabel nvarchar, ProductName nvarchar )
WITH
(
LOCATION='/DimProduct/year=*/month=*' ,
DATA_SOURCE = AzureDataLakeStore ,
FILE_FORMAT = TextFileFormat
) ;
You can now Grant SELECT permission to the user for external table.
GRANT SELECT ON [dbo].[DimProductexternal] TO [SQLUser]
GO
To restrict the access to the resource in Synapse, you can assign ROLE BASED ACEESS CONTROL (RBAC)
To restrict run/cancel pipelines access in Synapse workspace you can assign Synapse Monitoring Operator role using the RBAC in synapse. Refer Synapse RBAC roles and the actions they permit for more details.

Azure Syanpse Analytics

I have a need to connect to Synapse Analytics Serverless SQL Pool database using SQL Authentication.
I created a serverless SQL Pool database and created a SQL User and provided db_owner access.
Then created an external table below
IF NOT EXISTS (SELECT * FROM sys.external_file_formats
WHERE name = 'SynapseDeltaFormat')
CREATE EXTERNAL FILE FORMAT [SynapseDeltaFormat]
WITH ( FORMAT_TYPE = PARQUET)
GO
IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name =
'test_dfs_core_windows_net')
CREATE EXTERNAL DATA SOURCE [test_dfs_core_windows_net]
WITH (
LOCATION = 'abfss://test.dfs.core.windows.net'
)
GO
CREATE EXTERNAL TABLE table_staging (
<columns here>
)
WITH (
LOCATION = 'bronze/table_staging/',
DATA_SOURCE = [test_dfs_core_windows_net],
FILE_FORMAT = [SynapseDeltaFormat]
)
GO
SELECT TOP 100 * FROM dbo.table_staging
GO
Get below error when trying to access data of the table using SQL User
External table 'dbo.table_staging' is not accessible because location does not exist or it is used by another process.
Table data is accessible using AD user. Created DataSource using SQL User.
It seems like that SQL Server User does not have access to data lake/data storage. How to grant that access?
Just started delta lakes myself :)
Per default the serverless sql authenticates using the user context. When you are querying from within the synapse data studio you are using your AD users context which is why you can connect to the external storage.
However for SQL users and AD users that do not have access to be able to query you need to use credentials when setting up the query.
You can finde detailed instructions here:
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-storage-access-control?tabs=user-identity
You need to have way of accessing the storage - either service principal, SAS, Managed Identity
I just set up credentials using a service principal "App registration"
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<secret>';
go
CREATE DATABASE SCOPED CREDENTIAL [credentialname] WITH
IDENTITY = '<Client-ID>#https://login.microsoftonline.com/<tenant-id>/oauth2/v2.0/token'
, SECRET = '<token>'
GO
CREATE EXTERNAL DATA SOURCE mydatasource WITH (
LOCATION = 'https://<storageaccount>.dfs.core.windows.net/onedatahubtest',
CREDENTIAL = credentialname
);
CREATE VIEW MYVIEW AS
SELECT *
FROM OPENROWSET(
BULK 'Example/table',
DATA_SOURCE='mydatasource',
FORMAT = 'delta') as rows;
And now my sql user can access the view

Synapse Server less Pool writing data back to ADLS Gen-2 using CETAS >> Permissions issue

Use case-
After learning that AD Passthrough is not working as expected on Synapse Serverless pool with ADLS Gen-2 ; I am trying to use traditional method of creating external tables on Serverless Pool and granting READ ONLY access to users to a set of tales and enable WRITE BACK option to another ADLS Gen-2 container using CETAS option .
Looks like I am stuck there as well - to move forward.
I have tried to explain my scenario in below image.
Now - I have 5 external tables on a database where I have a READ ONLY access to the schema's where those table exists.
I wanted to create few more tables - which ideally does a JOIN between those 5 tables and aggregates the data and writes back to ADLS Gen-2 for reporting/data science purpose.
What access should I grant for WRITE back purpose ?
I tried creating new schema and granting ALTER, CONTROL, SELECT access to that schema along with CREATE TABLE access at database level . I dont want to grant more access to database level - as it has data scoped credential having managed identity referenced- which will grant full access on ROC container objects.
Grant select on SCHEMA ::sandbox to sls_svc ;
Grant ALTER on SCHEMA ::sandbox to sls_svc ;
GRANT CONTROL ON SCHEMA::[sandbox ] TO [sls_svc];
Grant CREATE TABLE to sls_svc;
CREATE EXTERNAL TABLE sanbox.revenue-by-month
WITH (
LOCATION = '/ROW/revenue-by-month/',
DATA_SOURCE = ADLS-ROW,
FILE_FORMAT = EF_PARQUET
)
AS
SELECT * from table1;
all users in sls_svc role has STORAGE DATA CONTRIBUTOR access on READ-WRITE-CONTAINER (ROW)
Below are the error messages I am getting
I also tried creating a new database. hoping that i can grant full access on that database - so that cross DB query can work - but I am out of luck there as well.
Any thoughts ?
It seems that you have correctly set permissions https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-overview?tabs=impersonation#permissions
Are you sure that you can successfully execute just select statement and that the issue is not in SELECT part?
GRANT CONNECT to the database that was created
+
GRANT DDL_ADMIN access
resolved the issue

azure sql user permission

I am trying to run following script to create credential for azure blob storage on azure Master database with database admin account, but its giving me an error 'User does not have permission to perform this action'.
Script :
'
USE master
CREATE DATABASE SCOPED CREDENTIAL 'XYZ'
WITH IDENTITY='SHARED ACCESS SIGNATURE',
SECRET = 'shared access signature of blob storage'
GO '
Any idea?
Thanks in advance
You should create the database scoped user on your user Database.
There are restrictions on 'master' in Azure SQL Database, even for DBA.