Attempting to Read parcquet files on linked storage in Azure Synapse - azure-data-lake

I am attempting to give access to parquet files on a Gen2 Data Lake container. I have owner RBAC on the container but would prefer to limit access in the container for other users.
My Query is very simple:
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://aztsworddataaipocacldl.dfs.core.windows.net/pocacl/Top/Sub/part-00006-c62926ba-c530-4ad8-87d1-cf38c67a2da3-c000.snappy.parquet',
FORMAT='PARQUET'
) AS [result]
When I run this I have no problems connecting. I have attempted to add ACL rights onto the files (and of course the containing folders 'Top' and 'Sub').
I've give RWX on the 'Top' folder using Storage Explorer and default so that it cascades to the 'Sub' folder and parquet files as I add them
When my colleague attempts to run the SQL script the get the error message. Failed to execute query. Error: File 'https://aztsworddataaipocacldl.dfs.core.windows.net/pocacl/Top/Sub/part-00006-c62926ba-c530-4ad8-87d1-cf38c67a2da3-c000.snappy.parquet' cannot be opened because it does not exist or it is used by another process.
NB similar results are also experienced in Spark but with a 403 instead
SQL on-demand provides a link to the following help file after the error, it suggests:
If your query fails with the error saying 'File cannot be opened because it does not exist or it is used by another process' and you're sure both file exist and it's not used by another process it means SQL on-demand can't access the file. This problem usually happens because your Azure Active Directory identity doesn't have rights to access the file. By default, SQL on-demand is trying to access the file using your Azure Active Directory identity. To resolve this issue, you need to have proper rights to access the file. Easiest way is to grant yourself 'Storage Blob Data Contributor' role on the storage account you're trying to query.
I don't wish to grant Storage Blob Data Contributor or Storage Blob Data Reader as this gives access to every file on the container and not just those I want end users to be able to query. We have found the same experience occurs for SSMS connecting to parquet external tables.
So then in parts:
Is this the correct pattern using ACL to grant access, or should I use another method?
Are there settings on the Storage Account or within my query/notebook that I should be enabling to support ACL?*
Has ACL been implemented on Synapse Workspace to date given that we're still in preview?
*I have resisted pasting my entire settings as I really have no idea what is relevant and what entirely irrelevant to this issue but of course can supply.

It would appear that the ACL feature was not working correctly in Preview for Azure Synapse Analytics.
I have now managed to get it to work. At present I see that once Read|Execute is provided to a folder it allows access to the files contained within that folder and sub folders. Access is available even when no specific ACL access is provided on a file in a sub folder. This is not quite what I expected however it provides enough for me to proceed: only giving access to the Gold folder allows for separation of access to the files I want to let users query and the working files that I want to keep hidden.

When you assign ACL to a folder it's not propagated recursively to all files inside the folder. Only new files inherit from the folder.
You can see this here

Go to azure storage explorer change ACL permissions in the route Folder and right click on your storage and click on "propogate access control lists"

Related

Copy files to local drive that requires different credentials

I've seen a lot of answers on copying files that use code to set a network share, with credentials, to copy to somewhere else. However I need a solution that will allow a user to copy from a network share they already have access for, to a local drive they don't have access to.
We run RDS servers and have locked down direct access to the local C:/ drive on the servers. We have been given a 3rd party program that needs to read data files that must be stored in a fixed path on the C:/ drive. These data files are updated once a month. Our users have read access but we do not want to give them direct write access to the root C:/ drive.
I need to write a piece of vb.net, or command line code in .bat file that will copy files to the Local C:/ whilst providing the details of a service account to provide the access.
As mentioned I've seen a lot about setting up a mapping to shared folder and passing creds, however we don't want to set the C:/ as mapped shared drive in this instance.
You don't want the user having access to the C Drive in general, is there any particular reason the permissions on the particular subfolder the files are going to can't have overriding permissions to allow writing to just that folder?
If that will not work, first thought that comes to mind is having a helper program that can be ran under a different user that does have that access. Set up an intermediate folder the user can write to, the program that they can launch drops the files into a folder they have access to. Helper program watches for files in the intermediate folder, moves them to where they need to be.
Set up would need to include adding a user that does have access to both locations, and then adding to task manager to launch the helper program under that other user at login.

Check folder and files permission via T-SQL

I want to check all the files and folders permissions in T-SQL.
For example:
Folder name: Root
Items inside the root are File1, file2, folder1
I want the list of users who has permission for these files and folders in T-SQL.
To answer your question; yes it can; however that'll require you to open up permissions that are so awful I'll not tell you how.
If you absolutely must do this then creating an External Access assembly using .Net and calling that is your answer. If you traverse this road then do NOT go the 'Trustworthy' route and bypass security. Create a asymmetric key and a user and sign your code accordingly.
Although NOT recommended, but you can use xp_cmdshell to query underlying OS/file-system from within SSMS (SQL Server Management Studio).
If you need to check if a folder/UCN-path is accessible from within SSMS, place a small database-backup file (.bak) there then use FILELISTONLY restore to simply read it, e.g.:
RESTORE FILELISTONLY FROM DISK = '\\folder_to_check\db.bak' --this will only read the file (without performing the Restore operation.
If above succeeds in reading the .bak file from your <folder_to_check> folder - it means the folder in question is accessible (via T-SQL / from within SSMS).
If not, grant access (such as READ/WRITE access on that folder) to the service account that executes your SQL-instance, which normally is a local system account or an AD-Service account.
To obtain this Service account's name, view Properties of SQL Server service in "Windows Services" (services.msc) or "SQL Server Configuration Manager" (SQLServerManager<your_SQLServer_Version_number>.msc) alternatively you can run following query:
select * from sys.dm_server_services --This will list the Accounts-Names that execute SQL Server Instance/engine service & SQL Agent service, and Full-Text Search services etc.
HTH.

Save and access file from shared drive

I am running a file upload process to upload files to a db. The web server and the SQL server are different machines. I am attempting to use an SQL OPENROWSET to upload an excel file, but I cannot determine how to get the file onto the other machine. Is there a way to set up a shared drive that the web server can save a file to and the SQL server can access? We have a local network set up with Active Directory.
For Example:
WebServer - Shared drive on web server under C:/inetpub/webpage/fileImport
SQLServer - Will log in with sql auth using USERID and PASSWD. Needs to access webserver shared drive.
What user do I share the drive on web server with so that the sql auth user will be able to access it when I run the OPENROWSET?
Any help will be much appreciated.
I am also trying the same thing by uploading the file in FTP and trying to access it. But i didn't get any progress from last 2 weeks.
And i had found may other alternatives like coping the files in another server and share the folder with out user name & password. then we can able to access it by giving the
\\folder\filename
If u get any other alternative plz share...
You should setup a new user that has access to the user group iis_users, and then give them security access to the file drive itself.
The same should be done to the DB server, and on the drive folder security the other user will need read/write/Modify permissions.
So it will look like:
(WebHost) ---- (Shared) ---- (DBHost)
*-------*
Well, you would setup the folder on the SQLServer.
Create a secure user on the SQL machine.
Make the folder shared (with modify rights for the secure user)
Map the Network drive on the Windows machine, using the secure user to access it.
Your main user on the SQLServer should then be able to openrowset from the local folder, whilst the IIS Server is remotely accessing it.
Using the OPENROWSET means that SQL qill access files using the service account. This account must be used to access share drive, as stated here Using SQL Credential to Open a file with OpenRowSet.

Check who has logged in using SQL Server 2000 trc files

I'm trying to go through multiple .trc files to find out who has been logging into SQL Server over the last few months. I didn't setup the trace, but what I've got are a bunch of .trc files,
ex:
C:\SQLAuditFile2012322132923.trc,
C:\SQLAuditFile201232131931.trc
etc.
I can load these files into SQL Profiler and look at them individually, but I was hoping for a way to load them all up, so that I can quickly scan them for logins. Either using a filter, or better yet, load them into a SQL Server table and query them.
I tried loading the files into a table using:
use <databasename>
GO
SELECT * INTO trc_table
FROM ::fn_trace_gettable('C:\SQLAuditFile2012322132923.trc', 10);
GO
But when I do this, i get the error message:
File 'C:\SQLAuditFile2012322132923.trc' either does not exist or is not a recognizable trace file. Or there was an error opening the file.
However, I know the file exists, and I have the correct name. Also they appear to be recognizable because I can load them up into SQL Profiler and view them fine.
Anybody have an idea why I'm getting this error message, and if this won't work, perhaps another way of analyzing these multiple .trc files more easily?
Thanks!
You may be having permissions issues on the root of C:. Try placing the file into a subfolder, e.g. c:\tracefiles\, and ensuring that the SQL Server account has at least explicit read permissions on that folder.
Also try starting simpler, e.g.
SELECT * FROM ::fn_trace_gettable('C:\SQLAuditFile2012322132923.trc', default);
Anyway unless you were explicitly capturing successful login events, I don't know that these trace files are going to contain the information you're looking for... this isn't something SQL Server tracks by default.
I had pretty much the same issue and thought I'd copy my solution from
Database Administrators.
I ran an SQL trace on a remote server and transferred the trace files to a
local directory on my workstation so that I load the data into a table on my
local SQL Server instance for running queries against.
At first I thought the error might be related permission but I ruled this
out since I had no problem loading the .trc files directly into SQL Profiler
or as a file into SSMS.
After trying a few other ideas, I thought about it a bit more and realised
that it was due to permissions after all: the query was being run by the SQL
Server process (sqlsrvr.exe) as the user NT AUTHORITY\NETWORK SERVICE –
not my own Windows account.
The solution was to grant Read and Execute permissions to NETWORK
SERVICE on the directory that the trace files were stored in and the trace
files themselves.
You can do this by right-clicking on the directory, go to the Security
tab, add NETWORK SERVICE as a user and then select Read & Execute for
its Permissions (this should automatically also select Read and
List folder contents). These file permissions (ACLs) should automatically
propagate to the directory contents.
If you prefer to use the command line, you can grant the necessary permissions to
the directory – and its contents – by running the following:
icacls C:\Users\anthony\Documents\SQL_traces /t /grant "Network Service:(RX)"

Why does sql server restrict the locations from which you can attach or restore a database from?

I'm assuming some sort of security constraint, but if I have access to all folders on a PC, why allow some folders and not others.
What is the criteria for a folder being a valid backup / restore / attach folder?
Any advice appreciated!
Is not you who must have access, but the SQL Server service account. The engine must be able to attach the file after a restart when you are not logged in, so it cannot use your credentials, it must use its own credentials.
A valid backup/attach folder is one on which the SQLServerMSSQLUser$ComputerName$InstanceName user has full control. The Setup creates a set of folders that are correctly configured, see Setting Up Windows Service Accounts:
Instid\MSSQL\backup Full control
Instid\MSSQL\binn Read, Execute
Instid\MSSQL\data Full control
Instid\MSSQL\FTData Full control
Instid\MSSQL\Install Read, Execute
Instid\MSSQL\Log Full control
Instid\MSSQL\Repldata Full control
100\shared Read, Execute
Instid\MSSQL\Template Data (SQL Server Express only) Read
It matters less what folders you have access to than what folders SQL Server has (or should have) access to. Folders in private locations on the drive (like in a user's home directory) aren't necessarily accessible by the user that SQL Server runs as.