Copy files from Azure BLOB storage to SharePoint Document Library - azure-storage

I cannot find a way to copy files\folders from Blob storage to a SharePoint document library. So far, I've tried AZCopy and PowerShell:
*AZCopy cannot connect to SP as the destination
*PowerShell works for local files but the script cannot connect to Blob storage ( Blob storage cannot be mapped as a networkdrive)

For anyone else who needs to do this, AZCopy worked. I just had to use a different destination. When you map a SharePoint document library as a mapped drive, it assigns a drive letter but it also shows the UNC path. That's what you have to use:
/Dest:"\\Tenant.sharepoint.com#SSL\DavWWWRoot\Sites\sitename\library"

Related

How to copy the file into SharePoint documents library folder using VBA?

I'm working on a Access DB to store files in a network location rather than as attachments in Access. The code is supposed to copy file from a local drive into a network location using
FileCopy strSource, strDest
I would like to use SharePoint documents library as a target storage location.
How can I provide a path to SharePoint document library folder?
I tried:
Providing URL of the folder
"https://name.sharepoint.com/sites/SiteName/Shared Documents/"
with different variations of forward and back slashes and "%20" for space
Mapping SP folder a network drive is unsuccessful. I get error message:
"
The regular (non SharePoint) network drive address works without problems.
Try using the Sync feature in Sharepoint to a local folder on your PC. Copy files to the local location and the Sync feature will automatically upload them to your Teams Channel.

Create Blob storage UNC path in cloud

I have used blob storage as a file storage account in .NET Core Web application hosted on Azure app service(PaaS).
My requirement is to create .zip files and then attach to email where it requires UNC path for attachment.
Here I have one option to use app service local storage for temporary file creation and use in attachment.
I am searching other option to map blob storage to any virtual drive in cloud and get its UNC path or any other option?
Also, Can you please suggest what are the possible options to map Azure Blob storage drive in network? I know the following one - App service local storage, VM, Local machine network drive.
First of all, you need to know the concept of UNC path, and then azure webapp can be regarded as a virtual machine in essence, and azure blob storage can also be regarded as a machine. Therefore, it is not feasible to send mail directly through azure blob.
Suggestion:
1. I check the information, you can try azure files to store files and use them.
I think this should be the fastest way, without using other azure products.
Download the file to the project directory, you can create a temporary folder, such as: MailTempFolder, you can download the file from the blob to this folder, and then you can get the UNC path to send mail.
After the sending is successful, just delete the file, it will not occupy too much space of the azure webapp, even if the sending fails, you can still get the zip file without downloading it again.

Attempting to Read parcquet files on linked storage in Azure Synapse

I am attempting to give access to parquet files on a Gen2 Data Lake container. I have owner RBAC on the container but would prefer to limit access in the container for other users.
My Query is very simple:
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://aztsworddataaipocacldl.dfs.core.windows.net/pocacl/Top/Sub/part-00006-c62926ba-c530-4ad8-87d1-cf38c67a2da3-c000.snappy.parquet',
FORMAT='PARQUET'
) AS [result]
When I run this I have no problems connecting. I have attempted to add ACL rights onto the files (and of course the containing folders 'Top' and 'Sub').
I've give RWX on the 'Top' folder using Storage Explorer and default so that it cascades to the 'Sub' folder and parquet files as I add them
When my colleague attempts to run the SQL script the get the error message. Failed to execute query. Error: File 'https://aztsworddataaipocacldl.dfs.core.windows.net/pocacl/Top/Sub/part-00006-c62926ba-c530-4ad8-87d1-cf38c67a2da3-c000.snappy.parquet' cannot be opened because it does not exist or it is used by another process.
NB similar results are also experienced in Spark but with a 403 instead
SQL on-demand provides a link to the following help file after the error, it suggests:
If your query fails with the error saying 'File cannot be opened because it does not exist or it is used by another process' and you're sure both file exist and it's not used by another process it means SQL on-demand can't access the file. This problem usually happens because your Azure Active Directory identity doesn't have rights to access the file. By default, SQL on-demand is trying to access the file using your Azure Active Directory identity. To resolve this issue, you need to have proper rights to access the file. Easiest way is to grant yourself 'Storage Blob Data Contributor' role on the storage account you're trying to query.
I don't wish to grant Storage Blob Data Contributor or Storage Blob Data Reader as this gives access to every file on the container and not just those I want end users to be able to query. We have found the same experience occurs for SSMS connecting to parquet external tables.
So then in parts:
Is this the correct pattern using ACL to grant access, or should I use another method?
Are there settings on the Storage Account or within my query/notebook that I should be enabling to support ACL?*
Has ACL been implemented on Synapse Workspace to date given that we're still in preview?
*I have resisted pasting my entire settings as I really have no idea what is relevant and what entirely irrelevant to this issue but of course can supply.
It would appear that the ACL feature was not working correctly in Preview for Azure Synapse Analytics.
I have now managed to get it to work. At present I see that once Read|Execute is provided to a folder it allows access to the files contained within that folder and sub folders. Access is available even when no specific ACL access is provided on a file in a sub folder. This is not quite what I expected however it provides enough for me to proceed: only giving access to the Gold folder allows for separation of access to the files I want to let users query and the working files that I want to keep hidden.
When you assign ACL to a folder it's not propagated recursively to all files inside the folder. Only new files inherit from the folder.
You can see this here
Go to azure storage explorer change ACL permissions in the route Folder and right click on your storage and click on "propogate access control lists"

Is there a way to identify where the file gets downloaded in Azure Databricks when I do web automation using Selenium Python?

I am using Selenium for web automation and Python as a language and I'm doing this on a Chrome browser.
I have this setup in Azure Databricks. I want to download an excel from the website and I do this by clicking the "Export to Excel" button. Now if I do the same in my local system it gets downloaded in my local machine's Download folder but can anybody help me to find where it will get downloaded now because it's being run through Azure Databricks notebook.
Is there a way where I can directly download that file to blob storage or any other specific storage? Thanks in advance.
Export to Excel button
exportToExcel = driver.find_element_by_xpath('//*[#id="excelReport"]')
exportToExcel.click()
time.sleep(10)
These are the options available to upload the files to Azure Databricks File System DBFS.
Option 1: Use Databricks CLI to upload files from local machine to DBFS.
Steps for installing and configuring Databricks CLI
Once databricks cli installed, you can use the below command to Copy a file to DBFS
dbfs cp test.txt dbfs:/test.txt
# Or recursively
dbfs cp -r test-dir dbfs:/test-dir
Option 2: DBFS Explorer for Databricks
DBFS Explorer was created as a quick way to upload and download files to the Databricks filesystem (DBFS). This will work with both AWS and Azure instances of Databricks. You will need to create a bearer token in the web interface in order to connect.
The tool is quite basic, today you can: [ Upload, Download, Create Folders,
Delete Files ]
Drag and Drop files from Windows Explorer/Finder
Option 3: You can upload data to any Azure Storage account such as [Azure Blob Storage, ADLS Gen1/Gen2 ] and you can mount a Blob storage container or a folder inside a container to Databricks File System (DBFS). The mount is a pointer to a Blob storage container, so the data is never synced locally.
Reference: Databricks - Azure Blob storage

How to create and interaction between google drive and aws s3?

I'm trying to set up a connection a Google Drive folder and S3 bucket, but I'm not sure where to start.
I've already created a sort of "Frankenstein process", but it's easy to use only by me and sharing it to my co-workers it's a pain.
I have a script that generates a plain text file and saves it into a drive folder. And to upload, I've installed Drive file stream to save it in my mac, then all I did was create a script using Python3, with the boto3 library, to upload the text file into different s3 buckets depending on the file name.
I was thinking that I can create a lambda to process the file into the s3 buckets but I cannot resolve how to create the connection between drive and s3. I would appreciate if someone could give me a piece of advise on how to start with this.
Thanks
if you just simply want to connect google drive and aws s3 there is one service name zapier which provide different type of integration without line of code
https://zapier.com/apps/amazon-s3/integrations/google-drive
For more details you can check this link out