Azure Synapse Analytics Error when using saveAsTable from a DataFrame which is loaded from a SQL source - azure-synapse

I'm following the guide (https://learn.microsoft.com/en-us/azure/synapse-analytics/get-started) for loading data from a SQL Pool and writing the DataFrame to a table in the metastore. However I'm getting an error:
Error : org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsRestOperationException: Operation failed: "This request is not authorized to perform this operation using this permission.", 403, PUT, https://xxx.dfs.core.windows.net/tempdata/synapse/workspaces/xxx/sparkpools/SparkPool/sparkpoolinstances/8f3ec14a-1e59-4597-8fd9-42da0db65331?action=setAccessControl&timeout=90, AuthorizationPermissionMismatch, "This request is not authorized to perform this operation using this permission. RequestId:fe61799c-e01f-0003-119e-37fdb1000000 Time:2020-05-31T22:57:55.8271281Z"
I've replaced my resource names with xxx.
Other DataFrame saveAsTable operations work fine. From what I can see, the data is being read from the SQL Pool successfully and being staged as when I browse the data lake location specified in the error I can see the data.
/tempdata/synapse/workspaces/xxx/sparkpools/SparkPool/sparkpoolinstances/8f3ec14a-1e59-4597-8fd9-42da0db65331
The Synapse workspace managed identity has storage blob data contributor permissions and my own domain account has owner access.
Has anyone else had issues?
Thanks
Andy

Please assign yourself (account with which you're trying to run the script) a role of Storage Blob Data Contributor.
Below information is now showing up during the creation of Azure Synapse workspace.
It was a big struggle to figure this out during it's private preview.
More information related to securing synapse workspace can be found here.
Let me know if this worked.
Thank you.

Related

Migrating data from CSV file on Azure Blob Storage to Synapse Analytics (serverless pool)

I have a problem running a pipeline (migration from CSV file stored on Azure Data Factory, to Synapse Analytics) in the Azure Data Factory.
It worked fine with dedicated pool, but i can't get it to work with built-in serverless pool.
I created run-once pipeline on adf.azure.com with UI creator.
On the "Source Data Store" tab i choose Source type: Azure Blob Storage, then i choose appropriate connection, i press browse to choose desired file, i leave "Recursively" option on, i press next.
The next tab is "File format settings", here I choose Advanced options, and changed Escape character from backslash to double quote.
I press next, there's a tab: "Destination data store", i choose target type: Azure Synapse Analytics and then i choose connection. I specified target table name.
On the next tab there's a column mapping, i unchecked type conversion, in the next tab (Copy Data tool settings) I selected Polybase as a copy method.
I enabled staging blob, selected linked Azure Storage service and blob container, then I Pressed next and finished creating a pipeline.
The error message that I received:
Operation on target Copy_ky9 failed:
ErrorCode=SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A
database operation failed with the following error: 'Incorrect syntax
near
'HEAP'.',Source=,''Type=System.Data.SqlClient.SqlException,Message=Incorrect
syntax near 'HEAP'.,Source=.Net SqlClient Data
Provider,SqlErrorNumber=102,Class=15,ErrorCode=-2146232060,State=1,Errors=[{Class=15,Number=102,State=1,Message=Incorrect
syntax near 'HEAP'.,},],'
Since I used a UI for creating a pipeline, I don't know how to check the syntax, although I guess that it internally generates some command, I couldn't find an option to preview it and fix it's syntax.
A dedicated SQL pool in Azure Synapse Analytics has built-in storage; you can load data to a table in a dedicated SQL pool. A Serverless SQL pool has no storage. It is just a metadata layer for views over files in storage; It can read and write files in Azure storage.
I would stop having ADF load it and just build a view in Synapse Serverless SQL.

How to get Azure SQL transactional log

How to get the transaction logs for a Azure SQL db? I'm trying to find log from portal of azure but not getting any luck.
If there is no way to get the log where that is saying in Microsoft docs. any help is appriciate
You don't as it is not exposed in the service. Please step back and describe what problem you'd like to solve. If you want a DR solution, for example, then active geo-replication can solve this for you as part of the service offering.
The log format in Azure SQL DB is constantly changing and is "ahead" of the most recent version of SQL Server. So, it is probably not useful to expose the log (the format is not documented). Your use case will likely determine the alternative question you can ask instead.
Azure SQL Database auditing tracks database events and writes them to an audit log in your Azure storage account, or sends them to Event Hub or Log Analytics for downstream processing and analysis.
Blob audit
Audit logs stored in Azure Blob storage are stored in a container named sqldbauditlogs in the Azure storage account. The directory hierarchy within the container is of the form ////. The Blob file name format is _.xel, where CreationTime is in UTC hh_mm_ss_ms format, and FileNumberInSession is a running index in case session logs spans across multiple Blob files.
For example, for database Database1 on Server1 the following is a possible valid path:
Server1/Database1/SqlDbAuditing_ServerAudit_NoRetention/2019-02-03/12_23_30_794_0.xel
Read-only Replicas audit logs are stored in the same container. The directory hierarchy within the container is of the form ////RO/. The Blob file name shares the same format. The Audit Logs of Read-only Replicas are stored in the same container.

Data Factory New Linked Service connection failure ACL and firewall rule

I'm trying to move data from a datalake stored in Azure Data Lake Storage Gen1 to a table in an Azure SQL database. In Data Factory "new Linked Service" when I test the connection I get a "connection failed" error message, "Access denied...make sure ACL and firewall rule is correctly configured in the Azure Data Lake Store account. I tried numerous times to correct using related Stack overflow comments and plethora of fragmented Azure documentation to no avail. Am I using the correct approach and if so how do I fix the issue?
Please follow me:
First:
Go to ADF and new Linked service in ADF,then copy Managed identity object ID.
Second:Go to Azure Data Lake Storage Gen1,navigate to Data Explorer -> Access -> click select in the 'Select User or group' field.
Finally:paste your Managed identity object ID and then test your connection in ADF.

GBQexception: How to read data with big query that is stored on google drive spreadsheet

I uploaded a dataset to bigquery via the google drive option and linking the google spreadsheet to a dataset which I call 'dim_table'
I then created a query to pull data from that dim_table dataset that I run daily.
I am trying to create an automated script that will run the same query code I created to get the dim_table data set and create a new dataset call chart_A
When I run this simple code:
import pandas_gbq as gbq
gbq.read_gbq("Select * from data.dim_stats",'ProjectID')
I get an error:
GenericGBQException: Reason: 403 Access Denied: BigQuery BigQuery: No
OAuth token with Google Drive scope was found.
I have been trying to read documentation on pandas gbq but could not find any documentation that points me on how I can authenticate gdrive with pandas gbq or use oauth. Any help is appreciated! :)
Let me know if you need me to comeup with a sample table online for testing.
best
I haven't used pandas-gbq but authentication methods with BigQuery mentioned here [1].
Create service account with a BigQuery role that can access to your datasets [2].
Create and download the service account's JSON key [3].
Set the private_key parameter to a file path to the JSON file or a string contains the JSON contents.
Also related guide to query Google Drive data without using pandas-gbq is here [4].

Error trying to move data from Azure table to DataLake store with DataFactory

I've been building a Datafactory pipeline to move data from my azure table storage to a datalake store, but the tasks fail with an exception that I can't find any information on. The error is
Copy activity encountered a user error: ErrorCode=UserErrorTabularCopyBehaviorNotSupported,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=CopyBehavior property is not supported if the source is tabular data source.,Source=Microsoft.DataTransfer.ClientLibrary,'.
I don't know where the problem lies, if in the datasets, the linked services or the pipeline, and can't seem to find any info at all on the error I'm seeing on the console.
Since the copy behavior from Azure Table Storage to Azure Data Lake Store is not currently supported as a temporary work around you could go from Azure Table Storage to Azure Blob Storage to Azure Data Lake store.
Azure Table Storage to Azure Blob Storage
Azure Blob Storage to Azure Data Lake Store
I know this is not ideal solution but if you are under time constraints, it is just an intermediary step to get the data into the data lake.
HTH
The 'CopyBehaviour' property is not supported for Table storage (which is not a file based store) that you are trying to use as a source in ADF copy activity. That is the reason why you are seeing this error message.