Azure Synapse Copy Data from BigQuery, Source ERROR [HY000] [Microsoft][BigQuery] (131) Unable to authenticate with Google BigQuery Storage API - google-bigquery

I am getting this error at the Source tab at the Use query (Table, Query) Query, when doing a copy data activity at the Azure Synapse pipeline.
Unable to authenticate with Google BigQuery Storage API:
.
The strange thing is I can preview data at the Source dataset, I can also preview data when select the Use query Table option.
I can even run query to select the table's schema
SELECT
*
FROM
`3082`.INFORMATION_SCHEMA.TABLES
WHERE table_type = 'BASE TABLE'
but I get this authentication error when selecting columns
SELECT
*
FROM
`3082.gcp_billing_export_v1_019F74_6EA5E8_C96548`;

ERROR [HY000] [Microsoft][BigQuery] (131) Unable to authenticate with Google BigQuery Storage API. Check your account permissions
The above error is due to issue in authentication of BigQuery Storage API. The permission required to access data from BigQuery are,
bigquery.readsessions.create
bigquery.readsessions.getData
bigquery.readsessions.update
The role BigQuery User will help in giving above permissions.
Reference:
Google cloud doc on Access Control - BigQuery User.
MS doc on Google BigQuery connector issue

Related

Query data lake from Azure SQL database

I'm just finding my way around Azure, trying to build a modern data warehouse. One thing I haven't been able to figure out is how to query my data lake from an Azure SQL database.
Something similar to the following works in Azure Synapse (note the long-term plan is to remove Synapse due to cost reasons):
SELECT top 100 *
FROM
OPENROWSET( BULK
'https://storageaccount.blob.core.windows.net/container/folder/2022/09/03/filename.parquet'
,SINGLE_BLOB
) AS [result]
But I get the following error running this from an Azure SQL database (in the Azure portal, using the query editor):
Failed to execute the query. Error: Cannot bulk load because the file "https://storageaccount.blob.core.windows.net/container/folder/2022/09/03/filename.parquet" could not be opened. Operating system error code 6(The handle is invalid.).
I also tried the code below after searching on the Internet:
CREATE EXTERNAL DATA SOURCE pocBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://storageaccount.blob.core.windows.net/container/folder/2022/09/03',
CREDENTIAL= sqlblob);
-- Query remote file
SELECT *
FROM OPENROWSET(BULK 'filename.parquet',
DATA_SOURCE = 'pocBlobStorage',
SINGLE_CLOB
--FORMATFILE='currency.fmt',
--FIRSTROW=2
--, FORMATFILE_DATA_SOURCE = 'pocBlobStorage'
) as D
I tried various combinations of the formatting options, but couldn't get anything to work.
The current error I'm getting is: Failed to execute query. Error: Referenced external data source "pocBlobStorage" not found.
I'm wondering if I need to do something to enable the Azure SQL database to access my data lake. For example, I haven't configured any credential called 'SQL blob' as per my last code segment, but I'm not sure where to do this (for example something similar to creating a linked service in azure data factory).
So how do I query my data lake, directly from my azure SQL database? Is the issue in my query, or do I need to configure access first, and if so how?

GBQ Data load SSIS ERROR [HY000] [Simba][BigQuery] (131) Unable to authenticate with Google BigQuery Storage API. Check your account permissions

I'm creating SSIS using ADO NET Source from Google Biquery with OLE DB Destination table. When the query result is under 100,000 rows there is no issue, my SSIS executed successfully. This issue occurs when the result is above 100,000 rows. Is there any way to fix this so there is no limit to how many rows?

Unable to Query Serverless Pool View in Azure Synapse using SQL Admin Credentials

I have set up a Serverless SQL pool in Azure Synapse that is querying a view I had set up of a linked Azure Data Lake.
CREATE VIEW DeviceTelemetryView
AS SELECT corporationid, deviceid, version, Convert(datetime, dateTimestamp, 126) AS dateTimeStamp, deviceData FROM
OPENROWSET(
BULK 'https://test123.dfs.core.windows.net/devicetelemetry/*/*/*/*/*/',
FORMAT = 'PARQUET'
) AS [result]
GO
Using my Azure AD credentials from with synapse studio or SSMS I have no issues querying this View. When I try to query using my SQL Admin account I get the following error:
Cannot find the CREDENTIAL 'https://test123.dfs.core.windows.net/devicetelemetry/////*/', because it does not exist or you do not have permission.
It is important that I am able to query using SQL Admin Creds as we are wanting to query this View via our application for various reports and thus don't want to use AAD creds.
I have tried the SO solution provided here: GRANT Database Scoped Credential syntax gives mismatched input error
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[WorkspaceSystemIdentity] TO [sqlAdmin];
As this seems to be the default credential that was created when linking my DataLake to Synapse however this gives me the following error when run against the db where my view exists:
Cannot find the database scoped credential 'WorkspaceSystemIdentity', because it does not exist or you do not have permission.
You would need to create server-scoped credential to allow access to storage files.
Server-scoped credential
These are used when SQL login calls OPENROWSET function without
DATA_SOURCE to read files on some storage account. The name of
server-scoped credential must match the base URL of Azure storage
(optionally followed by a container name). However, SQL users can't
use Azure AD authentication to access storage and serverless SQL pool doesn't return subfolders unless you specify /** at the end of path.

Error in SSMS when running query from SQL On-Demand endpoint

I am attempting to pull in data from a CSV file that is stored in an Azure Blob container and when I try to query the file I get an error of
File 'https://<storageaccount>.blob.core.windows.net/<container>/Sales/2020-10-01/Iris.csv' cannot be opened because it does not exist or it is used by another process.
The file does exist and as far as I know of it is not being used by anything else.
I am using SSMS and also a SQL On-Demand endpoint from Azure Synapse.
What I did in SSMS was run the following commands after connecting to the endpoint:
CREATE DATABASE [Demo2];
CREATE EXTERNAL DATA SOURCE AzureBlob WITH ( LOCATION 'wasbs://<container>#<storageaccount>.blob.core.windows.net/' )
SELECT * FROM OPENROWSET (
BULK 'Sales/2020-10-01/Iris.csv',
DATA_SOURCE = 'AzureBlob',
FORMAT = '*'
) AS tv1;
I am not sure of where my issue is at or where to go next. Did I mess up anything with creating the external data source? Do I need to use a SAS token there and if so what is the syntax for that?
#Ubiquitinoob44, you need to create a database credential:
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-storage-access-control?tabs=shared-access-signature
I figured out what the issue was. I haven't tried Armando's suggestion yet.
First I had to go to the container and edit IAM policies to give my Active Directory login a Blob Data Contributor role. The user to give access to will be your email address for logging in to your portal.
https://learn.microsoft.com/en-us/azure/storage/common/storage-auth-aad-rbac-portal?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json
After that I had to re-connect to the On-Demand endpoint in SSMS. Make sure you login through the Azure AD - MFA option. Originally I was using the On-Demand endpoint username and password which was not given access to the Blob Data Contributor role for the container.
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/resources-self-help-sql-on-demand

Exporting data from Google Bigquery table to Google Cloud Storage

When exporting data from the Google bigquery table to Google cloud storage in Python, I get the error:
Access Denied: BigQuery BigQuery: Permission denied while writing
data.
I checked the JSON key file and it links to the owner of the storage. What can I do?
there are several reason's for this type of error
1. you give the exact path to the GOOGLE_APPLICATION_CREDENTIALS key.
2. Please check that you have writing permission in your project.
3. You have given a correct schema and their value if you writing a table, many of the times this type of error occurred due to incorrect schema value