I am using Azure sdk for PHP. I have e.g 200 files to put in Azure storage. File size is between 100 KB to 1MB. My algorithm inserting is as follows.
Create Azure Blob Service.
Create new container (unique name).
Insert 10 files (createBlockBlob).
Create new container
Insert 10 Blobs.
... and so on.
Let sat after 30-40 insertion i getting error:
Exception 'HTTP_Request2_MessageException' with message 'Malformed response: '
Any suggestions what can be a reason of this error?
Related
I have a blob container say "demo". It has many files, few are in hot tier and few are in archive tier. I want to process only those files which are in Hot ignoring archive tier files.
using getmetadata activity lists all files including archive tier files
using "az storage blob list throws" error - 'This operation is not permitted on an archived blob.'
Pl direct me in the right direction.
You can use List Blobs operation which returns a list of the blobs under the specified container.
Method: GET
Request URI: https://myaccount.blob.core.windows.net/mycontainer?restype=container&comp=list
HTTP Version: HTTP/1.1
This will return the response body in XML format which you can later filter on the basis of access tier names.
In the above URI, you also need to provide parameter include={versions=2019-12-12} if it doesn't use newest version automatically.
For version 2017-04-17 and above, List Blobs returns the AccessTier
element if an access tier has been explicitly set. For Blob Storage or
General Purpose v2 accounts, valid values are Hot/Cool/Archive. If the
blob is in rehydrate pending state then ArchiveStatus element is
returned with one of the valid values
rehydrate-pending-to-hot/rehydrate-pending-to-cool.
Refer List Blobs for more details.
I'm trying to create a new external table using CETAS (CREATE EXTERNAL TABLE AS SELECT * FROM <table>) statement from an already existing external table in Azure Synapse Serverless SQL Pool. The table I'm selecting from is a very large external table built on around 30 GB of data in parquet format stored in ADLS Gen 2 storage but the query always times out after about 30 minutes. I've tried using premium storage and also tried out most if not all the suggestions made here as well but it didn't help and the query still times out.
The error I get in Synapse Studio is :-
Statement ID: {550AF4B4-0F2F-474C-A502-6D29BAC1C558} | Query hash: 0x2FA8C2EFADC713D | Distributed request ID: {CC78C7FD-ED10-4CEF-ABB6-56A3D4212A5E}. Total size of data scanned is 0 megabytes, total size of data moved is 0 megabytes, total size of data written is 0 megabytes. Query timeout expired.
The core use case is that assuming I only have the external table name, I want to create a copy of the data over which that external table is created in Azure storage itself.
Is there a way to resolve this timeout issue or a better way to solve the problem?
This is a limitation of Serverless.
Query timeout expired
The error Query timeout expired is returned if the query executed more
than 30 minutes on serverless SQL pool. This is a limit of serverless
SQL pool that cannot be changed. Try to optimize your query by
applying best practices, or try to materialize parts of your queries
using CETAS. Check is there a concurrent workload running on the
serverless pool because the other queries might take the resources. In
that case you might split the workload on multiple workspaces.
Self-help for serverless SQL pool - Query Timeout Expired
The core use case is that assuming I only have the external table name, I want to create a copy of the data over which that external table is created in Azure storage itself.
It's simple to do in a Data Factory copy job, a Spark job, or AzCopy.
I'm having troubles on a Azure SQL Database where i'm trying to read DB Audit logs.
Both procedures sys.fn_get_audit_file or sys.fn_xe_file_target_read_file sould be able to read a file.
But whatever I do i'm getting blank tables.But, even if I specify a non existing file I receive a table with zero records instead of a error.
So I'm afraid its something else.
My login is in the db_owner group.
Any suggestions ?
I found that I could only read XEL files by using the same server and same database context that they were created for. So for example, consider the following scenario:
ServerA is the Azure Synapse instance I was creating the audit XEL files from, all related to DatabaseA
ServerB is a normal SQL instance that I want to read the XEL files on
Test 1:
Using ServerB, try to read file directly from blob storage
Result: 0 rows returned, no error message
Test 2:
Using ServerB, download the XEL files locally, and try to read from the local copy
Result: 0 rows returned, no error message
Test 3:
Using ServerA, with the current DB = 'master', try to read file directly from blob storage
Result: 0 rows returned, no error message
Test 4:
Using ServerA, with the current DB = 'DatabaseA', try to read file directly from blob storage
Result: works perfectly
Because I really wanted to read the files from ServerB, I also tried doing a CREATE CREDENTIAL there that was able to read & write to my blob storage account. That didn't make any difference unfortunately - a repeat of Test 1 got the same result as before.
As the title suggests I'm trying to add an image from Azure Bulk Storage and put that into a VarBinary(Max) column in my Azure SQL Database.
I'm building an application in Unity where each user has a logo. This logo is specific to each user. I send a web request through to PHP code which then requests the server for the information I need from the specific database. So I'm trying to find a way to ensure each user (row in the table) has a logo attatched to it. I'm thinking if it's not right to store images in the database itself then would it be possible to do a web request to a URL that is stored in the logo column, to then draw the image from that URL and use that in the application? If so, does anyone know how I would do this?
I know the Bulk Storage provides a URL to the image. Additionally, if possible I want to add it into currently created rows. Thanks!
Assumptions :
You have user icons stored as blobs in your Azure Storge account with anonymous read access at Blob level
Blob URL is https://my-az.blob.core.windows.net/usericon/staff icon.png
(Replace 'my-az'with your specific Azure tenant name)
You have two separate tables - UserDets and UserImg. It is always better to have the image data in a separate table.
In SSMS execute this command -
CREATE EXTERNAL DATA SOURCE MyUserAzureBlobStorage
WITH ( TYPE = BLOB_STORAGE, LOCATION = 'https://my-az.blob.core.windows.net/usericon');
Then insert the image into VARBINARY column -
Insert into UserImg (UserId, UserImg)
(Select 1, BulkColumn FROM OPENROWSET(
BULK 'staff icon.png',
DATA_SOURCE = 'MyUserAzureBlobStorage', SINGLE_BLOB) AS ImageFile);
If you are using credentials for blob access, there are some additional steps. But it is not required in current instance.
I have a big table (About 10 million rows) that I'm trying to pull into my bigquery. I had to upload the CSV into the bucket due to the size constraints when creating the table. When I try to create the table using the Datastore, the job fails with the error:
Error Reason:invalid. Get more information about this error at Troubleshooting Errors: invalid.
Errors:
gs://es_main/provider.csv does not contain valid backup metadata.
Job ID: liquid-cumulus:job_KXxmLZI0Ulch5WmkIthqZ4boGgM
Start Time: Dec 16, 2015, 3:00:51 PM
End Time: Dec 16, 2015, 3:00:51 PM
Destination Table: liquid-cumulus:ES_Main.providercloudtest
Source URI: gs://es_main/provider.csv
Source Format: Datastore Backup
I've troubleshot by using a small sample file of rows from the same table and just uploading using the CSV feature in the table creation without any errors and can view the data just fine.
I'm just wondering what the metadata should be set as with the "Edit metadata" option within the bucket or if there is some other work around I'm missing. Thanks
The error message for the job that you posted is telling you that the file you're providing is not a Datastore Backup file. Note that "Datastore" here means Google Cloud Datastore, which is another storage solution that it sounds like you aren't using. A Cloud Datastore Backup is a specific file type from that storage product which is different from CSV or JSON.
Setting the file metadata within the Google Cloud Storage browser, which is where the "Edit metadata" option you're talking about lives, should have no impact on how BigQuery imports your file. It might be important if you were doing something more involved with your file from Cloud Storage, but it isn't important to BigQuery as far as I know.
To upload a CSV file from Google Cloud Storage to BigQuery, make sure to select the CSV source format and the Google Storage load source as pictured below.