BIG QUERY IMPORT FROM CLOUD issue - google-bigquery

I have uploaded my data sets into google cloud. I am trying to import them into big query tables. I get an error declaring that the location of my data is not the "path" as declared in the google cloud browser= "55555/M04Q1%20Query.txt"
Thats my bucket and my file.... so something is missing-
ideas?

I'm a little bit confused by what your asking. The import path, if you're importing from a google cloud storage path should look like: "gs://bucket/object". Can you give more information about the request your are sending, how you are sending it, and the error you are getting?

Related

Azure Blob Storage - How to read source string: wasbs://training#dbtrainsouthcentralus.blob.core.windows.net

I am doing a lab for an Azure Data course and there was some code to run from within Azure Databricks.
I noticed that it seemed to mount something from the following location:
wasbs://training#dbtrainsouthcentralus.blob.core.windows.net
So I am trying to figure out how to deconstruct the above string
wasbs looks to mean "windows azure storage blob"
The string training#dbtrainsouthcentralus.blob.core.windows.net looks like it means "container name"#"account name" - Which I would think should be something in my Azure Data Lake.
I dug around in my ADLS and was not able to find anything related to "training#dbtrainsouthcentralus.blob.core.windows.net"
So I was wondering, where on earth did this come from? How can I trace back to where this path came from?
The url is indeed constructed as follows:
wasbs://[container-name]#[storage-account-name].blob.core.windows.net[directory-name] (source)
I dug around in my ADLS ...
You won't find it in ALDS, it is a seperate resource in your subscription. There should be a storage account named dbtrainsouthcentralus.
Note: It could also be a public accessible storage account in some training subscription you do not have access to and is provided by microsoft for training purposes.

Google Cloud Logging export to Big Query does not seem to work

I am using the the google cloud logging web ui to export google compute engine logs to a big query dataset. According to the docs, you can even create the big query dataset from this web ui (It simply asks to give the dataset a name). It also automatically sets up the correct permissions on the dataset.
It seems to save the export configuration without errors but a couple of hours have passed and I don't see any tables created for the dataset. According to the docs, exporting the logs will stream the logs to big query and will create the table with the following template:
my_bq_dataset.compute_googleapis_com_activity_log_YYYYMMDD
https://cloud.google.com/logging/docs/export/using_exported_logs#log_entries_in_google_bigquery
I can't think of anything else that might be wrong. I am the owner of the project and the dataset is created in the correct project (I only have one project).
I also tried exporting the logs to a google storage bucket and still no luck there. I set the permissions correctly using gsutil according to this:
https://cloud.google.com/logging/docs/export/configure_export#setting_product_name_short_permissions_for_writing_exported_logs
And finally I made sure that the 'source' I am trying to export actually has some log entries.
Thanks for the help!
Have you ingested any log entries since configuring the export? Cloud Logging only exports entries to BigQuery or Cloud Storage that arrive after the export configuration is set up. See https://cloud.google.com/logging/docs/export/using_exported_logs#exported_logs_availability.
You might not have given edit permission for 'cloud-logs#google.com' in the Big Query console. Refer this.

Can Someone Help Me Troubleshoot Error In BQ "does not contain valid backup metadata."

I keep trying to upload a new table onto my companies BQ, but I keep getting the error you see in the title ("does not contain valid backup metadata.").
For reference, I'm uploading a .csv file that has been saved to our Google Cloud data storage. It's being uploaded as a native table.
Can anyone help me troubleshoot this?
It sounds like you are specifying the file type DATASTORE_BACKUP. When you specify that file type, BigQuery will take whatever uri you provide (even if it has a .CSV suffix) and search for the Google Cloud Data Storage Backup files relative to that url.

BigQuery InternalError loading from Cloud Storage (works with direct file upload)

Whenever I try to load a CSV file stored in CloudStorage into BigQuery, I get an InternalError (both using the web interface as well as the command line). The CSV is (an abbreviated) part of the Google Ngram dataset.
command like:
bq load 1grams.ngrams gs://otichybucket/import_test.csv word:STRING,year:INTEGER,freq:INTEGER,volume:INTEGER
gives me:
BigQuery error in load operation: Error processing job 'otichyproject1:bqjob_r28187461b449065a_000001504e747a35_1': An internal error occurred and the request could not be completed.
However, when I load this file directly using the web interface and the File upload as a source (loading from my local drive), it works.
I need to load from Cloud Storage, since I need to load much larger files (original ngrams datasets).
I tried different files, always the same.
I'm an engineer on the BigQuery team. I was able to look up your job, and it looks like there was a problem reading the Google Cloud Storage object.
Unfortunately, we didn't log much of the context, but looking at the code, the things that could cause this are:
The URI you specified for the job is somehow malformed. It doesn't look malformed, but maybe there is some odd UTF8 non-printing character that I didn't notice.
The 'region' for your bucket is somehow unexpected. Is there any chance you've set data location on your GCS bucket to something other than {US, EU, or ASIA}. See here for more info on bucket locations. If so, and you've set location to a region, rather than a continent, that could cause this error.
There could have been some internal error in GCS that caused this. However, I didn't see this in any of the logs, and it should be fairly rare.
We're putting in some more logging to detect this in the future and to fix the issue with regional buckets (however, regional buckets may fail, because bigquery doesn't support cross-region data movement, but at least they will fail with an intelligible error).

Google Big Query cloud storage path error

I am brand new to google big query so apologize if this is obvious.
I am simply trying to test the product out right now. I am able to upload a 5 MB file without any issues.
When I move to 10 MB+ using google storage, I am having no luck.
I have a bucket named teststir and a file name verify_sift.csv
When I try to create a new data set I select google cloud and put:
gs://teststir/verify_sift.csv as the path.
Unfortunately the job keeps failing:
Not found: URI gs://teststir/verify_sift.csv
(I have triple checked names, tried multiple files but no luck). Am I missing something obvious? Thank you for your help!
I've encountered the same problem.
I've just created new bucket with another location and it helped :)