Unexpected error, while copy query results to table using google java bigquery api for GAE - google-bigquery

I'm trying to copy query result to a new table, but I'm getting an error :
Copy
11:13am
query results to 49077933619:TelcoNG.table
Errors:
Unexpected. Please try again.
Job ID: job_090d08f69c8e4199afeca131b5279393
Start Time: 11:13am, 12 Aug 2013
End Time: 11:13am, 12 Aug 2013
Copy Source: 49077933619:_8dc46c0daeb9142a91aa374aa59d615c3703e024.anon17d88e0e_0960_4510_9740_b753109050f4
Destination Table: 49077933619:TelcoNG.table
I get this error since last Thursday (8 Aug 2013).
This functionality has worked perfect for over an year.
Are there any changes in the API?

It looks like there is a bug in detecting which datacenters a table created as the result of a query has been replicated to. You're doing the copy operation very soon after the query finished, and before the results have finished replicating. As I mentioned, this is a bug and we should fix it very soon.
As a short-term workaround, you can either wait a few minutes between the query and the copy operation, or you can set a destination table on your query, so you don't need to do the copy operation at all.

Related

BigQuery Scheduled Query won't run

I've got data buckets setup in GCS and using BigQuery to run all my .csv files from that bucket to build a table. That works flawlessly. I made a simple deduplication query that when manually run, selects only distinct rows and creates a new table with "DeDupe" appended (Code below). That runs flawlessly.
CREATE OR REPLACE TABLE
`project-name-123456.dataset_2022.dataset 2022 DeDuped` AS
SELECT
DISTINCT *
FROM
`project-name-123456.dataset_2022.dataset 2022`
The issue I am having is with scheduling that query. Every time it tries to run I get the error "Error status: Not found: Dataset project-name-123456:dataset_2022 was not found in location US; JobID: project-name-123456:628d7766-0000-2d36-a82f-94eb2c0a664a"
The only thing I can figure is that I have my data location for the dataset as "us-central1" as it has a free tier. And when I go to my scheduled query, whether I select the same data location, or "Default" it always changes to "US Multiple".
Is there a way to fix this?
Or do I need to create my dataset in "US Multiple"?
Trying to cut down on costs as much as possible by keeping it in the us-central1
EDIT: Seems like I just needed to delete and recreate the scheduled query again. Chatted with Google Support and they sorted it. Sorry all!

All our scheduled queries have been failing with `Error code 3 : Incompatible table partitioning specification.` since 2019-04-23 at 8PM UTC

Our scheduled queries have been running for months without any hiccups, but starting from 8pm UTC on 2019-04-23, they failed with the following error, and they are still failing very often 36 hours later.
11:00:01 PM Error code 3 : Incompatible table partitioning specification. Destination table exists with partitioning specification interval(type:DAY,field:), but transfer target partitioning specification is interval(type:DAY,field:). Please retry after updating either the destination table or the transfer partitioning specification.
11:00:00 PM Starting to process the query job with parameter #run_date=2019-04-23.
11:00:00 PM Dispatched run to data source with id 538824528883320
The following screenshot shows that some runs are ok (but none of our queries had successful runs today):
We tried redeploying the queries, but they still fail on the first run. Hitting Retry generates the same error too.
Update 1
So while we wait for Google folks to fix the bug, we found a workaround, as detailed in https://issuetracker.google.com/issues/131266091.
The solution was to re-create all the destination tables of our schedule queries without --require_partition_filter and --time_partitioning_expiration.
I really mean re-creating the tables. Updating the table configurations with bq update --norequire_partition_filter --time_partitioning_expiration 0 does not fix the problem.
This is a known issue and should be fixed soon.

Will BigQuery finish long running jobs with a destination table if my browser crashes / computer turns off?

I frequently run BigQuery jobs in the web gui that take 30 minutes or more, saving the results into another table to view later.
Since I'm not waiting for the result to come soon, and not storing them in my computer's memory, it would be great if I could start a query and then turn off my computer, to come back the next day and look at the results in the destination table.
Will this work?
The same applies if my computer crashes, or browser runs out of memory, or anything else that causes me to lose my connection to Bigquery while the job is running.
The simple answer is yes, the processing takes place in the cloud, not on your browser. As long as you set a destination table, the results will be saved there or if not, you can check the query history to see if there were any issues which caused it not to be produced.
If you don't set a destination table it will save to a temporary table which may not be available if you don't return in time.
I'm sure someone can give you a much more detailed answer.
Even if you have not defined destination table - you still can access result of the query by checking Query History. You should locate your query in the list of presented queries and then expand respective item and locate value of Destination Table.
Note: this is not regular table - rather so called anonymous table that is being available for about 24 hours after query was executed
So, knowing that table you can just use it in whatever way you want - for example just simply query it as in below
SELECT *
FROM `yourproject._1e65a8880ba6772f612fbe6ff0eee22c939f1a47.anon9139110fa21b95d8c8729cf0bb6e4bb6452946d4`
Note: anonymous table is being "saved" in a "system" dataset that is started with underscore so you will not be able to see it in UI. Also table name startes with 'anon' which I believe states for 'anonymous'

Google BigQuery Create/append to table from Avro internalError

I am fairly new to BigQuery, however I have been able to create and append to existing BigQuery tables from Avro files (both in EU region) until 1-2 days ago. I am only using the web UI so far.
I just attempted to create a new table from a newly generated Avro file and got the same error, details below:
Job ID bquijob_670fd977_15655fb3da1
Start Time Aug 4, 2016, 3:35:45 PM
End Time Aug 4, 2016, 3:35:53 PM
Write Preference Write if empty
Errors:
An internal error occurred and the request could not be completed.
(error code: internalError)
I am unable to debug because there is not really anything to go by.
We've just released a new feature to not creating the root field: https://cloud.google.com/bigquery/release-notes.
Since you have imported Avro before, we have excluded your project from this new feature. But unfortunately we had a bug with exclusion, and will cause reading Avro to fail. I think you most likely ran into this problem.
The fix will be released next week. If you don't need the root field and want to enable your project for the new feature, please send the project id to me, huazhang at google.com. Sorry for the trouble this has caused.

data load job failing with "Unexpected" error

Errors: Unexpected. Please try again.
Job ID: aaaaaaaaa.com:bbbbbbbbb:job_F1GjqdmJj3JZWDToxh_xav9hwsg
Start Time: 11:10am, 14 Apr 2014
End Time: 1:10pm, 14 Apr 2014
Destination Table: aaaaaaaaaaa.com:bbbbbbbbb:monte.ledger2
No idea why this is failing - some succeed, some do not...
Your import job is taking too long to process, so it is getting killed. We're reporting this as an internal error, which should be fixed to report a more user-friendly error instead.
We will also look into bumping the timeout (currently 2 hours per worker).
Workarounds for the issue include splitting your data into smaller files, or using uncompressed input files. The latter may sound surprising, but uncompressed files can be split into chunks and processed in parallel, but gzipped files cannot.