Stale BigQuery table after load job - google-bigquery

I've ran into a situation where a BigQuery table has become stale. I can't even run a count query on it. This occurred right after I ran the first load job.
For each query I run I get an error:
Error: Unexpected. Please try again.
See for example Job IDs: job_OnkmhMzDeGpAQvG4VLEmCO-IzoY, job_y0tHM-Zjy1QSZ84Ek_3BxJ7Zg7U

The error is "illegal field name". It looks like the field 69860107_VID is causing it. BigQuery doesn't support column rename, so if you want to change the schema you'll need to recreate the table.
I've filed a bug to fix the internal error -- this should have been blocked when the table was created.

Related

Why am I getting an error when scheduling a query on Google BigQuery?

When trying to schedule a query in BQ, I am getting the following error:
Error code 3 : Query error: Not found: Dataset was not found in location EU at [2:1]
Is this a permissions issue?
This sounds like a case of the scheduled query being configured to run in a different region than either the referenced tables, or the destination table of the query.
Put another way, BigQuery requires a consistent location for reading and writing, and does not allow a query in location A to write results in location B.
https://cloud.google.com/bigquery/docs/scheduling-queries has some additional information about this.

error loading table on bigquery dashboard but queries works fine

I clicked a table on bigquery dashboard, got this error:
However, I can get data when I do a select on this table. (That means the table does exist)
I already have the highest admin privilege so it shouldn't be a permission issue.
I created this table with python script, which collects data, writes into a csv file, and upload the csv file to bigquery everyday. After I created the table I once changed the schema both in the script and on the dashboard. Not sure if that's the cause, but the table loading error occurred several days after I changed the schema.
If you have Addblock extensions, this might be the root cause of this issue. Thus, try disabling it, then try running your query again.
Hope it helps.

Redshift drop/create/select query failing in Data Pipeline

I'm trying to run a daily migration script in Redshift using Data Pipeline.
The script works as expected when I run it directly using SQL Workbench/J, but fails when triggered through Data Pipeline.
I have reproduced the problem with this simple code:
drop table if exists image_stg;
create table image_stg (like image_full);
select * from image_stg;
When I run it in Data Pipeline, I get this error:
[Amazon](500310) Invalid operation: relation "image_stg" does not exist;
I also got this error once, for the exact same code, without changing anything:
[Amazon](500310) Invalid operation: Relation with OID 108425 does not exist.;
Here's a screenshot of the two error messages:
I've found this thread on the AWS forums, but it didn't help: Pipeline started failing on simple Redshift SqlActivity and temp table
What is causing this error? Is there a workaround?
I've contacted Amazon, and it looks like a problem in Data Pipeline.
They did suggest a workaround that seems to work in my case: Change the JDBC connection string from jdbc:redshift://… to jdbc:postgresql://… .
I had the same problem when creating a temporary table in Redshift via Pipeline but the workaround of changing the connection string from jdbc:redshift://… to jdbc:postgresql://… didn't work for me though. My last resort is to create the table as physical table and drop it after use - through Pipeline.

How do I get the list of bad records that didn't load in Bigquery?

Is there a way to get all the bad records that get skipped while doing a Bigquery load job and setting --max_bad_records ?
I believe the status.errors field will have a list of errors that occurred during job processing, including non-fatal errors like bad rows that were skipped.
https://cloud.google.com/bigquery/docs/reference/v2/jobs

BigQuery: Unable to delete table

We have a large table (somewhat large < 15 million rows) that we have been filling up with stress and stability testing. We are trying to delete the table but it is resisting.
Here's what we have tried:
delete table from the web console. No errors...but it doesn't delete the table.
delete from command line interface. We get an error message: "BigQuery error in rm operation: Backend Error"
We have also tried to delete the whole dataset from the console and that fails as well. No errors reported.
We tried to delete the whole dataset from the commandline. We get the same error message: "BigQuery error in rm operation: Backend Error"
Other tables with the same schema can be deleted without error. Our schema does use 9999 columns (the max) which would be the only odd thing we may be doing.
You've hit a bug with tables that have a large number of updates and a wide schema. We're working on a fix.