BigQuery Table Not Found using the browser tool - google-bigquery

I am using the Browser Tool to create a simple dataset with just 1 table with the following schema:
data:integer,count:integer
I am uploading the data using a comma separated csv file.
When I proceed to create the table I can see the new dataset and table in the left side column and next to Job History I see 1 running.
Nothing happens for a long time, even with a small csv file. When I click on the newly created table I get the error Table Not Found
When I refresh the page everything is gone, the dataset and the table.
This looks like some kind of bug, but as I am new with BigQuery I want to make sure I am not doing anything wrong.
If this is a bug, how can I skip it in order to be able to actually create a dataset with a table?
Any tip in the right direction will be much appreciated

If you look at the job history (in the top left corner), you should be able to see the load job that you ran. If it failed, it will show an error.
My assumption is that you ended up running this yesterday when our load jobs were temporarily backed up. When you run a load job, the UI shows a table placeholder, but the table won't actually exist until the load completes. That is why when you clicked on the table it showed as 'not found' since it hadn't really been created yet. That is also why it didn't show up when you reloaded.
We're in the process of increasing capacity by an order of magnitude, so that should be less likely to happen again.
If you do have jobs that failed that you think should have succeeded, please send the job ID and we can investigate.

Related

BigQuery Scheduled Query won't run

I've got data buckets setup in GCS and using BigQuery to run all my .csv files from that bucket to build a table. That works flawlessly. I made a simple deduplication query that when manually run, selects only distinct rows and creates a new table with "DeDupe" appended (Code below). That runs flawlessly.
CREATE OR REPLACE TABLE
`project-name-123456.dataset_2022.dataset 2022 DeDuped` AS
SELECT
DISTINCT *
FROM
`project-name-123456.dataset_2022.dataset 2022`
The issue I am having is with scheduling that query. Every time it tries to run I get the error "Error status: Not found: Dataset project-name-123456:dataset_2022 was not found in location US; JobID: project-name-123456:628d7766-0000-2d36-a82f-94eb2c0a664a"
The only thing I can figure is that I have my data location for the dataset as "us-central1" as it has a free tier. And when I go to my scheduled query, whether I select the same data location, or "Default" it always changes to "US Multiple".
Is there a way to fix this?
Or do I need to create my dataset in "US Multiple"?
Trying to cut down on costs as much as possible by keeping it in the us-central1
EDIT: Seems like I just needed to delete and recreate the scheduled query again. Chatted with Google Support and they sorted it. Sorry all!

BigQuery scheduled query: Cannot create a transfer in > JURISDICTION_US when destination dataset is located in > REGION_ASIA_SOUTHEAST_1

I am getting this error quite frequently while trying to create a scheduled query
Error creating scheduled query: Cannot create a transfer in
JURISDICTION_US when destination dataset is located in
REGION_ASIA_SOUTHEAST_1
I just need a scheduled query to overwrite data in a table.
I had the same problem while trying to create a scheduled query with python:
400 Cannot create a transfer in REGION_EUROPE_WEST_1 when destination dataset is located in JURISDICTION_EU
I figured out that even my project is located in europe-west1 but my destination dataset was located in multinational location: Europe. I had to update my parent path : parent=project_path to '{project_path}/locations/eu' so that it works.
I hope that it helps someone.
It's look like as a bug from BQ.
I got the same problems, with source and destination dataset located in EU both.
I've change just for testing purpose the destination for an other EU dataset, and it works.
I've finally update the scheduled query to use my first destination choice and now it works.. I can't explain why, but it's seem to be a workaround.
Maybe, you can try with starting from the Scheduled Queries BigQuery UI and click on "+ create schedule query" button, then I don't get error. If I start directly in BigQuery UI I get the same error.
As I tried, it may happen because I have existed table with the same id as the destination table. This happens even if the table is the result of manually running that query and saved.
I faced the same issue recently.I tried 2 things and they worked:
try setting the query location to destination dataset/table location, then try scheduling the query.
If that does not work try to run the query and save results to the intended table in bigquery i.e. try creating the destination table with storing the results of the query you are trying to schedule first. Then try scheduling the query.
Both cases worked for me in different cases.
I had this error and tried many of the solutions in this thread. I tried a new session in an incognito window and it worked so I believe this is a transient issue as suggested.
I just scheduled query select 1 and then edited it to the needed one – it worked
I think trouble with time when start schedule. If it is in the past relative to local time, then bg tries to run the request on another server.
I had the same issue. The way I solved it was to disable the editor tabs (there is a button at the top). Then opened the query settings and set the processing location to EU manually.
I was using bq command when I came across this issue and was able to resolve it by adding parameter --location='europe-west1
So my final query looked like this
bq query \
--use_legacy_sql=false \
--display_name='my_table' \
--location='europe-west1' \
'''create or replace table my_dataset.my_table as (select * from external_query('projects/my_mysql_connection/locations/europe-west1/connections/bi', '(select * from my_table)'))'''

error loading table on bigquery dashboard but queries works fine

I clicked a table on bigquery dashboard, got this error:
However, I can get data when I do a select on this table. (That means the table does exist)
I already have the highest admin privilege so it shouldn't be a permission issue.
I created this table with python script, which collects data, writes into a csv file, and upload the csv file to bigquery everyday. After I created the table I once changed the schema both in the script and on the dashboard. Not sure if that's the cause, but the table loading error occurred several days after I changed the schema.
If you have Addblock extensions, this might be the root cause of this issue. Thus, try disabling it, then try running your query again.
Hope it helps.

Will BigQuery finish long running jobs with a destination table if my browser crashes / computer turns off?

I frequently run BigQuery jobs in the web gui that take 30 minutes or more, saving the results into another table to view later.
Since I'm not waiting for the result to come soon, and not storing them in my computer's memory, it would be great if I could start a query and then turn off my computer, to come back the next day and look at the results in the destination table.
Will this work?
The same applies if my computer crashes, or browser runs out of memory, or anything else that causes me to lose my connection to Bigquery while the job is running.
The simple answer is yes, the processing takes place in the cloud, not on your browser. As long as you set a destination table, the results will be saved there or if not, you can check the query history to see if there were any issues which caused it not to be produced.
If you don't set a destination table it will save to a temporary table which may not be available if you don't return in time.
I'm sure someone can give you a much more detailed answer.
Even if you have not defined destination table - you still can access result of the query by checking Query History. You should locate your query in the list of presented queries and then expand respective item and locate value of Destination Table.
Note: this is not regular table - rather so called anonymous table that is being available for about 24 hours after query was executed
So, knowing that table you can just use it in whatever way you want - for example just simply query it as in below
SELECT *
FROM `yourproject._1e65a8880ba6772f612fbe6ff0eee22c939f1a47.anon9139110fa21b95d8c8729cf0bb6e4bb6452946d4`
Note: anonymous table is being "saved" in a "system" dataset that is started with underscore so you will not be able to see it in UI. Also table name startes with 'anon' which I believe states for 'anonymous'

Controlling the updates in my Database

I came here today to see if someone could give me a suggestion to improve the way I update my database.
Here is the problem, I have one file that I store new scripts every time that I need to change something. For instance, let's say I need to add a new column in a table. I would add the following line in my file called script1.sql:
alter table CLIENTS
add AGE integer
After doing that, I am going to send it to a client with an updated application, and ask him to run script1.sql on his database. That works just fine for me.
The problem shows up when this file starts to get bigger, and the client needs to receive the new updates.
The client would run the script1.sql file again, but now with more updates. He will get errors indicating that a column named AGE already exists in the database.
The biggest problem is when I change the version of my application. If I update my application from Application1 to Application2, I also change the script from script1.sql to script2.sql.
Now, my client will need to run both to get to the correct version without conflicts. He will also get lots of errors, since almost everything from script1.sql was already processed in his database.
What I want is to eliminate the chance to face conflicts. This process has been working for me, but always causing some sort of trouble. Therefore, if anyone has any idea about how I could make it work better, please help me out.
Usually SQL provides something called IF EXISTS ( also IF NOT EXISTS) so eg you can write a statement such as:
CREATE TABLE IF NOT EXISTS users ...
Which will only create the users table if it hasn't already been created.
There is usually a variant of this that can be added to all your statements (including updates such as renaming columns etc).
Then if the table has already been added (or column updated etc) then it won't try to run that SQL command again - which means you can run the same file over and over as many times as you like.
(Note: this is called idempotency)
You will need to google for the details on how to use EXISTS for sql-server