I have a BigQuery scheduled query that is failing with the following error:
Not found: Dataset bunny25256:dataset1 was not found in location US at [5:15]; JobID: 431285762868:scheduled_query_635d3a29-0000-22f2-888e-14223bc47b46
I scheduled the query via the SQL Workspace. When I run the query in the workspace, it works fine. The dataset and everything else that I have created is in the same region: us-central1.
Any ideas on what the problem could be, and how I could fix it or work around it?
There's nothing special about the query, it computes some statistics on a table in dataset1 and puts it in dataset2.
When you submit a query, you submit it to BQ at a given location. The dataset you created lives in us-central1 but your query was submitted to us. The location us and us-central1 are not the same. Change your scheduled query to run in us-central1. See docs on location for more info.
Dataset is not provided correctly- it should be in formate project.dataset.table
try running below in big query
select * from bunny25256:dataset1
you should provide bunny25256:dataset1.table
Related
I've got data buckets setup in GCS and using BigQuery to run all my .csv files from that bucket to build a table. That works flawlessly. I made a simple deduplication query that when manually run, selects only distinct rows and creates a new table with "DeDupe" appended (Code below). That runs flawlessly.
CREATE OR REPLACE TABLE
`project-name-123456.dataset_2022.dataset 2022 DeDuped` AS
SELECT
DISTINCT *
FROM
`project-name-123456.dataset_2022.dataset 2022`
The issue I am having is with scheduling that query. Every time it tries to run I get the error "Error status: Not found: Dataset project-name-123456:dataset_2022 was not found in location US; JobID: project-name-123456:628d7766-0000-2d36-a82f-94eb2c0a664a"
The only thing I can figure is that I have my data location for the dataset as "us-central1" as it has a free tier. And when I go to my scheduled query, whether I select the same data location, or "Default" it always changes to "US Multiple".
Is there a way to fix this?
Or do I need to create my dataset in "US Multiple"?
Trying to cut down on costs as much as possible by keeping it in the us-central1
EDIT: Seems like I just needed to delete and recreate the scheduled query again. Chatted with Google Support and they sorted it. Sorry all!
When trying to schedule a query in BQ, I am getting the following error:
Error code 3 : Query error: Not found: Dataset was not found in location EU at [2:1]
Is this a permissions issue?
This sounds like a case of the scheduled query being configured to run in a different region than either the referenced tables, or the destination table of the query.
Put another way, BigQuery requires a consistent location for reading and writing, and does not allow a query in location A to write results in location B.
https://cloud.google.com/bigquery/docs/scheduling-queries has some additional information about this.
I am getting this error quite frequently while trying to create a scheduled query
Error creating scheduled query: Cannot create a transfer in
JURISDICTION_US when destination dataset is located in
REGION_ASIA_SOUTHEAST_1
I just need a scheduled query to overwrite data in a table.
I had the same problem while trying to create a scheduled query with python:
400 Cannot create a transfer in REGION_EUROPE_WEST_1 when destination dataset is located in JURISDICTION_EU
I figured out that even my project is located in europe-west1 but my destination dataset was located in multinational location: Europe. I had to update my parent path : parent=project_path to '{project_path}/locations/eu' so that it works.
I hope that it helps someone.
It's look like as a bug from BQ.
I got the same problems, with source and destination dataset located in EU both.
I've change just for testing purpose the destination for an other EU dataset, and it works.
I've finally update the scheduled query to use my first destination choice and now it works.. I can't explain why, but it's seem to be a workaround.
Maybe, you can try with starting from the Scheduled Queries BigQuery UI and click on "+ create schedule query" button, then I don't get error. If I start directly in BigQuery UI I get the same error.
As I tried, it may happen because I have existed table with the same id as the destination table. This happens even if the table is the result of manually running that query and saved.
I faced the same issue recently.I tried 2 things and they worked:
try setting the query location to destination dataset/table location, then try scheduling the query.
If that does not work try to run the query and save results to the intended table in bigquery i.e. try creating the destination table with storing the results of the query you are trying to schedule first. Then try scheduling the query.
Both cases worked for me in different cases.
I had this error and tried many of the solutions in this thread. I tried a new session in an incognito window and it worked so I believe this is a transient issue as suggested.
I just scheduled query select 1 and then edited it to the needed one – it worked
I think trouble with time when start schedule. If it is in the past relative to local time, then bg tries to run the request on another server.
I had the same issue. The way I solved it was to disable the editor tabs (there is a button at the top). Then opened the query settings and set the processing location to EU manually.
I was using bq command when I came across this issue and was able to resolve it by adding parameter --location='europe-west1
So my final query looked like this
bq query \
--use_legacy_sql=false \
--display_name='my_table' \
--location='europe-west1' \
'''create or replace table my_dataset.my_table as (select * from external_query('projects/my_mysql_connection/locations/europe-west1/connections/bi', '(select * from my_table)'))'''
I have this simple query which is fine in hive 0.8 in IBM BigInsights2.0:
SELECT * FROM patient WHERE hr > 50 LIMIT 5
However when I run this query using hive 0.12 in BigInsights3.0 it runs forever and returns no results.
Actually the scenario is the same for following query and many others:
INSERT OVERWRITE DIRECTORY '/Hospitals/dir' SELECT p.patient_id FROM
patient1 p WHERE p.readingdate='2014-07-17'
If I exclude the WHERE part then it would be all fine in both versions.
Any idea what might be wrong with hive 0.12 or BigInsights3.0 when including WHERE clause in the query?
When you use a WHERE clause in the Hive query, Hive will run a map-reduce job to return the results. That's why it usually takes longer to run the query because without the WHERE clause, Hive can simply return the content of the file that represents the table in HDFS.
You should check the status of the map-reduce job that is triggered by your query to find out if an error happened. You can do that by going to the Application Status tab in the BigInsights web console and clicking on Jobs, or by going to the job tracker web interface. If you see any failed tasks for that job, check the logs of the particular task to find out what error occurred. After fixing the problem, run the query again.
I keep getting an error using the bqcommand line tool. For example, I can easily run this query and it returns the table that I want:
head -n 10 xxxx-bq:name_name.Report2
Note that xxxx-bq is the projectid, and name_name is the dataset id. When I try to run a query against this table, say the follwing:
query "SELECT count(*) FROM xxxx-bq:name_name.Report2
I get an error that says that I cannot start a job without a project id. What am I doing wrong here? How can I specify in the query the project ID? I know people have asked some similar questions. That said, I have been following along and my approach is not working.
Do you have a project id? If not, this page can help you set one up: https://developers.google.com/bigquery/bq-command-line-tool-quickstart
All BigQuery jobs (which include queries) require a project id, which is the project that gets billed for any damage done by the job. (by damage, I mean work)
You should either set your default project id (you can do this by running bq init)
or set the project id that you're running the job under via --project_id=
So if you're running bq shell, you would use bq shell --project_id=myprojectid instead.
strange... I just started working with bq & got the same error but it didn't like me passing --project_id=[myprojectid]. Although I was already authed with gcloud auth login, I had to run bq init (and it seemingly didn't do anything) -- after that, my queries worked just fine.