BigQuery "backend error" on loading - google-bigquery

We are having a number of backend errors on BigQuery's side when loading data files. Backend errors seem to be normal, occurring once or twice daily in our load jobs which are run every hour. In the last three days, we've had around 200 backend errors. This is causing cascading problems in our system.
Until the past three days, the system has been stable. The error is a simple "Backend error, try again." Usually the load job works when it's run again, but in the last three days the problem has become much worse. Please let me know if you need any other information from me.

We did have an issue on friday where all load jobs were returning backend error for a period of a few hours, but that should have been resolved. If you have seen backend errors since then please let us know and send a job id of a failing job.

Related

Current session is no longer available due to structural changes in the database - Tabular

We are using a SQL Server Tabular model which we use for self-service BI purposes. At monthly basis we have some 90 distinct persons who are using the model. Recently we encountered some issues/errors in the client tools(Excel and Power BI) that are connecting to the Tabular model. See screenshots. We did not make any significant changes to the model the past period.
We noticed that the errors keep showing up after our incremental load, i.e. a full process of a number of partitions we process these partitions every 15 minutes. The process is kicked of by a SSIS job which is scheduled every 15 minutes and processes 5 partitions in 3 tables.
Edit: After some research I figured out that the problem lies in the perspectives. Everytime I do a full process on any object. The error appears. This does not happen on the default model view. Still not found a solution though.
The error occurs when you make a change to the power bi report or the excel file. For example when you do a refresh, or when you click a filter. If you press refresh multiple times the connection comes back and everything works as it is supposed to. It seems like the clients lose their connection to the model. After 15 minutes the problem occurs again.
This is very aggravating for the users. Especially when they are in the middle of a presentation.
This is what we tried:
We tried searching Google for a solution
Checked that we have the latest SQL Server 2016 update (13.0.5149.0)
SSAS Builds from Visual Studio(2015 en 2017)
No full process on tables, only on
partitions.
Upgrading the server from 4 to 8 cpu cores.
I hope somebody can help us.
You shouldn't have the error that you are seeing with just a full process of a partition or even the full table. We do this every hour for a number of core tables and we do not see any issues like this (and we would)
I am starting from the hypothesis that
Your 15 minute process is doing more than just processing the partitions with a refresh command
Something else is happening on the environment (either scheduled or not). Who has permissions to change the schema? Could it be users / developers deliberately or not making changes?
The only things that should cause that kind of error would be Alter, Delete or CreateOrReplace TMSL commands
So unless that triggers your own ideas on a diagnostic process I would do the following steps
Note: I presume that your users also see this issue on your test environment when you run your 15 min processing routine on that. You should do the following on that test environment where nothing else is running to eliminate the possibility of someone else interfering with the experiment. If you don't have a representative test environment then you will have to do on live but I would do this out of hours or under some kind of change control process with your 15 minute refresh turned off and admin permissions to the cube heavily locked down to ensure that nothing can interfere with your experiment.
First prove that you can reproduce this issue with the 15 minute routine
Get your sample PowerBI report that is known to present the error (I'd prefer Power BI for a repro as it is slightly simpler than Excel)
Refresh your PowerBI and explore the data to prove that the error doesn't occur
Run your 15 minute process
You should now see the problem reported. If you do, great, you have a reproduceable issue! If you don't then it is not quite as you thought it was and you need to find the way of reliably reproducing these errors. (perhaps something else is happening that isn't the 15 minute process)
So now you are sure how you can reproduce the issue, you need to isolate whether it is really the processing that is causing the problem
Refresh your PowerBI and explore the data to prove that the error doesn't occur
Execute (via SSMS) your XMLA that processes the entire database for one of your tables
it should look something like this
{
"refresh": {
"type": "full",
"objects": [
{
"database": "yourdbname"
}
]
}
}
Do the thing that your users do when they see the issue.
If you too see the issue, then I would raise to Microsoft Support as this shouldn't happen
If you don't see the issue then you can refine this processing to just be the partition for a single table. But as we have done a process for the entire db above if shouldn't change the result
If you still don't see the issue then it isn't the processing that is causing this issue (which I suspect) and it is something else in the 15 minute routine that is causing it. Look deeper into that process and understand what else it is doing.
Alongside this checking the logs should show if there are any other processing tasks or types of XMLA happening.
I hope these ideas get you closer to finding the actual activity that is causing this experience for your users. It would be great if you could post with how you got on and what you found.
I have the same problem here if I install the latest CU on my SQL Server 2017. My production environment is still running with CU3 (Jan/2018) due to this problem.
Knowing that I would suggest reverting your installation to a previous release. Maybe 13.0.5026.0 (SP2) or even to the 13.0.4466.4 (Jan/2018).
I am facing the same issue with SQL Server 2017 CU 11 installed.
The issue indeed occurs in case of a 'full refresh' in combination with the use of a 'perspective' in an existing connection. The workaround to use the default 'Model' in the connection does indeed 'solve' the issue.

BigQuery load failing with backend error on nearly every load attempt

I have been trying in vain for nearly two days to load in two large datasets, each of which are ~30GB/s and split into 50 uncompressed ~600MB files each, all coming from a bucket. Almost always the jobs fail with "internal" or "backend" error.
I have tried submitting with a wild card (as in *csv) and I have also tried the individual files.
On the rare occasion the load job does not fail within a few minutes, it will eventually die after 6 or 7 hours.
I have split the files and made them uncompressed to help with load times, would this be causing an issue? I did have a compressed version load successfully after about 7 hours yesterday, but so far I have only been able to load a single 350 MB CSV uncompressed from the bucket.
Here is an example:
Errors:
Error encountered during execution. Retrying may solve the problem. (error code: backendError)
Job ID bvedemo:bquijob_64ebebf1_1532f1b3c4f
Backend error would imply something is happening at Google, but I must be doing something wrong to have it fail this often!
Lesson of the day: do not try to load data from a nearline bucket into BigQuery.
I moved the data into a standard bucket, reloaded from there and 65GB of data loaded in less than 1 minute.

BigQuery giving success message , but not seemed to be saved, not shown in querying using web client from google

Project number :149410454586
We are facing the same issue mentioned below
BigQuery streaming insert data availability delay
We are getting the success response from BigQuery , but the data is not visible
while querying it.It is strange that it seemed to happen over a night, till then
it was working fine.
We were supposed to go to deployment for UAT tomorrow and we never expected this
heppen and now helpless, kindly if you could help out on this.
Face the same issue one hour ago. Any streamed in data are not available for queries and it is not a warm-up issue.
Project Number: 272433143339
It seems to be related to https://status.cloud.google.com/incident/bigquery/18010

Bigquery Backend Error when exporting results to a table

There is some time that the query runs perfectly, but lately has appeared to me this error: "Backend Error".
I know that my query is huge, and it takes about 300 seconds to execute. But I imagine this is some BigQuery's bug, so I wonder why this error is happening.
This error started appears when I was executing some other queries, when I just wanted the results and not export them.
So I started to create a table with the results hopping that BigQuery could be able to perform the query
Here is an image that shows the error:
I looked up your job in the BigQuery job database, and it completed successfully after 160 seconds.
BigQuery queries are fundamentally asynchronous. That is, when you run a query, it runs as a named Job by the BigQuery service. Since the original call may timeout, usual best-practice is to poll for completion by using the jobs.getQueryResults() API. My guess is that this is the API call that actually failed.
We had reports of an elevated number of Backend Errors yesterday and we're still investigating. However, these don't appear to be actual failed queries, instead they are failures getting the status of queries or getting the results, that should go away by retrying.
How did you run the query? Did you use the BigQuery Web UI? If you are using the API, did you call the bigquery.jobs.insert() api or the bigquery.jobs.query() api?

Abort a table import stuck in 'pending'

Similar questions have been asked but not exactly what I am looking for.
The problem: on some occasions importing a table from Google Cloud to Big Query gets stuck in a 'pending' state for hours if not days. Tables that get stuck in this state never seem to come out of it, or at least we didn't bother waiting that long. I know it's not a queue issue since in the mean time we can import other tables just fine. No errors are returned by Big Query.
My question: in this situation, and in general, how can we safely abort/cancel an import to Big Query without having the table quietly import on us without us knowing. This would actually apply to any table regardless of its state, as long as it hasn't finished importing.
Thanks.
You may be hitting job load rate limits. For example, if you try to start more than two load jobs per minute for the same table, the load jobs against that table will be defferred, while other load jobs against other tables may continue at normal speed.
There are per-project limits on rate at which load jobs will be started and limits on the number of load jobs that can be running per project at any one time. If you send jobs faster than this, we'll queue, but as you've noticed, our queueing is not a fair queue, and can start newer jobs before older ones.
Aborting pending jobs is a commonly requested feature. If you file a feature request here that will help us prioritize it.