Random simple queries are failing on BigQuery - google-bigquery

For the BQ team, queries that usually work, are failing sometimes.
Could you please look into what could be the issue, there is just this:
Error: Unexpected. Please try again.
Job ID: aerobic-forge-504:job_DTFuQEeVGwZbt-PRFbMVE6TCz0U

Sorry for the slow response.
BigQuery replicates data to multiple datacenters and may run queries from any of them. Usually this is transparent ... if your data hasn't replicated everywhere, we will try to find a datacenter that has all of the data necessary to run a query and execute the query from there.
The error you hit was due to BigQuery not being able to find a single datacenter that had all of the data for one of your tables. We try very hard to make sure this doesn't happen. In principle, it should be very rare (we've got solution designed to make sure that it never happens, but haven't finished the implementation yet). We saw an uptick in this issue this morning, and have a bug filed and are currently investigating the issue.
Was this a transient error? If you retry the operation, does it work now? Are you still getting errors on other queries?

Related

GCP Bigquery run issue

I am running a query in GCP console. The script is correct but it shows the below message while I run the query.
"Results are not displayed because the table has a large amount of cells and may cause the BigQuery console to become unresponsive. Consider modifying your query to improve browser performance."
What is the solution.
This is just a BigQuery's mechanism to avoid overwhelming your browser.
You still can see the results by clicking on View results
If you think this message is confuse or something should be changed, I encourage you to create a Public Issue sharing your thoughts. You can do that through this link
You may have potentially caused a humongous CROSS JOIN in your query. You need to take a look at your query first, and BigQuery will compute the results if you force it, but it will lead to the use of a lot of Slots and rack up a high bill for you

Current session is no longer available due to structural changes in the database - Tabular

We are using a SQL Server Tabular model which we use for self-service BI purposes. At monthly basis we have some 90 distinct persons who are using the model. Recently we encountered some issues/errors in the client tools(Excel and Power BI) that are connecting to the Tabular model. See screenshots. We did not make any significant changes to the model the past period.
We noticed that the errors keep showing up after our incremental load, i.e. a full process of a number of partitions we process these partitions every 15 minutes. The process is kicked of by a SSIS job which is scheduled every 15 minutes and processes 5 partitions in 3 tables.
Edit: After some research I figured out that the problem lies in the perspectives. Everytime I do a full process on any object. The error appears. This does not happen on the default model view. Still not found a solution though.
The error occurs when you make a change to the power bi report or the excel file. For example when you do a refresh, or when you click a filter. If you press refresh multiple times the connection comes back and everything works as it is supposed to. It seems like the clients lose their connection to the model. After 15 minutes the problem occurs again.
This is very aggravating for the users. Especially when they are in the middle of a presentation.
This is what we tried:
We tried searching Google for a solution
Checked that we have the latest SQL Server 2016 update (13.0.5149.0)
SSAS Builds from Visual Studio(2015 en 2017)
No full process on tables, only on
partitions.
Upgrading the server from 4 to 8 cpu cores.
I hope somebody can help us.
You shouldn't have the error that you are seeing with just a full process of a partition or even the full table. We do this every hour for a number of core tables and we do not see any issues like this (and we would)
I am starting from the hypothesis that
Your 15 minute process is doing more than just processing the partitions with a refresh command
Something else is happening on the environment (either scheduled or not). Who has permissions to change the schema? Could it be users / developers deliberately or not making changes?
The only things that should cause that kind of error would be Alter, Delete or CreateOrReplace TMSL commands
So unless that triggers your own ideas on a diagnostic process I would do the following steps
Note: I presume that your users also see this issue on your test environment when you run your 15 min processing routine on that. You should do the following on that test environment where nothing else is running to eliminate the possibility of someone else interfering with the experiment. If you don't have a representative test environment then you will have to do on live but I would do this out of hours or under some kind of change control process with your 15 minute refresh turned off and admin permissions to the cube heavily locked down to ensure that nothing can interfere with your experiment.
First prove that you can reproduce this issue with the 15 minute routine
Get your sample PowerBI report that is known to present the error (I'd prefer Power BI for a repro as it is slightly simpler than Excel)
Refresh your PowerBI and explore the data to prove that the error doesn't occur
Run your 15 minute process
You should now see the problem reported. If you do, great, you have a reproduceable issue! If you don't then it is not quite as you thought it was and you need to find the way of reliably reproducing these errors. (perhaps something else is happening that isn't the 15 minute process)
So now you are sure how you can reproduce the issue, you need to isolate whether it is really the processing that is causing the problem
Refresh your PowerBI and explore the data to prove that the error doesn't occur
Execute (via SSMS) your XMLA that processes the entire database for one of your tables
it should look something like this
{
"refresh": {
"type": "full",
"objects": [
{
"database": "yourdbname"
}
]
}
}
Do the thing that your users do when they see the issue.
If you too see the issue, then I would raise to Microsoft Support as this shouldn't happen
If you don't see the issue then you can refine this processing to just be the partition for a single table. But as we have done a process for the entire db above if shouldn't change the result
If you still don't see the issue then it isn't the processing that is causing this issue (which I suspect) and it is something else in the 15 minute routine that is causing it. Look deeper into that process and understand what else it is doing.
Alongside this checking the logs should show if there are any other processing tasks or types of XMLA happening.
I hope these ideas get you closer to finding the actual activity that is causing this experience for your users. It would be great if you could post with how you got on and what you found.
I have the same problem here if I install the latest CU on my SQL Server 2017. My production environment is still running with CU3 (Jan/2018) due to this problem.
Knowing that I would suggest reverting your installation to a previous release. Maybe 13.0.5026.0 (SP2) or even to the 13.0.4466.4 (Jan/2018).
I am facing the same issue with SQL Server 2017 CU 11 installed.
The issue indeed occurs in case of a 'full refresh' in combination with the use of a 'perspective' in an existing connection. The workaround to use the default 'Model' in the connection does indeed 'solve' the issue.

Bigquery Cache Not Working

I noticed that BigQuery no longer cache the same query even I have chosen to use cache in the GUI (both Alpha and Classic). I didn't edit the query at all, just keep clicking run query button and every time GUI executed the query without using cache results.
It happens to my PHP script as well. Before, it was enable to use cache and came back with results very quick and now it executes the query every time even the same query has been executed minutes ago. I can confirm the behaviour in the logs.
I am wondering if there is anything changed in the last few weeks? Or some kind of account level settings control this? Because it was working fine for me.
As per official docs here cache is disable when:
...any of the tables referenced by the query have recently received
streaming inserts...
Even if you are streaming to one partition, and then querying to another, this will invalidate caching functionality for the whole table. There is this feature request opened where it is requested to be able to hit cache when doing streaming inserts to one partition but querying a different partition.
EDIT***:
After some investigation I've found out that some months ago there was an issue going on which was allowing to hit the cache even streaming inserts were being made. This was not expected behavior, and therefore it got solved in May. I guess this is the change you have experienced and what you are talking about.
Docs have not changed related to this, and they aren't/weren't incorrect. Just the previous behavior was the incorrect one.

BigQuery UDF works in one project but not another

I have been using UDF's for a few months now with a lot of success. Recently, I set up separate projects for development, and stream a sample of 1/10 of our web tracking data into these projects.
What I'm finding is that the UDF's I use in production, which operate on the full dataset, are working, while the exact same query in our development project consistently fails, despite querying 1/10 of the data. The error message is:
Query Failed
Error: Resources exceeded during query execution: UDF out of memory.
Error Location: User-defined function
I've looked through our Quotas and haven't found anything that would be limiting the development project.
Does anybody have any ideas?
If anybody can look into it, here are the project ids:
Successful query in production: bquijob_4af38ac9_155dc1160d9
Failed query in development: bquijob_536a2d2e_155dc153ed6
Jan-Karl, apologies for the late response; I've been out of the country to speak at some events in Japan and then have been dealing with oncall issues with production.
I finally got a chance to look into this for you. The two job ids you sent me are running very different queries. The queries look the same, but they're actually running over views, which have different definitions. The query that succeeded is a straight select * from table whereas the one that has the JS OOM is using a UDF.
We're in the midst of rolling out a fix for the JS OOM issue, by allowing the JavaScript engine to use more RAM.
...
...and now for some information that's not really relevant to this case, but that might be of future value...
...
In theory, it could be possible for a query to succeed in one project and fail in another, even if they're running over exactly the same dataset. This would be unusual, but possible.
Background : BigQuery operates and maintains copies of customer data in multiple datacentres for redundancy. Different projects are biased to run in different datacentres to help with load spreading and utilisation.
A query will run in the default datacentre for its project if the data is fresh enough. We have a process that replicates the data between datacentres, and we avoid running in a datacentre that has a stale copy of the data. However, we run maintenance jobs to ensure that the files that comprise your data are of "optimal" size. These jobs are scheduled separately per datacentre, so it's possible that your underlying data files for the same exact table would have a different physical structure in cell A and cell B. It would be possible for this to affect aspects of a query's performance, and in extreme cases, a query may succeed in cell A but not B.

SQL Query hangs and makes version store grow

First of, I apologize if this is the wrong place for this questions, but I haven't found any other location that might help me out.
I have a query that is running on a sql server that keeps running indefinitely and as a result the version store on SQL Server grows and tempdb grows as well. Currently I don't have the source code.
I would like to get a few pointers for where to search for the cause of this problem.
In activity monitor all I see is a process with a taskstate of SUSPENDED, and Wait Type of ASYNC_NETWORK_IO_WRITELOG. I'm running this on SQL Server 2008.
Again sorry if this is the wrong place for asking this.
/Andy.l
First, I've got no experience with SQL2008, but from my experience with SQL2000, it managed itself sometimes into nasty locking situations running multicore environment. I would try rerun query with Option (MAXDOP 1). Cost you almost nothing to check it out.
At least with older SQL Server versions a SELECT could be blocked by other sessions, so I would suspect something like that in your case (even though your mentioning the "version store" which seems to indicate that you have enabled the new snapshot isolation mode).
Running sp_who2 will give you more details about whether it is a blocking problem or not