How to increase sqllab timeout in apache superset? I have tried increasing SUPERSET_TIMEOUT but this does not work. Query gets timedout in 30 secs. I am not able to run a complex query.
See https://superset.incubator.apache.org/faq.html#why-are-my-queries-timing-out for possible reasons for queries to time out, and solutions.
You can modify the time out parameter in File config.py
file path lib/python3.7/site-packages/superset-0.30.1-py3.7.egg/superset
#Timeout duration for SQL Lab synchronous queries
SQLLAB_TIMEOUT = 30
In my case (Superset 0.23.3 on HDP 3.1.0) adding this configuration option to superset_config.py increased the timeout:
SQLLAB_TIMEOUT=120
Also, keep in mind that SUPERSET_TIMEOUT should be equal or greater to the above number, otherwise gunicorn will drop the request before it finishes.
Related
I have a dataset that has 1 billion rows. The data is stored in Hive. Also, I put Impala as a layer between Hive and Superset. The queries that are run in Superset have row limit max. 100.000. I need to change it with no row limit. Furthermore, I need to make a visualization from what the queries return from SQL lab, but it cannot be done because there is a timeout cache limit also. Therefore, if I change/increase the row limit in SQL lab and timeout cache in visualization, then I guess, there will be no problem.
i am trying my best to answer below. Pls backup all config files before changing.
For SQL row limit issue -
modify config.py file inside the 'anaconda3/lib/python3.7/site-packages' and set
DEFAULT_SQLLAB_LIMIT to 1000000000
QUERY_SEARCH_LIMIT to 1000000000
modify viz.py and set-
filter_row_limit to 1000000000
For timeout issue, increase below parameter values -
For synchronous queries - change superset_config.py
SUPERSET_WEBSERVER_TIMEOUT
SQLLAB_TIMEOUT
SUPERSET_TIMEOUT --This value should be >=SQLLAB_TIMEOUT
For async queries -
SQLLAB_ASYNC_TIME_LIMIT_SEC
There must be a config parameter to change the max row limit in site-packages/superset, DEFAULT_SQLLAB_LIMIT to set the default and SQL_MAX_ROW to set the max in SQL Lab.
I guess we have to run superset_init again to make the change appear on Superset.
I've been able to solve the problem as follows:
modify config.py in site-packages/superset:
increase SQL_MAX_ROW from 100'000.
I currently connect JetBrain's DataGrip IDE to Google BigQuery to run my queries. I get the following error however: [Simba][BigQueryJDBCDriver](100034) The job has timed out on the server. Try increasing the timeout value. This of course happens when I run a query that may take some time to execute.
I can execute queries that take a short amount of time to complete so the connection does work.
I looked at this question (SQL Workbench/J and BigQuery) but I still did not fully understand how to change the timeout value
The error is seen below in this screenshot:
This works well also:
Datasource Properties | Advanced | Timeout : 3600
Please open up data source properties and add this to the very end of connection URL: ;Timeout=3600; (note it case sensitive). Try to increase the value until error is gone.
I have a sql file running many queries. I want to see the accumualted sum of all queries. I know that if I turn on timing, or call
\timing
query 1;
query 2;
query 3;
...
query n;
at the beginning of the script, it will start to show time it takes for each query to run. However, I need to have the accumulate results of all queries, without having to manually add them.
Is there a systematic way? If not, how can I fetch the interim times to throw them in a variable.
The pg_stat_statements is a good module that provides a means for tracking execution statistics.
First, add pg_stat_statements to shared_preload_libraries in the
postgresql.conf file. To know where this .conf file exists in your
filesystem, run show config_file;
shared_preload_libraries = 'pg_stat_statements'
Restart Postgres database
Create the extension
CREATE EXTENSION pg_stat_statements;
Now, the module provides a View, pg_stat_statements, which helps you to analyze various query execution metrics.
Reset the contents of stat collected before running queries.
SELECT pg_stat_statements_reset();
Now, execute your script file containing queries.
\i script_file.sql
You may get all the timing statistics of all the queries executed. To get the total time taken, simply run
select sum(total_time) from pg_stat_statements
where query !~* 'pg_stat_statements';
The time you get is in milliseconds, which may be converted to desired format using various timestamp related Postgres functions
If you want to time the whole script, on linux or mac you can use the time utility to launch the script.
The measurement in this case is a bit more than the sum of the raw query times, because it includes some overhead of starting and running the psql command. On my system this overhead is around 20ms.
$ time psql < script.sql
…
real 0m0.117s
user 0m0.008s
sys 0m0.007s
The real value is the time it took to execute the whole script, including the aforementioned overhead.
The approach in this answer is a crude, simple client side way to measure the runtime of the overall script. It is not useful to measure milli-second precision server side execution times. It still might be sufficient for many use-cases.
The solution of Kaushik Nayak is a way more precise method to time executions directly on the server. It also provides much more insight into the execution (eg. query level times).
I am running a Virtuoso Open Source Server version 07.20.3217.
I am storing triples in there. However, when doing certain SPARQL queries, I get the following error message:
Virtuoso 42000 Error The estimated execution time 0 (sec) exceeds the limit of 3000 (sec).
It is not the first time it happens to me. Normally I try to re-write the query and the problem is solved. However, this time the estimated time is 0. Besides, I am not able to re-write the query to avoid this error.
The server is only accessed locally by me, so I would not have any problem enabling queries that take a lot of time.
What I am asking is: Is it any configuration file or similar in Virtuoso, where I can set a higher limit than 3000 sec per query?
Thanks in advance,
There is a server-side timeout, MaxQueryExecutionTime, set in the [SPARQL] section of the Virtuoso INI file, as discussed in the product documentation.
ObDisclaimer: OpenLink Software produces Virtuoso, and employs me.
I have an ASP Classic script that has to do a lot of things (admin only) including running a stored proc that takes 4 mins to execute.
However even though I have
Server.ScriptTimeout = 0
Right at the top of the page it seems to ignore it - 2003 and 2012 servers.
I have tried indexing the proc as much as possible but its for a drop down of commonly searched for terms so I have to clean out all the hacks, sex spam and so on.
What I don't understand is why the Server.ScriptTimeout = 0 is being ignored.
The actual error is this and comes just
Active Server Pages error 'ASP 0113'
Script timed out
/admin/restricted/fix.asp
The maximum amount of time for a script to execute was exceeded. You
can change this limit by specifying a new value for the property
Server.ScriptTimeout or by changing the value in the IIS
administration tools.
It never used to do this and it should override the default set in IIS of 30 seconds.
I have ensured the command timeout of the proc is 0 so unlimited and it errors AFTER coming back from the proc.
Anybody got any ideas?
0 is an invalid value for Server.ScriptTimeout for 2 reasons:
ASP pages have to have a timeout. They cannot run indefinitely.
ScriptTimeout can never be lower than the AspScriptTimeout set in IIS according to this statement from the MSDN docs:
For example, if NumSeconds is set to 10, and the metabase setting
contains the default value of 90 seconds, scripts will time out after
90 seconds.
The best way to work around this is to set Server.ScriptTimeout to a really high value.