How to calculate accumulated sum of query timings? - sql

I have a sql file running many queries. I want to see the accumualted sum of all queries. I know that if I turn on timing, or call
\timing
query 1;
query 2;
query 3;
...
query n;
at the beginning of the script, it will start to show time it takes for each query to run. However, I need to have the accumulate results of all queries, without having to manually add them.
Is there a systematic way? If not, how can I fetch the interim times to throw them in a variable.

The pg_stat_statements is a good module that provides a means for tracking execution statistics.
First, add pg_stat_statements to shared_preload_libraries in the
postgresql.conf file. To know where this .conf file exists in your
filesystem, run show config_file;
shared_preload_libraries = 'pg_stat_statements'
Restart Postgres database
Create the extension
CREATE EXTENSION pg_stat_statements;
Now, the module provides a View, pg_stat_statements, which helps you to analyze various query execution metrics.
Reset the contents of stat collected before running queries.
SELECT pg_stat_statements_reset();
Now, execute your script file containing queries.
\i script_file.sql
You may get all the timing statistics of all the queries executed. To get the total time taken, simply run
select sum(total_time) from pg_stat_statements
where query !~* 'pg_stat_statements';
The time you get is in milliseconds, which may be converted to desired format using various timestamp related Postgres functions

If you want to time the whole script, on linux or mac you can use the time utility to launch the script.
The measurement in this case is a bit more than the sum of the raw query times, because it includes some overhead of starting and running the psql command. On my system this overhead is around 20ms.
$ time psql < script.sql
…
real 0m0.117s
user 0m0.008s
sys 0m0.007s
The real value is the time it took to execute the whole script, including the aforementioned overhead.
The approach in this answer is a crude, simple client side way to measure the runtime of the overall script. It is not useful to measure milli-second precision server side execution times. It still might be sufficient for many use-cases.
The solution of Kaushik Nayak is a way more precise method to time executions directly on the server. It also provides much more insight into the execution (eg. query level times).

Related

Measure execution time in Altibase DBMS

I need to measure the execution time in a query in the ALTIBASE DBMS.
For example, I need to now in ms the time it took a SELECT * FROM example;
I tried to use the methods that have some traditional DBMS like MySQL or Oracle and doesn't work.
There are some method. The first one you can use the log from altibase to analyze. And the other method is to coding some program that recording time from using the SELECT to get result. Now, I also learn to use altibase - We can together.
I found the solution
After entering to the client with next command isql -u sys -p manager -sysdba you need to execute the next query
SET TIMING ON;
Next the following queries will have the execution time.

BQ PY Client Libraries :: client.run_async_query() vs client.run_sync_query()

I'm looking at BQ PY Client Libraries:
There used to be two different operations to query a table
client.run_async_query()
client.run_sync_query()
But in the latest version (v1.3) it seems there's only one operations to execute a query, Client.query(). Did I understand correctly?
And looking at GH code it looks Client.query() just returns the query job, not the actual query results / data.... Making me conclude it works in a similar way as client.run_async_query().. there's no replacement for client.run_sync_query() operation anymore which return query results (data) synchronously / immediately?
Thanks for the clarification!
Cheers!
Although .run_sync_query() has been removed, the Query reference says that short jobs may return results right away if they don't take long to finish:
query POST /projects/projectId/queries
Runs a BigQuery SQL query and returns results if the query completes within a specified timeout.

What problems may occur while querying SQL databases with big amount of data over internet

I am having this big database on one MSSQL server that contains data indexed by a web crawler.
Every day I want to update SOLR SearchEngine Index using DataImportHandler which is situated in another server and another network.
Solr DataImportHandler uses query to get data from SQL. For example this query
SELECT * FROM DB.Table WHERE DateModified > Config.LastUpdateDate
The ImportHandler does 8 selects of this types. Each select will get arround 1000 rows from database.
To connect to SQL SERVER i am using com.microsoft.sqlserver.jdbc.SQLServerDriver
The parameters I can add for connection are:
responseBuffering="adaptive/all"
batchSize="integer"
So my question is:
What can go wrong while doing this queries every day ? ( except network errors )
I want to know how is SQL Server working in this context ?
Further more I have to take a decicion regarding the way I will implement this importing and how to handle errors, but first I need to know what errors can arise.
Thanks!
Later edit
My problem is that I don't know how can this SQL Queries fail. When i am calling this importer every day it does 10 queries to the database. If 5th query fails I have to options:
rollback the entire transaction and do it again, or commit the data I got from the first 4 queries and redo somehow the queries 5 to 10. But if this queries always fails, because of some other problems, I need to think another way to import this data.
Can this sql queries over internet fail because of timeout operations or something like this?
The only problem i identified after working with this type of import is:
Network problem - If the network connection fails: in this case SOLR is rolling back any changes and the commit doesn't take place. In my program I identify this as an error and don't log the changes in the database.
Thanks #GuidEmpty for providing his comment and clarifying out this for me.
There could be issues with permissions (not sure if you control these).
Might be a good idea to catch exceptions you can think of and include a catch all (Exception exp).
Then take the overall one as a worst case and roll-back (where you can) and log the exception to include later on.
You don't say what types you are selecting either, keep in mind text/blob can take a lot more space and could cause issues internally if you buffer any data etc.
Though just a quick re-read and you don't need to roll-back if you are only selecting.
I think you would be better having a think about what you are hoping to achieve and whether knowing all possible problems will help?
HTH

How do I display the query time when a query completes in Vertica?

When using vsql, I would like to see how long a query took to run once it completes. For example when i run:
select count(distinct key) from schema.table;
I would like to see an output like:
5678
(1 row)
total query time: 55 seconds.
If this is not possible, is there another way to measure query time?
In vsql type:
\timing
and then hit Enter. You'll like what you'll see :-)
Repeating that will turn it off.
Regarding the other part of your question:
is there another way to measure query time?
Vertica can log a history of all queries executed on the cluster which is another source of query time. Before 6.0 the relevant system table was QUERY_REPO, starting with 6.0 it is QUERY_REQUESTS.
Assuming you're on 6.0 or higher, QUERY_REQUESTS.REQUEST_DURATION_MS will give you the query duration in milliseconds.
Example of how you might use QUERY_REQUESTS:
select *
from query_requests
where request_type = 'QUERY'
and user_name = 'dbadmin'
and start_timestamp >= CURRENT_DATE
and request ilike 'select%from%schema.table%'
order by start_timestamp;
The QUERY_PROFILES.QUERY_DURATION_US and RESOURCE_ACQUISITIONS.DURATION_MS columns may also be of interest to you. Here are the short descriptions of those tables in case you're not already familiar:
RESOURCE_ACQUISITIONS - Retains information about resources (memory, open file handles, threads) acquired by each running request for each resource pool in the system.
QUERY_PROFILES - Provides information about queries that have run.
I'm not sure how to enable that in vsql or if that's possible. But you could get that information from a script.
Here's the psuedocode (I used to use perl):
print time
system("vsql -c 'select * from table'");
print time
Or put time into a variable and do some subtraction.
The other option is to use some tool like Toad to connect to Vertica instead of using vsql.

Mysql query to return server load average

Does anyone know of a MySQL query that returns the server's current load average?
Do you mean the actual system load average? This has nothing to do with MySQL. For example on Linux, you can get it from /proc/loadavg.
Correct me if I'm wrong, but the load average variable is a property of the machine, not the MySQL server.
So to retrieve the avg. load you should be looking for a system call, not a SQL-query.
You might want to look into this statement:
http://dev.mysql.com/doc/refman/5.1/en/show-status.html
SHOW [GLOBAL | SESSION] STATUS
[LIKE 'pattern' | WHERE expr]
SHOW STATUS provides server status information. This information also can
be obtained using the mysqladmin
extended-status command. The LIKE
clause, if present, indicates which
variable names to match. The WHERE
clause can be given to select rows
using more general conditions, as
discussed in Section 20.28,
“Extensions to SHOW Statements”. This
statement does not require any
privilege. It requires only the
ability to connect to the server.
Do you have mytop installed?
mytop is a console-based (non-gui)
tool for monitoring the threads and
overall performance of a MySQL 3.22.x,
3.23.x, and 4.x server
Mytop allows you to monitor what is happening in real time, everything from number of queries per second to key efficiency of the queries.
See Using Mytop: A MySQL Monitor