High CPU Usage By Postgres Process - sql

I have an application running on Postgres database, sometimes when I have about 8-10 people working on the application, the CPU usage soars high to something between 99-100%, The application was built on Codeigniter framework which I believe had made provision for closing up connections to the database each and every time it is not needed, What could be solution to this problem. I would appreciate any suggestions. Thank you
Basically, what the people do on the application is to running insert queries but at a very fast rate, A person could run between 70 - 90 insert queries in a minute.

I came across with the similar kind of issue. The reason was - some transactions were getting stuck and running since long time. Hence CPU utilization got increased to 100%. Following command helped to find out the connections running for the longest time:
SELECT max(now() - xact_start) FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'active');
This command shows the the amount of time a connection has been running. This time should not be greater than an hour. So killing the connection which was running for a long long time or stuck at any point worked for me. I followed this post for monitoring and solving my issue. Post includes lots of useful commands to monitor this situation.

You need to find out what PostgreSQL is doing. Relevant resources:
Monitoring in general
Monitoring queries
Finding slow queries
Once you find what the slow or the most common queries are use, use EXPLAIN to make sure they are being executed efficiently.

Here are some cases we met that cause high CPU usage of Postgres.
Incorrect indexes are used in the query
Check the query plan - Through EXPLAIN, we could check the query plan, if the index is used in the query, the Index Scan could be found in the query plan result.
Solution: add the corresponding index for the query SQL to reduce CPU usage
Query with sort operation
Check EXPLAIN (analyze, buffers) - If the memory is insufficient to do the sorting operation, the temporary file could be used to do the sorting, and high CPU usage comes up.
Note: DO NOT "EXPLAIN (analyze)" in a busy production system as it actually executes the query behind the scenes to provide more accurate planner information and its impact is significant
Solution: Tune up the work_mem and sorting operations
Sample: Tune sorting operations in PostgreSQL with work_mem
Long-running transactions
Find long-running transactions through
SELECT pid
, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '2 minutes';
Solution:
Kill the long-running transaction through select pg_terminate_backend(pid)
Optimize the transaction or query SQL through corresponding indexes.

Related

Azure Database high DTU - High IO Avg

I'm trying to work out the cause of the high DTU on a database (its rank is S2 that is also geo replicated). On a server which is unsure if its V12 or the older (different problem).
Friday last week and this Friday we have a spike that looks like this:
Looking at the resource stats:
SELECT TOP 1000 *
FROM sys.dm_db_resource_stats
ORDER BY end_time DESC
avg CPU kicks around 3-5% during the peak
but most significantly the avg_data_io_percentage is roaming about 72% - 90%
How can I track down the IO further?
Query Performance Insight is quite useful but execution count and cpu could be misleading in this case?
TOP 5 queries per CPU consumption
top 5 during that odd period:
Are the likely offenders the queries that that appear differently in those top five?
Is there a better way to see the IO graph or data? Am I looking at the wrong thing? :D
Thanks in advance.
You can use SSMS and the built-in reports for Query Performance Insights/Query Data Store to look at IO-intensive queries. I suggest connecting to the database using SSMS and looking at the most resource intensive queries using the logical reads, logical writes, and physical reads metrics. You should find your offender in one of these.
Thanks,
Torsten

libpq very slow for large (20 million record) database

I am new to SQL/RDBMS.
I have an application which adds rows with 10 columns in PostgreSQL server using the libpq library. Right now, my server is running on same machine as my visual c++ application.
I have added around 15-20 million records. The simple query of getting total count is taking 4-5 minutes using select count(*) from <tableName>;.
I have indexed my table with the time I am entering the data (timecode). Most of the time I need count with different WHERE / AND clauses added.
Is there any way to make things fast? I need to make it as fast as possible because once the server moves to network, things will become much slower.
Thanks
I don't think network latency will be a large factor in how long your query takes. All the processing is being done on the PostgreSQL server.
The PostgreSQL MVCC design means each row in the table - not just the index(es) - must be walked to calculate the count(*) which is an expensive operation. In your case there are a lot of rows involved.
There is a good wiki page on this topic here http://wiki.postgresql.org/wiki/Slow_Counting with suggestions.
Two suggestions from this link, one is to use an index column:
select count(index-col) from ...;
... though this only works under some circumstances.
If you have more than one index see which one has the least cost by using:
EXPLAIN ANALYZE select count(index-col) from ...;
If you can live with an approximate value, another is to use a Postgres specific function for an approximate value like:
select reltuples from pg_class where relname='mytable';
How good this approximation is depends on how often autovacuum is set to run and many other factors; see the comments.
Consider pg_relation_size('tablename') and divide it by the seconds spent in
select count(*) from tablename
That will give the throughput of your disk(s) when doing a full scan of this table. If it's too low, you want to focus on improving that in the first place.
Having a good I/O subsystem and well performing operating system disk cache is crucial for databases.
The default postgres configuration is meant to not consume too much resources to play nice with other applications. Depending on your hardware and the overall utilization of the machine, you may want to adjust several performance parameters way up, like shared_buffers, effective_cache_size or work_mem. See the docs for your specific version and the wiki's performance optimization page.
Also note that the speed of select count(*)-style queries have nothing to do with libpq or the network, since only one resulting row is retrieved. It happens entirely server-side.
You don't state what your data is, but normally the why to handle tables with a very large amount of data is to partition the table. http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html
This will not speed up your select count(*) from <tableName>; query, and might even slow it down, but if you are normally only interested in a portion of the data in the table this can be helpful.

How long should a query that returns 5 million records take?

I realise the answer should probably be 'as little time as possible' but I'm trying to learn how to optimise databases and I have no idea what an acceptable time is for my hardware.
For a start I'm using my local machine with a copy of sql server 2008 express. I have a dual-core processor, 2GB ram and a 64bit OS (if that makes a difference). I'm only using a simple table with about 6 varchar fields.
At first I queried the data without any indexing. This took a ridiculously long amount of time so I cancelled and added a clustered index (using the PK) to the table. This cut the time down to 1 minute 14 sec. I have no idea if this is the best I can get or whether I'm still able to cut this down even further?
Am I limited by my hardware or is there anything else I can do to my table/database/queries to get results faster?
FYI I'm only using a standard SELECT * FROM <Table> to retrieve my results.
EDIT: Just to clarify, I'm only doing this for testing purposes. I don't NEED to pull out all the data, I'm just using that as a consistent test to see if I can cut down the query times.
I suppose what I'm asking is: Is there anything I can do to speed up the performance of my queries other than a) upgrading hardware and b) adding indexes (assuming the schema is already good)?
I think you are asking the wrong question.
First of all - why do you need so many articles at one time on the local machine? What do you want to do with them? I'm asking because I think you want to transfer this of data to somewhere, so you should be measuring how long it takes to transfer the data.
Some advice:
Your applications should not select 5 million records at the time. Try to split your query and get the data in smaller sets.
UPDATE:
Because you are doing this for testing, I suggest that you
Remove * from your query - it takes SQL server some time to resolve this.
Put your data in temporary storage, try using VIEW or a temporary table for this.
Use plan caching on your server
to improve performance. But even if you're just testing, I still don't understand why you would need such tests if your application would never use such a query. Testing just for the sake of testing is a bad use of time
Look at the query execution plan. If your query is doing a table scan, it will obviously take a long time. The query execution plan can help you decide what kind of indexing you would need on the table. Also, creating table partitions can help sometimes in cases where the data is partitioned by a condition (usually date and time).
I did 5.5 million in 20 seconds. That's taking over 100k schedules with different frequencies and forecasting them for the next 25 years. Just max scenario testing, but proves the speed you can achieve in a scheduling system as an example.
The best optimized way depends on the indexing strategy you choose. As many of the above answers, i too would say partitioning the table would help sometimes. And its not the best practice to query all the billion record in a single time frame. Will give you much better results if you could try to query partially with the iterations. you may check this link to clear the doubts on the minimum requirements for the Sql server 2008 Minimum H/W and S/W Requirements for Sql server 2008
When fecthing 5 million rows you are almost 100% going spool to tempdb. you should try to optimize your temp Db by adding additional files. if you have multiple drives on seperate disks you should split the table data into different ndf files located on seperate disks. parititioning wont help when querying all the data on the disk
U can also use a query hint to force parrallelism MAXDOP this will increase the CPU utilization. Ensure that the columns contain few nulls as possible and rebuild ur indexes and stats

Performance of queries using count(*) on tables with many rows (300 million+)

I understand there are limitations to using sqlite, but I'd like to know if it should be able to handle this scenario.
My table has over 300 million records and the db is about 12 gigs. The data import util with sqlite is nice and fast. But then I added an index to a string column in this table, and it ran all night to complete this operation. I haven't compared this to other db's, but seemed quite slow to me.
Now that my index is added, I'm wanting to look for duplicates in the data. So I'm trying to run a "having count > 0" query and it seems to be taking hours as well. My query looks like:
select col1, count(*)
from table1
group by col1
having count(*) > 1
I would assume this query would use my index on col1, but the slow query execution makes me wonder if it is not?
Would perhaps sql server handle this kind of thing better?
SQLite's count() isn't optimized - it does a full table scan even if indexed. Here is the recommended approach to speed things up. Run EXPLAIN QUERY PLAN to verify and you'll see:
EXPLAIN QUERY PLAN SELECT COUNT(FIELD_NAME) FROM TABLE_NAME;
I get something like this:
0|0|0|SCAN TABLE TABLE_NAME (~1000000 rows)
But then I added an index to a string column in this table, and it ran all night to complete this
operation. I haven't compared this to other db's, but seemed quite slow to me.
I hate to tell yuo, but how does your server look like? Not arguing, but that is a possibly very resoruce intensive operation that may require a lot of IO and normal computers or chehap web servers with a slow hard disc are not suited for significant database work. I run hundreds og gigabyte db project work and my smallest "large data" server has 2 SSD and 8 Velociraptors for data and log. The largest one has 3 storage nodes with a total of 1000gb SSD discs - simply because IO is what a db server lives and breathes on.
So I'm trying to run a "having count > 0" query and it seems to be taking hours as well
How much RAM? ENough to fit it all in memory, or a low memory virtual server where the missing memory blows up to bad IO? How much memory can / does SqlLite use? How is the temp setup? In memory? Sql server would possibly use a lot of memory / tempdb space for this type of check.
increase the sqlite cache via PRAGMA cache_size=<number of pages>. the memory used is <number of pages> times <size of page>. (which can be set via PRAGMA page_size=<size of page>)
by setting those values to 16000 and 32768 respectively (or about 512MB), i was able to get this one program's bulk load down from 20mins to 2mins. (although i think that if the disk on that system wasn't so slow, this might not have had as much effect)
but you might not have this extra memory available on lesser embedded platforms, i don't recommend increasing it as much as i did on those, but for desktop or laptop level systems it can greatly help.

SQL Plus vs Toad IDE - Running insert in SQL Plus takes significantly longer

I'm running a query like this:
INSERT INTO TableA (colA, colB)
Select ColA, ColB
from TableB
This is huge insert, as it is querying over 2 million rows an then inserting them into the table. My question is in regard to the performance. When I run the query in toad the query takes around 4-5 minutes to run.
When I run the query through sqlplus it is taking way longer. It has already been running 40 minutes+ and it is not finished. I've even done some minor tuning by setting the server output off in case that effected performance.
Is there any tuning I should be aware of in regard to running the query via sqlplus? Is there any way to find out the difference in how the query is being executed/handled by the different clients?
Note: This is the only way I can transfer my data from table A to table B. I've looked into imp/exp and impdp/expdp and it is not possible in my situation.
Toad - v. 9.6.1.1
SqlPlus - 9.2.0.1.0
Oracle DB - 10g
This sounds like there is something else involved. My wild guess would be that your SQL*Plus session is getting blocked. Can you check v$lock to see if that is the case? There are a lot of scripts / tools to check to see what your session is currently spending its time on. Figure that out and then go from there. I personally like Tanel Poder's Snapper script (http://tech.e2sn.com/oracle-scripts-and-tools/session-snapper).
It could be a thousand things. (#John Gardner: This is one reason why I'm not a huge fan of dba.stackexchange.com - you won't know if it's a programming issue or a DBA issue until you know the answer. I think it's better if we all work together on one site.)
Here are some ideas:
Different session settings - parallel dml and parallel query may be enabled, forced, or disabled. Look at your login scripts, or look at the session info with select pdml_stats, pq_status, v$session.* from v$session;
A lock, as #Craig suggested. Although I think it's easier to look at select v$session.blocking_session, v$session.* from v$session; to identify locks.
Delayed block cleanout will make the second query slower. Run with set autotrace on. The db block gets and redo size are probably larger the second time (the second statement has some extra work to do, although this probably isn't nearly enough to explain the time difference).
Buffer cache may make the second query faster. Run with set autotrace on, there may be a large difference in physical reads. Although with that much data the chances are probably small that a huge chunk of it is cached.
Other sessions may be taking up a lot of resources. Look at select * from v$sessmetric order by physical_reads desc,logical_reads desc, cpu desc; Or maybe look at v$sysmetric_history.
You may want to consider parallel and append hints. You can probably make that query run 10 times faster (although there are some downsides to that approach, such as the
data being unrecoverable initially).
Also, for testing, you may want to use smaller sizes. Run an insert with something like and rownum <= 10000. Performance tuning is very hard, it helps a lot if you can run
the statements frequently. There are always some flukes and you want to ignore the outliers, but you can't do that with only two samples.
You can look at some detailed stats for each run, but you may need to run the query with INSERT /*+ GATHER_PLAN_STATISTICS */.... Then run this to find the sql_id: select * from v$sql where sql_text like '%INSERT%GATHER_PLAN_STATISTICS%';
Then run this to look at the details of each step: select * from v$sql_plan_statistics_all where sql_id = '<sql_id from above>';
(In 11g, you can use v$sql_monitor, or even better, dbms_sqltune.report_sql_monitor.)
A really obvious point, but it's been known to trip people up... are there any indexes on tableA; if so are any of them unique; and if so did you commit or rollback the Toad session before running it again in SQL*Plus? Not doing so is an an easy way of getting a block, as #Craig suggests. In this scenario it won't ever finish - your 40+ minute wait is while it's blocking on the first row insert.
If there are any indexes you're likely to be better off dropping them while you do the insert and recreating them afterwards as that's usually significantly faster.
As other people have already suggested, there are a lot of things that could cause a statement that selects/inserts that much data to perform badly (and inconsistently). While I have seen Toad do things to improve performance sometimes, I've never seen it do anything so much faster, so I'm inclined to think it's more to do with the database rather than the tool.
I would ask the DBA's to check your session and the database while the slow statement is running. They should be able to give you some indication of what's happening - they'll be able to check for any problems such as locking or excessive log file switching. They'll also be able to trace both sessions (Toad and SQL Plus) to see how Oracle's executing those statements and if there are any differences, etc.
Depending what it is you're doing, they might even be able to help you run the insert faster. For example, it can be faster to disable an index, do the insert, then rebuild it; or it might be possible to disable logging temporarily. This would obviously depend on your exact scenario.