Sql profiler-query performance - sql

I would like to know regarding performance of two queries. A page is showing 18 records per page. It is implemented in two ways in two different applications, but using same tables and join.
When compared their query in SQL query profiler,
The first query fetches all 1 to 18 records and display. It is reading 1.2 million records.
The second query fetches 6 records at a time and display, On browser its showing 18 records. It is reading is less than 250k records.
Using almost same join in both queries.
Is it possible to show 18 records per page pagination in batches of query? If yes, how it is implemented?
Is there another way to find how many records read in an SQL execution other than profiler?

You can use OFFSET and FETCH to paginate. See
https://www.sqlshack.com/pagination-in-sql-server/
If you have the queries, you can run it in SSMS to get the execution plans. There will be estimated and actual row count information.
https://www.sqlshack.com/execution-plans-in-sql-server/

Related

Despite the existence of relevant indices, PostgreSQL query is slow

I have a table with 3 columns:
time (timestamptz)
price (numeric(8,2))
set_id (int)
The table contains 7.4M records.
I've created an simple index for time and an index for set_id.
I want to run the following query:
select * from test_prices where time BETWEEN '2015-06-05 00:00:00+00' and '2020-06-05 00:00:00+00';
Depsite my indices, the query takes 2 minutes and 30 seconds.
See explain analze stats: https://explain.depesz.com/s/ZwSH
GCP postgres DB has the following stats:
What do I miss here? Why is this query so slow and how can I improve?
According to your explain plan, the row is returning 1.6 million rows out of 4.5 million. That means that a significant portion of rows are being returned.
Postgres wisely decides that a full table scan is more efficient than using an index, because there is a good chance that all the data pages will need to be read anyway.
It is surprising that you are reporting 00:02:30 for the query. The explain is saying that the query completes in about 1.4 seconds -- which seems reasonable.
I suspect that the elapsed time is caused by the volume of data being returned (perhaps the rows are very wide), a slow network connection to the database, or contention on the database/server.
Your query selects two thirds of the table. A sequential scan is the most efficient way to process such a query.
Your query executes in under 2 seconds. It must be your client that takes a long time to render the query result (pgAdmin is infamous for that). Use a different client.

Pagination very heavy SQL query

I am trying to improve performance of some report in my system. Application is written in .NET Core 3.0, I am using EF-Core as ORM Framework and PostgreSQL as database. This report returns thousands of records and in some view it is presented to user. The results are paginated, ordered by some selected column (like start_time or agent_name etc.).
This result is calculated from heavy query (one execution takes about 10 seconds). To implement pagination, we need to calculate results for one page and total count. I can see 2 approaches how to solve this problem. Both of them have some disadvantages.
In current solution I am downloading full report, and then I am sorting it and slicing one page in memory. Advantage of this solution is that data is fetched from database only once. Disadvantage - we load thousands of record, but in fact we need only one page (50 records).
Other approach which I can see is slice records in DB (using LIMIT and OFFSET operators). I will fetch only on page of data, but I don't have count of all records, so I need to make second query with same query parameters which returns count of all records.
What do you think about this problem? Which approach is in your opinion better? Maybe some other technique is best for this issue?

Performance Issue : Sql Views vs Linq For Decreasing Query Execution Time

I am having a System Setup in ASP.NET Webforms and there is Acccounts Records Generation Form In Some Specific Situation I need to Fetch All Records that are near to 1 Million .
One solution could be to reduce number of records to fetch but when we need to fetch records for more than a year of 5 years that time records are half million, 1 million etc. How can I decrease its time?
What could be points that I can use to reduce its time? I can't show full query here, it's a big view that calls some other views in it
Does it take less time if I design it in as a Linq query? That's why I asked Linq vs Views
I have executed a "Select * from TableName" Query and its 40 mins and its still executing table is having 1,17,000 Records Can we decrease this timeline
I started this as a comment but ran out of room.
Use the server to do as much filtering for you as possible and return as few rows as possible. Client side filtering is always going to be much slower than server side filtering. Eg, it does not have access to the indexes & optimisation techniques that exist on the server.
Linq uses "lazy evaluation" which means that it builds up a method for filtering but does not execute it until it is forced to. I've used it and was initially impressed with the speed ... until I started to access the data it returned. When you use the data you want from Linq, this will trigger the actual selection process, which you'll find is slow.
Use the server to return a series of small resultsets and then process those. If you need to join these resultsets on a key, save them into dictionaries with that key so you can join them quickly.
Another approach is to look at Entity Framework to create a mirror of the server database structure along with indexes so that the subset of data you retrieve can be joined quickly.

What is the expected query response performance for BigQuery when querying over 400 million rows form multiple sharded tables?

I have noticed consistently slow BigQuery performance (between 30 seconds to 1 minute response time) when query more 400 million rows from multiple sharded tables.
I have run the queries 3 times in different time of day (afternoon, late evening and morning), notice response time has been consistently slow. The query uses a group-by string field that may have a lot of unique values and then sort by sum of another integer value in descending order and finally return only the top 10.
I have done performance timing testing on same schema and same query but storing all the data in one to five tables and noticed performance was always under 10 seconds.
What is the expected response time for querying dataset with 400 millions to 2 billions rows sharded in 7 to 90 sharded tables? Can sharding data in more tables cause slower query performance? FYI, each of the of the shard table has at least 24 million to 144 million rows. They are not very small tables.
The expected query performance depends highly on your query. Are you using GROUP EACH BY in your query?
The number of tables your data is sharded into shouldn't have much effect on query performance unless the number of tables is very large (in the hundreds or thousands). If you are seeing performance differences, then something may be wrong. Would you mind sharing either the queries you are running or the project and job ids of the queries that were fast vs the queries that were slow?

Windows 2003 server becomes very slow while executing query that retrieves lac's of records

I executed a query, which retrieves more than 100,000 records, which uses joins to retrieve the records.
While this query was running the whole server becomes very slow and this affects other sites, which try to run normal query to get records.
In this case, query which runs for getting that many number of records and other query running simultaneously are of different data base.
Your query retrieving hundreds of thousands of records is probably causing significant IO and trashes the buffer pool. You need to address this from two directions:
review your requirements. Why are you retrieving hundreds of thousands of records? For sure no human can look at so many. Any analysis should be pushed to be performed on the server and only retrieve aggregate results
Why do you need to analyze frequently hundreds of thousands of records? Perhaps you need an ETL pipeline to extract the required aggregates/summation on a daily basis
Maybe the query does need to analyse hundreds of thousands of records, perhaps you're missing an index
If all of the above don't apply it simply means you need a bigger boat. Your current hardware cannot handle the requirements.