total number of rows of a query - sql

I have a very large query that is supposed to return only the top 10 results:
select top 10 ProductId from .....
The problem is that I also want the total number of results that match the criteria without that 'top 10', but in the same time it's considered unaceptable to return all rows (we are talking of roughly 100 thousand results.
Is there a way to get the total number of rows affected by the previous query, either in it or afterwords without running it again?
PS: please no temp tables of 100 000 rows :))

dump the count in a variable and return that
declare #count int
select #count = count(*) from ..... --same where clause as your query
--now you add that to your query..of course it will be the same for every row..
select top 10 ProductId, #count as TotalCount from .....

Assuming that you're using an ORDER BY clause already (to properly define which the "TOP 10" results are), then you could add a call of ROW_NUMBER also, with the opposite sort order, and pick the highest value returned.
E.g., the following:
select top 10 *,ROW_NUMBER() OVER (order by id desc) from sysobjects order by ID
Has a final column with values 2001, 2000, 1999, etc, descending. And the following:
select COUNT(*) from sysobjects
Confirms that there are 2001 rows in sysobjects.

I suppose you could hack it with a union select
select top 10 ... from ... where ...
union
select count(*) from ... where ...
For you to get away with this type of hack you will need to add fake columns to the count query so it returns the same amount of columns as the main query. For example:
select top 10 id, first_name from people
union
select count(*), '' as first_name from people
I don't recommend using this solution. Using two separate queries is how it should be done

Generally speaking no - reasoning is as follows:
If(!) the query planner can make use of TOP 10 to return only 10 rows then RDBMS will not even know the exact number of rows that satisfy the full criteria, it just gets the TOP 10.
Therefore, when you want to find out count of all rows satisfying the criteria you are not running it the second time, but the first time.
Having said that proper indexes might make both queries execute pretty fast.
Edit
MySQL has SQL_CALC_FOUND_ROWS which returns the number of rows that query would return if there was no LIMIT applied - googling for an equivalent in MS SQL points to analytical SQL and CTE variant, see this forum (even though not sure that either would qualify as running it only once, but feel free to check - and let us know).

Related

Get 5 most recent records for each Number

Data I have a table in Access that has a Part Number and PriceYr and Price associated to each Part Number.There are over 10,000 records and the PartNumber are repeated and has different PriceYr and Price associated to it. However, I need a query to just find the 5 most recent price and date associated with it.
I tried using MAX(PriceYr) however, it only returns 1 most recent record for each PartNumber.
I also tried the following query but it doesn't seem to work.
SELECT Catalogs.PartNumber,Catalogs.PriceYr, Catalogs.Price FROM Catalogs
WHERE Catalogs.PriceYr in
(SELECT TOP 5 Catalogs.PriceYr
FROM Catalogs as Temp
WHERE Temp.PartNumber = Catalogs.PartNumber
ORDER By Catalogs.PriceYr DESC)
Any help will be greatly appreciated. Thank you.
Desired Result that i am trying to get.
Consider a correlated count subquery to filter by a rank variable. Right now, you pull top 5 overall on matching PartNumber not per PartNumber.
SELECT main.*
FROM
(SELECT c.PartNumber, c.PriceYr, c.Price,
(SELECT Count(*)
FROM Catalogs AS Temp
WHERE Temp.PartNumber = c.PartNumber
AND Temp.PriceYr >= c.PriceYr) As rank
FROM Catalogs c
) As main
WHERE main.rank <= 5
MAX() is an aggregating function, meaning that it groups all the data and takes the maximal value in the specified column. You need to use a GROUP BY statement to prevent the query from grouping the whole dataset in a single row.
On the other hand, your query seems to needlessly use a subquery. The following query should work quite fine :
SELECT TOP 5 c.PartNumber, c.PriceYr, c.Price
FROM Catalogs c
ORDER BY c.PriceYr DESC
WHERE c.PartNumber = #partNumber -- if you want the query to
-- work on a specific part number
(please post a table creation query to make sure this example works)

SQL for getting each category data in maria db

I need to fetch 4 random values from each category. What should be the correct sql syntax for maria db. I have attached one image of table structure.
Please click here to check the structure
Should i write some procedure or i can do it with basic sql syntax?
You can do that with a SQL statement if you only have a few rows:
SELECT id, question, ... FROM x1 ORDER BY rand() LIMIT 1
This works fine if you have only a few rows - as soon as you have thousands of rows the overhead for sorting the rows becomes important, you have to sort all rows for getting only one row.
A trickier but better solution would be:
SELECT id, question from x1 JOIN (SELECT CEIL(RAND() * (SELECT(MAX(id)) FROM x1)) AS id) as id using(id);
Running EXPLAIN on both SELECTS will show you the difference...
If you need random value for different categories combine the selects via union and add a where clause
http://mysql.rjweb.org/doc.php/groupwise_max#top_n_in_each_group
But then ORDER BY category, RAND(). (Your category is the blog's province.)
Notice how it uses #variables to do the counting.
If you have MariaDB 10.2, then use one of its Windowing functions.
SELECT column FROM table WHERE category_id = XXX
ORDER BY RAND()
LIMIT 4
do it for all categories

Is there any way to calculate total number of rows that return from dynamic query in Common Table Expression(CTE) or Subquery

We are in the process of optimizing our database.We have most of store procedure that uses CTE because it gives us high performance according to our table strucure.We have almost dynamic query that have different result according to different condition.We hold all data in CTE, and check condition, that was the not problem but we need total number of rows that return by each query ,in calculating this it takes lots of time.Temporary table or table variable not suitable in our case as it takes lots of time to insert data in it.We have structure as following
With t(fields) as
(select field1,field2.......
ROW_NUMBER() OVER (order by some column) as row...
from some table and lots of
inner n left joins
where some condition ),
rowTotal(RowTotal) as
(select max(row) from t)
select * from t,RowTotal
where condition for paging
But max(row) took lots of times if i remove this it return data within 100ms. I tried Coun(*),Count(SomeField) and many other it works but took lots of time.How can i achieve total number of rows from cte within some ms any aggregate function will not work for me.Is there any other way to calculate rowtotal like ##rowcount.Thanks in advance for any help.
If you are after the total number of rows from the inner query you can add this as a column to your select using COUNT() and PARTITION BY().
With t(fields) as
(select COUNT(*) OVER (PARTITION BY 1) AS TotalRows,
field1,field2.......
ROW_NUMBER() OVER (order by some column) as row...
from some table ...
This should give you a count of the total rows in 't' as the first column of t
I don't know that this is the fastest way to get the result you want but it works for me on 000's of returned records and prevents extra select queries to find the count separately.

How do you Select TOP x but still get a COUNT of the whole query?

I'm writing a webpage to interactively filter results based on filter criteria as it is specified by the user. I only want to return from SQL the TOP 20 rows but I want to know how many rows met the criteria (Count). I want to be able to tell the user: "here are the top 20 rows matching your criteria, and BTW, there were 2,000 additional rows I'm not showing here".
I know I could simply run the query twice but EWWWW that's expensive and wasteful. How can I achieve what I want without over taxing the database?
You can use COUNT(*) OVER()
SELECT TOP 20 *,
COUNT(*) OVER() AS TotalMatchingRows
FROM master..spt_values
WHERE type='P'
ORDER BY number
Doing two queries may work out more efficient however especially if you have narrower indexes that can be used in determining the matching row count but don't cover the entire SELECT list.
select COUNT(*) from Process_Master where dt_time=(select top 1 (dt_time) from Process_Master order by DT_TIME desc)

Assistance with SQL statement

I'm using sql-server 2005 and ASP.NET with C#.
I have Users table with
userId(int),
userGender(tinyint),
userAge(tinyint),
userCity(tinyint)
(simplified version of course)
I need to select always two fit to userID I pass to query users of opposite gender, in age range of -5 to +10 years and from the same city.
Important fact is it always must be two, so I created condition if ##rowcount<2 re-select without age and city filters.
Now the problem is that I sometimes have two returned result sets because I use first ##rowcount on a table. If I run the query.
Will it be a problem to use the DataReader object to read from always second result set? Is there any other way to check how many results were selected without performing select with results?
Can you simplify it by using SELECT TOP 2 ?
Update: I would perform both selects all the time, union the results, and then select from them based on an order (using SELECT TOP 2) as the union may have added more than two. Its important that this next select selects the rows in order of importance, ie it prefers rows from your first select.
Alternatively, have the reader logic read the next result-set if there is one and leave the SQL alone.
To avoid getting two separate result sets you can do your first SELECT into a table variable and then do your ##ROWCOUNT check. If >= 2 then just select from the table variable on its own otherwise select the results of the table variable UNION ALLed with the results of the second query.
Edit: There is a slight overhead to using table variables so you'd need to balance whether this was cheaper than Adam's suggestion just to perform the 'UNION' as a matter of routine by looking at the execution stats for both approaches
SET STATISTICS IO ON
Would something along the following lines be of use...
SELECT *
FROM (SELECT 1 AS prio, *
FROM my_table M1 JOIN my_table M2
WHERE M1.userID = supplied_user_id AND
M1.userGender <> M2.userGender AND
M1.userAge - 5 >= M2.userAge AND
M1.userAge + 15 <= M2.userAge AND
M1.userCity = M2.userCity
LIMIT TO 2 ROWS
UNION
SELECT 2 AS prio, *
FROM my_table M1 JOIN my_table M2
WHERE M1.userID = supplied_user_id AND
M1.userGender <> M2.userGender
LIMIT TO 2 ROWS)
ORDER BY prio
LIMIT TO 2 ROWS;
I haven't tried it as I have no SQL Server and there may be dialect issues.