Complex SQL pagination Query - sql

I am doing pagination for my data using the solution to this question.
I need to be using this solution for a more complex query now. Ie. the SELECT inside the bracket has joins and aggregate functions.
This is that solution I'm using as a reference:
;WITH Results_CTE AS
(
SELECT
Col1, Col2, ...,
ROW_NUMBER() OVER (ORDER BY SortCol1, SortCol2, ...) AS RowNum
FROM Table
WHERE <whatever>
)
SELECT *
FROM Results_CTE
WHERE RowNum >= #Offset
AND RowNum < #Offset + #Limit
The query that I need to incorporate into the above solution:
SELECT users.indicator, COUNT(*) as 'queries' FROM queries
INNER JOIN calls ON queries.call_id = calls.id
INNER JOIN users ON calls.user_id = users.id
WHERE queries.isresolved=0 AND users.indicator='ind1'
GROUP BY users.indicator ORDER BY queries DESC
How can I achieve this? So far I've made it work by removing the ORDER BY queries DESC part and putting that in the line ROW_NUMBER() OVER (ORDER BY ...) AS RowNum, but when I do this it doesn't allow me to order by that column ("Invalid column name 'queries'.").
What do I need to do to get it to order by this column?
edit: using SQL Server 2008

Try ORDER BY COUNT(*) DESC . It works on MySQL ... not sure about SQL Server 2008

I think queries your alias name for count(*) column
then use like this
SELECT users.indicator, COUNT(*) as 'queries' FROM queries
INNER JOIN calls ON queries.call_id = calls.id
INNER JOIN users ON calls.user_id = users.id
WHERE queries.isresolved=0 AND users.indicator='ind1'
GROUP BY users.indicator ORDER BY COUNT(*) DESC
http://oops-solution.blogspot.com/2011/11/string-handling-in-javascript.html

Related

Optimize Query (remove subquery)

Can you help me to optimize this Query ?. I need to remove the subquery because the performance is awful.
select LICENSE,
(select top 1 SERVICE_KEY
from SERVICES
where SERVICES.LICENSE = VEHICLE.LICENSE
order by DATE desc, HOUR desc)
from VEHICLE
The problem is that I can have two SERVICES on the same DATE and HOUR, so I haven't been able to code an equivalent SQL avoiding the subquery.
The query runs on a Legacy database where I can't modify its metadata, and it doesn't have any index at all. That's the reason to look for a solution that can avoid a correlated query.
Thank you.
You can express your query using ROW_NUMBER() without the need for a correlated subquery. Try the following query and see how the peformance is:
SELECT t.LICENSE, t.SERVICE_KEY
FROM
(
SELECT t1.LICENSE, t1.SERVICE_KEY
ROW_NUMBER() OVER (PARTITION BY t1.LICENSE
ORDER BY t2.DATE DESC, t2.HOUR DESC) rn
FROM VEHICLE t1
INNER JOIN SERVICES t2
ON t1.LICENSE = t2.LICENSE
) t
WHERE t.rn = 1
The performance of this query would depend, among other things, on having indices on the join columns of your two tables.

T-SQL, how to do this group by query?

I have a view with this information:
TableA (IDTableA, IDTableB, IDTableC, Active, date, ...)
For each register in TableA and each register in tableC, I want the register of tableB that have the max date and is active.
select IDTableA, IDtableC, IDTableB, Date, Active
from myView
where Active = 1
group by IDTableA, IDTableC
Having Max(Date)
order by IDTableA;
This query works with SQLite, but if I try this query in SQL Server I get an error that say that IDTableB in the select is not contained in the group clause.
I know that in theory the first query in the SQLite shouldn't work, but do it.
How can I do this query in SQL Server?
Thanks.
According to SQL 92, if you use GROUP BY clause, then in SELECT output expression list you can only use columns mentioned in GROUP BY list, or any other columns but they must be wrapped in aggregate functions like count(), sum(), avg(), max(), min() and so on.
Some servers like MSSQL, Postgres are strict about this rule, but for example MySQL and SQLite are very relaxed and forgiving - and this is why it surprised you.
Long story short - if you want this to work in MSSQL, adhere to SQL92 requirement.
This query in SQLServer
select IDTableA, IDtableC, IDTableB, Date, Active
from myView v1
where Active = 1
AND EXISTS (
SELECT 1
FROM myView v2
group by v2.IDTableA, v2.IDTableC
Having Max(v2.Date) = v1.Date
)
order by v1.IDTableA;
OR
Also in SQLServer2005+ you can use CTE with ROW_NUMBER
;WITH cte AS
(
select IDTableA, IDtableC, IDTableB, [Date], Active,
ROW_NUMBER() OVER(PARTITION BY IDTableA, IDTableC ORDER BY [Date] DESC) AS rn
from myView v1
where Active = 1
)
SELECT *
FROM cte
WHERE rn = 1
ORDER BY IDTableA
Try this,
select * from table1 b
where active = 1
and date = (select max(date) from table1
where idtablea = b.idtablea
and idtablec = b.idtablec
and active = 1);
SQLFIDDLE DEMO

Compare SQL groups against eachother

How can one filter a grouped resultset for only those groups that meet some criterion compared against the other groups? For example, only those groups that have the maximum number of constituent records?
I had thought that a subquery as follows should do the trick:
SELECT * FROM (
SELECT *, COUNT(*) AS Records
FROM T
GROUP BY X
) t HAVING Records = MAX(Records);
However the addition of the final HAVING clause results in an empty recordset... what's going on?
In MySQL (Which I assume you are using since you have posted SELECT *, COUNT(*) FROM T GROUP BY X Which would fail in all RDBMS that I know of). You can use:
SELECT T.*
FROM T
INNER JOIN
( SELECT X, COUNT(*) AS Records
FROM T
GROUP BY X
ORDER BY Records DESC
LIMIT 1
) T2
ON T2.X = T.X
This has been tested in MySQL and removes the implicit grouping/aggregation.
If you can use windowed functions and one of TOP/LIMIT with Ties or Common Table expressions it becomes even shorter:
Windowed function + CTE: (MS SQL-Server & PostgreSQL Tested)
WITH CTE AS
( SELECT *, COUNT(*) OVER(PARTITION BY X) AS Records
FROM T
)
SELECT *
FROM CTE
WHERE Records = (SELECT MAX(Records) FROM CTE)
Windowed Function with TOP (MS SQL-Server Tested)
SELECT TOP 1 WITH TIES *
FROM ( SELECT *, COUNT(*) OVER(PARTITION BY X) [Records]
FROM T
)
ORDER BY Records DESC
Lastly, I have never used oracle so apolgies for not adding a solution that works on oracle...
EDIT
My Solution for MySQL did not take into account ties, and my suggestion for a solution to this kind of steps on the toes of what you have said you want to avoid (duplicate subqueries) so I am not sure I can help after all, however just in case it is preferable here is a version that will work as required on your fiddle:
SELECT T.*
FROM T
INNER JOIN
( SELECT X
FROM T
GROUP BY X
HAVING COUNT(*) =
( SELECT COUNT(*) AS Records
FROM T
GROUP BY X
ORDER BY Records DESC
LIMIT 1
)
) T2
ON T2.X = T.X
For the exact question you give, one way to look at it is that you want the group of records where there is no other group that has more records. So if you say
SELECT taxid, COUNT(*) as howMany
GROUP by taxid
You get all counties and their counts
Then you can treat that expressions as a table by making it a subquery, and give it an alias. Below I assign two "copies" of the query the names X and Y and ask for taxids that don't have any more in one table. If there are two with the same number I'd get two or more. Different databases have proprietary syntax, notably TOP and LIMIT, that make this kind of query simpler, easier to understand.
SELECT taxid FROM
(select taxid, count(*) as HowMany from flats
GROUP by taxid) as X
WHERE NOT EXISTS
(
SELECT * from
(
SELECT taxid, count(*) as HowMany FROM
flats
GROUP by taxid
) AS Y
WHERE Y.howmany > X.howmany
)
Try this:
SELECT * FROM (
SELECT *, MAX(Records) as max_records FROM (
SELECT *, COUNT(*) AS Records
FROM T
GROUP BY X
) t
) WHERE Records = max_records
I'm sorry that I can't test the validity of this query right now.

Oracle SQL order by in subquery problems!

I am trying to run a subquery in Oracle SQL and it will not let me order the subquery columns. Ordering the subquery is important as Oracle seems to choose at will which of the returned columns to return to the main query.
select ps.id, ps.created_date, pst.last_updated, pst.from_state, pst.to_state,
(select last_updated from mwcrm.process_state_transition subpst
where subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id
and rownum = 1) as next_response
from mwcrm.process_state ps, mwcrm.process_state_transition pst
where ps.created_date > sysdate - 1/24
and ps.id=pst.process_state
order by ps.id asc
Really should be:
select ps.id, ps.created_date, pst.last_updated, pst.from_state, pst.to_state,
(select last_updated from mwcrm.process_state_transition subpst
where subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id
and rownum = 1
order by subpst.last_updated asc) as next_response
from mwcrm.process_state ps, mwcrm.process_state_transition pst
where ps.created_date > sysdate - 1/24
and ps.id=pst.process_state
order by ps.id asc
Both dcw and Dems have provided appropriate alternative queries. I just wanted to toss in an explanation of why your query isn't behaving the way you expected it to.
If you have a query that includes a ROWNUM and an ORDER BY, Oracle applies the ROWNUM first and then the ORDER BY. So the query
SELECT *
FROM emp
WHERE rownum <= 5
ORDER BY empno
gets an arbitrary 5 rows from the EMP table and sorts them-- almost certainly not what was intended. If you want to get the "first N" rows using ROWNUM, you would need to nest the query. This query
SELECT *
FROM (SELECT *
FROM emp
ORDER BY empno)
WHERE rownum <= 5
sorts the rows in the EMP table and returns the first 5.
Actually "ordering" only makes sense on the outermost query -- if you order in a subquery, the outer query is permitted to scramble the results at will, so the subquery ordering does essentially nothing.
It looks like you just want to get the minimum last_updated that is greater than pst.last_updated -- its easier when you look at it as the minimum (an aggregate), rather than a first row (which brings about other problems, like what if there are two rows tied for next_response?)
Give this a shot. Fair warning, been a few years since I've had Oracle in front of me, and I'm not used to the subquery-as-a-column syntax; if this blows up I'll make a version with it in the from clause.
select
ps.id, ps.created_date, pst.last_updated, pst.from_state, pst.to_state,
( select min(last_updated)
from mwcrm.process_state_transition subpst
where subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id) as next_response
from <the rest>
I've experienced this myself and you have to use ROW_NUMBER(), and an extra level of subquery, instead of rownum...
Just showing the new subquery, something like...
(
SELECT
last_updated
FROM
(
select
last_updated,
ROW_NUMBER() OVER (ORDER BY last_updated ASC) row_id
from
mwcrm.process_state_transition subpst
where
subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id
)
as ordered_results
WHERE
row_id = 1
)
as next_response
An alternative would be to use MIN instead...
(
select
MIN(last_updated)
from
mwcrm.process_state_transition subpst
where
subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id
)
as next_response
The confirmed answer is plain wrong.
Consider a subquery that generates a unique row index number.
For example ROWNUM in Oracle.
You need the subquery to create the unique record number for paging purposes (see below).
Consider the following example query:
SELECT T0.*, T1.* FROM T0 LEFT JOIN T1 ON T0.Id = T1.Id
JOIN
(
SELECT DISTINCT T0.*, ROWNUM FROM T0 LEFT JOIN T1 ON T0.Id = T1.Id
WHERE (filter...)
)
WHERE (filter...) AND (ROWNUM > 10 AND ROWNUM < 20)
ORDER BY T1.Name DESC
The inner query is the exact same query but DISTINCT on T0.
You can't put the ROWNUM on the outer query since the LEFT JOIN(s) could generate many more results.
If you could order the inner query (T1.Name DESC) the generated ROWNUM in the inner query would match.
Since you cannot use an ORDER BY in the subquery the numbers wont match and will be useless.
Thank god for ROW_NUMBER OVER (ORDER BY ...) which fixes this issue.
Although not supported by all DB engines.
One of the two methods, LIMIT (does not require ORDER) and the ROW_NUMBER() OVER will cover most DB engines.
But still if you don't have one of these options, for example the ROWNUM is your only option then a ORDER BY on the subquery is a must!

Optimize sql query with the rank function

This query gets the top item in each group using the ranking function.
I want to reduce the number of inner selects down to two instead of three. I tried using the rank() function in the innermost query, but couldn't get it working along with an aggregate function. Then I couldn't use a where clause on 'itemrank' without wrapping it in yet another select statement.
Any ideas?
select *
from (
select
tmp.*,
rank() over (partition by tmp.slot order by slot, itemcount desc) as itemrank
from (
select
i.name,
i.icon,
ci.slot,
count(i.itemid) as itemcount
from items i
inner join citems ci on ci.itemid = i.itemid
group by i.name, i.icon, ci.slot
) as tmp
) as popularitems
where itemrank = 1
EDIT: using sql server 2008
In Oracle and Teradata (and perhaps others too), you can use QUALIFY itemrank = 1 to get rid of the outer select. This is not part of the ANSI standard.
You can use Common Table Expressions in Oracle or in SQL Server.
Here is the syntax:
WITH expression_name [ ( column_name [,...n] ) ]
AS
( CTE_query_definition )
The list of column names is optional only if distinct names for all resulting columns are supplied in the query definition.
The statement to run the CTE is:
SELECT <column_list>
FROM expression_name;