Oracle SQL order by in subquery problems! - sql

I am trying to run a subquery in Oracle SQL and it will not let me order the subquery columns. Ordering the subquery is important as Oracle seems to choose at will which of the returned columns to return to the main query.
select ps.id, ps.created_date, pst.last_updated, pst.from_state, pst.to_state,
(select last_updated from mwcrm.process_state_transition subpst
where subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id
and rownum = 1) as next_response
from mwcrm.process_state ps, mwcrm.process_state_transition pst
where ps.created_date > sysdate - 1/24
and ps.id=pst.process_state
order by ps.id asc
Really should be:
select ps.id, ps.created_date, pst.last_updated, pst.from_state, pst.to_state,
(select last_updated from mwcrm.process_state_transition subpst
where subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id
and rownum = 1
order by subpst.last_updated asc) as next_response
from mwcrm.process_state ps, mwcrm.process_state_transition pst
where ps.created_date > sysdate - 1/24
and ps.id=pst.process_state
order by ps.id asc

Both dcw and Dems have provided appropriate alternative queries. I just wanted to toss in an explanation of why your query isn't behaving the way you expected it to.
If you have a query that includes a ROWNUM and an ORDER BY, Oracle applies the ROWNUM first and then the ORDER BY. So the query
SELECT *
FROM emp
WHERE rownum <= 5
ORDER BY empno
gets an arbitrary 5 rows from the EMP table and sorts them-- almost certainly not what was intended. If you want to get the "first N" rows using ROWNUM, you would need to nest the query. This query
SELECT *
FROM (SELECT *
FROM emp
ORDER BY empno)
WHERE rownum <= 5
sorts the rows in the EMP table and returns the first 5.

Actually "ordering" only makes sense on the outermost query -- if you order in a subquery, the outer query is permitted to scramble the results at will, so the subquery ordering does essentially nothing.
It looks like you just want to get the minimum last_updated that is greater than pst.last_updated -- its easier when you look at it as the minimum (an aggregate), rather than a first row (which brings about other problems, like what if there are two rows tied for next_response?)
Give this a shot. Fair warning, been a few years since I've had Oracle in front of me, and I'm not used to the subquery-as-a-column syntax; if this blows up I'll make a version with it in the from clause.
select
ps.id, ps.created_date, pst.last_updated, pst.from_state, pst.to_state,
( select min(last_updated)
from mwcrm.process_state_transition subpst
where subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id) as next_response
from <the rest>

I've experienced this myself and you have to use ROW_NUMBER(), and an extra level of subquery, instead of rownum...
Just showing the new subquery, something like...
(
SELECT
last_updated
FROM
(
select
last_updated,
ROW_NUMBER() OVER (ORDER BY last_updated ASC) row_id
from
mwcrm.process_state_transition subpst
where
subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id
)
as ordered_results
WHERE
row_id = 1
)
as next_response
An alternative would be to use MIN instead...
(
select
MIN(last_updated)
from
mwcrm.process_state_transition subpst
where
subpst.last_updated > pst.last_updated
and subpst.process_state = ps.id
)
as next_response

The confirmed answer is plain wrong.
Consider a subquery that generates a unique row index number.
For example ROWNUM in Oracle.
You need the subquery to create the unique record number for paging purposes (see below).
Consider the following example query:
SELECT T0.*, T1.* FROM T0 LEFT JOIN T1 ON T0.Id = T1.Id
JOIN
(
SELECT DISTINCT T0.*, ROWNUM FROM T0 LEFT JOIN T1 ON T0.Id = T1.Id
WHERE (filter...)
)
WHERE (filter...) AND (ROWNUM > 10 AND ROWNUM < 20)
ORDER BY T1.Name DESC
The inner query is the exact same query but DISTINCT on T0.
You can't put the ROWNUM on the outer query since the LEFT JOIN(s) could generate many more results.
If you could order the inner query (T1.Name DESC) the generated ROWNUM in the inner query would match.
Since you cannot use an ORDER BY in the subquery the numbers wont match and will be useless.
Thank god for ROW_NUMBER OVER (ORDER BY ...) which fixes this issue.
Although not supported by all DB engines.
One of the two methods, LIMIT (does not require ORDER) and the ROW_NUMBER() OVER will cover most DB engines.
But still if you don't have one of these options, for example the ROWNUM is your only option then a ORDER BY on the subquery is a must!

Related

Snowflake subquery

I have two tables. Transaction(ID, TERMINALID) and Terminal(ID, TERMINALID, EXPORT_DATE). The goal is to obtain for each row from Transaction table newest recored from Terminal table. Snowflake is used as a backend.
I have this SQL query:
SELECT tr.ID,
(SELECT te.ID
FROM "Terminal" te
WHERE te.TERMINALID = tr.TERMINALID
ORDER BY te.EXPORT_DATE DESC
LIMIT 1)
FROM "Transaction" tr;
But I get this error:
SQL compilation error: Unsupported subquery type cannot be evaluated
Error disappears if I replace tr.TERMINALID with a specific value. So I can't reference parent table from nested SELECT. Why this is not possible? Query works in MySQL.
I'm afraid Snowflake doesn't support correlated subqueries of this kind.
You can achieve what you want by using FIRST_VALUE to compute best per-terminalid id :
-- First compute per-terminalid best id
with sub1 as (
select
terminalid,
first_value(id) over (partition by terminalid order by d desc) id
from terminal
),
-- Now, make sure there's only one per terminalid id
sub2 as (
select
terminalid,
any_value(id) id
from sub1
group by terminalid
)
-- Now use that result
select tr.ID, sub2.id
FROM "Transaction" tr
JOIN sub2 ON tr.terminalid = sub2.terminalid
You can run subqueries first to see what they do.
We're working on making our support for subqueries better, and possibly there's a simpler rewrite, but I hope it helps.
SELECT
tr.ID
, (SELECT te.ID
FROM "Terminal" te
WHERE te.TERMINALID = tr.TERMINALID
ORDER BY te.EXPORT_DATE DESC
LIMIT 1
) AS the_id -- <<-- add an alias for the column
FROM "Transaction" tr
;
UPDATE:
length for type varchar cannot exceed 10485760
just use type varchar (or text) instead
Works here (with quoted identifiers):
CREATE TABLE "Transaction" ("ID" VARCHAR(123), "TERMINALID" VARCHAR(123)) ;
CREATE TABLE "Terminal" ( "ID" VARCHAR(123), "TERMINALID" VARCHAR(123), "EXPORT_DATE" DATE);
SELECT tr."ID"
, (SELECT te."ID"
FROM "Terminal" te
WHERE te."TERMINALID" = tr."TERMINALID"
ORDER BY te."EXPORT_DATE" DESC
LIMIT 1) AS meuk
FROM "Transaction" tr
;
BONUS UPDATE: avoid the scalar subquery and use plain old NOT EXISTS(...) to obtain the record with the most recent date:
SELECT tr."ID"
, te."ID" AS meuk
FROM "Transaction" tr
JOIN "Terminal" te ON te."TERMINALID" = tr."TERMINALID"
AND NOT EXISTS ( SELECT *
FROM "Terminal" nx
WHERE nx."TERMINALID" = te."TERMINALID"
AND nx."EXPORT_DATE" > te."EXPORT_DATE"
)
;
So few years later (2022), some correlated subqueries are support, but not this one:
using this data:
WITH transaction(id, terminalid) AS (
SELECT * FROM VALUES
(1,10),
(2,11),
(3,12)
), terminal(id, terminalid, export_date) AS (
SELECT * FROM VALUES
(100, 10, '2022-03-18'::date),
(101, 10, '2022-03-19'::date),
(102, 11, '2022-03-20'::date),
(103, 11, '2022-03-21'::date),
(104, 11, '2022-03-22'::date),
(105, 12, '2022-03-23'::date)
)
So compared to Marcin's we can now use a QUALIFY to select only one value per terminalid in a single step:
WITH last_terminal as (
SELECT id,
terminalid
FROM terminal
QUALIFY row_number() over(PARTITION BY terminalid ORDER BY export_date desc) = 1
)
SELECT tr.ID,
te.id
FROM transaction AS tr
JOIN last_terminal AS te
ON te.TERMINALID = tr.TERMINALID
ORDER BY 1;
giving:
ID
ID
1
101
2
104
3
105
and if you have multiple terminals per day, and terimal.id is incrementing number you could use:
QUALIFY row_number() over(PARTITION BY terminalid ORDER BY export_date desc, id desc) = 1
Now if your table is not that larger, you can do the JOIN then prune via the QUALIFY, and avoid the CTE, but on large tables this is much less performant, so I would only use this form when doing ad-hoc queries, where swapping forms if viable if performance problems occur.
SELECT tr.ID,
te.id
FROM transaction AS tr
JOIN terminal AS te
ON te.TERMINALID = tr.TERMINALID
QUALIFY row_number() over(PARTITION BY tr.id ORDER BY te.export_date desc, te.id desc) = 1
ORDER BY 1;
This kind of subquery is currently not supported.
Working with subqueries - Limitations:
The only type of subquery that allows a LIMIT / FETCH clause is an uncorrelated scalar subquery. Also, because an uncorrelated scalar subquery returns only 1 row, the LIMIT clause has little or no practical value inside a subquery
Query in question is correlated subquery, thus the result.
SELECT tr.ID,
(SELECT te.ID
FROM "Terminal" te
WHERE te.TERMINALID = tr.TERMINALID --correlation
ORDER BY te.EXPORT_DATE DESC
LIMIT 1)
FROM "Transaction" tr;

Order by not working in Oracle subquery

I'm trying to return 7 events from a table, from todays date, and have them in date order:
SELECT ID
FROM table
where ID in (select ID from table
where DATEFIELD >= trunc(sysdate)
order by DATEFIELD ASC)
and rownum <= 7
If I remove the 'order by' it returns the IDs just fine and the query works, but it's not in the right order. Would appreciate any help with this since I can't seem to figure out what I'm doing wrong!
(edit) for clarification, I was using this before, and the order returned was really out:
select ID
from TABLE
where DATEFIELD >= trunc(sysdate)
and rownum <= 7
order by DATEFIELD
Thanks
The values for the ROWNUM "function" are applied before the ORDER BY is processed. That why it doesn't work the way you used it (See the manual for a similar explanation)
When limiting a query using ROWNUM and an ORDER BY is involved, the ordering must be done in an inner select and the limit must be applied in the outer select:
select *
from (
select *
from table
where datefield >= trunc(sysdate)
order by datefield ASC
)
where rownum <= 7
You cannot use order by in where id in (select id from ...) kind of subquery. It wouldn't make sense anyway. This condition only checks if id is in subquery. If it affects the order of output, it's only incidental. With different data query execution plan might be different and output order would be different as well. Use explicit order by at the end of the main query.
It is well known 'feature' of Oracle that rownum doesn't play nice with order by. See http://www.adp-gmbh.ch/ora/sql/examples/first_rows.html for more information. In your case you should use something like:
SELECT ID
FROM (select ID, row_number() over (order by DATEFIELD ) r
from table
where DATEFIELD >= trunc(sysdate))
WHERE r <= 7
See also:
http://www.orafaq.com/faq/how_does_one_select_the_top_n_rows_from_a_table
http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html
http://asktom.oracle.com/pls/asktom/f?p=100:11:507524690399301::::P11_QUESTION_ID:127412348064
See also other similar questions on SO, eg.:
Oracle SELECT TOP 10 records
Oracle/SQL - Select specified range of sequential records
Your outer query cant "see" the ORDER in the inner query and in this case the order in the inner doesn't make sense because it (the inner) is only being used to create a subset of data that will be used on the WHERE of the outer one, so the order of this subset doesn't matter.
maybe if you explain better what you want to do, we can help you
ORDER BY CLAUSE IN Subqueries:
the order by clause is not allowed inside a subquery, with the exception of the inline views. If attempt to include an ORDER BY clause, you receive an error message
An inline View is a query at the from clause.
SELECT t.*
FROM (SELECT id, name FROM student) t

Complex SQL pagination Query

I am doing pagination for my data using the solution to this question.
I need to be using this solution for a more complex query now. Ie. the SELECT inside the bracket has joins and aggregate functions.
This is that solution I'm using as a reference:
;WITH Results_CTE AS
(
SELECT
Col1, Col2, ...,
ROW_NUMBER() OVER (ORDER BY SortCol1, SortCol2, ...) AS RowNum
FROM Table
WHERE <whatever>
)
SELECT *
FROM Results_CTE
WHERE RowNum >= #Offset
AND RowNum < #Offset + #Limit
The query that I need to incorporate into the above solution:
SELECT users.indicator, COUNT(*) as 'queries' FROM queries
INNER JOIN calls ON queries.call_id = calls.id
INNER JOIN users ON calls.user_id = users.id
WHERE queries.isresolved=0 AND users.indicator='ind1'
GROUP BY users.indicator ORDER BY queries DESC
How can I achieve this? So far I've made it work by removing the ORDER BY queries DESC part and putting that in the line ROW_NUMBER() OVER (ORDER BY ...) AS RowNum, but when I do this it doesn't allow me to order by that column ("Invalid column name 'queries'.").
What do I need to do to get it to order by this column?
edit: using SQL Server 2008
Try ORDER BY COUNT(*) DESC . It works on MySQL ... not sure about SQL Server 2008
I think queries your alias name for count(*) column
then use like this
SELECT users.indicator, COUNT(*) as 'queries' FROM queries
INNER JOIN calls ON queries.call_id = calls.id
INNER JOIN users ON calls.user_id = users.id
WHERE queries.isresolved=0 AND users.indicator='ind1'
GROUP BY users.indicator ORDER BY COUNT(*) DESC
http://oops-solution.blogspot.com/2011/11/string-handling-in-javascript.html

ORDER BY in GROUP BY clause

I have a query
Select
(SELECT id FROM xyz M WHEREM.ID=G.ID AND ROWNUM=1 ) TOTAL_X,
count(*) from mno G where col1='M' group by col2
Now from subquery i have to fetch ramdom id for this I am doing
Select
(SELECT id FROM xyz M WHEREM.ID=G.ID AND ROWNUM=1 order by dbms_random.value ) TOTAL_X,
count(*) from mno G where col1='M' group by col2
But , oracle is showing an error
"Missing right parenthesis".
what is wrong with the query and how can i wrtie this query to get random Id.
Please help.
Even if what you did was legal, it would not give you the result you want. The ROWNUM filter would be applied before the ORDER BY, so you would just be sorting one row.
You need something like this. I am not sure if this exact code will work given the correlated subquery, but the basic point is that you need to have a subquery that contains the ORDER BY without the ROWNUM filter, then apply the ROWNUM filter one level up.
WITH subq AS (
SELECT id FROM xyz M WHERE M.ID=G.ID order by dbms_random.value
)
SELECT (SELECT id FROM subq WHERE rownum = 1) total_x,
count(*)
from mno g where col1='M' group by col2
You can't use order by in a subselect. It wouldn't matter too, because the row numbering is applied first, so you cannot influence it by using order by,
[edit]
Tried a solution. Don't got Oracle here, so you'll have to read between the typos.
In this case, I generate a single random value, get the count of records in xyz per mno.id, and generate a sequence for those records per mno.id.
Then, a level higher, I filter only those records whose index match with the random value.
This should give you a random id from xyz that matches the id in mno.
select
x.mnoId,
x.TOTAL_X
from
(SELECT
g.id as mnoId,
m.id as TOTAL_X,
count(*) over (partition by g.id) as MCOUNT,
dense_rank() over (partition by g.id) as MINDEX,
r.RandomValue
from
mno g
inner join xyz m on m.id = g.id
cross join (select dbms_random.value as RandomValue from dual) r
where
g.col1 = 'M'
) x
where
x.MINDEX = 1 + trunc(x.MCOUNT * x.RandomValue)
The only difference between your two lines are that you order_by in the one that fails, right?
It so happens that order_by doesn't fit inside a nested select.
You could do an order_by inside a where clause that contains a select, though.
Edit: #StevenV is right.
If you're trying to do what I suspect, this should work
Select A.Id, Count(*)
From MNO A
Join (Select ID From XYZ M Where M.ID=G.ID And Rownum=1 Order By Dbms_Random.Value ) B On (B.ID = A.ID)
GROUP BY A.ID

Date of max id: sql/oracle optimization

What is a more elegant way of doing this:
select date from table where id in (
select max(id) from table);
Surely there is a better way...
You can use the ROWNUM pseudocolumn. The subquery is necessary to order the result before finding the first row:
SELECT date
FROM (SELECT * FROM table ORDER BY id DESC)
WHERE ROWNUM = 1;
You can use subquery factoring in Oracle 9i and later in the following way:
WITH ranked_table AS (
SELECT ROWNUM AS rn, date
FROM table
ORDER BY id DESC
)
SELECT date FROM ranked_table WHERE rn = 1;
You can use a self-join, and find where no row exists with a greater id:
SELECT date
FROM table t1
LEFT OUTER JOIN table t2
ON t1.id < t2.id
WHERE t2.id IS NULL;
Which solution is best depends on the indexes in your table, and the volume and distribution of your data. You should test each solution to determine what works best, is fastest, is most flexible for your needs, etc.
select date from (select date from table order by id desc)
where rownum < 2
assuming your ids are unique.
EDIT: using subquery + rownum