rownum vs max in oracle - sql

I have a couple of queries to get latest modified parent_id
with max keyword:
select max(parent_id)
from sample_table
where modified_date=(select max(modified_date)
FROM sample_table
where id = 'test') and id = 'test';
with rownum keyword:
select *
from (
select parent_id,modified_date
from sample_table
where id = 'test'
order by modified_date desc)
WHERE rownum <= 1 ;
Both the queries returning same and correct result.
Which one is better and faster query..

Your query is somewhat unpredictable because two records can have the same modified_date. So you have to apply a trick to return a single row only.
The first query is deterministic: It takes the latestd modified_date; if it returns several rows it takes the one with the highest parent_id. The second query is unpredicatable: it depends on how Oracle executes the query.
I would use the second query and modify it slightly to move the two order criteria close to each other:
select *
from (
select parent_id,modified_date
from sample_table
where id = 'test'
order by modified_date desc, parent_id desc)
WHERE rownum <= 1;
This type of query can also be better extended to return more columns, namely by adding it to the inner SELECT clause. In the other query, it's trickier.

You would say the best way is this one:
SELECT
MAX(parent_id) KEEP (DENSE_RANK FIRST ORDER BY modified_date desc, parent_id desc),
MAX(modified_date)
FROM sample_table
WHERE ID = 'test';

Related

How to select rows corresponding to a randomly selected column value in SQL

My query returns a result like shown in the table. I would like to randomly pick an ID from the ID column and get all the rows having that ID. How can I do that in SnowFlake or SQL:
ID
Postalcode
Value
...
1e3d
NK25F4
3214
...
1e3d
NK25F4
3258
...
1e3d
NK25F4
3354
...
1f74
NG2LK8
5524
1f74
NG2LK8
5548
3e9a
N6B7H4
3694
3e9a
N6B7H4
3325
38e4
N6C7H2
3654
...
There is a Snowflake function to return a fix number of "random" rows SAMPLE, so using that will reduce the need to read all rows.
SELECT t.*
FROM your_table as t
JOIN (SELECT ID FROM your_table SAMPLE (1 ROWS)) as r
ON t.id = r.id
thus using your data above:
with your_table(id, postalcode, value) as (
select * from values
('1e3d', 'NK25F4', 3214),
('1e3d', 'NK25F4', 3258),
('1e3d', 'NK25F4', 3354),
('1f74', 'NG2LK8', 5524),
('1f74', 'NG2LK8', 5548),
('3e9a', 'N6B7H4', 3694),
('3e9a', 'N6B7H4', 3325),
('38e4', 'N6C7H2', 3654)
)
I get (random set) but one looks like:
ID
POSTALCODE
VALUE
1f74
NG2LK8
5,524
1f74
NG2LK8
5,548
You could also use a NATURAL JOIN like:
SELECT *
FROM your_table
NATURAL JOIN (SELECT ID FROM your_table SAMPLE (1 ROWS))
You could put your existing query in a common table expression, then pick a random ID from it, and use it to filter the dataset:
with
dat as ( ... your query ...),
tid as (select id from dat order by random() fetch first 1 row)
select d.*
from dat d
inner join tid t on t.id = d.id
The second CTE, tid picks the random id; it does that by randomly ordering the dataset, then getting the id of the top row.
Something like
SELECT *
FROM Table_NAME
WHERE ID IN (SELECT ID FROM Table_Name ORDER BY RAND() LIMIT 1);
Should work. Though it's not particularly efficient and in many application scenarios it would arguably be more reasonable overall to compute the random ID in your application (e.g. keeping the set of all ids cached, periodically pulling it separately if need be etc).
(Note: The query assumes MYSQL, other variants may have slightly different keywords/structure, e.g. for the random function).
WITH DATA AS (
select '1e3d' id,'NK25F4' postalcode,3214 some_value union all
select '1e3d' id,'NK25F4' postalcode,3258 some_value union all
select '1e3d' id,'NK25F4' postalcode,3354 some_value union all
select '1f74' id,'NG2LK8' postalcode,5524 some_value union all
select '1f74' id,'NG2LK8' postalcode,5548 some_value union all
select '3e9a' id,'N6B7H4' postalcode,3694 some_value union all
select '3e9a' id,'N6B7H4' postalcode,3325 some_value union all
select '38e4' id,'N6C7H2' postalcode,3654 some_value )
SELECT * FROM DATA ,LATERAL (SELECT ID FROM DATA SAMPLE(2 ROWS)) I WHERE I.ID = DATA.ID
You can also play with the window frame a little and let qualify do the work
select *
from your_table
qualify id=first_value(id) over (order by random() rows between unbounded preceding and unbounded following)
Snowflake deviates from ANSI standard on the default window frames for rank-related functions (first_value, last_value, nth_value), so that makes the above equivalent to :
select *
from your_table
qualify id=first_value(id) over (order by random())

Listing multiple columns in a single row in SQL

(select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,ROW_NUMBER() OVER(PARTITION BY EXTERNAL_TRANSACTION_ID ORDER BY ID ) AS SEQNUM
from AC_POS_TRANSACTION_TRK aptt WHERE [RESULT] ='Success'
GROUP BY ID, EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE )
Hello,
On above query, I want to get rows of transaction id's which has seqnum=1 and seqnum=2
But if that transaction id has no second row (seqnum=2), I dont want to get any row for that transaction id.
Thanks!!
Something like this
Not 100% sure if this is correct without you table definition, but my understanding is that you want to EXCLUDE records if that record has an entry with seqnum=2 -- you can't use a where clause alone because that would still return seqnum = 1.
You can use an exists /not exists or in/not in clause like this
(select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,ROW_NUMBER() OVER(PARTITION BY EXTERNAL_TRANSACTION_ID ORDER BY ID ) AS SEQNUM
from AC_POS_TRANSACTION_TRK aptt WHERE [RESULT] ='Success'
and not exists ( select 1 from AC_POS_TRANSACTION_TRK a where a.id = aptt.id
and a.seqnum = 2)
GROUP BY ID, EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE )
basically what this does is it excludes records if a record exists as specified in the NOT EXISTS query.
One option you can try is to add a count of rows per group using the same partioning critera and then filter accordingly. Not entirely sure about your query without seeing it in context and with sample data - there's no aggregation so why use group by?
However can you try something along these lines
select * from (
select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,
Row_Number() over(partition by EXTERNAL_TRANSACTION_ID order by ID) as SEQNUM,
Count(*) over(partition by EXTERNAL_TRANSACTION_ID) Qty
from AC_POS_TRANSACTION_TRK
where [RESULT] ='Success'
)x
where SEQNUM in (1,2) and Qty>1
This should do the job.
With Qry As (
-- Your original query goes here
),
Select Qry.*
From Qry
Where Exists (
Select *
From Qry Qry1
Where Qry1.EXTERNAL_TRANSACTION_ID = Qry.EXTERNAL_TRANSACTION_ID
And Qry1.SEQNUM = 1
)
And Exists (
Select *
From Qry Qry2
Where Qry2.EXTERNAL_TRANSACTION_ID = Qry.EXTERNAL_TRANSACTION_ID
And Qry2.SEQNUM = 2
)
BTW, your original query looks problematic to me, specifically I think that instead of a GROUP BY columns those columns should be in the PARTITION BY clause of the OVER statement, but without knowing more about the table structures and what you're trying to achieve, I could not say for sure.

Query historized data

To describe my query problem, the following data is helpful:
A single table contains the columns ID (int), VAL (varchar) and ORD (int)
The values of VAL may change over time by which older items identified by ID won't get updated but appended. The last valid item for ID is identified by the highest ORD value (increases over time).
T0, T1 and T2 are points in time where data got entered.
How do I get in an efficient manner to the Result set?
A solution must not involve materialized views etc. but should be expressible in a single SQL-query. Using Postgresql 9.3.
The correct way to select groupwise maximum in postgres is using DISTINCT ON
SELECT DISTINCT ON (id) sysid, id, val, ord
FROM my_table
ORDER BY id,ord DESC;
Fiddle
You want all records for which no newer record exists:
select *
from mytable
where not exists
(
select *
from mytable newer
where newer.id = mytable.id
and newer.ord > mytable.ord
)
order by id;
You can do the same with row numbers. Give the latest entry per ID the number 1 and keep these:
select sysid, id, val, ord
from
(
select
sysid, id, val, ord,
row_number() over (partition by id order by ord desc) as rn
from mytable
)
where rn = 1
order by id;
Left join the table (A) against itself (B) on the condition that B is more recent than A. Pick only the rows where B does not exist (i.e. A is the most recent row).
SELECT last_value.*
FROM my_table AS last_value
LEFT JOIN my_table
ON my_table.id = last_value.id
AND my_table.ord > last_value.ord
WHERE my_table.id IS NULL;
SQL Fiddle

Filter SQL data by repetition on a column

Very simple basic SQL question here.
I have this table:
Row Id __________Hour__Minute__City_Search
1___1409346767__23____24_____Balears (Illes)
2___1409346767__23____13_____Albacete
3___1409345729__23____7______Balears (Illes)
4___1409345729__23____3______Balears (Illes)
5___1409345729__22____56_____Balears (Illes)
What I want to get is only one distinct row by ID and select the last City_Search made by the same Id.
So, in this case, the result would be:
Row Id __________Hour__Minute__City_Search
1___1409346767__23____24_____Balears (Illes)
3___1409345729__23____7______Balears (Illes)
What's the easier way to do it?
Obviously I don't want to delete any data just query it.
Thanks for your time.
SELECT Row,
Id,
Hour,
Minute,
City_Search
FROM Table T
JOIN
(
SELECT MIN(Row) AS Row,
ID
FROM Table
GROUP BY ID
) AS M
ON M.Row = T.Row
AND M.ID = T.ID
Can you change hour/minute to a timestamp?
What you want in this case is to first select what uniquely identifies your row:
Select id, max(time) from [table] group by id
Then use that query to add the data to it.
SELECT id,city search, time
FROM (SELECT id, max(time) as lasttime FROM [table] GROUP BY id) as Tkey
INNER JOIN [table] as tdata
ON tkey.id = tdata.id AND tkey.lasttime = tdata.time
That should do it.
two options to do it without join...
use Row_Number function to find the last one
Select * FROM
(Select *,
row_number() over(Partition BY ID Order BY Hour desc Minute Desc) as RNB
from table)
Where RNB=1
Manipulate the string and using simple Max function
Select ID,Right(MAX(Concat(Hour,Minute,RPAD(Searc,20,''))),20)
From Table
Group by ID
avoiding Joins is usually much faster...
Hope this helps

Date of max id: sql/oracle optimization

What is a more elegant way of doing this:
select date from table where id in (
select max(id) from table);
Surely there is a better way...
You can use the ROWNUM pseudocolumn. The subquery is necessary to order the result before finding the first row:
SELECT date
FROM (SELECT * FROM table ORDER BY id DESC)
WHERE ROWNUM = 1;
You can use subquery factoring in Oracle 9i and later in the following way:
WITH ranked_table AS (
SELECT ROWNUM AS rn, date
FROM table
ORDER BY id DESC
)
SELECT date FROM ranked_table WHERE rn = 1;
You can use a self-join, and find where no row exists with a greater id:
SELECT date
FROM table t1
LEFT OUTER JOIN table t2
ON t1.id < t2.id
WHERE t2.id IS NULL;
Which solution is best depends on the indexes in your table, and the volume and distribution of your data. You should test each solution to determine what works best, is fastest, is most flexible for your needs, etc.
select date from (select date from table order by id desc)
where rownum < 2
assuming your ids are unique.
EDIT: using subquery + rownum