I have a query where in I have to sort the table and retrieve the first value that matches for every id.
The scenario that I would like to achieve is to get the ID of Table A that matches the first ID_2 from the sorted Table B
I have a slight concept of the code.
select A.ID, A.COL1, B.COL1, B.COL2
from A, B
where A.ID = B.ID
and B.ID_2 = (select ID_2
from (select ID_2
from B B2
where B2.ID = A.ID
order by (case when B2.PRIO ...))
where rownum = 1)
The problem here is A.ID is not accessible within the select in where clause.
Another way that I found was using analytic function
select ID, COL1, COL2
from (select A.ID, A.COL1, B.COL2,
row_number() over (partition by A.ID order by (case when B.PRIO ...) row_num
from A, B
where A.ID = B.ID)
where row_num = 1
The problem with this code is I think it is not good performance wise.
Can anyone help me? =)
row_number() is not a statistic function. It is an analytic or window function. It is probably your best bet. I would do:
select a.*
from A join
(select b.*,
row_number() over (partition by b.ID order by (case when b.PRIO ...) as seqnum
from b
) b
on A.ID = B.ID and b.seqnum = 1;
If you really only want A.ID, then you don't need A at all . . . the information is in B.ID (assuming it is not being used for filtering). The above then simplifies to:
select b.id
from (select b.*,
row_number() over (partition by b.ID order by (case when b.PRIO ...) as seqnum
from b
) b
where b.seqnum = 1;
You don't need a correlated sub-subquery (which is invalid in Oracle), and you don't need an analytic function either. You need the aggregate first/last function.
... and b.id_2 = (select max(id_2) keep (dense_rank first order by case.....)
from b b2
where b2.id = a.id
) .....
Even this is probably too complicated. If you would describe your requirement (instead of just posting some incomplete code), the community may be able to help you simplify the query even further.
Related
I'm trying to make a CASE expression in T-SQL that's in a GROUP BY clause, that's basically asking if there's a a.id THEN provide b.name associated with it (where a.id = b.id) .
What I have so far is: (Updated Query)
SELECT b.name, ...
MAX(CASE
WHEN a.id IS NOT NULL
THEN b.name END)
FROM ...
LEFT JOIN table_b AS b
ON b.id = a.id
GROUP BY ...
Because it's T-SQL, the CASE has to be in a aggregate function or GROUP BY clause, which is why I included the MAX. However, without the GROUP BY clause, there would be 4 values. I need the most recent value as defined by a.datetime. How can I put that condition in the CASE statement?
Your last edit changes quite a lot. You should use ROW_NUMBER for this:
WITH CTE AS
(
SELECT b.name, ...
RN = ROW_NUMBER() OVER( PARTITION BY a.id
ORDER BY b.DateColumn DESC)
FROM ...
LEFT JOIN table_b AS b
ON b.id = a.id
)
SELECT *,
CASE
WHEN a.id IS NOT NULL
THEN b.name
END
FROM CTE
WHERE RN = 1;
If I understood you correctly(From now on try including the full query, table data\structure, input and desired output) , you can join them from the beggining and then just pick MAX() :
SELECT ....
MAX(b.name)
FROM YourTable a
LEFT JOIN YourTable b
ON(a.id = b.id)
...
GROUP BY ....
If there's a match, the name will be fetched. Otherwise - it will be NULL .
I am trying a query to return only the latest row from table.
Initially I used max(id) in query
But as I use sequence and my envoirnment is clustered, I cannot rely on sequence as its out of order.
So I decided to order based on creation time and pick top row using rownum.
I used something like
SELECT A.id
FROM Table_A, Table_B B
WHERE A.status = 'COMPLETED'
AND B.name = 'some_name'
AND A.id = B.id
AND rownum = 1
order by A.Creation_Time;
This some how returns me some wrong result say 42145.
If I remove the rownum condtn the top record is differnet say 45343;
When using rownum with order by, you need to use a subquery. This has to do with the order of evaluation of the where and order by. So, try this:
SELECT t.*
FROM (SELECT A.id
FROM Table_A JOIN
Table_B B
ON A.id = B.id
WHERE A.status = 'COMPLETED' AND B.name = 'some_name'
ORDER BY A.Creation_Time
) ab
WHERE rownum = 1;
I should add: Oracle 12 supports fetch first 1 row only, which is more convenient:
SELECT A.id
FROM Table_A JOIN
Table_B B
ON A.id = B.id
WHERE A.status = 'COMPLETED' AND B.name = 'some_name'
ORDER BY A.Creation_Time
FETCH FIRST 1 ROW ONLY;
Any chance you meant to do this? (Specifying to order DESC)
SELECT A.id
FROM Table_A, Table_B B
WHERE A.status = 'COMPLETED'
AND B.name = 'some_name'
AND A.id = B.id
AND rownum = 1
order by A.Creation_Time DESC;
EDIT: Obviously specifying order DESC was critical to your query. I had to upvote Gordon also, since he's completely correct that you need to limit the rows returned after sorting, so the methods he suggests are perfect for that. So, for anyone reading this far, I wanted to leave no doubt that rownum is assigned after the where clause is processed but before sorting or aggregation (source).
NB: As good as Gordon's answer is, there is one more consideration that may or may not be important here. It seems that it might be possible to have duplicate values of A.Creation_Time, which could cause you to see non-deterministic behavior as far as which of the duplicate rows might be returned when executing that query. If that problem might arise, Tom Kyte (same link as earlier) suggests adding some unique column value to the order by clause as an easy dodge. For example:
SELECT t.id
FROM (SELECT *
FROM (SELECT A.id, A.Creation_Time
FROM Table_A A
JOIN Table_B B
ON A.id = B.id
WHERE A.status = 'COMPLETED' AND B.name = 'some_name'
)
ORDER BY Creation_Time, rowid DESC
) t
WHERE rownum = 1;
Where ROWID is a unique pseudo column that can serve the purpose.
Apparently outer-joins to a subquery are not allowed by Oracle. For each row on table A, I'm trying find the row on table B with the same ID, and latest date.
Something like this:
SELECT a.*, b.date, b.val1, b.val2
FROM a, b
WHERE b.id (+) = a.id
AND b.date (+) = (SELECT MAX(b.date) FROM a, b WHERE a.id = b.id);
Removing the outer join (+) on b.date allows it to be parsed, but no rows are returned when there are no rows on table B. I need the query to just return NULL in this case. Is there a way around this?
Thanks
I think what you want is this:
SELECT a.*, b.date, b.val1, b.val2
FROM a
LEFT JOIN b ON b.id = a.id
WHERE (b.date is null
or b.date = (SELECT MAX(b2.date) FROM b b2 WHERE a.id = b2.id));
This way, the outer join is just performed on id. Then we're filtering out all of the rows where b.date is not the max for the corresponding row in a.
As an aside, you'll note that I removed a from the sub-query. As originally written, the sub-query returned the largest date in b that had a corresponding row in a. The same value would be used for every row of the outer query. The revised version makes the sub-query correlate to the outer query (i.e. it will get the corresponding max(date) for each row returned).
I already voted for Allan's answer, but just to demonstrate an alternative approach, here's how it can be done with an analytic function:
SELECT * FROM (
SELECT a.*, b.date, b.val1, b.val2,
ROW_NUMBER() over (PARTITION BY a.id ORDER BY b.date DESC) r
FROM a LEFT JOIN b ON a.id=b.id
)
WHERE r=1
This will include only one row for each a.id, even if there are multiple b rows with the maximum date. To include all of them, change ROW_NUMBER to RANK.
How about a scalar subquery?
select a.*, (select max(b.date) from b where b.id = a.id) as b_date
from a;
Edit: You can save the max date to a variable
DECLARE #maxDate as datetime
SET #maxDate = (SELECT MAX(date) FROM b)
SELECT a.*, b.date, b.val1, b.val2
FROM a
LEFT OUTER JOIN b ON a.id = b.id
AND b.date = #maxDate
This may be more or less efficient than Allan's answer, depending on if A has many more rows than B (or vice-versa). If B has a ton of rows, then querying it twice (which my answer does) is probably not the best solution.
The following query gives me a single row because b.id is pinned. I would like a query which I can give a group of ids and get the minimum valued row for each of them.
The effect I want is as if I wrapped this query in a loop over a collection of ids and executed the query with each id as b.id = value but that will be (tens of?) thousands of queries.
select top 1 a.id, b.id, a.time_point, b.time_point
from orientation_momentum a, orientation_momentum b
where a.id = '00820001001' and b.id = '00825001001'
order by calculatedValue() asc
This is on sql-server but I would prefer a portable solution if it's possible.
SQL Server ranking function should do the trick.
select * from (
select a.id, b.id, a.time_point, b.time_point,
rank() over (partition by a.id, b.id
order by calculatedValue() asc) ranker
from orientation_momentum a, orientation_momentum b
where a.id = '00820001001' and b.id between 10 and 20
) Z where ranker = 1
I want to select distinct or unique records from a database I am querying. How can I do this but at the same time select the entire record instead of just the column that I am distinguishing as unique? Do I have to do unruly joins?
Depending on the database that you are using, you can use window functions. If you want only rows that never repeat:
select t.*
from (select t.*,
count(*) over (partition by <id>) as numdups
from t
) t
where numdups = 1
If you want one example of each row:
select t.*
from (select t.*,
row_number(*) over (partition by <id> order by <id>) as seqnum
from t
) t
where seqnum = 1
If you don't have window functions, you can get the same thing done with "unruly joins".
If you want only one column out of several to be unique and you have joins that might include multiple records, then you have to determine which of the two or more values you want the query to provide. This can be done with aggregate functions, with correlated sub-queries or derived tables or CTEs (In SQL Server not sure if Oracle has those).
But you have to determine which value you want before you write the query. Once you know that then you probably know how to get it.
Here are some quick examples (I used SQL Server coding conventions but most of this should make sense in Oracle as it is all basic SQL, Oracle may have a different way of declaring a parameter):
select a.a_id, max (b.test) , min (c.test2)
from tablea a
join tableb b on a.a_id = b.a_id
join tablec c on a.a_id = c.a_id
group by a.a_id
order by b.test, c.test2
Select a.a_id, (select top 1 b.test from tableb b where a.a_id = b.a_id order by test2),
(select top 1 b.test2 from tableb b where a.a_id = b.a_id order by test2),
(select top 1 c.test3 from tablec c where a.a_id = c.a_id order by test4)
from tablea a
declare #a_id int
set #a_id = 189
select a.a_id , b.test, b.test4
from tablea a
join tableb b on a.a_id = b.a_id
join (select min(b.b_id) from tableb b where b.a_id = #a_id order by b.test3) c on c.b_id = b.b_id
where a.a_id = #a_id
In the second example
select t.*
from (select t.*,
row_number() over (partition by id order by id ) as seqnum
from t
) t
where seqnum = 1
the row_number() must be without star in the braces.