SQL: select multiple columns based on multiple groups of minimum values? - sql

The following query gives me a single row because b.id is pinned. I would like a query which I can give a group of ids and get the minimum valued row for each of them.
The effect I want is as if I wrapped this query in a loop over a collection of ids and executed the query with each id as b.id = value but that will be (tens of?) thousands of queries.
select top 1 a.id, b.id, a.time_point, b.time_point
from orientation_momentum a, orientation_momentum b
where a.id = '00820001001' and b.id = '00825001001'
order by calculatedValue() asc
This is on sql-server but I would prefer a portable solution if it's possible.

SQL Server ranking function should do the trick.
select * from (
select a.id, b.id, a.time_point, b.time_point,
rank() over (partition by a.id, b.id
order by calculatedValue() asc) ranker
from orientation_momentum a, orientation_momentum b
where a.id = '00820001001' and b.id between 10 and 20
) Z where ranker = 1

Related

Postgres query to combine multiple rows with same value of a column into a single row

I have this table:
and would like to convert it to the following:
Please help me, been stuck on it for way too long. Doesn't working for me using group by
WITH A as (SELECT id, a FROM XXX WHERE a is not null),
B as (SELECT id, b FROM XXX WHERE b is not null)
SELECT A.a, B.b, A.id FROM A
INNER JOIN B on A.id = B.id;
For this dataset, simple aggregation would do what you want:
select min(a) a, min(b) b, id
from mytable
group by id
This takes advantage of the fact that aggregate functions ignore null values; we could get the very same result with max() as we did with min().

Subqueries vs Multi Table Join

I've 3 tables A, B, C. I want to list the intersection count.
Way 1:-
select count(id) from A a join B b on a.id = b.id join C c on B.id = C.id;
Result Count - X
Way 2:-
SELECT count(id) FROM A WHERE id IN (SELECT id FROM B WHERE id IN (SELECT id FROM C));
Result Count - Y
The result count in each of the query is different. What exactly is wrong?
A JOIN can multiply the number of rows as well as filtering out rows.
In this case, the second count should be the correct one because nothing is double counted -- assuming id is unique in a. If not, it needs count(distinct a.id).
The equivalent using JOIN would use COUNT(DISTINCT):
select count(distinct a.id)
from A a join
B b
on a.id = b.id join
C c
on B.id = C.id;
I mention this for completeness but do not recommend this approach. Multiplying the number of rows just to remove them using distinct is inefficient.
In many databases, the most efficient method might be:
select count(*)
from a
where exists (select 1 from b where b.id = a.id) and
exists (select 1 from c where c.id = a.id);
Note: This assumes there are indexes on the id columns and that id is unique in a.

Oracle rownum returning wrong result

I am trying a query to return only the latest row from table.
Initially I used max(id) in query
But as I use sequence and my envoirnment is clustered, I cannot rely on sequence as its out of order.
So I decided to order based on creation time and pick top row using rownum.
I used something like
SELECT A.id
FROM Table_A, Table_B B
WHERE A.status = 'COMPLETED'
AND B.name = 'some_name'
AND A.id = B.id
AND rownum = 1
order by A.Creation_Time;
This some how returns me some wrong result say 42145.
If I remove the rownum condtn the top record is differnet say 45343;
When using rownum with order by, you need to use a subquery. This has to do with the order of evaluation of the where and order by. So, try this:
SELECT t.*
FROM (SELECT A.id
FROM Table_A JOIN
Table_B B
ON A.id = B.id
WHERE A.status = 'COMPLETED' AND B.name = 'some_name'
ORDER BY A.Creation_Time
) ab
WHERE rownum = 1;
I should add: Oracle 12 supports fetch first 1 row only, which is more convenient:
SELECT A.id
FROM Table_A JOIN
Table_B B
ON A.id = B.id
WHERE A.status = 'COMPLETED' AND B.name = 'some_name'
ORDER BY A.Creation_Time
FETCH FIRST 1 ROW ONLY;
Any chance you meant to do this? (Specifying to order DESC)
SELECT A.id
FROM Table_A, Table_B B
WHERE A.status = 'COMPLETED'
AND B.name = 'some_name'
AND A.id = B.id
AND rownum = 1
order by A.Creation_Time DESC;
EDIT: Obviously specifying order DESC was critical to your query. I had to upvote Gordon also, since he's completely correct that you need to limit the rows returned after sorting, so the methods he suggests are perfect for that. So, for anyone reading this far, I wanted to leave no doubt that rownum is assigned after the where clause is processed but before sorting or aggregation (source).
NB: As good as Gordon's answer is, there is one more consideration that may or may not be important here. It seems that it might be possible to have duplicate values of A.Creation_Time, which could cause you to see non-deterministic behavior as far as which of the duplicate rows might be returned when executing that query. If that problem might arise, Tom Kyte (same link as earlier) suggests adding some unique column value to the order by clause as an easy dodge. For example:
SELECT t.id
FROM (SELECT *
FROM (SELECT A.id, A.Creation_Time
FROM Table_A A
JOIN Table_B B
ON A.id = B.id
WHERE A.status = 'COMPLETED' AND B.name = 'some_name'
)
ORDER BY Creation_Time, rowid DESC
) t
WHERE rownum = 1;
Where ROWID is a unique pseudo column that can serve the purpose.

SQL Server ROW_NUMBER Left Join + when you don't know column names

I'm writing a page that will create a query (for non-db users) and it create the query and run it returning the results for them.
I am using row_number to handle custom pagination.
How do I do a left join and a row_number in a subquery when I don't know the specific columns I need to return. I tried to use * but I get an error that
The column '' was specified multiple times
Here is the query I tried:
SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY Test) AS ROW_NUMBER, *
FROM table1 a
LEFT JOIN table2 b
ON a.ID = b.ID) x
WHERE ROW_NUMBER BETWEEN 1 AND 50
Your query is going to fail in SQL Server regardless of the row_number() call. The * returns all columns, including a.id and b.id. These both have the same name. This is fine for a query, but for a subquery, all columns need distinct names.
You can use row_number() for an arbitrary ordering by using a "subquery with constant" in the order by clause:
SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY (select NULL)) AS ROW_NUMBER, *
FROM table1 a
LEFT JOIN table2 b
ON a.ID = b.ID) x
WHERE ROW_NUMBER BETWEEN 1 AND 50 ;
This removes the dependency on the underlying column name (assuming none are named ROW_NUMBER).
Try this sql. It should work.
SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY a.Test) AS ROW_NUMBER, a.*,b.*
FROM table1 a
LEFT JOIN table2 b
ON a.ID = b.ID) x
WHERE ROW_NUMBER BETWEEN 1 AND 50

Select DISTINCT or UNIQUE records or rows in Oracle

I want to select distinct or unique records from a database I am querying. How can I do this but at the same time select the entire record instead of just the column that I am distinguishing as unique? Do I have to do unruly joins?
Depending on the database that you are using, you can use window functions. If you want only rows that never repeat:
select t.*
from (select t.*,
count(*) over (partition by <id>) as numdups
from t
) t
where numdups = 1
If you want one example of each row:
select t.*
from (select t.*,
row_number(*) over (partition by <id> order by <id>) as seqnum
from t
) t
where seqnum = 1
If you don't have window functions, you can get the same thing done with "unruly joins".
If you want only one column out of several to be unique and you have joins that might include multiple records, then you have to determine which of the two or more values you want the query to provide. This can be done with aggregate functions, with correlated sub-queries or derived tables or CTEs (In SQL Server not sure if Oracle has those).
But you have to determine which value you want before you write the query. Once you know that then you probably know how to get it.
Here are some quick examples (I used SQL Server coding conventions but most of this should make sense in Oracle as it is all basic SQL, Oracle may have a different way of declaring a parameter):
select a.a_id, max (b.test) , min (c.test2)
from tablea a
join tableb b on a.a_id = b.a_id
join tablec c on a.a_id = c.a_id
group by a.a_id
order by b.test, c.test2
Select a.a_id, (select top 1 b.test from tableb b where a.a_id = b.a_id order by test2),
(select top 1 b.test2 from tableb b where a.a_id = b.a_id order by test2),
(select top 1 c.test3 from tablec c where a.a_id = c.a_id order by test4)
from tablea a
declare #a_id int
set #a_id = 189
select a.a_id , b.test, b.test4
from tablea a
join tableb b on a.a_id = b.a_id
join (select min(b.b_id) from tableb b where b.a_id = #a_id order by b.test3) c on c.b_id = b.b_id
where a.a_id = #a_id
In the second example
select t.*
from (select t.*,
row_number() over (partition by id order by id ) as seqnum
from t
) t
where seqnum = 1
the row_number() must be without star in the braces.