Access Top N Query where N is given in another table - sql

I have two tables in MS SQL Server. Table2 has the following:
TaskId TopN
1 2
2 3
3 1
Table1 has the following:
TaskId TopN Value
1 2 12
1 2 12
1 2 12
2 3 1
2 3 1
2 3 5
2 3 12
2 3 8
2 3 5
I want to be able to select the top N records based on the TopN field in table2 (which is the same TopN value found in table1, so maybe I don't even need to bother using two tables). The desired output should be as follows:
TaskId TopN Value
1 2 12
1 2 12
2 3 12
2 3 8
2 3 5
I have tried the below SQL statement, but it skips TaskId=1. Any idea of what I am doing wrong?
SELECT DISTINCT T1.TaskId,
T1.TopN,
T1.values
FROM Table1 T1 INNER JOIN Table1 T2 ON
T1.TaskId = T2.TaskId AND
T1.TopN = T2.TopN AND
T1.Value <= T2.Value
GROUP BY T1.TaskId,
T1.TopN,
T1.Value
HAVING COUNT(*) <= (
SELECT TopN
FROM table2
WHERE table2.TaskID = T1.TaskId
)

Please note that in the question you have named Table2 as the one which has the fields - TaskId, TopN, Values however in your query you have used the opposite. Assuming Table2 is the one which has the details, you can use the query below to get the desired result. You would not need to use the other table (Table1 - as per the question) which has just the task_id and topN since all the info is already present in Table2.
Select Taskid, TopN, Values
from
(Select T1.*, row_number() over(partition by Taskid order by Values desc) As rnk
from Table2 T1) Tb
where Tb.TopN >= Tb.rnk;
** Fixed the typo in the code (changed to >= instead of <=), it should work fine now.

The problem is that you have three rows with the same values -- and 3 > 2. That is, the subquery returns "3" which is not less than "2". In SQL Server, you would do this much more simply using row_number().
If you are using MS Access, you need a column that distinguishes the rows.
EDIT:
In SQL Server, you would use:
select t1.*
from (select t1.*,
row_number() over (partition by taskid order by value desc) as seqnum
from table1 t1
) t1
where t1.seqnum <= t1.topn;

Related

Find rows that contains same value on different columns

The table to find which rows contains same value on two different columns for 2 rows. Here is a small sample rows among 2k+ rows.
id left right
1 3 4
2 4 1
3 1 9
4 2 6
5 2 5
6 9 8
7 0 7
In the above case, I need to get row 1,2,3,6 as it contains 4 on two rows of two different columns i.e (id=1&2),1 on two rows of two different columns(id=1&3) and 9 on two rows of two different columns(id=3&6)
My thoughts:
I did thought many things for example cross join on left and right column, group by and count etc.
with Final as (With OuterTable as (WITH Alias AS (SELECT id as left_id , left FROM Test)
SELECT DISTINCT id, left_id FROM Alias
INNER JOIN Test ON Alias.left = Test.right)
SELECT id from OuterTable
UNION ALL
SELECT left_id from OuterTable)
SELECT DISTINCT * from Final;
It's messy, but it works.
You can do it with EXISTS:
SELECT t1.*
FROM tablename t1
WHERE EXISTS (
SELECT 1 FROM tablename t2
WHERE t1.id <> t2.id AND (t2.left = t1.right OR t1.left = t2.right)
)
See the demo.
Results:
id
left
right
1
3
4
2
4
1
3
1
9
6
9
8

Oracle SQL - display values up to current record

Can I use LISTAGG or a similar analytical function in Oracle SQL to display all values in group up to current record?
This is my table:
id group_id value
-- -------- -----
1 1 A
2 1 B
3 1 C
4 2 X
5 2 Y
6 2 Z
I would like the following result:
id group_id values
-- -------- ------
1 1 A
2 1 AB
3 1 ABC
4 2 X
5 2 XY
6 2 XYZ
Here is one option, using a correlated subquery to handle the rollup of the value column:
SELECT
t1.id,
t1.group_id,
(SELECT LISTAGG(t2.val, '') WITHIN GROUP (ORDER BY t2.id)
FROM yourTable t2
WHERE t1.group_id = t2.group_id AND t2.id <= t1.id) AS vals
FROM yourTable t1
ORDER BY
t1.id;
Demo
The logic here is that, for each group, with rollup a concatenation of all values coming at or before the current id value in a given row.
Another approach to this, one which might perform and scale better, would be to use a recursive CTE. But, that would take more code, and might be harder to digest than what I wrote above.

TSQL - update based on value between two ints in 2nd table.

Having an issue with doing a mass update with millions of rows. Example of what I'm attempting to do below. Trying to avoid case statements if possible as there are over 1000 ranks.
Table 1:
id, score, rank
1 4090 null
2 6400 null
3 8905 null
4 2551 null
Table 2:
Rank, Score
1 0
2 1000
3 3500
4 5000
5 8000
6 10000
I'm attempting to update table 1 to display the correct rank.
EX: ID 2 having a score of 6400 would be above 5000 but below 8000 therefore be rank 4. Is this possible without a case statement?
You can use cross apply:
update t1
set rank = t2.rank
from table1 t1 cross apply
(select top 1 t2.*
from table2 t2
where t2.score <= t1.score
order by t2.score desc
) t2;
For millions of rows I would suggest one of the following:
Do the update in batches.
Use a case statement.
Put the output in a new table, truncate the original table, and reload
"Millions" of updates is often a very expensive operation.
Another Option is with a simple JOIN in concert with Lead()
Example
Update Table1 Set Rank=B.Rank
From Table1 A
Join (
Select Rank
,R1=Score
,R2=Lead(Score,1,999999) over (Order By Score)
From Table2
) B on A.score >= B.R1 and A.Score < B.R2
Returns
id score rank
1 4090 3
2 6400 4
3 8905 5
4 2551 2

Running Totals again. No over clause, no cursor, but increasing order

I am still having trouble creating an running total based on the increasing order of the value. Row id has no real meaning, it is just the PK. My server doesn't support OVER.
Row Value
1 3
2 7
3 1
4 2
Result:
Row Value
3 1
4 3
1 6
2 13
I have tried self and cross joins where I specify that the value of the second amount(the one being summed up) is less than the current value of the first. I have also tried doing this with the having clause but that always threw an error when I tried it that way. Can someone explain why it would be wrong to use it in that manner and how I should be doing it?
Here is one way to do a running total:
select row, value,
(select sum(value) from t t2 where t2.value <= t.value) as runningTotal
from t
you can use the with rollup command if you have sql server 2008.
select sum(value) from t t2 where t2.value <= t.value with rollup
If your platform supports recursive queries(IIRC you should omit the RECURSIVE keyword for microsoft stuff). Because the CTE needs to estimate the begin/end of a "chain", unfortunately, the tuples need to be ordered in some way (I use the "row" field; an internal tuple-id would be perfect for this purpose):
WITH RECURSIVE sums AS (
-- Terminal part
SELECT d0.row
, d0.value AS value
, d0.value AS runsum
FROM data d0
WHERE NOT EXISTS (
SELECT * FROM data nx
WHERE nx.row < d0.row
)
UNION
-- Recursive part
SELECT t1.row AS row
, t1.value AS value
, t0.runsum + t1.value AS runsum
FROM data t1
, sums t0
WHERE t1.row > t0.row
AND NOT EXISTS (
SELECT * FROM data nx
WHERE nx.row > t0.row
AND nx.row < t1.row
)
)
SELECT * FROM sums
;
RESULT:
row | value | runsum
-----+-------+--------
1 | 3 | 3
2 | 7 | 10
3 | 1 | 11
4 | 2 | 13
(4 rows)

SQL Server - How to display most recent records based on dates in two tables

I have 2 tables. I Want to list the records based on the recent date. For ex: from the following tables, I want to display ID 2 and ID 4 using a select statement. ID 2 and 4 are the most recent based on the dates from the second table. Please help me with the query. Thank you.
ID EXID PID REASON
1 1 1 XYZ
2 2 1 ABX
3 3 2 NNN
4 4 2 AAA
EXID EXDATE
1 1/1/2011
2 4/1/2011
3 3/1/2011
4 5/1/2011
Here you go, this ought to do it. Let me know if you have any questions.
SELECT
TBL.ID,
TBL.EXDATE
FROM
(
SELECT
T1.ID,
T2.EXDATE,
ROW_NUMBER() OVER(PARTITION BY T1.PID ORDER BY T2.EXDATE DESC) AS 'RN'
FROM
Table1 T1
INNER JOIN Table2 T2
ON T1.EXID = T2.EXID
) TBL
WHERE
TBL.RN = 1