Having an issue with doing a mass update with millions of rows. Example of what I'm attempting to do below. Trying to avoid case statements if possible as there are over 1000 ranks.
Table 1:
id, score, rank
1 4090 null
2 6400 null
3 8905 null
4 2551 null
Table 2:
Rank, Score
1 0
2 1000
3 3500
4 5000
5 8000
6 10000
I'm attempting to update table 1 to display the correct rank.
EX: ID 2 having a score of 6400 would be above 5000 but below 8000 therefore be rank 4. Is this possible without a case statement?
You can use cross apply:
update t1
set rank = t2.rank
from table1 t1 cross apply
(select top 1 t2.*
from table2 t2
where t2.score <= t1.score
order by t2.score desc
) t2;
For millions of rows I would suggest one of the following:
Do the update in batches.
Use a case statement.
Put the output in a new table, truncate the original table, and reload
"Millions" of updates is often a very expensive operation.
Another Option is with a simple JOIN in concert with Lead()
Example
Update Table1 Set Rank=B.Rank
From Table1 A
Join (
Select Rank
,R1=Score
,R2=Lead(Score,1,999999) over (Order By Score)
From Table2
) B on A.score >= B.R1 and A.Score < B.R2
Returns
id score rank
1 4090 3
2 6400 4
3 8905 5
4 2551 2
Related
I have two tables in MS SQL Server. Table2 has the following:
TaskId TopN
1 2
2 3
3 1
Table1 has the following:
TaskId TopN Value
1 2 12
1 2 12
1 2 12
2 3 1
2 3 1
2 3 5
2 3 12
2 3 8
2 3 5
I want to be able to select the top N records based on the TopN field in table2 (which is the same TopN value found in table1, so maybe I don't even need to bother using two tables). The desired output should be as follows:
TaskId TopN Value
1 2 12
1 2 12
2 3 12
2 3 8
2 3 5
I have tried the below SQL statement, but it skips TaskId=1. Any idea of what I am doing wrong?
SELECT DISTINCT T1.TaskId,
T1.TopN,
T1.values
FROM Table1 T1 INNER JOIN Table1 T2 ON
T1.TaskId = T2.TaskId AND
T1.TopN = T2.TopN AND
T1.Value <= T2.Value
GROUP BY T1.TaskId,
T1.TopN,
T1.Value
HAVING COUNT(*) <= (
SELECT TopN
FROM table2
WHERE table2.TaskID = T1.TaskId
)
Please note that in the question you have named Table2 as the one which has the fields - TaskId, TopN, Values however in your query you have used the opposite. Assuming Table2 is the one which has the details, you can use the query below to get the desired result. You would not need to use the other table (Table1 - as per the question) which has just the task_id and topN since all the info is already present in Table2.
Select Taskid, TopN, Values
from
(Select T1.*, row_number() over(partition by Taskid order by Values desc) As rnk
from Table2 T1) Tb
where Tb.TopN >= Tb.rnk;
** Fixed the typo in the code (changed to >= instead of <=), it should work fine now.
The problem is that you have three rows with the same values -- and 3 > 2. That is, the subquery returns "3" which is not less than "2". In SQL Server, you would do this much more simply using row_number().
If you are using MS Access, you need a column that distinguishes the rows.
EDIT:
In SQL Server, you would use:
select t1.*
from (select t1.*,
row_number() over (partition by taskid order by value desc) as seqnum
from table1 t1
) t1
where t1.seqnum <= t1.topn;
Can I use LISTAGG or a similar analytical function in Oracle SQL to display all values in group up to current record?
This is my table:
id group_id value
-- -------- -----
1 1 A
2 1 B
3 1 C
4 2 X
5 2 Y
6 2 Z
I would like the following result:
id group_id values
-- -------- ------
1 1 A
2 1 AB
3 1 ABC
4 2 X
5 2 XY
6 2 XYZ
Here is one option, using a correlated subquery to handle the rollup of the value column:
SELECT
t1.id,
t1.group_id,
(SELECT LISTAGG(t2.val, '') WITHIN GROUP (ORDER BY t2.id)
FROM yourTable t2
WHERE t1.group_id = t2.group_id AND t2.id <= t1.id) AS vals
FROM yourTable t1
ORDER BY
t1.id;
Demo
The logic here is that, for each group, with rollup a concatenation of all values coming at or before the current id value in a given row.
Another approach to this, one which might perform and scale better, would be to use a recursive CTE. But, that would take more code, and might be harder to digest than what I wrote above.
I'm trying to design an SQL statement to select values only in certain cases. Here is an example to illustrate my 2 input tables and expected output.
Table1:
TransactionID TotalItemsType1 TotalItemsType2
0001 1 8
0002 7 6
1234 5 6
Table2:
TransactionID Cost
0001 5.99
1234 2.25
1234 0.15
0002 9.99
Expected Result:
TransactionID Cost TotalItemsType1 TotalItemsType2
0001 5.99 1 8
0002 9.99 7 6
1234 2.25 5 6
1234 0.15 0 0
When there are multiple rows in Table2 for a single TransactionID (1234), the TotalItemsType1/2 columns in the output should be populated only for the TransactionID with the highest Cost value (2.25 compared to 0.15), otherwise returning 0. In the case of a multiple equal highest values (2.25 compared to 2.25 for example) then just choose one of the TransactionIDs based on database row order (first).
I've attempted to do this using various joins and case expressions but I haven't yet found a workable solution.
You can apply a partitioned row number to Table2 via a CTE, and then conditionally use values from the JOIN only if the row number is 1 (and 0 otherwise):
WITH rn AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY TransactionID ORDER BY Cost) rn
FROM Table2
)
SELECT rn.TransactionID, rn.Cost, COALESCE(t1.TotalItemsType1, 0) TotalItemsType1,
COALESCE(t1.TotalItemsType2, 0) TotalItemsType2
FROM rn LEFT JOIN Table1 t1
ON rn.TransactionID = t1.TransactionID AND rn.rn = 1
The LEFT JOIN uses a condition of the row number being 1 to generate "empty columns" for Table1 where it's not 1. Then the COALESCE uses these NULL values to generate the required 0.
with max_cost as ( select transactionid tranid,max(cost) max_cost1 from tran_2 group by transactionid)
select a.transactionid,
b.cost,
decode(b.cost,max_cost1,a.totaltype1,0) type_1,
decode(b.cost,max_cost1,a.totaltype2,0) type_2
from tran_1 a,tran_2 b,max_cost c
where a.transactionid = b.transactionid
and c.tranid = a.transactionid
and c.tranid = b.transactionid
order by a.transactionid
solution without using analytic function
i have used the tran_1 table containing transactionid,type1,type2 column
and tran_2 table containing transactionid,cost
Here's the case: I have one table myTable which contains 3 columns:
ID int, identity
Group varchar(2), not null
value decimal(18,0), not null
Table looks like this:
ID GROUP VALUE Prev_Value Result
------------------------------------------
1 A 20 0 20
2 A 30 20 10
3 A 35 30 5
4 B 100 0 100
5 B 150 100 50
6 B 300 200 100
7 C 40 0 40
8 C 60 40 20
9 A 50 35 15
10 A 70 50 20
Prev_Value and Result columns should be custom columns. I need to make it on view. Anyone can help? please... Thank you so much.
The gist of what you need to do here is to join the table to itself, where part of the join condition is that the value column of the joined copy of the table is less than value column of the original. Then you can group by the columns from the original table and select the max value from the joined table to get your results:
SELECT t1.id, t1.[Group], t1.Value
, coalesce(MAX(t2.Value),0) As Prev_Value
, t1.Value - coalesce(MAX(t2.Value),0) As Result
FROM MyTable t1
LEFT JOIN MyTable t2 ON t2.[Group] = t1.[Group] and t2.Value < t1.Value
GROUP BY t1.id, t1.[Group], t1.Value
Once you can update to Sql Server 2012 you'll also be able to take advantage of the new LAG keyword.
I am still having trouble creating an running total based on the increasing order of the value. Row id has no real meaning, it is just the PK. My server doesn't support OVER.
Row Value
1 3
2 7
3 1
4 2
Result:
Row Value
3 1
4 3
1 6
2 13
I have tried self and cross joins where I specify that the value of the second amount(the one being summed up) is less than the current value of the first. I have also tried doing this with the having clause but that always threw an error when I tried it that way. Can someone explain why it would be wrong to use it in that manner and how I should be doing it?
Here is one way to do a running total:
select row, value,
(select sum(value) from t t2 where t2.value <= t.value) as runningTotal
from t
you can use the with rollup command if you have sql server 2008.
select sum(value) from t t2 where t2.value <= t.value with rollup
If your platform supports recursive queries(IIRC you should omit the RECURSIVE keyword for microsoft stuff). Because the CTE needs to estimate the begin/end of a "chain", unfortunately, the tuples need to be ordered in some way (I use the "row" field; an internal tuple-id would be perfect for this purpose):
WITH RECURSIVE sums AS (
-- Terminal part
SELECT d0.row
, d0.value AS value
, d0.value AS runsum
FROM data d0
WHERE NOT EXISTS (
SELECT * FROM data nx
WHERE nx.row < d0.row
)
UNION
-- Recursive part
SELECT t1.row AS row
, t1.value AS value
, t0.runsum + t1.value AS runsum
FROM data t1
, sums t0
WHERE t1.row > t0.row
AND NOT EXISTS (
SELECT * FROM data nx
WHERE nx.row > t0.row
AND nx.row < t1.row
)
)
SELECT * FROM sums
;
RESULT:
row | value | runsum
-----+-------+--------
1 | 3 | 3
2 | 7 | 10
3 | 1 | 11
4 | 2 | 13
(4 rows)