Count number of unique occurrences of a key value corresponding to each ID column - sql

I have a table in DB2 as below :
Key ID SubID
Abc123 576 10
Abc123 576 12
Abc124 576 13
Abc125 577 14
Abc126 578 15
Abc127 578 16
Abc128 578 17
Want to create a additional count column where it counts number of unique occurrences of key value for each ID and the output should be as below
Key ID SubID Count
Abc123 576 10 2
Abc123 576 12 2
Abc124 576 13 2
Abc125 577 14 1
Abc126 578 15 3
Abc127 578 16 3
Abc128 578 17 3
I tried below
select Key, ID, SubId ,
count(Key) over (partition by Key) as count
from table
Appreciate any help!

You cannot use a window function with the DISTINCT qualifier. You can use a scalar subquery to count the rows you want.
For example:
select *,
(select count(distinct key) from t x where x.id = t.id) as cnt
from t
Result:
KEY ID SUBID CNT
------- ---- ------ ---
Abc123 576 10 2
Abc123 576 12 2
Abc124 576 13 2
Abc125 577 14 1
Abc126 578 15 3
Abc127 578 16 3
Abc128 578 17 3
See running example at db<>fiddle.

Related

Getting the last 50 rows for each group in group by

I have this query but it is only showing the last 5 rows instead of limiting the amount of rows the group by gets
I only want the last 50 rows for each person to be sum and in the group.
SELECT playerid, SUM(gamesplayed) AS totalgames, SUM(playtimes) AS playtimeTotal, SUM(Kills) AS totalkills
FROM plugin_game
WHERE gamesplayed=1
GROUP BY playerid
ORDER BY totalkills DESC
LIMIT 50
playerid totalgames playtimeTotal totalkills
797749 8 3076 678
53854 8 5982 635
24398 8 3277 575
464657 4 1325 387
65748 4 3390 368
651532 4 3219 354
287378 6 3893 350
753808 4 2565 323
731631 4 1733 256
665338 4 1971 255
569648 2 2041 244
56488 4 2636 157
006985 3 785 93
58640 1 432 72
If i change the LIMIT to 5 it only shows
playerid totalgames playtimeTotal totalkills
797749 8 3076 678
53854 8 5982 635
24398 8 3277 575
464657 4 1325 387
65748 4 3390 368
so if we use 5 games as an example, i only want to get the SUM for the past 5 games for the group
This should work in postgre sql!
SELECT playerid,
SUM(gamesplayed) over w AS totalgames,
SUM(playtimes) over w AS playtimetotal,
SUM(kills) over w AS totalkills,
ROW_NUMBER() over w AS row
FROM plugin_game
window w AS (PARTITION BY playerid ORDER BY totalkills DESC)
WHERE gamesplayed=1 and row <=50

Group repeating pattern in pandas Dataframe

so i have a Dataframe that has a repeating Number Series that i want to group like this:
Number Pattern
Value
Desired Group
Value.1
1
723
1
Max of Group
2
400
1
Max of Group
8
235
1
Max of Group
5
387
2
Max of Group
7
911
2
Max of Group
3
365
3
Max of Group
4
270
3
Max of Group
5
194
3
Max of Group
7
452
3
Max of Group
100
716
4
Max of Group
104
69
4
Max of Group
2
846
5
Max of Group
3
474
5
Max of Group
4
524
5
Max of Group
So essentially the number pattern is always monotonly increasing.
Any Ideas?
You can compare Number Pattern by 1 with cumulative sum by Series.cumsum and then is used GroupBy.transform with max:
df['Desired Group'] = df['Number Pattern'].eq(1).cumsum()
df['Value.1'] = df.groupby('Desired Group')['Value'].transform('max')
print (df)
Number Pattern Value Desired Group Value.1
0 1 723 1 723
1 2 400 1 723
2 3 235 1 723
3 1 387 2 911
4 2 911 2 911
5 1 365 3 452
6 2 270 3 452
7 3 194 3 452
8 4 452 3 452
9 1 716 4 716
10 2 69 4 716
11 1 846 5 846
12 2 474 5 846
13 3 524 5 846
For monotically increasing use:
df['Desired Group'] = (~df['Number Pattern'].diff().gt(0)).cumsum()

Redshift SQL query help needed

Table order_details:
order_id dish_id category_id
----------------------------------
601 22 123
601 23 234
603 32 456
603 54 456
603 11 543
603 19 456
From the sample table provided above: how can I group the order_id,dish_id and category_id on the basis of distinct group with respect to each order_id?
The result should look like
order_id dish_id category_id count
---------------------------------------------
601 22 123 1
601 23 234 1
603 32 456 3
603 54 452 3
603 11 543 3
603 19 456 3
Note:
Like dish_id 22 in order_id 601 went along with 1 different distinct category_id i.e 234, and similarly in order_id 603 dish_id 32 went along 2 different distinct category_id ie 456, 543
If I assume that the triplets are unique, you seem to want 1 less than the count of the group. That would be:
select t.*,
(count(*) over (partition by order_id) - 1) as cnt
from t;

how to update duplicate rows in a column to a new values

I will explain my problem briefly
have duplicate rino like below (actually this rino is the serial number in front end)
chqid rino branchid
----- ---- --------
876 6 2
14 6 2
18 10 2
828 10 2
829 11 2
19 11 2
830 12 2
20 12 2
78 40 2
1092 40 2
1094 41 2
79 41 2
413 43 2
1103 43 2
82 44 2
1104 44 2
1105 45 2
83 45 2
91 46 2
1106 46 2
here in my case I don't want to delete these duplicate rino instead of that I planned to update the rino having max date(column not specified in the above sample actually a date column is there) to the next rino number
what exactly I meant is :
if I sort out the above result according to the max(date) I will get
chqid rino branchid
----- ---- --------
876 6 2
828 10 2
19 11 2
830 12 2
1092 40 2
79 41 2
413 43 2
82 44 2
83 45 2
1106 46 2
(NOTE : total number of duplicate rows are 10 in branchid=2)
the last entered rino in the table for branchid=2 is 245
So I just want to update the 10 rows(Of column rino) with numbers starting from 246 to 255( just added 245+10 like this select lastno+ generate_series(1,10) nos from tab where cola=4 and branchid = 2 and vrid=20;)
Expected Output:
chqid rino branchid
----- ---- --------
876 246 2
828 247 2
19 248 2
830 249 2
1092 250 2
79 251 2
413 252 2
82 253 2
83 254 2
1106 255 2
using postgresql
Finally I found a solution, am using dynamic-sql to solve my issue
do
$$
declare
arow record;
begin
for arow in
select chqid,rino,branchid from (
select chqid,rino::int ,vrid,branchid ,row_number()over (partition by rino::int ) rn
from tab
where vrid =20
and branchid = 2)t
where rn >1
loop
execute format('
update tab
set rino=(select max(rino::int)+1 from gtab19 where acyrid=4 and branchid = 2 and vrid=20)
where chqid=%s
',arow.chqid);
end loop;
end;
$$;

HSQLDB query to replace a null value with a value derived from another record

This is a small excerpt from a much larger table, call it LOG:
RN EID FID FRID TID TFAID
1 364 509 7045 null 7452
2 364 509 7045 7452 null
3 364 509 7045 7457 null
4 375 512 4525 5442 5241
5 375 513 4525 5863 5241
6 375 515 4525 2542 5241
7 576 621 5632 null 5452
8 576 621 5632 2595 null
9 672 622 5632 null 5966
10 672 622 5632 2635 null
I would like a query that will replace the null in the 'TFAID' column with the value from the 'TFAID' column from the 'FID' column that matches.
Desired output would therefore be:
RN EID FID FRID TID TFAID
1 364 509 7045 null 7452
2 364 509 7045 7452 7452
3 364 509 7045 7457 7452
4 375 512 4525 5442 5241
5 375 513 4525 5863 5241
6 375 515 4525 2542 5241
7 576 621 5632 null 5452
8 576 621 5632 2595 5452
9 672 622 5632 null 5966
10 672 622 5632 2635 5966
I know that something like
SELECT RN,
EID,
FID,
FRID,
TID,
(COALESCE TFAID, {insert clever code here}) AS TFAID
FROM LOG
is what I need, but I can't for the life of me come up with the clever bit of SQL that will fill in the proper TFAID.
HSQLDB supports SQL features that can be used as alternatives. These features are not supported by some other databases.
CREATE TABLE LOG (RN INT, EID INT, FID INT, FRID INT, TID INT, TFAID INT);
-- using LATERAL
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, f.TFAID) AS TFAID
FROM LOG l , LATERAL (SELECT MAX(TFAID) AS TFAID FROM LOG f WHERE f.FID = l.FID) f
-- using scalar subquery
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, (SELECT MAX(TFAID) AS TFAID FROM LOG f WHERE f.FID = l.FID)) AS TFAID
FROM LOG l
Here is one approach. This aggregates the log to get the value and then joins the result in:
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, f.TFAID) AS TFAID
FROM LOG l join
(select fid, max(tfaid) as tfaid
from log
group by fid
) f
on l.fid = f.fid;
There may be other approaches that are more efficient. However, HSQL doesn't implement all SQL features.