Running ID based on 2 columns - sql

Can someone help. Been trawling through Google and loads of forums but can't seem to find what I am looking for. I need some kind of running ID added to my data. See example below.
This is my data
ID
A
B
C
1
22
WP1234
C
2
22
WP1235
C
3
22
WP1236
O
4
24
WP1237
C
5
24
WP1238
C
6
24
WP1239
O
7
26
WP1240
C
8
26
WP1241
C
9
28
WP1242
C
I need to get some kind of running ID based on columns and A, C.
Desired outcome would be
ID
A
B
C
RunningID
1
22
WP1234
C
1
2
22
WP1235
C
2
3
22
WP1236
O
1
4
24
WP1237
C
1
5
24
WP1238
C
2
6
24
WP1239
O
1
7
26
WP1240
C
1
8
26
WP1241
C
2
9
28
WP1242
C
1

I think you want row_number():
select t.*,
row_number() over (partition by a, c order by id) as running_id
from t;

Related

Create multiple rows based on a column containing a list of numbers

I currently have a table which looks like this.
A Category Code
1 A 10,30
2 B 30
3 C 20,30,40
Is there anyway to write a sql statement that would get me
ID Category Code
1 A 10
1 A 30
2 B 30
3 C 20
3 C 30
3 C 40
Thanks
You can use UNNEST with SPLIT function...
select a, category, s_code
from my_data, unnest(split(code, ',')) as s_code
a
category
s_code
1
A
10
1
A
30
2
B
30
3
C
20
3
C
30
3
C
40

How to make one column match duplicates in another column

This problem is out of my ability range and I can’t get anywhere with it beyond knowing I can probably use LEAD, LAG or maybe a cursor?
Here is a breakdown of the table and question:
row_id is always an IDENTITY(1, 1) column.
The set_id column always starts out in groups of 3s (two 0s for the first set_id, don't worry about why).
The letter column is alphabetic. There are varying counts of duplicates.
Here's the original table:
row_id
set_id
letter
1
0
A
2
0
A
3
1
A
4
1
B
5
1
B
6
2
B
7
2
B
8
2
C
9
3
C
10
3
C
11
3
D
12
4
D
13
4
D
14
4
D
What I need is a code that: if there is a duplicate letter in the next row, then the set_id in the next row should be the same as the previous row (alt_set_id).
If that doesn't make sense, here is the result I want:
row_id
set_id
letter
alt_set_id
1
0
A
0
2
0
A
0
3
1
A
0
4
1
B
1
5
1
B
1
6
2
B
1
7
2
B
1
8
2
C
2
9
3
C
2
10
3
C
2
11
3
D
3
12
4
D
3
13
4
D
3
14
4
D
3
Here's where I am with code so far, I'm not really close but I think I am on the right path:
SELECT
*,
CASE
WHEN letter = [letter in next row]
THEN 'yes'
ELSE 'no'
END AS 'next row a duplicate?',
'tbd' AS alt_row_id
FROM
(SELECT
*,
LEAD(letter) OVER (ORDER BY row_id) AS 'letter in next row'
FROM
sort_test) AS dt
WHERE
row_id = row_id
That query has the below result set, which is something I think I can work with, but it doesn't feel very efficient and I'm not yet getting the result needed in the alt_set_id column:
row_id
set_id
letter
letter in next row
next row a duplicate?
alt_set_id
1
0
A
A
yes
tbd
2
0
A
A
yes
tbd
3
1
A
B
no
tbd
4
1
B
B
yes
tbd
5
1
B
B
yes
tbd
6
2
B
B
yes
tbd
7
2
B
C
no
tbd
8
2
C
C
yes
tbd
9
3
C
C
yes
tbd
10
3
C
D
no
tbd
11
3
D
D
yes
tbd
12
4
D
D
yes
tbd
13
4
D
D
yes
tbd
14
4
D
NULL
no
tbd
Thanks for any help!
Based on your example data, you want the minimum set_id for each letter. If so, use window functions;
select t.*, min(set_id) over (partition by letter) as alt_set_id
from sort_test t;
It would appear if I understand correctly a simple correlated subquery will give you the desired result:
select *, (select Min(set_Id) from t t2 where t2.letter=t.letter) as alt_set_id
from t
See working DB Fiddle

To prepare a dataframe with elements being repeated from a list in python

I have a list as primary = ['A' , 'B' , 'C' , 'D']
and a DataFrame as
df2 = pd.DataFrame(data=dateRange, columns = ['Date'])
which contains 1 date column starting from 01-July-2020 till 31-Dec-2020.
I created another column 'DayNum' which will contain the day number from the date like 01-July-2020 is Wednesday so the 'DayNum' column will have 2 and so on.
Now using the list I want to create another column 'primary' so that the DataFrame looks as follows:
In short, the elements on the list should repeat. You can say that this is a roster to show the name of the person on the roster on a weekly basis where Monday is the start (day 0) and Sunday is the end (day 6).
The output should be like this:
Date DayNum Primary
0 01-Jul-20 2 A
1 02-Jul-20 3 A
2 03-Jul-20 4 A
3 04-Jul-20 5 A
4 05-Jul-20 6 A
5 06-Jul-20 0 B
6 07-Jul-20 1 B
7 08-Jul-20 2 B
8 09-Jul-20 3 B
9 10-Jul-20 4 B
10 11-Jul-20 5 B
11 12-Jul-20 6 B
12 13-Jul-20 0 C
13 14-Jul-20 1 C
14 15-Jul-20 2 C
15 16-Jul-20 3 C
16 17-Jul-20 4 C
17 18-Jul-20 5 C
18 19-Jul-20 6 C
19 20-Jul-20 0 D
20 21-Jul-20 1 D
21 22-Jul-20 2 D
22 23-Jul-20 3 D
23 24-Jul-20 4 D
24 25-Jul-20 5 D
25 26-Jul-20 6 D
26 27-Jul-20 0 A
27 28-Jul-20 1 A
28 29-Jul-20 2 A
29 30-Jul-20 3 A
30 31-Jul-20 4 A
First compare column for 0 by Series.eq with cumulative sum by Series.cumsum for groups for each week, then use modulo by Series.mod with number of values in list and last map by dictioanry created by enumerate and list by Series.map:
primary = ['A','B','C','D']
d = dict(enumerate(primary))
df['Primary'] = df['DayNum'].eq(0).cumsum().mod(len(primary)).map(d)

How to write sql query to generate a group no for each grouped record

following is scenario:
I have data in following format:
entryid , ac_no, db/cr, amt
-----------------------------------------------
1 10 D 5
1 11 C 5
2 01 D 8
2 11 C 8
3 12 D 10
3 13 C 10
4 14 D 5
4 16 C 5
5 14 D 2
5 17 C 2
6 14 D 3
6 18 C 3
I want data in following format:
So far i have acheived the first 3 columns by query
select wm_concat(entryid),ac_no,db_cr,Sum(amt) from t1 group by ac_no,db_cr
wm_Concat(entryid),ac_no, db/cr, Sum(amt), set_id
------------------------------------------------
1 10 D 5 S1
2 01 D 8 S1
1,2 11 C 13 S1
3 12 D 10 S2
3 13 C 10 S2
4,5,6 14 D 10 S3
4 16 C 5 S3
5 17 C 2 S3
6 18 C 3 S3
I want an additional column `set_id` that either shows this S1, S2.. or any number 1,2.. so that the debit & credit entries sets can be identified.
I am making sets of debit and credit entries based on their Ac_no values.
Any little help will be highly appreciated. Thanks
Create a new column say set and give a unique identifier to the particular set. So for example the first three records will have set id S1, next two will have S2 and so on.
To distinguish a transaction from a set you can use column db/cr along with newly added set column. You can identify that the 3rd row is a set since it's transaction type is 'C' whereas the transactions are of type 'D'.
Here I have assumed that your transactions are debit only, if not please provide more details in the question. Hope this helps.

How to create a flag of exclusion for duplicate rows in Oracle

Given below is the snapshot of my data
NameAgeIncome Group
Asd 20 A
Asd 20 A
b 19 E
c 21 B
c 21 B
c 21 B
df 21 C
rd 24 D
I want ot include a flag variable where it says 1 to one of the duplicate row and 0 to another. And also 0 to rest of the rows which are not duplicate. Given below is the snapshot of final desired output
NameAgeIncome Group Flag
Asd 20 A 1
Asd 20 A 0
b 19 E 0
c 21 B 1
c 21 B 1
c 21 B 0
df 21 C 0
rd 24 D 0
Can anyone help me how to create this Flag variable in Oracle database
You can do this using analytic functions and case:
select t.*,
(case when row_number() over (partition by name, age, income order by name) = 1
then 0 else 1
end) as GroupFlag
from table t;