How to eliminate duplicates in a SQL table with the following values?

How to eliminate duplicates in a SQL table with the following values? - sql

S_No. Name HRA
1 SS 123
2 SS 123
3 SS 123
4 SS 124
5 SA 222
6 SA 222
7 SA 221
8 SE 222
9 SE 123
10 SE 123
Desired Result
S_No. Name HRA
1 SS 123
4 SS 124
5 SA 222
7 SA 221
8 SE 222
9 SE 123

select min(s_no), name, hra
from table_name
group by name, hra

Related

Update query for two tables in oracle based on multiple conditions

Consider i have two table train_reserve and reserve:
train_reserve:
ChangeId
C_Id
Process
Download
trainId
Status
SDate
EDate
Book_date
L_date
BookId
1
1001
1
A
1995
B
05-APR-22
06-APR-22
10-MAR-22
11-MAR-22
111
2
1001
2
B
1995
M
05-APR-22
08-APR-22
10-MAR-22
11-MAR-22
111
3
1002
1
B
1995
B
12-APR-22
14-APR-22
10-MAR-22
11-MAR-22
222
4
1002
2
C
1995
M
12-APR-22
13-APR-22
10-MAR-22
11-MAR-22
222
5
1003
1
A
1995
B
25-MAY-22
25-MAY-22
10-MAR-22
11-MAR-22
333
6
1004
1
A
1995
B
19-MAR-22
20-MAR-22
10-MAR-22
11-MAR-22
444
7
1004
1
B
1995
B
19-MAR-22
20-MAR-22
10-MAR-22
11-MAR-22
555
reserve:
C_Id
trainId
SDate
EDate
L_date
BookId
1001
1995
05-APR-22
08-APR-22
11-MAR-22
111
1002
1995
12-APR-22
13-APR-22
11-MAR-22
222
1003
1995
25-MAY-22
25-MAY-22
11-MAR-22
333
1004
1995
19-MAR-22
20-MAR-22
11-MAR-22
444
1005
1995
19-MAR-22
20-MAR-22
11-MAR-22
555
Below is the input from user:
C_id=1, Process=(1,2), Download=(A,B,C), trainId=1995, Status=(B),Sdate=null,Edate=null,Book_date>='10-MAR-22',L_date=null.
User want to update BookId=null in both tables when C_id>=1001 and Status is B only . i.e I want below output:
train_reserve:
ChangeId
C_Id
Process
Download
trainId
Status
SDate
EDate
Book_date
L_date
BookId
1
1001
1
A
1995
B
05-APR-22
06-APR-22
10-MAR-22
11-MAR-22
111
2
1001
2
B
1995
M
05-APR-22
08-APR-22
10-MAR-22
11-MAR-22
111
3
1002
1
B
1995
B
12-APR-22
14-APR-22
10-MAR-22
11-MAR-22
222
4
1002
2
C
1995
M
12-APR-22
13-APR-22
10-MAR-22
11-MAR-22
222
5
1003
1
A
1995
B
25-MAY-22
25-MAY-22
10-MAR-22
11-MAR-22
null
6
1004
1
A
1995
B
19-MAR-22
20-MAR-22
10-MAR-22
11-MAR-22
null
7
1004
1
B
1995
B
19-MAR-22
20-MAR-22
10-MAR-22
11-MAR-22
null
reserve:
C_Id
trainId
SDate
EDate
L_date
BookId
1001
1995
05-APR-22
08-APR-22
11-MAR-22
111
1002
1995
12-APR-22
13-APR-22
11-MAR-22
222
1003
1995
25-MAY-22
25-MAY-22
11-MAR-22
null
1004
1995
19-MAR-22
20-MAR-22
11-MAR-22
null
1005
1995
19-MAR-22
20-MAR-22
11-MAR-22
null
I am currently using two update statements as below
update train_reserve a
set a.BookId=null
where a.C_Id>=1001
and a.trainId=1995
and a.Process in (1,2)
and a.Download in ('A','B','C')
and a.Status='B'
and a.Book_date>='10-MAR-22'
and not exists (select 1
from train_reserve b
where a.C_Id = b.C_Id
and b.Status='M');
update reserve
set BookId = null
where a.C_Id in (select a.C_Id
from train_reserve a
where a.C_Id >= 1001
and a.trainId=1995
and a.Process in (1,2)
and a.Download in ('A','B','C')
and a.Status='B'
and a.Book_date>='10-MAR-22'
and not exists (select 1
from train_reserve b
where a.C_Id = b.C_Id
and b.Status='M'));
But second query above takes long time to update since i am fetching data from 1st table then updating reserve table.
Is there optimized way to achieve above result?

Getting an element and the next from a table

I have a table with ids, cities and some sequence number, say:
ID CITY SEQ_NO
1 Milan 123
2 Paris 124
1 Rome 125
1 Naples 126
1 Strasbourg 130
3 London 129
3 Manchester 132
2 Strasbourg 128
3 Rome 131
2 Rome 127
4 Moscow 135
5 New York 136
4 Helsinki 137
I want to get the city that comes after Rome for the same id, in this case, I can order them by doing something like:
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SEQ_NO) as rownum,
id,
city,
seq_no
FROM mytable
I get:
rownum ID CITY SEQ_NO
1 1 Milan 123
2 1 Rome 125
3 1 Naples 126
4 1 Strasbourg 130
1 2 Paris 124
2 2 Rome 127
3 2 Strasbourg 128
1 3 London 129
2 3 Rome 131
3 3 Manchester 132
1 4 Moscow 135
2 4 Helsinki 137
1 5 New York 136
and, I want to get
ID CITY SEQ_NO
1 Rome 125
1 Naples 126
2 Rome 127
2 Strasbourg 128
3 Rome 131
3 Manchester 132
How do I proceed?

Hmmm . . . I might suggest window functions:
select t.*
from (select t.*,
lag(city) over (partition by id order by seq_no) as prev_city
from mytable t
) t
where 'Rome' in (city, prev_city)

how to add incremental number to specific column in pandas

I have following dataframe in pandas
code tank length dia diff
123 3 625 210 -0.38
123 5 635 210 1.2
I want to add 1 only in length for 5 times if the diff is positive and subtract 1 if the dip is negative. My desired dataframe looks like
code tank length diameter
123 3 625 210
123 3 624 210
123 3 623 210
123 3 622 210
123 3 621 210
123 3 620 210
123 5 635 210
123 5 636 210
123 5 637 210
123 5 638 210
123 5 639 210
123 5 640 210
I am doing following in pandas.
df.add(1)
But, its adding 1 to all the columns.

Use Index.repeat 6 times, then add counter values by GroupBy.cumcount and last create default RangeIndex by DataFrame.set_index:
df1 = df.loc[df.index.repeat(6)].copy()
df1['length'] += df1.groupby(level=0).cumcount()
df1 = df1.reset_index(drop=True)
Or:
df1 = (df.loc[df.index.repeat(6)]
.assign(length = lambda x: x.groupby(level=0).cumcount() + x['length'])
.reset_index(drop=True))
print (df1)
code tank length dia
0 123 3 625 210
1 123 3 626 210
2 123 3 627 210
3 123 3 628 210
4 123 3 629 210
5 123 3 630 210
6 123 5 635 210
7 123 5 636 210
8 123 5 637 210
9 123 5 638 210
10 123 5 639 210
11 123 5 640 210
EDIT:
df1 = df.loc[df.index.repeat(6)].copy()
add = df1.groupby(level=0).cumcount()
mask = df1['diff'] < 0
df1['length'] = np.where(mask, df1['length'] - add, df1['length'] + add)
df1 = df1.reset_index(drop=True)
print (df1)
code tank length dia diff
0 123 3 625 210 -0.38
1 123 3 624 210 -0.38
2 123 3 623 210 -0.38
3 123 3 622 210 -0.38
4 123 3 621 210 -0.38
5 123 3 620 210 -0.38
6 123 5 635 210 1.20
7 123 5 636 210 1.20
8 123 5 637 210 1.20
9 123 5 638 210 1.20
10 123 5 639 210 1.20
11 123 5 640 210 1.20

We can use pd.concat, np.cumsum and groupby + .add.
If you want to substract, simply multiply addition * -1 so for example: (np.cumsum(np.ones(n))-1) * -1
n = 6
new = pd.concat([df]*n).sort_values(['code', 'length']).reset_index(drop=True)
addition = np.cumsum(np.ones(n))-1
new['length'] = new.groupby(['code', 'tank'])['length'].apply(lambda x: x.add(addition))
Output
code tank length dia
0 123 3 625.0 210
1 123 3 626.0 210
2 123 3 627.0 210
3 123 3 628.0 210
4 123 3 629.0 210
5 123 3 630.0 210
6 123 5 635.0 210
7 123 5 636.0 210
8 123 5 637.0 210
9 123 5 638.0 210
10 123 5 639.0 210
11 123 5 640.0 210

Python Pandas Combining 2 df by keys

I'm trying to combine these two dataframes:
df1 =
ID1 ID2
111 1001
112 1002
113 1003
114 1004
115 1005
df2 =
ID1 Name Age
111 ABC 20
111 ABC 21
1001 ABC 22
1002 QAZ 18
1002 QAZ 19
1002 QAZ 20
113 XYZ 25
113 XYZ 25
to get an output like this:
ID Name Age ID1 ID2
111 ABC 20 111 1001
111 ABC 21 111 1001
1001 ABC 22 111 1001
1002 QAZ 18 112 1002
1002 QAZ 19 112 1002
1002 QAZ 20 112 1002
113 XYZ 25 113 1003
113 XYZ 25 113 1003
Is this possible?
Thanks in advance!

merge + combine_first PS: I think the ID1 in df2 should be ID
df2.merge(df1,left_on='ID',right_on='ID1',how='left').\
combine_first(df2.merge(df1,left_on='ID',right_on='ID2',how='left'))
Out[912]:
ID Name Age ID1 ID2
0 111 ABC 20 111.0 1001.0
1 111 ABC 21 111.0 1001.0
2 1001 ABC 22 111.0 1001.0
3 1002 QAZ 18 112.0 1002.0
4 1002 QAZ 19 112.0 1002.0
5 1002 QAZ 20 112.0 1002.0
6 113 XYZ 25 113.0 1003.0
7 113 XYZ 25 113.0 1003.0

How to extract info based on the latest row

I have two tables:-
TABLE A :-
ORNO DEL PONO QTY
801 123 1 80
801 123 2 60
801 123 3 70
801 151 1 95
801 151 3 75
802 130 1 50
802 130 2 40
802 130 3 30
802 181 2 55
TABLE B:-
ORNO PONO STATUS ITEM
801 1 12 APPLE
801 2 12 ORANGE
801 3 12 MANGO
802 1 22 PEAR
802 2 22 KIWI
802 3 22 MELON
I wish to extract the info based on the latest DEL (in Table A) using SQL. The final output should look like this:-
OUTPUT:-
ORNO PONO STATUS ITEM QTY
801 1 12 APPLE 95
801 2 12 ORANGE 60
801 3 12 MANGO 75
802 1 22 PEAR 50
802 2 22 KIWI 55
802 3 22 MELON 30
Thanks.

select b.*, y.QTY
from
(
select a.ORNO, a.PONO, MAX(a.DEL) [max]
from #tA a
group by a.ORNO, a.PONO
)x
join #tA y on y.ORNO = x.ORNO and y.PONO = x.PONO and y.DEL = x.max
join #tB b on b.ORNO = y.ORNO and b.PONO = y.PONO
Output:
ORNO PONO STATUS ITEM QTY
----------- ----------- ----------- ---------- -----------
801 1 12 APPLE 95
801 2 12 ORANGE 60
801 3 12 MANGO 75
802 1 22 PEAR 50
802 2 22 KIWI 55
802 3 22 MELON 30

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to eliminate duplicates in a SQL table with the following values? - sql

S_No. Name HRA 1 SS 123 2 SS 123 3 SS 123 4 SS 124 5 SA 222 6 SA 222 7 SA 221 8 SE 222 9 SE 123 10 SE 123 Desired Result S_No. Name HRA 1 SS 123 4 SS 124 5 SA 222 7 SA 221 8 SE 222 9 SE 123

select min(s_no), name, hra from table_name group by name, hra

Related

Update query for two tables in oracle based on multiple conditions

Getting an element and the next from a table

how to add incremental number to specific column in pandas

Python Pandas Combining 2 df by keys

How to extract info based on the latest row

Categories

Resources