Adding column in table with the values in another if matched

Adding column in table with the values in another if matched - sql

I have two tables:
table1
A
B
C
A1
B1
C1
A2
B2
C2
A3
B3
C3
A4
B4
C4
A5
B5
C5
A6
B6
C6
table2
A
D
A1
D1
A3
D3
A5
D5
A6
D6
I would like to have table 1 updated with a column D which shows the value in column D joining by A. However, Is altering table 1 adding a column D and then merging both tables and update when matched the way to go or is there any better approach?

You can just join the value in when you need it:
select t1.*, t2.d
from table1 t1 left join
table2 t2
on t1.a = t2.a;
If that is not sufficient, you can add the column:
alter table1 add d <type>;
Then you can update it:
update table1 t1
set d = (select t2.d from table2 t2 where t2.a = t1.a)
where exists (select t2.d from table2 t2 where t2.a = t1.a);

Related

All rows of First N items of a group of data in dataset based on another column in pandas

Let's consider I have this dataset:
name comp item type
A c1 item21 t1
A c1 item231 t1
A c1 item3 t1
B c3 item23 t1
B c3 item1 t1
B c3 p3251 t1
C c4 item1 t1
C c4 p32sd t1
C c4 item512 t1
D c5 item242 t2
D c5 item1 t2
F c6 item4 t2
F c6 item24 t2
H c7 item4125 t2
H c7 item3 t2
H c7 item14 t2
K c8 item1 t2
K c8 p3223 t2
I want to select all items of first n [names,comp] of each type:
For example all items of first 2 names-comp of each type the expected df would be:
name comp item type
A c1 item21 t1
A c1 item231 t1
A c1 item3 t1
B c3 item23 t1
B c3 item1 t1
B c3 p3251 t1
D c5 item242 t2
D c5 item1 t2
F c6 item4 t2
F c6 item24 t2
Does anybody have any idea how to do this?

Try this:
cols = ['type', 'name', 'comp']
# The first 2 name-comp of each type
tmp = df[cols].drop_duplicates().groupby('type').head(2)
# All rows that match the criteria
result = tmp.merge(df, left_on=cols, right_on=cols)
If you want no intermediary data frame:
df[cols].drop_duplicates().groupby('type').head(2).merge(df, left_on=cols, right_on=cols)

SQL HIVE | Duplicate lines in Table

I have a table like this where the keys are [c_1, c_2, c_3], I want to non duplicates in my table.
Input :
C1 C2 C3 C4 C5
A1 D1 V1 X1 F3
A2 D1 V1 X2 F2
A1 D1 V1 X1 F3
A2 D1 V1 X2 F2
A4 D1 V2 X1 F3
A2 D1 V1 X1 F3
Output :
C1 C2 C3 C4 C5
A1 D1 V1 X1 F3
A2 D1 V1 X2 F2
A4 D1 V2 X1 F3
Regards,

try below:
insert overwrite table yourtable select distinct * from yourtable;

you can select the non duplicated data by
SELECT DISTINCT * FROM Table
then you can truncate the table and insert the above result to the table.

You can use ROW_NUMBER() window function:
select t.c1, t.c2, t.c3, t.c4, t.c5
from (
select *, row_number() over (partition by c1, c2, c3 order by c4, c5) rn
from tablename
) t
where t.rn = 1
You can remove order by c4, c5 if you are not interested in the 1st row of that order.

Does aggregation do what you want?
select c1, c2, c3, max(c4), max(c5)
from t
group by c1, c2, c3;
This does not guarantee that c4 and c5 come from the same row, but it does guarantee that the triple c1/c2/c3 appears only once.

Compare two tables with the same columns and report the difference keeping one column as the reference column

I have two tables. Table T1 with the following columns and rows:
#A B C D
-----------------
P1 01 C1 1
P1 02 C2 2
P2 01 C3 1
P2 02 C4 3
Table T2 with the same columns as T1 but with some differences in the data
#A B C D
---------------
P1 01 C1 1
P1 02 C9 8
P1 03 C5 1
P2 01 C6 2
P2 05 C8 4
Columns A & B together form the primary key.
I want to compare the two tables by keeping the column A as the reference column between the two tables. In my output I want to see the difference between the two tables.
#A B C D B C D T1-vs-T2
---------------------------------------------
P1 01 C1 1 01 C1 1 Match
P1 02 C2 2 02 C9 8 No Match
P1 -- -- - 03 C5 1 Not in T1
P2 01 C3 3 01 C6 2 No Match
P2 02 C4 3 -- -- - Not in T2
P2 -- -- - 05 C8 4 Not in T1

You are looking for a full outer join. Access does not directly support a full outer join operator, but we can simulate it using a union query.
SELECT
t1.A AS A,
t1.B AS B,
t1.C AS t1_C,
t1.D AS t1_D,
t2.C AS t2_C,
t2.D AS t2_D,
IIF(t1.C = t2.C AND t1.D = t2.D, 'Match',
IIF(t1.A IS NOT NULL AND t2.A IS NOT NULL, 'No Match',
'Not in T2')) AS T1_vs_T2
FROM Table1 t1
LEFT JOIN Table2 t2
ON t1.A = t2.A AND t1.B = t2.B
UNION ALL
SELECT
t2.A,
t2.B,
t1.C,
t1.D,
t2.C,
t2.D,
'Not in T1'
FROM Table1 t1
RIGHT JOIN Table2 t2
ON t1.A = t2.A AND t1.B = t2.B
WHERE
t1.A IS NULL;

I think a better way to implement the logic is to start with all ids and just use left join:
select ab.a, ab.b, t1.c, t1.d, t2.c, t2.d,
switch(t1.a is null, 'Not in t1',
t1.b is null, 'Not in t2',
t1.c = t2.c and t1.d = t2.d, 'Match',
1=1, 'No Match'
) as t1_vs_t2
from ((select a, b from t1
union -- on purpose to remove dups
select a, b from t2
) ab left join
t1
on t1.a = ab.a and t1.b = ab.b
) left join
t2
on t2.a = ab.a and t2.b = ab.b;
I prefer this because the logic for the comparison is all in one place.

Filter data based on result set of group and count

I have the following table
Col1 Col2 Col3
A1 B1 C1
A1 B1 C2
A1 B2 C1
A1 B2 C2
A1 B2 C3
A2 B1 C1
A2 B1 C2
A2 B2 C1
A2 B2 C2
From this table I want all the unique records from Col1 where for the combination of col1 and col2 there's a different count for the same value in Col1. The only possible answer is A1 in the table above.
The following query gives me the count of each col1 and col2.
select col1, col2, count(*) from table
group by col1, col2;
Col1 Col2 Count
A1 B1 2
A1 B2 3
A2 B1 2
A2 B2 2
From the above query I can see that A1 has two records with a different count. How do I return A1 in a single query?

You can use another level of aggregation:
select col1
from (select col1, col2, count(*) as cnt
from table
group by col1, col2
) t
group by col1
having min(cnt) <> max(cnt);

Update one table from another table with duplicate keys

I am trying to merge data from one table into another.
Table 1 (Tab1)
ID col2 col3 col_to_update
1 s1 a1 null
2 s1 a2 null
3 s1 a2 null
4 s2 a1 null
5 s3 a1 null
6 s4 a1 null
Table 2 (Tab2)
ID col2 col3 col4
10 s1 a1 v1
11 s1 a1 v2
12 s1 a2 v3
13 s2 a1 v4
14 s3 a1 v5
15 s4 a1 v6
16 s4 a1 v7
I am trying to map column col4 from table Tab2 into column col_to_update in table Tab1 based on matching Tab1.col2 = Tab2.col2 and Tabl.col3 = Tab2.col3 to get below expected output:
Expected Output
ID col2 col3 col4
1 s1 a1 v1
2 s1 a2 v3
3 s1 a2 v3
4 s2 a1 v4
5 s3 a1 v5
6 s4 a1 v6
I tried unsuccessfully with below query:
MERGE INTO Tab1 x1
USING
(
SELECT t1.id as t1id, t2.id as t2id, t2.col2 t2col2, t2.col3 t2col3, t2.col4 from Tab2 t2
JOIN Tab1 t1 ON t2.col2 = t1.col2 AND t2.col3 = t1.col3
) x2
ON (x1.id = x2.t1id)
WHEN MATCHED THEN UPDATE SET
x1.col_to_update = x2.col4;
Is there a way to get the expected output.

You simply want to update tab1:
update tab1
set col_to_update =
(
select min(tab2.col4) -- or whichever value you want to use
from tab2
where tab2.col2 = tab1.col2
and tab2.col3 = tab1.col3
);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Adding column in table with the values in another if matched - sql

Related

All rows of First N items of a group of data in dataset based on another column in pandas

SQL HIVE | Duplicate lines in Table

Compare two tables with the same columns and report the difference keeping one column as the reference column

Filter data based on result set of group and count

Update one table from another table with duplicate keys

Categories

Resources