Presto - Concat multiple tables using unique identifier - sql

I have multiple tables in the following format:
table users -
ID lang
1 EN
2 EN
3 DE
table A -
ID event1 event2
1 5 1
2 null 1
3 11 null
table B -
ID event1 event10
1 2 1
3 2 null
so after concat/join the tables on ID column my final table would look like this:
final_table -
ID lang A_event1 A_event2 B_event1 B_event10
1 EN 5 1 2 1
2 EN null 1 null null
3 DE 11 null 2 null
So I have multiple issues here, first how to properly do the joins so that aliases would match table names and have final unique column names even though the events have same naming inside the columns, and also I would like all the missing values would also have null values (like table B that does not have user ID = 2).
My tries so far were not successful as the column names would be duplicated without unique IDs and missing values were not filled with nulls properly.
example for what I already tried:
select t1.*, t2.*, t3.*
from users t1
left join
A t2
using (ID)
left join
B t3
using (ID)
I can construct the query programmatically to provide flexability, but I would like to know the proper syntax for such case.
Thanks.

Your attempt with two left joins looks quite good. I would, however suggest not using the using(id) syntax to join the tables: with 3 tables involved, it is ambiguous to which id column you are referring, which could lead to missing records in the resultset:
select
u.id,
u.lang,
ta.event1 A_event1,
ta.event2 A_event2,
tb.event1 B_event1,
tb.event110 B_event10
from users u
left join tableA ta on ta.id = u.id
left join tableB tb on tb.id = u.id
I don't see how this query would generate duplicate ids in the resultset (as long as the ids are unique in each table, as shown in your sample data).

If the non-id columns in the tables were unique, then you could express this as:
select *
from users u left join
A
using (ID) left join
B
using (ID);
The id means the same thing in the three tables, so it is appropriate to use using. In fact, using is very handy when working with outer joins (although more so with full join).
I'm not a big fan of using select *. And it is not appropriate in this case because the columns are not unique. So a fine way to write the query is:
select u.*,
a.event1 as a_event1, a.event2 as a_event2,
b.event1 as b_event1, b.event10 as b_event10
from users u left join
A
using (ID) left join
B
using (ID);

Related

Exclude one item with different corelated value in the next column SQL

I have two tables:
acc_num
ser_code
1
A
1
B
1
C
2
C
2
D
and the second one is:
ser_code
value
A
5
B
8
C
10
D
15
I want to exclude all the accounts with the service codes that they have value of 10 or 15.
Because my data set is huge, I want to use NOT EXIST but it just excludes combination of acc_num and ser_code.
I want to exclude the acc_num with all of it's ser_code, because on of it's ser_code meats my criteria.
I used:
select acc_num, ser_code
from table 1
where NOT EXIST (select 1
FROM table 2 where acc_num = acc_num and value in (10, 15)
out put with above code is:
acc_num
ser_code
1
A
1
B
Desire out put would be empty.
here you are
select t1.acc_num,t1.ser_code from table1 t1, table2 t2
where (t1.ser_code=t2.ser_code and t2.value not in (10,15))
and t1.acc_num not in
(
select t3.acc_num from table1 t3,table2 t4
where t1.acc_num=t3.acc_num and t3.ser_code=t4.ser_code
and t4.value in (10,15)
) ;
This could be achieved in many ways. However using NOT EXISTS is the best option. The problem with your query is for acc_num 1, there are ser_code that does not have value as 10, 15. So you will get A and B in result.
To overcome that you must pull acc_num inside the sub-query
Query 1 (using NOT EXISTS):
As you can see in the below query, I have included acc_num inside sub-query, so that the filter works properly,
SELECT DISTINCT a.acc_num, a.ser_code
FROM one as a
WHERE NOT EXISTS
(
SELECT DISTINCT one.acc_num
FROM two
INNER JOIN one
ON one.ser_code=two.ser_code
WHERE value IN (10,15) AND a.acc_num=one.acc_num
)
Query 2 (using LEFT JOIN):
NOT EXISTS often confusing due to it's nature (super fast though). Hence LEFT JOIN could also be used (expensive than NOT EXISTS),
SELECT DISTINCT a.acc_num, a.ser_code
FROM one as a
LEFT JOIN
(
SELECT DISTINCT one.acc_num
FROM two
INNER JOIN one
ON one.ser_code=two.ser_code
WHERE value IN (10,15)
) b
ON a.acc_num=b.acc_num
WHERE b.acc_num IS NULL
Query 3 (using NOT IN):
NOT IN would also achieve this with comprehensive query but expensive than both of the above methods,
SELECT DISTINCT a.acc_num, a.ser_code
FROM one as a
WHERE a.acc_num NOT IN
(
SELECT DISTINCT one.acc_num
FROM two
INNER JOIN one
ON one.ser_code=two.ser_code
WHERE value IN (10,15)
)
All 3 would yield same result. I would prefer to go with NOT EXISTS
See demo with time consumption in db<>fiddle

Duplicate rows in left join

I have 2 tables. There are about 100000 of null in one column, other values are integer, total values are about 200000. Another table has only the integer value. When I use the left join on this column, it gave me a lot of duplicates rows. Is it ok to use left join here?
Table 1:
Column 1
2
3
5
null
null
Table 2:
Column 1
1
2
3
so on
Your example is really odd. Why would anyone have null values in an ID field? But anyway.
If you need fields from table 2 in the resultset as you say above then you must use an INNER JOIN not a LEFT JOIN
Something like:
SELECT DISTINCT a.id, a.name, b.someOtherField
FROM Table1 a
INNER JOIN Table2 b ON a.id = b.id
Please note: Since only the ID field of table 1 has null values there will be no records selected from table 1 with id IS NULL because they have no equivalent in table 2. Adding the DISTINCT keyword helps in case this query would still produce duplicates.

add new column with matching id in both Table 1 and Table 2

I have two tables,
in table1 I have 5 rows and
in table2 3 rows
table1:
#no---Name---value
1-----John---100
2-----Cooper-200
3-----Mil----300
4-----Key----200
5-----Van----300
Table 2:
#MemID-#no---FavID
19-----1-----2
21-----1-----3
22-----2-----5
Now expected result:
#no---name---value---MyFav
1-----John---100-----NULL
2-----Cooper-200-----1
3-----Mil----300-----1
4-----Key----200-----NULL
5-----Van----300-----NULL
1 indicates - My favorites
MyFav - new column ( alias)
This is the expected result, please suggest how to get it.
I think I understand the logic. You want MyFav to be marked as a 1 if that row is a favorite of John. You can do this with a left join and some more filtering:
select t1.*,
(case when t2.#no is not null then 1 end) as MyFav
from table1 t1 left join
table2 t2
on t1.#no = t2.FavId and
t2.#no = (select tt1.#no from table1 tt1 where tt1.Name = 'John');
Just use natural join for that, It will use your primary key as a mediator to join both the tables, as required. In your case, I think primary key is #no
For more information on natural join please visit SQL Joins

SQL Inner Join and further requirements

I would like to return table entries from an inner join which do not have any matching entries in the second column.
Lets consider the following two tables:
Table one:
Name Number
A 1
A 2
A 4
Table two:
Name ID
A 3
The query should return Name=A ID=3. If ID would be 4, the query should not return anything. Is this even possible in SQL? Thanks for any hints!
Edit:
the joined table would look like this:
Name Number ID
A 1 3
A 2 3
A 4 3
So if I do this query I get no entries in the result set:
SELECT * FROM TABLE_ONE INNER JOIN TABLE_TWO ON TABLE_ONE.NAME=TABLE_TWO.NAME WHERE NUMBER=ID
Exactly in this situation I would like to get the Name returned!
Yes, instead of using an INNER join, use a LEFT or a FULL OUTER join. This will allow null values from the other table to appear when you have a value in one of your tables.
The FULL OUTER JOIN keyword returns all rows from the left table (table1) and from the right table (table2).
The LEFT JOIN keyword returns all rows from the left table (table1), with the matching rows in the right table (table2). The result is NULL in the right side when there is no match. (There is also a RIGHT join, but it does the same thing as the left join, just returning all rows from the RIGHT table instead of the left).
SELECT *
FROM Table2
WHERE NOT EXISTS (
SELECT *
FROM Table1
WHERE Table1.Name = Table2.Name AND Table1.Number = Table2.ID
)
As #rhealitycheck has said, a full outer join would work. I found this blog post helpful in explaining joins. P.S. I can't leave comments (Otherwise I would have).

left join on MS SQL 2008 R2

I'm trying to left join two tables. Table A contains unique 100 records with field_a_1, field_a_2, field_a_3. The combination of field_a_1 and field_a_2 is unique.
Table B has multi-million records with multiple fields. field_b_1 is same as field_a_1 and field_b_2 is same as field_a_2.
I join the two tables together like this:
select a.*, b.*
from a
left join b
on field_a_1 = field_b_1
and field_a_2 = field_b_2
Instead of getting 100 records, I get multi-million records. Why is this?
Because table B has multiple rows for each table A entry.
For example:
TableA (ID)
1
2
3
TableB (ID, data)
1 hello
1 world
1 foo
1 bar
2 data
2 words
2 more
3 words
3 boring
If you left join from TableA to TableB, you will get a row for every TableB record that matches a TableA record - ie. all of them.
Can you explain what results you are looking for?
Because a left join returns all of the rows from the first table + all of the matching rows from the second table. Which of the millions of matching rows did you expect to get?
Left join or inner join don't really make a difference. A JOIN will return all rows that match the join condition. So if table b has millions of rows that match the JOIN criteria, then all the rows will be returned.
Depending on what you wish to accomplish you should consider using the DISTINCT keyword or GROUP BY to perform aggregate functions.