How to join two total tables using sql? - sql

For a university work we have two tables in sql:
table1:
column_name1 number_P1
PARIS 10
LISBOA 20
RIO 30
table2:
column_name2 number_P2
PARIS 100
NEW YORK 300
I need to join the two tables by adding the total number of people in each city. So I tried to do:
SELECT table1.column_name1,
number_P2 + number_P1 AS TOTAL
FROM table1
LEFT JOIN table2 ON table1.column_name = table2.column_name;
However, if a city A appears in table 1 and does not appear in table 2 this would not work. The same would happen if a City B appears in table 2 and does not appear in table 1. How can I generalize these situations?
Desired output:
column_name number_P
PARIS 110
LISBOA 20
RIO 30
NEW YORK 300

We can try to use UNION ALL with SUM instead of JOIN
SELECT column_name,
SUM(number_P) number_P
FROM (
SELECT column_name1 as column_name,number_P1 as number_P
FROM table1
UNION ALL
SELECT column_name2,number_P2
FROM table2
) t1
GROUP BY column_name

Another way to achieve this without a subquery.
SELECT IFNULL(table1.column_name1,table2.column_name2) AS ColumnName,
(IFNULL(number_P2,0)+ IFNULL(number_P1,0)) AS TOTAL
FROM table1
FULL JOIN table2 ON table1.column_name1 = table2.column_name2;
Output
ColumnName
TOTAL
PARIS
110
LISBOA
20
RIO
30
NEW YORK
300
To replace 'RIO' with 'RIO DE JANEIRO'
SELECT CASE IFNULL(table1.column_name1,table2.column_name2)
WHEN 'RIO' THEN 'RIO DE JANEIRO'
ELSE IFNULL(table1.column_name1,table2.column_name2) END AS ColumnName,
(IFNULL(number_P2,0)+ IFNULL(number_P1,0)) AS TOTAL
FROM table1
FULL JOIN table2 ON table1.column_name1 = table2.column_name2;

Related

Getting records from 2 tables with common an uncommon columns

Below is similar example of the issue I have:
if I have this table 1:
Patient ID
Name
Check in Date
order name
preformed by
1
Jack
12/sep/2002
xray
Dr.Amal
2
Nora
15/oct/2002
ultrasound
Dr.Goerge
1
Jack
13/nov/2003
Medicine
Dr.Fred
table 2:
Patient ID
Name
Check in Date
order name
1
Jack
14/Jun/2002
xray 2
2
Nora
15/oct/2002
ultrasound
1
Jack
13/nov/2003
Medicine
3
Rafael
13/nov/2003
Vaccine
The result I need is as the following:
Name
Check in Date
order name
preformed by
Jack
12/sep/2002
xray
Dr.Amal
Nora
15/oct/2002
ultrasound
Dr.Goerge
Jack
13/nov/2003
Medicine
Dr.Fred
Jack
14/Jun/2002
xray 2
Null
Rafael
13/nov/2003
Vaccine
Null
If you noticed the result I need is all records of table 1 and all records of table 2 with no duplication and joining the same common fields and adding 'Preformed by' column from Table 1. I tried using 'UNION' as the following:
SELECT Name, Check_in_Date, order_name,preformed_by
FROM table1
UNION
SELECT Name, Check_in_Date, order_name,''
FROM table2
the result I get is 2 records for each patient with the same date one with preformed by one with null as the following:
Name
Check in Date
order name
preformed by
Jack
12/sep/2002
xray
Dr.Amal
Nora
15/oct/2002
ultrasound
Dr.Goerge
Nora
15/oct/2002
ultrasound
Null
Jack
13/nov/2003
Medicine
Dr.Fred
Jack
13/nov/2003
Medicine
null
Jack
14/Jun/2002
xray 2
Null
Rafael
13/nov/2003
Vaccine
Null
If the same ID has same check in date in both table it must return the preformed by of table 1 not null How can I do this?
Thank you.
What you need is a FULL JOIN matching by those three columns along with NVL() function in order to bring the values
from table2 which return null from table1 such as
SELECT NVL(t1.name,t2.name) AS name,
NVL(t1.check_in_date,t2.check_in_date) AS check_in_date,
NVL(t1.order_name,t2.order_name) AS order_name,
t1.preformed_by
FROM table1 t1
FULL JOIN table2 t2
ON t1.name = t2.name
AND t1.check_in_date = t2.check_in_date
AND t1.order_name = t2.order_name
or another method uses UNION to filter out duplicates and then applies an OUTER JOIN such as
SELECT tt.name, tt.check_in_date, tt.order_name, t1.preformed_by
FROM (
SELECT name, check_in_date, order_name FROM table1 UNION
SELECT name, check_in_date, order_name FROM table2
) tt
LEFT JOIN table1 t1
ON t1.name = tt.name
AND t1.check_in_date = tt.check_in_date
AND t1.order_name = tt.order_name
Demo

Changing record values based on whether there are duplicates when two tables are combined

I know I can join Table #1 and Table #2 with a UNION and then filter out duplicate Id's using DISTINCT. However, for the duplicate contacts I'd like to change DrinkPreference to Coke/Pepsi.
Is this possible?
Starting Table #1
Id
FirstName
LastName
DrinkPreference
123
Tom
Bannon
Pepsi
124
Sarah
Smith
Pepsi
Starting Table #2
id
FirstName
LastName
DrinkPreference
125
Jim
Henry
Coke
123
Tom
Bannon
Coke
Table? #3 - combined with DrinkPreference set to Coke/Pepsi where contact exists in both tables?
Id
FirstName
LastName
DrinkPreference
125
Jim
Henry
Coke
123
Tom
Bannon
Coke/Pepsi
124
Sarah
Smith
Pepsi
You can try this one
SELECT coalesce(t1.firstname, t2.firstname) AS firstname,coalesce(t1.lastname,t2.lastname) AS lastname, CASE WHEN t1.drinkpreferences IS NULL THEN t2.drinkpreferences WHEN t2.drinkpreferences IS NULL THEN t1.drinkpreferences
ELSE t1.drinkpreferences || '/' || t2.drinkpreferences END AS drinkpreferences FROM table1 t1 FULL JOIN table2 t2 ON t1.id = t2.id
Achievable using multiple unions and joins.
select distinct FirstName, LastName, case when ct = 2 then 'Coke/Pepsi' else DrinkPreference end
from (
select FirstName, LastName, DrinkPreference, Id from table1
union all
select FirstName, LastName, DrinkPreference, Id from table2) a
left join
(
select count(1)ct, Id from
(select Id from table1
union all
select Id from table2) t1
group by Id
) b on b.Id = a.Id

BigQuery join a nested table onto another table

I am trying to join a table of some project data (table1) with a nested array of project id's onto another table with project data (table2) (order is important here)
table1
proj_date num_proj_per_day proj_size proj_id(nested)
1/1/2020 4 150 a123
b456
c789
table2
proj_id(not nested) proj_loc lots_of_other_proj_fields....
a123 Los Angeles
b456 New York
c798 Los Angeles
d012 Denver
.... ....
desired outcome
proj_date num_proj_per_day proj_size proj_id(unnested) pro_loc
1/1/2020 4 150 a123 Los Angeles
1/1/2020 4 150 b456 New York
1/1/2020 4 150 c789 Los Angeles
I have been able to achieve this outcome if I write the sql code with table1 as the from and then cross join unnest(proj_id) and then left join table2. The problem is i need to have table2 in the from statement then join table1 on the unnested(proj_id). Order unfortuantely matters because I have to merge this new dataset(table1) into existing dataset/framework(table2) within Looker
Example of what works to get the correct outcome but does not work for my application
SELECT
table1.*,
table2.proj_loc
FROM table1
CROSS JOIN UNNEST(table1.proj_id) as unnested
LEFT JOIN table2
ON table2.proj_id = unnested.proj_id
I am looking for something like below but you can not put the unnest into the ON clause - bigquery pops error "Unexpected keyword UNNEST"
SELECT
table1.*,
table2.proj_loc
FROM table2
LEFT JOIN table1
ON UNNEST(table1.proj_id)=table2.proj_id
Thank you in advance and let me know if you need anymore clarifying information
Below is for BigQuery Standard SQL
#standardSQL
SELECT proj_date, num_proj_per_day, proj_size, t2.*
FROM `project.dataset.table2` t2
JOIN `project.dataset.table1` t1
ON t2.proj_id IN UNNEST(t1.proj_id)
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.dataset.table1` AS (
SELECT DATE '2020-01-01' proj_date, 4 num_proj_per_day, 150 proj_size, ['a123','b456','c789'] proj_id
),`project.dataset.table2` AS (
SELECT 'a123' proj_id, 'Los Angeles' proj_loc, 1 proj_field1, 2 proj_field2, 3 proj_field3 UNION ALL
SELECT 'b456', 'New York', 21, 22, 23 UNION ALL
SELECT 'c789', 'Los Angeles', 31, 32, 33 UNION ALL
SELECT 'd012', 'Denver', 41, 42, 43
)
SELECT proj_date, num_proj_per_day, proj_size, t2.*
FROM `project.dataset.table2` t2
JOIN `project.dataset.table1` t1
ON t2.proj_id IN UNNEST(t1.proj_id)
with output
Row proj_date num_proj_per_day proj_size proj_id proj_loc proj_field1 proj_field2 proj_field3
1 2020-01-01 4 150 a123 Los Angeles 1 2 3
2 2020-01-01 4 150 b456 New York 21 22 23
3 2020-01-01 4 150 c789 Los Angeles 31 32 33

How to fetch the unmatched records from two tables in Hive?

I have below two tables of data where i need to get only the unmatched records of data using hive only.
Table1:
hive> select * from dept;
OK
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Table2:
hive> select * from dept_text;
OK
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
50 Software Bangalore
60 Housewife yellandu
Output:I need to get the output like below.Can someone help me on this.
50 Software Bangalore
60 Housewife yellandu
Use left join on dept_text table then filter only the null id columns from dept table
select dt.* from dept_text dt
left join
dept d
on d.id=dt.id
where d.id is null;
Example:
desc dept;
--id int
--desc string
--city string
select * from dept;
--OK
--dept.id dept.desc dept.city
--10 ACCOUNTING NEW YORK
--20 RESEARCH DALLAS
--30 SALES CHICAGO
--40 OPERATIONS BOSTON
--if you want to join on desc column
select dt.* from dept_text dt
left join
dept d
on d.desc=dt.desc
where d.id is null;
--or if you want to join on id column
select dt.* from dept_text dt
left join
dept d
on d.id=dt.id
where d.id is null;

Reconciliation Automation Query

I have one database and time to time i change some part of query as per requirement.
i want to keep record of results of both before and after result of these queries in one table and want to show queries which generate difference.
For Example,
Consider following table
emp_id country salary
---------------------
1 usa 1000
2 uk 2500
3 uk 1200
4 usa 3500
5 usa 4000
6 uk 1100
Now, my before query is :
Before Query:
select count(emp_id) as count,country from table where salary>2000 group by country;
Before Result:
count country
2 usa
1 uk
After Query:
select count(emp_id) as count,country from table where salary<2000 group by country;
After Query Result:
count country
2 uk
1 usa
My Final Result or Table I want is:
column 1 | column 2 | column 3 | column 4 |
2 usa 2 uk
1 uk 1 usa
...... but if query results are same than it shouldn't show in this table.
Thanks in advance.
I believe that you can use the same approach as here.
select t1.*, t2.* -- if you need specific columns without rn than you have to list them here
from
(
select t.*, row_number() over (order by count) rn
from
(
-- query #1
select count(emp_id) as count,country from table where salary>2000 group by country;
) t
) t1
full join
(
select t.*, row_number() over (order by count) rn
from
(
-- query #2
select count(emp_id) as count,country from table where salary<2000 group by country;
) t
) t2 on t1.rn = t2.rn