Selecting Records Matching Two or More Related Tables - sql

I have a 'persons' table:
person_id name
100 jack
125 jill
201 jane
And many sub-tables, that the person_id could be in:
'rowing'
id person_id
1 100
2 201
'swimming'
id person_id
1 125
2 201
'running'
id person_id
1 201
'throwing'
id person_id
1 125
2 201
I would like to be able to select all people who are involved in two activities, regardless of which two.

As the great #TimSchmelter (great first name) mentioned, you should really be having a single PersonActivities table with an id corresponding to the particular activity.
That being said, if you must work with your current schema, one option would be to UNION together the activity tables, and then count which persons have two or more records, meaning that they participated in two or more activities.
SELECT t1.person_id, t1.name
FROM persons
INNER JOIN
(
SELECT t.person_id, COUNT(t.person_id) AS activityCount
FROM
(
SELECT person_id FROM rowing
UNION ALL
SELECT person_id FROM swimming
UNION ALL
SELECT person_id FROM running
UNION ALL
SELECT person_id FROM throwing
) AS t
GROUP BY t.person_id
HAVING COUNT(t.person_id) > 1
) t2
ON t1.person_id = t2.person_id

Related

Club two table based on certain condition in postgresql

Here's my sample input tables:
employee_id
project
effective_date**
1
A
2014-08-13
1
B
2016-12-21
1
C
2018-02-21
employee_id
designation
effective_date
1
trainee
2014-08-05
1
senior
2016-08-17
1
team leader
2018-02-05
Table1: describes an employee who undergoes different projects at different date's in an organization.
Table2: describes the same employee from Table1 who undergoes different designation in the same organisation.
Now I want an Expected output table like this:
employee_id
project
designation
effective_date
1
A
trainee
2014-08-13
1
A
senior
2016-08-17
1
B
Senior
2016-12-21
1
B
team leader
2018-02-05
1
C
team leader
2018-02-21
The fact is that whenever:
his project changes, I need to display project effective_date.
his designation changes, I need to display designation effective_date but with the project he worked on during this designation change
This problem falls into the gaps-and-islands taxonomy. This specific variant can be solved in three steps:
applying a UNION ALL of the two tables while splitting "tab1.project" and "tab2.role" in two separate fields within the same schema
compute the partitions, between a non-null value and following null values, with two running sums (one for the "designation" and one for "project")
apply two different aggregations on the two different fields, to remove the null values.
WITH cte AS (
SELECT employee_id, effective_date,
project AS project,
NULL AS role FROM tab1
UNION ALL
SELECT employee_id, effective_date,
NULL AS project,
designation AS role FROM tab2
), cte2 AS (
SELECT *,
COUNT(CASE WHEN project IS NOT NULL THEN 1 END) OVER(
PARTITION BY employee_id
ORDER BY effective_date
) AS project_partition,
COUNT(CASE WHEN role IS NOT NULL THEN 1 END) OVER(
PARTITION BY employee_id
ORDER BY effective_date
) AS role_partition
FROM cte
)
SELECT employee_id, effective_date,
MAX(project) OVER(PARTITION BY project_partition) AS project,
MAX(role) OVER(PARTITION BY role_partition) AS role
FROM cte2
ORDER BY employee_id, effective_date
Check the demo here.

Getting records from 2 tables with common an uncommon columns

Below is similar example of the issue I have:
if I have this table 1:
Patient ID
Name
Check in Date
order name
preformed by
1
Jack
12/sep/2002
xray
Dr.Amal
2
Nora
15/oct/2002
ultrasound
Dr.Goerge
1
Jack
13/nov/2003
Medicine
Dr.Fred
table 2:
Patient ID
Name
Check in Date
order name
1
Jack
14/Jun/2002
xray 2
2
Nora
15/oct/2002
ultrasound
1
Jack
13/nov/2003
Medicine
3
Rafael
13/nov/2003
Vaccine
The result I need is as the following:
Name
Check in Date
order name
preformed by
Jack
12/sep/2002
xray
Dr.Amal
Nora
15/oct/2002
ultrasound
Dr.Goerge
Jack
13/nov/2003
Medicine
Dr.Fred
Jack
14/Jun/2002
xray 2
Null
Rafael
13/nov/2003
Vaccine
Null
If you noticed the result I need is all records of table 1 and all records of table 2 with no duplication and joining the same common fields and adding 'Preformed by' column from Table 1. I tried using 'UNION' as the following:
SELECT Name, Check_in_Date, order_name,preformed_by
FROM table1
UNION
SELECT Name, Check_in_Date, order_name,''
FROM table2
the result I get is 2 records for each patient with the same date one with preformed by one with null as the following:
Name
Check in Date
order name
preformed by
Jack
12/sep/2002
xray
Dr.Amal
Nora
15/oct/2002
ultrasound
Dr.Goerge
Nora
15/oct/2002
ultrasound
Null
Jack
13/nov/2003
Medicine
Dr.Fred
Jack
13/nov/2003
Medicine
null
Jack
14/Jun/2002
xray 2
Null
Rafael
13/nov/2003
Vaccine
Null
If the same ID has same check in date in both table it must return the preformed by of table 1 not null How can I do this?
Thank you.
What you need is a FULL JOIN matching by those three columns along with NVL() function in order to bring the values
from table2 which return null from table1 such as
SELECT NVL(t1.name,t2.name) AS name,
NVL(t1.check_in_date,t2.check_in_date) AS check_in_date,
NVL(t1.order_name,t2.order_name) AS order_name,
t1.preformed_by
FROM table1 t1
FULL JOIN table2 t2
ON t1.name = t2.name
AND t1.check_in_date = t2.check_in_date
AND t1.order_name = t2.order_name
or another method uses UNION to filter out duplicates and then applies an OUTER JOIN such as
SELECT tt.name, tt.check_in_date, tt.order_name, t1.preformed_by
FROM (
SELECT name, check_in_date, order_name FROM table1 UNION
SELECT name, check_in_date, order_name FROM table2
) tt
LEFT JOIN table1 t1
ON t1.name = tt.name
AND t1.check_in_date = tt.check_in_date
AND t1.order_name = tt.order_name
Demo

find duplicate values in between two tables

I have two tables: Customer1 and Customer2. Both tables' fields are the same, but employee names are different. Some of the employee are repeated in both tables.
I want to find the employees that are duplicated across both tables.
Sample data:
Customer 1
ID Name Designation salary
1 User1 Developer 5000
1 User2 Developer 5000
1 User5 Developer 5000
1 User1 Developer 5000
Customer 2
ID Name Designation salary
1 User1 Developer 5000
1 User2 Developer 5000
1 User3 Developer 5000
1 User1 Developer 5000
Result
ID Name Designation salary
1 User1 Developer 5000
1 User2 Developer 5000
User1 and User2 are in both tables many times, but I want to count them only once. I really appreciate any help on this.
You can join the two tables and the print out the distinct records from Customer1 that are left:
SELECT distinct customer1.*
FROM customer1
INNER JOIN customer2 ON
customer1.id = customer2.id
AND customer1.name = customer2.name
You can use UNION ALL & do aggergation :
select id, name, Designation, salary
from (select c1.id, c1.name, c1.Designation, c1.salary
from Customer1 c1
union all
select c2.id, c2.name, c2.Designation, c2.salary
from Customer2 c2
) c
group by id, name, Designation, salary
having count(*) > 1;

Reconciliation Automation Query

I have one database and time to time i change some part of query as per requirement.
i want to keep record of results of both before and after result of these queries in one table and want to show queries which generate difference.
For Example,
Consider following table
emp_id country salary
---------------------
1 usa 1000
2 uk 2500
3 uk 1200
4 usa 3500
5 usa 4000
6 uk 1100
Now, my before query is :
Before Query:
select count(emp_id) as count,country from table where salary>2000 group by country;
Before Result:
count country
2 usa
1 uk
After Query:
select count(emp_id) as count,country from table where salary<2000 group by country;
After Query Result:
count country
2 uk
1 usa
My Final Result or Table I want is:
column 1 | column 2 | column 3 | column 4 |
2 usa 2 uk
1 uk 1 usa
...... but if query results are same than it shouldn't show in this table.
Thanks in advance.
I believe that you can use the same approach as here.
select t1.*, t2.* -- if you need specific columns without rn than you have to list them here
from
(
select t.*, row_number() over (order by count) rn
from
(
-- query #1
select count(emp_id) as count,country from table where salary>2000 group by country;
) t
) t1
full join
(
select t.*, row_number() over (order by count) rn
from
(
-- query #2
select count(emp_id) as count,country from table where salary<2000 group by country;
) t
) t2 on t1.rn = t2.rn

How to find distinct users in multiple tables

I have a table called users that holds users ids, as well as a few tables like cloud_storage_a, cloud_storage_b and cloud_storage_c. If a user exists in cloud_storage_a, that means they are a connected to cloud storage a. A user can exist in many cloud storages too. Here's an example:
users table:
user_id | address | name
-------------------------------
123 | 23 Oak Ave | Melissa
333 | 18 Robson Rd | Steve
421 | 95 Ottawa St | Helen
555 | 12 Highland | Amit
192 | 39 Anchor Rd | Oliver
cloud_storage_a:
user_id
-------
421
333
cloud_storage_b:
user_id
-------
555
cloud_storage_c:
user_id
-------
192
555
Etc.
I want to create a query that grabs all users connected on any cloud storage. So for this example, users 421, 333, 555, 192 should be returned. I'm guessing this is some sort of join but I'm not sure which one.
You are close. Instead of a JOIN that merges tables next to each other based on a key, you want to use a UNION which stacks recordsets/tables on top of eachother.
SELECT user_id FROM cloud_storage_a
UNION
SELECT user_id FROM cloud_storage_b
UNION
SELECT user_id FROM cloud_storage_c
Using keyword UNION here will give you distinct user_id's across all three tables. If you switched that to UNION ALL you would no longer get Distinct, which has it's advantages in other situations (not here, obviously).
Edited to add:
If you wanted to bring in user address you could use this thing as a subquery and join into your user table:
SELECT
subunion.user_id
user.address
FROM
user
INNER JOIN
(
SELECT user_id FROM cloud_storage_a
UNION
SELECT user_id FROM cloud_storage_b
UNION
SELECT user_id FROM cloud_storage_c
) subunion ON
user.user_id = subunion.user_id
That union will need to grow as you add more cloud_storage_N tables. All in all, it's not a great database design. You would be much better off creating a single cloud_storage table and having a field that delineates which one it is a, b, c, ... ,N
Then your UNION query would just be SELECT DISTINCT user_id FROM cloud_storage; and you would never need to edit it again.
You need to join unknown(?) number of tables cloud_storage_X this way.
You'd better change your schema to the following:
storage:
user_id cloud
------- -----
421 a
333 a
555 b
192 c
555 c
Then the query is as simple as this:
select distinct user_id
from storage;
select u.* from users u,
cloud_storage_a csa,
cloud_storage_b csb,
cloud_storage_c csc
where u.user_id = csa.user_id or u.user_id = csb.user_id or u.user_id = csc.user_id
You should simplify your schema to handle this type of queries.
To get columns from your users table for all (distinct) qualifying users:
SELECT * -- or whatever you need
FROM users u
WHERE EXISTS (SELECT 1 FROM cloud_storage_a WHERE user_id = u.user_id) OR
EXISTS (SELECT 1 FROM cloud_storage_b WHERE user_id = u.user_id) OR
EXISTS (SELECT 1 FROM cloud_storage_c WHERE user_id = u.user_id);
To just get all user_id and nothing else, #JNevill's UNION query looks good. You could join the result of this to users to the same effect:
SELECT u.* -- or whatever you need
FROM users u
JOIN (
SELECT user_id FROM cloud_storage_a
UNION
SELECT user_id FROM cloud_storage_b
UNION
SELECT user_id FROM cloud_storage_c
) c USING user_id);
But that's probably slower.