Joining two or more tables - sql

I have written some code to merge two tables together. The values that are displayed are null.
+----------+--------------+-----------------------------------------------+-------------+----------+----------+------+----------+
| Movie_ID | Release_year | Movie_Title | Duration | Genre_ID | Actor_ID | Role | Movie_ID |
+----------+--------------+-----------------------------------------------+-------------+----------+----------+------+----------+
| 10001 | 1997 | Titantic | 190 minutes | 40001 | NULL | NULL | NULL |
| 10002 | 1998 | Shakesphere in Love | 123 minutes | 40002 | NULL | NULL | NULL |
| 10003 | 1999 | American Beauty | 122 minutes | 40003 | NULL | NULL | NULL |
| 10004 | 2000 | Gladiator | 155 minutes | 40004 | NULL | NULL | NULL |
| 10005 | 2001 | A beautiful Mind | 135 minutes | 40004 | NULL | NULL | NULL |
| 10006 | 2002 | Chicago | 113 minutes | 40005 | NULL | NULL | NULL |
| 10007 | 2003 | The Lord of the Rings: The return of the King | 201 minutes | 40006 | NULL | NULL | NULL |
| 10008 | 2004 | Million Dollar Baby | 132 minutes | 40007 | NULL | NULL | NULL |
| 10009 | 2005 | Crash | 112 minutes | 40008 | NULL | NULL | NULL |
| 10010 | 2006 | The Departed | 151 minutes | 40009 | NULL | NULL | NULL |
| 10011 | 2007 | No Country for Old Men | 122 minutes | 40009 | NULL | NULL | NULL |
| 10012 | 2008 | Slumdog Millionaire | 120 minutes | 40008 | NULL | NULL | NULL |
| 10013 | 2009 | The Hurt Locker | 131 minutes | 40009 | NULL | NULL | NULL |
| 10014 | 2010 | The King\s speech | 118 minutes | 40010 | NULL | NULL | NULL |
| 10015 | 2011 | The Artist | 100 minutes | 40011 | NULL | NULL | NULL |
| 10016 | 2012 | Argo | 120 minutes | 40012 | NULL | NULL | NULL |
| 10017 | 2013 | 12 Years a Slave | 134 minutes | 40004 | NULL | NULL | NULL |
| 10018 | 2014 | Birdman or The Unexpected Virtue of Ignorance | 119 minutes | 40003 | NULL | NULL | NULL |
| 10019 | 2015 | Spotlight | 129 minutes | 40008 | NULL | NULL | NULL |
| 10020 | 2016 | Moonlight | 111 minutes | 40013 | NULL | NULL | NULL |
| 10021 | 2017 | The Shape of Water | 123 minutes | 40012 | NULL | NULL | NULL |
| 10022 | 2018 | Green Book | 130 minutes | 40011 | NULL | NULL | NULL |
+----------+--------------+-----------------------------------------------+-------------+----------+----------+------+----------+
SELECT *
FROM databaseoscars.movie a
LEFT JOIN databaseoscars.`movie cast` b ON a.Movie_ID = b.Actor_ID;
I expected the output to be all data is displayed on one table.

Your problem is here: a.Movie_ID = b.Actor_ID. A movie will never be an actor. Use your movie cast table's movie ID instead:
SELECT *
FROM databaseoscars.movie m
LEFT JOIN databaseoscars.`movie cast` mc ON mc.movie_id = m.movie_id;
In MySQL and MariaDB, you could also use the USING clause:
SELECT *
FROM databaseoscars.movie m
LEFT JOIN databaseoscars.`movie cast` mc USING (movie_id);

Related

Distincted ids for grouped values

I want to count the distinct ids in each numb and store them in a column :
Tried this:
WITH T AS(
SELECT
MAX(CASE WHEN LOGS like'CAR%' then REPLACE(LOGS,'CAR-','')end)as CAR,
MAX(CASE WHEN LOGS like 'MOT%' then REPLACE(LOGS,'MOT-','')end)as MOTO,
MAX(CASE WHEN LOGS like 'BICYCLE%' then REPLACE(LOGS,'BICYCLE-','')end)as BICYCLE,
MAX(CASE WHEN LOGS like 'SHIP%' then REPLACE(LOGS,'SHIP-','')end)as SHIP,
ID,
ORIG,
DATE_ID ,
NUMB,
STEPS
from dbo.test
group by ORIG,DATE_ID,ID ,NUMB,STEPS
)
SELECT ID,ORIG,NUMB,STEPS,DATE_ID,CAR,MOTO,BICYCLE,SHIP,
(SELECT COUNT(DISTINCT ID) FROM dbo.test tp WHERE ORIG= '4567') as COUNTER
from t
where ORIG= '4567'
and NUMB in('1515','1921','2121')
GROUP BY ID,ORIGIN_URI,NUMB,STEPS,DATE_ID,CAR,MOTO,BICYCLE,SHIP
Receive this output:
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| ID | ORIG | NUMB | STEPS | DATE_ID | CAR | MOTO | BICYCLE | SHIP | COUNTER |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| 1 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 1 | 4567 | 1515 | 2 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 1 | 4567 | 1515 | 3 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 2 | 4567 | 1921 | 1 | 20201111 | NULL | KTM | NULL | NULL | 3 |
| 3 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 2 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 3 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 4 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
As you can see COUNTER columns has the count of distincted ids but for all NUMB
I want to output this:
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| ID | ORIG | NUMB | STEPS | DATE_ID | CAR | MOTO | BICYCLE | SHIP | COUNTER |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| 1 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 1 | 4567 | 1515 | 2 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 2 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 2 | 4567 | 1921 | 1 | 20201111 | NULL | KTM | NULL | NULL | 1 |
| 3 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 3 | 4567 | 2121 | 2 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 3 | 4567 | 2121 | 3 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 1 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
1515 has 2 ids
1921 has 1 id
2121 has 2 ids
I tried also to place a GROUP BY NUMB inside (SELECT COUNT(DISTINCT ID) FROM dbo.test tp WHERE ORIG= '4567') but didn't work.
What you seem to want is:
count(distinct steps) over (partition by orig, numb)
Alas, SQL Server doesn't support count(distinct) with window functions.
Happily, there is an easy workaround (which begs the question as to why the above syntax is not supported):
(dense_rank() over (partition by orig, numb order by steps asc) +
dense_rank() over (partition by orig, numb order by steps desc) - 1
) as counter

PostgreSQL: Count number of rows in table 1 for distinct rows in table 2

I am working with really big data that at the moment I become confused, looking like I'm just repeating one thing.
I want to count the number of trips per user from two tables, trips and session.
psql=> SELECT * FROM trips limit 10;
trip_id | session_ids | daily_user_id | seconds_start | seconds_end
---------+-----------------+---------------+---------------+-------------
400543 | {172079} | 17118 | 1575550944 | 1575551181
400542 | {172078} | 17118 | 1575541533 | 1575542171
400540 | {172077} | 17118 | 1575539001 | 1575539340
400538 | {172076} | 17117 | 1575540499 | 1575541999
400534 | {172074,172075} | 17117 | 1575537161 | 1575539711
400530 | {172073} | 17116 | 1575447043 | 1575447682
400529 | {172071} | 17115 | 1575496394 | 1575497803
400527 | {172070} | 17113 | 1575495241 | 1575496034
400525 | {172068} | 17115 | 1575485658 | 1575489378
400524 | {172067} | 17113 | 1575488721 | 1575490491
(10 rows)
psql=> SELECT * FROM session limit 10;
session_id | user_id | key | start_time | daily_user_id
------------+---------+--------------------------+------------+---------------
172079 | 43 | hLB8S7aSfp4gAFp7TykwYQ==+| 1575550921 | 17118
| | | |
172078 | 43 | YATMrL/AQ7Nu5q2dQTMT1A==+| 1575541530 | 17118
| | | |
172077 | 43 | fOLX4tqvsyFOP3DCyBZf1A==+| 1575538997 | 17118
| | | |
172076 | 7 | 88hwGj4Mqa58juy0PG/R4A==+| 1575540515 | 17117
| | | |
172075 | 7 | 1O+8X49+YbtmoEa9BlY5OQ==+| 1575538384 | 17117
| | | |
172074 | 7 | XOR7hsFCNk+soM75ZhDJyA==+| 1575537405 | 17117
| | | |
172073 | 42 | rAQWwYgqg3UMTpsBYSpIpA==+| 1575447109 | 17116
| | | |
172072 | 276 | 0xOsxRRN3Sq20VsXWjlrzQ==+| 1575511120 | 17114
| | | |
172071 | 7 | P4beN3W/ZrD+TCpZGYh23g==+| 1575496642 | 17115
| | | |
172070 | 43 | OFi30Zv9e5gmLZS5Vb+I7Q==+| 1575495238 | 17113
| | | |
(10 rows)
Goal: get the distribution of trips per user
Attempt:
psql=> SELECT COUNT(distinct trip_id) as trips
, count(distinct user_id) as users
, extract(year from to_timestamp(seconds_start)) as year_date
, extract(month from to_timestamp(seconds_start)) as month_date
FROM trips
INNER JOIN session
ON session_id = ANY(session_ids)
GROUP BY year_date, month_date
ORDER BY year_date, month_date;
+-------+-------+-----------+------------+
| trips | users | year_date | month_date |
+-------+-------+-----------+------------+
| 371 | 44 | 2016 | 3 |
| 12207 | 185 | 2016 | 4 |
| 3859 | 88 | 2016 | 5 |
| 1547 | 28 | 2016 | 6 |
| 831 | 17 | 2016 | 7 |
| 427 | 4 | 2016 | 8 |
| 512 | 13 | 2016 | 9 |
| 431 | 11 | 2016 | 10 |
| 1011 | 26 | 2016 | 11 |
| 791 | 15 | 2016 | 12 |
| 217 | 8 | 2017 | 1 |
| 490 | 17 | 2017 | 2 |
| 851 | 18 | 2017 | 3 |
| 1890 | 66 | 2017 | 4 |
| 2143 | 43 | 2017 | 5 |
| . | | | |
| . | | | |
| . | | | |
+-------+-------+-----------+------------+
This resultset count number of users and trips, my intention is actually to get an analysis of trips per user, like so:
+------+-------------+
| user | no_of_trips |
+------+-------------+
| 1 | 489 |
| 2 | 400 |
| 3 | 12 |
| 4 | 102 |
| . | |
| . | |
| . | |
+------+-------------+
How do I do this, please?
You seem to just want aggregation by user_id:
SELECT s.user_id, COUNT(distinct t.trip_id) as trips
FROM trips t INNER JOIN
session s
ON s.session_id = ANY(t.session_ids)
GROUP BY s.user_id ;
I'm pretty sure that the COUNT(DISTINCT) is unnecessary, so I would advise removing it:
SELECT s.user_id, COUNT(*) as trips
FROM trips t INNER JOIN
session s
ON s.session_id = ANY(t.session_ids)
GROUP BY s.user_id ;

SQL Server: I need to create copies of records from 2 tables and ensure FK reflects these copies

In SQL Server 2016, I need to create nearly exact copies of records from 2 tables. The only difference will be their primary keys, one other column that I'm resetting to zero, and a foreign key in my 2nd table (which is the PK of my 1st table). I can create copies of both tables fine, however I don't know how to assign new FK values to the 2nd table, to correctly reflect the new primary keys from the 1st table.
Here are the current records in both tables:
Table 1: Batches
+-----------------------+-----------+----------------+------------+
| BatchID (pk identity) | StartDate | ProcessingStep | BatchCount |
+-----------------------+-----------+----------------+------------+
| 1 | 5/10/2019 | 2 | 8203 |
| 2 | 5/11/2019 | 2 | 345 |
| 3 | 5/12/2019 | 2 | 5014 |
+-----------------------+-----------+----------------+------------+
Table 2: ItemList
+--------------------------+---------+--------+-----------+-------------+
| ItemListID (pk identity) | BatchID | ItemID | Processed | ProcessDate |
+--------------------------+---------+--------+-----------+-------------+
| 1000 | 1 | 201 | 1 | 5/10/2019 |
| 1001 | 1 | 689 | 1 | 5/10/2019 |
| 1002 | 2 | 548 | 1 | 5/11/2019 |
| 1003 | 2 | 693 | 1 | 5/11/2019 |
| 1004 | 3 | 123 | 1 | 5/12/2019 |
| 1005 | 3 | 999 | 1 | 5/12/2019 |
+--------------------------+---------+--------+-----------+-------------+
I now want to create copies of these records with the following exceptions:
Batches.ProcessingStep for all records will now be set to zero
ItemList's Processed & ProcessDate are reset to zero & null respectively
Update ItemList.BatchID to reflect the new PK of the copied Batches records (this is where I'm having trouble)
Currently, my script for updating my tables is as follows:
INSERT INTO Batches(StartDate, ProcessingStep, BatchCount)
SELECT StartDate, 0, BatchCount
FROM Batches
WHERE BatchID IN (1,2,3)
INSERT INTO ItemList(BatchID, ItemID, Processed, ProcessDate)
SELECT <<?? not sure ??>>, ItemID, 0, NULL
WHERE ItemListID BETWEEN 1000 AND 1005
And here would be my final results:
Table 1: Batches
+---------+-----------+----------------+------------+
| BatchID | StartDate | ProcessingStep | BatchCount |
+---------+-----------+----------------+------------+
| 1 | 5/10/2019 | 2 | 8203 |
| 2 | 5/11/2019 | 2 | 345 |
| 3 | 5/12/2019 | 2 | 5014 |
| 4 | 5/10/2019 | 0 | 8203 |
| 5 | 5/11/2019 | 0 | 345 |
| 6 | 5/12/2019 | 0 | 5014 |
+---------+-----------+----------------+------------+
Table 2: ItemList
+------------+---------+--------+-----------+-------------+
| ItemListID | BatchID | ItemID | Processed | ProcessDate |
+------------+---------+--------+-----------+-------------+
| 1000 | 1 | 201 | 1 | 5/10/2019 |
| 1001 | 1 | 689 | 1 | 5/10/2019 |
| 1002 | 2 | 548 | 1 | 5/11/2019 |
| 1003 | 2 | 693 | 1 | 5/11/2019 |
| 1004 | 3 | 123 | 1 | 5/12/2019 |
| 1005 | 3 | 999 | 1 | 5/12/2019 |
| 1006 | 4 | 201 | 0 | NULL |
| 1007 | 4 | 689 | 0 | NULL |
| 1008 | 5 | 548 | 0 | NULL |
| 1009 | 5 | 693 | 0 | NULL |
| 1010 | 6 | 123 | 0 | NULL |
| 1011 | 6 | 999 | 0 | NULL |
+------------+---------+--------+-----------+-------------+
How would I go about populating that ItemList.BatchID foreign key correctly?
Thanks.

How to check dates condition from one table to another in SQL

Which way we can use to check and compare the dates from one table to another.
Table : inc
+--------+---------+-----------+-----------+-------------+
| inc_id | cust_id | item_id | serv_time | inc_date |
+--------+---------+-----------+-----------+-------------+
| 1 | john | HP | 40 | 17-Apr-2015 |
| 2 | John | HP | 60 | 10-Jan-2016 |
| 3 | Nick | Cisco | 120 | 11-Jan-2016 |
| 4 | samanta | EMC | 180 | 12-Jan-2016 |
| 5 | Kerlee | Oracle | 40 | 13-Jan-2016 |
| 6 | Amir | Microsoft | 300 | 14-Jan-2016 |
| 7 | John | HP | 120 | 15-Jan-2016 |
| 8 | samanta | EMC | 20 | 16-Jan-2016 |
| 9 | Kerlee | Oracle | 10 | 2-Feb-2017 |
+--------+---------+-----------+-----------+-------------+
Table: Contract:
+-----------+---------+----------+------------+
| item_id | con_id | Start | End |
+-----------+---------+----------+------------+
| Dell | DE2015 | 1/1/2015 | 12/31/2015 |
| HP | HP2015 | 1/1/2015 | 12/31/2015 |
| Cisco | CIS2016 | 1/1/2016 | 12/31/2016 |
| EMC | EMC2016 | 1/1/2016 | 12/31/2016 |
| HP | HP2016 | 1/1/2016 | 12/31/2016 |
| Oracle | OR2016 | 1/1/2016 | 12/31/2016 |
| Microsoft | MS2016 | 1/1/2016 | 12/31/2016 |
| Microsoft | MS2017 | 1/1/2017 | 12/31/2017 |
+-----------+---------+----------+------------+
Result:
+-------+---------+---------+--------------+
| Calls | Cust_id | Con_id | Tot_Ser_Time |
+-------+---------+---------+--------------+
| 2 | John | HP2016 | 180 |
| 2 | samanta | EMC2016 | 200 |
| 1 | Nick | CIS2016 | 120 |
| 1 | Amir | MS2016 | 300 |
| 1 | Oracle | OR2016 | 40 |
+-------+---------+---------+--------------+
MY Query:
select count(inc_id) as Calls, inc.cust_id, contract.con_id,
sum(inc.serv_time) as tot_serv_time
from inc inner join contract on inc.item_id = contract.item_id
where inc.inc_date between '2016-01-01' and '2016-12-31'
group by inc.cust_id, contract.con_id
The result from inc table with filter between 1-jan-2016 to 31-Dec-2016 with
count of inc_id based on the items and its contract start and end dates .
If I understand correctly your problem, this query will return the desidered result:
select
count(*) as Calls,
inc.cust_id,
contract.con_id,
sum(inc.serv_time) as tot_serv_time
from
inc inner join contract
on inc.item_id = contract.item_id
and inc.inc_date between contract.start and contract.end
where
inc.inc_date between '2016-01-01' and '2016-12-31'
group by
inc.cust_id,
contract.con_id
the question is a little vague so you might need some adjustments to this query.
select
Calls = count(*)
, Cust = i.Cust_id
, Contract = c.con_id
, Serv_Time = sum(Serv_Time)
from inc as i
inner join contract as c
on i.item_id = c.item_id
and i.inc_date >= c.[start]
and i.inc_date <= c.[end]
where c.[start]>='20160101'
group by i.Cust_id, c.con_id
order by i.Cust_Id, c.con_id
returns:
+-------+---------+----------+-----------+
| Calls | Cust | Contract | Serv_Time |
+-------+---------+----------+-----------+
| 1 | Amir | MS2016 | 300 |
| 2 | John | HP2016 | 180 |
| 1 | Kerlee | OR2016 | 40 |
| 1 | Nick | CIS2016 | 120 |
| 2 | samanta | EMC2016 | 200 |
+-------+---------+----------+-----------+
test setup: http://rextester.com/WSYDL43321
create table inc(
inc_id int
, cust_id varchar(16)
, item_id varchar(16)
, serv_time int
, inc_date date
);
insert into inc values
(1,'john','HP', 40 ,'17-Apr-2015')
,(2,'John','HP', 60 ,'10-Jan-2016')
,(3,'Nick','Cisco', 120 ,'11-Jan-2016')
,(4,'samanta','EMC', 180 ,'12-Jan-2016')
,(5,'Kerlee','Oracle', 40 ,'13-Jan-2016')
,(6,'Amir','Microsoft', 300 ,'14-Jan-2016')
,(7,'John','HP', 120 ,'15-Jan-2016')
,(8,'samanta','EMC', 20 ,'16-Jan-2016')
,(9,'Kerlee','Oracle', 10 ,'02-Feb-2017');
create table contract (
item_id varchar(16)
, con_id varchar(16)
, [Start] date
, [End] date
);
insert into contract values
('Dell','DE2015','20150101','20151231')
,('HP','HP2015','20150101','20151231')
,('Cisco','CIS2016','20160101','20161231')
,('EMC','EMC2016','20160101','20161231')
,('HP','HP2016','20160101','20161231')
,('Oracle','OR2016','20160101','20161231')
,('Microsoft','MS2016','20160101','20161231')
,('Microsoft','MS2017','20170101','20171231');

SQL Combining two tables with one same column without losing values of a different one

I've been struggling with this for a while, I have two different tables that share a column but both have different amount of rows.
One of the tables is for money requests (table 1) and the other one is for proving the expenses (table 2)
Table 1
+-----------+-----------+
|expenseid | requestid |
+-----------+-----------+
| 16333 | 7454 |
| NULL | 7455 |
| 16336 | 7456 |
| 16338 | 7457 |
| NULL | 7458 |
| 16341 | 7459 |
| 16345 | 7460 |
| NULL | 7461 |
| NULL | 7462 |
+-----------+-----------+
Table 2
+-----------+-----------+
|expenseid | amount |
+-----------+-----------+
| 16333 | 200 |
| 16334 | 150 |
| 16335 | 300 |
| 16336 | 900 |
| 16337 | 100 |
| 16338 | 120 |
| 16339 | 700 |
| 16340 | 431 |
| 16341 | 420 |
| 16342 | 150 |
| 16343 | 240 |
| 16344 | 465 |
| 16345 | 200 |
| 16346 | 120 |
| 16347 | 90 |
| 16348 | 50 |
| 16349 | 245 |
+-----------+-----------+
As you can see the tables share the same column 'expenseid', but the amount of rows is different and there are two different columns that don't correspond to each other, i would like to have a table as follows
Combined table
+-----------+-----------+-----------+
|expenseid | amount | requestid |
+-----------+-----------+-----------+
| 16333 | 200 | 7454 |
| NULL | NULL | 7455 |
| 16334 | 150 | NULL |
| 16335 | 300 | NULL |
| 16336 | 900 | 7456 |
| 16337 | 100 | NULL |
| 16338 | 120 | 7457 |
| NULL | NULL | 7458 |
| 16339 | 700 | NULL |
| 16340 | 431 | NULL |
| 16341 | 420 | 7459 |
| 16342 | 150 | NULL |
| 16343 | 240 | NULL |
| 16344 | 465 | NULL |
| 16345 | 200 | 7460 |
| NULL | NULL | 7461 |
| NULL | NULL | 7462 |
| 16346 | 120 | NULL |
| 16347 | 90 | NULL |
| 16348 | 50 | NULL |
| 16349 | 245 | NULL |
+-----------+-----------+-----------+
I've managed to merge both tables in a way it shows the null values for expenseid related to table 1 with a left outer join, but it doesn't show the null values for the column requestid, any ideas on how to do this?
You need a FULL OUTER JOIN instead of a LEFT OUTER JOIN.
SELECT
COALESCE(Table1.expenseid, Table2.expenseid) AS expenseid,
amount,
requestid
FROM Table1
FULL OUTER JOIN Table2
ON Table1.expenseid = Table2.expenseid
Results:
EXPENSEID AMOUNT REQUESTID
16333 200 7454
(null) (null) 7455
16336 900 7456
16338 120 7457
...etc...
See it working online: sqlfiddle