How do I do mulitple joins on two databases tables

How do I do mulitple joins on two databases tables - sql

The goal is to join all the same values (the duplicates) together. Email, timestamp and daystamp.
I have created one join statement
SELECT history.email, history.timestamp, payment.timestamp,
history.daystamp, payment.daystamp
FROM history
FULL OUTER JOIN payment ON history.email = payment.email
ORDER BY history.email;
I have all the unique email addresses. How do I do the same for the timestamp and daystamp?
Can I do three outer joins in one statement?

Here are two methods that might be useful to adapt to your specific problem here although it's a little unclear without sample data.
Method 1:
SELECT A.column2
, B.column2
, C.column2
FROM
(
(SELECT month, column2 FROM table1) A
FULL OUTER JOIN
(SELECT month, column2 FROM table2) B on A.month= B.month
FULL OUTER JOIN
(SELECT month, column2 FROM table3) C on A.month= C.month
)
Method 2:
select
A.column2,
B.column2,
C.column2
from (
select distinct month from table1
union
select distinct month from table2
union
select distinct month from table3
) as X
left outer join table1 as A on A.month = X.month
left outer join table2 as B on B.month = X.month
left outer join table3 as C on C.month = X.month

Something like this?
SELECT
case when p.payment_id is not null then 'p' else 'h' end as tbl
, coalesce(h.email, p.email) as email
, coalesce(h.timestamp, p.timestamp) as timestamp
, coalesce(h.daystamp, p.daystamp) as daystamp
FROM history h
FULL JOIN payment p
ON h.email = p.email
AND h.timestamp = p.timestamp
AND h.daystamp is not distinct from p.daystamp
WHERE (h.history_id is null or p.payment_id is null)
ORDER BY coalesce(h.email, p.email);
tbl | email | timestamp | daystamp
:-- | :------------- | :------------------ | -------:
h | test2#mail.not | 2022-02-22 22:22:22 | 20220222
p | test2#mail.not | 2022-02-22 22:28:22 | 20220222
h | test3#mail.not | 2022-02-22 22:22:23 | 20220222
p | test4#mail.not | 2022-02-22 22:22:24 | 20220222
Test on db<>fiddle here

Related

find the max for each value in SQL

i have tables like this
table 1
|cl.1|
| -- |
| a |
| b |
| c |
table 2
|cl.1|cl.2|para|
|----|---| --- |
| a | 3 | t |
| a | 3 | f |
| b | 2 | t |
| a | 1 | b |
| c | 4 | t |
| b | 7 | d |
i want to get the max value for each element in table1 from table2
and the different parameter
so the expecited tabel should be like this
|cl.1|max|para|
|----|---| --- |
| a | 3 | t |
| a | 3 | f |
| c | 4 | t |
| b | 7 | d |

You can try to compute all the maximums:
with Maxes as (
select cl1,
max(cl2) as cl2
from Table2
group by cl1)
and then join them with the original Table2, e.g.
with Maxes as (
select cl1,
max(cl2) as cl2
from Table2
group by cl1)
select t.*
from Table2 t join
Maxes m on (t.cl1 = m.cl1 and t.cl2 = m.cl2)

Depends on what features your RDBMS supports.
With Oracle you could do a CROSS APPLY to order table2 by descending cl2 and keep the top values (with ties):
select T1.c1, TM.maximum, TM.para
from Table1 T1
cross apply (
select *
from Table2 T2
where T2.c1 = T1.c1
order by T2.maximum descending
fetch first 1 row with ties
) TM
You can do the same in SQL Server with syntax select top 1 with ties instead of fetch first 1 row with ties.
Another option could be to use Analytical Functions to rank the results per col1 and then keep only the first ones.
select T.c1, T.maximum, T.para
from (
select
T1.c1, T2.maximum, T2.para,
rank() over (partition by T1.c1 order by T2.maximum desc) r
from T1
join T2 on T1.c1 = T2.c2
) T
where T.r = 1
Less stylish and probably(?) less performant would be computing the maximum for each c1 and then doing an equality:
select T1.c1, T2.maximum, T2.para
from T1
join T2 on T1.c1 = T2.c1
where T2.maximum = (select max(maximum) from T2 where c1 = T1.c1)

If you are trying to get the max tl.1 and if for the same values it is equal, you could try:
SELECT *
FROM table2
WHERE cl_2 in ( SELECT MAX(cl_2)
FROM table2
group by cl_1
);
Result:
cl_1 cl_2 para
a 3 t
a 3 f
c 4 t
b 7 d
Tested on MySQL : https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=42a6bc20622a210b18101588540995ec
You could use a join , but it makes no difference:
SELECT t1.cl_1,t2.cl_2,t2.para
FROM table2 t2
INNER JOIN table1 t1 on t2.cl_1=t1.cl_1
WHERE t2.cl_2 in (SELECT MAX(cl_2) FROM table2 group by cl_1 );
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=4b2eed9bcee3532cc7c4e7b3862bc3ef

DENSE_RANK can be used to get whole rows that have a maximum of something within a partition.
Because when sorted descending, the top 1 will have rank 1.
select cl_1, cl_2, para
from
(
select cl_1, cl_2, para
, dense_rank() over (partition by cl_1 order by cl_2 desc) as rnk
from table1 t1
join table2 t2 using (cl_1)
) q
where rnk = 1

Use a CTE to get the max values, then select the rows with those values:
with maxes as
(
select t1.[cl.1]
, max(t2.[cl.2]) max_val
from table1 t1
inner join table2 t2
on t1.[cl.1] = t2.[cl.1]
group by t1.[cl.1]
)
select t1.[cl.1]
, t2.[cl.2]
, t2.para
from table1 t1
inner join table2 t2
on t1.[cl.1] = t2.[cl.1]
where t2.[cl.2] = (select m.max_val from maxes m where m.[cl.1] = t1.[cl.1])
This can also be achieved by joining the CTE:
with maxes as
(
select t1.[cl.1]
, max(t2.[cl.2]) max_val
from table1 t1
inner join table2 t2
on t1.[cl.1] = t2.[cl.1]
group by t1.[cl.1]
)
select t1.[cl.1]
, t2.[cl.2]
, t2.para
from table1 t1
inner join table2 t2
on t1.[cl.1] = t2.[cl.1]
inner join maxes m
on t2.[cl.2] = m.max_val

SQL DATEDIFF Returns

I don't want to show my whole query as it is very specific but I will try to explain briefly.
The following query works perfectly and I got 6000 records as a result.
SELECT
DISTINCT ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P
--AND SO ON......
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
However, when i add a DATEDIFF calculation I get only 100 Rows in the answer:
SELECT DISTINCT
ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P,
DATEDIFF(dd,A.ADATE,A.BDATE)
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
I am expecting 6000 rows with a correct query using DATEDIFF in line with what the following query returns:
SELECT
DISTINCT *,
DATEDIFF(dd,A.ADATE,A.BDATE)
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
But I do not need all of them I need just the selected ones and the DATEDIFF but combining the queries above did not work for some reason that i do not know. Can anyone see why i am not getting the expected row count in my second query?

You can just wrap your query in a CTE and perform the DATEDIFF on the result set returned by the CTE:
WITH DISTINCT_CTE AS (
SELECT
DISTINCT ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P
--AND SO ON......
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
)
SELECT *, DATEDIFF(dd, ADATE, BDATE)
FROM DISTINCT_CTE

You could try a subquery... first subquery the entire thing.
In the receiving query do another subquery for the DATEDIFF. For this subquery you need the primary key to get back to the correct row for the dates, which if I'm interpreting correctly is A.ID.
SELECT dT.ID, dT.Name, dT.A_Whatever
,(SELECT DATEDIFF(dd, A2.ADATE, A2.BDATE)
FROM A AS A2
WHERE dT.ID = A2.ID --the primary key
) AS [DateDiff]
--AND SO ON........
FROM (
SELECT DISTINCT ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P
--AND SO ON......
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
) AS dT

You can try the following query :
SELECT T1.*, T2.DateDiff
FROM (
SELECT DISTINCT ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P
--AND SO ON......
FROM A
INNER JOIN B ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
) AS T1
JOIN (SELECT ID, DATEDIFF(dd, ADATE, BDATE) as DateDiff from A as A2) AS T2
ON T1.ID= T2.ID

As the following attempts to display, if you are using select distinct AND removing date columns from view when introducing datediff() into the select clause that could be the cause of the change in rows returned. Note in query 1 that as long as adate or bdate are displayed 5 rows would be returned, but without them (query 2) you just get one row. Alternatively, if you removed distinct,from query 2 you would get all 5 rows but just the one column (this isn't shown below).
try it out at SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Sample
([ADATE] datetime, [BDATE] datetime)
;
INSERT INTO Sample
([ADATE], [BDATE])
VALUES
('2017-10-01 00:00:00', '2017-10-06 00:00:00'),
('2017-10-02 00:00:00', '2017-10-07 00:00:00'),
('2017-10-03 00:00:00', '2017-10-08 00:00:00'),
('2017-10-04 00:00:00', '2017-10-09 00:00:00'),
('2017-10-05 00:00:00', '2017-10-10 00:00:00')
;
Query 1:
select distinct 'q1' qry, adate, bdate, datediff(day,adate,bdate) days_diff
from sample
order by adate
Results:
| qry | adate | bdate | days_diff |
|-----|----------------------|----------------------|-----------|
| q1 | 2017-10-01T00:00:00Z | 2017-10-06T00:00:00Z | 5 |
| q1 | 2017-10-02T00:00:00Z | 2017-10-07T00:00:00Z | 5 |
| q1 | 2017-10-03T00:00:00Z | 2017-10-08T00:00:00Z | 5 |
| q1 | 2017-10-04T00:00:00Z | 2017-10-09T00:00:00Z | 5 |
| q1 | 2017-10-05T00:00:00Z | 2017-10-10T00:00:00Z | 5 |
Query 2:
select distinct 'q2' qry, datediff(day,adate,bdate) days_diff
from sample
Results:
| qry | days_diff |
|-----|-----------|
| q2 | 5 |

Select Missing date only if it is Maximum from a table in Oracle

I have 2 tables. If table 1 has dates greater than table 2 only those record has should be populated in Output.
Table 1:
ID Category Date
1 A 3/2/1990
1 A 3/5/2013
1 C 4/3/1979
2 D 4/3/1970
2 D 5/6/2016
3 E 8/8/2016
Table 2:
ID Category Date
1 A 3/2/1990
1 C 4/3/1979
1 C 4/3/1982
1 D 4/3/1982
2 D 5/6/2016
The expected Output is
ID Category Date
1 A 3/5/2013
3 E 8/8/2016
I tried the below query and its giving me incorrect results.
select a.id,a.category,a,Date from table1 a where
a.Date > (select Max(b.Date) from table2 b where a.id=b.id and a.category =b.catgory group by b.id,b.category)

SQL Fiddle Demo
WITH cte AS (
SELECT ID, Category, MAX(Date) as mdate
FROM Table2
GROUP BY ID, Category
)
SELECT T1.* --, T2.*
FROM Table1 as T1
LEFT JOIN cte as T2
ON T1.ID = T2.ID
AND T1.Category = T2.Category
WHERE T1.Date > T2.mdate
OR T2.mdate is NULL
OUTPUT

SELECT T1.*
FROM Table1 AS T1 INNER JOIN Table2 AS T2
ON T1.ID = T2.ID
WHERE T1.Date > T2.mdate;

As per the required output, you need to use left outer join
SELECT T1.*
FROM table1 T1
LEFT OUTER JOIN (
SELECT ID
,category
,MAX(Date) mdate
FROM Table2
GROUP BY ID
,category
) T2 ON (
T1.ID = T2.ID
AND T1.category = T2.category
)
WHERE T1.date > nvl(T2.mdate, '01/01/1900');

Filtering Table2:
SELECT ID, Category,MAX(Date) as Date
FROM Table2
GROUP BY ID,Category;
| ID | Category | Date |
|----|----------|-------------------------|
| 1 | A | March, 02 1990 00:00:00 |
| 1 | C | April, 03 1982 00:00:00 |
| 1 | D | April, 03 1982 00:00:00 |
| 2 | D | May, 06 2016 00:00:00 |
Now using this to create a left join with Table1:
SELECT t1.*
FROM Table1 t1 LEFT JOIN
(SELECT ID, Category,MAX(Date) as Date
FROM Table2
GROUP BY ID,Category) AS t2part
ON t1.ID = t2part.ID
AND t1.Category = t2part.Category
WHERE t1.Date > t2part.Date;
| ID | Category | Date |
|----|----------|-------------------------|
| 1 | A | March, 05 2013 00:00:00 |
Please note that the row with ID=3, category=E wasn't found due to not matching neither ID or Category in the JOIN.
As good practice if the entities should interact there must be some sort of normalization applied so we could make best use of joins through indexes.
fiddle with your provided data and queries.

Why my query with union operator returns only fields from first select?

I need to get all values from table and values with condition for 'not null'.
So I make two SELECT statements.
select oa.dept_id, COUNT(oa.id) quantity, sum(oa.premium) 'sum'
from Table1 oa
Left Join Table2 od On od.id = oa.dept_id
group by oa.dept_id
Union all
select oa1.dept_id, COUNT(oa1.id) quantity1, sum(oa1.premium) 'sum1'
from Table1 oa1
Left Join Table2 od1 On od1.id = oa1.dept_id
where oa1.action is not null
group by oa1.dept_id
I expect result like this with 70 rows:
-----------------------------------------------
| dept.id | quantity | sum | quantity1 | sum1 |
-----------------------------------------------
I got result like this with 130 rows:
----------------------------
| dept.id | quantity | sum |
----------------------------

You need to use join
select a.dept_id, quantity, sum,quantity1,sum1
from
(
select oa.dept_id, COUNT(oa.id) quantity, sum(oa.premium) 'sum'
from Table1 oa
Left Join Table2 od On od.id = oa.dept_id
group by oa.dept_id
) a
join
(
select oa1.dept_id, COUNT(oa1.id) quantity1, sum(oa1.premium) 'sum1'
from Table1 oa1
Left Join Table2 od1 On od1.id = oa1.dept_id
where oa1.action is not null
group by oa1.dept_id
) b
on a.dept_id=b.dept_id

SQL aggregation query, grouping by entries in junction table

I have TableA in a many-to-many relationship with TableC via TableB. That is,
TableA TableB TableC
id | val fkeyA | fkeyC id | data
I wish the do select sum(val) on TableA, grouping by the relationship(s) to TableC. Every entry in TableA has at least one relationship with TableC. For example,
TableA
1 | 25
2 | 30
3 | 50
TableB
1 | 1
1 | 2
2 | 1
2 | 2
2 | 3
3 | 1
3 | 2
should output
75
30
since rows 1 and 3 in Table have the same relationships to TableC, but row 2 in TableA has a different relationship to TableC.
How can I write a SQL query for this?

SELECT
sum(tableA.val) as sumVal,
tableC.data
FROM
tableA
inner join tableB ON tableA.id = tableB.fkeyA
INNER JOIN tableC ON tableB.fkeyC = tableC.id
GROUP by tableC.data
edit
Ah ha - I now see what you're getting at. Let me try again:
SELECT
sum(val) as sumVal,
tableCGroup
FROM
(
SELECT
tableA.val,
(
SELECT cast(tableB.fkeyC as varchar) + ','
FROM tableB WHERE tableB.fKeyA = tableA.id
ORDER BY tableB.fkeyC
FOR XML PATH('')
) as tableCGroup
FROM
tableA
) tmp
GROUP BY
tableCGroup

Hm, in MySQL it could be written like this:
SELECT
SUM(val) AS sumVal
FROM
( SELECT
fkeyA
, GROUP_CONCAT(fkeyC ORDER BY fkeyC) AS grpC
FROM
TableB
GROUP BY
fkeyA
) AS g
JOIN
TableA a
ON a.id = g.fkeyA
GROUP BY
grpC

SELECT sum(a.val)
FROM tablea a
INNER JOIN tableb b ON (b.fKeyA = a.id)
GROUP BY b.fKeyC

It seems that is it needed to create a key_list in orther to allow group by:
75 -> key list = "1 2"
30 -> key list = "1 2 3"
Because GROUP_CONCAT don't exists in T-SQL:
WITH CTE ( Id, key_list )
AS ( SELECT TableA.id, CAST( '' AS VARCHAR(8000) )
FROM TableA
GROUP BY TableA.id
UNION ALL
SELECT TableA.id, CAST( key_list + ' ' + str(TableB.id) AS VARCHAR(8000) )
FROM CTE c
INNER JOIN TableA A
ON c.Id = A.id
INNER join TableB B
ON B.Id = A.id
WHERE A.id > c.id --avoid infinite loop
)
Select
sum( val )
from
TableA inner join
CTE on (tableA.id = CTE.id)
group by
CTE.key_list

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How do I do mulitple joins on two databases tables - sql

Related

find the max for each value in SQL

SQL DATEDIFF Returns

Select Missing date only if it is Maximum from a table in Oracle

Why my query with union operator returns only fields from first select?

SQL aggregation query, grouping by entries in junction table

Categories

Resources