SQL DATEDIFF Returns - sql

I don't want to show my whole query as it is very specific but I will try to explain briefly.
The following query works perfectly and I got 6000 records as a result.
SELECT
DISTINCT ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P
--AND SO ON......
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
However, when i add a DATEDIFF calculation I get only 100 Rows in the answer:
SELECT DISTINCT
ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P,
DATEDIFF(dd,A.ADATE,A.BDATE)
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
I am expecting 6000 rows with a correct query using DATEDIFF in line with what the following query returns:
SELECT
DISTINCT *,
DATEDIFF(dd,A.ADATE,A.BDATE)
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
But I do not need all of them I need just the selected ones and the DATEDIFF but combining the queries above did not work for some reason that i do not know. Can anyone see why i am not getting the expected row count in my second query?

You can just wrap your query in a CTE and perform the DATEDIFF on the result set returned by the CTE:
WITH DISTINCT_CTE AS (
SELECT
DISTINCT ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P
--AND SO ON......
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
)
SELECT *, DATEDIFF(dd, ADATE, BDATE)
FROM DISTINCT_CTE

You could try a subquery... first subquery the entire thing.
In the receiving query do another subquery for the DATEDIFF. For this subquery you need the primary key to get back to the correct row for the dates, which if I'm interpreting correctly is A.ID.
SELECT dT.ID, dT.Name, dT.A_Whatever
,(SELECT DATEDIFF(dd, A2.ADATE, A2.BDATE)
FROM A AS A2
WHERE dT.ID = A2.ID --the primary key
) AS [DateDiff]
--AND SO ON........
FROM (
SELECT DISTINCT ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P
--AND SO ON......
FROM A
INNER JOIN A ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
) AS dT

You can try the following query :
SELECT T1.*, T2.DateDiff
FROM (
SELECT DISTINCT ID,
NAME,
CASE WHEN A.ID IS NULL THEN 'NOT EX.'
ELSE A.Whatever
END AS A_Whatever,
D.Z1 AS A.P
--AND SO ON......
FROM A
INNER JOIN B ON A.ID= B.ID AND A.Nb= B.Nb
LEFT JOIN T AS T2_ID ON T2_D.Z= A.Z
LEFT JOIN L1 ON A.NR = L1.NR AND A.S = L1.S
LEFT JOIN LF ON LF.NR = L1.LNR
--AND SO ON.......
) AS T1
JOIN (SELECT ID, DATEDIFF(dd, ADATE, BDATE) as DateDiff from A as A2) AS T2
ON T1.ID= T2.ID

As the following attempts to display, if you are using select distinct AND removing date columns from view when introducing datediff() into the select clause that could be the cause of the change in rows returned. Note in query 1 that as long as adate or bdate are displayed 5 rows would be returned, but without them (query 2) you just get one row. Alternatively, if you removed distinct,from query 2 you would get all 5 rows but just the one column (this isn't shown below).
try it out at SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Sample
([ADATE] datetime, [BDATE] datetime)
;
INSERT INTO Sample
([ADATE], [BDATE])
VALUES
('2017-10-01 00:00:00', '2017-10-06 00:00:00'),
('2017-10-02 00:00:00', '2017-10-07 00:00:00'),
('2017-10-03 00:00:00', '2017-10-08 00:00:00'),
('2017-10-04 00:00:00', '2017-10-09 00:00:00'),
('2017-10-05 00:00:00', '2017-10-10 00:00:00')
;
Query 1:
select distinct 'q1' qry, adate, bdate, datediff(day,adate,bdate) days_diff
from sample
order by adate
Results:
| qry | adate | bdate | days_diff |
|-----|----------------------|----------------------|-----------|
| q1 | 2017-10-01T00:00:00Z | 2017-10-06T00:00:00Z | 5 |
| q1 | 2017-10-02T00:00:00Z | 2017-10-07T00:00:00Z | 5 |
| q1 | 2017-10-03T00:00:00Z | 2017-10-08T00:00:00Z | 5 |
| q1 | 2017-10-04T00:00:00Z | 2017-10-09T00:00:00Z | 5 |
| q1 | 2017-10-05T00:00:00Z | 2017-10-10T00:00:00Z | 5 |
Query 2:
select distinct 'q2' qry, datediff(day,adate,bdate) days_diff
from sample
Results:
| qry | days_diff |
|-----|-----------|
| q2 | 5 |

Related

How do I do mulitple joins on two databases tables

The goal is to join all the same values (the duplicates) together. Email, timestamp and daystamp.
I have created one join statement
SELECT history.email, history.timestamp, payment.timestamp,
history.daystamp, payment.daystamp
FROM history
FULL OUTER JOIN payment ON history.email = payment.email
ORDER BY history.email;
I have all the unique email addresses. How do I do the same for the timestamp and daystamp?
Can I do three outer joins in one statement?
Here are two methods that might be useful to adapt to your specific problem here although it's a little unclear without sample data.
Method 1:
SELECT A.column2
, B.column2
, C.column2
FROM
(
(SELECT month, column2 FROM table1) A
FULL OUTER JOIN
(SELECT month, column2 FROM table2) B on A.month= B.month
FULL OUTER JOIN
(SELECT month, column2 FROM table3) C on A.month= C.month
)
Method 2:
select
A.column2,
B.column2,
C.column2
from (
select distinct month from table1
union
select distinct month from table2
union
select distinct month from table3
) as X
left outer join table1 as A on A.month = X.month
left outer join table2 as B on B.month = X.month
left outer join table3 as C on C.month = X.month
Something like this?
SELECT
case when p.payment_id is not null then 'p' else 'h' end as tbl
, coalesce(h.email, p.email) as email
, coalesce(h.timestamp, p.timestamp) as timestamp
, coalesce(h.daystamp, p.daystamp) as daystamp
FROM history h
FULL JOIN payment p
ON h.email = p.email
AND h.timestamp = p.timestamp
AND h.daystamp is not distinct from p.daystamp
WHERE (h.history_id is null or p.payment_id is null)
ORDER BY coalesce(h.email, p.email);
tbl | email | timestamp | daystamp
:-- | :------------- | :------------------ | -------:
h | test2#mail.not | 2022-02-22 22:22:22 | 20220222
p | test2#mail.not | 2022-02-22 22:28:22 | 20220222
h | test3#mail.not | 2022-02-22 22:22:23 | 20220222
p | test4#mail.not | 2022-02-22 22:22:24 | 20220222
Test on db<>fiddle here

Optimizing SQL query having DISTINCT keyword and functions

I have this query that generates about 40,000 records and the execution time of this query is about 1 minute 30 seconds.
SELECT DISTINCT
a.ID,
a.NAME,
a.DIV,
a.UID,
(select NAME from EMPLOYEE where UID= a.UID and UID<>'') as boss_id,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 1 and id = a.ID) as TERM1,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 2 and id = a.ID) as TERM2,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 3 and id = a.ID) as TERM3,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 4 and id = a.ID) as TERM4,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 5 and id = a.ID) as TERM5,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 6 and id = a.ID) as TERM6,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 7 and id = a.ID) as TERM7,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 8 and id = a.ID) as TERM8
FROM EMPLOYEE a
WHERE ID LIKE 'D%'
I tried using group by, different kinds of join to improve the execution time but couldn't succeed.Both the tables ABC and XYZ are indexed.
Also, I think that the root cause of this problem is either the DISTINCT keyword or the MAX function.
How can I optimize the above query to bring down the execution time to at least less than a minute?
Any help is appreciated.
Query is not tested, this is just an idea on how you could get this done in two different ways.
(SQL Server solutions here)
Using LEFT JOIN for each ID should look something like this:
SELECT a.ID,
a.NAME,
a.DIV,
a.UID,
b.Name as boss_id,
MAX(xyz1.create_time) as TERM1,
MAX(xyz2.create_time) as TERM2,
MAX(xyz3.create_time) as TERM3,
MAX(xyz4.create_time) as TERM4,
MAX(xyz5.create_time) as TERM5,
MAX(xyz6.create_time) as TERM6,
MAX(xyz7.create_time) as TERM7,
MAX(xyz8.create_time) as TERM8
FROM EMPLOYEE a
JOIN EMPLOYEE b on a.UID = b.UID and b.UID <> ''
LEFT JOIN XYZ xyz1 on a.ID = xyz1.ID and xyz1.XYZ_ID = 1
LEFT JOIN XYZ xyz2 on a.ID = xyz2.ID and xyz1.XYZ_ID = 2
LEFT JOIN XYZ xyz3 on a.ID = xyz3.ID and xyz1.XYZ_ID = 3
LEFT JOIN XYZ xyz4 on a.ID = xyz4.ID and xyz1.XYZ_ID = 4
LEFT JOIN XYZ xyz5 on a.ID = xyz5.ID and xyz1.XYZ_ID = 5
LEFT JOIN XYZ xyz6 on a.ID = xyz6.ID and xyz1.XYZ_ID = 6
LEFT JOIN XYZ xyz7 on a.ID = xyz7.ID and xyz1.XYZ_ID = 7
LEFT JOIN XYZ xyz8 on a.ID = xyz8.ID and xyz1.XYZ_ID = 8
WHERE a.ID LIKE 'D%'
GROUP BY a.ID, a.NAME, a.DIV, a.UID, b.Name
Using PIVOT would look something like this:
select * from (
SELECT DISTINCT
a.ID,
a.NAME,
a.DIV,
a.UID,
b.NAME as boss_id,
xyz.xyz_id,
xyz.create_time
FROM EMPLOYEE a
JOIN EMPLOYEE b on a.UID = b.UID and b.UID <> ''
LEFT JOIN (SELECT DATE(MAX(create_time)) create_time, XYZ_ID, ID
from XYZ
where XYZ_ID between 1 and 8
group by XYZ_ID, ID) xyz on a.ID = xyz1.ID
WHERE a.ID LIKE 'D%') src
PIVOT (
max(create_time) for xyz_id IN (['1'], ['2'], ['3'], ['4'],
['5'], ['6'], ['7'], ['8'])
) PIV
Give it a shot
I would recommend group by and conditional aggregation:
SELECT e.ID, e.NAME, e.DIV, e.UID,
DATE(MAX(CASE WHEN XYZ_ID = 1 THEN create_time END)) as term1,
DATE(MAX(CASE WHEN XYZ_ID = 2 THEN create_time END)) as term2,
DATE(MAX(CASE WHEN XYZ_ID = 3 THEN create_time END)) as term3,
DATE(MAX(CASE WHEN XYZ_ID = 4 THEN create_time END)) as term4,
DATE(MAX(CASE WHEN XYZ_ID = 5 THEN create_time END)) as term5,
DATE(MAX(CASE WHEN XYZ_ID = 6 THEN create_time END)) as term6,
DATE(MAX(CASE WHEN XYZ_ID = 7 THEN create_time END)) as term7,
DATE(MAX(CASE WHEN XYZ_ID = 8 THEN create_time END)) as term8
FROM EMPLOYEE e LEFT JOIN
XYZ
ON xyz.ID = e.id
WHERE e.ID LIKE 'D%'
GROUP BY e.ID, e.NAME, e.DIV, e.UID;
I don't understand the logic for boss_id, so I left that out. This should improve the performance significantly.

Conditional Left Join SQL

table A
----------------------------
NAME | CODE | BRANCH
----------------------------
bob | PL | B
david | AA | B
susan | PL | C
joe | AB | C
alfred | PL | B
table B
----------------------------
CODE | DESCRIPTION
----------------------------
PL | code 1
PB | code 2
PC | code 3
table C
----------------------------
CODE | DESCRIPTION
----------------------------
AA | code 4
AB | code 5
AC | code 6
Is there any way to join table A, B and C. without join all the table?
select A.*, COALESCE(B.DESCRIPTION, C.DESCRIPTION) AS DESCRIPTION from A
left join B on A.CODE = B.CODE
left join C on A.CODE = C.CODE
In my real case there will be more than 10 to join with the same column.
So I need conditional left join, something like this
SELECT A* , DESCRIPTION
FROM A LEFT JOIN (
CASE
WHEN A.CODE = 'B' THEN SELECT * FROM B
WHEN A.CODE = 'C' THEN SELECT * FROM C
END
) BC ON A.CODE = BC.CODE
You cannot use CASE to implement flow control. In SQL CASE is an expression that returns a single value.
You can instead use the following query:
select A.*,
CASE A.BRANCH
WHEN 'B' THEN B.DESCRIPTION
WHEN 'C' THEN C.DESCRIPTION
END AS DESCRIPTION
from A
left join B on A.CODE = B.CODE AND A.BRANCH = 'B'
left join C on A.CODE = C.CODE AND A.BRANCH = 'C'
You could use this to generate queries. Then you write a PL/SQL block to loop through all these queries and execute dynamically to give you separate results.
SELECT 'SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN '
|| CASE WHEN A.BRANCH = 'B' THEN 'TABLEB B' END
|| CASE WHEN A.BRANCH = 'C' THEN 'TABLEC C' END
|| ' ON '
|| 'A.CODE = '
|| CASE WHEN A.BRANCH = 'B' THEN 'B.CODE' END
|| CASE WHEN A.BRANCH = 'C' THEN 'C.CODE' END
v_query
FROM TableA A;
Output
V_QUERY
--------------------------------------------------------------------------------
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEC C ON A.CODE = C.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEC C ON A.CODE = C.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE

Why my query with union operator returns only fields from first select?

I need to get all values from table and values with condition for 'not null'.
So I make two SELECT statements.
select oa.dept_id, COUNT(oa.id) quantity, sum(oa.premium) 'sum'
from Table1 oa
Left Join Table2 od On od.id = oa.dept_id
group by oa.dept_id
Union all
select oa1.dept_id, COUNT(oa1.id) quantity1, sum(oa1.premium) 'sum1'
from Table1 oa1
Left Join Table2 od1 On od1.id = oa1.dept_id
where oa1.action is not null
group by oa1.dept_id
I expect result like this with 70 rows:
-----------------------------------------------
| dept.id | quantity | sum | quantity1 | sum1 |
-----------------------------------------------
I got result like this with 130 rows:
----------------------------
| dept.id | quantity | sum |
----------------------------
You need to use join
select a.dept_id, quantity, sum,quantity1,sum1
from
(
select oa.dept_id, COUNT(oa.id) quantity, sum(oa.premium) 'sum'
from Table1 oa
Left Join Table2 od On od.id = oa.dept_id
group by oa.dept_id
) a
join
(
select oa1.dept_id, COUNT(oa1.id) quantity1, sum(oa1.premium) 'sum1'
from Table1 oa1
Left Join Table2 od1 On od1.id = oa1.dept_id
where oa1.action is not null
group by oa1.dept_id
) b
on a.dept_id=b.dept_id

SQL aggregation query, grouping by entries in junction table

I have TableA in a many-to-many relationship with TableC via TableB. That is,
TableA TableB TableC
id | val fkeyA | fkeyC id | data
I wish the do select sum(val) on TableA, grouping by the relationship(s) to TableC. Every entry in TableA has at least one relationship with TableC. For example,
TableA
1 | 25
2 | 30
3 | 50
TableB
1 | 1
1 | 2
2 | 1
2 | 2
2 | 3
3 | 1
3 | 2
should output
75
30
since rows 1 and 3 in Table have the same relationships to TableC, but row 2 in TableA has a different relationship to TableC.
How can I write a SQL query for this?
SELECT
sum(tableA.val) as sumVal,
tableC.data
FROM
tableA
inner join tableB ON tableA.id = tableB.fkeyA
INNER JOIN tableC ON tableB.fkeyC = tableC.id
GROUP by tableC.data
edit
Ah ha - I now see what you're getting at. Let me try again:
SELECT
sum(val) as sumVal,
tableCGroup
FROM
(
SELECT
tableA.val,
(
SELECT cast(tableB.fkeyC as varchar) + ','
FROM tableB WHERE tableB.fKeyA = tableA.id
ORDER BY tableB.fkeyC
FOR XML PATH('')
) as tableCGroup
FROM
tableA
) tmp
GROUP BY
tableCGroup
Hm, in MySQL it could be written like this:
SELECT
SUM(val) AS sumVal
FROM
( SELECT
fkeyA
, GROUP_CONCAT(fkeyC ORDER BY fkeyC) AS grpC
FROM
TableB
GROUP BY
fkeyA
) AS g
JOIN
TableA a
ON a.id = g.fkeyA
GROUP BY
grpC
SELECT sum(a.val)
FROM tablea a
INNER JOIN tableb b ON (b.fKeyA = a.id)
GROUP BY b.fKeyC
It seems that is it needed to create a key_list in orther to allow group by:
75 -> key list = "1 2"
30 -> key list = "1 2 3"
Because GROUP_CONCAT don't exists in T-SQL:
WITH CTE ( Id, key_list )
AS ( SELECT TableA.id, CAST( '' AS VARCHAR(8000) )
FROM TableA
GROUP BY TableA.id
UNION ALL
SELECT TableA.id, CAST( key_list + ' ' + str(TableB.id) AS VARCHAR(8000) )
FROM CTE c
INNER JOIN TableA A
ON c.Id = A.id
INNER join TableB B
ON B.Id = A.id
WHERE A.id > c.id --avoid infinite loop
)
Select
sum( val )
from
TableA inner join
CTE on (tableA.id = CTE.id)
group by
CTE.key_list