left join not giving correct output - sql

I am having 2 table
tableA
Accountid
-----------
10
11
12
tableB
Accountid | Date |
--------------------------------------------
10 | 2016-02-02 |
11 | 2016-02-02 |
11 | 2016-02-02 |
15 | 2016-02-03 |
I am expecting the output like
Accountid | ID |
------------------------------------
10 | 10 |
11 | 11 |
12 | NULL |
I am running this query
select A.accountid,b.accountid as ID
from tableA as A
left join tableB as B on a.accountid=b.accountid
where b.date between '2016-02-02' and '2016-02-02'
but it is giving me the output as, I am not sure where am I going wrong
Accountid | ID |
-----------------------------------
10 | 10 |
11 | 11 |
I am using MSSQL database.

When any field of the right table of a left join is occurred in WHERE clause then this join will behave like INNER JOIN
To get expected result your query should be like this
select A.accountid,b.accountid as ID
from tableA as A
left join tableB as B on a.accountid=b.accountid and
b.date between '2016-02-02' and '2016-02-02'

Try this:
select A.accountid,b.accountid as ID
from tableA as A left join tableB as B on a.accountid=b.accountid
where b.date is null or b.date between '2016-02-02' and '2016-02-02'
The reason is, for AccountID 12 b.date is effectively null (because there's no row in tableB). Therefore you'll only get a result for that row if you allow date to be null in the query.

If b does not exist (id = 12) your where clause return false .
if you want to see the row with id=12 you must include your test (b.date between '2016-02-02' and '2016-02-02') with your "ON" clause :
select A.accountid,b.accountid as ID from tableA as A left join tableB as B
on a.accountid=b.accountid and b.date between '2016-02-02' and '2016-02-02'

the reason is because the WHERE clause of your select is executed after the LEFT JOIN
so, first sql server will extract the data as you expect, with the row 12-NULL,
and then it will be filtered out and removed from output by your WHERE clause
you can move the date filter on JOIN condition as suggested by #JaydipJ and #RémyBaron
or filter the tableB before the JOIN this way:
select A.accountid,b.accountid as ID
from tableA as A
left join (
select *
from tableB
where b.date between '2016-02-02' and '2016-02-02'
) as B on a.accountid=b.accountid

Related

SQL Performance Inner Join

Let me ask you something I've been thinking about for a while. Imagine that you have two tables with data:
MAIN TABLE (A)
| ID | Date |
|:-----------|------------:|
| 1 | 01-01-1990|
| 2 | 01-01-1991|
| 3 | 01-01-1992|
| 4 | 01-01-2000|
| 5 | 01-01-2001|
| 6 | 01-01-2003|
SECONDARY TABLE (B)
| ID | Date | TOTAL |
|:-----------|------------:|--------:|
| 1 | 01-01-1990| 1 |
| 2 | 01-01-1991| 2 |
| 3 | 01-01-1992| 1 |
| 4 | 01-01-2000| 5 |
| 5 | 01-01-2001| 8 |
| 6 | 01-01-2003| 7 |
and you want to select only ID with date greater than 31-12-1999 and get the following columns: ID, Date and Total. For that we have many options but my question would be which of the following would be better in terms of performance:
OPTION 1
With main as(
select id,
date
from A
where date > '31-12-1999'
)
select main.id,
main.date,
B.total
from main inner join B on main.id = b.id
OPTION 1
With main as(
select id,
date
from A
where date > '31-12-1999'
),
secondary as (
select id,
total
from B
where date > '31-12-1999'
)
select main.id,
main.date,
secondary.total
from main inner join secondary on main.id = b.id
Which of both queries would be better in terms of performance? Thanks in advance!
DATE FOR BOTH TABLES MEANS THE SAME
You don't need to use CTE you can directly join two tables -
select A.id,
A.date,
B.total
from A inner join B on A.id = b.id
where A.date > '31-12-1999'
You would need to test on your data. But there is really no need for CTEs:
select a.id a.date, b.total
from a inner join
b
on a.id = b.id
where a.date > '1999-12-31' and b.date > '1999-12-31';
As for your specific question, the two queries are not the same, because the first is filtering on only one date and the second is filtering on two dates. You should run the query that implements the logic that you intend.

SQL Exclude Row if Specific Value Exists Within Joined Field

I am trying to write an SQL query that will allow me to exclude a record from TableA if it has at least one match against TableB.
I have written some code, as below, that almost gets me what I need -
SELECT a.ID,
a.OPEN_DT,
b.LINKCREATED,
b.RULE__ID
FROM TableA a
LEFT JOIN TableB b
ON a.ROW_WID = b.A_ROW_WID
WHERE EXTRACT(YEAR FROM a.OPEN_DT) >= '2013'
AND NOT EXISTS (SELECT *
FROM TableB
WHERE A_ROW_WID = a.ROW_WID
AND EXTRACT(YEAR FROM b.CREATED) >= '2017')
;
Table A
ROW_WID | ID | OPEN_DT
---------------------------------
1 | A | 2013-01-01
2 | B | 2014-01-01
3 | C | 2017-01-01
Table B
RULE_ID | A_ROW_WID | LINKCREATED
---------------------------------
1 | A | 2014-01-01
2 | A | 2017-01-01
3 | B | 2017-01-01
The query above would return 1 row for ROW_WID = 1, 1 row for ROW_WID = 2 and nothing for ROW_WID = 3.
I would like my query to exclude ROW_WID=1 altogether because there is one row in TableB that has the year 2017.
I hope this question is clear, but let me know if not.
-EDIT-
Expected result would look like this -
ID | OPEN_DT | LINKCREATED | RULE_ID
C | 2017-01-01 | NULL | NULL
As ID 'C' from TableA has no link in TableB.
If there were an entry in A that had any links in B prior to 2017, they would be returned. Just not any with a TableB entry >= 2017.
Your issue is that you aren't checking for the max created date in the NOT EXISTS:
SELECT a.ID,
a.OPEN_DT,
b.LINKCREATED,
b.RULE__ID
FROM TableA a
LEFT JOIN TableB b
ON a.ROW_WID = b.A_ROW_WID
WHERE EXTRACT(YEAR FROM a.OPEN_DT) >= '2013'
AND NOT EXISTS (SELECT 'NE'
FROM TableB B2
WHERE A_ROW_WID = a.ROW_WID
AND B2.LINKCREATED= (SELECT MAX(BE.LINKCREATED) FROM TableB BE WHERE B2.A_ROW_WID=BE.A_ROW_WID)
AND EXTRACT(YEAR FROM b2.CREATED) >= '2017')
Try using not in:
SELECT a.ID,
a.OPEN_DT,
b.LINKCREATED,
b.RULE__ID
FROM TableA a
LEFT JOIN TableB b
ON a.ROW_WID = b.A_ROW_WID
WHERE EXTRACT(YEAR FROM a.OPEN_DT) >= '2013'
AND b.rule_id not in (select rule_id from TableB where A_ROW_WID in (SELECT
A_ROW_WID
FROM TableB
WHERE EXTRACT(YEAR FROM b.CREATED) >= '2017')a)b

SQL filter rows based without using Group by

I have a query which will perform joins over 6 tables and fetches various columns based on a condition. I want to add an extra filter condition which will give me only those members who have a count(distinct dateCaptured)>30. I'm able to get the list of members who satisfy this condition using Group by and having. But I don't want to group by other column names because of this one condition. Do I need to use PARTITION BY in this case.
Sample TABLE a
+-----+------------+--------------+
| Id | Identifier | DateCaptured |
+-----+------------+--------------+
| 1 | 05548 | 2017-09-01 |
| 2 | 05548 | 2017-09-01 |
| 3 | 05548 | 2017-09-01 |
| 4 | 05548 | 2017-09-02 |
| 5 | 05548 | 2017-09-03 |
| 6 | 05548 | 2017-09-04 |
| 7 | 37348 | 2017-08-15 |
| 8 | 37348 | 2017-08-15 |
| . | | |
| . | | |
| . | | |
| 54 | 37348 | 2017-10-15 |
+-----+------------+--------------+
Query
SELECT a.value,
b.value, c.value,
d.value
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
Assume Table a has more than 30 records for Identifier 37348. How can I get only this Identifier for the above query.
These are the patients i'm interested in for the above SELECT.
SELECT a.Identifier,count(DISTINCT DateCaptured)
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
GROUP BY Identifier
HAVING count(DISTINCT DateCaptured)>30
WITH cte as (
SELECT a.Identifier
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
GROUP BY Identifier
HAVING count(DISTINCT DateCaptured) > 30
)
SELECT a.value,
b.value, c.value,
d.value
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
INNER JOIN cte on cte.Identifier = a.Identifier
SELECT a.value,
b.value, c.value,
d.value
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
WHERE a.Identifier IN (SELECT a1.Identifier
FROM Table a1
GROUP BY a1.Identifier HAVING count(DISTINCT a1.DateCaptured)>30)
If the multiple rows really are in tableA, then you can do:
SELECT a.value, b.value, c.value, d.value
FROM (SELECT a.*, COUNT(*) OVER (PARTITION BY id) as cnt
FROM a
) a INNER JOIN
b
ON a.Id = b.id INNER JOIN
c
ON a.Id = c.Id AND s.Invalid = 0 INNER JOIN
d
ON a.Id = d.Id
WHERE a.cnt > 30;
Note: If you still need count(distinct) you can do:
SELECT a.value, b.value, c.value, d.value
FROM (SELECT a.*, SUM(CASE WHEN seqnum = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY id) as cnt
FROM (SELECT a.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY DateCaptured) as seqnum
FROM a
) a
) a INNER JOIN
b
ON a.Id = b.id INNER JOIN
c
ON a.Id = c.Id AND s.Invalid = 0 INNER JOIN
d
ON a.Id = d.Id
WHERE a.cnt > 30;

Joining Table A and B to get elements of both

I have two tables:
Table 'bookings':
id | date | hours
--------------------------
1 | 06/01/2016 | 2
1 | 06/02/2016 | 1
2 | 06/03/2016 | 2
3 | 06/03/2016 | 4
Table 'lookupCalendar':
date
-----
06/01/2016
06/02/2016
06/03/2016
I want to join them together so that I have a date for each booking so that the results look like this:
Table 'results':
id | date | hours
--------------------------
1 | 06/01/2016 | 2
1 | 06/02/2016 | 1
1 | 06/03/2016 | 0 <-- Added by query
2 | 06/01/2016 | 0 <-- Added by query
2 | 06/02/2016 | 0 <-- Added by query
2 | 06/03/2016 | 2
3 | 06/01/2016 | 0 <-- Added by query
3 | 06/02/2016 | 0 <-- Added by query
3 | 06/03/2016 | 4
I have tried doing a cross-apply, but that doesn't get me there, neither does a full join. The FULL JOIN just gives me nulls in the id column and the cross-apply gives me too much data.
Is there a query that can give me the results table above?
More Information
It might be beneficial to note that I am doing this so that I can calculate an average hours booked over a period of time, not just the number of records in the table.
Ideally, I'd be able to do
SELECT AVG(hours) AS my_average, id
FROM bookings
GROUP BY id
But since that would just give me a count of the records instead of the count of the days I want to cross apply it with the dates. Then I think I can just do the query above with the results table.
select i.id, c.date, coalesce(b.hours, 0) as hours
from lookupCalendar c
cross join (select distinct id from bookings) i
left join bookings b
on b.id = i.id
and b.date = c.date
order by i.id, c.date
Try this:
select c.date, b.id, isnull(b.hours, 0)
from lookupCalendar c
left join bookings b on b.date = c.date
LookupCalendar is your main table because you want the bookings against each date, irrespective of whether there was a booking on that date or not, so a left join is required.
I am not sure if you need to include b.id to solve your actual problem though. Wouldn't you just want to get the total number of hours booked against each date like this, to then calculate the average?:
select c.date, sum(isnull(b.hours, 0))
from lookupCalendar c
left join bookings b on b.date = c.date
group by c.date
You can try joining all the combinations of IDs and dates and left joining the data;
WITH Booking AS (SELECT *
FROM (VALUES
( 1 , '06/01/2016', 2 )
, ( 1 , '06/02/2016', 1 )
, ( 2 , '06/03/2016', 2 )
, ( 3 , '06/03/2016', 4 )
) x (id, date, hours)
)
, lookupid AS (
SELECT DISTINCT id FROM Booking
)
, lookupCalender AS (
SELECT DISTINCT date FROM Booking
)
SELECT ID.id, Cal.Date, ISNULL(B.Hours,0) AS hours
FROM lookupid id
INNER JOIN lookupCalender Cal
ON 1 = 1
LEFT JOIN Booking B
ON id.id = B.id
AND Cal.date = B.Date
ORDER BY ID.id, Cal.Date

SQL Join to Get Row with Maximum Value from Right table

I am having problem with sql join (oracle/ms sql)
I have two tables
A
ID | B_ID
---|------
1 | 1
1 | 4
2 | 3
2 | 2
----------
B
B_ID | B_VA| B_VB
-------|--------|-------
1 | 1 | a
2 | 2 | b
3 | 5 | c
4 | 2 | d
-----------------------
From these two tables I need A.ID, B.B_ID, B.B_VA (MAX), B.B_VB (with max B.B_VA)
So result table would be like
ID | B_ID | B_VA| B_VB
-------|--------|--------|-------
1 | 4 | 2 | d
2 | 3 | 5 | c
I tried some joins without success. Can anyone help me with query to get the result I want.
Thank you
Your logic as described doesn't quite correspond to the data. For instance, b_va is numeric, but the column in the output is a string.
Perhaps you want this. The data in a to be aggregated to get the maximum b_id value. Then each column to be joined to get the corresponding b_vb column. That, at least, conforms to your desired output:
select a.id, a.b_id, b1.b_vb as b_va, b2.b_vb
from (select id, max(b_id) as b_id
from a
group by id
) a join
b b1
on a.id = b1.b_id join
b b2
on a.b_id = b2.b_id;
EDIT:
For the corrected data, I think this is what you want:
select a.id, a.b_id, max(b1.b_va) as b_va, b2.b_vb
from (select id, max(b_id) as b_id
from a
group by id
) a join
b b1
on a.id = b1.b_id join
b b2
on a.b_id = b2.b_id
group by a.id, a.b_id, b2.b_vb;
Try this
SELECT X.ID, Y.B_ID, X.B_VA, Y.B_VB
FROM (SELECT A.ID, MAX(B_VA) AS B_VA
FROM A INNER JOIN B ON A.B_ID = B.B_ID
GROUP BY A.ID) AS X INNER JOIN
A AS Z ON X.ID = Z.ID INNER JOIN
B AS Y ON Z.B_ID=Y.B_ID AND X.B_VA=Y.B_VA