I have two tables both containing employee data, TableA and TableB, I'm joining them based on 2 id's (one it's userID and the other mID (month based id)) using a LEFT OUTER JOIN which returns NULL in about 20% of the results because TableB it's incomplete. I want -if possible- a query that detects if the join doesn't find a match and subtract one month to mID so the JOIN can cover at least a percentage of missing data with just old data.
I don't know if it's a way too complex query but I had in mind something like:
SELECT T1.*, T2.*
FROM TABLEA
LEFT OUTER JOIN TABLEB
ON T2.USERID = T1.USERID AND (CASE WHEN (T2.HID = T1.HID) = NULL THEN (T2.HID = T1.HID-1))
Appreciate any help.
I think this is what you are really wanting to do; the downside is there is a bit more logic and it would be duplicated per column, but it provides a fine grain of control.
DECLARE #TableA TABLE (USERID INT,HID INT,SomeData VARCHAR(20))
DECLARE #TableB TABLE (USERID INT,HID INT,SomeData VARCHAR(20))
INSERT INTO #TableA(USERID,HID,SomeData) SELECT 1,5,'Now'
INSERT INTO #TableA(USERID,HID,SomeData) SELECT 2,5,NULL
INSERT INTO #TableA(USERID,HID,SomeData) SELECT 3,5,NULL
INSERT INTO #Tableb(USERID,HID,SomeData) SELECT 2,4,'Now-1'
INSERT INTO #Tableb(USERID,HID,SomeData) SELECT 2,3,'Now-2'
INSERT INTO #Tableb(USERID,HID,SomeData) SELECT 3,4,'Now-1'
SELECT
t1.USERID, T1.Hid AS [Current HID]
,
CASE
WHEN T1.SomeData IS NOT NULL THEN T1.SomeData
WHEN T2.USERID IS NOT NULL THEN T2.SomeData
WHEN T3.USERID IS NOT NULL THEN T3.SomeData
ELSE T1.SomeData
END AS [Most Recent SomeData]
FROM #TABLEA T1
LEFT JOIN #TABLEB T2 ON T2.USERID = T1.USERID AND T2.HID = T1.HID
LEFT JOIN #TABLEB T3 ON T3.USERID = T1.USERID AND T3.HID = T1.HID-1
You were on the right track by using a case statement but it was just a little out of order. Warning! I didn't test it but I believe I ran into something similar in the past.
SELECT T1.*
,T2.*
FROM TABLEA t1
LEFT JOIN TABLEB t2 ON T2.USERID = T1.USERID
AND T2.HID = CASE
WHEN T2.HID = T1.HID then t1.hid
else T1.HID - 1
end
Related
Initially, I have a query like below, doing a join on 1=1. (It's simply doing a cross join, which selects all rows from the first table and all rows from the second table and shows as a cartesian product, i.e. with all possibilities.)
SELECT * FROM Table1 t1
JOIN Table2 t2 ON 1=1
Problem: Optimize this query in such a way, it will show only the records for a particular ID and if we don't have an ID or have a NULL in the ID then it will show the result same as previously(1=1). So I wrote the script below.
Declare #T2id as int;
Set #T2id = 123;
SELECT * FROM Table1 t1
JOIN Table2 t2 ON
-- left side of join on statement
CASE
WHEN #T2id Is NULL
THEN 1
ELSE
t2.Id
END
=
-- right side of join on statement
CASE
WHEN #T2id Is NULL
THEN 1
ELSE
#T2id
END
Can anyone confirm, is it good or we can have a better approach than this?
I think your way of presenting a cross-join is something I haven't seen before.
My view is it's simpler to read and understand if you just:
SELECT *
FROM Table1 t1, Table2 t2
As for the question, assuming SQL Server (you didn't tag the RDBMS, but I guess from your variable declaration) you might consider:
IF ISNULL(#T2id,1) = 1
SELECT *
FROM Table1 t1, Table2 t2;
ELSE
SELECT *
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.id = t2.id
WHERE t2.id = #T2id;
I have two tables like the following:
Table1
Id Table1_Col
1 A
2 B
3 C
4 D
5 E
Table2
Id Table1_Col Table2_Col
1 A Test
I want the count of (Table1_Col) in Table2 and I need query for the following output:
Expected Output
Table1_Col Count_Table2_Col
A 1
B 0
C 0
D 0
E 0
What I have tried so far:
select Table1_Col,Count(Table2_Col) from table1 t1
Left outer join table2 t2 on t1.Table1_Col = t2.Table1_Col
Please provide me a proper solution for this.
You need GROUP BY, when using aggregate methods. Also Table1_Col existing in both tables, so please use with the proper table alias for the columns.
The query below will return your expected result. Please find the demo too.
select T1.Table1_Col, Count(T2.Table2_Col) AS Table2_Col
from table1 t1
Left outer join table2 t2 on t1.Table1_Col = t2.Table1_Col
GROUP BY T1.Table1_Col
Demo on db<>fiddle
UPDATE: As per the comment in the post, based on your fiddle, the condition t3.visitno=1 should be in the LEFT OUTER JOIN and not in the WHERE clause, so the following query will work:
select t3.pvisitno, t1.DocName, count(t2.vdocid) as [count]
from Document_type t1
left outer join visitdocs t2 on t2.DocId = t1.DocId
left outer join visittbl t3 on t3.visitno = t2.visitno and t3.visitno=1
group by t3.pvisitno,t1.DocName
order by count(t2.vdocid) desc
db<>fiddle demo for the revised fiddle
Try this query:
select t1.Table1_Col,
sum(case when Table2_Col is null then 0 else 1 end) Count_Table2_Col
from Table1_Col t1
left join Table2 t2 on t1.Table1_Col = t2.Table1_Col
group by t1.Table1_Col
You can try this:
Declare #t table ( id int ,col varchar(50))
insert into #t values (1,'A')
insert into #t values (2,'B')
insert into #t values (3,'C')
Declare #t1 table ( id int ,col varchar(50),col2 varchar(50))
insert into #t1 values (1,'A','TEST')
select t.col,count(t1.id) countT2 from #t t left join #t1 t1
on t.id=t1.id
group by t.col
Here's another option:
select t1.Table1_Col, coalesce(x.cnt, 0) cnt
from table1 t1
left outer join (select Table2_Col, count(*) cnt from table2 group by Table2_Col) x
on x.Table2_Col = t1.Table1_Col;
The idea here is to create an inline view of table2 with its counts and then left join that with the original table.
The "coalesce" is necessary because the inline view will only have records for the rows in table2, so any gaps would be "null" in the query, while you specified you want "0".
I have a small query below. #t1 and #t2 are 2 little tables. I am trying to do a simple left join for both these tables and I see the output.
query:
create table #t1 (cid int, program varchar(20), PP varchar(20), Startdate date, enddate date,codeset varchar(20),visitID int)
insert into #t1
values
(1001,'P1','ORD','2018-09-27','2018-09-28','OL',150),
(1001,'P2','ORD','2018-09-29',NULL,'IR',151)
create table #t2 (cid int,visitID int, answer varchar(20))
insert into #t2
values
(1001,150,'Credited')
select t1.cid, t1.Startdate, t1.Enddate,t2.answer
from #t1 t1
left join #t2 t2 on t1.cid = t2.cid
drop table #t1, #t2
The output is:
To the logic of left join, all the records from left table and only matching records from the right table should show up. Why do I see 'Credited' in the second row when no such record exist in #t1?
desired output:
I'm missing something silly and unable to figure out. Any help?!
You are seeing the expected behavior. You are joining on CID. The single record in #t2 has a CID value of 1001. That matches both records in #t1 since both records in #t1 have value of 1001; thus, you have two rows in your results with a value of Credited for the column answer.
You apparently want to join cid and visitid.
SELECT t1.cid,
t1.startdate,
t1.enddate,
t2.answer
FROM #t1 t1
LEFT JOIN #t2 t2
ON t1.cid = t2.cid
AND t1.visitid = t2.visitid;
I don't think you can get such output with data and join condition you have. You are joining #t1 to #t2 by cid and you have the same cid for both records in #t2 table thats mean that one record from the #t2 with cid 1001 will be joined to both of records in table #t2.
You can get such output if you will change id of second line in #t1 to 1002 but I don't know is it correct for your task.
If you want to only show Credited where there is an EndDate specified, you need to add to your join condition.
select t1.cid, t1.Startdate, t1.Enddate,t2.answer
from #t1 t1
left join #t2 t2 on t1.cid = t2.cid AND t1.EndDate IS NOT NULL
If it's the visitID and not the EndDate that causes this, then use this query:
select t1.cid, t1.Startdate, t1.Enddate,t2.answer
from #t1 t1
left join #t2 t2 on t1.cid = t2.cid AND t1.visitId = 150
We don't have enough info right now to really know what your logic requires but it will likely look something like the above.
I have a table t1. It has columns [id] and [id2].
Select count(*) from t1 where id=1;
returns 31,189 records
Select count(*) from t1 where id=2;
returns 31,173 records
I want to know the records where id2 is in id=1 but not in id=2.
So, I use the following:
Select * from t1 a left join t1 b on a.id2=b.id2
Where a.id=2 And b.id=1
And b.id2 Is Null;
It returns zero records.
Using an inner join to see how many records have id2 in common, I do...
Select * from t1 a inner join t1 b on a.id2=b.id2
Where a.id=2 And b.id=1;
And that returns 31,060. So where are the extra records in my first query that don't match?
I am sure I must be missing something obvious.
Sample Data
id id2
1 101
1 102
1 103
2 101
2 102
My expected results is to find the record with '103' in it. 'id2' not shared.
Thanks for any help.
Jeff
You are attempting to do what is generally called an exclude join. This involves doing a LEFT JOIN between two tables, then using a WHERE clause to only select rows where the right table is null, i.e. there was no record to join. In this way, you select everything from the left table except what exists in the right table.
With this data, it would look something like this:
SELECT
t1.id,
t1.id2
FROM test_table t1
LEFT JOIN
(SELECT
id,
id2
FROM test_table
WHERE id = 2) t2
ON t2.id2 = t1.id2
WHERE t1.id = 1
AND t2.id IS NULL --This is what makes the exclude join happen
And here is a SQLFiddle demonstrating this in MySQL 5.7 with the sample data you provided.
I think maybe Access changes the left join to an inner join when you add a where clause to filter rows (I know SQL Server does this), but if you do the filtering in derived tables it should work:
select
a.*
from
(select * from t1 where id = 1) a
left join
(select * from t1 where id = 2) b
on a.id2 = b.id2
where b.id2 is null
I have a stored procedure that joins in numerous tables and selects fields from them. One of the tables being a temporary table.
SELECT
a.Field1,
a.Field2,
b.Field3,
b.Field4,
c.Field5
FROM table1 a
LEFT JOIN #table2 b ON a.Field1 = b.Field1
INNER JOIN table3 c ON a.Field1 = c.Field1
The above takes 10+ minutes, however if I comment out the two b fields from the select while leaving the join in place it runs in just seconds.
I have pulled this out of procedure to simplify and same behavior. Also the execution plans are almost identical.
Any help is appreciated.
How many rows are in the temp table, and is "Field2" in the temp table a primary key?
If you're not selecting any rows from the right table of a left join, and the join is to the primary key (or possibly a unique key), and you reference no columns from the right table, SQL Server can avoid having to access the temp table at all (since the presence or absence of a joining row has no impact on the final result):
Example. Table setup:
create table T1 (
ID int not null primary key,
Col1 varchar(10) not null
)
go
insert into T1 (ID,Col1)
select 1,'a' union all
select 2,'b' union all
select 3,'c'
go
create table #t2 (
ID int not null primary key,
Col2 varchar(10) not null
)
go
insert into #t2 (ID,Col2)
select 1,'d' union all
select 2,'e' union all
select 4,'f'
go
create table #t3 (
ID int not null,
Col3 varchar(10) not null
)
go
insert into #t3 (ID,Col3)
select 1,'d' union all
select 2,'e' union all
select 1,'f'
And the queries:
select T1.ID,T1.Col1 from T1 left join #t2 t2 on T1.ID = t2.ID
select T1.ID,T1.Col1,t2.Col2 from T1 left join #t2 t2 on T1.ID = t2.ID
select T1.ID,T1.Col1 from T1 left join #t3 t3 on T1.ID = t3.ID
select T1.ID,T1.Col1,t3.Col2 from T1 left join #t2 t3 on T1.ID = t3.ID
In all but the first query, the join happens as expected. But because the presence or absence of rows in #t2 can't affect the final result for the first query, it avoids performing the join entirely.
But if it's not something like that (and I'd expect it to be an obvious difference in the query plans)< I#m a bit stumped.
Have you tried inverting the joins? (although you are missing a join condition for table c in the sample query)
SELECT
a.Field1,
a.Field2,
b.Field3,
b.Field4,
c.Field5
FROM table1 a
INNER JOIN table3 c
LEFT JOIN #table2 b ON a.Field1 = b.Field1
I would try adding an index with included columns to #table2 and see if it helps:
CREATE NONCLUSTERED INDEX IX_table2
ON #table2 (Field1)
INCLUDE (Field3, Field4);
How about running the query in two parts. Make the first part as restrictive as possible and then only outer join on the filtered set.
SELECT a.Field1,
a.Field2,
b.Field3,
c.Field5
INTO #t
FROM table1 a
INNER JOIN table3 c ON a.Field1 = c.Field1
SELECT t.Field1,
t.field2,
b.field3,
b.field4,
t.field5
FROM #t t
LEFT OUTER JOIN #table2 b ON t.Field1 = b.Field1
select * into #temp from table1
select * into #temp1 from table2
select * into #temp2 from table3
SELECT
a.Field1,
a.Field2,
b.Field3,
b.Field4,
c.Field5
FROM #temp a
LEFT JOIN #temp1 b ON a.Field1 = b.Field1
INNER JOIN #temp2 c ON a.Field1 = c.Field1
if(Object_Id('TempDB..#temp') Is Not Null)
Begin
Drop table #temp
End
if(Object_Id('TempDB..#temp1') Is Not Null)
Begin
Drop table #temp1
End
if(Object_Id('TempDB..#temp2') Is Not Null)
Begin
Drop table #temp2
End