Waterfall join conditions - sql

I have two tables similar to:
Table 1 --unique ID's
ID Date
1 3/8/2017
2 3/8/2017
3 3/8/2017
Table 2
ID Date SourceID
1 3/8/2017 1
1 3/8/2017 2
1 3/8/2017 3
2 3/8/2017 2
3 3/8/2017 1
3 3/8/2017 3
And I want to write a query that has a result like:
Result
ID SourceID
1 2
2 2
3 1
Where the source ID ordering should be 2, 1, 3
I have:
select Table1.ID
, COALESCE(Join1.SourceID, Join2.SourceID, Join3.SourceID) as SourceID
from Table1
left outer join Table2 Join1
on Table1.date = Join1.date
and Table1.ID = Join1.ID
and Join1.SourceID = 2
left outer join Table2 Join2
on Table1.date = Join2.date
and Table1.ID = Join2.ID
and Join2.SourceID = 1
and Join1.SourceID is null
left outer join Table2 Join3
on Table1.date = Join3.date
and Table1.ID = Join3.ID
and Join3.SourceID = 3
and Join1.SourceID is null
and Join2.SourceID is null
But this currently just keeps the records where sourceid = 2 and does not add in the other sourceid's.
Thanks in advance for any help. Let me know if you need any clarification. Using SQL-Server. I only need a few and fixed amount of sources so I am avoiding using a cursor.

This is a prioritization query. I would do it using outer apply:
select t1.*, t2.sourceId
from table1 t1 outer apply
(select top 1 t2.*
from table2 t2
where t2.id = t1.id and t2.date = t1.date
order by (case t2.sourceid when 2 then 1 when 1 then 2 when 3 then 3 end)
) t2;
Note: For readability, you can simplify the order by to:
order by charindex(cast(t2.sourceId as varchar(255)), '2,1,3')
If you are uncomfortable with outer apply, you can do the same thing with a single join:
select t1.*, t2.sourceId
from table1 t1 join
(select t2.*,
row_number() over (partition by id, date
order by (case t2.sourceid when 2 then 1 when 1 then 2 when 3 then 3 end)
) as seqnum
from table2 t2
) t2
on t2.id = t1.id and t2.date = t1.date and t2.seqnum = 1;

Related

How to randomly update a column if the number of records between the 2 join tables are not equal in Postgres sql

Table 1(5 records):
id
name
date
units
1
abc
3/16/2021
1
abc
3/17/2021
1
abc
3/18/2021
1
abc
3/19/2021
1
abc
3/20/2021
Table 2(3 records):
id
name
startdate
enddate
units
1
abc
3/16/2021
03/23/2021
2
1
abc
3/16/2021
03/23/2021
2
1
abc
3/16/2021
03/23/2021
2
Below is the join condition:
select * from Table1 a right join Table2 b on
(a.id = b.id) and (a.name = b.name) and (a.date between b.startdate and b.enddate)
I am trying to update the units columns in Table 1 from Table 2. My requirement is since there are 3 records in Table 2, only 3 records in Table 1 should be updated based on the above join condition. It can be random. But the number of records updated should not go above 3.
I tried doing this.
with e as
(select *,
row_number() over(partition by a.id
order by id) as rn
from Table1 a right join Table 2 b on (a.id = b.id) and (a.name = b.name) and (a.date between b.startdate and b.enddate)
)
update table1
set units = e.units
from e
where e.rn = 1
However, in this case all 5 records get updated. How do I resolve this? Any help is appreciated. Thank you.
Join the tables together. Then choose one row from from table2 for each row in table1 and do the update:
update table1 t1
set t1.units = t2.units
from (select distinct on (t1.id, t1.name, t1.date) t1.*, t2.units
from table1 t1 join
table2 t2
on t2.id = t1.id and t2.name = t1.name and
t1.date between t2.startdate and t2.enddate
order by t1.id, t1.name, t1.date, random())
) tt1
where tt1.name = t1.name and
tt1.id = t1.id and
tt1.date = t1.date;

MSSQL get rows which only differ at 2 columns

I have a task on which I have no idea how that could even work out.
I have to find records, which have a time difference of X and where a boolean is ON/OFF. I tried to use a LEFT OUTER JOIN and used the conditions in the ON clause, but it gave me the wrong result.
So my question is, how can I select rows, which have the same value in 2 columns, but different values in other 2 columns?
Edit:
My problem is, that for some reason my actual query returns the same entry multiple times. I checked if the entry exists multiple times, but it doesn't
Data for reference:
ID1 ID2 Boolean Time
1 1 0 2018-03-06 11:31:39
1 1 1 2018-03-06 11:33:39
2 1 0 2018-03-06 11:31:39
2 2 1 2018-03-06 11:40:39
The desired output from the query would be
ID1 ID2 Boolean Time
1 1 0 2018-03-06 11:31:39
1 1 1 2018-03-06 11:33:39
because ID1 and ID2 are the same, the Boolean is different and the time difference is in the specified range (lets say 5 minutes). The other 2 entries are not valid, because ID2 differs and the time difference is too big.
My current query:
select
t1.id1,
t1.id2,
t1.boolean,
t1.time
from t1 t1
left outer join t1 t2
on t1.boolean != t2.boolean and datediff(minute, t1.time, t2.time)<=5
where t1.id1 = t2.id1
and t1.id2 = t2.id2
Your query looks fine, I found few small issues
1- Table alias used is wrong instead of t it should be t1
2- Order or data is wrong
3- Changed left join to inner join
4- Modified ON and Where condition for better readability and performance
Check following corrected query.
WITH t1 AS
(
SELECT * FROM (VALUES
(1 , 1 , 0 , '2018-03-06 11:31:39'),
(1 , 1 , 1 , '2018-03-06 11:33:39'),
(2 , 1 , 0 , '2018-03-06 11:31:39'),
(2 , 2 , 1 , '2018-03-06 11:40:39')
) T( ID1, ID2 , Boolean, Time)
)
select
t1.id1,
t1.id2,
t1.boolean,
t1.time
from t1 t1
inner join t1 t2
on t1.id1 = t2.id1 and t1.id2 = t2.id2
where
t1.boolean != t2.boolean and datediff(minute, t1.time, t2.time)<=5
ORDER BY [TIME]
Output
+-----+-----+---------+---------------------+
| id1 | id2 | boolean | time |
+-----+-----+---------+---------------------+
| 1 | 1 | 0 | 2018-03-06 11:31:39 |
+-----+-----+---------+---------------------+
| 1 | 1 | 1 | 2018-03-06 11:33:39 |
+-----+-----+---------+---------------------+
To avoid duplicate value use GROUP BY
SELECT t1.id1
,t1.id2
,t1.boolean
,t1.TIME
FROM t1 t1
INNER JOIN t1 t2 ON t1.boolean != t2.boolean
AND datediff(minute, t1.TIME, t2.TIME) <= 5
WHERE t1.id1 = t2.id1
AND t1.id2 = t2.id2
GROUP BY t1.id1
,t1.id2
,t1.boolean
,t1.TIME
SELECT
D1.*
FROM
Data AS D1
WHERE
EXISTS (
SELECT
1
FROM
Data AS D2
WHERE
D1.ID1 = D2.ID2 AND
~D1.Boolean = D2.Boolean AND
ABS(DATEDIFF(MINUTE, D1.Time, D2.Time)) <= 5)
ORDER BY
D1.ID1,
D1.Boolean,
D1.Time

Sql query to get join of two tables and both table has where query

Stuck at join, tried left, right, left outer , right outer joins
table 1
selectionID name type
1 abc 1
2 def 1
3 ghi 2
4 dhi 2
5 gki 2
6 ppp 2
Table 2
TID UserID selectionID isOK
1 10 3 0
2 19 3 0
3 10 8 0
6 10 5 1
Desired result is
join of
select from table 1 where type =2
select from table 2 where UserID = 10
selectionID name type TID userID
3 ghi 2 1 10
4 dhi 2 undefined undefined/null
5 gki 2 undefined undefined/null
6 ppp 2 6 10
so basically i want all data from table 1 that fits in where clause and their respective data in table 2 with another where clause
As long as i have done research i need to use inner query of second table...am I going right way?
Try the following query:
SELECT t1.selectionID, t1.name, t1.type, t2.tid, t2.userID
FROM table1 t1 LEFT JOIN table2 t2 ON t1.type = t2.TID AND t2.userID = 10
WHERE t1.type = 2;
Stuck at join, tried left, right, left outer , right outer joins ... well LEFT JOIN is same as LEFT OUTER JOIN. BTW, you are looking for a LEFT JOIN probably like
select t1.selectionID,
t1.name,
t1.type,
t2.TID,
t2.UserId
from table1 t1
left join table2 t2 on t1.selectionID = t2.selectionID
and t2.UserId = 10
where t1.type = 2;
You were probably failing because placing the conditions in the where clause. If a row doesn't join you will have nulls in its columns, so a where condition will discard those rows
select *
from table1 t1
left join
table2 t2
on t1.selectionID = t2.selectionID and
t2.userID = 10
where t1.type = 2
Another way, is to force nulled rows to be accepted, using coalesce
select *
from table1 t1
left join
table2 t2
on t1.selectionID = t2.selectionID and
where t1.type = 2 and
coalesce(t2.userID, 10) = 10
select * from table1 t1
left join table2 t2 ON t1.SelectionID = t2.SelectionID
where t1.type = 2 AND t2.UserID = 10

SQL Query - ID Join, Just Duplicates

I am working with Oracle SQL. I have two tables. One has ItemID and DatePurchased and the other has ItemID, CustomerID. I'm trying to join the tables so that I can see only those customers with multiple items.
In other words, if I had:
TABLE 1
ItemID---DatePurchased
1 MAR15
2 JUN10
3 APR02
and
TABLE 2
ItemID---CustomerID
1 1
2 1
3 2
I would want this returned:
TABLE 3
ItemID--DatePurchased--CustomerID
1 MAR15 1
2 JUN10 1
(Customer 2 is left out because he only has one item (ItemID=3)).
Any ideas on how to do this in SQL?
select ItemID, DatePurchased, CustomerID
from
(
select
T1.ItemID, T1.DatePurchased, T2.CustomerID,
count(*) over (partition by T2.CustomerId) as ItemCnt
from TABLE2 T2
join TABLE1 T1 on T1.ItemID = T2.ItemID
) dt
where ItemCnt > 1
select
T2.ItemID, T2.CustomerID, T1.DatePurchased
from TABLE2 as T2
inner join TABLE1 as T1 on T1.ItemID = T2.ItemID
where
T2.CustomerID in
(
select TT.CustomerID
from TABLE2 as TT
group by TT.CustomerID
having count(*) > 1
)

SQL: 3 self-joins and then join them together

I have 2 tables to join in a specific way. I think my query is right, but not sure.
select t1.userID, t3.Answer Unit, t5.Answer Demo
FROM
table1 t1
inner join (select * from table2) t3 ON t1.userID = t3.userID
inner join (select * from table2) t5 ON t1.userID = t5.userID
where
NOT EXISTS (SELECT * FROM table1 t2 WHERE t2.userID = t1.userID AND t2.date > t1.date)
and NOT EXISTS (SELECT * FROM table2 t4 WHERE t4.userID = t3.userID and t4.counter > t3.counter)
and NOT EXISTS (SELECT * FROM table2 t6 WHERE t6.userID = t5.userID and t6.counter > t5.counter)
and t1.date_submitted >'1/1/2009'
and t3.question = Unit
and t5.question = Demo
order by
t1.userID
From table1 I want distinct userID where date > 1/1/2009
table1
userID Date
1 1/2/2009
1 1/2/2009
2 1/2/2009
3 1/2/2009
4 1/1/2008
So The result I want from table1 should be this:
userID
1
2
3
I then want to join this on userID with table2, which looks like this:
table2
userID question answer counter
1 Unit A 1
1 Demo x 1
1 Prod 100 1
2 Unit B 1
2 Demo Y 1
3 Prod 100 1
4 Unit A 1
1 Unit B 2
1 Demo x 2
1 Prod 100 2
2 Unit B 2
2 Demo Z 2
3 Prod 100 2
4 Unit A 2
I want to join table1 with table2 with this result:
userID Unit Demo
1 B X
2 B Z
In other words,
select distinct userID from table2 where question = Unit for the highest counter
and then
select distinct userID from table2 where question = Demo for the highest counter.
I think what I've done is 3 self-joins then joined those 3 together.
Do you think it's right?
SELECT du.userID, unit.answer, demo.answer
FROM (
SELECT DISTINCT userID
FROM table1
WHERE date > '1/1/2009'
) du
LEFT JOIN
table2 unit
ON (userID, question, counter) IN
(
SELECT du.userID, 'Unit', MAX(counter)
FROM table2 td
WHERE userID = du.userID
AND question = 'Unit'
)
LEFT JOIN
table2 demo
ON (userID, question, counter) IN
(
SELECT du.userID, 'Demo', MAX(counter)
FROM table2 td
WHERE userID = du.userID
AND question = 'Demo'
)
Having an index on table2 (userID, question, counter) will greatly improve this query.
Since you mentioned SQL Server 2005, the following will be easier and more efficient:
SELECT du.userID,
(
SELECT TOP 1 answer
FROM table2 ti
WHERE ti.user = du.userID
AND ti.question = 'Unit'
ORDER BY
counter DESC
) AS unit_answer,
(
SELECT TOP 1 answer
FROM table2 ti
WHERE ti.user = du.userID
AND ti.question = 'Demo'
ORDER BY
counter DESC
) AS demo_answer
FROM (
SELECT DISTINCT userID
WHERE date > '1/1/2009'
FROM table1
) du
To aggregate:
SELECT answer, COUNT(*)
FROM (
SELECT DISTINCT userID
FROM table1
WHERE date > '1/1/2009'
) du
JOIN table2 t2
ON t2.userID = du.userID
AND t2.question = 'Unit'
GROUP BY
answer