I have just one SQL query that I want help for.
As you can see below, there are 4 emails. 3 of the 4 messages are related (15482, 15483 and 15484). I would like to have these rows out.
You can relate them to each other through column messageid = ReplyTo and ReplyTo = MessageID. It ends up being recrusiv.
The picture below shows what I have right now, but as I said I only want the ID 15482, 15483 and 15484 shown and not 15485. Because as you can see, the first 3 rows is an email thread that hangs together while the last is a new mail
How can I do that, when I only know ID 15482
This is my SQL statement
select
t1.id, t1.From_, t1.MessageId, t2.ReplyTo
from
HelpDesk_Z01_Emails as t1
left join
HelpDesk_Z01_EmailsReplyTo as t2 on t1.MessageId = t2.MessageId
This is my output:
You correctly said that it is a recursion.
Usually it is done using recursive common table expression
I'm not sure if everything in the query below is correct. If you had provided a simplified example with sample data it would be possible to try the suggested solution. Without it, the query is written, but not tested. Give it a try.
WITH
CTE
AS
(
SELECT
t1.id, t1.From_, t1.MessageId, t2.ReplyTo
FROM
HelpDesk_Z01_Emails as t1
left join HelpDesk_Z01_EmailsReplyTo as t2 on t1.MessageId = t2.MessageId
WHERE t1.id = 15482
UNION ALL
SELECT
t1.id, t1.From_, t1.MessageId, t2.ReplyTo
FROM
HelpDesk_Z01_Emails as t1
left join HelpDesk_Z01_EmailsReplyTo as t2 on t1.MessageId = t2.MessageId
INNER JOIN CTE ON CTE.MessageId = t2.ReplyTo
)
SELECT *
FROM CTE
;
This is the script to make a simplified table with sample data that you should have included in your question:
DECLARE #T TABLE(id int, From_ varchar(255), MessageId varchar(255), ReplyTo varchar(255));
INSERT INTO #T (id, From_, MessageId, ReplyTo) VALUES (15482, 'test#com', 'CAF', NULL);
INSERT INTO #T (id, From_, MessageId, ReplyTo) VALUES (15483, 'test#com', '54c', 'CAF');
INSERT INTO #T (id, From_, MessageId, ReplyTo) VALUES (15484, 'test#com', 'Fk', '54c');
INSERT INTO #T (id, From_, MessageId, ReplyTo) VALUES (15485, 'test#com', 'FkMh', NULL);
Having such starting point it is easy to write and verify the following query:
WITH
CTE
AS
(
SELECT
TT.id, TT.From_, TT.MessageId, TT.ReplyTo
FROM
#T AS TT
WHERE TT.id = 15482
UNION ALL
SELECT
TT.id, TT.From_, TT.MessageId, TT.ReplyTo
FROM
#T AS TT
INNER JOIN CTE ON CTE.MessageId = TT.ReplyTo
)
SELECT *
FROM CTE
;
The result set:
id From_ MessageId ReplyTo
15482 test#com CAF NULL
15483 test#com 54c CAF
15484 test#com Fk 54c
Related
I am matching two datasets that I imported into a Redshift DB: both are at rep id level.
This is my initial query to match the two datasets:
select *
from #t t
join #t2 t2
on lower(trim(t.unique_id))=lower(trim(t2.unique_id))
or lower(trim(t.email))=lower(trim(t2.email))
or lower(trim(split_part(t.first_name,',',1))||trim(split_part(t.last_name,',',1)))=lower(trim(split_part(t2.first_name,',',1))||trim(split_part(t2.last_name,',',1)))
#t is the source of truth I am matching to, and unique_id is supposedly the universal identifier (though only matches about 60%) for rep id (internal identifier), however, in some cases #t2 table has (incorrectly) multiple unique_ids per rep, and incorrectly multiple emails.
How can I change it so that it is more restrictive, ie. when getting a match by unique_id- dont match next record for that rep, when matching by email- dont match next record for that rep, and lastly join by firstname/lastname.
Thank you!
I think there are a few ways to skin this cat. As one option you could add a rank for each join as a CASE statement, and then pick out the one that has the min rank:
SELECT *
FROM
(
SELECT *,
min(ranktest) OVER (PARTITION BY t1.unique_id) as minrank
FROM
(
select *,
CASE WHEN lower(trim(t.unique_id))=lower(trim(t2.unique_id)) THEN 1
WHEN lower(trim(t.email))=lower(trim(t2.email)) THEN 2
WHEN ower(trim(split_part(t.first_name,',',1))||trim(split_part(t.last_name,',',1)))=lower(trim(split_part(t2.first_name,',',1))||trim(split_part(t2.last_name,',',1))) THEN 3
END as ranktest
from #t t
join #t2 t2
on lower(trim(t.unique_id))=lower(trim(t2.unique_id))
or lower(trim(t.email))=lower(trim(t2.email))
or lower(trim(split_part(t.first_name,',',1))||trim(split_part(t.last_name,',',1)))=lower(trim(split_part(t2.first_name,',',1))||trim(split_part(t2.last_name,',',1)))
) sub1
WHERE ranktest = minrank;
You could also do this by querying twice, once to get your data, and once to get the min(ranktest). It will almost definitely be slower, but.. it's a little prettier:
WITH subquery AS
(
select *,
CASE WHEN lower(trim(t.unique_id))=lower(trim(t2.unique_id)) THEN 1
WHEN lower(trim(t.email))=lower(trim(t2.email)) THEN 2
WHEN ower(trim(split_part(t.first_name,',',1))||trim(split_part(t.last_name,',',1)))=lower(trim(split_part(t2.first_name,',',1))||trim(split_part(t2.last_name,',',1))) THEN 3
END as ranktest
from #t t
join #t2 t2
on lower(trim(t.unique_id))=lower(trim(t2.unique_id))
or lower(trim(t.email))=lower(trim(t2.email))
or lower(trim(split_part(t.first_name,',',1))||trim(split_part(t.last_name,',',1)))=lower(trim(split_part(t2.first_name,',',1))||trim(split_part(t2.last_name,',',1)))
)
SELECT *
FROM subquery t1
WHERE t1.ranktest = (SELECT min(ranktest) FROM subquery WHERE subquery.unique_id = t1.ranktest)
Alternatively, you could run this as a UNION ALL, testing for the join differently each time to avoid repeats and only allowing the top most ranked join through:
select *
from #t t
join #t2 t2
on lower(trim(t.unique_id))=lower(trim(t2.unique_id))
UNION ALL
select *
from #t t
join #t2 t2
on lower(trim(t.unique_id))<>lower(trim(t2.unique_id))
AND lower(trim(t.email))=lower(trim(t2.email))
UNION ALL
select *
FROM #t t
join #t2 t2
ON lower(trim(t.unique_id))<>lower(trim(t2.unique_id))
AND lower(trim(t.email))<>lower(trim(t2.email))
AND lower(trim(split_part(t.first_name,',',1))||trim(split_part(t.last_name,',',1)))=lower(trim(split_part(t2.first_name,',',1))||trim(split_part(t2.last_name,',',1)))
I want to create a temp table and insert values based on the select. The query doesn't execute, What am i missing ? I eventually want to loop thru the temp table
Create Table #temp (ID varchar(25),Source_Id varchar(25),Processed varchar(25), Status varchar(25),Time_Interval_Min varchar(25))
Insert into #temp
Select t.*
From
(SELECT DISTINCT source_id
FROM Activity_WorkLoad) t1
CROSS APPLY
(
SELECT TOP 1
aw.ID,
Source_Id
,Processed
,Status
,Time_Interval_Min
FROM [dbSDS].[dbo].[Activity_WorkLoad] aw
JOIN [dbSDS].[dbo].[SDA_Schedule_Time] st ON aw.SDA_Resource_ID = st.ID
WHERE aw.Source_Id = t1.Source_Id AND aw.Status = 'Queued'
ORDER BY Processed DESC
)t
When you cross apply, you still need an alias:
Insert into #temp (id, source_id, processed, status, time_interval_min)
Select tt.*
From (SELECT DISTINCT source_id
FROM Activity_WorkLoad
) t CROSS APPLY
(SELECT TOP 1 aw.ID, Source_Id, Processed, Status, Time_Interval_Min
FROM [dbSDS].[dbo].[Activity_WorkLoad] aw JOIN
[dbSDS].[dbo].[SDA_Schedule_Time] st
ON aw.SDA_Resource_ID = st.ID
WHERE aw.Source_Id = t.Source_Id AND aw.Status = 'Queued'
ORDER BY Processed DESC
) tt;
I also assume that you want results from the second subquery, not the first, because the first does not have enough columns.
I have a table
declare #table table(t varchar(50), d varchar(50), activ varchar(10), groupid int, rownum int)
insert into #table values('ALK','ceri', '0.2',1,1)
insert into #table values('ALK','criz', '24',1,2)
insert into #table values('EGFR','erlo', '2',2,3)
insert into #table values('EGFR','gefi', '57',2,4)
insert into #table values('EGFR','ibru', '5.6',2,5)
insert into #table values('EGFR','ceri', '900',2,6)
insert into #table values('EGFR','cetu', 'NULL',2,7)
insert into #table values('EGFR','afat', '10',2,8)
insert into #table values('EGFR','lapa', '10.8',2,9)
insert into #table values('EGFR','pani', 'NULL',2,10)
insert into #table values('ERBB2','pert', 'NULL',3,11)
insert into #table values('ERBB2','tras', 'NULL',3,12)
insert into #table values('ERBB2','lapa', '9.2',3,13)
insert into #table values('ERBB2','ado-', 'NULL',3,14)
insert into #table values('ERBB2','afat', '14',3,15)
insert into #table values('ERBB2','ibru', '9.4',3,16)
in output I need all combinations by groupid or t in format
t,d,t,d,t,d,activ and so on then I will qualify best combinations.
Any help will be appreciated. This will show doctors optimum combination of drugs for cancer patients. The table is dynamic and different for every patient.
Thank you
For all possible combinations, you would use CROSS JOIN:
SELECT * FROM table1 AS t1
CROSS JOIN table2 AS t2
on t1.ID = t2.ID
Keep in mind this gives a O(n^2) result set, likely to be huge for large sets of data.
I will use #TT to represent the table var since calling it #table may be a bit confusing
I also changed the datatype of active to float
There are really 3 possible cross joins
-- #1 -- producing 256 rows
select * from #TT as T1
cross join #TT as T2
-- #2 -- produces 104 rows
select * from #TT as T1
cross join #TT as T2
where T1.GroupID = T2.GroupID
-- #3 -- produces 104
select * from #TT as T1
cross join #TT as T2
where T1.t = T2.t
The 1st is a true cross join on the whole table.
The 2nd and 3rd are cross joins on GroupID and t respectively, but they are identical since Group 1 represents T='ALK', etc. This is easily confirmed since a union of 2 & 3 3 also produces 104 rows
However, select * on a self join is silly as is obvious if you change select * to
select T1.*, '===', T2.*
You can see the columns on the left of '===' are the same as the columns to the right of '==='
Since GroupID is an integer I would write the cross join as
select T1.* from #TT as T1
cross join #TT as T2
where T1.GroupID = T2.GroupID
Now since the poster wanted to grouping based on the smallest total active, I think it makes sense to group the response by GroupID and T and D giving and report the sum of Activ and order by GroupID and sum(Activ)
-- #4 adding group by and sum -- 16 rows generated
select T1.groupid, T1.t, T1.d, sum(T1.activ) as SumActiv
from #TT as T1
cross join #TT as T2
where T1.groupid = T2.groupid
group by T1.t, T1.groupid, T1.d
order by groupid, sum(T1.Activ)
Now you are getting close except for the fact that no CROSS JOIN is needed at all
-- #5 remove the cross join
select T1.groupid, T1.t, T1.d, sum(T1.activ) as SumActiv
from #TT as T1
group by T1.t, T1.groupid, T1.d
When I remove the cross join portion of the query I get the exact same result. I think we finally have what is wanted, with the possible exception of removing all but the first row for each combination of GroupID and d
If I have two tables such as this:
CREATE TABLE #table1 (id INT, name VARCHAR(10))
INSERT INTO #table1 VALUES (1,'John')
INSERT INTO #table1 VALUES (2,'Alan')
INSERT INTO #table1 VALUES (3,'Dave')
INSERT INTO #table1 VALUES (4,'Fred')
CREATE TABLE #table2 (id INT, name VARCHAR(10))
INSERT INTO #table2 VALUES (1,'John')
INSERT INTO #table2 VALUES (3,'Dave')
INSERT INTO #table2 VALUES (5,'Steve')
And I want to see all rows which only appear in one of the tables, what would be the best way to go about this?
All I can think of is to either do:
SELECT * from #table1 except SELECT * FROM #table2
UNION
SELECT * from #table2 except SELECT * FROM #table1
Or something along the lines of:
SELECT id,MAX(name) as name FROM
(
SELECT *,1 as count from #table1 UNION ALL
SELECT *,1 as count from #table2
) data
group by id
HAVING SUM(count) =1
Which would return Alan,Fred and Steve in this case.
But these feel really clunky - is there a more efficient way of approaching this?
select coalesce(t1.id, t2.id) id,
coalesce(t1.name, t2.name) name
from #table1 t1
full outer join #table2 t2
on t1.id = t2.id
where t1.id is null
or t2.id is null
The full outer join guarantees records from both sides of the join. Whatever record that does not have in both sides (the ones you are looking for) will have NULL in one side or in other. That's why we filter for NULL.
The COALESCE is there to guarantee that the non NULL value will be displayed.
Finally, it's worth highlighting that repetitions are detected by ID. If you want it also to be by name, you should add name to the JOIN. If you only want to be by name, join by name only. This solution (using JOIN) gives you that flexibility.
BTW, since you provided the CREATE and INSERT code, I actually ran them and the code above is a fully working code.
You can use EXCEPT and INTERSECT:
-- All rows
SELECT * FROM #table1
UNION
SELECT * FROM #table2
EXCEPT -- except
(
-- those in both tables
SELECT * FROM #table1
INTERSECT
SELECT * FROM #table2
)
Not sure if this is any better than your EXCEPT and UNION example...
select id, name
from
(select *, count(*) over(partition by checksum(*)) as cc
from (select *
from #table1
union all
select *
from #table2
) as T
) as T
where cc = 1
This seems so simple, but I just can't figure it out. I want to simply join 2 tables together. I don't care which values are paired with which. Using TSQL, here is an example:
declare #tbl1 table(id int)
declare #tbl2 table(id int)
insert #tbl1 values(1)
insert #tbl1 values(2)
insert #tbl2 values(3)
insert #tbl2 values(4)
insert #tbl2 values(5)
select * from #tbl1, #tbl2
This returns 6 rows, but what kind of query will generate this (just slap the tables side-by-side):
1 3
2 4
null 5
You can give each table row numbers and then join on the row numbers:
WITH
Table1WithRowNumber as (
select row_number() over (order by id) as RowNumber, id from Table1
),
Table2WithRowNumber as (
select row_number() over (order by id) as RowNumber, id from Table2
)
SELECT Table1WithRowNumber.Id, Table2WithRowNumber.Id as Id2
FROM Table1WithRowNumber
FULL OUTER JOIN Table2WithRowNumber ON Table1WithRowNumber.RowNumber = Table2WithRowNumber.RowNumber
Edit: Modiifed to use FULL OUTER JOIN, so you get all rows (with nulls).
Use Cross Join
Select * From tableA Cross Join TableB
But understand you will get a row in the output for every combination of rows in TableA with every Row in TableB...
So if Table A has 8 rows, and TableB has 4 rows, you will get 32 rows of data...
If you want any less than that, you have to specify some join criteria, that will filter out the extra rows from the output
Well, this will work:
Select A.ID, B.ID From
(SELECT ROW_NUMBER () OVER (ORDER BY ID) AS RowNumber, ID FROM Tbl2 ) A
full outer join
(SELECT ROW_NUMBER () OVER (ORDER BY ID) AS RowNumber, ID FROM Tbl1 ) B
on (A.RowNumber=B.RowNumber)
The SQL1 cross join applies here also.
Select *
From tableA, TableB