SQL Case Condition On Inner Join - sql

I am currently trying to join a table to itself to check if for one email there exist two or more Ids.
I am trying to join my table with itself on its email. I then wanted to query my table with a case condition saying if the count of the email in the nested query > 1 then select the latest modified record in the outer table.
SELECT *
FROM table1 <-- outer table
WHERE email IN
(SELECT email, COUNT(*)
FROM table1 as src
INNER JOIN table1 ON src.Email = table1.Email AND src.Id = table1.id
GROUP BY src.Email)
How can I write a query to say if the count for the given email is greater than 1 then select the latest record from the outer table?

Why would you go through all that trouble? How about just selecting the last modified record:
select t1.*
from table1 t1
where t1.modified_dt = (select max(tt1.modified_dt)
from table1 tt1
where tt1.email = t1.email
);

Another way to do it using window functions:
DECLARE #Tab TABLE (ID INT, Email VARCHAR(100), LastModified DATE)
INSERT #Tab
VALUES (1,'testemail#none.com','2019-12-01'),
(2,'testemail#none.com','2019-11-19'),
(3,'otheremail#none.com','2019-12-15')
SELECT *
FROM(
SELECT ROW_NUMBER() OVER(PARTITION BY t.Email ORDER BY t.LastModified DESC) rn, t.*
FROM #Tab t
) t2
WHERE t2.rn = 1

If by latest you mean the latest id number (the maximum number) then this should help you
With cte AS
(
SELECT email,
COUNT(id) OVER (PARTITION BY email) AS CountOfIDs,
ROW_NUMBER() OVER (PARITION BY email ORDER BY ID DESC) AS IdIndex
FROM table1
)
SELECT *
FROM cte
WHERE CountOfIDs > 1 AND IdIndex = 1

Related

Query to pull the data from 3 tables based on latest load date and HashKey

I am trying write a SQL query to pull the data from 3 tables using JOINS on basis of common HashKey and I want to take all the updated records from 3rd table based on the load date(last increment/recent records) using SQL.
I have tried below SQL query but I am not able to get the recent record from third table.
SELECT
tab1.TennisID
tab1.TennisHashKey
tab3.LoadDate
tab2.TennisType
tab3.Clicks
tab3.Hit
tab3.Likes
fROM table1 tab1
LEFT JOIN table2 tab2
ON tab1.TennisHashKey = tab2.TennisHashKey
LEFT JOIN (SELECT * FROM Table3 WHERE LoadDate = (SELECT TOP 1 LoadDate FROM Table 3 ORDER BY LoadDate Desc)) tab3
ON tab2.TennisHashKey = tab3.TennishHashKey
I have matching number of records in Table 1 and Table 2, but there are multiple rows for same hashkey in Table3 based on loadDate.
Please provide your suggestion on this.
Thanks
Use ROW_NUMBER() to join only the most recent row from Table3.
SELECT
tab1.TennisID
, tab1.TennisHashKey
, tab3.LoadDate
, tab2.TennisType
, tab3.Clicks
, tab3.Hit
, tab3.Likes
FROM table1 tab1
LEFT JOIN table2 tab2
ON tab1.TennisHashKey = tab2.TennisHashKey
LEFT JOIN (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY TennisHashKey ORDER BY LoadDate DESC) rn
FROM Table3
) tab3
ON tab2.TennisHashKey = tab3.TennishHashKey
AND rn = 1;
Another approach: you can use OUTER APPLY and get latest row and select it.
declare #table1 table(tennisid char(1), tennishashkey char(4),loaddate date)
declare #table2 table(tennishashkey char(4),tennistype char(10), loaddate date)
declare #table3 table(tennishashkey char(4),loaddate date,clicks int, hit int, likes int)
insert into #table1 values('A','A001','2020-01-01')
insert into #table2 values('A001','grass','2020-01-01')
insert into #table3 values('A001','2020-01-01',0,0,0),('A001','2020-01-01',1,1,1);
SELECT
tab1.TennisID
, tab1.TennisHashKey
, tab3.LoadDate
, tab2.TennisType
, tab3.Clicks
, tab3.Hit
, tab3.Likes
FROM #table1 tab1
LEFT JOIN #table2 tab2
ON tab1.TennisHashKey = tab2.TennisHashKey
OUTER APPLY (
SELECT TOP 1 *
FROM #Table3 as tab3
where tab3.tennishashkey = tab1.tennishashkey
order by loaddate desc
) tab3
TennisID
TennisHashKey
LoadDate
TennisType
Clicks
Hit
Likes
A
A001
2020-01-01
grass
1
1
1

Select rows with a duplicate ID but different value in another column

I have a table like this
I would like to select the Itemid that occurs more than once with a different Rate with group by Masterid
The output should be something like:
You might try the following:
SELECT masterid, detailid, itemid, rate FROM mytable
WHERE (masterid, detailid, rate) IN
(
SELECT masterid, detailid, rate FROM mytable t
JOIN mytable o ON o.masterid = t.masterid
AND o.detailid = t.detailid AND o.rate <> t.rate
GROUP BY t.masterid, t.detailid, t.rate
HAVING COUNT(*) >= 2
)
The inner join within the sub-query assures only rows appearing that have an unequal counter part. Alternatively you might add another sub-query condition to the outer query:
AND EXISTS
(
SELECT * FROM mytable o
WHERE o.masterid = t.masterid AND o.detailid = t.detailid AND o.rate <> t.rate
)
I believe you are looking for a query like below
select t1.* from t t1
join
(
select masterid,itemid
from t
group by masterid,itemid
having count(distinct rate )>1
)t2
on t1.masterid=t2.masterid and t1.itemid=t2.itemid
order by masterid,detailid
and here's a working db fiddle
Try following code:
Select masterid, detailid, rate, count(*) as count from Mytable
group by masterid, detailid, rate
having count(*) > 1

insert into temp table using Cross Apply?

I want to create a temp table and insert values based on the select. The query doesn't execute, What am i missing ? I eventually want to loop thru the temp table
Create Table #temp (ID varchar(25),Source_Id varchar(25),Processed varchar(25), Status varchar(25),Time_Interval_Min varchar(25))
Insert into #temp
Select t.*
From
(SELECT DISTINCT source_id
FROM Activity_WorkLoad) t1
CROSS APPLY
(
SELECT TOP 1
aw.ID,
Source_Id
,Processed
,Status
,Time_Interval_Min
FROM [dbSDS].[dbo].[Activity_WorkLoad] aw
JOIN [dbSDS].[dbo].[SDA_Schedule_Time] st ON aw.SDA_Resource_ID = st.ID
WHERE aw.Source_Id = t1.Source_Id AND aw.Status = 'Queued'
ORDER BY Processed DESC
)t
When you cross apply, you still need an alias:
Insert into #temp (id, source_id, processed, status, time_interval_min)
Select tt.*
From (SELECT DISTINCT source_id
FROM Activity_WorkLoad
) t CROSS APPLY
(SELECT TOP 1 aw.ID, Source_Id, Processed, Status, Time_Interval_Min
FROM [dbSDS].[dbo].[Activity_WorkLoad] aw JOIN
[dbSDS].[dbo].[SDA_Schedule_Time] st
ON aw.SDA_Resource_ID = st.ID
WHERE aw.Source_Id = t.Source_Id AND aw.Status = 'Queued'
ORDER BY Processed DESC
) tt;
I also assume that you want results from the second subquery, not the first, because the first does not have enough columns.

SQL query to merge 2 tables with additional conditions?

I have 2 identical tables: user_id, name, age, date_added.
USER_ID column may contain multiple duplicate IDs.
Need to merge those 2 tables into 1 with the following condition.
If there are multiple records with identical 'name' for the same user then need to keep only the LATEST (by date_added) record.
This script will be used with MSSQL 2005, but would also appreciate if somebody comes up with version that does not use ROW_NUMBER(). Need this script to reload a broken table once, performance is not critical.
example:
table1:
1,'john',21,01/01/2010
1,'john',15,01/01/2005
1,'john',71,01/01/2001
table2:
1,'john',81,01/01/2007
1,'john',15,01/01/2005
1,'john',11,01/01/2008
result:
1,'john',21,01/01/2010
UPDATE:
I think that I've found my own solution. It is based on an answer for my previous question given by Larry Lustig and Joe Stefanelli.
with tmp2 as
(
SELECT * FROM table1
UNION
SELECT * FROM table2
)
SELECT * FROM tmp2 c1
WHERE (SELECT COUNT(*) FROM tmp2 c2
WHERE c2.user_id = c1.user_id AND
c2.name = c1.name AND
c2.date_added >= c1.date_added) <= 1
Could you please help me to convert this query to the one without 'WITH' clause?
Here's a variant of #Andomar's answer:
; with all_users as
(
select *
from table1 u1
union all
select *
from table2 u2
)
, ranker as (
select *,
rank() over (partition by userid order by recordtime) as [r]
)
select * from ranker where [r] = 1
Just in the interests of giving a different approach...
WITH distinctlist
As (SELECT user_id,
name
FROM table1
UNION
SELECT user_id,
name
FROM table2)
SELECT C.*
FROM distinctlist d
CROSS APPLY (SELECT TOP 1 *
FROM (SELECT TOP 1 *
FROM table1
WHERE user_id = d.user_id
AND name = d.name
ORDER BY date_added DESC
UNION ALL
SELECT TOP 1 *
FROM table1
WHERE user_id = d.user_id
AND name = d.name
ORDER BY date_added DESC) T
ORDER BY date_added DESC) C
You could use not exists, like:
; with all_users as
(
select *
from table1 u1
union all
select *
from table2 u2
)
select *
from all_users u1
where not exists
(
select *
from all_users u2
where u1.name = u2.name
and u1.record_time < u2.record_time
)
If the database doesn't support CTE's, expand all_users in the two places it is used.
P.S. If there are only three columns, and no more, you could use an even simpler solution:
select name
, MAX(record_time)
from (
select *
from table1 u1
union all
select *
from table2 u2
) sub
group by
name

SQL Select Distinct with Conditional

Table1 has columns (id, a, b, c, group). There are several rows that have the same group, but id is always unique. I would like to SELECT group,a,b FROM Table1 WHERE the group is distinct. However, I would like the returned data to be from the row with the greatest id for that group.
Thus, if we have the rows
(id=10, a=6, b=40, c=3, group=14)
(id=5, a=21, b=45, c=31, group=230)
(id=4, a=42, b=65, c=2, group=230)
I would like to return these 2 rows:
[group=14, a=6,b=40] and
[group=230, a=21,b=45] (because id=5 > id=4)
Is there a simple SELECT statement to do this?
Try:
select grp, a, b
from table1 where id in
(select max(id) from table1 group by grp)
You can do it using a self join or an inner-select. Here's inner select:
select `group`, a, b from Table1 AS T1
where id=(select max(id) from Table1 AS T2 where T1.`group` = T2.`group`)
And self-join method:
select T1.`group`, T2.a, T2.b from
(select max(id) as id,`group` from Table1 group by `group`) T1
join Table1 as T2 on T1.id=T2.id
2 selects, your inner select gets:
SELECT MAX(id) FROM YourTable GROUP BY [GROUP]
Your outer select joins to this table.
Think about it logically, the inner select gets a sub set of the data you need.
The outer select inner joins to this subset and can get further data.
SELECT [group], a, b FROM YourTable INNER JOIN
(SELECT MAX(id) FROM YourTable GROUP BY [GROUP]) t
ON t.id = YourTable.id
SELECT mi.*
FROM (
SELECT DISTINCT grouper
FROM mytable
) md
JOIN mytable mi
ON mi.id =
(
SELECT id
FROM mytable mo
WHERE mo.grouper = md.grouper
ORDER BY
id DESC
LIMIT 1
)
If your table is MyISAM or id is not a PRIMARY KEY, then make sure you have a composite index on (grouper, id).
If your table is InnoDB and id is a PRIMARY KEY, then a simple index on grouper will suffice (id, being a PRIMARY KEY, will be implictly included).
This will use an INDEX FOR GROUP-BY to build the list of distinct groupers, and for each grouper it will use the index access to find the maximal id.
Don't know how to do it in mysql. But the following code will work for MsSQL...
SELECT Y.* FROM
(
SELECT DISTINCT [group], MAX(id) ID
FROM Table1
GROUP BY [group]
) X
INNER JOIN Table1 Y ON X.ID=Table1.ID