Using Top in SQL Server 2012 - sql

I have a table Contacts, parent to table Activity. I would like to select the latest activity for each contact, but getting more than one row.
This is my query:
select top 30
*
from
Contacts o, Activity d
where
o.ID = d.contact
and d.ID > 401061
and Last_Action is null
order by
d.activity_date desc
I think I need Top? but not sure how to implement here. Any help would be appreciated.

You can use row_number() to number each contact's activities. In an outer query, you can filter down to only the latest activity per contact:
select top 30 *
from (
select row_number() over (
partition by o.ID
order by d.activity_date desc) as rn
, *
from Contacts o
join Activity d
on o.ID = d.contact
where d.ID > 401061
and Last_Action is null
) as SubQueryAlias
where rn = 1 -- Only last activity per contact
order by
activity_date desc

Here's a way using not exists that will work on most dbs. You're basically selecting each activity per contact where a newer activity does not exist (therefore it's the latest activity).
select top 30 * from activity a
join contact c on c.id = a.contact
where not exists (
select 1 from activity b
where b.contact = a.contact
and b.activity_date > a.activity_date
) and last_action is null and a.id > 401061
order by a.activity_date desc

This is slight guesswork as I can't see what your tables look like in terms of columns, but perhaps this (or something like it) would work?
select top 30 * from Contacts o,
(SELECT contact, max(activity_date) FROM Activity GROUP BY contact) d
where o.ID = d.contact And d.ID > 401061
and Last_Action is null order by d.activity_date desc

I think you want to use a subquery:
SELECT TOP 30 *
FROM
Contacts AS o,
( SELECT
contact,
MAX( activity_date ) AS activity_date
FROM
Activity
WHERE
contact > 401061 AND
Last_Action IS NULL
GROUP BY
contact
) AS d
WHERE
o.ID = d.contact
ORDER BY
d.activity_date

Assuming you want the top 30 actions per contact, this is the exact kind of thing CROSS APPLY was invented for.
Something like the following - uncertainties are because I can't see an example of your data.
select
*
from
contacts
cross apply (
select top 30
*
from
activity
where
contacts.id = activity.contact
and 401061 < activity.id
) as _ca
where
last_action is null -- Perhaps you could move this into the CA - but we don't know which table it's from
order by
activity.activity_date desc;
edit
select top 30
*
from
contacts
cross apply (
select top 1
*
from
activity
where
contacts.id = activity.contact
and 401061 < activity.id
order by
activity.activity_date desc
) as _ca
where
last_action is null -- assuming this is in table contacts
order by
_ca.activity_date desc;

Using a cte
WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY contact ORDER BY activity_date DESC) rn
FROM Activity
WHERE d.ID > 401061 AND Last_Action IS NULL
)
SELECT TOP 30 *
FROM Contacts o
JOIN cte d ON o.Id = d.contact
WHERE d.rn = 1
ORDER BY cte.activity_date DESC

Related

How optimize select with max subquery on the same table?

We have many old selects like this:
SELECT
tm."ID",tm."R_PERSONES",tm."R_DATASOURCE", ,tm."MATCHCODE",
d.NAME AS DATASOURCE,
p.PDID
FROM TABLE_MAPPINGS tm,
PERSONES p,
DATASOURCES d,
(select ID
from TABLE_MAPPINGS
where (R_PERSONES, MATCHCODE)
in (select
R_PERSONES, MATCHCODE
from TABLE_MAPPINGS
where
id in (select max(id)
from TABLE_MAPPINGS
group by MATCHCODE)
)
) tm2
WHERE tm.R_PERSONES = p.ID
AND tm.R_DATASOURCE=d.ID
and tm2.id = tm.id;
These are large tables, and queries take a long time.
How to rebuild them?
Thank you
You can query the table only once using something like (untested as you have not provided a minimal example of your create table statements or sample data):
SELECT *
FROM (
SELECT m.*,
COUNT(CASE WHEN rnk = 1 THEN 1 END)
OVER (PARTITION BY r_persones, matchcode) AS has_max_id
FROM (
SELECT tm.ID,
tm.R_PERSONES,
tm.R_DATASOURCE,
tm.MATCHCODE,
d.NAME AS DATASOURCE,
p.PDID,
RANK() OVER (PARTITION BY tm.matchcode ORDER BY tm.id DESC) As rnk
FROM TABLE_MAPPINGS tm
INNER JOIN PERSONES p ON tm.R_PERSONES = p.ID
INNER JOIN DATASOURCES d ON tm.R_DATASOURCE = d.ID
) m
)
WHERE has_max_id > 0;
First finding the maximum ID using the RANK analytic function and then finding all the relevant r_persones, matchcode pairs using conditional aggregation in a COUNT analytic function.
Note: you want to use the RANK or DENSE_RANK analytic functions to match the maximums as it can match multiple rows per partition; whereas ROW_NUMBER will only ever put a single row per partition first.
You're querying table_mappings 3 times; how about doing it only once?
WITH
tab_map
AS
(SELECT a.id,
a.r_persones,
a.matchcode,
a.datasource,
ROW_NUMBER ()
OVER (PARTITION BY a.matchcode ORDER BY a.id DESC) rn
FROM table_mappings a)
SELECT tm.id,
tm.r_persones,
tm.matchcode,
d.name AS datasource,
p.pdid
FROM tab_map tm
JOIN persones p ON p.id = tm.r_persones
JOIN datasources d ON d.id = tm.r_datasource
WHERE tm.rn = 1

Using SQL how can I write a query to find the top 5 per category per month?

I am trying to get the Top 5 rows with the highest number for each category for a specific time interval such as a month. What I currently have returns 5 of the exact same descriptions for a category. I am trying to get the top five. This only happens when I try to sort it based on a time period.
WITH CustomerRank
AS
(SELECT
Count(*) AS "Count",
d.Item,
d.Description,
Name,
i.Type,
d.CreatedOn
FROM [dbo].i,
d,
dbo.b,
as,
a,
c
WHERE d.Inspection_Id = i.Id AND d.Inspection_Id = i.Id AND
b.Id = i.BuildingPart_Id AND b.as= Assessments.Id
AND as.Application_Id = a.Id AND a.Customer_Id = Customers.Id
group by d.Item, d.Description, Name, i.Type, d.CreatedOn
)
select * from (
SELECT "Count",Item,Description,Type,ROW_NUMBER() Over (PARTITION BY Name order by "Count" desc) AS RowNum, Name, CreatedOn
FROM CustomerRank
where CreatedOn > '2017-1-1 00:00:00'
) s where RowNum <6
Cheers
Try something like this:
WITH CustomerRank
AS
(SELECT
Count(*) AS "Count",
d.Item, d.Description, Name, i.Type
FROM dbo.Inspection i
INNER JOIN dbo.Details d ON d.Inspection_Id = i.Id
INNER JOIN dbo.BuildingParts b ON b.Id = i.BuildingPart_Id
INNER JOIN dbo.Assessments a ON a.Id = b.AssessmentId
INNER JOIN dbo.Applications ap ON ap.Id = a.Application_Id
INNER JOIN dbo.Customers c ON c.Id = a.Customer_Id
where CreatedOn > '2017-1-1 00:00:00'
group by d.Item, d.Description, Name, i.Type
)
select * from (
SELECT "Count",Item,Description,Type,ROW_NUMBER() Over (PARTITION BY Name order by "Count" desc) AS RowNum, Name
FROM CustomerRank
) s where RowNum <6
The idea is that the CreatedOn column must be removed from the GROUP BY clause (because if you keep it there, we would get a different row for each value of the CreatedOn column).
Also, it's better to use JOIN-s and aliases for each table.

How to get latest DETAIL entry against the MASTER entry?

I have 2 tables
1. User Master
user_id, user_full_name, user_dob...so on
2. Login Details
login_id, login_user_id, login_time, login_date, logout_time
Problem
2nd table has n number of rows against User Master table id
I need to make a join but the condition is that it should show only last login data of the user
example
user_full_name, user_login, user_logout so on...
If you want the result for a single user, you could use a simple INNER JOIN combined with an ORDER BY and TOP 1:
SELECT TOP 1 user_full_name, login_time, login_date, logout_time
FROM Users INNER JOIN Logins ON
Users.user_id = Logins.user_id
WHERE
Users.user_id = #user_id
ORDER BY login_date DESC, login_time DESC
(See SQLFiddle)
If you want the result for all users, you could use CROSS APPLY:
SELECT user_full_name, l.*
FROM Users u CROSS APPLY (
SELECT TOP 1 login_time, login_date, logout_time
FROM Logins
WHERE
u.user_id = Logins.user_id
ORDER BY login_date DESC, login_time DESC
) l
(See SQLFiddle)
A common solution for this problem is to use the row_number window function and filter for rows with row number 1 in each partition (by user, ordered by date/time):
WITH UserDetails AS (
SELECT
*
, ROW_NUMBER() OVER (PARTITION BY login_user_id
ORDER BY login_date DESC, login_time DESC) AS RN
FROM LoginDetails
)
SELECT *
FROM UserMaster M
JOIN UserDetails D ON M.user_id = D.login_user_id
WHERE D.RN = 1;
You could try using a TOP 1 inside the JOIN clause:
SELECT a.user_id, a.user_full_name, b.login_id...
FROM UserMaster a INNER JOIN Logins b ON b.login_date =
(
SELECT TOP 1 login_date
FROM Logins
WHERE login_user_id = a.user_id
ORDER BY login_date DESC
)

Highest Count with a group

I'm having an absolute brain fade
SELECT p.ProductCategory, f.ProductSubCategory, COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
GROUP BY p.ProductCategory, f.ProductSubCategory
ORDER BY 1,3 DESC
This shows me the count for each ProductSubCategory, I would like to see only the highest ProductSubCategory per ProductCategory.
I wish to see (I don't care about the Count value)
There are a couple of different ways to do this. One involves joining the results back to themselves and using the max aggregate. But since you are using SQL Server, you can use ROW_NUMBER to achieve the same result:
with cte as (
select p.productcategory, p.ProductSubCategory, COUNT(*) cnt,
ROW_NUMBER() over (partition by p.productcategory order by count(*) desc) rn
from products p
join sales s on p.ProductSubCategory = s.ProductSubCategory
group by p.productcategory, p.ProductSubCategory
)
select *
from cte
where rn = 1
You already got the answer, Please see the following code to. It may help you.
SELECT p.ProductCategory,
f.ProductSubCategory,
COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
JOIN (
SELECT p.ProductCategory,
f.ProductSubCategory,
ROW_NUMBER() OVER ( PARTITION BY p.ProductCategory,
f.ProductSubCategory
ORDER BY COUNT(*) DESC) [Row]
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory) Lu
ON P.ProductCategory = Lu.ProductCategory
AND f.ProductSubCategory = Lu.ProductSubCategory
WHERE Lu.Row = 1
GROUP By p.ProductCategory,
f.ProductSubCategory

mysql query with double join

I have 3 tables, but I can only get to join another table count. See below.
The one below works like a charm, but I need to add another "count" from another table.
there is a 3rd table called "ci_nomatch" and contains a reference to ci_address_book.reference
which could have multiple entries (many on many) but I only need the count of that table.
so if ci_address_book would have an entries called "item1","item 2","item3"
and ci_nomatch would have "1,item1,user1","2,item1,user4"
I would like to have returned "2" for Item1 on the query.
Any ideas? I tried another join, but it tells me that the reference does not exist, while it does!
SELECT c.*, IFNULL(p.total, 0) AS matchcount
FROM ci_address_book c
LEFT JOIN (
SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id
) AS p
ON c.id=p.addressbook_id
ORDER BY matchcount DESC
LIMIT 0,15
You could subquery it directly in the select
SELECT c.*, IFNULL(p.total, 0) AS matchcount,
(SELECT COUNT(*) FROM ci_nomatch n on n.reference = c.reference) AS othercount
FROM ci_address_book c
LEFT JOIN (
SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id
) AS p
ON c.id=p.addressbook_id
ORDER BY matchcount DESC
LIMIT 0,15
#updated for comment. Including an extra column "(matchcount - othercount) AS deducted" would be best done by sub-querying.
SELECT *, matchcount - othercount AS deducted
FROM
(
SELECT c.* , IFNULL( p.total, 0 ) AS matchcount, (
SELECT COUNT( * ) FROM ci_falsepositives n
WHERE n.addressbook_id = c.reference ) AS othercount
FROM ci_address_book c
LEFT JOIN (
SELECT addressbook_id, COUNT( match_id ) AS total
FROM ci_matched_sanctions GROUP BY addressbook_id ) AS p
ON c.id = p.addressbook_id ORDER BY matchcount DESC LIMIT 0 , 15
) S