Select value from table based on conditions in other rows - sql

I have multiple rows with a parent ID that associates related rows. I want to select the email address where Status = 'active', for the parent ID, and if there's multiple rows with that condition, I want to pick the most recently modified (createDate).
Basically I have two+ records, parent ID 111. The first record has m#x.com with a status of 'active', and the second record has m#y.com with a status of 'unsubscribed'. How do I select just ID 111 with m#x.com?
How would I go about this?
Table Data:
ID ParentID Email Status CreateDate
1000919 1000919 xxx#gmail.com bounced 2/5/18
1017005 1000919 yyy#gmail.com active 1/6/18
1002868 1002868 sss#gmail.com active 12/31/17
1002868 1002868 www#gmail.com active 12/31/17
1002982 1002982 uuu#gmail.com held 2/7/18
1002982 1002982 iii#gmail.com held 2/7/18
1002990 1002990 ooo#gmail.com active 10/26/18
1003255 1003255 ppp#gmail.com active 2/7/18
Expected Result:
ParentID Email Status CreateDate
1000919 yyy#gmail.com active 1/6/18
1002868 sss#gmail.com active 12/31/17

I hope this is what you need:
SELECT * FROM table
WHERE parent_id IN
(SELECT id FROM users WHERE status = "active")
ORDER BY createdate DESC LIMIT 1;
Order by createdate in descending order will allow you to select only last n rows, where n is set in LIMIT.

This is really hard with no primary key and duplicate rows. You have no defined the answer when a parentid has 2 rows on the same date and different emails. CreatedDate could be a datetime field, and likely to be unique. Without that we must use >=. This will do it though
> SELECT distinct a.* FROM [Temp] a join [Temp] b
> on a.parentid=b.parentid
> and a.createdate >= b.createdate
> and a.status='active' and b.status='active'

ok, so based on the information of the question.
A self-reference field exists on the table.
The status has to be active.
If 2 or more records exist in the table for the same parent take the latest.
I added a 4th if matching all conditions to take the highest email.
the code is not formatting properly I'm new to the stackoverflow (lol)
Blockquote table to create
create table #tmp
(id int identity,
name varchar(50),
email varchar(50),
status varchar(20),
add_date datetime,
mod_date datetime,
account_id int)
Blockquote Populating table
insert into #tmp
(name,email,status, add_date, mod_date, account_id)
values ('Cesar', 'Cesar#hotmail.com', 'Active', '20180101', '20180103', 1),
('manuel', 'manuel#hotmail.com', 'Active', '20180103', '20180103', 1),
('feliz', 'feliz#hotmail.com', 'Inactive', '20180103', '20180105', 1),
('lucien', 'lucien#hotmail.com', 'Active', '20180105', '20180105', 2),
('norman', 'norman#hotmail.com', 'Active', '20180110', '20180110', 2),
('tom', 'tom#hotmail.com', 'Active', '20180110', '20180115', 3),
('peter', 'peter#hotmail.com', 'inactive', '20180101', '20180110', 3),
('john', 'john#hotmail.com', 'Active', '20180101', '20180105', 3)
Blockquote Visualization
select *
from #tmp as a
where status = 'Active' and
exists (select
account_id
from #tmp as b
where
b.status = a.status
group by
account_id
having
MAX(b.mod_date) = a.mod_date and
a.email = MAX(b.email))
Blockquote, so the exists, is faster than having a subquery to predicate the data since the table would be pulled back in full

Related

avoiding group by for column used in datediff?

As the database is currently constructed, I can only use a Date Field of a certain table in a datediff-function that is also part of a count aggregation (not the date field, but that entity where that date field is not null. The group by in the end messes up the counting, since the one entry is counted on it's own / as it's own group.
In some detail:
Our lead recruiter want's a report that shows the sum of applications, and conducted interviews per opening. So far no problem. Additionally he likes to see the total duration per opening from making it public to signing a new employee per opening and of cause only if the opening could already be filled.
I have 4 tables to join:
table 1 holds the data of the opening
table 2 has the single applications
table 3 has the interview data of the applications
table 4 has the data regarding the publication of the openings (with the date when a certain opening was made public)
The problem is the duration requirement. table 4 holds the starting point and in table 2 one (or none) applicant per opening has a date field filled with the time he returned a signed contract and therefor the opening counts as filled. When I use that field in a datediff I'm forced to also put that column in the group by clause and that results in 2 row per opening. 1 row has all the numbers as wanted and in the second row there is always that one person who has a entry in that date field...
So far I haven't come far in thinking of a way of avoiding that problem except for explanining to the colleague that he get's his time-to-fill number in another report.
SELECT
table1.col1 as NameOfProject,
table1.col2 as Company,
table1.col3 as OpeningType,
table1.col4 as ReasonForOpening,
count (table2.col2) as NumberOfApplications,
sum (case when table2.colSTATUS = 'withdrawn' then 1 else 0 end) as mberOfApplicantsWhoWithdraw,
sum (case when table3.colTypeInterview = 'PhoneInterview' then 1 else 0 end) as NumberOfPhoneInterview,
...more sum columns...,
table1.finished, // shows „1“ if opening is occupied
DATEDIFF(day, table4.colValidFrom, **table2.colContractReceived**) as DaysToCompletion
FROM
table2 left join table3 on table2.REF_NR = table3.REF_NR
join table1 on table2.PROJEKT = table1.KBEZ
left join table4 on table1.REFNR = table4.PRJ_REFNR
GROUP BY
**table2.colContractReceived**
and all other columns except the ones in aggregate (sum and count) functions go in the GROUP BY section
ORDER BY table1.NameOfProject
Here is a short rebuild of what it looks like. First a row where the opening is not filled and all aggregations come out in one row as wanted. The next project/opening shows up double, because the field used in the datediff is grouped independently...
project company; no_of_applications; no_of_phoneinterview; no_of_personalinterview; ... ; time_to_fill_in_days; filled?
2018_312 comp a 27 4 2 null 0
2018_313 comp b 54 7 4 null 0
2018_313 comp b 1 1 1 42 1
I'd be glad to get any idea how to solve this. Thanks for considering my request!
(During the 'translation' of all the specific column and table names I might have build in a syntax error here and there but the query worked well ecxept for that unwanted extra aggregation per filled opening)
If I've understood your requirement properly, I believe the issue you are having is that you need to show the date between the starting point and the time at which an applicant responded to an opening, however this must only show a single row based on whether or not the position was filled (if the position was filled, then show that row, if not then show that row).
I've achieved this result by assuming that you count a position as filled using the "ContractsRecevied" column. This may be wrong however the principle should still provide what you are looking for.
I've essentially wrapped your query in to a subquery, performed a rank ordering by the contractsfilled column descending and partitioned by the project. Then in the outer query I filter for the first instance of this ranking.
Even if my assumption about the column structure and data types is wrong, this should provide you with a model to work with.
The only issue you might have with this ranking solution is if you want to aggregate over both rows within one (so include all of the summed columns for both the position filled and position not filled row per project). If this is the case let me know and we can work around that.
Please let me know if you have any questions.
declare #table1 table (
REFNR int,
NameOfProject nvarchar(20),
Company nvarchar(20),
OpeningType nvarchar(20),
ReasonForOpening nvarchar(20),
KBEZ int
);
declare #table2 table (
NumberOfApplications int,
Status nvarchar(15),
REF_NR int,
ReturnedApplicationDate datetime,
ContractsReceived bit,
PROJEKT int
);
declare #table3 table (
TypeInterview nvarchar(25),
REF_NR int
);
declare #table4 table (
PRJ_REFNR int,
StartingPoint datetime
);
insert into #table1 (REFNR, NameOfProject, Company, OpeningType, ReasonForOpening, KBEZ)
values (1, '2018_312', 'comp a' ,'Permanent', 'Business growth', 1),
(2, '2018_313', 'comp a', 'Permanent', 'Business growth', 2),
(3, '2018_313', 'comp a', 'Permanent', 'Business growth', 3);
insert into #table2 (NumberOfApplications, Status, REF_NR, ReturnedApplicationDate, ContractsReceived, PROJEKT)
values (27, 'Processed', 4, '2018-04-01 08:00', 0, 1),
(54, 'Withdrawn', 5, '2018-04-02 10:12', 0, 2),
(1, 'Processed', 6, '2018-04-15 15:00', 1, 3);
insert into #table3 (TypeInterview, REF_NR)
values ('Phone', 4),
('Phone', 5),
('Personal', 6);
insert into #table4 (PRJ_REFNR, StartingPoint)
values (1, '2018-02-25 08:00'),
(2, '2018-03-04 15:00'),
(3, '2018-03-04 15:00');
select * from
(
SELECT
RANK()OVER(Partition by NameOfProject, Company order by ContractsReceived desc) as rowno,
table1. NameOfProject,
table1.Company,
table1.OpeningType,
table1.ReasonForOpening,
case when ContractsReceived >0 then datediff(DAY, StartingPoint, ReturnedApplicationDate) else null end as TimeToFillInDays,
ContractsReceived Filled
FROM
#table2 table2 left join #table3 table3 on table2.REF_NR = table3.REF_NR
join #table1 table1 on table2.PROJEKT = table1.KBEZ
left join #table4 table4 on table1.REFNR = table4.PRJ_REFNR
group by NameOfProject, Company, OpeningType, ReasonForOpening, ContractsReceived,
StartingPoint, ReturnedApplicationDate
) x where rowno=1

SQL query - find where a value (where there could be multiple) does not exist

I need some help in identifying records which do not have a specific value associated with it.
Need:
Each distinct customer record can have multiple methods of contact, for example:
Cheryl Hubert has the following contact records:
Code value: 1.
Description: home phone
CustomerData:. 123-456-7890
Code value: 2
Description: work phone
CustomerData: 000-123-4567
Code value:3
Description: email
CustomerData: chubert#xxx.xxx
Customers may have none of these, or some of these.
I need to write a query to find all those customer records which DO NOT have an email address (code value 3). I've seen queries with 'not exists' but not sure that would be the right way. Keep in mind that the same field name is used for all contact data (CustomerData).
The code value/description provides what is within the CustomerData field.
Any help appreciated.
Let's say the contact info is in a table contactRecords, which looks something like this:
customerId int,
codeValue int,
description varchar,
customerData varchar
To get all of the customers who do not have an email record (where codeValue = 3), try something like this:
select distinct customerId
from contactRecords
where customerId not in (
select distinct customerId
from contactRecords
where codeValue = 3)
The inner query finds all customers who have an email record. The outer query finds all but those customers.
As you posted almost no data i will try guessing your structure. Assuming you have clients in one table and contacts on another one with the client id, usually when you want to find something non relational between two tables, you select on your client, left join on your contact and put a where clause on any of the contact column is null. If you want specifically the value 3, put it directly in join clause.
Try this query:
select *
from customers c
where not exists(select 1 from contact_method
where customer_id = c.id
and description = 'email');
I assumed such schema:
create table customers(id int, name varchar(20));
insert into customers values (1, 'Cheryl Hubert');
create table contact_method (id int, customer_id int, code_value int, description varchar(20), customer_data varchar(20));
insert into contact_method values (1, 1, 1, 'home phone', '123-456-7890');
insert into contact_method values (2, 1, 2, 'work phone', '000-123-4567');
insert into contact_method values (3, 1, 3, 'email', 'chubert#xxx.xxx');
Demo
You can use the GROUP BY and HAVING clauses to check:
Oracle Setup:
CREATE TABLE contact_details ( code_value, customerid, description, customerdata ) AS
SELECT 1, 1, 'home phone', '123-456-7890' FROM DUAL UNION ALL
SELECT 2, 1, 'work phone', '000-123-4567' FROM DUAL UNION ALL
SELECT 3, 1, 'email', 'chubert#xxx.xxx' FROM DUAL UNION ALL
SELECT 4, 2, 'home phone', '012-345-6789' FROM DUAL;
Query:
SELECT customerid
FROM contact_details
GROUP BY customerid
HAVING COUNT( CASE description WHEN 'email' THEN 1 END ) = 0
Output:
| CUSTOMERID |
|------------|
| 2 |

sql join using recursive cte

Edit: Added another case scenario in the notes and updated the sample attachment.
I am trying to write a sql to get an output attached with this question along with sample data.
There are two table, one with distinct ID's (pk) with their current flag.
another with Active ID (fk to the pk from the first table) and Inactive ID (fk to the pk from the first table)
Final output should return two columns, first column consist of all distinct ID's from the first table and second column should contain Active ID from the 2nd table.
Below is the sql:
IF OBJECT_ID('tempdb..#main') IS NOT NULL DROP TABLE #main;
IF OBJECT_ID('tempdb..#merges') IS NOT NULL DROP TABLE #merges
IF OBJECT_ID('tempdb..#final') IS NOT NULL DROP TABLE #final
SELECT DISTINCT id,
current
INTO #main
FROM tb_ID t1
--get list of all active_id and inactive_id
SELECT DISTINCT active_id,
inactive_id,
Update_dt
INTO #merges
FROM tb_merges
-- Combine where the id from the main table matched to the inactive_id (should return all the rows from #main)
SELECT id,
active_id AS merged_to_id
INTO #final
FROM (SELECT t1.*,
t2.active_id,
Update_dt ,
Row_number()
OVER (
partition BY id, active_id
ORDER BY Update_dt DESC) AS rn
FROM #main t1
LEFT JOIN #merges t2
ON t1.id = t2.inactive_id) t3
WHERE rn = 1
SELECT *
FROM #final
This sql partially works. It doesn't work, where the id was once active then gets inactive.
Please note:
the active ID should return the last most active ID
the ID which doesn't have any active ID should either be null or the ID itself
ID where the current = 0, in those cases active ID should be the ID current in tb_ID
ID's may get interchanged. For example there are two ID's 6 and 7, when 6 is active 7 is inactive and vice versa. the only way to know the most current active state is by the update date
Attached sample might be easy to understand
Looks like I might have to use recursive cte for achieiving the results. Can someone please help?
thank you for your time!
I think you're correct that a recursive CTE looks like a good solution for this. I'm not entirely certain that I've understood exactly what you're asking for, particularly with regard to the update_dt column, just because the data is a little abstract as-is, but I've taken a stab at it, and it does seem to work with your sample data. The comments explain what's going on.
declare #tb_id table (id bigint, [current] bit);
declare #tb_merges table (active_id bigint, inactive_id bigint, update_dt datetime2);
insert #tb_id values
-- Sample data from the question.
(1, 1),
(2, 1),
(3, 1),
(4, 1),
(5, 0),
-- A few additional data to illustrate a deeper search.
(6, 1),
(7, 1),
(8, 1),
(9, 1),
(10, 1);
insert #tb_merges values
-- Sample data from the question.
(3, 1, '2017-01-11T13:09:00'),
(1, 2, '2017-01-11T13:07:00'),
(5, 4, '2013-12-31T14:37:00'),
(4, 5, '2013-01-18T15:43:00'),
-- A few additional data to illustrate a deeper search.
(6, 7, getdate()),
(7, 8, getdate()),
(8, 9, getdate()),
(9, 10, getdate());
if object_id('tempdb..#ValidMerge') is not null
drop table #ValidMerge;
-- Get the subset of merge records whose active_id identifies a "current" id and
-- rank by date so we can consider only the latest merge record for each active_id.
with ValidMergeCTE as
(
select
M.active_id,
M.inactive_id,
[Priority] = row_number() over (partition by M.active_id order by M.update_dt desc)
from
#tb_merges M
inner join #tb_id I on M.active_id = I.id
where
I.[current] = 1
)
select
active_id,
inactive_id
into
#ValidMerge
from
ValidMergeCTE
where
[Priority] = 1;
-- Here's the recursive CTE, which draws on the subset of merges identified above.
with SearchCTE as
(
-- Base case: any record whose active_id is not used as an inactive_id is an endpoint.
select
M.active_id,
M.inactive_id,
Depth = 0
from
#ValidMerge M
where
not exists (select 1 from #ValidMerge M2 where M.active_id = M2.inactive_id)
-- Recursive case: look for records whose active_id matches the inactive_id of a previously
-- identified record.
union all
select
S.active_id,
M.inactive_id,
Depth = S.Depth + 1
from
#ValidMerge M
inner join SearchCTE S on M.active_id = S.inactive_id
)
select
I.id,
S.active_id
from
#tb_id I
left join SearchCTE S on I.id = S.inactive_id;
Results:
id active_id
------------------
1 3
2 3
3 NULL
4 NULL
5 4
6 NULL
7 6
8 6
9 6
10 6

Calculating a fields value according to the values of the previous and next fields

For clarity assume that I have a table with a carID, a mileage and a date. The dates are always months (eg 01/02/2015, 01/03/2015, ...). Each carID has a row for each month, but not each row has values for the mileage field, some are NULL.
Example table:
carID mileage date
-----------------------------------------
1 400 01/01/2015
2 NULL 01/02/2015
3 NULL 01/03/2015
4 1050 01/04/2015
If such a field is NULL I need to calculate what value it should have by looking at the previous and next values (these aren't necessarily the next or previous month, they can be months apart).
I want to do this by taking the difference of the previous and next values, then calculate the time between them and make the value accordingly to the time. I have no idea however as how to do this.
I have already used a bit of code to look at the next value before, it looks like this:
, carKMcombiDiffList as (
select ml.*,
(ml.KM - mlprev.KM) as diff
from carKMcombilist ml outer apply
(select top 1 ml2.*
from carKMcombilist ml2
where ml2.FK_CarID = ml.FK_CarID and
ml2.beginmonth < ml.beginmonth
order by ml2.beginmonth desc
) mlprev
)
What this does is check if the current value is larger then the previous value. I assume I can use this as well to check the previous one in my current problem, I just don't know how I can add the next one in it AND all the logic that I need to make the calculations.
Assumption: CarID and date are always a unique combination
This is what i came up with:
select with_dates.*,
prev_mileage.mileage as prev_mileage,
next_mileage.mileage as next_mileage,
next_mileage.mileage - prev_mileage.mileage as mileage_delta,
datediff(month,prev_d,next_d) as month_delta,
(next_mileage.mileage - prev_mileage.mileage)/datediff(month,prev_d,next_d)*datediff(month,prev_d,with_dates.d) + prev_mileage.mileage as estimated_mileage
from (select *,
(select top 1 d
from mileage as prev
where carid = c.carid
and prev.d < c.d
and prev.mileage is not null
order by d desc ) as prev_d,
(select top 1 d
from mileage as next_rec
where carid = c.carid
and next_rec.d > c.d
and next_rec.mileage is not null
order by d asc) as next_d
from mileage as c
where mileage is null) as with_dates
join mileage as prev_mileage
on prev_mileage.carid = with_dates.carid
and prev_mileage.d = with_dates.prev_d
join mileage as next_mileage
on next_mileage.carid = with_dates.carid
and next_mileage.d = with_dates.next_d
Logic:
First, for every mileage is nullrecord i select the previous and next date where mileage is not null. After this i just join the rows based on carid and date and do some simple math to approximate.
Hope this helps, it was quite fun.
The following query obtains the previous and next available mileages for a record.
with data as --test data
(
select * from (VALUES
(0, null, getdate()),
(1, 400, '20150101'),
(1, null, '20150201'),
(1, null, '20150301'),
(1, 1050, '20150401'),
(2, 300, '20150101'),
(2, null, '20150201'),
(2, null, '20150301'),
(2, 1235, '20150401'),
(2, null, '20150501'),
(2, 1450, '20150601'),
(3, 200, '20150101'),
(3, null, '20150201')
) as v(carId, mileage, [date])
where v.carId != 0
)
-- replace 'data' with your table name
select d.*,
(select top 1 mileage from data dprev where dprev.mileage is not null and dprev.carId = d.carId and dprev.[date] <= d.date order by dprev.[date] desc) as 'Prev available mileage',
(select top 1 mileage from data dnext where dnext.mileage is not null and dnext.carId = d.carId and dnext.[date] >= d.date order by dnext.[date] asc) as 'Next available mileage'
from data d
Note that these columns can still be null if there is no data available before/after a specific date.
From here it's up to you on how you use these values. Probably you want to interpolate values for records where mileage is missing.
Edit
In order to interpolate the values for missing mileages I had to compute three auxiliary columns:
ri - index of record in a continuous group where mileage is missing
gi - index of a continuous group where mileage is missing per car
gc - count of records per continuous group where mileage is missing
The limit columns from the query above where renamed to
pa (Previous Available) and
na (Next Available).
The query is not compact and I am sure it can be improved but the good part of the cascading CTEs is that you can easily check intermediary results and understand each step.
SQL Fiddle: SO 29363187
with data as --test data
(
select * from (VALUES
(0, null, getdate()),
(1, 400, '20150101'),
(1, null, '20150201'),
(1, null, '20150301'),
(1, 1050, '20150401'),
(2, 300, '20150101'),
(2, null, '20150201'),
(2, null, '20150301'),
(2, 1235, '20150401'),
(2, null, '20150501'),
(2, 1450, '20150601'),
(3, 200, '20150101'),
(3, null, '20150201')
) as v(carId, mileage, [date])
where v.carId != 0
),
-- replace 'data' with your table name
limits AS
(
select d.*,
(select top 1 mileage from data dprev where dprev.mileage is not null and dprev.carId = d.carId and dprev.[date] <= d.date order by dprev.[date] desc) as pa,
(select top 1 mileage from data dnext where dnext.mileage is not null and dnext.carId = d.carId and dnext.[date] >= d.date order by dnext.[date] asc) as na
from data d
),
t1 as
(
SELECT l.*,
case when mileage is not null
then null
else row_number() over (partition by l.carId, l.pa, l.na order by l.carId, l.[date])
end as ri, -- index of record in a continuous group where mileage is missing
case when mileage is not null
then null
else dense_rank() over (partition by carId order by l.carId, l.pa, l.na)
end as gi -- index of a continuous group where mileage is missing per car
from limits l
),
t2 as
(
select *,
(select count(*) from t1 tm where tm.carId = t.carId and tm.gi = t.gi) gc --count of records per continuous group where mileage is missing
FROM t1 t
)
select *,
case when mileage is NULL
then pa + (na - pa) / (gc + 1.0) * ri -- also converts from integer to decimal
else NULL
end as 'Interpolated value'
from t2
order by carId, [date]

SQL Group By Problem

I have a table that has 3 cols namely points, project_id and creation_date. every time points are assigned a new record has been made, for example.
points = 20 project_id = 441 creation_date = 04/02/2011 -> Is one record
points = 10 project_id = 600 creation_date = 04/02/2011 -> Is another record
points = 5 project_id = 441 creation_dae = 06/02/2011 -> Is final record
(creation_date is the date on which record is entered and it is achieved by setting the default value to GETDATE())
now the problem is I want to get MAX points grouped by project_id but I also want creation_date to appear with it so I can use it for another purpose, if creation date is repeating its ok and I cannot group by creation_date because if I do so it will skip the points of project with id 600 and its wrong because id 600 is a different project and its only max points are 10 so it should be listed and its only possible if I do the grouping using project_id but then how should I also list creation_date
So far I am using this query to get MAX points of each project
SELECT MAX(points) AS points, project_id
FROM LogiCpsLogs AS LCL
WHERE (writer_id = #writer_id) AND (DATENAME(mm, GETDATE()) = DATENAME(mm, creation_date)) AND (points <> 0)
GROUP BY project_id
writer_id is the ID of writer whose points I want to see, like writer_id = 1, 2 or 3.
This query brings the result of current month only but I would like to list creation_date as well. Please help.
The subquery way
SELECT P.Project_ID, P.Creation_Date, T.Max_Points
FROM Projects P INNER JOIN
(
SELECT Project_ID, MAX(Points) AS Max_Points
FROM Projects
GROUP BY Project_ID
) T
ON P.Project_ID = T.Project_ID
AND P.Points = T.Max_Points
Please see comment: this will give you ALL days where max-points was achieved. If you only just want one, the query will be more complex.
Edits:
Misread requirements. Added additional constraint.
I'll give you sample..
SELECT MAX(POINTS),
PROJECT_ID,
CREATION_DATE
FROM yourtable
GROUP by CREATION_DATE,PROJECT_ID;
This should be what you want, you don't even need a group by or aggravate functions:
SELECT points, project_id, created_date
FROM #T AS LCL
WHERE writer_id = #writer_id AND points <> 0
AND NOT EXISTS (
SELECT TOP 1 1
FROM #T AS T2
WHERE T2.writer_id = #writer_id
AND T2.project_id = LCL.project_id
AND T2.points > LCL.points)
Where #T is your table, also if you want to only show the records where they were the total in general and not the total for just this given #writer_id then remove the restriction T2.writer_id = #writer_id from the inner query
And my code that I used to test:
DECLARE #T TABLE
(
writer_id int,
points int,
project_id int,
created_date datetime
)
INSERT INTO #T VALUES(1, 20, 441, CAST('20110204' AS DATETIME))
INSERT INTO #T VALUES(1, 10, 600, CAST('20110204' AS DATETIME))
INSERT INTO #T VALUES(1, 5, 441, CAST('20110202' AS DATETIME))
INSERT INTO #T VALUES(1, 15, 241, GETDATE())
INSERT INTO #T VALUES(1, 12, 241, GETDATE())
INSERT INTO #T VALUES(2, 12, 241, GETDATE())
SELECT * FROM #T
DECLARE #writer_id int = 1
My results:
Result Set (3 items)
points | project_id | created_date
20 | 441 | 04/02/2011 00:00:00
10 | 600 | 04/02/2011 00:00:00
15 | 241 | 21/09/2011 18:59:31
My solution use CROSS APPLY sub-queries.
For optimal performance I have created an index on project_id (ASC) & points (DESC sorting order) fields.
If you want to see all creation_date values that have maximum points then you can use WITH TIES:
CREATE TABLE dbo.Project
(
project_id INT PRIMARY KEY
,name NVARCHAR(100) NOT NULL
);
CREATE TABLE dbo.ProjectActivity
(
project_activity INT IDENTITY(1,1) PRIMARY KEY
,project_id INT NOT NULL REFERENCES dbo.Project(project_id)
,points INT NOT NULL
,creation_date DATE NOT NULL
);
CREATE INDEX IX_ProjectActivity_project_id_points_creation_date
ON dbo.ProjectActivity(project_id ASC, points DESC)
INCLUDE (creation_date);
GO
INSERT dbo.Project
VALUES (1, 'A'), (2, 'BB'), (3, 'CCC');
INSERT dbo.ProjectActivity (project_id, points, creation_date)
VALUES (1,100,'2011-01-01'), (1,110,'2011-02-02'), (1, 111, '2011-03-03'), (1, 111, '2011-04-04')
,(2, 20, '2011-02-02'), (2, 22, '2011-03-03')
,(3, 2, '2011-03-03');
SELECT p.*, ca.*
FROM dbo.Project p
CROSS APPLY
(
SELECT TOP(1) WITH TIES
pa.points, pa.creation_date
FROM dbo.ProjectActivity pa
WHERE pa.project_id = p.project_id
ORDER BY pa.points DESC
) ca;
DROP TABLE dbo.ProjectActivity;
DROP TABLE dbo.Project;