Get another value from row with max value with group by clause - sql

Sorry for the title. I have this table:
CREATE TABLE people
(
Id int IDENTITY(1,1) PRIMARY KEY,
name varchar(50) NOT NULL,
FatherId int REFERENCES people(Id),
MotherId int REFERENCES people(Id),
age int NOT NULL
);
Which you can populate with these commands:
insert into people (name, age) VALUES ('Jonny', 50 )
insert into people (name, age) VALUES ('Angela', 48 )
insert into people (name, age) VALUES ('Donny', 55 )
insert into people (name, age) VALUES ('Amy', 55 )
insert into people (name, FatherId, MotherId, age) VALUES ('Marcus', 1, 2, 10)
insert into people (name, FatherId, MotherId, age) VALUES ('Beevis', 1, 2, 5)
insert into people (name, FatherId, MotherId, age) VALUES ('Stew', 3, 4, 24)
insert into people (name, FatherId, MotherId, age) VALUES ('Emily', 3, 4, 25)
My Goal
I want to get the age and name of the oldest child of each set of parents.
Getting just the age was pretty simple:
SELECT MAX(age) FROM people WHERE FatherId IS NOT NULL GROUP BY FatherId
But what if I want to get the age and their corresponding name?
I have tried
select p1.name, p1.age
FROM people p1
INNER JOIN
(
SELECT FatherId, MAX(age) age
FROM people
GROUP BY FatherId
) p2
ON p1.FatherId = p2.FatherId
but this just gives all the children because of the FatherId matching.
I can't seem to get the primary key (Id) because of the GROUP BY clause.
I suppose if this is not possible then some table restructuring may be required to make it possible?
EDIT
Here is a solution I found using CROSS APPLY
select child.name, child.age
FROM people parents
CROSS APPLY
(
select top 1 age, name
from people child
where child.FatherId = parents.Id
ORDER BY age DESC
) child

Here's a simple tweak of your own attempt. One possible advantage of doing it this way it to allow for ties.
select p1.name, p1.age
from people p1 inner join
(
select FatherId, max(age) max_age
from people
group by FatherId
) p2
on p2.FatherId = p1.FatherId and p2.max_age = p1.age;
Also you did refer to "set of parents" in the question. To do that you'd need to group by and join on MotherId as well, assuming of course that this matches up with the real world where children commonly have only a single parent in common.

You can try this query and see this demo
select
name,age
from
(select p.*, rn=row_number() over(partition by p.fatherid,p.motherid order by age desc) from
people p join
(select fatherid,motherid from people
where coalesce(fatherid,motherid) is not null
group by fatherid,motherid)t
on p.fatherid=t.fatherid and p.motherid=t.motherid
)t where rn=1

Two nested sub-select statements can be combined with inner join :
SELECT p1.age, p2.name
FROM
(
SELECT max(age) age
FROM people
WHERE FatherId is not null
GROUP BY FatherId ) p1
INNER JOIN
(
SELECT age, name
FROM people
WHERE FatherId is not null ) p2
ON ( p1.age = p2.age );
age name
--- ------
10 Marcus
25 Emily
SQL Fiddle Demo

This will pick up both parents
declare #t TABLE
(
Id int IDENTITY(1,1) PRIMARY KEY,
name varchar(50) NOT NULL,
FatherId int,
MotherId int,
age int NOT NULL
);
insert into #t (name, age) VALUES
('Jonny', 50 ),
('Angela', 48 ),
('Donny', 55 ),
('Amy', 55 );
insert into #t (name, FatherId, MotherId, age) VALUES
('Marcus', 1, 2, 10),
('Beevis', 1, 2, 5),
('Stew', 3, 4, 24),
('Emily', 3, 4, 25);
select tt.name, tt.age
, tt.fatherName, tt.fatherAge
, tt.motherName, tt.motherAge
from (
select ta.*
, tf.name as fatherName, tf.age as fatherAge
, tm.name as motherName, tm.age as motherAge
, row_number() over (partition by ta.FatherID, ta.MotherID order by ta.age desc) as rn
from #t ta
left join #t tf
on tf.id = ta.fatherID
left join #t tm
on tm.id = ta.motherID
) as tt
where FatherID is not null
and rn = 1

Related

Union two queries ordered by newid

I have a table that stores employees (id, name, and gender). I need to randomly get two men and two women.
CREATE TABLE employees
(
id INT,
name VARCHAR (10),
gender VARCHAR (1),
);
INSERT INTO employees VALUES (1, 'Mary', 'F');
INSERT INTO employees VALUES (2, 'Jake', 'M');
INSERT INTO employees VALUES (3, 'Ryan', 'M');
INSERT INTO employees VALUES (4, 'Lola', 'F');
INSERT INTO employees VALUES (5, 'Dina', 'F');
INSERT INTO employees VALUES (6, 'Paul', 'M');
INSERT INTO employees VALUES (7, 'Tina', 'F');
INSERT INTO employees VALUES (8, 'John', 'M');
My attempt is the following:
SELECT TOP 2 *
FROM employees
WHERE gender = 'F'
ORDER BY NEWID()
UNION
SELECT TOP 2 *
FROM employees
WHERE gender = 'M'
ORDER BY NEWID()
But it doesn't work since I can't put two order by in the same query.
Why not just use row_number()? One method without a subquery is:
SELECT TOP (4) WITH TIES e.*
FROM employees
WHERE gender IN ('M', 'F')
ORDER BY ROW_NUMBER() OVER (PARTITION BY gender ORDER BY newid());
This is slightly less performant than using ROW_NUMBER() in a subquery.
Or, a fun method would use APPLY:
select e.*
from (values ('M'), ('F')) v(gender) cross apply
(select top (2) e.*
from employees e
where e.gender = v.gender
order by newid()
) e;
You cannot put an ORDER BY in the combinable query (the first one) of the UNION. However, you can use ORDER BY if you convert each one into a table expression.
For example:
select *
from (
SELECT TOP 2 *
FROM employees
WHERE gender = 'F'
ORDER BY newid()
) x
UNION ALL
select *
from (
SELECT TOP 2 *
FROM employees
WHERE gender = 'M'
ORDER BY newid()
) y
Result:
id name gender
--- ----- ------
5 Dina F
4 Lola F
2 Jake M
3 Ryan M
See running example at SQL Fiddle.

How to get unique records from 3 tables

I have 3 tables and I am trying to get unique results from all 3 tables (including other columns from each table).
I have tried union approach but that approach only works when I have single column selected from each table.
As soon as I want another corresponding column value from each table, I don't get unique values for the field I am trying to get.
Sample Database and query available here as well: http://www.sqlfiddle.com/#!18/1b9a6/10
Here is the example tables i have created.
CREATE TABLE TABLEA
(
id int,
city varchar(6)
);
INSERT INTO TABLEA ([id], [city])
VALUES
(1, 'A'),
(2, 'B'),
(3, 'C');
CREATE TABLE TABLEB
(
id int,
city varchar(6)
);
INSERT INTO TABLEB ([id], [city])
VALUES
(1, 'B'),
(2, 'C'),
(3, 'D');
CREATE TABLE TABLEC
(
id int,
city varchar(6)
);
INSERT INTO TABLEC ([id], [city])
VALUES
(1, 'C'),
(2, 'D'),
(2, 'E');
Desired result:
A,B,C,D,E
Unique city from all 3 table combined. By unique, I am referring to DISTINCT city from the combination of all 3 tables. Yes, the id is different for common values between tables but it doesn't matter in my use-case if id is coming from table A, B OR C, as long as I am getting DISTINCT (aka UNIQUE) city across all 3 tables.
I tried this query but no luck (city B is missing in the output):
SELECT city, id
FROM
(SELECT city, id
FROM TABLEA
WHERE city NOT IN (SELECT city FROM TABLEB
UNION
SELECT city FROM TABLEC)
UNION
SELECT city, id
FROM TABLEB
WHERE city NOT IN (SELECT city FROM TABLEA
UNION
SELECT city FROM TABLEC)
UNION
SELECT city, id
FROM TABLEC) AS mytable
try this. As this should give you distinct city with there first appear id:
select distinct min(id) over(partition by city) id, city from (
select * from TABLEA
union all
select * from TABLEB
union all
select * from TABLEC ) uni
You got the right idea, just wrap the UNION results in a subquery/temp table and then apply the DISTINCT
WITH TABLEE AS (
SELECT city, id FROM TABLEA
UNION
SELECT city, id FROM TABLEB
UNION
SELECT city, id FROM TABLEC
)
SELECT DISTINCT city
FROM TABLEE

How do I count a Date Column based on another Date Column?

What I'm trying to achieve is to count the login attempts of a user based on the LoginAttempts value and on the LastLoginDate. For example, I need to query the LastLoginDate within 30 days with 2 Loginattempts.
result should be:
What I have is this..I created temp table to pull the information and and it doesn't seem to be counting correctly. Here's where I'm stuck..Any help would be appreciated!!
Your GROUP BY is incorrect. It includes LastLoginDateUTC. Consequently, you're counting logins per date, not logins per 30 days.
Drop LastLoginDateUTC from your GROUP BY, and change the SELECT clause to use max(LastLoginDateUTC) as LastLoginDateUTC. That should give you what you want.
Not sure if I understand your request completely, but here's something outside the box. Good luck!:
select FirstName, StudentID, UserName, LastLoginDate, LoginAttempts,
RowNumber = row_number() over(partition by StudentID order by StudentID, LoginAttempts)
into #temp1
from user
where cast(LastLoginDate as date) >= getdate()- 30
--should return all rows of data for past 30 days.
--should also rank each loginAttempt by student
select * from #temp1
where RowNumber >= 3
This not an answer... more a suggestion.
As I see it you have Students and you need to audit login attempts.
This for me is a one to many relation.
So I would keep the Student table, stripped off from any login related data.
This would simplify your Student table, keep less rows and save storage space for not using the same data (name, username, etc) over and over again.
That data would be in a StudentLoginAttempts, something like:
Create Table StudentLoginAttempts (
Id int not null identity(1,1),
StudentId int not null,
LoginDate datetime not null,
Successful bit not null,
Constraint PK_StudentLoginAttempts Primary Key Clustered (Id),
Constraint FK_StudentLoginAttempts_Student Foreign Key (StudentId) References Student(StudentId)
)
go
Create Index IX_StudentLoginAttempts_StudentId On StudentLoginAttempts(StudentId)
go
Create Index IX_StudentLoginAttempts_LoginDate On StudentLoginAttempts(LoginDate)
go
So things could be more clear and you can have more info.
Think the example bellow:
Create Table #Student (
StudentId int not null identity(1,1),
Username varchar(50) not null,
FirstName varchar(50) not null
)
Create Table #StudentLoginAttempts (
Id int not null identity(1,1),
StudentId int not null,
LoginDate datetime not null,
Successful bit not null
)
insert into #Student values
( 'Student001', 'JON' ),
( 'Student002', 'STEVE' )
insert into #StudentLoginAttempts values
( 1, '2016-01-01 09:12', 0 ),
( 1, '2016-02-01 09:12', 0 ),
( 1, '2016-03-01 09:12', 1 ),
( 2, '2016-03-02 10:12', 0 ),
( 2, '2016-04-02 10:12', 1 ),
( 2, '2016-05-02 10:12', 0 )
;with TotalAttemptsCte as (
select StudentId, TotalLoginAttempts = count(*) from #StudentLoginAttempts group by StudentId
),
FailedCte as (
select StudentId, FailedLogins = count(*) from #StudentLoginAttempts where ( Successful = 0 ) group by StudentId
),
SuccessfulCte as (
select StudentId, SuccessfulLogins = count(*) from #StudentLoginAttempts where ( Successful = 1 ) group by StudentId
),
LastSuccessFulDateCte as (
select StudentId, max(LoginDate) as LastSuccessfulLoginDate
from
#StudentLoginAttempts
where
( Successful = 1 )
group by StudentId
)
select
a.*, b.TotalLoginAttempts, c.FailedLogins, d.SuccessfulLogins, e.LastSuccessfulLoginDate
from
#Student a
left join TotalAttemptsCte b on ( a.StudentId = b.StudentId )
left join FailedCte c on ( a.StudentId = c.StudentId )
left join SuccessfulCte d on ( a.StudentId = d.StudentId )
left join LastSuccessFulDateCte e on ( a.StudentId = e.StudentId )
Drop Table #StudentLoginAttempts
Drop Table #Student
You could also create a view based on the query for more easy access.
I remind you that this is just a suggestion, how I would do it.

using exists and having count to find rows with highest values by id

I want to select the top 3 rows from a songwriter without using top or max and filtering on name.
the table:
CREATE TABLE Person (Name VARCHAR(10), song VARCHAR(10), length INT )
INSERT INTO Person
values
('Jim', 'songA', 8),
('Jim', 'songB', 5),
('Jim', 'songC', 7),
('Jim', 'songD', 4),
('Jimsky', 'songE', 8),
('Jim', 'songF', 6);
the query:
SELECT
p1.Name,
p1.song,
p1.length
FROM Person p1
WHERE EXISTS
(
SELECT *
FROM Person p2
WHERE p2.length < p1.length
AND p1.Name = 'Jim'
)
How can i select the top 3 or top 2 rows without top/max from the songwriter Jim, with having count?
Thanks
I would suggest you:
SELECT
p1.Name,
p1.song,
p1.length
FROM Person p1
WHERE EXISTS
(
SELECT *
FROM Person p2
WHERE p2.length < p1.length
AND p1.Name = 'Jim'
)
ORDER BY <SomeRow> DESC LIMIT 3
but I don't understand your context, so can you explain a bit more what you are going to achieve?
You would do this using row_number(), assuming you want the three longest songs:
select Name, song, length
from (select p.Name, p.song, p.length,
row_number() over (partition by p.name order by p.length desc) as seqnum
from person p
) p
where seqnum <= 3;

how to insert many records excluding some

I want to create a table with a subset of records from a master table.
for example, i have:
id name code
1 peter 73
2 carl 84
3 jack 73
I want to store peter and carl but not jack because has same peter's code.
I need hight performance because i have 20M records.
I try this:
SELECT id, name, DISTINCT(code) INTO new_tab
FROM old_tab
WHERE (conditions)
but don't work.
Assuming you want to pick the row with the maximum id per code, then this should do it:
insert into new_tab (id, name, code)
(SELECT id, name, code
FROM
(
SELECT id, name, code, rank() as rnk OVER (PARTITION BY code ORDER BY id DESC)
FROM old_tab WHERE rnk = 1
)
)
and for the minimum id per code, just change the sort order in the rank from DESC to ASC:
insert into new_tab (id, name, code)
(SELECT id, name, code
FROM
(
SELECT id, name, code, rank() as rnk OVER (PARTITION BY code ORDER BY id ASC)
FROM old_tab WHERE rnk = 1
)
)
Using a derived table, you can find the minID for each code, then join back to that in the outer to get the rest of the columns for that ID from oldTab.
select id,name,code
insert into newTabFROM
from old_tab t inner join
(SELECT min(id) as minId, code
from old_tab group by code) x
on t.id = x.minId
WHERE (conditions)
Try this:
CREATE TABLE #Temp
(
ID INT,
Name VARCHAR(50),
Code INT
)
INSERT #Temp VALUES (1, 'Peter', 73)
INSERT #Temp VALUES (2, 'Carl', 84)
INSERT #Temp VALUES (3, 'Jack', 73)
SELECT t2.ID, t2.Name, t2.Code
FROM #Temp t2
JOIN (
SELECT t.Code, MIN(t.ID) ID
FROM #temp t
JOIN (
SELECT DISTINCT Code
FROM #Temp
) d
ON t.Code = d.Code
GROUP BY t.Code
) b
ON t2.ID = b.ID