How to group by one column, aggregate by another column and get another column as result in postgresql? - sql

This seems something simple, but couldn't find an answer for this question last few hours.
I have a table request_state, where "id" is primary key, it can have multiple entries with same state_id. I want to get the id after grouping by state_id using max datetime.
So I tried this, but it gives error "state_id" must appear in the GROUP BY clause or be used in an aggregate function
select id, state_id, max(datetime)
from request_state
group by id
but when I use following query, I get multiple entries with same state_id.
select id, state_id, max(datetime)
from request_state
group by id, state_id
My table:
id state_id date_time
cef 1 Jan 1
ter 1 Jan 2
ijk 1 Jan 3
uuu 2 Feb 1
rrr 2 Feb 2
This is what I want as my result,
id state_id date_time
__ ________ _________
ijk 1 Jan 3
rrr 2 Feb 2

You seem to want:
select max(id) as id, state_id, max(datetime)
from request_state
group by state_id;
If you want the row where datetime is maximum for each state, then use distinct on:
select distinct on (state) rs.*
from request_state rs
order by state, datetime desc;

Try this query:
select id, state_id, date_time from (
select id, state_id, date_time,
row_number() over (partition by state_id order by date_time desc) rn
from tbl
) a where rn = 1

You can use correlated suqbuery :
select t.*
from table t
where date_time = (select max(date_time) from table t1 where t1.state_id = t.state_id);

Related

How to select unique records by ORACLE

When I perform "SELECT * FROM table" I got results like below:
ID Date Time Type
----------------------------------
60 03/03/2013 8:55:00 AM 1
60 03/03/2013 2:10:00 PM 2
110 17/03/2013 9:15:00 AM 1
67 24/03/2013 9:00:00 AM 1
67 24/03/2013 3:05:00 PM 2
as you see each ID has a transaction Type 1 and 2 in the same Date
except ID 110 HAS only Type 1
So how could I just get result like this:
ID Date Time Type
----------------------------------
110 17/03/2013 9:15:00 AM 1
as only one record are returned from the first result
Change the partition definition (partition by id,date) according to your needs
select *
from (select t.*
,count(*) over (partition by id,date) as cnt
from mytable t
) t
where t.cnt = 1
;
You can use this:
select * from my_table t
where exists (
select 1 from my_table
where id = t.id
group by id
having count(*) = 1
)
If you want only type 1, then compare the minimum and maximum values. I prefer window functions:
select t.*
from (select t.*, min(type) over (partition by id) as mintype,
max(type) over (partition by id) as maxtype
from t
) t
where mintype = maxtype and mintype = 1;
If you want only records of the same type (and not specifically type = 1), then remove that condition.
If you want only records on the same day, then include the date in the partition by.
Under some circumstances, not exists can be faster:
select t.*
from t
where not exists (select 1 from t t2 where t2.id = t.id and t2.type <> 1);

Group BY Having COUNT, but Order on a column not contained in group

I have a table where I need to get the ID, for a group(based on ID and Name) with a COUNT(*) = 3, for the latest set of timestamps.
So for example below, I want to retrieve ID 2. As it has 3 rows, and the latest timestamps (even though ID 3 has latest timestamps overall, it doesn't have a count of 3).
But I don't understand how to order by Date, as I cannot contain it in the Group By clause, as it is not the same:
SELECT TOP 1 ID
FROM TABLE
GROUP BY ID,Name
HAVING COUNT(ID) > 2
AND Name = 'ABC'
--ORDER BY Date DESC
Sample Data
ID Name Date
1 ABC 2015-05-27 08:00
1 ABC 2015-05-27 09:00
1 ABC 2015-05-27 10:00
2 ABC 2015-05-27 11:00
2 ABC 2015-05-27 12:00
2 ABC 2015-05-27 13:00
3 ABC 2015-05-27 14:00
3 ABC 2015-05-27 15:00
In SQL server, you need aggregate the columns not on group by list:
SELECT TOP 1 ID
FROM TABLE
WHERE Name = 'ABC'
GROUP BY ID,Name
HAVING COUNT(ID) > 2
ORDER BY MAX(Date) DESC
The name filter should be put before the group by for better performance, if you really need it.
You could do it in a nested query.
Subquery:
SELECT ID
from TABLE
GROUP BY ID
HAVING Count(ID) > 2
That gives you the IDs you want. Put that in another query:
SELECT ID, Data
FROM Table
Where ID in (Subquery)
Order by Date DESC;
First get all desired IDs. That is all IDs having a count > 2. Get the maximum date for each such ID. Then rank these records with ROW_NUMBER, giving the latest ID #1. At last remove all IDs that are not ranked #1.
select name, id
from
(
select
name, id, row_count() over (partition by name order by max_date desc) as rn
from
(
select name, id, max(date) as max_date
from mytable
--where name = 'ABC'
group by name, id
having count(*) > 2
) wanted_ids
) ranked_ids
where rn = 1;

Add a column with the max value of the group

I want to add an extra column, where the max values of each group (ID) will appear.
Here how the table looks like:
select ID, VALUE from mytable
ID VALUE
1 4
1 1
1 7
2 2
2 5
3 7
3 3
Here is the result I want to get:
ID VALUE max_values
1 4 7
1 1 7
1 7 7
2 2 5
2 5 5
3 7 7
3 3 7
Thank you for your help in advance!
Your previous questions indicate that you are using SQL Server, in which case you can use window functions:
SELECT ID,
Value,
MaxValue = MAX(Value) OVER(PARTITION BY ID)
FROM mytable;
Based on your comment on another answer about first summing value, you may need to use a subquery to actually get this:
SELECT ID,
Date,
Value,
MaxValue = MAX(Value) OVER(PARTITION BY ID)
FROM ( SELECT ID, Date, Value = SUM(Value)
FROM mytable
GROUP BY ID, Date
) AS t;
There is no need to use GROUP BY in subselect.
select ID, VALUE,
(select MAX(VALUE) from mytable where ID = t.ID) as MaxValue
from mytable t
Use this query.
SELECT ID
,value
,(
SELECT MAX(VALUE)
FROM GetMaxValue gmv
WHERE gmv.ID = gmv1.ID
GROUP BY ID
) as max_value
FROM GetMaxValue gmv1
ORDER BY ID
Try it with a sub select and group by, then grab the MAX of this group:
select
ID,
VALUE,
(select MAX(VALUE)
from mytable
group by ID
having ID = t.ID
) as max_values
from mytable t
Edit:
I built a SQL fiddle, which shows that my solution works, but also VDohnal is correct and doesn't need the group by, so I'll upvote his answer.

Select top row based on grouping

I think it is a common situation, but I am not able to get the logic.
I have a table as follows.
PersonID SchoolID EndDate
-------- -------- -------
1 ABC 2013
1 DEF 2014
1 GHI 2010
2 XYZ 2013
2 UVW 2011
I want the following output
PersonID SchoolID EndDate
-------- -------- -------
1 DEF 2014
2 XYZ 2013
Basically, I want the latest school for each person. Hence, I try to do something like
SELECT SchoolID, PersonID,EndDate FROM tbl
GROUP BY PersonID
HAVING EndDate = MAX(ENDDATE)
ORDER BY EndDate DESC
But I got an error saying EndDate is invalid in a HAVING clause because it is not contained in an aggregate function or group by clause.
I tried doing this
SELECT SchoolID, PersonID,MAX(EndDate) FROM tbl
GROUP BY PersonID
ORDER BY EndDate DESC
I get an error saying SchoolID is invalid in the select list because of the same reason.
What am I missing here?
with cte as (SELECT *,
ROW_NUMBER() OVER(PARTITION BY PersonID
ORDER BY EndDate DESC) AS RN
FROM Table1)
select PersonId, SchoolId, EndDate from cte
where RN = 1
see SqlFiddle
You have to wrap MAX(Date) in a subquery.
SELECT SchoolID, PersonID, Date
FROM table1 t
WHERE Date =
(SELECT MAX(Date) FROM table1
WHERE PersonID = t.PersonID);
Note: this will give multiple rows for one PersonID if there are multiple dates tied for the max.
with temp as
(
SELECT PersonID,MAX(EndDate) as enddate FROM TABLE
GROUP BY PersonID
)
select TABLE.* from TABLE inner join temp on TABLE.personid=temp.personid
and TABLE.enddate=temp.enddate;

selecting a column on different order from sub query

I have following table structure. I want to select distinct user_id according to office_id with lastest login_datetime.
tbl_id user_id office_id login_datetime
----------------------------------------
1 2 28 12/28/2012 5:35:50 AM
2 2 15 12/28/2012 5:35:50 AM
3 3 20 12/28/2012 5:35:50 AM
4 4 28 12/28/2012 5:35:50 AM
5 2 28 12/28/2012 5:35:50 AM
6 4 15 12/28/2012 5:35:50 AM
7 3 20 12/28/2012 5:35:50 AM
I tried like :
SELECT user_id as u_id,office_id,
(select login_datetime from tbl t2 where t2.user_id=u_id AND ROWNUM=1 ORDER BY t2.tbl_id DESC ) as LAST_LOGIN
FROM tbl
GROUP BY user_id,office_id
But, its not working for me, any help ?
use Window Function
SELECT tbl_id, user_id, office_id,login_datetime
FROM
(
SELECT tbl_id, user_id, office_id,login_datetime,
ROW_NUMBER() OVER (PARTITION BY user_id, "office_id"
ORDER BY login_datetime DESC) rn
FROM tableName
) a
WHERE a.rn = 1
SQLFiddle Demo
Another sollution is a direct group by with a keep dense rank:
select user_id, office_id,
max(login_datetime) keep (dense_rank first order by login_datetime desc) as latest_login_datetime
from tbl
group by user_id, office_id
or if you want unique user_id:
select user_id,
max(office_id) keep (dense_rank first order by login_datetime desc) as lastest_office_id,
max(login_datetime) keep (dense_rank first order by login_datetime desc) as latest_login_datetime
from tbl
group by user_id
OK, i have changed the example to ORACLE
I will explain the query
First i am selecting a distinct user_id and office id (so if a user belongs to two offices he will returns twice)
and then the MAX login_datetime to get the latest datetime,
Then in the WHERE i am filtering the query by the office_id where it equles to all distinct offices. (basically i am returning a distinct table of office_id)
In the end i am grouping by user_id and office_id because of the MAX function
SELECT
DISTINCT "user_id" ,
"office_id",
MAX("login_datetime")
FROM TableName
WHERE "office_id" IN (SELECT DISTINCT "office_id" FROM TableName)
GROUP BY
"user_id",
"office_id"
SQL Fiddle example