How to group by one column, aggregate by another column and get another column as result in postgresql?

How to group by one column, aggregate by another column and get another column as result in postgresql? - sql

This seems something simple, but couldn't find an answer for this question last few hours.
I have a table request_state, where "id" is primary key, it can have multiple entries with same state_id. I want to get the id after grouping by state_id using max datetime.
So I tried this, but it gives error "state_id" must appear in the GROUP BY clause or be used in an aggregate function
select id, state_id, max(datetime)
from request_state
group by id
but when I use following query, I get multiple entries with same state_id.
select id, state_id, max(datetime)
from request_state
group by id, state_id
My table:
id state_id date_time
cef 1 Jan 1
ter 1 Jan 2
ijk 1 Jan 3
uuu 2 Feb 1
rrr 2 Feb 2
This is what I want as my result,
id state_id date_time
__ ________ _________
ijk 1 Jan 3
rrr 2 Feb 2

You seem to want:
select max(id) as id, state_id, max(datetime)
from request_state
group by state_id;
If you want the row where datetime is maximum for each state, then use distinct on:
select distinct on (state) rs.*
from request_state rs
order by state, datetime desc;

Try this query:
select id, state_id, date_time from (
select id, state_id, date_time,
row_number() over (partition by state_id order by date_time desc) rn
from tbl
) a where rn = 1

You can use correlated suqbuery :
select t.*
from table t
where date_time = (select max(date_time) from table t1 where t1.state_id = t.state_id);

Related

How to select unique records by ORACLE

When I perform "SELECT * FROM table" I got results like below:
ID Date Time Type
----------------------------------
60 03/03/2013 8:55:00 AM 1
60 03/03/2013 2:10:00 PM 2
110 17/03/2013 9:15:00 AM 1
67 24/03/2013 9:00:00 AM 1
67 24/03/2013 3:05:00 PM 2
as you see each ID has a transaction Type 1 and 2 in the same Date
except ID 110 HAS only Type 1
So how could I just get result like this:
ID Date Time Type
----------------------------------
110 17/03/2013 9:15:00 AM 1
as only one record are returned from the first result

Change the partition definition (partition by id,date) according to your needs
select *
from (select t.*
,count(*) over (partition by id,date) as cnt
from mytable t
) t
where t.cnt = 1
;

You can use this:
select * from my_table t
where exists (
select 1 from my_table
where id = t.id
group by id
having count(*) = 1
)

If you want only type 1, then compare the minimum and maximum values. I prefer window functions:
select t.*
from (select t.*, min(type) over (partition by id) as mintype,
max(type) over (partition by id) as maxtype
from t
) t
where mintype = maxtype and mintype = 1;
If you want only records of the same type (and not specifically type = 1), then remove that condition.
If you want only records on the same day, then include the date in the partition by.
Under some circumstances, not exists can be faster:
select t.*
from t
where not exists (select 1 from t t2 where t2.id = t.id and t2.type <> 1);

Group BY Having COUNT, but Order on a column not contained in group

I have a table where I need to get the ID, for a group(based on ID and Name) with a COUNT(*) = 3, for the latest set of timestamps.
So for example below, I want to retrieve ID 2. As it has 3 rows, and the latest timestamps (even though ID 3 has latest timestamps overall, it doesn't have a count of 3).
But I don't understand how to order by Date, as I cannot contain it in the Group By clause, as it is not the same:
SELECT TOP 1 ID
FROM TABLE
GROUP BY ID,Name
HAVING COUNT(ID) > 2
AND Name = 'ABC'
--ORDER BY Date DESC
Sample Data
ID Name Date
1 ABC 2015-05-27 08:00
1 ABC 2015-05-27 09:00
1 ABC 2015-05-27 10:00
2 ABC 2015-05-27 11:00
2 ABC 2015-05-27 12:00
2 ABC 2015-05-27 13:00
3 ABC 2015-05-27 14:00
3 ABC 2015-05-27 15:00

In SQL server, you need aggregate the columns not on group by list:
SELECT TOP 1 ID
FROM TABLE
WHERE Name = 'ABC'
GROUP BY ID,Name
HAVING COUNT(ID) > 2
ORDER BY MAX(Date) DESC
The name filter should be put before the group by for better performance, if you really need it.

You could do it in a nested query.
Subquery:
SELECT ID
from TABLE
GROUP BY ID
HAVING Count(ID) > 2
That gives you the IDs you want. Put that in another query:
SELECT ID, Data
FROM Table
Where ID in (Subquery)
Order by Date DESC;

First get all desired IDs. That is all IDs having a count > 2. Get the maximum date for each such ID. Then rank these records with ROW_NUMBER, giving the latest ID #1. At last remove all IDs that are not ranked #1.
select name, id
from
(
select
name, id, row_count() over (partition by name order by max_date desc) as rn
from
(
select name, id, max(date) as max_date
from mytable
--where name = 'ABC'
group by name, id
having count(*) > 2
) wanted_ids
) ranked_ids
where rn = 1;

Add a column with the max value of the group

I want to add an extra column, where the max values of each group (ID) will appear.
Here how the table looks like:
select ID, VALUE from mytable
ID VALUE
1 4
1 1
1 7
2 2
2 5
3 7
3 3
Here is the result I want to get:
ID VALUE max_values
1 4 7
1 1 7
1 7 7
2 2 5
2 5 5
3 7 7
3 3 7
Thank you for your help in advance!

Your previous questions indicate that you are using SQL Server, in which case you can use window functions:
SELECT ID,
Value,
MaxValue = MAX(Value) OVER(PARTITION BY ID)
FROM mytable;
Based on your comment on another answer about first summing value, you may need to use a subquery to actually get this:
SELECT ID,
Date,
Value,
MaxValue = MAX(Value) OVER(PARTITION BY ID)
FROM ( SELECT ID, Date, Value = SUM(Value)
FROM mytable
GROUP BY ID, Date
) AS t;

There is no need to use GROUP BY in subselect.
select ID, VALUE,
(select MAX(VALUE) from mytable where ID = t.ID) as MaxValue
from mytable t

Use this query.
SELECT ID
,value
,(
SELECT MAX(VALUE)
FROM GetMaxValue gmv
WHERE gmv.ID = gmv1.ID
GROUP BY ID
) as max_value
FROM GetMaxValue gmv1
ORDER BY ID

Try it with a sub select and group by, then grab the MAX of this group:
select
ID,
VALUE,
(select MAX(VALUE)
from mytable
group by ID
having ID = t.ID
) as max_values
from mytable t
Edit:
I built a SQL fiddle, which shows that my solution works, but also VDohnal is correct and doesn't need the group by, so I'll upvote his answer.

Select top row based on grouping

I think it is a common situation, but I am not able to get the logic.
I have a table as follows.
PersonID SchoolID EndDate
-------- -------- -------
1 ABC 2013
1 DEF 2014
1 GHI 2010
2 XYZ 2013
2 UVW 2011
I want the following output
PersonID SchoolID EndDate
-------- -------- -------
1 DEF 2014
2 XYZ 2013
Basically, I want the latest school for each person. Hence, I try to do something like
SELECT SchoolID, PersonID,EndDate FROM tbl
GROUP BY PersonID
HAVING EndDate = MAX(ENDDATE)
ORDER BY EndDate DESC
But I got an error saying EndDate is invalid in a HAVING clause because it is not contained in an aggregate function or group by clause.
I tried doing this
SELECT SchoolID, PersonID,MAX(EndDate) FROM tbl
GROUP BY PersonID
ORDER BY EndDate DESC
I get an error saying SchoolID is invalid in the select list because of the same reason.
What am I missing here?

with cte as (SELECT *,
ROW_NUMBER() OVER(PARTITION BY PersonID
ORDER BY EndDate DESC) AS RN
FROM Table1)
select PersonId, SchoolId, EndDate from cte
where RN = 1
see SqlFiddle

You have to wrap MAX(Date) in a subquery.
SELECT SchoolID, PersonID, Date
FROM table1 t
WHERE Date =
(SELECT MAX(Date) FROM table1
WHERE PersonID = t.PersonID);
Note: this will give multiple rows for one PersonID if there are multiple dates tied for the max.

with temp as
(
SELECT PersonID,MAX(EndDate) as enddate FROM TABLE
GROUP BY PersonID
)
select TABLE.* from TABLE inner join temp on TABLE.personid=temp.personid
and TABLE.enddate=temp.enddate;

selecting a column on different order from sub query

I have following table structure. I want to select distinct user_id according to office_id with lastest login_datetime.
tbl_id user_id office_id login_datetime
----------------------------------------
1 2 28 12/28/2012 5:35:50 AM
2 2 15 12/28/2012 5:35:50 AM
3 3 20 12/28/2012 5:35:50 AM
4 4 28 12/28/2012 5:35:50 AM
5 2 28 12/28/2012 5:35:50 AM
6 4 15 12/28/2012 5:35:50 AM
7 3 20 12/28/2012 5:35:50 AM
I tried like :
SELECT user_id as u_id,office_id,
(select login_datetime from tbl t2 where t2.user_id=u_id AND ROWNUM=1 ORDER BY t2.tbl_id DESC ) as LAST_LOGIN
FROM tbl
GROUP BY user_id,office_id
But, its not working for me, any help ?

use Window Function
SELECT tbl_id, user_id, office_id,login_datetime
FROM
(
SELECT tbl_id, user_id, office_id,login_datetime,
ROW_NUMBER() OVER (PARTITION BY user_id, "office_id"
ORDER BY login_datetime DESC) rn
FROM tableName
) a
WHERE a.rn = 1
SQLFiddle Demo

Another sollution is a direct group by with a keep dense rank:
select user_id, office_id,
max(login_datetime) keep (dense_rank first order by login_datetime desc) as latest_login_datetime
from tbl
group by user_id, office_id
or if you want unique user_id:
select user_id,
max(office_id) keep (dense_rank first order by login_datetime desc) as lastest_office_id,
max(login_datetime) keep (dense_rank first order by login_datetime desc) as latest_login_datetime
from tbl
group by user_id

OK, i have changed the example to ORACLE
I will explain the query
First i am selecting a distinct user_id and office id (so if a user belongs to two offices he will returns twice)
and then the MAX login_datetime to get the latest datetime,
Then in the WHERE i am filtering the query by the office_id where it equles to all distinct offices. (basically i am returning a distinct table of office_id)
In the end i am grouping by user_id and office_id because of the MAX function
SELECT
DISTINCT "user_id" ,
"office_id",
MAX("login_datetime")
FROM TableName
WHERE "office_id" IN (SELECT DISTINCT "office_id" FROM TableName)
GROUP BY
"user_id",
"office_id"
SQL Fiddle example

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to group by one column, aggregate by another column and get another column as result in postgresql? - sql

You seem to want: select max(id) as id, state_id, max(datetime) from request_state group by state_id; If you want the row where datetime is maximum for each state, then use distinct on: select distinct on (state) rs.* from request_state rs order by state, datetime desc;

Try this query: select id, state_id, date_time from ( select id, state_id, date_time, row_number() over (partition by state_id order by date_time desc) rn from tbl ) a where rn = 1

You can use correlated suqbuery : select t.* from table t where date_time = (select max(date_time) from table t1 where t1.state_id = t.state_id);

Related

How to select unique records by ORACLE

Group BY Having COUNT, but Order on a column not contained in group

Add a column with the max value of the group

Select top row based on grouping

selecting a column on different order from sub query

Categories

Resources