Group by column and get max and min id on sql

Group by column and get max and min id on sql - sql

I got a table with theses Column :
ID_REAL,DATE_REAL,NAME_REAL
I want to make a query to get result like this with a group by on the name
NAME | MAX(DATE_REAL) | ID_REAL of the MAX(DATE_REAL) | MIN(DATE_REAL) | ID_REAL of the MIN(DATE_REAL)
I dont know how to make it for the moment I have
select NAME_REAL,max(DATE_REAL),ID_REAL from MYREALTABLE group by NAME_REAL,ID_REAL
select NAME_REAL,min(DATE_REAL),ID_REAL from MYREALTABLE group by NAME_REAL,ID_REAL
But is not whats I need, and also I need only 1 query
Thanks you

I think the following should work by finding the records which have the minimum and maximum dates per name and joining those two queries.
select
mn.NAME_REAL,
MIN_DATE_REAL,
ID_REAL_OF_MIN_DATE_REAL,
MAX_DATE_REAL,
ID_REAL_OF_MAXDATE_REAL
from
(
select NAME_REAL,
DATE_REAL as MIN_DATE_REAL,
ID_REAL as ID_REAL_OF_MIN_DATE_REAL,
from (
select
NAME_REAL,
ID_REAL,
DATE_REAL,
row_number() over (partition by NAME_REAL order by DATE_REAL asc) as date_order_asc
from MYREALTABLE
)
where date_order_asc = 1
) mn
inner join
(
select NAME_REAL,
DATE_REAL as MAX_DATE_REAL,
ID_REAL as ID_REAL_OF_MAX_DATE_REAL,
from (
select
NAME_REAL,
ID_REAL,
DATE_REAL,
row_number() over (partition by NAME_REAL order by DATE_REAL desc) as date_order_desc
from MYREALTABLE
)
where date_order_desc = 1
) mx
on mn.NAME_REAL = mx.NAME_REAL

You can join the two results into a single query result as follows
select o.NAME_REAL,o.max,o.id_real,t.min,o.id_real from (
select NAME_REAL,max(DATE_REAL) as max,ID_REAL, from MYREALTABLE group by NAME_REAL,ID_REAL)
as o inner join
(select NAME_REAL,min(DATE_REAL),ID_REAL from MYREALTABLE group by NAME_REAL,ID_REAL
) as t on o.NAME_REAL=t.NAME_REAL

Try the below -
select NAME_REAL,ID_REAL,max(DATE_REAL) as max_date, min(DATE_REAL) as min_date
from MYREALTABLE
group by NAME_REAL,ID_REAL

Related

Finding the highest COUNT of a group per individual GROUP BY query in Hive

I have a table of customer transactions where an individual_id appears once for every different transaction.
There is a category column called Name_desc which i would like to group by individual and find the most common category of name_desc per individual.
Suppose data is like below
Id Name_desc
---- ------
1 a
2 c
1 b
2 c
1 b
I want below output
Id Name_desc( most occuring category)
------ ------
1 b
2 c
I tried with below query and got an
Error while compiling statement: FAILED: ParseException line 4:19 cannot recognize input near 'select' 'max' '(' in expression specification
error
select name_desc, count(*) as count_e
from db.cust_scan
group by id, name_desc
having count(*)= ( select max(count_e),id
from
(
select id, name_desc, count(*) as count_e
from
db.cust_scan
where
base_div_nbr =1
and
country_code ='US'
and
retail_channel_code=1
and visit_date between '2019-01-01' and '2019-12-31'
GROUP by
individual_id, tt_id_desc
order by individual_id, count_e desc
) as t
group by individual_id )
I would appreciate any suggestions or help with regard to query. If there is an efficient way of getting this job done. Let me know.

This following script written and tested for MSSQL. But as HIVE also support the same Row_Number() ans sub query, this following query should help you getting your required output-
SELECT A.Id, A.Name_desc
FROM
(
SELECT Id,Name_desc,
row_number() over (partition by id order by COUNT(*) desc) AS RN
FROM your_table
GROUP BY Id,Name_desc
) A
WHERE RN = 1

You need subquery in Hive:
SELECT s.Id, s.Name_desc
FROM
(
select s.*, row_number() over (partition by s.id order by s.cnt desc) rn
from
(
SELECT Id, Name_desc, COUNT(*) cnt
FROM your_table
GROUP BY Id, Name_desc
) s
) s
WHERE rn= 1;

Get all items with min values SQL Server

Here's what the table is like:
----------------------------------
EmployeeId Tasks_Count
1 1
2 1
3 2
4 1
5 3
I need a query to get all employees with min tasks count. Result should be like this:
---------------
EmployeeId
1
2
4
The problem is that i using a subquery to count tasks. Here's my code
SELECT *
FROM (SELECT EmployeeId,
COUNT(*) AS Tasks_count
FROM Tasks
INNER JOIN Status ON Tasks.StatusId=Status.Id
WHERE Status.Name != 'Closed'
GROUP BY EmployeeId
ORDER BY Tasks_count DESC) AS Employee_not_closed
WHERE Tasks_count IN (SELECT MIN(Tasks_count)
FROM Employee_not_closed)

Use FETCH FIRST WITH TIES:
select EmployeeId
from tablename
order by Tasks_Count
fetch first 1 row with ties

You can try below -
select * from tablename
where Tasks_Count in (select min(Tasks_Count) from tablename)

It can also be done using RANK() function like following.
;with cte as
(
select Employeeid, rank() over( order by Tasks_Count) rn
from #table
)
select * from cte where rn=1

You Can use the below code i have tested the code and its working fine.
select EmployeeId from StackOverFlow_3 where Tasks_Count in(select min(Tasks_Count) from StackOverFlow_3)

You can use a join on subquery
select m.EmployeeId
from my_table m
inner join
(
select min(task_count) min_task
from my_table
) t on t.min_task = m.task_count

Select most recent status for each ID and department code

I have the following table:
I want to get the most recent status for each dept_code that a CL_ID has. So the desired output would be this:
I have tried the following but this give me just the most recent status for each client and not each of their dept_codes.
SELECT *
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT] C
INNER JOIN
(SELECT CLIENT_NUMBER, MAX(STATUS_DATE) AS SDATE
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT]
GROUP BY CLIENT_NUMBER) X
ON X.CLIENT_NUMBER = C.CLIENT_NUMBER
AND X.SDATE = C.STATUS_DATE
ORDER BY C.CLIENT_NUMBER
Any help would be much appreciated. Thanks.

A convenient method that works in SQL Server is:
select top (1) cl.*
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl
order by row_number() over (partition by cl_id, dept_code order by status_date desc);
A method that is efficient with the right indexes in almost any database is:
select cl.*
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl
where cl.status_date = (select max(cl2.status_date)
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl2
where cl2.cl_id = cl.cl_id and cl2.dept_code = cl.dept_code
);
The right index is on (cl_id, dept_code, status_date).

I would also use ROW_NUMBER, but with a subquery:
SELECT CL_ID, Status_date, Status, Dept_code
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY CL_ID, Dept_code ORDER BY Status_date DESC) rn
FROM CIMSHR6_MERGED].[dbo].[C3CLSTAT]
) t
WHERE rn = 1;

1) Firstly group everything on Dept_Code,CL_ID and assign rank for each row with in the group in descending order.
2) Select all the rows with rnk=1 which would display your desired result.
SELECT Z.CL_ID,
Z.Status_Date,
Z.Status,
Z.Dept_Code
FROM
(
SELECT *,
RANK() OVER( PARTITION BY Dept_Code,CL_ID, ORDER BY Status_Date DESC ) AS rnk
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT]
) Z
WHERE Z.rnk = 1;

This would work for almost all databases
select * from c3clstat c
where exists
(select 1 from c3clstat c1
where c1.cl_id=c.cl_id
and c1.dept_code=c.dept_code
group by cl_id,dept_code
having c.status_date=max(c1.status_date)
)

Select the Max date time for single User

I have a table like this,
Date User
15-06-2018 A
16-06-2018 A
15-06-2018 B
14-06-2018 C
16-06-2018 C
I want to get the output like this,
Date User
16-06-2018 A
15-06-2018 B
16-06-2018 C
I tried Select Max(date),User from Table group by User

Based on your comment, I assume you have duplicated results in those 80 columns when you group by them. Assuming so, here's one option using row_number to always return 1 row per user:
select *
from (
select *, row_number() over (partition by user order by date desc) rn
from yourtable
) t
where rn = 1

You can use correlation subquery :
select t.*
from table t
where date = (select max(t1.date)
from table t1
where t1.user = t.user
);
However, i would also recommend row_number() :
select top (1) with ties *
from table t
order by row_number() over (partition by user order by date desc);

You can also use a ranking function
SELECT User, Date
FROM
(
SELECT User, Date
, Row_id = Row_Number() OVER (Partition by User, ORDER BY User, Date desc)
FROM table
)q
WHERE Row_Id = 1

I would suggest you this
Select * from table t where exist
(Select 1 from
(Select user, max(date) as date from table) A
Where A.user = t.user and A.date = t.date )

Group by clause with min

I am having the following table
I used following query and i got error message. I can identify why the error is but how can i solve it
select min(id),customer_id,created_at from thunderbolt_orders
group by customer_id
I need the minimum id's customer_id and created_at how can i achieve it.

select distinct on (customer_id)
customer_id, id, created_at
from thunderbolt_orders
order by customer_id, id

with cte as (
select
*, row_number() over(partition by customer_id order by id) as row_num
from Table1
)
select *
from cte
where row_num = 1

SELECT id,customer_id,created_at
FROM thunderbolt_orders
WHERE id IN
(SELECT MIN(id) FROM thunderbolt_orders GROUP BY customer_id);

Depending on whether or not you just want the minimum ID or whether you want the minimum ID for each customer these are the solutions.
Minimum ID:
select top 1 id,customer_id,created_at from thunderbolt_orders order by id asc
Minimum ID for each customer:
with cte as (
select min(id) as id
from thunderbolt_orders
group by customer_id
)
select *
from cte c
inner join thunderbolt_orders t on t.id = c.id

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Group by column and get max and min id on sql - sql

Try the below - select NAME_REAL,ID_REAL,max(DATE_REAL) as max_date, min(DATE_REAL) as min_date from MYREALTABLE group by NAME_REAL,ID_REAL

Related

Finding the highest COUNT of a group per individual GROUP BY query in Hive

Get all items with min values SQL Server

Select most recent status for each ID and department code

Select the Max date time for single User

Group by clause with min

Categories

Resources