Display result as group by count with max date? - sql

I have below sample data, that I need to display results as count by group with max date.
REQUEST_NUMBER ASSIGNED_GROUP LAST_MODIFIED_DATE
001 GROUP A
001 GROUP B 2/2/2018
002 GROUP A
002 GROUP B 2/2/2018
002 GROUP C 2/3/2018
003 GROUP B
My expected result needs to be displayed as count of a group with max of last_modified_date only like:
ASSIGNED_GROUP TOTAL_COUNT
GROUP B 2
GROUP C 1
In my above example 001 was last assigned to GROUP B, 002 last assigned to GROUP C, and 003 is only 1 record with NULL last_modified_date, so remains with GROUP B.
I'm trying with just one result so far, but not getting proper results:
SELECT request_number, ASSIGNED_GROUP_NAME
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY request_number ORDER BY request_number) RNUM,
request_number, ASSIGNED_GROUP_NAME
FROM WORK_DETAIL
WHERE request_number = '3458112'
)
WHERE MAX(last_modified_date)
ORDER BY ASSIGNED_GROUP_NAME

Something like this could work
SELECT ASSIGNED_GROUP, COUNT(ASSIGNED_GROUP), MAX(LAST_MODIFIED_DATE) FROM YourTable
GROUP BY ASSIGNED_GROUP

You could use group by;
select t.assigned_group,t.last_modified_date,count(*) from table t inner join
(
select assigned_group,max(last_modified_date) as maxDate from table
where last_modified_date is not null
group by assigned_group
) t2
ON t.last_modified_date = t2.maxDate and t.assigned_group = t2.assigned_group
group by t.assigned_group,t.last_modified_date

You could use ajoin with a subquery with max_date group by assigned_group
select a.ASSIGNED_GROUP, count(*)
from my_table a
inner join(
select ASSIGNED_GROUP, max(LAST_MODIFIED_DATE) as max_date
from my_table
where LAST_MODIFIED_DATE is not null
group by ASSIGNED_GROUP
) t on t.max_date = a.LAST_MODIFIED_DATE and t.ASSIGNED_GROUP = a.ASSIGNED_GROUP
group by a.assigned_group

Related

Get most recent measurement

I have a table that has has some measurements, ID and date.
The table is built like so
ID DATE M1 M2
1 2020 1 NULL
1 2020 NULL 15
1 2018 2 NULL
2 2019 1 NULL
2 2019 NULL 1
I would like to end up with a table that has one row per ID with the most recent measurement
ID M1 M2
1 1 15
2 1 1
Any ideas?
You can use correlated sub-query with aggregation :
select id, max(m1), max(m2)
from t
where t.date = (select max(t1.date) from t t1 where t1.id = t.id)
group by id;
Use ROW_NUMBER combined with an aggregation:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DATE DESC) rn
FROM yourTable
)
SELECT ID, MAX(M1) AS M1, MAX(M2) AS M2
FROM cte
WHERE rn = 1
GROUP BY ID;
The row number lets us restrict to only records for each ID having the most recent year date. Then, we aggregate to find the max values for M1 and M2.
In standard SQL, you can use lag(ignore nulls):
select id, coalesce(m1, prev_m1), coalesce(m2, prev_m2)
from (select t.*,
lag(m1 ignore nulls) over (partition by id order by date) as prev_m1,
lag(m2 ignore nulls) over (partition by id order by date) as prev_m2,
row_number() over (partition by id order by date desc) as seqnum
from t
) t
where seqnum = 1;

Delete duplicated record

I have a table which contains a lot of duplicated rows like this:
id_emp id date ch_in ch_out
1 34103 2019-09-01
1 34193 2019-09-01 17:00
1 34194 2019-09-02 07:03:21 16:59:26
1 34104 2019-09-02 07:03:21 16:59:26
1 33361 2019-09-02 NULL NULL
I want just one row for each date and others must delete with condition like I want the output must be:
id_emp id date ch_in ch_out
1 34193 2019-09-01 17:00
1 34104 2019-09-02 07:03:21 16:59:26
I tried to use distinct but nothing working:
select distinct id_emp, id, date_1, ch_in,ch_out
from ch_inout
where id_emp=1 order by date_1 asc
And I tried too using this query to delete:
select *
from (
select *, rn=row_number() over (partition by date_1 order by id)
from ch_inout
) x
where rn > 1;
But nothing is working the result is empty.
You can use aggregation:
select id_emp, max(id) as id, date, min(ch_in), max(ch_out)
from ch_inout
group by id_emp, date;
This returns the maximum id for each group of rows. That is not exactly what is returned in the question, but you don't specify the logic.
EDIT:
If you want to delete all but the largest id for each id_emp/date combination, you can use:
delete c from ch_inout c
where id < (select max(c2.id)
from ch_inout c2
where c2.id_emp = c.id_emp and c2.date = c.date
);
You can use ROW_NUMBER() to identify the records you want to delete. Assuming that you want to keep the record with the lowest id on each date:
SELECT *
FROM (
SELECT
t.*,
ROW_NUMBER() OVER(PARTITION BY date ORDER BY id) rn
FROM ch_inout t
) x
WHERE rn > 1
You can easily turn this into a DELETE statement:
WITH cte AS (
SELECT
t.*,
ROW_NUMBER() OVER(PARTITION BY date ORDER BY id) rn
FROM ch_inout t
)
DELETE FROM cte WHERE rn > 1

How to create GROUP BY on min and max date

I have a database table like this
emp_id start-date end_date title location
111 1-JAN-2000 31-DEC-2003 MANAGER NYO
111 1-JAN-2003 31-DEC-2005 MANAGER BOM
111 1-JAN-2006 31-DEC-2007 CFO NYO
111 1-JAN-2008 31-DEC-2015 MANAGER NYO
I have created a SQL code already with GROUP BY and min , max function
select emp_id,min(start_date),max(end_date),title
from table1
group by emp_id,title
What is expect is this:
111 1-JAN-2000 31-DEC-2005 MANAGER
111 1-JAN-2006 31-DEC-2007 CFO
111 1-JAN-2008 31-DEC-2015 MANAGER
What i am getting is:
111 1-JAN-2000 31-DEC-2015 MANAGER
111 1-JAN-2006 31-DEC-2007 CFO
This is a type of gaps-and-islands problem with date-chains. I would suggest using a left join to find where the islands start. Then a cumulative sum and aggregation:
select emp_id, title, min(start_date), max(end_date)
from (select t.*,
sum(case when tprev.emp_id is null then 1 else 0 end) over
(partition by t.emp_id, t.title order by t.start_date) as grouping
from t left join
t tprev
on t.emp_id = tprev.emp_id and
t.title = tprev.title and
t.start_date = tprev.end_date + 1
) t
group by grouping, emp_id, title;
try like below by using window function find the gap and make it the group
with cte1 as
(
select a.*,
row_number()over(partition by emp_id,title order by start-date) rn,
row_number() over(order by start-date) rn1
from table_name a
) select emp_id,
min(start-date),
max(end_date),
max(title)
from cte1 group by emp_id, rn1-rn
demo link

First value in DATE minus 30 days SQL

I have bunch of data out of which I'm showing ID, max date and it's corresponding values (user id, type, ...). Then I need to take MAX date for each ID, substract 30 days and show first date and it's corresponding values within this date period.
Example:
ID Date Name
1 01.05.2018 AAA
1 21.04.2018 CCC
1 05.04.2018 BBB
1 28.03.2018 AAA
expected:
ID max_date max_name previous_date previous_name
1 01.05.2018 AAA 05.04.2018 BBB
I have working solution using subselects, but as I have quite huge WHERE part, refresh takes ages.
SUBSELECT looks like that:
(SELECT MIN(N.name)
FROM t1 N
WHERE N.ID = T.ID
AND (N.date < MAX(T.date) AND N.date >= (MAX(T.date)-30))
AND (...)) AS PreviousName
How'd you write the select?
I'm using TSQL
Thanks
I can do this with 2 CTEs to build up the dates and names.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 (ID int, theDate date, theName varchar(10)) ;
INSERT INTO t1 (ID, theDate, theName)
VALUES
( 1,'2018-05-01','AAA' )
, ( 1,'2018-04-21','CCC' )
, ( 1,'2018-04-05','BBB' )
, ( 1,'2018-03-27','AAA' )
, ( 2,'2018-05-02','AAA' )
, ( 2,'2018-05-21','CCC' )
, ( 2,'2018-03-03','BBB' )
, ( 2,'2018-01-20','AAA' )
;
Main Query:
;WITH cte1 AS (
SELECT t1.ID, t1.theDate, t1.theName
, DATEADD(day,-30,t1.theDate) AS dMinus30
, ROW_NUMBER() OVER (PARTITION BY t1.ID ORDER BY t1.theDate DESC) AS rn
FROM t1
)
, cte2 AS (
SELECT c2.ID, c2.theDate, c2.theName
, ROW_NUMBER() OVER (PARTITION BY c2.ID ORDER BY c2.theDate) AS rn
, COUNT(*) OVER (PARTITION BY c2.ID) AS theCount
FROM cte1
INNER JOIN cte1 c2 ON cte1.ID = c2.ID
AND c2.theDate >= cte1.dMinus30
WHERE cte1.rn = 1
GROUP BY c2.ID, c2.theDate, c2.theName
)
SELECT cte1.ID, cte1.theDate AS max_date, cte1.theName AS max_name
, cte2.theDate AS previous_date, cte2.theName AS previous_name
, cte2.theCount
FROM cte1
INNER JOIN cte2 ON cte1.ID = cte2.ID
AND cte2.rn=1
WHERE cte1.rn = 1
Results:
| ID | max_date | max_name | previous_date | previous_name |
|----|------------|----------|---------------|---------------|
| 1 | 2018-05-01 | AAA | 2018-04-05 | BBB |
| 2 | 2018-05-21 | CCC | 2018-05-02 | AAA |
cte1 builds the list of max_date and max_name grouped by the ID and then using a ROW_NUMBER() window function to sort the groups by the dates to get the most recent date. cte2 joins back to this list to get all dates within the last 30 days of cte1's max date. Then it does essentially the same thing to get the last date. Then the outer query joins those two results together to get the columns needed while only selecting the most and least recent rows from each respectively.
I'm not sure how well it will scale with your data, but using the CTEs should optimize pretty well.
EDIT: For the additional requirement, I just added in another COUNT() window function to cte2.
I would do:
select id,
max(case when seqnum = 1 then date end) as max_date,
max(case when seqnum = 1 then name end) as max_name,
max(case when seqnum = 2 then date end) as prev_date,
max(case when seqnum = 2 then name end) as prev_name,
from (select e.*, row_number() over (partition by id order by date desc) as seqnum
from example e
) e
group by id;

SQL - Finding Customer's largest Location by Order $

I have a table with customer IDs, location IDs, and their order values. I need to select the location ID for each customer with the largest spend
Customer | Location | Order $
1 | 1A | 100
1 | 1A | 20
1 | 1B | 100
2 | 2A | 50
2 | 2B | 20
2 | 2B | 50
So I would get
Customer | Location | Order $
1 | 1A | 120
2 | 2B | 70
I tried something like this:
SELECT
a.CUST
,a.LOC
,c.BOOKINGS
FROM (SELECT DISTINCT TOP 1 b.CUST, b.LOC, sum(b.ORDER_VAL) as BOOKINGS
FROM ORDER_TABLE b
GROUP BY b.CUST, b.LOC
ORDER BY BOOKINGS DESC) as c
INNER JOIN ORDER_TABLE a
ON a.CUST = c.CUST
But that just returns the top order.
Just use variables to emulate ROW_NUM()
DEMO
SELECT *
FROM ( SELECT `Customer`, `Location`, SUM(`Order`) as `Order`,
#rn := IF(#customer = `Customer`,
#rn + 1,
IF(#customer := `Customer`, 1, 1)
) as rn
FROM Table1
CROSS JOIN (SELECT #rn := 0, #customer := '') as par
GROUP BY `Customer`, `Location`
ORDER BY `Customer`, SUM(`Order`) DESC
) t
WHERE t.rn = 1
Firs you have to sum the values for each location:
select Customer, Location, Sum(Order) as tot_order
from order_table
group by Customer, Location
then you can get the maximum order with MAX, and the top location with a combination of group_concat that will return all locations, ordered by total desc, and substring_index in order to get only the top one:
select
Customer,
substring_index(
group_concat(Location order by tot_order desc),
',', 1
) as location,
Max(tot_order) as max_order
from (
select Customer, Location, Sum(Order) as tot_order
from order_table
group by Customer, Location
) s
group by Customer
(if there's a tie, two locations with the same top order, this query will return just one)
This seems like an order by using aggregate function problem. Here is my stab at it;
SELECT
c.customer,
c.location,
SUM(`order`) as `order_total`,
(
SELECT
SUM(`order`) as `order_total`
FROM customer cm
WHERE cm.customer = c.customer
GROUP BY location
ORDER BY `order_total` DESC LIMIT 1
) as max_order_amount
FROM customer c
GROUP BY location
HAVING max_order_amount = order_total
Here is the SQL fiddle. http://sqlfiddle.com/#!9/2ac0d1/1
This is how I'd handle it (maybe not the best method?) - I wrote it using a CTE first, only to see that MySQL doesn't support CTEs, then switched to writing the same subquery twice:
SELECT B.Customer, C.Location, B.MaxOrderTotal
FROM
(
SELECT A.Customer, MAX(A.OrderTotal) AS MaxOrderTotal
FROM
(
SELECT Customer, Location, SUM(`Order`) AS OrderTotal
FROM Table1
GROUP BY Customer, Location
) AS A
GROUP BY A.Customer
) AS B INNER JOIN
(
SELECT Customer, Location, SUM(`Order`) AS OrderTotal
FROM Table1
GROUP BY Customer, Location
) AS C ON B.Customer = C.Customer AND B.MaxOrderTotal = C.OrderTotal;
Edit: used the table structure provided
This solution will provide multiple rows in the event of a tie.
SQL fiddle for this solution
How about:
select a.*
from (
select customer, location, SUM(val) as s
from orders
group by customer, location
) as a
left join
(
select customer, MAX(b.tot) as t
from (
select customer, location, SUM(val) as tot
from orders
group by customer, location
) as b
group by customer
) as c
on a.customer = c.customer where a.s = c.t;
with
Q_1 as
(
select customer,location, sum(order_$) as order_sum
from cust_order
group by customer,location
order by customer, order_sum desc
),
Q_2 as
(
select customer,max(order_sum) as order_max
from Q_1
group by customer
),
Q_3 as
(
select Q_1.customer,Q_1.location,Q_1.order_sum
from Q_1 inner join Q_2 on Q_1.customer = Q_2.customer and Q_1.order_sum = Q_2.order_max
)
select * from Q_3
Q_1 - selects normal aggregate, Q_2 - selects max(aggregate) out of Q_1 and Q_3 selects customer,location, sum(order) from Q_1 which matches with Q_2