How to get last date on max count transaction_id by branch in sql - sql

I would like to get last date on max count txn_id base on branch_name.
This is the data
I want the result like this
here is my script. but I get only one row.
select max(date),
account,
branch_name,
province,
district
from
(select date,
account,
branch_name,
province,
district,
RANK() OVER (ODER BY txn_no desc) rnk
from
(select count(tr.txn_id) txn_no,
tr.date,
u.account,
b.branch_name,
b.province,
b.district
from transaction tr
inner join users u
on u.user_id = tr.user_id
inner join branch b
on b.user_id = u.user_id
where 1=1
and tr.date >= to_date('01/04/2021','dd/mm/yyyy') and tr.date < to_date('30/04/2021','dd/mm/yyyy')
group by tr.date,
u.account,
b.branch_name,
b.province,
b.district
))
where rnk = 1
group by tr.date,
u.account,
b.branch_name,
b.province,
b.district

WITH
cte1 AS ( SELECT *, COUNT(*) OVER (PARTITION user, branch) cnt
FROM source_table ),
cte2 AS ( SELECT *, RANK() OVER (PARTITION BY user ORDER BY cnt DESC) rnk
FROM cte1 )
SELECT *
FROM cte2
WHERE rnk = 1
If more than one branch have the same amount of rows then all of them will be returned. If only one of them must be returned in this case then according additional criteria must be used and ORDER BY expression clause in cte2 must be accordingly expanded. If any (random) must be returned in this case then RANK() in cte2 must be replaced with ROW_NUMBER().

I would do it with SELECT TOP 1 ... instead of RANK(), something like:
SELECT date,
txn_id,
account,
branch_name,
province,
district
FROM transaction t
WHERE branch_name = (
SELECT TOP 1 branch_name
FROM (
SELECT branch_name, count(*) as cnt
FROM transaction
WHERE account = t.account
GROUP BY branch_name
ORDER BY 2 DESC
) s
AND date = (
SELECT TOP 1 date
FROM (
SELECT date, count(*) as cnt
FROM transaction
WHERE account = t.account
AND branch_name = (
SELECT TOP 1 branch_name
FROM (
SELECT branch_name, count(*) as cnt
FROM transaction
WHERE account = t.account
GROUP BY branch_name
ORDER BY 2 DESC
) s2
GROUP BY branch_name
ORDER BY 2 DESC
) s2
)
If there are multiple transactions on the last date, they all are going to be returned though.

You want the most recent transaction from the account/branch with the most transactions. If so, you can do this as:
select t.*
from (select t.*,
max(cnt) over (partition by account) as max_cnt
from (select t.*,
count(*) over (partition by account, branch_name) as cnt,
row_number() over (partition by account, branch_name order by date desc) as seqnum
from t
) t
) t
where max_cnt = cnt and seqnum = 1;
Note: If there are multiple branches that have the same count, then they are all returned. Your question does not specify how to deal with such duplicates.

Related

Cannot set Percent_rank() - defaults to 0.0

I'm trying to SET a column, but the result will end up as 0.0 for each row.
If I use the same syntax (the select part of it) in SELECT, the results display correctly.
UPDATE table1
SET ranking = (SELECT
PERCENT_RANK() OVER(PARTITION BY city ORDER BY sales DESC)
from table1
group by store_id)
Is it possible to make this work?
The subquery:
SELECT PERCENT_RANK() OVER(PARTITION BY city ORDER BY sales DESC)
from table1
group by store_id
returns 1 row for each store_id and SQLite picks just one and updates with that row's value of PERCENT_RANK() all the rows of the table.
You must correlate the subquery with table1
UPDATE table1
SET ranking = (
SELECT pr
FROM (
SELECT store_id,
PERCENT_RANK() OVER(PARTITION BY city ORDER BY sales DESC) pr
FROM table1
GROUP BY store_id
) t
WHERE table1.store_id = t.store_id
);
Or, if your version of SQLite is 3.33.0 use the UPDATE...FROM... syntax:
UPDATE table1 AS t1
SET ranking = t.pr
FROM (
SELECT store_id,
PERCENT_RANK() OVER(PARTITION BY city ORDER BY sales DESC) pr
FROM table1
GROUP BY store_id
) t
WHERE t1.store_id = t.store_id;

Select most recent status for each ID and department code

I have the following table:
I want to get the most recent status for each dept_code that a CL_ID has. So the desired output would be this:
I have tried the following but this give me just the most recent status for each client and not each of their dept_codes.
SELECT *
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT] C
INNER JOIN
(SELECT CLIENT_NUMBER, MAX(STATUS_DATE) AS SDATE
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT]
GROUP BY CLIENT_NUMBER) X
ON X.CLIENT_NUMBER = C.CLIENT_NUMBER
AND X.SDATE = C.STATUS_DATE
ORDER BY C.CLIENT_NUMBER
Any help would be much appreciated. Thanks.
A convenient method that works in SQL Server is:
select top (1) cl.*
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl
order by row_number() over (partition by cl_id, dept_code order by status_date desc);
A method that is efficient with the right indexes in almost any database is:
select cl.*
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl
where cl.status_date = (select max(cl2.status_date)
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl2
where cl2.cl_id = cl.cl_id and cl2.dept_code = cl.dept_code
);
The right index is on (cl_id, dept_code, status_date).
I would also use ROW_NUMBER, but with a subquery:
SELECT CL_ID, Status_date, Status, Dept_code
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY CL_ID, Dept_code ORDER BY Status_date DESC) rn
FROM CIMSHR6_MERGED].[dbo].[C3CLSTAT]
) t
WHERE rn = 1;
1) Firstly group everything on Dept_Code,CL_ID and assign rank for each row with in the group in descending order.
2) Select all the rows with rnk=1 which would display your desired result.
SELECT Z.CL_ID,
Z.Status_Date,
Z.Status,
Z.Dept_Code
FROM
(
SELECT *,
RANK() OVER( PARTITION BY Dept_Code,CL_ID, ORDER BY Status_Date DESC ) AS rnk
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT]
) Z
WHERE Z.rnk = 1;
This would work for almost all databases
select * from c3clstat c
where exists
(select 1 from c3clstat c1
where c1.cl_id=c.cl_id
and c1.dept_code=c.dept_code
group by cl_id,dept_code
having c.status_date=max(c1.status_date)
)

Selecting City from Customer ID in SQL

Customer have ordered from different cities. Thus we have multiple cities against same customer_id. I want to display that city against customer id which has occurred maximum number of times , in case where customer has ordered same number of orders from multiple cities that city should be selected from where he has placed last order. I have tried something like
SELECT customer_id,delivery_city,COUNT(DISTINCT delivery_city)
FROM analytics.f_order
GROUP BY customer_id,delivery_city
HAVING COUNT(DISTINCT delivery_city) > 1
WITH cte as (
SELECT customer_id,
delivery_city,
COUNT(delivery_city) as city_count,
MAX(order_date) as last_order
FROM analytics.f_order
GROUP BY customer_id, delivery_city
), ranking as (
SELECT *, row_number() over (partition by customer_id
order by city_count DESC, last_order DESC) as rn
FROM cte
)
SELECT *
FROM ranking
WHERE rn = 1
select customer_id,
delivery_city,
amount
from
(
select t.*,
rank() over (partition by customer_id order by amount asc) as rank
from(
SELECT customer_id,
delivery_city,
COUNT(DISTINCT delivery_city) as amount
FROM analytics.f_order
GROUP BY customer_id,delivery_city
) t
)
where rank = 1

SQL Return MAX Values from Multiple Rows

I have tried many solutions and nothing seems to work. I am trying to return the MAX status date for a project. If that project has multiple items on the same date, then I need to return the MAX ID. So far I have tried this:
SELECT PRJSTAT_ID, PRJSTAT_PRJA_ID, PRJSTAT_STATUS, PRJSTAT_DATE
From Project_Status
JOIN
(SELECT MAX(PRJSTAT_PRJA_ID) as MaxID, MAX(PRJSTAT_DATE) as MaxDate
FROM Project_Status
Group by PRJSTAT_PRJA_ID)
On
PRJSTAT_PRJA_ID = MaxID and PRJSTAT_DATE = MaxDate
Order by PRJSTAT_PRJA_ID
It returns the following:
I am getting multiple records for PRJSTAT_PRJA_ID, but I only want to return the row with the MAX PRJSTAT_ID. Any thoughts?
Take out the MAX on the ID on the subquery:
SELECT PRJSTAT_ID, PRJSTAT_PRJA_ID, PRJSTAT_STATUS, PRJSTAT_DATE
From Project_Status
JOIN
(SELECT PRJSTAT_PRJA_ID as ID, MAX(PRJSTAT_DATE) as MaxDate
FROM Project_Status
Group by PRJSTAT_PRJA_ID)
On
PRJSTAT_PRJA_ID = ID and PRJSTAT_DATE = MaxDate
Order by PRJSTAT_PRJA_ID
Or remove the need to join:
SELECT * FROM
(SELECT PRJSTAT_ID, PRJSTAT_PRJA_ID, PRJSTAT_STATUS, PRJSTAT_DATE,
ROW_NUMBER() OVER (PARTITION BY PRJSTAT_PRJA_ID ORDER BY PRJSTAT_DATE DESC)
AS SEQ,
ROW_NUMBER() OVER (PARTITION BY PRJSTAT_PRJA_ID ORDER BY PRJSTAT_PRJA_ID
DESC) AS IDSEQ
From Project_Status
)PR
WHERE SEQ = 1
AND IDSEQ = 1
Your problem is ties. You want the record with the maximum date per PRJSTAT_PRJA_ID and in case of a tie the record with the highest ID. The easiest way to rank records per group and only keep the best record is ROW_NUMBER:
select prjstat_id, prjstat_prja_id, prjstat_status, prjstat_date
from
(
select
project_status.*,
row_number() over (partition by prjstat_prja_id
order by prjstat_date desc, prjstat_id desc) as rn
from project_status
)
where rn = 1
order by prjstat_prja_id;

Query to pull second smallest id

CUSTOMER(ID,CASE_ID,NAME,STATE)
1,100,Alex,NY
2,100,Alex,VA
3,100,Alex,CT
4,100,Tom,PA
5,102,Peter,MO
6,103,Dave,TN
.
.
.
How to write a query to pull 2nd smallest (min) id (if present) for every group of case_id
Please try:
SELECT
ID,
CASE_ID
FROM
(
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY CASE_ID ORDER BY ID) Rn
FROM CUSTOMER
)x
WHERE Rn=2
You can use a windowing function:
with cte as (
select ID, CASE_ID, ROW_NUMBER() over (partition by CASE_ID order by ID) rn
from CUSTOMER
)
select ID, CASE_ID
from cte
where rn = 2
Or you can use an exists clause to remove the first row (i.e. get the minimum value where there is a row with a lower value):
select MIN(ID) ID, CASE_ID
from CUSTOMER c
where exists (select 1 from CUSTOMER c2 where c2.ID < c.ID and c2.CASE_ID = c.CASE_ID)
group by CASE_ID
Or, written another way:
select MIN(ID) ID, CASE_ID
from CUSTOMER c
where c.ID > (select MIN(ID) from CUSTOMER c2 where c2.CASE_ID = c.CASE_ID)
group by CASE_ID