How to get rid of multiple branch_id? - sql

I have a SQL Query of this:
SELECT
COUNT(PERMISSION_ID) AS USER_TOTAL_PERMISSION_PER_BRANCH,
USER_ID,
BRANCH_ID
FROM BRANCH_PERMISSION_USER
GROUP BY USER_ID, BRANCH_ID
ORDER BY USER_ID, USER_TOTAL_PERMISSION_PER_BRANCH DESC
But I have a problem because I only want the first row per user_id. The main goal is to get the list of user together it's branch and top 1 or the distinct on the USER_TOTAL_PERMISSION_PER_BRANCH
Here is the sample output:
Expected output should be:
[USER_TOTAL_PERMISSION_PER_BRANCH][USER_ID][BRANCH_ID]
135 1 1
135 2 1
134 3 1
1 4 1
1 5 1
1 6 1

You can use window functions:
SELECT USER_TOTAL_PERMISSION_PER_BRANCH, USER_ID, BRANCH_ID
FROM (SELECT COUNT(*) AS USER_TOTAL_PERMISSION_PER_BRANCH,
USER_ID, BRANCH_ID,
ROW_NUMBER() OVER (PARTITION BY USER_ID ORDER BY COUNT(*) DESC) as seqnum
FROM BRANCH_PERMISSION_USER
GROUP BY USER_ID, BRANCH_ID
) ub
WHERE seqnum = 1

You can turn your query to a CTE a do filtering using correlation:
with cte as (
select
count(permission_id) as user_total_permission_per_branch,
user_id,
branch_id
from branch_permission_user
group by user_id, branch_id
)
select c.*
from cte c
where c.user_total_permission_per_branch = (
select max(c1.user_total_permission_per_branch)
from cte c1
where c1.user_id = c.user_id and c1.branch_id = c.branch_id
)

Thanks to Sir #Gordon
I use his logic. Here is it:
SELECT USER_TOTAL_PERMISSION_PER_BRANCH, USER_ID, BRANCH_ID
FROM (SELECT COUNT(*) AS USER_TOTAL_PERMISSION_PER_BRANCH,
USER_ID, BRANCH_ID,
ROW_NUMBER() OVER (PARTITION BY USER_ID ORDER BY COUNT(*) DESC) as seqnum
FROM BRANCH_PERMISSION_USER
GROUP BY USER_ID, BRANCH_ID
) ub
WHERE seqnum = 1

Related

How do I find the Sum and Max value per Unique ID in HIVE?

basically how do I turn
id name quantity
1 Jerry 1
1 Jerry 2
1 Nana 1
2 Max 4
2 Lenny 3
into
id name quantity
1 Jerry 3
2 Max 4
in HIVE?
I want to sum up and find the highest quantity for each unique ID
You can use window functions with aggregation:
select id, name, quantity
from (select id, name, sum(quantity) as quantity,
row_number() over (partition by id order by sum(quantity) desc) as seqnum
from t
group by id, name
) t
where seqnum = 1;
You can first calculate the sum of quantity per group, then rank them according to descending quantity, and finally filter the rows with rank = 1.
select
id, name, quantity
from (
select
*,
row_number() over (partition by id order by quantity desc) as rn
from (
select id, name, sum(quantity) as quantity
from mytable
group by id, name
)
) where rn = 1;
try like below
with cte as
(
select id,name,sum(quantity) as q
from table_name group by id,name
) select id,name,q from cte t1
where t1.q=( select max(q) from cte t2 where t1.id=t2.id)

Selecting rows that have row_number more than 1

I have a table as following (using bigquery):
id
year
month
sales
row_number
111
2020
11
1000
1
111
2020
12
2000
2
112
2020
11
3000
1
113
2020
11
1000
1
Is there a way in which I can select rows that have row numbers more than one?
For example, my desired output is:
id
year
month
sales
row_number
111
2020
11
1000
1
111
2020
12
2000
2
I don't want to just exclusively select rows with row_number = 2 but also row_number = 1 as well.
The original code block I used for the first table result is:
SELECT
id,
year,
month,
SUM(sales) AS sales,
ROW_NUMBER() OVER (PARTITIONY BY id ORDER BY id ASC) AS row_number
FROM
table
GROUP BY
id, year, month
You can use window functions:
select t.* except (cnt)
from (select t.*,
count(*) over (partition by id) as cnt
from t
) t
where cnt > 1;
As applied to your aggregation query:
SELECT iym.* EXCEPT (cnt)
FROM (SELECT id, year, month,
SUM(sales) as sales,
ROW_NUMBER() OVER (Partition by id ORDER BY id ASC) AS row_number
COUNT(*) OVER(Partition by id ORDER BY id ASC) AS cnt
FROM table
GROUP BY id, year, month
) iym
WHERE cnt > 1;
You can wrap your query as in below example
select * except(flag) from (
select *, countif(row_number > 1) over(partition by id) > 0 flag
from (YOUR_ORIGINAL_QUERY)
)
where flag
so it can look as
select * except(flag) from (
select *, countif(row_number > 1) over(partition by id) > 0 flag
from (
SELECT id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month
)
)
where flag
so when applied to sample data in your question - it will produce below output
Try this:
with tmp as (SELECT id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month)
select * from tmp a where exists ( select 1 from tmp b where a.id = b.id and b.row_number =2)
It's a so clearly exists statement SQL
This is what I use, it's similar to #ElapsedSoul answer but from my understanding for static list "IN" is better than using "EXISTS" but I'm not sure if the performance difference, if any, is significant:
Difference between EXISTS and IN in SQL?
WITH T1 AS
(
SELECT
id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC) AS ROW_NUM
FROM table
GROUP BY id, year, month
)
SELECT *
FROM T1
WHERE id IN (SELECT id FROM T1 WHERE ROW_NUM > 1);

List the most up-to-date product of each category,postqresql queries

user_id product_id category_id date_added date_update
1 2 1 2.3.2021 null
1 3 1 2.3.2020 2.4.2023
1 4 2 2.3.2020 null
1 5 2 2.3.2020 2.4.2023
2 5 2 2.3.2020 2.4.2023
2 4 1 2.3.2020 null
List the most up-to-date product of each category
You can use row_number()
select * from
(
select *,row_number() over(parition by userid,category_id order by date_update) as rn
from tablename
)A where rn=1
OR you can also use distinct on
select distinct on (user_id,category_id) *
FROM tablename
ORDER BY user_id,category_id, date_update
List the most up-to-date product of each category
You can use distinct on. Let me assume that if the update date is null, then you want the creation date:
select distinct on (category_id) t.*
from t
order by category_id, coalesce(date_update, date_added) desc;
If you wanted this per user/category combination, the logic would be:
select distinct on (user_id, category_id) t.*
from t
order by user_id, category_id, coalesce(date_update, date_added) desc;
Using Window function
select u_id,c_id, p_id, coalesce (date_update, date_added) as date ,
rank () over (partition by u_id, c_id order by coalesce (date_update, date_added) desc) as r
from inventory
) t where r = 1

Avoid Unions to get TOP count

Here are two tables:
LocationId Address City State Zip
1 2100, 1st St Austin TX 76819
2 2200, 2nd St Austin TX 76829
3 2300, 3rd St Austin TX 76839
4 2400, 4th St Austin TX 76849
5 2500, 5th St Austin TX 76859
6 2600, 6th St Austin TX 76869
TripId PassengerId FromLocationId ToLocationId
1 746896 1 2
2 746896 2 1
3 234456 1 3
4 234456 3 1
5 234456 1 4
6 234456 4 1
7 234456 1 6
8 234456 6 1
9 746896 1 2
10 746896 2 1
11 746896 1 2
12 746896 2 1
I want TOP 5 locations which each passenger has traveled to (does not matter if its from or to location). I can get it using a UNION, but was wondering if there was a better way to do this.
My Solution:
select top 5 *
from
(select count(l.LocationId) as cnt, l.LocationId, l.Address1, l.Address2, l.City, St.State , l.Zip
from
Trip t
join LOCATION l on t.FromLocationId = l.LocationId
where t.PassengerId = 746896
group by count(l.LocationId) as cnt, l.LocationId, l.Address1, l.Address2, l.City, St.State , l.Zip
UNION
select count(l.LocationId) as cnt, l.LocationId, l.Address1, l.Address2, l.City, St.State , l.Zip
from
Trip t
join LOCATION l on t.ToLocationId = l.LocationId
where t.PassengerId = 746896
group by count(l.LocationId) as cnt, l.LocationId, l.Address1, l.Address2, l.City, St.State , l.Zip
) as tbl
order by cnt desc
This will give you top 5 location.
SELECT TOP 5 tmp.fromlocationid AS locationid,
Count(tmp.fromlocationid) AS Times
FROM (SELECT fromlocationid
FROM trip
UNION ALL
SELECT tolocationid
FROM trip) tmp
GROUP BY tmp.fromlocationid
Method 1: This will give you top 5 location of each passenger.
WITH cte AS
( SELECT passengerid,
locationid,
Count(locationid) AS Times,
Row_number() OVER(partition BY passengerid ORDER BY passengerid ASC) AS RowNum
FROM (SELECT tripid, passengerid, fromlocationid AS locationid
FROM trip
UNION ALL
SELECT tripid, passengerid, tolocationid AS locationid
FROM trip) tmp
GROUP BY passengerid, locationid )
SELECT *
FROM cte
WHERE rownum <= 5
ORDER BY passengerid, Times DESC
Method 2: Same result without Union Operator (Top 5 location of each passenger)
WITH cte AS
( SELECT passengerid,
locationid,
Count(locationid) AS Times,
Row_number() OVER(partition BY passengerid ORDER BY passengerid ASC) AS RowNum
FROM trip
UNPIVOT ( locationid
FOR subject IN (fromlocationid, tolocationid) ) u
GROUP BY passengerid, locationid )
SELECT *
FROM cte
WHERE rownum <= 5
ORDER BY passengerid, times DESC
If you also want to get the location details, you can simply join the location table.
SELECT cte.* , location.*
FROM cte
INNER JOIN location ON location.locationid = cte.locationid
WHERE rownum <= 5
ORDER BY passengerid, times DESC
Reference
- https://stackoverflow.com/a/19056083/6327676
YOou'll need to replace the SELECT *'s with the columns you need, however, something like this should work:
WITH Visits AS (
SELECT *,
COUNT(*) OVER (PARTITION BY t.PassengerID, L.LocationID) AS Visits
FROM Trip T
JOIN [Location] L ON T.FromLocationId = L.LocationId),
Rankings AS (
SELECT *,
DENSE_RANK() OVER (PARTITION BY V.PassengerID ORDER BY Visits DESC) AS Ranking
FROM Visits V)
SELECT *
FROM Rankings
WHERE Ranking <= 5;
Further simplified solution
select top 3 * from
(
Select distinct count(locationId) as cnt, locationId from trip
unpivot
(
locationId
for direction in (fromLocationId, toLocationId)
)u
where passengerId IN (746896, 234456)
group by direction, locationId
)as tbl2
order by cnt desc;
Solution combining columns
The main issue for me is avoiding union to combine the two columns.
The UNPIVOT command can do this.
select top 3 * from (
select count(locationId) cnt, locationId
from
(
Select valu as locationId, passengerId from trip
unpivot
(
valu
for loc in (fromLocationId, toLocationId)
)u
)united
where passengerId IN (746896, 234456)
group by locationId
) as tbl
order by cnt desc;
http://sqlfiddle.com/#!18/cec8b/136
If you want to get the counts by direction:
select top 3 * from (
select count(locationId) cnt, locationId, direction
from
(
Select valu as locationId, direction, passengerId from trip
unpivot
(
valu
for direction in (fromLocationId, toLocationId)
)u
)united
where passengerId IN (746896, 234456)
group by locationId, direction
) as tbl
order by cnt desc;
http://sqlfiddle.com/#!18/cec8b/139
Same Results as you ( minus some minor descriptions )
select top 3 * from
(
select distinct * from (
select count(locationId) cnt, locationId
from
(
Select valu as locationId, direction, passengerId from trip
unpivot
(
valu
for direction in (fromLocationId, toLocationId)
)u
)united
where passengerId IN (746896, 234456)
group by locationId, direction
) as tbl
)as tbl2
order by cnt desc;
You can do this without union all:
select top (5) t.passengerid, v.locationid, count(*)
from trip t cross apply
(values (fromlocationid), (tolocationid)) v(locationid) join
location l
on v.locationid = l.locationid
where t.PassengerId = 746896
group by t.passengerid, v.locationid
order by count(*) desc;
If you want an answer for all passengers, it would be a similar idea, using row_number(), but your query suggests you want the answer only for one customer at a time.
You can include additional fields from location as well.
Here is a SQL Fiddle.

Select Max two rows of each account SQL Server

I have this table
ID AGE ACCNUM NAME
--------------------------------
1 10 55409 Intro
2 6 55409 Chapter1
3 4 55409 Chapter2
4 3 69591 Intro
5 6 69591 Outro
6 0 40322 Intro
And I need a query that returns the two max age from each ACCNUM
in this case, records:
1, 2, 4, 5, 6
I have tried too many queries but nothing works for me.
I tried this query
Select
T1.accnum, T1.age
from
table1 as T1
inner join
(select
accnum, max(age) as max
from table1
group by accnum) as T2 on T1.accnum = T2.accnum
and (T1.age = T2.max or T1.age = T2.max -1)
TSQL Ranking Functions: Row_Number() https://msdn.microsoft.com/en-us/library/ms186734.aspx
select id, age, accnum, name
from
(
select id, age, accnum, name, ROW_NUMBER() Over (Partition By accnum order by age desc) as rn
from yourtable
) a
where a.rn <= 2
You can use row_number():
select accnum
, age
from ( select accnum
, age
, row_number() over(partition by accnum order by age desc) as r
from table1 as T1) t where r < 3
CODE:
WITH CTE AS (SELECT ID, AGE, ACCNUM, NAME,
ROW_NUMBER() OVER(PARTITION BY ACCNUM ORDER BY AGE DESC) AS ROW_NUM
FROM T1)
SELECT ID, AGE, ACCNUM, NAME
FROM CTE
WHERE ROW_NUM <= 2
Uses a common table expression to achieve the desired result.
SQL Fiddle