row_number based on case when with text content - sql

Here is a task I need to get three elements based on the given conditions:
three elements: user_id, order_time, ordered_subject
each unique user_id
earliest order_time
ordered_subjects' order should be app-> acc ->ayy
if there are several order_time are the same, you should take only one subject followed by the 3rd requirement
original table: user_order
user_id
order_time
ordered_subject
1
2001-02-09
app
2
2001-02-09
app
3
2001-02-10
ayy
1
2001-02-09
acc
1
2001-02-10
app
4
2001-02-08
ayy
5
2001-02-09
acc
5
2001-02-09
ayy
expected table:
user_id
order_time
ordered_subject
1
2001-02-09
app
2
2001-02-09
app
3
2001-02-10
ayy
4
2001-02-08
ayy
5
2001-02-09
acc
I come up with the idea of case when and row_number() over, but it doesn't work
the code I tried:
select
a.uid,
a.subject,
b.min_time,
(case when "app" then 1
when "acc" then 2
when "ayy" then 3
else 4 end) as rn,
row_number() over(partition by
concat(uid,order_id)
order by
rn)
from (
select uid, min(order_time) as min_time
from user_order
group by
uid
) as b
-- join
user_order as a
-- on
where
a.uid = b.uid
and
b.min_time = a.order_time
How should I fix this?

You want one result row per user. Per user you want the earliest order and if there is more than one order on the earliest date you prefer the order subject app over acc and acc over ayy.
You want to use ROW_NUMBER, so partition by user ID and order by date and the order subject in the desired order.
select user_id, order_time, ordered_subject
from
(
select
user_id, order_time, ordered_subject,
row_number() over
(partition by user_id
order by order_time,
case ordered_subject
when 'app' then 1
when 'acc' then 2
when 'ayy' then 3
else 4
end) as rn
from mytable
) numbered
where rn = 1
order by user_id;

Related

SQL: Running total count of distinct values

I'm trying to obtain rolling number of unique values in a window.
Here's how my table looks like:
SELECT
user_id
, order_date
, product
FROM example_table
WHERE user_id = 1
ORDER BY order_date ASC
user_id
order_date
product
1
2021-01-01
A
1
2021-01-01
B
1
2021-01-04
A
1
2021-01-07
C
1
2021-01-09
C
1
2021-01-20
A
Here's what I'm trying to achieve:
user_id
order_date
product
cum_dist_count
1
2021-01-01
A
1
1
2021-01-02
B
2
1
2021-01-04
A
2
1
2021-01-07
C
3
1
2021-01-09
C
3
1
2021-01-20
A
3
In other words, I want to be able to see how many unique items a customer has bough so far, and be able to see that for particular date (so for the example above: on 2021-01-04 they have bought 2 unique items and for 2021-01-07 that number was 3).
I've tried grouping by selecting user_id and product, and min(order_date) in a CTE, then doing ROW_NUMBER over user_id and product in that CTE and that worked partially - I'm able to seethe dates the countof unique products has changed (so for this example: 2021-01-01, 2021-01-02 and 2021-01-07, but then I loose the rows "between" which I still want to be able to access.
with cte as (
SELECT
user_id
, product
, min(order_date) as first_order
FROM example_table
GROUP BY 1,2
ORDER BY order_date ASC
)
SELECT
user_id
, first_order
, product
, ROW_NUMBER() OVER (PARTITION BY user_id, product ORDER BY first_order) AS number_of_unique_products
WHERE user_id = 1
With the above, I would get:
user_id
order_date
product
cum_dist_count
1
2021-01-01
A
1
1
2021-01-02
B
2
1
2021-01-07
C
3
The DB is in BigQuery StandardSQL.
Any help is much appreciated!
For each item, you can record the earliest date it appears. Then add those up:
select et.* except (seqnum),
countif(seqnum = 1) over (partition by user_id order by order_date) as running_distinct_count
from (select et.*,
row_number() over (partition by user_id, product order by order_date) as seqnum
from example_table et
) et
Below is for BigQuery
select * except(cum_products),
(select count(distinct product) from t.cum_products product) as cum_dist_count
from (
select *,
array_agg(product) over prev_rows as cum_products
from example_table
window prev_rows as (partition by user_id order by order_date)
) t
if applied to sample data in your question
with example_table as (
select 1 user_id, '2021-01-01' order_date, 'A' product union all
select 1, '2021-01-02', 'B' union all
select 1, '2021-01-04', 'A' union all
select 1, '2021-01-07', 'C' union all
select 1, '2021-01-09', 'C' union all
select 1, '2021-01-20', 'A'
)
output is

SQL : Return joint most frequent values from a column

I have the following table named customerOrders.
ID user order
1 1 2
2 1 3
3 1 1
4 2 1
5 1 5
6 2 4
7 3 1
8 6 2
9 2 2
10 2 3
I want to return to users with most orders. Currently, I have the following QUERY:
SELECT user, COUNT(user) AS UsersWithMostOrders
FROM customerOrders
GROUP BY user
ORDER BY UsersWithMostOrders DESC;
This returns me all the values grouped by total orders like.
user UsersWithMostOrders
1 4
2 4
3 1
6 1
I only want to return the users with most orders. In my case that would be user 1 and 2 since both of them have 4 orders. If I use TOP 1 or LIMIT, it will only return the first user. If I use TOP 2, it will only work in this scenario, it will return invalid data when top two users have different count of orders.
Required Result
user UsersWithMostOrders
1 4
2 4
You can use TOP 1 WITH TIES:
SELECT TOP 1 WITH TIES
[user], COUNT(*) AS UsersWithMostOrders
FROM customerOrders
GROUP BY [user]
ORDER BY UsersWithMostOrders DESC;
See the demo.
Results:
> user | UsersWithMostOrders
> ---: | ------------------:
> 1 | 4
> 2 | 4
Option 1
Should work with most versions of SQL.
select *
from (
select *,
rank() over(order by numOrders desc) as rrank
from (
select `user`, count(*) as numOrders
from customerOrders
group by `user`
) summed
) ranked
where rrank = 1
Play around with the code here
Option 2
If your version of SQL allows window functions (with), here is a much more readable solution which does the same thing
with summed as (
select `user`, count(*) as numOrders
from customerOrders
group by `user`
),
ranked as (
select *,
rank() over(order by numOrders desc) as rrank
from summed
)
select *
from ranked
where rrank = 1
Play around with the code here
You can use a CTE to attain this Req:
;WITH CTE AS(
SELECT [user], COUNT(user) AS UsersWithMostOrders
FROM #T
GROUP BY [user])
SELECT M.* from CTE M
INNER JOIN ( SELECT
MAX(UsersWithMostOrders) AS MaximumOrders FROM CTE) S ON
M.UsersWithMostOrders=S.MaximumOrders
Below Oracle Query can help:
WITH test_table AS
(
SELECT user, COUNT(order) AS total_order , DENSE_RANK() OVER (ORDER BY
total_order desc) AS rank_orders FROM customerOrders
GROUP BY user
)
select * from test_table where rank_orders = 1

Return last amount for each element with same ref_id

I have 2 tables, one is credit and other one is creditdetails.
Creditdetails creates new row every day for each of credit.
ID Amount ref_id date
1 2 1 16.03
2 3 1 17.03
3 4 1 18.03
4 1 2 16.03
5 2 2 17.03
6 0 2 18.03
I want to sum up amount of every row with the unique id and last date. So the output should be 4 + 0.
You can use ROW_NUMBER to filter on the latest amount per ref_id.
Then SUM it.
SELECT SUM(q.Amount) AS TotalLatestAmount
FROM
(
SELECT
cd.ref_id,
cd.Amount,
ROW_NUMBER() OVER (PARTITION BY cd.ref_id ORDER BY cd.date DESC) AS rn
FROM Creditdetails cd
) q
WHERE q.rn = 1;
A test on db<>fiddle here
With this query:
select ref_id, max(date) maxdate
from creditdetails
group by ref_id
you get all the last dates for each ref_id, so you can join it to the table creditdetails and sum over amount:
select sum(amount) total
from creditdetails c inner join (
select ref_id, max(date) maxdate
from creditdetails
group by ref_id
) g
on g.ref_id = c.ref_id and g.maxdate = c.date
I think you want something like this,
select sum(amount)
from table
where date = ( select max(date) from table);
with the understanding that your date column doesn't appear to be in a standard format so I can't tell if it needs to be formatted in the query to work properly.

How to calculate the number of a day in series of consecutive dates?

I have a table
id name created_at
1 name 1 08/01/2017
2 name 2 08/02/2017
3 name 3 08/03/2017
4 name 4 08/05/2017
5 name 5 08/06/2017
6 name 6 08/07/2017
7 name 7 08/10/2017
8 name 8 08/12/2017
I need to add a column where be rank for all rows, but if they were created from day to day.
The result should be like below
id name created_at days_on
1 name 1 08/01/2017 1
2 name 2 08/02/2017 2
3 name 3 08/03/2017 3
4 name 4 08/05/2017 1
5 name 5 08/06/2017 2
6 name 6 08/07/2017 3
7 name 7 08/10/2017 null
8 name 8 08/12/2017 null
There are many answers describing typical approaches to similar problems, where you can also find an explanation of the techniques used below.
select
id, name, created_at,
case when count(*) over wa > 1 then row_number() over wo end as rank
from (
select
id, name, created_at,
sum(first) over w as part
from (
select *, (lag(created_at) over w+ 1 is distinct from created_at)::int as first
from my_table
window w as (order by id)
) s
window w as (order by id)
) s
window
wa as (partition by part),
wo as (partition by part order by id);
DbFiddle.
This is a variation of the group-and-islands problem. Let me show a solution using lag() to define the groups:
lag() to get the previous day
cumulative sum to get the groups
row_number() to assign the final values
This works as:
select id, name, created_at,
(case when count(*) over (partition by grp) > 1
then row_number() over (partition by grp order by id)
end) as days_on
from (select t.*,
sum( (prev_ca <> created_at - interval '1 day')::int ) as grp
from (select t.*,
lag(created_at) over (order by id) as prev_ca
from t
) t;

SQL incremental id for every user_id

I have data:
user_id user_login_date
1 2013.07.05
1 2013.07.15
1 2013.07.16
1 2013.07.17
2 2013.07.05
2 2013.07.05
2 2013.07.15
And I want to make virtual table that would look like this:
user_id user_login_date date_id
1 2013.07.05 1
1 2013.07.15 2
1 2013.07.16 3
1 2013.07.17 4
2 2013.07.05 1
2 2013.07.05 2
2 2013.07.15 3
How do I do that?
I tried:
WITH user_count
AS (
SELECT user_id, user_login_date
FROM users
)
SELECT user_count.user_id, user_count.user_login_date, COUNT(user_count.user_id)
FROM users, user_count
WHERE users.user_login_date >= user_count.user_login_date
AND users.user_id = user_count.user_id
GROUP BY user_count.user_id, user_count.user_login_date
ORDER BY user_count.user_id, user_count.user_login_date;
But the result isn't that that I want.
select
user_id, user_login_date,
row_number() over(
partition by user_id
order by user_login_date
) as date_id
from users
order by user_id, date_id
select row_number() over (partition by user_id order by user_login_date) as date_id
, yt.*
from YourTable yt