SQL Getting row number only when the value is different from all previous values - sql

I want the count adding one only when the value has not been show before. The base table is:
rownum product
1 coke
2 coke
3 burger
4 burger
5 chocolate
6 apple
7 coke
8 burger
The goal is:
rownum product
1 coke
1 coke
2 burger
2 burger
3 chocolate
4 apple
4 coke
4 burger
I am thinking to compare the current row with all previous rows, but I have difficulty to call all previous rows. Thank you!

This is a gaps-and-islands problem. Here is one approach using window functions: the idea is to use a window sum that increments everytime the "first" occurence of a product is seen:
select t.*,
sum(case when rn = 1 then 1 else 0 end) over(order by rownum) new_rownum
from(
select t.*, row_number() over(partition by product order by rownum) rn
from mytable t
) t
order by rownum

Several different ways to accomplish this. I guess you'll get to pick one you like the best. This one just finds the first row number per product. You then just need to collapse the holes with an easy application of dense_rank() to the initial grouping.
with data as (
select *, min(rownum) over (partition by product) as minrow
from T
)
select dense_rank() over (order by minrow) as rownum, product
from data
order by rownum, data.rownum;

Related

How to get the values for every group of the top 3 types

I've got this table ratings:
id
user_id
type
value
0
0
Rest
4
1
0
Bar
3
2
0
Cine
2
3
0
Cafe
1
4
1
Rest
4
5
1
Bar
3
6
1
Cine
2
7
1
Cafe
5
8
2
Rest
4
9
2
Bar
3
10
3
Cine
2
11
3
Cafe
5
I want to have a table with a row for every pair (user_id, type) for the top 3 rated types through all users (ranked by sum(value) across the whole table).
Desired result:
user_id
type
value
0
Rest
4
0
Cafe
1
0
Bar
3
1
Rest
4
1
Cafe
5
1
Bar
3
2
Rest
4
3
Cafe
5
2
Bar
3
I was able to do this with two queries, one to get the top 3 and then another to get the rows where the type matches the top 3 types.
Does someone know how to fit this into a single query?
Get rows per user for the 3 highest ranking types, where types are ranked by the total sum of their value across the whole table.
So it's not exactly about the top 3 types per user, but about the top 3 types overall. Not all users will have rows for the top 3 types, even if there would be 3 or more types for the user.
Strategy:
Aggregate to get summed values per type (type_rnk).
Take only the top 3. (Break ties ...)
Join back to main table, eliminating any other types.
Order result by user_id, type_rnk DESC
SELECT r.user_id, r.type, r.value
FROM ratings r
JOIN (
SELECT type, sum(value) AS type_rnk
FROM ratings
GROUP BY 1
ORDER BY type_rnk DESC, type -- tiebreaker
LIMIT 3 -- strictly the top 3
) v USING (type)
ORDER BY user_id, type_rnk DESC;
db<>fiddle here
Since multiple types can have the same ranking, I added type to the sort order to break ties alphabetically by their name (as you did not specify otherwise).
Turns out, we don't need window functions - the ones with OVER and, optionally, PARTITION for this. (Since you asked in a comment).
I think you just want row_number(). Based on your results, you seem to want three rows per type, with the highest value:
select t.*
from (select t.*,
row_number() over (partition by type order by value desc) as seqnum
from t
) t
where seqnum <= 3;
Your description suggests that you might just want this per user, which is a slight tweak:
select t.*
from (select t.*,
row_number() over (partition by user order by value desc) as seqnum
from t
) t
where seqnum <= 3;

How to COUNT in a specific column after GROUP BY

I'm stuck with how to write SQL statements, so I would appreciate it if you could teach me.
Current status
items table
id
session_id
item_id
competition_id
1
1
2
1
2
1
3
1
2
1
2
1
2
1
2
1
2
1
5
2
3
1
7
2
4
1
4
2
5
1
5
2
want to
grouping by competition_id,
Count the same numbers in item_id,Extract the most common numbers and their numbers.
For example
If competition_id is 1,item_id → 2 ,and the number is 3
If competition_id is 2,item_id → 5 ,and the number is 2
If competition_id is 3,・・・
If competition_id is 4,・・・
environment
macOS BigSur
ruby 2.7.0
Rails 6.1.1
sqlite
In statistics, what you are asking for is the mode, the most common value.
You can use aggregation and row_number():
select ct.*
from (select competition_id, item_id, count(*) as cnt,
row_number() over (partition by competition_id order by count(*) desc) as seqnum
from t
group by competition_id, item_id
) ci
where seqnum = 1;
In the event that there are ties, this returns only one of the values, arbitrarily. If you want all modes when there are ties use rank() instead of row_number().

SQL Getting row number when the value is different from previous one, no matter whether the value shows before

Hi I have a difficulty when creating count: the base table is
rownum product
1 coke
2 coke
3 burger
4 burger
5 chocolate
6 apple
7 coke
8 burger
I want the result like below, as long as the product is different than the previous one, the count add one. I trying to use dense_rank(), rank() function, but it's not what I want. Thank youstrong text
rownum product
1 coke
1 coke
2 burger
2 burger
3 chocolate
4 apple
5 coke
6 burger
Use lag() to see when the value changes and then a cumulative sum:
select t.*,
sum(case when prev_product = product then 0 else 1 end) over (order by rownum) as new_rownum
from (select t.*, lag(product) over (order by rownum) as prev_product
from base t
) t

Calculate "position in run" in SQL

I have a table of consecutive ids (integers, 1 ... n), and values (integers), like this:
Input Table:
id value
-- -----
1 1
2 1
3 2
4 3
5 1
6 1
7 1
Going down the table i.e. in order of increasing id, I want to count how many times in a row the same value has been seen consecutively, i.e. the position in a run:
Output Table:
id value position in run
-- ----- ---------------
1 1 1
2 1 2
3 2 1
4 3 1
5 1 1
6 1 2
7 1 3
Any ideas? I've searched for a combination of windowing functions including lead and lag, but can't come up with it. Note that the same value can appear in the value column as part of different runs, so partitioning by value may not help solve this. I'm on Hive 1.2.
One way is to use a difference of row numbers approach to classify consecutive same values into one group. Then a row number function to get the desired positions in each group.
Query to assign groups (Running this will help you understand how the groups are assigned.)
select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t
Final Query using row_number to get positions in each group assigned with the above query.
select id,value,row_number() over(partition by value,rnum_diff order by id) as pos_in_grp
from (select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t
) t

Keep minimum value of field A and corresponding value of field B in SQL server

Say I have the following table:
Category rank score
a 3 100
a 1 105
a 2 110
b 2 102
b 7 107
b 3 95
I would like to know both the most efficient and the most visually elegant way of getting the lines having the minimum rank for each category.
In my example the result would be
Category rank score
a 1 105
b 2 102
The solutions I came up with seem inefficient and ugly for something that seems quite straightforward.
A typical solution is to use row_number():
select category, rank, score
from (select t.*,
row_number() over (partition by category order by rank) as seqnum
from t
) t
where seqnum = 1;
Whether or not you think this is elegant is a matter of opinion.
Below solution uses the concept of CTE....
with cte as
(
select category, rank, score, ROW_NUMBER() OVER(PARTITION BY category ORDER BY rank ) AS row_num
from t
)
select category, rank, score from cte
where row_num=1