I am trying to rank my sales data using the rank() over function . Here is my code :
Select
Category as CAT
,units*cost as COST_SALES
,units*retail as RETAIL_COST
,units as UNITS_SOLD
,RANK() OVER (PARTITION BY 1 ORDER BY 3 DESC ) AS RANKING
from Table
Where date between current_date-7 and current_date
group by 1
When I get my result it is unordered and shows rank 1 for all the categories.
You can't use column references in the window functions. You need to name the columns explicitly:
Select Category as CAT, units*cost as COST_SALES, units*retail as RETAIL_COST,
units as UNITS_SOLD,
RANK() OVER (PARTITION BY Categroy ORDER BY units*retail DESC ) AS RANKING
from Table
Where date between current_date-7 and current_date
group by Category;
Related
The output should be count of max items sold in a date.
This is bigquery table:
item,date
apple,1-1-2020
apple,1-1-2020
pear,1-1-2020
pear,1-1-2020
pear,1-2-2020
pear,1-2-2020
pear,1-2-2020
orange,1-2-2020
Expected output:
item,date
apple,1-1-2020
pear,1-1-2020
pear,1-2-2020
Consider below approach
select item, date, count(1) sales
from `project.dataset.table`
group by item, date
qualify rank() over(partition by date order by sales desc) = 1
When applied to sample data in your question - output is
If for some reason, you don't want to have sales column in your output - use below
select item, date
from `project.dataset.table`
group by item, date
qualify rank() over(partition by date order by count(1) desc) = 1
if applied to sample data in your question - output is
The following query should do it:
SELECT
item,
sale_date,
FROM (
SELECT
sample.*,
COUNT(item) AS item_count
FROM
sample
GROUP BY
sample.sale_date,
item )
# Here you need to use a WHERE (or HAVING, or GROUP BY) in order to be able to use QUALIFY
WHERE sale_date IS NOT NULL
QUALIFY RANK() OVER(PARTITION BY sale_date ORDER BY item_count DESC) = 1
I have a table Tabl1 : id, name, country, year, medal.
how can I find the top 10 countries by the number of medals for each year in 1 request?
thanks:)
You haven't told us anything about your table schema or the data, so this is a guess!
Going to assume your medal column contains the qty of medals for each Id/name, so you just need to rank by the sum of medals. Something along the lines of:
select [year], country, [Rank] from (
select [year], country, Rank() over(partition by [year] order by Sum(medal) desc ) [Rank]
from Tabl1
group by [year],country
)x
where [Rank]<=10
order by [year], [Rank]
here you can get the top 10 countries in each year:
select * from
(
select country,year,count(*),row_number() over (order by count(*) desc) as rn
from table
group by country, year
) tt
where tt.rn < 11
the sub query groups the data per country and year and gives you count() of each group, but at the same time It sorts them per count(*) desc and gives the a row number per each group ( it happanes using row_number() window funcion) , so the country with the most medal in eacg year is on top and it gets row number = 1 in each group , you need top 10 , so you filter them tt.rn < 11 in the main query.
If you want 10 countries per year:
with data as (
select country, "year" as yr,
rank() over (partition by "year" order by count(*) desc) as rnk
from T
group by country, "year"
)
select yr as "year", country from data
where rnk <= 10
order by yr, rnk;
Note that if ties are possible this could return more than ten rows for any given year.
I've a table that has this information:
And need to get the following information:
If the country of the same person name (in this case Artur) is different, then I need to sum the two values of quantity from the max date (in this case 04/10) and return both person (Artur) and the qty (15k)
If the country of the same person name (in this case Joseph) is the same, then I need only the first row of the max date available.
I'm really struguling as I'm not sure how to implement the logic into my code:
Select
table.person,
table.quantity
From
(
Select
table.date,
table.person,
table.country,
table.quantity,
ROW_NUMBER () over (
PARTITION by table.code, table.person
ORDER by table.date DESC
) AS rn
FROM
table
WHERE table.date >= DATE '{2020-04-10}' -5
) a
WHERE a.RN IN (1,2)
Is it possible to create a rule to sum rows 1 and 2 when country is different (Artur case) and only return row number 1 when the country is the same for a name (Joseph case)?
Use dense_rank() or max() as a window function:
select person, sum(quantity)
from (select t.*,
max(date) over (partition by person) as max_date
from t
) t
where date = max_date
group by person;
EDIT:
Hmmm . . . I think you might want one row per country per person on the max date. If so:
select person, sum(quantity)
from (select t.*,
row_number() over (partition by person, country order by date desc) as seqnum_pc,
rank() over (partition by person order by date desc) as seqnum_p
from t
) t
where seqnum_p = 1 and seqnum_pc = 1
group by person;
I am using SQL Server and I have a table "a"
month segment_id price
-----------------------------
1 1 100
1 2 200
2 3 50
2 4 80
3 5 10
I want to make a query which presents the original columns where the price will be the max per month
The result should be:
month segment_id price
----------------------------
1 2 200
2 4 80
3 5 10
I tried to write SQL code:
Select
month, segment_id, max(price) as MaxPrice
from
a
but I got an error:
Column segment_id is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
I tried to fix it in many ways but didn't find how to fix it
Because you need a group by clause without segment_id
Select month, max(price) as MaxPrice
from a
Group By month
as you want results per each month, and segment_id is non-aggregated in your original select statement.
If you want to have segment_id with maximum price repeating per each month for each row, you need to use max() function as window analytic function without Group by clause
Select month, segment_id,
max(price) over ( partition by month order by segment_id ) as MaxPrice
from a
Edit (due to your lastly edited desired results) : you need one more window analytic function row_number() as #Gordon already mentioned:
Select month, segment_id, price From
(
Select a.*,
row_number() over ( partition by month order by price desc ) as Rn
from a
) q
Where rn = 1
I would recommend a correlated subquery:
select t.*
from t
where t.price = (select max(t2.price) from t t2 where t2.month = t.month);
The "canonical" solution is to use row_number():
select t.*
from (select t.*,
row_number() over (partition by month order by price desc) as seqnum
from t
) t
where seqnum = 1;
With the right indexes, the correlated subquery often performs better.
Only because it was not mentioned.
Yet another option is the WITH TIES clause.
To be clear, the approach by Gordon and Barbaros would be a nudge more performant, but this technique does not require or generate an extra column.
Select Top 1 with ties *
From YourTable
Order By row_number() over (partition by month order by price desc)
With not exists:
select t.*
from tablename t
where not exists (
select 1 from tablename
where month = t.month and price > t.price
)
or:
select t.*
from tablename inner join (
select month, max(price) as price
from tablename
group By month
) g on g.month = t.month and g.price = t.price
I am using the #standardsql in bigquery and trying to code the maksimum ranking of each customer_id as 1, and the rest of it are 0
This is the query result so far
The query for ranking is this
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY booking_date Asc) as ranking
What i need is to create another column like this where it decode the maximum ranking of each customerid as 1, and the number below it as 0 just like the below table
Thanks
Based on your sample data, your ranking is unstable, because you have multiple rows with the same key values. In any case, you can still do what you want without subqueries, just using case:
select t.*,
row_number() over (partition by customer_id order by booking_date asc) as ranking,
(case when row_number() over (partition by customer_id order by booking_date asc) =
count(*) over (partition by customer_id)
then 1 else 0
end) as custom_coded
from t;
A more traditional way of doing essentially the same thing would be to use a descending sort:
select t.*,
row_number() over (partition by customer_id order by booking_date asc) as ranking,
(case when row_number() over (partition by customer_id order by booking_date desc) = 1
then 1 else 0
end) as custom_coded
from t;
We can wrap your current query, and then use MAX as an analytic function with a partition by customer to compare each ranking value against the max ranking for each customer. When the ranking value equals the maximum value for a customer, then we assign 1 for the custom_coded, otherwise we assign 0.
SELECT
customer_id, item_bought, booking_date, ranking,
CASE WHEN ranking = MAX(ranking) OVER (PARTITION BY customer_id)
THEN 1 ELSE 0 END AS custom_coded
FROM
(
SELECT customer_id, item_bought, booking_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY booking_date) ranking
FROM yourTable
) t;