How to write Bigquery for below table - google-bigquery

The output should be count of max items sold in a date.
This is bigquery table:
item,date
apple,1-1-2020
apple,1-1-2020
pear,1-1-2020
pear,1-1-2020
pear,1-2-2020
pear,1-2-2020
pear,1-2-2020
orange,1-2-2020
Expected output:
item,date
apple,1-1-2020
pear,1-1-2020
pear,1-2-2020

Consider below approach
select item, date, count(1) sales
from `project.dataset.table`
group by item, date
qualify rank() over(partition by date order by sales desc) = 1
When applied to sample data in your question - output is
If for some reason, you don't want to have sales column in your output - use below
select item, date
from `project.dataset.table`
group by item, date
qualify rank() over(partition by date order by count(1) desc) = 1
if applied to sample data in your question - output is

The following query should do it:
SELECT
item,
sale_date,
FROM (
SELECT
sample.*,
COUNT(item) AS item_count
FROM
sample
GROUP BY
sample.sale_date,
item )
# Here you need to use a WHERE (or HAVING, or GROUP BY) in order to be able to use QUALIFY
WHERE sale_date IS NOT NULL
QUALIFY RANK() OVER(PARTITION BY sale_date ORDER BY item_count DESC) = 1

Related

Get last record by month/year and id

I need to get the last record of each month/year for each id.
My table captures daily, for each id, an order value which is cumulative. So, I need that at the end I only have the last record of the month for each id.
I believe without something simple, but with the examples found I could not replicate for my case.
Here is an example of my input data and the expected result: db_fiddle.
My attempt doesn't include grouping by month and year:
select ar.id, ar.value, ar.aquisition_date
from table_views ar
inner join (
select id, max(aquisition_date) as last_aquisition_date_month
from table_views
group by id
)ld
on ar.id = ld.id and ar.aquisition_date = ld.last_aquisition_date_month
You could do this:
with tn as (
select
*,
row_number() over (partition by id, date_trunc('month', aquisition_date) order by aquisition_date desc) as rn
from table_views
)
select * from tn where rn = 1
The tn cte adds a row number that counts incrementally in descending order of date, for each month/id.. Then you take only those with rn=1, which is the last aquisition_date of any given month, for each id

SUM most recent ID/Product combinations for the latest date

select * from
(select Id, Prodcut, Billing_date
, row_number() over (partition by Id, product order by Billing_date desc) as RowNumber
,sum(Revenue)
from Table1
group by 1,2,3,4,1) a
where a.rowNumber = 1
There are rows where Id+product combination repeats for latest billing date and which causing some data to be missed out. I am trying to add sum with row_number to sum all the ID&product combinations for the latest date but not able to make it work.
Can anyone please help me out here!
Data Sample Image
Database: Athena, Dbeaver
I would expect this to do what you want:
select *
from (select Id, Product, Billing_date,
row_number() over (partition by Id, product order by Billing_date desc) as seqnum,
sum(Revenue)
from Table1
group by Id, Product, Billing_date
) t1
where seqnum = 1;
Your group by columns do not seem correct. I'm surprised your query runs in any datbase.

Sum having a condition

I've a table that has this information:
And need to get the following information:
If the country of the same person name (in this case Artur) is different, then I need to sum the two values of quantity from the max date (in this case 04/10) and return both person (Artur) and the qty (15k)
If the country of the same person name (in this case Joseph) is the same, then I need only the first row of the max date available.
I'm really struguling as I'm not sure how to implement the logic into my code:
Select
table.person,
table.quantity
From
(
Select
table.date,
table.person,
table.country,
table.quantity,
ROW_NUMBER () over (
PARTITION by table.code, table.person
ORDER by table.date DESC
) AS rn
FROM
table
WHERE table.date >= DATE '{2020-04-10}' -5
) a
WHERE a.RN IN (1,2)
Is it possible to create a rule to sum rows 1 and 2 when country is different (Artur case) and only return row number 1 when the country is the same for a name (Joseph case)?
Use dense_rank() or max() as a window function:
select person, sum(quantity)
from (select t.*,
max(date) over (partition by person) as max_date
from t
) t
where date = max_date
group by person;
EDIT:
Hmmm . . . I think you might want one row per country per person on the max date. If so:
select person, sum(quantity)
from (select t.*,
row_number() over (partition by person, country order by date desc) as seqnum_pc,
rank() over (partition by person order by date desc) as seqnum_p
from t
) t
where seqnum_p = 1 and seqnum_pc = 1
group by person;

prepare the monthly sales report for the customer who has the maximum sales, for the below table

prepare the monthly sales report for the customer who has the maximum sales, for the below table
This is an example of how you can retrieve in 'one' query:
the id of the customer who has the maximum sales
the sales summary of this customer
First the query that brings the customer_id and the corresponding summary of sales is like:
select customer_id, sum(sales) as sumsales from mytable group by customer_id;
To this query, a rank has to be added to the descending order of Sales summaries. This will allow to select a single record according to its rank later. So, a wrap is necessary:
select customer_id, sumsales, rank() over (order by sumsales desc) as rnk from
(select customer_id, sum(sales) as sumsales from mytable group by customer_id);
Now that the ranked entries are available, the first ranked record has to be selected:
select customer_id, sumsales from
(select customer_id, sumsales, rank() over (order by sumsales desc) as rnk from
(select customer_id, sum(sales) as sumsales from mytable group by customer_id)
)
where rnk=1;
However, this may not be the most effective way to achieve this.
EDIT:
In order to avoid a wrapping layer, just to add the rank, it is possible to add the rank foeld to the first internal query:
select customer_id, sum(sales) as sumsales, rank() over (order by sum(sales) desc) as rnk
from mytable group by customer_id;
And then, a single wrapping query is needed to select the first ranked record as:
select customer_id, sumsales from (
select customer_id, sum(sales) as sumsales, rank() over (order by sum(sales) desc) as rnk
from mytable group by customer_id
)
where rnk=1;
Reference to other similar relevant answers here.

Ranking in Teradata- SQL

I am trying to rank my sales data using the rank() over function . Here is my code :
Select
Category as CAT
,units*cost as COST_SALES
,units*retail as RETAIL_COST
,units as UNITS_SOLD
,RANK() OVER (PARTITION BY 1 ORDER BY 3 DESC ) AS RANKING
from Table
Where date between current_date-7 and current_date
group by 1
When I get my result it is unordered and shows rank 1 for all the categories.
You can't use column references in the window functions. You need to name the columns explicitly:
Select Category as CAT, units*cost as COST_SALES, units*retail as RETAIL_COST,
units as UNITS_SOLD,
RANK() OVER (PARTITION BY Categroy ORDER BY units*retail DESC ) AS RANKING
from Table
Where date between current_date-7 and current_date
group by Category;