Hi I am trying to calculate a sales commission with a maximum value of 25000 for the year. for example if an employee earning $3000 a month so the total for the year will be $36000. however I have to pay maximum $25000.
I tried window functions like sum(mtd_commission) over month and compare it against 25000 however it stops after 8th month or 24000. how I can calculate only (25000-24000) $1000 for the 9th month.
[expected results]
You might consider below query.
WITH sample_table AS (
SELECT month, 3000 mtd_comm FROM UNNEST(GENERATE_ARRAY(1, 12)) month
SUM(mtd_comm) OVER w0 ytd_comm,
WHEN SUM(mtd_comm) OVER w0 <= 25000 THEN mtd_comm
ELSE GREATEST(0, 25000 - SUM(mtd_comm) OVER w1)
END AS paid_comm,
LEAST(25000, SUM(mtd_comm) OVER w0) ytd_paid_comm,
FROM sample_table
WINDOW w0 AS (ORDER BY month),
Query results
I'm using a public data set to run some modeling while trying to learn BigQuery SQL. I have a date column but I'm trying to group by day of the year not full date. Date is entered as 2018-2-12 but I'd like it just as 2-12 or 02-12. I have the code to extract the day and month from date but can't find a way to concatenate the two in order for it to be grouped.
EXTRACT (MONTH FROM sales.date) AS month,
EXTRACT(DAY FROM sales.date) AS day ,
ROUND(AVG(sales.bottles_sold/sales.pack), 2) as pack_qty, -- average case or pack
ROUND(AVG(sales.bottles_sold), 2) AS qty_bottles, -- average total number of bottles
ROUND(AVG(sales.sale_dollars), 2) as sales_rev, -- average sales rev
ROUND(AVG((sales.state_bottle_retail - sales.state_bottle_cost) * sales.bottles_sold), 2) AS profit, -- avg profit on that day
ROUND (AVG(sales.volume_sold_liters), 2) as volumeLit, -- average volume in liters
ROUND (AVG(sales.volume_sold_gallons), 2) as volumeGal -- average volume in gal
FROM `bigquery-public-data.iowa_liquor_sales.sales` AS sales
volumeGal DESC;
EXTRACT (MONTH FROM sales.date) AS month,
EXTRACT(DAY FROM sales.date) AS day ,
ROUND(AVG(sales.bottles_sold/sales.pack), 2) as pack_qty, -- average case or pack
ROUND(AVG(sales.bottles_sold), 2) AS qty_bottles, -- average total number of bottles
ROUND(AVG(sales.sale_dollars), 2) as sales_rev, -- average sales rev
ROUND(AVG((sales.state_bottle_retail - sales.state_bottle_cost) * sales.bottles_sold), 2) AS profit, -- avg profit on that day
ROUND (AVG(sales.volume_sold_liters), 2) as volumeLit, -- average volume in liters
ROUND (AVG(sales.volume_sold_gallons), 2) as volumeGal -- average volume in gal
FROM `bigquery-public-data.iowa_liquor_sales.sales` AS sales
EXTRACT (MONTH FROM sales.date),
EXTRACT(DAY FROM sales.date)
volumeGal DESC;
You should use the CONCAT function inside the SELECT statement. With the following query you will have a single day column with the "day-month" format in the result.
CONCAT(EXTRACT (DAY FROM sales.date) ,'-', EXTRACT(MONTH FROM sales.date)) AS day ,
ROUND(AVG(sales.bottles_sold/sales.pack), 2) as pack_qty, -- average case or pack
ROUND(AVG(sales.bottles_sold), 2) AS qty_bottles, -- average total number of bottles
ROUND(AVG(sales.sale_dollars), 2) as sales_rev, -- average sales rev
ROUND(AVG((sales.state_bottle_retail - sales.state_bottle_cost) * sales.bottles_sold), 2) AS profit, -- avg profit on that day
ROUND (AVG(sales.volume_sold_liters), 2) as volumeLit, -- average volume in liters
ROUND (AVG(sales.volume_sold_gallons), 2) as volumeGal -- average volume in gal
FROM `bigquery-public-data.iowa_liquor_sales.sales` AS sales
volumeGal DESC;
You can use + operator for concatenating two strings.
convert(nvarchar,month(sales.date))+'-'+ convert(nvarchar,day(sales.date) ) as date_month ,
ROUND(AVG(sales.bottles_sold/sales.pack), 2) as pack_qty, -- average case or pack
ROUND(AVG(sales.bottles_sold), 2) AS qty_bottles, -- average total number of bottles
ROUND(AVG(sales.sale_dollars), 2) as sales_rev, -- average sales rev
ROUND(AVG((sales.state_bottle_retail - sales.state_bottle_cost) * sales.bottles_sold), 2) AS profit, -- avg profit on that day
ROUND (AVG(sales.volume_sold_liters), 2) as volumeLit, -- average volume in liters
ROUND (AVG(sales.volume_sold_gallons), 2) as volumeGal -- average volume in gal
FROM bigquery-public-data.iowa_liquor_sales.sales AS sales
volumeGal DESC;
I have a simple table that contains a record of products and their total sales per day over a year (just 3 columns - Product, Date, Sales). So, for example, if product A is sold every single day, it'll have 365 records. Similarly, if product B is sold for only 50 days, the table will have just 50 rows for that product - one for each day of sale.
I need to calculate the daily average sales and standard deviation for the entire year, which means that, for product B, I need to have additional 365-50=315 entries with zero sales to be able to calculate the daily average and standard deviation for the year correctly.
Is there a way to do this efficiently and dynamically in SQL?
We can generate 366 rows and join the sales data to it:
WITH rg(rn) AS (
SELECT a.rn + 1 AS rn
FROM rg a
WHERE a.rn <= 366
SELECT YEAR(saledate) as yr, DATEPART(dayofyear, saledate) as doy, count(*) as numsales
FROM sales
GROUP BY YEAR(saledate), DATEPART(dayofyear, saledate)
) s ON rg.rn = s.doy
You can replace the nulls (where there is no sale data for that day) with e.g. AVG(COALESCE(numsales, 0)). You'll probably also need a WHERE clause to eliminate the 366th day on non leap years (such as MODULO the year by 4 and only do 366 rows if it's 0).
If you're only doing a single year, you can use a where clause in the sales subquery to give only the relevant records; most efficient is to use a range like WHERE salesdate >= DATEFROMPARTS(YEAR(GetDate()), 1, 1) AND salesdate < DATEFROMPARTS(YEAR(GetDate()) + 1, 1, 1) rather than calling a function on every sales date to extract the year from it to compare to a constant. You can also drop the YEAR(salesdate) from the select/group by if there is only a single year
If you're doing multiple years, you could make the rg generate more rows, or (perhaps simpler) cross join it to a list of years so you get 366 rows multiplied by e.g. VALUES (2015),(2016),(2017),(2018),(2019),(2020) (and make the year from the sales part of the join too)
find the first and last day of the year and then use datediff() to find number of days in that year.
After that don't use AVG on sales, but SUM(Sales) / days_in_year
select *,
days_in_year = datediff(day, first_of_year, last_of_year) + 1
from (values (2019), (2020)) v(year)
cross apply
select first_of_year = dateadd(year, year - 1900, 0),
last_of_year = dateadd(year, year - 1900 + 1, -1)
) d
There's a different way to look at it - don't try to add additional empty rows, just divide by the number of days in a year. While the number of days a year isn't constant (a leap year will have 366 days), it can be calculated easily since the first day of the year is always January 1st and the last is always December 31st:
SUM(sales) / DATEPART(dy, DATEFROMPARTS(YEAR(date)), 12, 31))
FROM sales_table
GROUP BY YEAR(date), product
I'm thinking the only way to do it is to sum the values between (today - 365) and (today -65 + 90) then move on by 1 day each time, but that would be impractical. Is there a way around it?
If you have one row on each day:
select top (1) t.*
from (select t.*, sum(x) over (order by date rows between 89 preceding and current row) as sum_90
from t
) t
order by sum_90 desc;
I have 4 dimensions, which one of them is date. I need to calculate for each date, the average in the last 30 days, per each dimension value.
I have tried to run average over a partition by the 4 dimensions in a form of:
Date, Produce,Company, Song, Revenues,
Average(case when Date between Date -Interval '31' day and Date - Interval '1' Day then Revenues else null End) over (partition by Date,Company,Song,Revenues order by Date) as "Running Average"
I get only nulls with every aggregation I tried.
Help is appreciated. Thanks
You can try below -
Date, Produce,Company, Song, Revenues,
Average(Revenues) over (partition by Company,Song rows between 30 preceding and current row) as "Running Average"
How to calculate a dynamic average value between rows?
first 12 months status_flag is going to be N and from 13th month onward we need to take the average of sales for first 13 rows and compare it with min and max values and if it lies in between min and max then set the status_flag as Y else set it as N.
Same for 14th row take the average of first 14 rows and compare it with min and max... and so on.
How to do this?
I think the challenging part is to get the average sales. You can use the Analytic Functions:
select Storeid, Months, Min, Max, sales,
avg(sales) over (order by Months RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as avg_sales
from your_table;
The rest should be easier. Note, RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is the default, so you can just skip it.
with a as
(select Storeid, Months, Min, Max, sales,
avg(sales) over (order by Months) as avg_sales
from your_table)
select Storeid, Months, Min, Max, sales, avg_sales,
when Months <= 12 then 'N'
when avg_sales between Min and Max then 'Y'
else 'N'
end as Status_flag
from a;
Update table t set status_flag =
case when
(Select count(*)
From table
where month <= t.Month) > 12
(select avg(sales)
from table
where Month <= t.Month)
Between Min and Max
then 'Y' else 'N' end