I'm using a public data set to run some modeling while trying to learn BigQuery SQL. I have a date column but I'm trying to group by day of the year not full date. Date is entered as 2018-2-12 but I'd like it just as 2-12 or 02-12. I have the code to extract the day and month from date but can't find a way to concatenate the two in order for it to be grouped.
SELECT
EXTRACT (MONTH FROM sales.date) AS month,
EXTRACT(DAY FROM sales.date) AS day ,
ROUND(AVG(sales.bottles_sold/sales.pack), 2) as pack_qty, -- average case or pack
ROUND(AVG(sales.bottles_sold), 2) AS qty_bottles, -- average total number of bottles
ROUND(AVG(sales.sale_dollars), 2) as sales_rev, -- average sales rev
ROUND(AVG((sales.state_bottle_retail - sales.state_bottle_cost) * sales.bottles_sold), 2) AS profit, -- avg profit on that day
ROUND (AVG(sales.volume_sold_liters), 2) as volumeLit, -- average volume in liters
ROUND (AVG(sales.volume_sold_gallons), 2) as volumeGal -- average volume in gal
FROM `bigquery-public-data.iowa_liquor_sales.sales` AS sales
GROUP BY
month,
day
ORDER BY
volumeGal DESC;
SELECT
EXTRACT (MONTH FROM sales.date) AS month,
EXTRACT(DAY FROM sales.date) AS day ,
ROUND(AVG(sales.bottles_sold/sales.pack), 2) as pack_qty, -- average case or pack
ROUND(AVG(sales.bottles_sold), 2) AS qty_bottles, -- average total number of bottles
ROUND(AVG(sales.sale_dollars), 2) as sales_rev, -- average sales rev
ROUND(AVG((sales.state_bottle_retail - sales.state_bottle_cost) * sales.bottles_sold), 2) AS profit, -- avg profit on that day
ROUND (AVG(sales.volume_sold_liters), 2) as volumeLit, -- average volume in liters
ROUND (AVG(sales.volume_sold_gallons), 2) as volumeGal -- average volume in gal
FROM `bigquery-public-data.iowa_liquor_sales.sales` AS sales
GROUP BY
EXTRACT (MONTH FROM sales.date),
EXTRACT(DAY FROM sales.date)
ORDER BY
volumeGal DESC;
You should use the CONCAT function inside the SELECT statement. With the following query you will have a single day column with the "day-month" format in the result.
SELECT
CONCAT(EXTRACT (DAY FROM sales.date) ,'-', EXTRACT(MONTH FROM sales.date)) AS day ,
ROUND(AVG(sales.bottles_sold/sales.pack), 2) as pack_qty, -- average case or pack
ROUND(AVG(sales.bottles_sold), 2) AS qty_bottles, -- average total number of bottles
ROUND(AVG(sales.sale_dollars), 2) as sales_rev, -- average sales rev
ROUND(AVG((sales.state_bottle_retail - sales.state_bottle_cost) * sales.bottles_sold), 2) AS profit, -- avg profit on that day
ROUND (AVG(sales.volume_sold_liters), 2) as volumeLit, -- average volume in liters
ROUND (AVG(sales.volume_sold_gallons), 2) as volumeGal -- average volume in gal
FROM `bigquery-public-data.iowa_liquor_sales.sales` AS sales
GROUP BY
day
ORDER BY
volumeGal DESC;
You can use + operator for concatenating two strings.
Code:
SELECT
convert(nvarchar,month(sales.date))+'-'+ convert(nvarchar,day(sales.date) ) as date_month ,
ROUND(AVG(sales.bottles_sold/sales.pack), 2) as pack_qty, -- average case or pack
ROUND(AVG(sales.bottles_sold), 2) AS qty_bottles, -- average total number of bottles
ROUND(AVG(sales.sale_dollars), 2) as sales_rev, -- average sales rev
ROUND(AVG((sales.state_bottle_retail - sales.state_bottle_cost) * sales.bottles_sold), 2) AS profit, -- avg profit on that day
ROUND (AVG(sales.volume_sold_liters), 2) as volumeLit, -- average volume in liters
ROUND (AVG(sales.volume_sold_gallons), 2) as volumeGal -- average volume in gal
FROM bigquery-public-data.iowa_liquor_sales.sales AS sales
GROUP BY
date_month
ORDER BY
volumeGal DESC;
Related
Hi I am trying to calculate a sales commission with a maximum value of 25000 for the year. for example if an employee earning $3000 a month so the total for the year will be $36000. however I have to pay maximum $25000.
I tried window functions like sum(mtd_commission) over month and compare it against 25000 however it stops after 8th month or 24000. how I can calculate only (25000-24000) $1000 for the 9th month.
Thanks
[expected results]
You might consider below query.
WITH sample_table AS (
SELECT month, 3000 mtd_comm FROM UNNEST(GENERATE_ARRAY(1, 12)) month
)
SELECT *,
SUM(mtd_comm) OVER w0 ytd_comm,
CASE
WHEN SUM(mtd_comm) OVER w0 <= 25000 THEN mtd_comm
ELSE GREATEST(0, 25000 - SUM(mtd_comm) OVER w1)
END AS paid_comm,
LEAST(25000, SUM(mtd_comm) OVER w0) ytd_paid_comm,
FROM sample_table
WINDOW w0 AS (ORDER BY month),
w1 AS (ORDER BY month RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
;
Query results
I'd like to find the average monthly revenue for each sales owner--however my current query is taking the monthly total and just dividing it by the number of entries. Ultimately, I'd like to get the average by finding the total revenue for each month and then dividing it by the number of months and then eventually just finding the avg. of the past 6 months. Code as well as sample output below:
select activitydate, console_org_name, partneragency, partneradvertiser, org_sales_owner
,round(sum(gross_revenue_allocation)::numeric,2) as gross_revenue
,round(avg(sum(gross_revenue_allocation)) over (partition by org_sales_owner order by activitydate RANGE INTERVAL '5' MONTH PRECEDING)::numeric,2) as salesowner6monthavg
from data_provider_payout dpp
where activitydate >= '01/01/2019'
group by activitydate, console_org_name, partneragency, partneradvertiser, org_sales_owner
If I understand correctly, you need to aggregate by the month and the owner. That would be something like this:
select date_trunc('month', activitydate), org_sales_owner,
round(sum(gross_revenue_allocation)::numeric, 2) as gross_revenue,
round(avg(sum(gross_revenue_allocation)) over (
partition by org_sales_owner
order by min(activitydate)
range between interval '5 month' preceding and current_row
)
)::numeric, 2) as salesowner6monthavg
from data_provider_payout dpp
where activitydate >= '2019-01-01'
group by date_trunc('month', activitydate), org_sales_owner
I have a simple table that contains a record of products and their total sales per day over a year (just 3 columns - Product, Date, Sales). So, for example, if product A is sold every single day, it'll have 365 records. Similarly, if product B is sold for only 50 days, the table will have just 50 rows for that product - one for each day of sale.
I need to calculate the daily average sales and standard deviation for the entire year, which means that, for product B, I need to have additional 365-50=315 entries with zero sales to be able to calculate the daily average and standard deviation for the year correctly.
Is there a way to do this efficiently and dynamically in SQL?
Thanks
We can generate 366 rows and join the sales data to it:
WITH rg(rn) AS (
SELECT 1 AS rn
UNION ALL
SELECT a.rn + 1 AS rn
FROM rg a
WHERE a.rn <= 366
)
SELECT
*
FROM
rg
LEFT JOIN (
SELECT YEAR(saledate) as yr, DATEPART(dayofyear, saledate) as doy, count(*) as numsales
FROM sales
GROUP BY YEAR(saledate), DATEPART(dayofyear, saledate)
) s ON rg.rn = s.doy
OPTION (MAXRECURSION 370);
You can replace the nulls (where there is no sale data for that day) with e.g. AVG(COALESCE(numsales, 0)). You'll probably also need a WHERE clause to eliminate the 366th day on non leap years (such as MODULO the year by 4 and only do 366 rows if it's 0).
If you're only doing a single year, you can use a where clause in the sales subquery to give only the relevant records; most efficient is to use a range like WHERE salesdate >= DATEFROMPARTS(YEAR(GetDate()), 1, 1) AND salesdate < DATEFROMPARTS(YEAR(GetDate()) + 1, 1, 1) rather than calling a function on every sales date to extract the year from it to compare to a constant. You can also drop the YEAR(salesdate) from the select/group by if there is only a single year
If you're doing multiple years, you could make the rg generate more rows, or (perhaps simpler) cross join it to a list of years so you get 366 rows multiplied by e.g. VALUES (2015),(2016),(2017),(2018),(2019),(2020) (and make the year from the sales part of the join too)
find the first and last day of the year and then use datediff() to find number of days in that year.
After that don't use AVG on sales, but SUM(Sales) / days_in_year
select *,
days_in_year = datediff(day, first_of_year, last_of_year) + 1
from (values (2019), (2020)) v(year)
cross apply
(
select first_of_year = dateadd(year, year - 1900, 0),
last_of_year = dateadd(year, year - 1900 + 1, -1)
) d
There's a different way to look at it - don't try to add additional empty rows, just divide by the number of days in a year. While the number of days a year isn't constant (a leap year will have 366 days), it can be calculated easily since the first day of the year is always January 1st and the last is always December 31st:
SELECT YEAR(date),
product,
SUM(sales) / DATEPART(dy, DATEFROMPARTS(YEAR(date)), 12, 31))
FROM sales_table
GROUP BY YEAR(date), product
I have a table that has entry date and completion date on it (records go back a couple years) and i need to write a query that gives the average amount of time completion takes by each month. I can get the overall average,
SELECT AVG ( t.completion_dt - t.entry_dt ) * 24
FROM table t;
or the average for a specific month,
SELECT AVG ( t.completion_dt - t.entry_dt ) * 24
FROM table t
WHERE t.entry_dt BETWEEN to_date('2018/12/01','yyyy/mm/dd') AND
to_date('2019/1/01', 'yyyy/mm/dd');
Is there a way to get the query to return the average for each month?
If you are wanting to see each month/year independently (Jan 2020 on a different row from Jan 2019), then you can group on your entry_dt field truncated at the month.
SELECT trunc(t.entry_dt,'MM'),
AVG ( t.completion_dt - t.entry_dt ) * 24
FROM table t
group by trunc(t.entry_dt,'MM');
If you are wanting to average ALL January months together across multiple years, you will want to group on something like the to_char of your entry_dt field.
SELECT to_char(t.entry_dt, 'Month'),
AVG ( t.completion_dt - t.entry_dt ) * 24
FROM table t
group by to_char(t.entry_dt, 'Month');
I have table like this
select * from promet_3b;
Colum DATUM is date; ORGJED is organisational unit, RGRUPA is our product type and KOLICINA is quantity.
I would like to have table like this.
1) Column MONTH (jan-feb-...)
2) COLUMN ORGJED
3) COLUMN RGRUPA
4) COLUMN KOL1- sum of quantity till start of month
5) COLUMN KOL2 - sum of quantity till end of month
For example in APRIL is KOL2=300, that is the KOL1 in MAY
This Query will give you desired result...Please check...
SELECT ORGJED,RGRUPA, EXTRACT (MONTH FROM S.DATUM) CUR_MONTH,
EXTRACT (YEAR FROM S.DATUM) CUR_YEAR,
SUM (S.KOLICINA) KOLICINA,
NVL (
LAG (
SUM (S.KOLICINA))
OVER (PARTITION by ORGJED,RGRUPA
ORDER BY ORGJED,RGRUPA,
EXTRACT (YEAR FROM S.DATUM),
EXTRACT (MONTH FROM S.DATUM)),
0)
PREV_MONTH
FROM mlp1 s
GROUP BY ORGJED,RGRUPA,
EXTRACT (MONTH FROM S.DATUM),
EXTRACT (YEAR FROM S.DATUM)