How to aggregate YTD measure dynamically - sql

I have a table which has 2 fields timestamp and count. Table has data since 2016 November.
I have to set up a query which will daily aggregate the YTD sum(count) for all the years. I am not using calendar year definition but rather November-October (Next year). This shouldn't ideally change the logic
2017: 11/01/2016-10/31/2017;
2018: 11/01/2017-10/31/2018;
2019: 11/01/2018-10/31/2019;
2020: 11/01/2019-10/31/2020
I want a query that will calculate on any given day aggregate YTD with November 1st as the start date. I tried this query
select ytd_bucket
,sum(count_field) sum
from
(
select
timestamp_field,
count_field,
CASE
WHEN DATE(timestamp_field,"America/Los_Angeles") >= '2019-11-01' THEN '2020'
WHEN DATE(timestamp_field,"America/Los_Angeles") BETWEEN '2018-11-01' AND CAST(CONCAT('2019-',FORMAT_DATE('%m-%d', DATE(CURRENT_TIMESTAMP(),"America/Los_Angeles"))) AS DATE) THEN '2019'
WHEN DATE(timestamp_field,"America/Los_Angeles") BETWEEN '2017-11-01' AND CAST(CONCAT('2018-',FORMAT_DATE('%m-%d', DATE(CURRENT_TIMESTAMP(),"America/Los_Angeles"))) AS DATE) THEN '2018'
WHEN DATE(timestamp_field,"America/Los_Angeles") BETWEEN '2016-11-01' AND CAST(CONCAT('2017-',FORMAT_DATE('%m-%d', DATE(CURRENT_TIMESTAMP(),"America/Los_Angeles"))) AS DATE) THEN '2017'
ELSE NULL END as YTD_bucket
from table
)
group by 1
The above query does not aggregate the numbers are a YTD level. For the years prior to 2020 (ytd_bucket) the query is aggregating the entire years count.

Start by aggregating per day:
select date(timestamp_field, 'America/Los_Angeles') as dte,
count(*)
from table
group by dte;
Then, for the YTD, you want to add one year and get the date:
select dte,
count(*),
sum(count(*)) over (partition by extract(year from date_add(dte, interval 1 month))
order by min(timestamp_field)
) as running_cnt
from (select t.*,
date(timestamp_field, 'America/Los_Angeles') as dte
from t
) t
group by dte;

Related

Remove Duplicates and show Total sales by year and month

i am trying to work with this query to produce a list of all 11 years and 12 months within the years with the sales data for each month. Any suggestions? this is my query so far.
SELECT
distinct(extract(year from date)) as year
, sum(sale_dollars) as year_sales
from `project-1-349215.Dataset.sales`
group by date
it just creates a long list of over 2000 results when i am expecting 132 max one for each month in the years.
You should change your group by statement if you have more results than you expected.
You can try:
group by YEAR(date), MONTH(date)
or
group by EXTRACT(YEAR_MONTH FROM date)
A Grouping function is for takes a subsection of the date in your case year and moth and collect all rows that fit, and sum it up,
So a sĀ“GROUp BY date makes no sense, what so ever as you don't want the sum of every day
So make this
SELECT
extract(year from date) as year
,extract(MONTH from date) as month
, sum(sale_dollars) as year_sales
from `project-1-349215.Dataset.sales`
group by 1,2
Or you can combine both year and month
SELECT
extract(YEAR_MONTH from date) as year
, sum(sale_dollars) as year_sales
from `project-1-349215.Dataset.sales`
group by 1

How to GROUP BY Month-Year using BigQuery

I am trying to count the number of bus trips (with start and destinations) on monthly basis (for several years) using TIMESTAMP column/field. I can do this on MONTH basis (TIMESTAMP_TRUNC(start_date, MONTH)) but I would like to do this for MONTH-YEAR basis. Any help is appreciated. Thanks
you can use Standard SQL:
SELECT
FORMAT_DATE('%b-%Y', created_date) mon_year,
COUNT(1) AS `count`
FROM `project.dataset.table`
GROUP BY mon_year
ORDER BY PARSE_DATE('%b-%Y', mon_year)
if you are using timestamp you have to cast it to date
SELECT
FORMAT_DATE('%b-%Y', DATE(CURRENT_TIMESTAMP())) mon_year
will produce:
Sep-2020
As per your example from comments. You can't use count in where clause. If you want to have a filter on aggregation you have to use having docs.
SELECT TIMESTAMP_TRUNC(start_date, MONTH) AS year_month,
start_station_name,
end_station_name,
count(start_station_name) AS count_start,
FROM bigquery-PUBLIC-data.san_francisco.bikeshare_trips
WHERE start_station_name <> end_station_name
GROUP BY year_month,
start_station_name,
end_station_name
HAVING count(start_station_name) > 10
LIMIT 50
Your code should do what you want:
select timestamp_trunc(start_date, month) as yyyymmm, count(*)
from t
group by yyyymm;
This includes both the year and month, so Jan 2020 is different from Jan 2019.
If you wanted just by month of the year, then use extract():
select extract(month from start_date) as mon, count(*)
from t
group by mon;
This would treat Jan 2020 as the same as Jan 2019.

Count records for first day of every month in a year

I have a table with 4 columns huge number of records. It has the following structure:
DATE_ENTERED EMP_NAME DATA ORIGINATED
01-JAN-20 A 545454 APPLE
I want to calculate no of records for every first day of every month in a year
is there any way can we fetch the data for every first day of month.
In oracle you can use TRUNC function on the date as follows:
SELECT TRUNC(DATE_ENTERED), COUNT(1) AS CNT
FROM YOUR_TABLE
WHERE TRUNC(DATE_ENTERED) = TRUNC(DATE_ENTERED, 'MON')
GROUP BY TRUNC(DATE_ENTERED, 'MON')
Please note that the TRUNC(DATE_ENTERED, 'MON') returns the first day of the month for DATE_ENTERED.
Cheers!!
SELECT Year, Month, COUNT(*)
FROM
(
SELECT
YEAR(DATE_ENTERED) Year
MONTH(DATE_ENTERED) Month
DAY(DATE_ENTERED) Day
FROM your_table
WHERE DAY(DATE_ENTERED) = 1
) A
GROUP BY Year, Month
Generally WHERE DAY(DATE_ENTERED) = 1 will get you the records only for dates at the start of each month. Thus using Year and Month function you can group them by in order to get a count for each year and each month
You mean something like
SELECT COUNT(*)
FROM Table
WHERE DAY(DATE_ENTERED) = 1 AND
YEAR(DATE_ENTERED) = Some_Year
GROUP BY DATE_ENTERED
You can also use DATE_ENTERED BETWEEN 'YYYY0101' and 'YYYY1231' (replace the YYYY with the year you want to retrieve data for) instead of YEAR(DATE_ENTERED) = Some_Year, if performance is an issue.
You can use something like this:
select * from your_table
where DAY(DATE_ENTERED) = 1
and DATE_ENTERED between '2020-01-01' and '2020-12-31'
for number of count use this:
select count(*) from your_table
where DAY(DATE_ENTERED)= 1
and DATE_ENTERED between '2020-01-01' and '2020-12-31'
UPDATE
select * from your_table where Extract(day FROM DATE_ENTERED) = 1 and DATE_ENTERED between '01-JAN-20 ' and '01-DEC-20 ';
this is how the data looks like:
For the list of records
select count(*) from your_table where Extract(day FROM DATE_ENTERED) = 1 and DATE_ENTERED between '01-JAN-20 ' and '01-DEC-20 ';
UPDATE-2
select EXTRACT(month from DATE_ENTERED) as Count,
to_char(to_date(DATE_ENTERED, 'DD-MM-YYYY'), 'Month') from your_table
where Extract(day FROM DATE_ENTERED) = 1 and DATE_ENTERED between '01-JAN-20
'and '01-DEC-20 ' group by EXTRACT(month from DATE_ENTERED),
to_char(to_date(DATE_ENTERED, 'DD-MM-YYYY'), 'Month');
Here is the output:

SQL Server / SSRS: Calculating monthly average based on grouping and historical values

I need to calculate an average based on historical data for a graph in SSRS:
Current Month
Previous Month
2 Months ago
6 Months ago
This query returns the average for each month:
SELECT
avg_val1, month, year
FROM
(SELECT
(sum_val1 / count) as avg_val1, month, year
FROM
(SELECT
SUM(val1) AS sum_val1, SUM(count) AS count, month, year
FROM
(SELECT
COUNT(val1) AS count, SUM(val1) AS val1,
MONTH([SnapshotDate]) AS month,
YEAR([SnapshotDate]) AS year
FROM
[DC].[dbo].[KPI_Values]
WHERE
[SnapshotKey] = 'Some text here'
AND No = '001'
AND Channel = '999'
GROUP BY
[SnapshotDate]) AS sub3
GROUP BY
month, year, count) AS sub2
GROUP BY sum_val1, count, month, year) AS sub1
ORDER BY
year, month ASC
When I add the following WHERE clause I get the average for March (2 months ago):
WHERE month = MONTH(GETDATE())-2
AND year = YEAR(GETDATE())
Now the problem is when I want to retrieve data from 6 months ago; MONTH(GETDATE()) - 6 will output -1 instead of 12. I also have an issue with the fact that the year changes to 2016 and I am a bit unsure of how to implement the logic in my query.
I think I might be going about this wrong... Any suggestions?
Subtract the months from the date using the DATEADD function before you do your comparison. Ex:
WHERE SnapshotDate BETWEEN DATEADD(month, -6, GETDATE()) AND GETDATE()
MONTH(GETDATE()) returns an int so you can go to 0 or negative values. you need a user scalar function managing this, adding 12 when <= 0

How to group quantities in a period of time

Im trying to group quantities regarding a time or period, i have the next table
SALES
SALES_DATE
SALES_ITEM
SALES_QUANTITY
The query that im doing it's
SELECT DATE,ITEM,SUM(QUANTITY)
FROM SALES
WHERE DATE BETWEEN "DATE1" AND "DATE2";
The problem is that i dont need the DATE to appear, if i look for the sales of october it should appear the sum of october without showing the date... Thank you very much for your help
Example:
What i get...
DATE ITEM SALES
2012-06-12 14152 7
2012-06-14 14152 15
2012-06-16 14157 25
What i need: query between 06-12 and 06-16
ITEM SALES
14152 22
14157 25
Thanks you very much
If you want the sum by month, you can include that in the group by expression. Here is one way:
SELECT extract(year from DATE) as yr, extract(month from date) as mon, ITEM, SUM(QUANTITY)
FROM SALES
WHERE DATE BETWEEN "DATE1" AND "DATE2"
group by extract(year from DATE), extract(month from date)
order by 1, 2
Although extract is standard SQL, not all databases support it. For instance, you might use to_char(date, 'YYYY-MM') in Oracle or datepart(month, date) in SQL Server.