Use different fields based on condition - sql

I log the daily produced energy of my solar panels. Now I want to create a SQL statement to get the sum of produced energy for each month but separate columns for each year.
I came up with the following SQL statement:
SELECT LPAD(extract (month from inverterlogs_summary_daily.bucket)::text, 2, '0') as month,
sum(inverterlogs_summary_daily."EnergyProduced") as a2022
from inverterlogs_summary_daily
WHERE
inverterlogs_summary_daily.bucket >= '01.01.2022' and inverterlogs_summary_daily.bucket < '01.01.2023'
group by month
order by 1;
This results in only getting the values from 2022:
month
a2022
1
100
2
358
3
495
How could I change the SQL statement to get new columns for each year? Is this even possible?
Result should look like this (with a new column for each year, wouldn't mind if I had to update the SQL statement every year):
month
a2022
a2023
1
100
92
2
358
497
3
495
508

You can use conditional aggregation:
select extract(month from bucket) bucket_month,
sum("EnergyProduced") filter(where extract(year from bucket) = 2022) a_2022,
sum("EnergyProduced") filter(where extract(year from bucket) = 2021) a_2021
from inverterlogs_summary_daily
where bucket >= date '2021-01-01' and bucket < date '2023-01-01'
group by extract(month from bucket)
order by bucket_month
I assumed that bucket is of a timestamp-like datatype, and adapted the date arithmetic accordingly.
Side note: the expressions in the filter clause can probably be optimized with the lengthier:
sum("EnergyProduced") filter(
where bucket >= date '2022-01-01' and bucket < date '2023-01-01'
) a_2022,

You can add a condition to the SUM
SELECT to_char(bucket, 'mm') as month,
sum(CASE WHEN extract (YEAR from inverterlogs_summary_daily.bucket) = 2022 then inverterlogs_summary_daily."EnergyProduced" END) as a2022,
sum(CASE WHEN extract (YEAR from inverterlogs_summary_daily.bucket) = 2023 then inverterlogs_summary_daily."EnergyProduced" END) as a2022
from inverterlogs_summary_daily
WHERE
inverterlogs_summary_daily.bucket >= '01.01.2022' and inverterlogs_summary_daily.bucket < '01.01.2024'
group by month

Related

How to aggregate YTD measure dynamically

I have a table which has 2 fields timestamp and count. Table has data since 2016 November.
I have to set up a query which will daily aggregate the YTD sum(count) for all the years. I am not using calendar year definition but rather November-October (Next year). This shouldn't ideally change the logic
2017: 11/01/2016-10/31/2017;
2018: 11/01/2017-10/31/2018;
2019: 11/01/2018-10/31/2019;
2020: 11/01/2019-10/31/2020
I want a query that will calculate on any given day aggregate YTD with November 1st as the start date. I tried this query
select ytd_bucket
,sum(count_field) sum
from
(
select
timestamp_field,
count_field,
CASE
WHEN DATE(timestamp_field,"America/Los_Angeles") >= '2019-11-01' THEN '2020'
WHEN DATE(timestamp_field,"America/Los_Angeles") BETWEEN '2018-11-01' AND CAST(CONCAT('2019-',FORMAT_DATE('%m-%d', DATE(CURRENT_TIMESTAMP(),"America/Los_Angeles"))) AS DATE) THEN '2019'
WHEN DATE(timestamp_field,"America/Los_Angeles") BETWEEN '2017-11-01' AND CAST(CONCAT('2018-',FORMAT_DATE('%m-%d', DATE(CURRENT_TIMESTAMP(),"America/Los_Angeles"))) AS DATE) THEN '2018'
WHEN DATE(timestamp_field,"America/Los_Angeles") BETWEEN '2016-11-01' AND CAST(CONCAT('2017-',FORMAT_DATE('%m-%d', DATE(CURRENT_TIMESTAMP(),"America/Los_Angeles"))) AS DATE) THEN '2017'
ELSE NULL END as YTD_bucket
from table
)
group by 1
The above query does not aggregate the numbers are a YTD level. For the years prior to 2020 (ytd_bucket) the query is aggregating the entire years count.
Start by aggregating per day:
select date(timestamp_field, 'America/Los_Angeles') as dte,
count(*)
from table
group by dte;
Then, for the YTD, you want to add one year and get the date:
select dte,
count(*),
sum(count(*)) over (partition by extract(year from date_add(dte, interval 1 month))
order by min(timestamp_field)
) as running_cnt
from (select t.*,
date(timestamp_field, 'America/Los_Angeles') as dte
from t
) t
group by dte;

Using Date to find the inequality for sales than 500

I'm curious as to find the daily average sales for the month of December 1998 not greater than 100 as a where clause. So what I imagine is that since the table consists of the date of sales (sth like 1 december 1998, consisting of different date, months and year), amount due....First I'm going to define a particular month.
DEFINE a = TO_DATE('1-Dec-1998', 'DD-Month-YYYY')
SELECT SUBSTR(Sales_Date, 4,6), (SUM(Amount_Due)/EXTRACT(DAY FROM LAST_DAY(Sales_Date))
FROM ......
WHERE SUM(AMOUNT_DUE)/EXTRACT(DAY FROM LAST_DAY(&a)) < 100
I'm stuck as to extract the sum of amount due in the month of december 1998 for the where clause....
How can I achieve the objective?
To me, it looks like this:
select to_char(sales_date, 'mm.yyyy') month,
avg(amount_due) avg_value
from your_table
where sales_date >= trunc(date '1998-12-01', 'mm')
and sales_date < add_months(trunc(date '1998-12-01', 'mm'), 1)
group by to_char(sales_date, 'mm.yyyy')
having avg(amount_due) < 100;
WHERE clause can be simplified; it shows how to fetch certain period:
trunc to mm returns first day in that month
add_months to the above value (first day in that month) will return first day of the next month
the bottom line: give me all rows whose sales_date is >= first day of this month and < first day of the next month; basically, the whole this month
Finally, the where clause you used should actually be the having clause.
As long as the amount_due column only contains numbers, you can use the sum function.
Below SQL query should be able to satisfy your requirement.
Select SUM(Amount_Due) from table Sales where Sales_Date between '1-12-1998' and '31-12-1998'
OR
Select SUM(Amount_Due) from table Sales where Sales_Date like '%-12-1998'

How to separate data columns by year using basic SQL (bigquery)

I am trying to create a visualization using bigquery and chartio. I want to display traffic volumes by day for each year to compare on one viz, to help identify seasonality.
I can break down the traffic by having a single column for traffic and another column for month and one for year, but this data structure doesn't work when I try to build the viz is chartio.
So what I am trying to do is to set a column for each year, where I have the traffic numbers set out by month. I am not sure of the way to do this, I know I probably need a union or a join here.
The code below combines the values, but doesn't get what I want.
Thanks in advance for the help!
SELECT
EXTRACT(MONTH FROM date) AS month,
EXTRACT(YEAR FROM date) AS year,
SUM(CAST(traffic AS INT64)) AS traffic
FROM
data.source
GROUP BY month, year
This is the output I get:
month year traffic
1 2017 11991865
3 2019 3482067
8 2017 21345567
6 2016 85207567
3 2018 22010756
What I want is:
month traffic_2016 traffic_2017
1 233391865 11991865
2 1123465 3482067
3 11996545 21345567
4 119916655 85207567
5 34571865 22010756
By using IF-ELSE / CASE WHEN statement with GROUP BY
SELECT
EXTRACT(MONTH FROM date) AS month,
SUM(IF(EXTRACT(YEAR FROM date) = 2016, CAST(traffic AS INT64), 0) AS traffic_2016,
SUM(IF(EXTRACT(YEAR FROM date) = 2017, CAST(traffic AS INT64), 0) AS traffic_2017,
FROM
data.source
GROUP BY month
Simply with Join
SELECT
*
FROM
(SELECT
EXTRACT(MONTH FROM date) AS month,
SUM(CAST(traffic AS INT64)) AS traffic_2016
FROM
data.source
WHERE
EXTRACT(MONTH FROM date) = 2016
GROUP BY month)
JOIN
(SELECT
EXTRACT(MONTH FROM date) AS month,
SUM(CAST(traffic AS INT64)) AS traffic_2017
FROM
data.source
WHERE
EXTRACT(MONTH FROM date) = 2017
GROUP BY month)
USING(month)
Below is for BigQuery Standard SQL and provides less verbose and easier to read and maintain and extend with more columns version
#standardSQL
SELECT month,
SUM(IF(year = 2016, value, 0)) traffic_2016,
SUM(IF(year = 2017, value, 0)) traffic_2017,
SUM(IF(year = 2018, value, 0)) traffic_2018,
SUM(IF(year = 2019, value, 0)) traffic_2019
FROM `project.data.source`,
UNNEST([STRUCT(
EXTRACT(MONTH FROM `date`) AS month,
EXTRACT(YEAR FROM `date`) AS year,
CAST(traffic AS INT64) AS value
)])
GROUP BY month

SQL Server / SSRS: Calculating monthly average based on grouping and historical values

I need to calculate an average based on historical data for a graph in SSRS:
Current Month
Previous Month
2 Months ago
6 Months ago
This query returns the average for each month:
SELECT
avg_val1, month, year
FROM
(SELECT
(sum_val1 / count) as avg_val1, month, year
FROM
(SELECT
SUM(val1) AS sum_val1, SUM(count) AS count, month, year
FROM
(SELECT
COUNT(val1) AS count, SUM(val1) AS val1,
MONTH([SnapshotDate]) AS month,
YEAR([SnapshotDate]) AS year
FROM
[DC].[dbo].[KPI_Values]
WHERE
[SnapshotKey] = 'Some text here'
AND No = '001'
AND Channel = '999'
GROUP BY
[SnapshotDate]) AS sub3
GROUP BY
month, year, count) AS sub2
GROUP BY sum_val1, count, month, year) AS sub1
ORDER BY
year, month ASC
When I add the following WHERE clause I get the average for March (2 months ago):
WHERE month = MONTH(GETDATE())-2
AND year = YEAR(GETDATE())
Now the problem is when I want to retrieve data from 6 months ago; MONTH(GETDATE()) - 6 will output -1 instead of 12. I also have an issue with the fact that the year changes to 2016 and I am a bit unsure of how to implement the logic in my query.
I think I might be going about this wrong... Any suggestions?
Subtract the months from the date using the DATEADD function before you do your comparison. Ex:
WHERE SnapshotDate BETWEEN DATEADD(month, -6, GETDATE()) AND GETDATE()
MONTH(GETDATE()) returns an int so you can go to 0 or negative values. you need a user scalar function managing this, adding 12 when <= 0

Retrieve records in a range using year and month fields in mysql

I have a mysql table with year (YEAR(4)) and month (TINYINT) columns. What is the best way to select all records in a range, e.g. 2009-10 .. 2010-02 ?
Here is a generalized solution:
Select...
From ...
Where ( Year = <StartYear> And Month >= <StartMonth> )
Or ( Year > <StartYear> And Year < <EndYear> )
Or ( Year = <EndYear> And Month <= <EndMonth> )
Use:
WHERE (year = 2009 AND month >= 10)
OR (year = 2010 AND month <= 2)
...or, using UNION ALL:
SELECT t.*
FROM TABLE t
WHERE year = 2009 AND month >= 10
UNION ALL
SELECT t.*
FROM TABLE t
WHERE year = 2010 AND month <= 2
UNION ALL is faster than UNION, but won't remove duplicates.
You may want to consider creating another column called year_month which stores the date in YYYYMM format, such as 201002 for February 2010. Make sure to create an index on it, and you may also create two triggers to automatically update the column ON INSERT and ON UPDATE.
Then you would be able to use:
WHERE year_month BETWEEN 200710 AND 201002
This would use the index on year_month.
The accepted answer does not work for the case when StartYear and EndYear is the same. E.g. 2017-01 to 2017-08. A revised psuedocode that covers this case would be
((year = StartYear and month >= StartMonth) OR (year > StartYear)) AND ((year = EndYear and month <= EndMonth) OR (year < EndYear))