A query for date within a year - sql

My table is like this on Postgres, note that all days start by 01, there is only 1 entry a month+year
SELECT * FROM "fis_historico_receita"
+----+------------+---------------+
| id | data | receita_bruta |
+----+------------+---------------+
| 1 | 2010-02-01 | 100000.0 |
| 2 | 2010-01-01 | 100000.0 |
| 3 | 2009-12-01 | 100000.0 |
| 4 | 2009-11-01 | 100000.0 |
| 5 | 2009-10-01 | 100000.0 |
| 6 | 2009-09-01 | 100000.0 |
| 7 | 2009-08-01 | 100000.0 |
| 8 | 2009-07-01 | 100000.0 |
| 9 | 2009-06-01 | 100000.0 |
| 10 | 2009-05-01 | 100000.0 |
| 11 | 2009-04-01 | 100000.0 |
| 12 | 2009-03-01 | 100000.0 |
| 13 | 2009-02-01 | 100000.0 |
| 14 | 2009-01-01 | 100000.0 |
| 15 | 2008-12-01 | 100000.0 |
+----+------------+---------------+
What I want is to find 12 months starting right from before the current.
I tried this:
select *
from fis_historico_receita
where data in interval '1 year'
I really would like an answer using Interval, +1 goes for everyone that runs on Postgres

Try this:
select *
from fis_historico_receita
where data BETWEEN NOW() - interval '1 year' AND NOW()

Related

Solar-Heating: Data analytics for Grafana, advanced query

I would need some help with a very specific use case I have for my homelab.
I do have some solar panels on my roof, and I do extract a lot of data points to my server. I am using a specific app for that, making it easy to consume and automate stuff for that data (iobroker). The data I do save into a progres database. (No questions please why not Influx or TimescaleDB, postgres is what I need to live with...)
I use everything on docker right now, works perfectly. While I was able to create numerous dashboard on Grafana, display everything I like there, there is one specific "thing" I was unable to do, and after month of trying to get it done I finally ask for help. I do have a device supporting my heating from generated power to warm up the water. The device is using energy that we would normally feed back to the grid. The device is updating the power it pushes to the heating device pretty much every second. I am pulling the data from the device also every second. However I do have the logging configured in the way, that is only logs data when there is a difference to the previous datapoint.
One example:
Time
consumption in W
2018-02-21 12:00:00
3500
2018-02-21 12:00:01
1470
2018-02-21 12:00:02
1470
2018-02-21 12:00:03
1470
2018-02-21 12:00:00
1600
The second and third entry with the value of "1470" would not exist!
So first issue I have is a missing data point(s). What I would like to achieve is to have a calculation showing the consumption by individual day, month, year and all-time.
This does not need to happen inside Grafana, and I don't think Grafana can do this at all. There are options to do similar things in Grafana, but they do not provide an accurate result ($__unixEpochGroupAlias(ts,1s,previous)). I do have every option that is needed to create the data, so there should not be any obstacle in your ideas, and store it again inside the DB.
The data is polled/stored every 1000ms, so every second. Idea is to use Ws (Watt-seconds) to easily calculate with accurate numbers, as well as to display them better in Wh or kWh.
The DB can be only queried with SQL - but as mentioned if calculations needs to be done in a different language or so, then this is also fine.
Tried everything I could think of. SQL queries, searching numerous posts, all avaialble SQL based Grafana options. Guess I need custom code, but that above my skillset.
Anything more you'd need to know? Let me know. Thanks in advance!
The data structure looks the following:
id=entry for the application to identify the datapoint ts=timestamp
val=value in Ws
The other values are not important, but I wanted to show them for completeness.
id | ts | val | ack | _from | q
----+---------------+------+-----+-------+---
23 | 1661439981910 | 1826 | t | 3 | 0
23 | 1661439982967 | 1830 | t | 3 | 0
23 | 1661439984027 | 1830 | t | 3 | 0
23 | 1661439988263 | 1828 | t | 3 | 0
23 | 1661439985088 | 1829 | t | 3 | 0
23 | 1661439987203 | 1829 | t | 3 | 0
23 | 1661439989322 | 1831 | t | 3 | 0
23 | 1661439990380 | 1830 | t | 3 | 0
23 | 1661439991439 | 1827 | t | 3 | 0
23 | 1661439992498 | 1829 | t | 3 | 0
23 | 1661440021097 | 1911 | t | 3 | 0
23 | 1661439993558 | 1830 | t | 3 | 0
23 | 1661440022156 | 1924 | t | 3 | 0
23 | 1661439994624 | 1830 | t | 3 | 0
23 | 1661440023214 | 1925 | t | 3 | 0
23 | 1661439995683 | 1828 | t | 3 | 0
23 | 1661440024273 | 1924 | t | 3 | 0
23 | 1661439996739 | 1830 | t | 3 | 0
23 | 1661440025332 | 1925 | t | 3 | 0
23 | 1661440052900 | 1694 | t | 3 | 0
23 | 1661439997797 | 1831 | t | 3 | 0
23 | 1661440026391 | 1927 | t | 3 | 0
23 | 1661439998855 | 1831 | t | 3 | 0
23 | 1661440027450 | 1925 | t | 3 | 0
23 | 1661439999913 | 1828 | t | 3 | 0
23 | 1661440028509 | 1925 | t | 3 | 0
23 | 1661440029569 | 1927 | t | 3 | 0
23 | 1661440000971 | 1830 | t | 3 | 0
23 | 1661440030634 | 1926 | t | 3 | 0
23 | 1661440002030 | 1838 | t | 3 | 0
23 | 1661440031694 | 1925 | t | 3 | 0
23 | 1661440053955 | 1692 | t | 3 | 0
23 | 1659399542399 | 0 | t | 3 | 0
23 | 1659399543455 | 1 | t | 3 | 0
23 | 1659399544511 | 0 | t | 3 | 0
23 | 1663581880895 | 2813 | t | 3 | 0
23 | 1663581883017 | 2286 | t | 3 | 0
23 | 1663581881952 | 2646 | t | 3 | 0
23 | 1663581884074 | 1905 | t | 3 | 0
23 | 1661440004144 | 1838 | t | 3 | 0
23 | 1661440032752 | 1926 | t | 3 | 0
23 | 1661440005202 | 1839 | t | 3 | 0
23 | 1661440034870 | 1924 | t | 3 | 0
23 | 1661440006260 | 1840 | t | 3 | 0
23 | 1661440035929 | 1922 | t | 3 | 0
23 | 1661440007318 | 1840 | t | 3 | 0
23 | 1661440036987 | 1918 | t | 3 | 0
23 | 1661440008377 | 1838 | t | 3 | 0
23 | 1661440038045 | 1919 | t | 3 | 0
23 | 1661440009437 | 1839 | t | 3 | 0
23 | 1661440039104 | 1900 | t | 3 | 0
23 | 1661440010495 | 1839 | t | 3 | 0
23 | 1661440040162 | 1877 | t | 3 | 0
23 | 1661440011556 | 1838 | t | 3 | 0
23 | 1661440041220 | 1862 | t | 3 | 0
23 | 1661440012629 | 1840 | t | 3 | 0
23 | 1661440042279 | 1847 | t | 3 | 0
23 | 1661440013687 | 1840 | t | 3 | 0
23 | 1661440043340 | 1829 | t | 3 | 0
23 | 1661440014746 | 1833 | t | 3 | 0
23 | 1661440044435 | 1817 | t | 3 | 0
23 | 1661440015804 | 1833 | t | 3 | 0
23 | 1661440045493 | 1789 | t | 3 | 0
23 | 1661440046551 | 1766 | t | 3 | 0
23 | 1661440016862 | 1846 | t | 3 | 0
23 | 1661440047610 | 1736 | t | 3 | 0
23 | 1661440048670 | 1705 | t | 3 | 0
23 | 1661440017920 | 1863 | t | 3 | 0
23 | 1661440049726 | 1694 | t | 3 | 0
23 | 1661440050783 | 1694 | t | 3 | 0
23 | 1661440018981 | 1876 | t | 3 | 0
23 | 1661440051840 | 1696 | t | 3 | 0
23 | 1661440055015 | 1692 | t | 3 | 0
23 | 1661440056071 | 1693 | t | 3 | 0
23 | 1661440322966 | 1916 | t | 3 | 0
23 | 1661440325082 | 1916 | t | 3 | 0
23 | 1661440326142 | 1926 | t | 3 | 0
23 | 1661440057131 | 1693 | t | 3 | 0
23 | 1661440327199 | 1913 | t | 3 | 0
23 | 1661440058189 | 1692 | t | 3 | 0
23 | 1661440328256 | 1915 | t | 3 | 0
23 | 1661440059247 | 1691 | t | 3 | 0
23 | 1661440329315 | 1923 | t | 3 | 0
23 | 1661440060306 | 1692 | t | 3 | 0
23 | 1661440330376 | 1912 | t | 3 | 0
23 | 1661440061363 | 1676 | t | 3 | 0
23 | 1661440331470 | 1913 | t | 3 | 0
23 | 1661440062437 | 1664 | t | 3 | 0
23 | 1663581885133 | 1678 | t | 3 | 0
23 | 1661440332530 | 1923 | t | 3 | 0
23 | 1661440064552 | 1667 | t | 3 | 0
23 | 1661440334647 | 1915 | t | 3 | 0
23 | 1661440335708 | 1913 | t | 3 | 0
23 | 1661440065608 | 1665 | t | 3 | 0
23 | 1661440066665 | 1668 | t | 3 | 0
23 | 1661440336763 | 1912 | t | 3 | 0
23 | 1661440337822 | 1913 | t | 3 | 0
23 | 1661440338879 | 1911 | t | 3 | 0
23 | 1661440068780 | 1664 | t | 3 | 0
23 | 1661440339939 | 1912 | t | 3 | 0
(100 rows)```
iobroker=# \d ts_number
Table "public.ts_number"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | not null |
ts | bigint | | not null |
val | real | | |
ack | boolean | | |
_from | integer | | |
q | integer | | |
Indexes:
"ts_number_pkey" PRIMARY KEY, btree (id, ts)
You can do this with a mix of generate_series() and some window functions.
First we use generate_series() to get all the second timestamps in a desired range. Then we join to our readings to find what consumption values we have. Group nulls with their most recent non-null reading. Then set the consumption the same for the whole group.
So: if we have readings like this:
richardh=> SELECT * FROM readings;
id | ts | consumption
----+------------------------+-------------
1 | 2023-02-16 20:29:13+00 | 900
2 | 2023-02-16 20:29:16+00 | 1000
3 | 2023-02-16 20:29:20+00 | 925
(3 rows)
We can get all of the seconds we might want like this:
richardh=> SELECT generate_series(timestamptz '2023-02-16 20:29:13+00', timestamptz '2023-02-16 20:29:30+00', interval '1 second');
generate_series
------------------------
2023-02-16 20:29:13+00
2023-02-16 20:29:14+00
...etc...
2023-02-16 20:29:29+00
2023-02-16 20:29:30+00
(18 rows)
Then we join our complete set of timestamps to our readings:
WITH wanted_timestamps (ts) AS (
SELECT generate_series(timestamptz '2023-02-16 20:29:13+00', timestamptz '2023-02-16 20:29:30+00', interval '1 second')
)
SELECT
wt.ts
, r.consumption
, sum(CASE WHEN r.consumption IS NOT NULL THEN 1 ELSE 0 END)
OVER (ORDER BY ts) AS group_num
FROM
wanted_timestamps wt
LEFT JOIN readings r USING (ts)
ORDER BY wt.ts;
ts | consumption | group_num
------------------------+-------------+-----------
2023-02-16 20:29:13+00 | 900 | 1
2023-02-16 20:29:14+00 | | 1
2023-02-16 20:29:15+00 | | 1
2023-02-16 20:29:16+00 | 1000 | 2
2023-02-16 20:29:17+00 | | 2
2023-02-16 20:29:18+00 | | 2
2023-02-16 20:29:19+00 | | 2
2023-02-16 20:29:20+00 | 925 | 3
2023-02-16 20:29:21+00 | | 3
2023-02-16 20:29:22+00 | | 3
2023-02-16 20:29:23+00 | | 3
2023-02-16 20:29:24+00 | | 3
2023-02-16 20:29:25+00 | | 3
2023-02-16 20:29:26+00 | | 3
2023-02-16 20:29:27+00 | | 3
2023-02-16 20:29:28+00 | | 3
2023-02-16 20:29:29+00 | | 3
2023-02-16 20:29:30+00 | | 3
(18 rows)
Finally, fill in the missing consumption values:
WITH wanted_timestamps (ts) AS (
SELECT generate_series(timestamptz '2023-02-16 20:29:13+00', timestamptz '2023-02-16 20:29:30+00', interval '1 second')
), grouped_values AS (
SELECT
wt.ts
, r.consumption
, sum(CASE WHEN r.consumption IS NOT NULL THEN 1 ELSE 0 END)
OVER (ORDER BY ts) AS group_num
FROM wanted_timestamps wt
LEFT JOIN readings r USING (ts)
)
SELECT
gv.ts
, first_value(gv.consumption) OVER (PARTITION BY group_num)
AS consumption
FROM
grouped_values gv
ORDER BY ts;
ts | consumption
------------------------+-------------
2023-02-16 20:29:13+00 | 900
2023-02-16 20:29:14+00 | 900
2023-02-16 20:29:15+00 | 900
2023-02-16 20:29:16+00 | 1000
2023-02-16 20:29:17+00 | 1000
2023-02-16 20:29:18+00 | 1000
2023-02-16 20:29:19+00 | 1000
2023-02-16 20:29:20+00 | 925
2023-02-16 20:29:21+00 | 925
2023-02-16 20:29:22+00 | 925
2023-02-16 20:29:23+00 | 925
2023-02-16 20:29:24+00 | 925
2023-02-16 20:29:25+00 | 925
2023-02-16 20:29:26+00 | 925
2023-02-16 20:29:27+00 | 925
2023-02-16 20:29:28+00 | 925
2023-02-16 20:29:29+00 | 925
2023-02-16 20:29:30+00 | 925
(18 rows)

Redshift SQL - Count Sequences of Repeating Values Within Groups

I have a table that looks like this:
| id | date_start | gap_7_days |
| -- | ------------------- | --------------- |
| 1 | 2021-06-10 00:00:00 | 0 |
| 1 | 2021-06-13 00:00:00 | 0 |
| 1 | 2021-06-19 00:00:00 | 0 |
| 1 | 2021-06-27 00:00:00 | 0 |
| 2 | 2021-07-04 00:00:00 | 1 |
| 2 | 2021-07-11 00:00:00 | 1 |
| 2 | 2021-07-18 00:00:00 | 1 |
| 2 | 2021-07-25 00:00:00 | 1 |
| 2 | 2021-08-01 00:00:00 | 1 |
| 2 | 2021-08-08 00:00:00 | 1 |
| 2 | 2021-08-09 00:00:00 | 0 |
| 2 | 2021-08-16 00:00:00 | 1 |
| 2 | 2021-08-23 00:00:00 | 1 |
| 2 | 2021-08-30 00:00:00 | 1 |
| 2 | 2021-08-31 00:00:00 | 0 |
| 2 | 2021-09-01 00:00:00 | 0 |
| 2 | 2021-08-08 00:00:00 | 1 |
| 2 | 2021-08-15 00:00:00 | 1 |
| 2 | 2021-08-22 00:00:00 | 1 |
| 2 | 2021-08-23 00:00:00 | 1 |
For each ID, I check whether consecutive date_start values are 7 days apart, and put a 1 or 0 in gap_7_days accordingly.
I want to do the following (using Redshift SQL only):
Get the length of each sequence of consecutive 1s in gap_7_days for each ID
Expected output:
| id | date_start | gap_7_days | sequence_length |
| -- | ------------------- | --------------- | --------------- |
| 1 | 2021-06-10 00:00:00 | 0 | |
| 1 | 2021-06-13 00:00:00 | 0 | |
| 1 | 2021-06-19 00:00:00 | 0 | |
| 1 | 2021-06-27 00:00:00 | 0 | |
| 2 | 2021-07-04 00:00:00 | 1 | 6 |
| 2 | 2021-07-11 00:00:00 | 1 | 6 |
| 2 | 2021-07-18 00:00:00 | 1 | 6 |
| 2 | 2021-07-25 00:00:00 | 1 | 6 |
| 2 | 2021-08-01 00:00:00 | 1 | 6 |
| 2 | 2021-08-08 00:00:00 | 1 | 6 |
| 2 | 2021-08-09 00:00:00 | 0 | |
| 2 | 2021-08-16 00:00:00 | 1 | 3 |
| 2 | 2021-08-23 00:00:00 | 1 | 3 |
| 2 | 2021-08-30 00:00:00 | 1 | 3 |
| 2 | 2021-08-31 00:00:00 | 0 | |
| 2 | 2021-09-01 00:00:00 | 0 | |
| 2 | 2021-08-08 00:00:00 | 1 | 4 |
| 2 | 2021-08-15 00:00:00 | 1 | 4 |
| 2 | 2021-08-22 00:00:00 | 1 | 4 |
| 2 | 2021-08-23 00:00:00 | 1 | 4 |
Get the number of sequences for each ID
Expected output:
| id | num_sequences |
| -- | ------------------- |
| 1 | 0 |
| 2 | 3 |
How can I achieve this?
If you want the number of sequences, just look at the previous value. When the current value is "1" and the previous is NULL or 0, then you have a new sequence.
So:
select id,
sum( (gap_7_days = 1 and coalesce(prev_gap_7_days, 0) = 0)::int ) as num_sequences
from (select t.*,
lag(gap_7_days) over (partition by id order by date_start) as prev_gap_7_days
from t
) t
group by id;
If you actually want the lengths of the sequences, as in the intermediate results, then ask a new question. That information is not needed for this question.

SQL window excluding current group?

I'm trying to provide rolled up summaries of the following data including only the group in question as well as excluding the group. I think this can be done with a window function, but I'm having problems with getting the syntax down (in my case Hive SQL).
I want the following data to be aggregated
+------------+---------+--------+
| date | product | rating |
+------------+---------+--------+
| 2018-01-01 | A | 1 |
| 2018-01-02 | A | 3 |
| 2018-01-20 | A | 4 |
| 2018-01-27 | A | 5 |
| 2018-01-29 | A | 4 |
| 2018-02-01 | A | 5 |
| 2017-01-09 | B | NULL |
| 2017-01-12 | B | 3 |
| 2017-01-15 | B | 4 |
| 2017-01-28 | B | 4 |
| 2017-07-21 | B | 2 |
| 2017-09-21 | B | 5 |
| 2017-09-13 | C | 3 |
| 2017-09-14 | C | 4 |
| 2017-09-15 | C | 5 |
| 2017-09-16 | C | 5 |
| 2018-04-01 | C | 2 |
| 2018-01-13 | D | 1 |
| 2018-01-14 | D | 2 |
| 2018-01-24 | D | 3 |
| 2018-01-31 | D | 4 |
+------------+---------+--------+
Aggregated results:
+------+-------+---------+----+------------+------------------+----------+
| year | month | product | ct | avg_rating | avg_rating_other | other_ct |
+------+-------+---------+----+------------+------------------+----------+
| 2018 | 1 | A | 5 | 3.4 | 2.5 | 4 |
| 2018 | 2 | A | 1 | 5 | NULL | 0 |
| 2017 | 1 | B | 4 | 3.6666667 | NULL | 0 |
| 2017 | 7 | B | 1 | 2 | NULL | 0 |
| 2017 | 9 | B | 1 | 5 | 4.25 | 4 |
| 2017 | 9 | C | 4 | 4.25 | 5 | 1 |
| 2018 | 4 | C | 1 | 2 | NULL | 0 |
| 2018 | 1 | D | 4 | 2.5 | 3.4 | 5 |
+------+-------+---------+----+------------+------------------+----------+
I've also considered producing two aggregates, one with the product in question and one without, but having trouble with creating the appropriate joining key.
You can do:
select year(date), month(date), product,
count(*) as ct, avg(rating) as avg_rating,
sum(count(*)) over (partition by year(date), month(date)) - count(*) as ct_other,
((sum(sum(rating)) over (partition by year(date), month(date)) - sum(rating)) /
(sum(count(*)) over (partition by year(date), month(date)) - count(*))
) as avg_other
from t
group by year(date), month(date), product;
The rating for the "other" is a bit tricky. You need to add everything up and subtract out the current row -- and calculate the average by doing the sum divided by the count.

How to Group by 6 days in Postgresql

I want to convert this type of data to 6Days GROUP BY format.
+-----+--------------+------------+
| gid | cnt | date |
+-----+--------------+------------+
| 1 | 1 | 2012-02-05 |
| 2 | 2 | 2012-02-06 |
| 3 | 1 | 2012-02-07 |
| 4 | 1 | 2012-02-08 |
| 5 | 1 | 2012-02-09 |
| 6 | 2 | 2012-02-10 |
| 7 | 3 | 2012-02-11 |
| 8 | 1 | 2012-02-12 |
| 9 | 1 | 2012-02-13 |
| 10 | 2 | 2012-02-14 |
| 11 | 3 | 2012-02-15 |
| 12 | 4 | 2012-02-16 |
| 13 | 1 | 2012-02-17 |
| 14 | 1 | 2012-02-18 |
| 15 | 1 | 2012-02-19 |
| 16 | NULL | 2012-02-20 |
| 17 | 6 | 2012-02-21 |
| 18 | NULL | 2012-02-22 |
+-----+--------------+------------+
↓↓↓↓↓↓↓↓↓↓↓↓↓↓
The date is a continuous format.
If I understand correctly you need something like this:
WITH x AS (SELECT date::date, (random() * 3)::int AS cnt FROM generate_series('2012-02-05'::date, '2012-02-22'::date, '1 day'::interval) AS date
)
SELECT start::date,
(start + '5 day'::interval)::date AS end,
sum(cnt)
FROM generate_series(
(SELECT min(date) FROM x),
(SELECT max(date) FROM x),
'5 day'::interval
) AS start
LEFT JOIN x ON (x.date >= start AND x.date <= start + '5 day'::interval)
GROUP BY 1, 2
ORDER BY 1
In x I emulate your table.

Count rows each month of a year - SQL Server

I have a table "Product" as :
| ProductId | ProductCatId | Price | Date | Deadline |
--------------------------------------------------------------------
| 1 | 1 | 10.00 | 2016-01-01 | 2016-01-27 |
| 2 | 2 | 10.00 | 2016-02-01 | 2016-02-27 |
| 3 | 3 | 10.00 | 2016-03-01 | 2016-03-27 |
| 4 | 1 | 10.00 | 2016-04-01 | 2016-04-27 |
| 5 | 3 | 10.00 | 2016-05-01 | 2016-05-27 |
| 6 | 3 | 10.00 | 2016-06-01 | 2016-06-27 |
| 7 | 1 | 20.00 | 2016-01-01 | 2016-01-27 |
| 8 | 2 | 30.00 | 2016-02-01 | 2016-02-27 |
| 9 | 1 | 40.00 | 2016-03-01 | 2016-03-27 |
| 10 | 4 | 15.00 | 2016-04-01 | 2016-04-27 |
| 11 | 1 | 25.00 | 2016-05-01 | 2016-05-27 |
| 12 | 5 | 55.00 | 2016-06-01 | 2016-06-27 |
| 13 | 5 | 55.00 | 2016-06-01 | 2016-01-27 |
| 14 | 5 | 55.00 | 2016-06-01 | 2016-02-27 |
| 15 | 5 | 55.00 | 2016-06-01 | 2016-03-27 |
I want to create SP count rows of Product each month with condition Year = CurrentYear , like :
| Month| SumProducts | SumExpiredProducts |
-------------------------------------------
| 1 | 3 | 3 |
| 2 | 3 | 3 |
| 3 | 3 | 3 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
| 6 | 2 | 2 |
What should i do ?
You can use a query like the following:
SELECT MONTH([Date]),
COUNT(*) AS SumProducts ,
COUNT(CASE WHEN [Date] > Deadline THEN 1 END) AS SumExpiredProducts
FROM mytable
WHERE YEAR([Date]) = YEAR(GETDATE())
GROUP BY MONTH([Date])