Remove the time series before sequence of zero using sql - sql

I have a time series as follows :
Day. Data
1/1/2020. 0
2/1/2020 .2
3/1/2020 0
...... ...
1/2/2020 0
2/2/2020. 0
3/2/2020. .2
4/2/2020. .3
5/2/2020. 0
6/2/2020 0
7/2/2020. 0
8/2/2020 2
9/2/2020 2.4
10/2/2020 3
So I want filter data only show after final sequence of zeros that we have in time series in this case I want to get only data after 8/2/202.
I have tried this
SELECT * FROM table where Data> 0
here is the result :
Day. Data
2/1/2020 .2
...... ...
3/2/2020. .2
4/2/2020. .3
8/2/2020 2
9/2/2020 2.4
10/2/2020 3
However this does not find the lates 0 and remove everything before that.
I want also show the result 2 days after the final zero in sequence in the table.
Day Data
10/2/2020 3
11/2/2020. 3.5
..... ....

One method is:
select t.*
from t
where t.day > (select max(t2.day) from t t2 where t2.value = 0);
You can offset this:
where t.day > (select max(t2.day) + interval '2' day from t t2 where t2.value = 0);
The above assumes that at least one row has zeros. Here are two easy fixes:
where t.day > all (select max(t2.day) from t t2 where t2.value = 0);
or:
where t.day > (select coalesce(max(t2.day), '2000-01-01') from t t2 where t2.value = 0);

You can use window functions:
select day, data
from (
select t.*, max(case when data = 0 then day end) over() day0
from mytable t
) t
where day > day0 or day0 is null
order by day0
This is easily adapted if you want to start two days after the last 0:
select day, data
from (
select t.*, max(case when data = 0 then day end) over() day0
from mytable t
) t
where day > day0 + interval '2 day' or day0 is null
order by day0

Related

Count average with multiple conditions

I'm trying to create a query which allows to categorize the average percentage for specific data per month.
Here's how my dataset presents itself:
Date
Name
Group
Percent
2022-01-21
name1
gr1
5.2
2022-01-22
name1
gr1
6.1
2022-01-26
name1
gr1
4.9
2022-02-01
name1
gr1
3.2
2022-02-03
name1
gr1
8.1
2022-01-22
name2
gr1
36.1
2022-01-25
name2
gr1
32.1
2022-02-10
name2
gr1
35.8
...
...
...
...
And here's what I want to obtain with my query (based on what I showed of the table):
Month
<=25%
25<_<=50%
50<_<=75%
75<_<=100%
01
1
1
0
0
02
1
1
0
0
...
...
...
...
...
The result needs to:
Be ordered by month
Have the average use for each name counted and categorized
So far I know how to get the average of the Percent value per Name:
SELECT Name,
AVG(Percent)
from `table`
where Group = 'gr1'
group by Name
and how to count iterations of Percent in the categories created for the query:
SELECT EXTRACT(MONTH FROM Date) as Month,
COUNT(CASE WHEN Percent <= 25 AND Group = 'gr1' THEN Name END) `_25`,
COUNT(CASE WHEN Percent > 25 AND Percent <= 50 AND Group = 'gr1' THEN Name END) `_50`,
COUNT(CASE WHEN Percent > 50 AND Percent <= 75 AND Group = 'gr1' THEN Name END) `_75`,
COUNT(CASE WHEN Percent > 75 AND Percent <= 100 AND Group = 'gr1' THEN Name END) `_100`,
FROM `table`
GROUP BY Month
ORDER BY Month
but this counts all iterations of every name where I want the average of those values.
I've been struggling to figure out how to combine the two queries or to create a new one that answers my need.
I'm working with the BigQuery service from Google Cloud
This query produces the needed result, based on your example. So basically this combines your 2 queries using subquery, where the subquery is responsible to calculate AVG grouped by Name, Month and Group, and the outer query is for COUNT and "categorization"
SELECT
Month,
COUNT(CASE
WHEN avg <= 25 THEN Name
END) AS _25,
COUNT(CASE
WHEN avg > 25
AND avg <= 50 THEN Name
END) AS _50,
COUNT(CASE
WHEN avg > 50
AND avg <= 75 THEN Name
END) AS _75,
COUNT(CASE
WHEN avg > 75
AND avg <= 100 THEN Name
END) AS _100
FROM
(
SELECT
EXTRACT(MONTH from Date) AS Month,
Name,
AVG(Percent) AS avg
FROM
table1
GROUP BY Month, Name, Group
HAVING Group = 'gr1'
) AS namegr
GROUP BY Month
This is the result:
Month
_25
_50
_75
_100
1
1
1
0
0
2
1
1
0
0
See also Fiddle (BUT on MySql) - http://sqlfiddle.com/#!9/16c5882/9
You can use this query to Group By Month and each Name
SELECT CONCAT(EXTRACT(MONTH FROM Date), ', ', Name) AS DateAndName,
CASE
WHEN AVG(Percent) <= 25 THEN '1'
ELSE '0'
END AS '<=25%',
CASE
WHEN AVG(Percent) > 25 AND AVG(Percent) <= 50 THEN '1'
ELSE '0'
END AS '25<_<=50%',
CASE
WHEN AVG(Percent) > 50 AND AVG(Percent) <= 75 THEN '1'
ELSE '0'
END AS '50<_<=75%',
CASE
WHEN AVG(Percent) > 75 AND AVG(Percent) <= 100 THEN '1'
ELSE '0'
END AS '75<_<=100%'
from DataTable /*change to your table name*/
group by EXTRACT(MONTH FROM Date), Name
order by DateAndName
It gives the following result:
DateAndName
<=25%
25<_<=50%
50<_<=75%
75<_<=100%
1, name1
1
0
0
0
1, name2
0
1
0
0
2, name1
1
0
0
0
2, name2
0
1
0
0

Oracle SQL Show all month of a year, with or without value ORA-01841

I have a problem with which I despair, I have data distributed over days, and would like to display this for the entire year in months and once in weeks.
My problem with the months that I get in the select my data displayed (for January, September) but I want that all months for a selected year are displayed, even if they are empty. For this I have made myself a "WITH" (copied) and now try to join this, but get an ORA-01841 error.
And how do I implement the whole construct to display only the weeks.
WITH MONAT_ZAEHLER (MZ) AS
(
SELECT
TO_CHAR(ADD_MONTHS(TO_DATE('01.2022','MM.YYYY'),LEVEL -1),'Month', 'NLS_DATE_LANGUAGE = GERMAN') AS GRD_ROW_ID
FROM
DUAL
CONNECT BY LEVEL <= 12
)
SELECT
TO_CHAR(GEN_DATUM,'Month', 'NLS_DATE_LANGUAGE = GERMAN') AS GRD_ROW_ID
, COUNT( DISTINCT CASE
WHEN LP_BELEGUNG.ART = 1 THEN LP_BELEGUNG.LP_BELEGUNG_ID
ELSE NULL
END ) AS "1"
, COUNT( DISTINCT CASE
WHEN LP_BELEGUNG.ART = 2 THEN LP_BELEGUNG.LP_BELEGUNG_ID
ELSE NULL
END ) AS "2"
, COUNT( DISTINCT CASE
WHEN LP_BELEGUNG.ART = 3 THEN LP_BELEGUNG.LP_BELEGUNG_ID
ELSE NULL
END ) AS "3"
, COUNT( DISTINCT CASE
WHEN LP_BELEGUNG.ART = 99 THEN LP_BELEGUNG.LP_BELEGUNG_ID
ELSE NULL
END ) AS "99"
FROM
LP_BELEGUNG
FULL OUTER JOIN MONAT_ZAEHLER ON TRUNC(LP_BELEGUNG.GEN_DATUM, 'Month') = MONAT_ZAEHLER.MZ
WHERE
TO_CHAR(GEN_DATUM, 'YYYY') = '2022'
GROUP BY
TO_CHAR(GEN_DATUM,'Month', 'NLS_DATE_LANGUAGE = GERMAN')
The error is because you're converting the month to a name string in the CTE, then trying to convert it again for the GRD_ROW_ID alias.
The solution is basically the same as your previous question, but now you want the CTE to have one row per month - which you are doing, but you should leave it as a date type in the CTE, not convert it to a string there:
with cte (dt) as (
select add_months(date '2022-01-01', level - 1)
from dual
connect by level <= 12
)
... then convert that actual date value to a string:
SELECT
TO_CHAR(cte.dt, 'Month', 'NLS_DATE_LANGUAGE = GERMAN') AS GRD_ROW_ID
...
... and outer join to your actual table as before, using a date range:
FROM
cte
LEFT JOIN
LP_BELEGUNG
ON
LP_BELEGUNG.GEN_DATUM >= cte.dt AND LP_BELEGUNG.GEN_DATUM < add_months(cte.dt, 1)
GROUP BY
cte.dt
ORDER BY
cte.dt
... this time looking for values where the the GEN_DATUM is greater than or equal to cte.dt value (again, as before), which is midnight on the first day of the first day of the month; and less than add_months(cte.dt, 1), which is midnight on the first day of the first day of the following month. So for January, that will be >= 2022-01-01 00:00:00 and < 2022-02-01 00:00:00, which is all possible dates and times during that month.
GRD_ROW_ID
ANZAHL_ART_1
ANZAHL_ART_2
ANZAHL_ART_3
ANZAHL_ART_4
Januar
0
0
0
0
Februar
0
0
0
0
März
0
0
0
0
April
0
0
0
0
Mai
0
0
0
0
Juni
0
0
0
0
Juli
0
0
0
0
August
0
0
0
0
September
1
1
1
7
Oktober
0
0
0
0
November
0
0
0
0
Dezember
0
0
0
0
fiddle
To get a row for every week of the year you would do something similar again, but in blocks of 7 days:
with cte (dt) as (
select date '2022-01-01' + 7 * (level - 1)
from dual
connect by level <= 53
)
SELECT
TO_CHAR(cte.dt, 'YYYY-WW') AS GRD_ROW_ID
...
FROM
cte
LEFT JOIN
LP_BELEGUNG
ON
LP_BELEGUNG.GEN_DATUM >= cte.dt AND LP_BELEGUNG.GEN_DATUM < cte.dt + 7
AND LP_BELEGUNG.GEN_DATUM < add_months(trunc(cte.dt, 'YYYY'), 12)
GROUP BY
cte.dt
ORDER BY
cte.dt
which has an extra check in the join to stop it including data from week 53 which is actually in the following year - which I'm guessing you woudl want to do.
fiddle

Get running time from table with start / stop event datetime only

I need your help to get the total running duration per day from a table when I record only start and stop events:
id
ts
event
1
2020-12-26 09:00:00.589016
0
2
2020-12-26 10:25:00.589016
1
3
2020-12-26 19:30:45.644092
0
4
2020-12-26 22:30:00.554092
1
0 = stop event
1 = start event
The difficulty here is to compute the duration between start and stop events but also:
if a start event is the day before, include the duration between midnight and the first start event (in this example 9h)
Any idea to achieve it ?
Assuming your Times are already in a datetime_64 format as shown below:
ts event
id
1 2020-12-25 23:55:09.589016 1
2 2020-12-26 00:05:18.589016 0
3 2020-12-26 09:00:00.589016 1
4 2020-12-26 10:25:00.589016 0
5 2020-12-26 19:30:45.644092 1
6 2020-12-26 22:30:00.554092 0
You can do the following:
dfs = df.loc[df.event == 1]
dfs = dfs.rename(columns={"ts": "Start"})
dfs.reset_index(drop= True, inplace=True)
dfnd = df.loc[df.event==0]
dfnd = dfnd.rename(columns={"ts": "Stop"})
dfnd.reset_index(drop= True, inplace=True)
dfdur = dfnd.Stop - dfs.Start
Which Yields the following:
0 0 days 00:10:09
1 0 days 01:25:00
2 0 days 02:59:14.910000
For each row with event = 0 and non existing previous row of the same day with event = 1 create another row with ts at midnight of the same day.
Similarly for each row with event = 1 and non existing next row of the same day with event = 0 create another row with ts at 23:59:59.99999 of the same day.
This can be done in a CTE.
Then use window function LAG() for each row with event = 0 to get the starting time and with strftime() calculate the difference and finally aggregate on all the differences of each day:
WITH cte AS (
SELECT ts, event FROM tablename
UNION ALL
SELECT datetime(date(t.ts)), 1
FROM tablename t
WHERE event = 0 AND NOT EXISTS (SELECT 1 FROM tablename WHERE event = 1 AND date(ts) = date(t.ts) AND ts < t.ts)
UNION ALL
SELECT date(t.ts) || ' 23:59:59.999999', 0
FROM tablename t
WHERE event = 1 AND NOT EXISTS (SELECT 1 FROM tablename WHERE event = 0 AND date(ts) = date(t.ts) AND ts > t.ts)
)
SELECT date(ts) date,
SUM(strftime('%s', ts) - strftime('%s', prev_ts)) total
FROM (
SELECT *, LAG(ts) OVER (ORDER BY ts) prev_ts
FROM cte
)
WHERE event = 0
GROUP BY date
You will get the total per day in seconds.
If you want better accuracy you can use the function julianday() instead of strftime():
..............................
SELECT date(ts) date,
SUM(julianday(ts) - julianday(prev_ts)) * 24 * 3600 total
..............................
Or, a more efficient way:
WITH cte AS (
SELECT *,
LAG(ts, 1, date(ts)) OVER (PARTITION BY date(ts) ORDER BY ts) start_ts,
event = 1 AND LEAD(ts) OVER (PARTITION BY date(ts) ORDER BY ts) IS NULL flag
FROM tablename
)
SELECT date(ts) date,
SUM(
CASE flag
WHEN 0 THEN strftime('%s', ts) - strftime('%s', start_ts)
WHEN 1 THEN strftime('%s', date(ts) || ' 23:59:59.999999') - strftime('%s', ts)
END
) total
FROM cte
WHERE event = 0 OR flag = 1
GROUP BY date
Note that this code works only if all datetimes are in the format YYYY-MM-DD hh:mm:ss.ssssss (I noticed that in your sample data there is a value that is not of that format: '2020-12-26 9:00:00.589016').
See the demo.
Results:
> date | total
> :--------- | ----:
> 2020-12-26 | 70544
You can find the difference between the start and stop times for each interval, and then sum the latter result, grouped by the day:
with _events as (select row_number() over (order by t1.id) r, substr(t1.ts, 0, instr(t1.ts, " ")) day, t1.* from test t1),
events as (select (select sum(e2.event = 1 and e2.r < e1.r) from _events e2) c, e1.* from _events e1)
select day_r.day, sum(diff) from (
select e3.day, (julianday(max(e3.ts)) - julianday(min(e3.ts)))*24*60*60 diff
from events e3
group by e3.c
)
day_r group by day_r.day;

Postgres query to conditionally return a value from a subset of records if it meets a criteria

Given a dataset with a timestamp and a value, I would like to run a query that for a given day, would return a 0 for that day if a record exists with value 0 or 1 if only non-zero values exist for that day. If there are no records for that day, also return 0.
As an example, with the given data set.
2019-06-20 23.1
2019-06-20 22.4
2019-06-20 23.1
2019-06-18 23.2
2019-06-18 0
2019-06-18 22.8
I would like to have this returned:
2019-06-20 1 -- only non-zero values for 6/20
2019-06-19 0 -- no data for 06/19
2019-06-18 0 -- 06/18 has a zero value
I know I can write a stored procedure to do this, but was hoping I could do it with a query (possibly CTE)
You can use aggregation with generate_series():
select d.day,
(case when min(t.value) > 0 then 1 else 0 end) as flag
from (select generate_series(min(day), max(day), interval '1 day') as day
from t
) d left join
t
on t.day = d.day
group by d.day
order by d.date;
This assumes that the values are non-negative (as in your example). If you can have negative values:
select d.day,
(case when count(*) filter (where t.value = 0) = 0 and
count(t.value) > 0
then 1 else 0
end) as flag
from (select generate_series(min(day), max(day), interval '1 day') as day
from t
) d left join
t
on t.day = d.day
group by d.day
order by d.date;

How to show different dates data (from the same table) as columns in Oracle

I'm sorry if the title wasn't too clear, but the following explanation will be more accurate.
I have the following view:
DATE USER CONDITION
20140101 1 A
20140101 2 B
20140101 3 C
20140108 1 C
20140108 3 B
20140108 2 C
What I need to do is present how many users where in all conditions this week and 7 days before today.
Output should be like this:
Condition Today Last_Week (Today-7)
A 0 1
B 1 1
C 2 1
How can I do this in Oracle? I will need to do this for 4 weeks so itll be Today-7,14-21.
I've tried this with group by but I get the "week2" as rows. Then I've tried something like Select conditions, (select count(users) from MyView where DATE='Today') FROM MyView(looking at something thats actually working) but it doesnt work for me.
Achieved this with a little modification of the accepted answer:
select condition,
count(case when to_date(xdate) = to_date(sysdate) then 1 end) to_day,
count(case when to_date(xdate) = to_date(sysdate-7) then 1 end) last_7_days
from my_table
group by condition
select condition, count(case when to_date(xdate) = to_date(sysdate) then 1 end) to_day,
count(case when to_date(xdate) < to_date(sysdate) then 1 end) last_7_days
from my_table
where to_date(xdate) >= to_date(sysdate) - 7
group by condition
select condition
, sum
( case
when date between trunc(sysdate) - 7 and trunc(sysdate) - 1
then 1
else 0
end
)
last_week
, sum
( case
when date between trunc(sysdate) and trunc(sysdate + 1)
then 1
else 0
end
)
this_week
from table
group
by condition
By using the conditional count (as a sum) and grouping on condition you can filter out all desired dates. Note that using trunc will cause to use the begin of the day.