Number of days in a dataset SQL Oracle without GROUP by - sql

Need to calculated the Moving Range for a set of data without using group by clause. As I am calculating the average value and the previous avg value I need to take into account only existing values. I cant not use DIFFDATE(start-end).
Another constrains is that I need to do it at row-level as I need to have it as a pre-calculated value (denominator) to calculate the AVG Moving Range.
At the moment I am using window functions to calculate the average and previous averages.
ROUND(AVG(SUMCOUNTSFT3) OVER (partition by to_date(to_char(DATETIMEOFREADING, 'DD/MM/RR'))),2) as AVG_SUMCOUNTSFT3,
ROUND(AVG(SUMCOUNTSFT3) OVER (order by to_date(to_char(DATETIMEOFREADING, 'DD/MM/RR')) RANGE between interval '1' day preceding AND interval '1' day preceding),2) as LAG_VAL
Here is some sample data, as you can see I have multiple readings from a sensor. I have calculated the average for that day and for the previous day. Then I will have the difference between data points by |Xi - Xi-1|, the denominator is the column that I am trying to calculate. In some cases we will not have reading for a day if the sensor is failing and I need to discard those days if there is no data.
I believe a ROW_NUMBER() or DENSE_RANK() will do the job with a partition clause.

Related

SQL average of previous range of columns into current column

I am trying to get the following calculations but at row level, in the image below I calculated the avg of values for each day (it can have n number of rows) then I used the LAG function to insert the avg of the previous row into the next row LAG_VAL column.
Now I am doing the calculations at row level, I have been able to get the average for that range of data using windowed functions (analytics)
ROUND(AVG(SUMCOUNTSFT3) OVER (partition by to_date(to_char(DATETIMEOFREADING, 'DD/MM/RRRR'))),2) as AVG_SUMCOUNTSFT3
but I have been not able to calculate the avg value of the previous day an insert that into the range of the next day as illustrated in the previous image.
Not sure if there is a way to implement this with the RANGE function of if I need to use PLSQL.
Off the top of my head (without a matching schema to test with) this windowing clause should work:
(partition by to_date(to_char(DATETIMEOFREADING, 'DD/MM/RRRR') order by dates range between interval '1' day preceding and interval '1' day preceding)
This is plain SQL, so it works inside as well as outside of PL/SQL.

Redshift - How to SUM number over last 4 weeks as a window function per row?

is it possible to SUM a number over a special time period in Amazon Redshift with a WINDOW-Function?
As an example I'm counting login numbers for different companies per day.
What I now want per row is, that it sums up the logins over the last 4 weeks (referenced by the date of the row): The field which I'm serarching for is marked yellow in the screenshot.
Thanks in advance for your help.
If you have data for each day, then you can use rows:
select t.*,
sum(logs) over (partition by company
order by date
rows between 27 preceding and current row
) as logins_4_weeks
from t;
Redshift does not yet support range for the window frame, so this is your best bet.

How can I optimally modify this BiqQuery query to retrieve the latest available data

I have the following query. It initially performs a sub-select by querying on a table that is partitioned (sharded) by sample_date_time. It does this by filtering using a date range in the WHERE that is passed via parameters. Then the final SELECT selects the data to be returned.
The query currently returns data for the latest complete hour (from the beginning of the previous hours hourly boundary, to the end of it). I want to adapt it to instead to get the latest hour of data that contains any data sample, up to a maximum of approximately 5hrs ago. The query can't use anything that invalidates the BigQuery cache within any given hour (e.g. I can't use a date function that gets the current date). The table data only updates every hour.
I'm thinking maybe I need to select the max sample_date_time in the initial sub-select, over a range of the last 5 hours. I could pass the hourly end boundary of the current time as a parameter, but I'm not seeing how I can limit the date range for which to retrieve the MAX, then use that max to get the start and end dates of the most recent hour that has any data.
WITH data AS (
SELECT
created_date_time,
sample_date_time,
station,
channel,
value
FROM my.mart
WHERE sample_date_time BETWEEN '2019-07-23 04:00:00.000000+00:00' AND '2019-07-23 04:59:59.999000+00:00'
AND station = '[my_guid]'
)
SELECT sample_date_time, station, channel, value
FROM data
ORDER BY value desc, channel asc, sample_date_time desc

Creating a DAX pattern that counts days between a date field and a month value on a chart's x-axis

I am struggling with a DAX pattern to allow me to plot an average duration value on a chart.
Here is the problem: My dataset has a field called dtOpened which is a date value describing when something started, and I want to be able to calculate the duration in days since that date.
I then want to be able to create an average duration since that date over a time period.
It is very easy to do when thinking about the value as it is now, but I want to be able to show a chart that describes what that average value would have been over various time periods on the x-axis (month/quarter/year).
The problem that I am facing is that if I create a calculated column to find the current age (NOW() - [dtOpened]), then it always uses the NOW() function - which is no use for historic time spans. Maybe I need a Measure for this, rather than a calculated column, but I cannot work out how to do it.
I have thought about using LASTDATE (rather than NOW) to work out what the last date would be in the filter context of any single month/quarter/year, but if the current month is only half way through, then it would probably need to consider today's date as the value from which to subtract the dtOpened value.
I would appreciate any help or pointers that you can give me!
It looks like you have a table (let's call it Cases) storing your cases with one record per case with fields like the following:
casename, dtOpened, OpenClosedFlag
You should create a date table with on record per day spanning your date range. The date table will have a month ending date field identifying the last day of the month (same for quarter & year). But this will be a disconnected date table. Don't create a relationship between the Date on the Date table and your case open date.
Then use iterative averagex to average the date differences.
Average Duration (days) :=
CALCULATE (
AVERAGEX ( Cases, MAX ( DateTable[Month Ending] ) - Cases[dtopened] ),
FILTER ( Cases, Cases[OpenClosedFlag] = "Open" ),
FILTER ( Cases, Cases[dtopened] <= MAX ( DateTable[Month Ending] ) )
)
Once you plot the measure against your Month you should see the average values represented correctly. You can do something similar for quarter & year.
You're a genius, Rory; Thanks.
In my example, I had a dtClosed field rather than an Opened/Closed flag, so there was one extra piece of filtering to do to test if the Case was closed at that point in time. So my measure ended up looking like this:
Average Duration:=CALCULATE(
AVERAGEX(CasesOnly, MAX(DT[LastDateM]) - CasesOnly[Owner Opened dtOnly]),
FILTER(CasesOnly, OR(ISBLANK(CasesOnly[Owner Resolution dtOnly]),
CasesOnly[Owner Resolution dtOnly] > MAX(DT[LastDateM]))),
FILTER(CasesOnly, CasesOnly[Owner Opened dtOnly] <= MAX(DT[LastDateM]))
)
And to get the chart, I plotted the DT[Date] field on the x-axis.
Thanks very much again.

Moving trailing week average in PostgreSQL

My source data includes Transaction ID, Date, Amount. I need a one week trailing average which moves on a daily basis and averaging amount per transaction. Problem is, that sometimes there is no transactions in particuliar date, and I need avg per transaction, no per day, and trailing average moves by day, not by week.In this particular case I can't use OVER with rows preceding. I'm stack with it :(
Data looks like this:
https://gist.github.com/avitominoz/a252e9f1ab3b1d02aa700252839428dd
There are two methods to doing this. One uses generate_series() to get all the results. The second uses a lateral join.
with minmax as (
select min(trade_date) as mintd, max(trade_date) as maxtd
from sales
)
select days.dte, s.values,
avg(values) over (order by days.dte
rows between 6 preceding and current row
) as avg_7day
from generate_series(mintd, maxtd, interval '1 day') days(dte) left join
sales s
on s.trade_dte = days.dte;
Note: this ignores the values on missing days rather than treating them as 0. If you want 0, then use avg(coalesce(values, 0)).