get average elapsed time for each month

get average elapsed time for each month - sql

I have a table that has entry date and completion date on it (records go back a couple years) and i need to write a query that gives the average amount of time completion takes by each month. I can get the overall average,
SELECT AVG ( t.completion_dt - t.entry_dt ) * 24
FROM table t;
or the average for a specific month,
SELECT AVG ( t.completion_dt - t.entry_dt ) * 24
FROM table t
WHERE t.entry_dt BETWEEN to_date('2018/12/01','yyyy/mm/dd') AND
to_date('2019/1/01', 'yyyy/mm/dd');
Is there a way to get the query to return the average for each month?

If you are wanting to see each month/year independently (Jan 2020 on a different row from Jan 2019), then you can group on your entry_dt field truncated at the month.
SELECT trunc(t.entry_dt,'MM'),
AVG ( t.completion_dt - t.entry_dt ) * 24
FROM table t
group by trunc(t.entry_dt,'MM');
If you are wanting to average ALL January months together across multiple years, you will want to group on something like the to_char of your entry_dt field.
SELECT to_char(t.entry_dt, 'Month'),
AVG ( t.completion_dt - t.entry_dt ) * 24
FROM table t
group by to_char(t.entry_dt, 'Month');

Related

Extract previous row calculated value for use in current row calculations - Postgres

Have a requirement where I would need to rope the calculated value of the previous row for calculation in the current row.
The following is a sample of how the data currently looks :-
ID
Date
Days
1
2022-01-15
30
2
2022-02-18
30
3
2022-03-15
90
4
2022-05-15
30
The following is the output What I am expecting :-
ID
Date
Days
CalVal
1
2022-01-15
30
2022-02-14
2
2022-02-18
30
2022-03-16
3
2022-03-15
90
2022-06-14
4
2022-05-15
30
2022-07-14
The value of CalVal for the first row is Date + Days
From the second row onwards it should take the CalVal value of the previous row and add it with the current row Days
Essentially, what I am looking for is means to access the previous rows calculated value for use in the current row.
Is there anyway we can achieve the above via Postgres SQL? I have been tinkering with window functions and even recursive CTEs but have had no luck :(
Would appreciate any direction!
Thanks in advance!

select
id,
date,
coalesce(
days - (lag(days, 1) over (order by date, days))
, days) as days,
first_date + cast(days as integer) as newdate
from
(
select
-- get a running sum of days
id,
first_date,
date,
sum(days) over (order by date, days) as days
from
(
select
-- get the first date
id,
(select min(date) from table1) as first_date,
date,
days
from
table1
) A
) B
This query get the exact output you described. I'm not at all ready to say it is the best solution but the strategy employed is to essential create a running total of the "days" ... this means that we can just add this running total to the first date and that will always be the next date in the desired sequence. One finesse: to put the "days" back into the result, we calculated the current running total less the previous running total to arrive at the original amount.

assuming that table name is table1
select
id,
date,
days,
first_value(date) over (order by id) +
(sum(days) over (order by id rows between unbounded preceding and current row))
*interval '1 day' calval
from table1;
We just add cumulative sum of days to first date in table. It's not really what you want to do (we don't need date from previous row, just cumulative days sum)
Solution with recursion
with recursive prev_row as (
select id, date, days, date+ days*interval '1 day' calval
from table1
where id = 1
union all
select t.id, t.date, t.days, p.calval + t.days*interval '1 day' calval
from prev_row p
join table1 t on t.id = p.id+ 1
)
select *
from prev_row

Teradata loop for dates, column adding within loop

I have a table where every row is transaction and there are few columns: clients IDs and dates for every transaction.
I am trying to write a query which will give a table where column N shows number of clients whose first transaction happened in month N made transactions in months: N, N+1, N+2, ...
For example (desired table for 3 months data):
1 2 3
100 90 78
80 80
60
First row of the column 1 shows number of clients whose first transaction happened in month 1, second row shows how many of this clients stayed after 1 month, third row - after two month etc
My current query (Year is a column wit year for the date, like 2017, month is a number of month like 1 for January):
WITH not_in AS(
SELECT ID, Year, month
FROM table
WHERE trans_date<date "2017-01-01"),
ID_in AS(
SELECT ID, Year, month
FROM table
WHERE trans_date BETWEEN date "2017-01-01" AND date "2017-01-31"
),
from_this AS(
SELECT ID, Year, month
FROM table
)
SELECT Year, Month, count(distinct ID)
FROM from_this
WHERE ID IN (select ID from ID_in)
AND
ID NOT IN (select ID from not_in)
GROUP BY 1,2
ORDER BY 1,2
But this gives only one column (for January 2017) of the desired table. I need to change dates for other months in 2017, 2018 and so on manually.
How to avoid this?
I guess, it should be looped somehow. And I think, I should create volatile table and add columns to it within loop, then select * from it.
Also I can not find an instruction for variables declaration and while loops in Teradata, any clearifications are appreciated.

Count Records Prior to Date for Whole Year

I have a historical database with about 9000 records with unique UserID and date they created an account CreatedDate that looks like this:
UserID CreatedDate
1 5/12/2019
2 1/1/2018
3 4/2/2015
4 8/9/2016
. ..
I would like to know how many accounts were created UP TO a certain date, but for multiple months.
For example, how many accounts were there in Jan 2020, Feb 2020, Mar 2020, so on and so forth.
The manual way would be to do this for each month but it would be tedious:
select count(*)
from SCHEMA
--KEEP REPLACING THE MONTH TO GET COUNTS
where CreatedDate <= '2020-01-31'
Just wondering if there is a more efficient way? A group by wouldn't work because it just totals for each month, but I'm trying to get a historical count. Thanks!

You seem to need running total for each month. If so, you need group by to compute total counts per month and then you have to sum them using analytical sum function.
This is how you would do it in Postgres (db fiddle). Other vendors may differ in the way how month is extracted but the principle is same.
with schema(UserID, CreatedDate) as (values
(1, date '2019-12-05'),
(2, date '2018-01-01'),
(3, date '2015-01-04'),
(4, date '2016-09-08')
)
select month, sum(cnt) over (order by month) from (
select date_trunc('month', CreatedDate)::date as month, count(*) as cnt
from schema
group by date_trunc('month', CreatedDate)::date
) x
Note if data has gaps in month sequence and you want continuous sequence (for example all months between 2015-01 and 2019-12), you have to pregenerate calendar (relation with all months) and left join table schema to it. (It is not in my example yet because of YAGNI.)

Appending the result query in bigquery

I am doing a query where the query will append the data from previous date as the outcome in BigQuery.
So, the result data for today will be higher than yesterdays as the data is appending by days.
So far, what I only managed to get the outcome is the data by days (where you can see the number of ID declining and is not appending from previous day) as this result:
What should I do to add appending function in the query so each day will get the result of data from the previous day in bigquery?
code:
WITH
table1 AS (
SELECT
ID,
...
FROM t
WHERE DATE_SUB('2020-01-31', INTERVAL 31 DAY) and '2020-01-31'
),
table2 AS (
SELECT
ID,
COUNTIF((rating < 7) as bad,
COUNTIF((rating >= 7 AND SAFE_CAST(NPS_Rating as INT64) < 9) as intermediate,
COUNTIF((rating as good
FROM
t
WHERE DATE_SUB('2020-01-31', INTERVAL 31 DAY) and '2020-01-31'
)
SELECT
DATE_SUB('2020-01-31', INTERVAL 31 DAY) as date,
*
FROM table1
FULL OUTER JOIN table2 USING (ID)

If you have counts that you want to accumulate, then you want a cumulative sum. The query would look something like this:
select datecol, count(*), sum(count(*)) over (order by datecol)
from t
group by datecol
order by datecol;

Using Date to find the inequality for sales than 500

I'm curious as to find the daily average sales for the month of December 1998 not greater than 100 as a where clause. So what I imagine is that since the table consists of the date of sales (sth like 1 december 1998, consisting of different date, months and year), amount due....First I'm going to define a particular month.
DEFINE a = TO_DATE('1-Dec-1998', 'DD-Month-YYYY')
SELECT SUBSTR(Sales_Date, 4,6), (SUM(Amount_Due)/EXTRACT(DAY FROM LAST_DAY(Sales_Date))
FROM ......
WHERE SUM(AMOUNT_DUE)/EXTRACT(DAY FROM LAST_DAY(&a)) < 100
I'm stuck as to extract the sum of amount due in the month of december 1998 for the where clause....
How can I achieve the objective?

To me, it looks like this:
select to_char(sales_date, 'mm.yyyy') month,
avg(amount_due) avg_value
from your_table
where sales_date >= trunc(date '1998-12-01', 'mm')
and sales_date < add_months(trunc(date '1998-12-01', 'mm'), 1)
group by to_char(sales_date, 'mm.yyyy')
having avg(amount_due) < 100;
WHERE clause can be simplified; it shows how to fetch certain period:
trunc to mm returns first day in that month
add_months to the above value (first day in that month) will return first day of the next month
the bottom line: give me all rows whose sales_date is >= first day of this month and < first day of the next month; basically, the whole this month
Finally, the where clause you used should actually be the having clause.

As long as the amount_due column only contains numbers, you can use the sum function.
Below SQL query should be able to satisfy your requirement.
Select SUM(Amount_Due) from table Sales where Sales_Date between '1-12-1998' and '31-12-1998'
OR
Select SUM(Amount_Due) from table Sales where Sales_Date like '%-12-1998'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

get average elapsed time for each month - sql

Related

Extract previous row calculated value for use in current row calculations - Postgres

Teradata loop for dates, column adding within loop

Count Records Prior to Date for Whole Year

Appending the result query in bigquery

Using Date to find the inequality for sales than 500

Categories

Resources