Calculation on previous dates in sql - sql

I need to find the age for each day, but I need it for all previous dates in one query.
For example:
-- For SYDATE:
SELECT SYSDATE AS DATE,
((SYSDATE)- create_time) as Age
FROM items
-- For (SYDATE-1):
SELECT (SYSDATE -1) AS DATE,
((SYSDATE-1)- create_time) as Age
FROM items
-- For (SYDATE-2) AND SO ON ----:
SELECT (SYSDATE-2) AS DATE,
((SYSDATE - 2)- create_time) as Age
FROM items
Is there any method so that it automatically calculates for previous dates and gives output.
Final output should display like this:
Date_in Age
24/JUN/15 20
23/JUN/15 19
22/JUN/15 18

Basically the idea is to use an inline view that lists all the dates of the items table. I assume there is a creation_date every day.
Then use a cartesian product join with the items table.
You might need to filter more, not sure which result exactly you expect to have.
On SQLFiddle I used this schema
CREATE TABLE items
(
item_id number,
create_time date
);
insert into items values (1, sysdate-3);
insert into items values (2, sysdate-2);
insert into items values (3, sysdate-1);
insert into items values (4, sysdate);
to test this query, which might be what you were asking for
select
b.system_date as date_in, (b.system_date-a.create_time) as age, a.item_id, a.create_time
from
(select distinct create_time as system_date from items) b,
items a
order by date_in desc, age desc;
The result is available at the link above.

Here is the Answer which worked for me:
select trunc(sysdate) - level + 1 dt
, trunc(sysdate) - level + 1 - created_date age from items
connect by trunc(sysdate) - level + 1 - created_date > 0;
or
SELECT (SYSDATE-rownum) AS DATE,
((SYSDATE - rownum)- create_time)Age
FROM items
connect by level <3

Related

Irregular grouping of timestamp variable

I have a table organized as follows:
id lateAt
1231235 2019/09/14
1242123 2019/09/13
3465345 NULL
5676548 2019/09/28
8986475 2019/09/23
Where lateAt is a timestamp of when a certain loan's payment became late. So, for each current date - I need to look at these numbers daily - there's a certain amount of entries which are late for 0-15, 15-30, 30-45, 45-60, 60-90 and 90+ days.
This is my desired output:
lateGroup Count
0-15 20
15-30 22
30-45 25
45-60 32
60-90 47
90+ 57
This is something I can easily calculate in R, but to get the results back to my BI dashboard I'd have to create a new table in my database, which I don't think is a good practice. What is the SQL-native approach to this problem?
I would define the "late groups" using a range, the join against the number of days:
with groups (grp) as (
values
(int4range(0,15, '[)')),
(int4range(15,30, '[)')),
(int4range(30,45, '[)')),
(int4range(45,60, '[)')),
(int4range(60,90, '[)')),
(int4range(90,null, '[)'))
)
select grp, count(t.user_id)
from groups g
left join the_table t on g.grp #> current_date - t.late_at
group by grp
order by grp;
int4range(0,15, '[)') creates a range from 0 (inclusive) and 15 (exclusive)
Online example: https://rextester.com/QJSN89445
The quick and dirty way to do this in SQL is:
SELECT '0-15' AS lateGroup,
COUNT(*) AS lateGroupCount
FROM my_table t
WHERE (CURRENT_DATE - t.lateAt) >= 0
AND (CURRENT_DATE - t.lateAt) < 15
UNION
SELECT '15-30' AS lateGroup,
COUNT(*) AS lateGroupCount
FROM my_table t
WHERE (CURRENT_DATE - t.lateAt) >= 15
AND (CURRENT_DATE - t.lateAt) < 30
UNION
SELECT '30-45' AS lateGroup,
COUNT(*) AS lateGroupCount
FROM my_table t
WHERE (CURRENT_DATE - t.lateAt) >= 30
AND (CURRENT_DATE - t.lateAt) < 45
-- Etc...
For production code, you would want to do something more like Ross' answer.
You didn't mention which DBMS you're using, but nearly all of them will have a construct known as a "value constructor" like this:
select bins.lateGroup, bins.minVal, bins.maxVal FROM
(VALUES
('0-15',0,15),
('15-30',15.0001,30), -- increase by a small fraction so bins don't overlap
('30-45',30.0001,45),
('45-60',45.0001,60),
('60-90',60.0001,90),
('90-99999',90.0001,99999)
) AS bins(lateGroup,minVal,maxVal)
If your DBMS doesn't have it, then you can probably use UNION ALL:
SELECT '0-15' as lateGroup, 0 as minVal, 15 as maxVal
union all SELECT '15-30',15,30
union all SELECT '30-45',30,45
Then your complete query, with the sample data you provided, would look like this:
--- example from SQL Server 2012 SP1
--- first let's set up some sample data
create table #temp (id int, lateAt datetime);
INSERT #temp (id, lateAt) values
(1231235,'2019-09-14'),
(1242123,'2019-09-13'),
(3465345,NULL),
(5676548,'2019-09-28'),
(8986475,'2019-09-23');
--- here's the actual query
select lateGroup, count(*) as Count
from #temp as T,
(VALUES
('0-15',0,15),
('15-30',15.0001,30), -- increase by a small fraction so bins don't overlap
('30-45',30.0001,45),
('45-60',45.0001,60),
('60-90',60.0001,90),
('90-99999',90.0001,99999)
) AS bins(lateGroup,minVal,maxVal)
) AS bins(lateGroup,minVal,maxVal)
where datediff(day,lateAt,getdate()) between minVal and maxVal
group by lateGroup
order by lateGroup
--- remove our sample data
drop table #temp;
Here's the output:
lateGroup Count
15-30 2
30-45 2
Note: rows with null lateAt are not counted.
I think you can do it all in one clear query :
with cte_lategroup as
(
select *
from (values(0,15,'0-15'),(15,30,'15-30'),(30,45,'30-45')) as t (mini, maxi, designation)
)
select
t2.designation
, count(*)
from test t
left outer join cte_lategroup t2
on current_date - t.lateat >= t2.mini
and current_date - lateat < t2.maxi
group by t2.designation;
With a preset like yours :
create table test
(
id int
, lateAt date
);
insert into test
values (1231235, to_date('2019/09/14', 'yyyy/mm/dd'))
,(1242123, to_date('2019/09/13', 'yyyy/mm/dd'))
,(3465345, null)
,(5676548, to_date('2019/09/28', 'yyyy/mm/dd'))
,(8986475, to_date('2019/09/23', 'yyyy/mm/dd'));

Using SQL Server 2012 how to iterate through an unknown number of rows and calculate date differences

I need to calculate the average number of days if there are two or more dates for each ID: the days between date1 and date2, date2 and date3 etc. The output needs to be the average number of days between each interval per ID. I am looking for a solution that iterates through each date for each ID and then averages the number of days
I could create a row number and partition by the id but in the actual data there can be up to 20 rows for each ID.
CREATE TABLE #ATABLE(
ID INTEGER NOT NULL
,DATE DATE NOT NULL
);
INSERT INTO #ATABLE(ID,DATE) VALUES (1,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/10/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/20/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/30/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (3,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (3,'1/10/2019');
--get avg days between orders
DROP TABLE #ATABLE
The out put for the above would be:
ID AvgDatediff
1 Null
2 10
3 9
You can use lag to get the previous row (per row), and then find the diff between it and the current row. Then, you can average them out:
SELECT id, AVG(diff)
FROM (SELECT id,
DATEDIFF(DAY, date, LAG(date) OVER (PARTITION BY id
ORDER BY date DESC)) AS diff
FROM #atable) t
GROUP BY id;
The simplest way to get the average difference is:
SELECT id, DATEDIFF(DAY, MIN(date), MAX(date)) / NULLIF(COUNT(*) - 1, 0)
FROM #atable) t
GROUP BY id;
Note: You may want a * 1.0 if you don't want an integer average.
In other words, the average difference is the latest date minus the earliest date divided by one less than the count. Try it. It works.
SELECT id, AVG(DayDiff)
FROM (
SELECT id,
DATEDIFF(dd, date, LEAD(date) OVER (PARTITION BY id ORDER BY date)) AS DayDiff
FROM #atable
) as AA
GROUP BY id;
LEAD(source_column) ==> picks the next data on basis of the order by clause i.e. here date.

How can I select stats on lateness of a record expected x days after previous record?

I have entities with config info in one table. If the 'vendor' doesn't do something within 'reminder_days' of the last time of doing it, then it becomes overdue.
CREATE TABLE t_vendors
(
vendor_id NUMBER,
vendor_name VARCHAR2 (250),
reminder_days NUMBER
);
Insert into T_VENDORS (vendor_id, vendor_name, reminder_days)
Values (12, 'sanity-test', 7);
and an app records what they do whenever they do it into this table with this sort of data:
CREATE TABLE t_vendor_events
(
vendor_event_id,
vendor_id NUMBER (19,0),
description VARCHAR2 (250),
event_date DATE
);
Insert into t_vendor_events (vendor_event_id, vendor_id, description, event_date)
Values (10015, 12, TO_DATE('11/9/2015 21:22:55', 'MM/DD/YYYY HH24:MI:SS'), 'one');
Insert into t_vendor_events (vendor_event_id, vendor_id, description, event_date)
Values (10016, 12, TO_DATE('11/16/2015 21:23:55', 'MM/DD/YYYY HH24:MI:SS'), 'two');
Insert into t_vendor_events (vendor_event_id, vendor_id, description, event_date)
Values (10017, 12, TO_DATE('11/30/2015 21:24:55', 'MM/DD/YYYY HH24:MI:SS'), 'three');
Insert into t_vendor_events (vendor_event_id, vendor_id, description, event_date)
Values (10018, 12, TO_DATE('12/01/2015 21:25:55', 'MM/DD/YYYY HH24:MI:SS'), 'four');
Once I've got the comparative values, I need to aggregate the data to quantify the lateness:
how many events occurred
how often they were overdue
what was expected (the reminder days value)
how much they were late on average
how much they were late at worst (max)
I need to see all the vendors in the result, including those that failed to produce an event at all.
All the solutions that I can think of involve creating extra columns and storing some kind of 'lateness' data on every event. This though strikes me as a redundancy, since I know the required interval (reminder_days) but I don't know what kind of nested selects would produce what I need.
I would prefer to stick to standard SQL and I'm not using PL-SQL, but am able to use Oracle-specific syntax in selects where necessary.
The result would look something like this (Expected Days is the 'reminder days' column):
Vendor Event Overdue Expected Avg Max
Count Count Days Elapsed Elapsed
Mega1 5 2 10 12 20
Ole! 6 0 10 9 10
GoPunk 0 0 0 0 0
X-Dan 0 0 0 0 0
RetroB 1 1 30 60 60
You can use lag to get the previous event_date and calculate the difference with the current event_date. Then select rows where the difference is > reminder_days by vendor. Just aggregate the final result, to know how often a vendor was late.
with prev as
(select lag(event_date) over(partition by vendor_id order by event_date) prevdt
, t.* from t_vendor_events)
select v.vendor_id, v.vendor_name, event_date - nvl(prevdt, event_date) diff
from prev p
join t_vendors v on p.vendor_id = v.vendor_id
where event_date - nvl(prevdt, event_date) > v.reminder_days

SQL query for all the days of a month

i have the following table RENTAL(book_date, copy_id, member_id, title_id, act_ret_date, exp_ret_date). Where book_date shows the day the book was booked. I need to write a query that for every day of the month(so from 1-30 or from 1-29 or from 1-31 depending on month) it shows me the number of books booked.
i currently know how to show the number of books rented in the days that are in the table
select count(book_date), to_char(book_date,'DD')
from rental
group by to_char(book_date,'DD');
my questions are:
How do i show the rest of the days(if let's say for some reason in my database i have no books rented on 20th or 19th or multiple days) and put the number 0 there?
How do i show the number of days only of the current month so(28,29,30,31 all these 4 are possible depending on month or year)... i am lost . This must be done using only SQL query no pl/SQL or other stuff.
The following query would give you all days in the current month, in your case you can replace SYSDATE with your date column and join with this query to know how many for a given month
SELECT DT
FROM(
SELECT TRUNC (last_day(SYSDATE) - ROWNUM) dt
FROM DUAL CONNECT BY ROWNUM < 32
)
where DT >= trunc(sysdate,'mm')
The answer is to create a table like this:
table yearsmonthsdays (year varchar(4), month varchar(2), day varchar(2));
use any language you wish, e.g. iterate in java with Calendar.getInstance().getActualMaximum(Calendar.DAY_OF_MONTH) to get the last day of the month for as many years and months as you like, and fill that table with the year, month and days from 1 to last day of month of your result.
you'd get something like:
insert into yearsmonthsdays ('1995','02','01');
insert into yearsmonthsdays ('1995','02','02');
...
insert into yearsmonthsdays ('1995','02','28'); /* non-leap year */
...
insert into yearsmonthsdays ('1996','02','01');
insert into yearsmonthsdays ('1996','02','02');
...
insert into yearsmonthsdays ('1996','02','28');
insert into yearsmonthsdays ('1996','02','29'); /* leap year */
...
and so on.
Once you have this table done, your work is almost finished. Make an outer left join between your table and this table, joining year, month and day together, and when no lines appear, the count will be zero as you wish. Without using programming, this is your best bet.
In oracle, you can query from dual and use the conncect by level syntax to generate a series of rows - in your case, dates. From there on, it's just a matter of deciding what dates you want to display (in my example I used all the dates from 2014) and joining on your table:
SELECT all_date, COALESCE (cnt, 0)
FROM (SELECT to_date('01/01/2014', 'dd/mm/yyyy') + rownum - 1 AS all_date
FROM dual
CONNECT BY LEVEL <= 365) d
LEFT JOIN (SELECT TRUNC(book_date), COUNT(book_date) AS cnt
FROM rental
GROUP BY book_date) r ON d.all_date = TRUNC(r.book_date)
There's no need to get ROWNUM involved ... you can just use LEVEL in the CONNECT BY:
WITH d1 AS (
SELECT TRUNC(SYSDATE, 'MONTH') - 1 + LEVEL AS book_date
FROM dual
CONNECT BY TRUNC(SYSDATE, 'MONTH') - 1 + LEVEL <= LAST_DAY(SYSDATE)
)
SELECT TRUNC(d1.book_date), COUNT(r.book_date)
FROM d1 LEFT JOIN rental r
ON TRUNC(d1.book_date) = TRUNC(r.book_date)
GROUP BY TRUNC(d1.book_date);
Simply replace SYSDATE with a date in the month you're targeting for results.
All days of the month based on current date
select trunc(sysdate) - (to_number(to_char(sysdate,'DD')) - 1)+level-1 x from dual connect by level <= TO_CHAR(LAST_DAY(sysdate),'DD')
It did works to me:
SELECT DT
FROM (SELECT TRUNC(LAST_DAY(SYSDATE) - (CASE WHEN ROWNUM=1 THEN 0 ELSE ROWNUM-1 END)) DT
FROM DUAL
CONNECT BY ROWNUM <= 32)
WHERE DT >= TRUNC(SYSDATE, 'MM')
In Oracle SQL the query must look like this to not miss the last day of month:
SELECT DT
FROM(
SELECT trunc(add_months(sysdate, 1),'MM')- ROWNUM dt
FROM DUAL CONNECT BY ROWNUM < 32
)
where DT >= trunc(sysdate,'mm')

How to write a database view that expands data into multiple rows?

I have a database table that contains collection data for product collected from a supplier and I need to produce an estimate of month-to-date production figures for that supplier using an Oracle SQL query. Each day can have multiple collections, and each collection can contain product produced across multiple days.
Here's an example of the raw collection data:
Date Volume ColectionNumber ProductionDays
2011-08-22 500 1 2
2011-08-22 200 2 2
2011-08-20 600 1 2
Creating a month-to-date estimate is tricky because the first day of the month may have a collection for two days worth of production. Only a portion of that collected volume is actually attributable to the current month.
How can I write a query to produce this estimate?
My gut feeling is that I should be able to create a database view that transforms the raw data into estimated daily production figures by summing collections on the same day and distributing collection volumes across the number of days they were produced on. This would allow me to write a simple query to find the month-to-date production figure.
Here's what the above collection data would look like after being transformed into estimated daily production figures:
Date VolumeEstimate
2011-08-22 350
2011-08-21 350
2011-08-20 300
2011-08-19 300
Am I on the right track? If so, how can this be implemented? I have absolutely no idea how to do this type of transformation in SQL. If not, what is a better approach?
Note: I cannot do this calculation in application code since that would require a significant code change which we can't afford.
try
CREATE TABLE TableA (ProdDate DATE, Volume NUMBER, CollectionNumber NUMBER, ProductionDays NUMBER);
INSERT INTO TableA VALUES (TO_DATE ('20110822', 'YYYYMMDD'), 500, 1, 2);
INSERT INTO TableA VALUES (TO_DATE ('20110822', 'YYYYMMDD'), 200, 2, 2);
INSERT INTO TableA VALUES (TO_DATE ('20110820', 'YYYYMMDD'), 600, 1, 2);
COMMIT;
CREATE VIEW DailyProdVolEst AS
SELECT DateList.TheDate, SUM (DateRangeSums.DailySum) VolumeEstimate FROM
(
SELECT ProdStart, ProdEnd, SUM (DailyProduction) DailySum
FROM
(
SELECT (ProdDate - ProductionDays + 1) ProdStart, ProdDate ProdEnd, CollectionNumber, VolumeSum/ProductionDays DailyProduction
FROM
(
Select ProdDate, CollectionNumber, ProductionDays, Sum (Volume) VolumeSum FROM TableA
GROUP BY ProdDate, CollectionNumber, ProductionDays
)
)
GROUP BY ProdStart, ProdEnd
) DateRangeSums,
(
SELECT A.MinD + MyList.L TheDate FROM
(SELECT MIN (ProdDate - ProductionDays + 1) MinD FROM TableA) A,
(SELECT LEVEL - 1 L FROM DUAL CONNECT BY LEVEL <= (SELECT Max (ProdDate) - MIN (ProdDate - ProductionDays + 1) + 1 FROM TableA)) MyList
) DateList
WHERE DateList.TheDate BETWEEN DateRangeSums.ProdStart AND DateRangeSums.ProdEnd
GROUP BY DateList.TheDate;
The view DailyProdVolEst gives you dynamically the result you described... though some "constraints" apply:
the combination of ProdDate and CollectionNumber should be unique.
the ProductionDays need to be > 0 for all rows
EDIT - as per comment requested:
How this query works:
It finds out what the smallest + biggest date in the table are, then builds rows with each row being a date in that range (DateList)... this is matched up against a list of rows containing the daily sum for unique combinations of ProdDate Start/End (DateRangeSums) and sums it up on the date level.
What do SUM (DateRangeSums.DailySum) and SUM (DailyProduction) do ?
Both sum things up - the SUM (DateRangeSums.DailySum) sums up in cases of partialy overlapping date ranges, and the SUM (DailyProduction) sums up within one date range if there are more than one CollectionNumber. Without SUM the GROUP BY wouldn't be needed.
I think a UNION query will do the trick for you. You aren't using the CollectionNumber field in your example, so I excluded it from the sample below.
Something similar to the below query should work (Disclaimer: My oracle db isn't accessible to me at the moment):
SELECT Date, SUM(Volume) VolumeEstimate
FROM
(SELECT Date, SUM(Volume / ProductionDays) Volume
FROM [Table]
GROUP BY Date
UNION
SELECT (Date - 1) Date, SUM(Volume / 2)
WHERE ProductionDays = 2
GROUP BY Date - 1)
GROUP BY Date
It sounds like what you want to do is sum up by day and then use a tally table to divide out the results.
Here's a runnable example with your data in T-SQL dialect:
DECLARE #tbl AS TABLE (
[Date] DATE
, Volume INT
, ColectionNumber INT
, ProductionDays INT);
INSERT INTO #tbl
VALUES ('2011-08-22', 500, 1, 2)
, ('2011-08-22', 200, 2, 2)
, ('2011-08-20', 600, 1, 2);
WITH Numbers AS (SELECT 1 AS N UNION ALL SELECT 2 AS N)
,AssignedVolumes AS (
SELECT t.*
, t.Volume / t.ProductionDays AS PerDay
, DATEADD(d, 1 - n.N, t.[Date]) AS AssignedDate
FROM #tbl AS t
INNER JOIN Numbers AS n
ON n.N <= t.ProductionDays
)
SELECT AssignedDate
, SUM(PerDay)
FROM AssignedVolumes
GROUP BY AssignedDate;​
I dummied up a simple numbers table with only 1 and 2 in it to perform the pivot. Typically you'll have a table with a million numbers in sequence.
For Oracle, the only thing you should need to change would be the DATEADD.