Oracle SQL Special Period Format - sql

I have a special fiscal period in format YYYYMMM, for example
Feb of 2015 is 2015002
Nov of 2014 is 2014011
I need to do subtraction from the period, 2 months ago from 2015002 is 2014012, but i cant do like
SELECT '2015001' - 2 FROM DUAL
How can i do that?

You should first convert it to a date, then subtract months and convert back to the format you need.
with x(y) as (
select '2015002' from dual
)
select y,
to_date(y,'YYYY"0"MM'),
add_months(to_date(y,'YYYY"0"MM'),-2),
to_char(add_months(to_date(y,'YYYY"0"MM'),-2),'YYYY"0"MM')
from x
Results:
| Y | TO_DATE(Y,'YYYY"0"MM') | ADD_MONTHS(TO_DATE(Y,'YYYY"0"MM'),-2) | TO_CHAR(ADD_MONTHS(TO_DATE(Y,'YYYY"0"MM'),-2),'YYYY"0"MM') |
|---------|----------------------------|---------------------------------------|------------------------------------------------------------|
| 2015002 | February, 01 2015 00:00:00 | December, 01 2014 00:00:00 | 2014012 |

Related

Average of last 3 months (SQL vertica)

I need to find the average days past due for the last 3 months for each client. Not as a rolling/moving average, but one time number, always calculating the last 3 months, no matter if the data changes.
For example now the last data I have is from Sept 2022, so I need the average of Sept 2022, August 2022 and July 2022. But if the data changes and now I have October 2022, then I would need average of Oct, Sept, August and so on.
I tried this but it calculates wrong
CREATE TABLE AVERAGE_dpd
AS (
SELECT "SUM_WEIGHTED_AVG_PERMONTH"."NAME",
AVG("SUM_WEIGHTED_AVG_PERMONTH"."SUM")
OVER (PARTITION BY "SUM_WEIGHTED_AVG_PERMONTH"."NAME"
order by MONTH ("SUM_WEIGHTED_AVG_PERMONTH"."LAST DAY OF MONTH_NETDUEDATE") desc
rows between 2 preceding and CURRENT ROW)
as AVG3Months
FROM "SUM_WEIGHTED_AVG_PERMONTH");
Thank you so much for your help!
I assume you are expecting this result:
client_name | avg_last_3mth
-------------+--------------
Client 01 | 16.76
Client 02 | 5.75
Client 03 | -13.95
So I assume you have something like this as input data (and this is how we usually like the data to accompany the question):
month_begin | client_name | dpd
------------+-------------+----
2022-01-05 | Client 01 | 12
2022-01-06 | Client 01 | 14
2022-01-07 | Client 01 | 18
2022-01-08 | Client 01 | 17
2022-01-05 | Client 02 | 12
2022-01-06 | Client 02 | 14
2022-01-07 | Client 02 | 18
2022-01-08 | Client 02 | 17
2022-01-05 | Client 03 | 12
2022-01-06 | Client 03 | 14
2022-01-07 | Client 03 | 18
2022-01-08 | Client 03 | 17
With this input data, you probably want the rows with month_begin of the first of the current month (TRUNC(CURRENT_DATE,'MONTH')), plus the two previous months. And this is what I do, then I obviously group by the client name:
WITH
-- input data I made up, don't use in query ..
dpd(month_begin,client_name,dpd) AS (
SELECT DATE '2022-05-01','Client 01',12
UNION ALL SELECT DATE '2022-06-01','Client 01',14
UNION ALL SELECT DATE '2022-07-01','Client 01',18
UNION ALL SELECT DATE '2022-08-01','Client 01',17
UNION ALL SELECT DATE '2022-05-01','Client 02', 2
UNION ALL SELECT DATE '2022-06-01','Client 02', 4
UNION ALL SELECT DATE '2022-07-01','Client 02', 8
UNION ALL SELECT DATE '2022-08-01','Client 02', 7
UNION ALL SELECT DATE '2022-05-01','Client 03',22
UNION ALL SELECT DATE '2022-06-01','Client 03',24
UNION ALL SELECT DATE '2022-07-01','Client 03',28
UNION ALL SELECT DATE '2022-08-01','Client 03',27
)
-- real query starts here ..
SELECT
client_name
, AVG(dpd)::NUMERIC(5,2) AS avg_last_3mth
FROM dpd
WHERE month_begin >= TRUNC(CURRENT_DATE,'MONTH') - '2 MONTHS'::INTERVAL YEAR TO MONTH
GROUP BY
client_name;
-- out client_name | avg_last_3mth
-- out -------------+---------------
-- out Client 02 | 6.33
-- out Client 01 | 16.33
-- out Client 03 | 26.33

Calculate running sum of previous 3 months from monthly aggregated data

I have a dataset that I have aggregated at monthly level. The next part needs me to take, for every block of 3 months, the sum of the data at monthly level.
So essentially my input data (after aggregated to monthly level) looks like:
month
year
status
count_id
08
2021
stat_1
1
09
2021
stat_1
3
10
2021
stat_1
5
11
2021
stat_1
10
12
2021
stat_1
10
01
2022
stat_1
5
02
2022
stat_1
20
and then my output data to look like:
month
year
status
count_id
3m_sum
08
2021
stat_1
1
1
09
2021
stat_1
3
4
10
2021
stat_1
5
8
11
2021
stat_1
10
18
12
2021
stat_1
10
25
01
2022
stat_1
5
25
02
2022
stat_1
20
35
i.e 3m_sum for Feb = Feb + Jan + Dec. I tried to do this using a self join and wrote a query along the lines of
WITH CTE AS(
SELECT date_part('month',date_col) as month
,date_part('year',date_col) as year
,status
,count(distinct id) as count_id
FROM (date_col, status, transaction_id) as a
)
SELECT a.month, a.year, a.status, sum(b.count_id) as 3m_sum
from cte as a
left join cte as b on a.status = b.status
and b.month >= a.month - 2 and b.month <= a.month
group by 1,2,3
This query NEARLY works. Where it falls apart is in Jan and Feb. My data is from August 2021 to Apr 2022. The means, the value for Jan should be Nov + Dec + Jan. Similarly for Feb it should be Dec + Jan + Feb.
As I am doing a join on the MONTH, all the months of Aug - Nov are treated as being values > month of jan/feb and so the query isn't doing the correct sum.
How can I adjust this bit to give the correct sum?
I did think of using a LAG function, but (even though I'm 99% sure a month won't ever be missed), I can't guarantee we will never have a month with 0 values, and therefore my LAG function will be summing the wrong rows.
I also tried doing the same join, but at individual date level (and not aggregating in my nested query) but this gave vastly different numbers, as I want the sum of the aggregation and I think the sum from the individual row was duplicated a lot of stuff I do a COUNT DISTINCT on to remove.
You can use a SUM with a window frame of 2 PRECEDING. To ensure you don't miss rows, use a calendar table and left-join all the results to it.
SELECT *,
SUM(a.count_id) OVER (ORDER BY c.year, c.month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM Calendar c
LEFT JOIN a ON a.year = c.year AND a.month = c.month
WHERE c.year >= 2021 AND c.year <= 2022;
db<>fiddle
You could also use LAG but you would need it twice.
It should be #Charlieface's answer - only that I get one different result than you put in your expected result table:
WITH
-- your input - and I avoid keywords like "MONTH" or "YEAR"
-- and also identifiers starting with digits are forbidden -
indata(mm,yy,status,count_id,sum_3m) AS (
SELECT 08,2021,'stat_1',1,1
UNION ALL SELECT 09,2021,'stat_1',3,4
UNION ALL SELECT 10,2021,'stat_1',5,8
UNION ALL SELECT 11,2021,'stat_1',10,18
UNION ALL SELECT 12,2021,'stat_1',10,25
UNION ALL SELECT 01,2022,'stat_1',5,25
UNION ALL SELECT 02,2022,'stat_1',20,35
)
SELECT
*
, SUM(count_id) OVER(
ORDER BY yy,mm
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS sum_3m_calc
FROM indata;
-- out mm | yy | status | count_id | sum_3m | sum_3m_calc
-- out ----+------+--------+----------+--------+-------------
-- out 8 | 2021 | stat_1 | 1 | 1 | 1
-- out 9 | 2021 | stat_1 | 3 | 4 | 4
-- out 10 | 2021 | stat_1 | 5 | 8 | 9
-- out 11 | 2021 | stat_1 | 10 | 18 | 18
-- out 12 | 2021 | stat_1 | 10 | 25 | 25
-- out 1 | 2022 | stat_1 | 5 | 25 | 25
-- out 2 | 2022 | stat_1 | 20 | 35 | 35

check date ranges with other date ranges

I have the table Distractionswith the following columns:
id startTime endTime(possible null)
Also, I have two parameters, it's period. pstart and pend.
I have to find all distractions for the period and count hours.
For example, we have:
Distractions:
`id` `startTime` `endTime`
1 01.01.2014 00:00 03.01.2014 00:00
2 25.03.2014 00:00 02.04.2014 00:00
3 27.03.2014 00:00 null
The columns contains time, but don't use them.
Period is pstart = 01.01.2014 and pend = 31.03.2014
For example above the result is equal:
for id = 1 - 72 hours
for id = 2 - 168 hours(7 days from 25 to
31 - end of period)
for id = 3 - 120 hours (5 days from 27 to 31 - the distraction not completed, therefore select end of period)
the sum is equal 360.
My code:
select
sum ((ds."endTime" - ds."startTime")*24) as hoursCount
from "Distractions" ds
--where ds."startTime" >= :pstart and ds."endTime" <= :pend
-- I don't know how to create where condition properly.
You'll have to take care of cases where date ranges are outside the input range and also account for starttime and endtime being null.
This where clause should keep only the valid data ranges. I have substituted the null startime with a earliest date and null endtime with a date
far in the future.
where coalesce(endtime,date'9999-12-31') >= :pstart
and coalesce(starttime,date'0000-01-01') <= :pend
Once you have filtered records, you need to adjust the date values so that anything starting before the input :pstart is moved forward to the :pstart,
and anything ending after :pend is moved back to :pend. Subtracting these two should give the value you are looking for. But, there is a catch. Since
the time is 00:00:00, when you subtract the dates, it will miss one full day. So, add 1 to it.
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table myt(
id number,
starttime date,
endtime date
);
insert into myt values( 1 ,date'2014-01-01', date'2014-01-03');
insert into myt values( 2 ,date'2014-03-25', date'2014-04-02');
insert into myt values( 3 ,date'2014-03-27', null);
insert into myt values( 4 ,null, date'2013-04-02');
insert into myt values( 5 ,date'2015-03-25', date'2015-04-02');
insert into myt values( 6 ,date'2013-12-25', date'2014-04-09');
insert into myt values( 7 ,date'2013-12-26', date'2014-01-09');
Query 1:
select id,
case when coalesce(starttime,date'0000-01-01') < date'2014-01-01'
then date'2014-01-01'
else starttime
end adj_starttime,
case when coalesce(endtime,date'9999-12-31') > date'2014-03-31'
then date'2014-03-31'
else endtime
end adj_endtime,
(case when coalesce(endtime,date'9999-12-31') > date'2014-03-31'
then date'2014-03-31'
else endtime
end -
case when coalesce(starttime,date'0000-01-01') < date'2014-01-01'
then date'2014-01-01'
else starttime
end
+ 1) * 24 hoursCount
from myt
where coalesce(endtime,date'9999-12-31') >= date'2014-01-01'
and coalesce(starttime,date'0000-01-01') <= date'2014-03-31'
order by 1
Results:
| ID | ADJ_STARTTIME | ADJ_ENDTIME | HOURSCOUNT |
|----|--------------------------------|--------------------------------|------------|
| 1 | January, 01 2014 00:00:00+0000 | January, 03 2014 00:00:00+0000 | 72 |
| 2 | March, 25 2014 00:00:00+0000 | March, 31 2014 00:00:00+0000 | 168 |
| 3 | March, 27 2014 00:00:00+0000 | March, 31 2014 00:00:00+0000 | 120 |
| 6 | January, 01 2014 00:00:00+0000 | March, 31 2014 00:00:00+0000 | 2160 |
| 7 | January, 01 2014 00:00:00+0000 | January, 09 2014 00:00:00+0000 | 216 |

Sql Server 2012 - Group data by varying timeslots

I have some data to analyze which is at half hour granularity, but would like to group it by 2, 3, 6, 12 hour and 2 days and 1 week to make some more meaningful comparisons.
|DateTime | Value |
|01 Jan 2013 00:00 | 1 |
|01 Jan 2013 00:30 | 1 |
|01 Jan 2013 01:00 | 1 |
|01 Jan 2013 01:30 | 1 |
|01 Jan 2013 02:00 | 2 |
|01 Jan 2013 02:30 | 2 |
|01 Jan 2013 03:00 | 2 |
|01 Jan 2013 03:30 | 2 |
Eg. 2 hour grouped result will be
|DateTime | Value |
|01 Jan 2013 00:00 | 4 |
|01 Jan 2013 02:00 | 8 |
To get the 2 hourly grouped result, I thought of this code -
CASE
WHEN DatePart(HOUR,DateTime)%2 = 0 THEN
CAST(CAST(DatePart(HOUR,DateTime) AS varchar) + '':00'' AS TIME)
ELSE
CAST(CAST(DATEPART(HOUR,DateTime) As Int) - 1 AS varchar) + '':00'' END Time
This seems to work alright, but I cant think of using this to generalize to 3, 6, 12 hours.
I can for 6, 12 hours just use case statements and achieve result but is there any way to generalize so that I can achieve 2,3,6,12 hour granularity and also 2 days and a week level granularity? By generalize, I mean I could pass on a variable with desired granularity to the same query rather than having to constitute different queries with different case statements.
Is this possible? Please provide some pointers.
Thanks a lot!
I think you can use
Declare #Resolution int = 3 -- resolution in hours
Select
DateAdd(Hour,
DateDiff(Hour, 0, datetime) / #Resolution * #Resolution, -- integer arithmetic
0) as bucket,
Sum(values)
From
table
Group By
DateAdd(Hour,
DateDiff(Hour, 0, datetime) / #Resolution * #Resolution, -- integer arithmetic
0)
Order By
bucket
This calculates the number of hours since a known fixed date, rounds down to the resolution size you're interested in, then adds them back on to the fixed date.
It will miss buckets out, though if you have no data in them
Example Fiddle

datetime order by

I have a website which contains news data and I want show the most updated data by time,
I have field column_time contains 8 data. Why when I use this SQL:
select * from table_name order by waktu desc
is the result this:
28 Jan 2013 | 15:36:47
28 Jan 2013 | 15:30:48
27 Jan 2013 | 21:38:36
27 Jan 2013 | 21:38:32
27 Jan 2013 | 21:38:29
11 Feb 2013 | 20:41:05
11 Feb 2013 | 20:40:37
11 Feb 2013 | 20:36:11
and not this?
11 Feb 2013 | 20:41:05
11 Feb 2013 | 20:40:37
11 Feb 2013 | 20:36:11
28 Jan 2013 | 15:36:47
28 Jan 2013 | 15:30:48
27 Jan 2013 | 21:38:36
27 Jan 2013 | 21:38:32
27 Jan 2013 | 21:38:29
The column is sorted like character data, type varchar or text.
You probably want to use timestamp or datetime as data type, depending on your secret RDBMS.
Try this to order by latest record (not just time) (DEMO - Converting to DATETIME)
Select * from table_name
Order by convert(Datetime,replace(your_column,' | ',' ')) desc
OR if you just need to order by time regardless of date then use; (Also you can convert to Time if you are on above sql-server 2008)
Order by convert(Datetime, substring(your_column,
charindex('|',your_column,1)+2,len(your_column))) desc