Week interval query starting on mondays - sql

FIDDLE
I need to do a JasperReport. what I need to display is the total number of accounts processes, broken down into weekly intervals with the number of activated and declined accounts.
For the weekly interval query I got thus far:
SELECT *
FROM account_details
WHERE DATE date_opened = DATE_ADD(2014-01-01, INTERVAL(1-DAYOFWEEK(2014-01-01)) +1 DAY)
This seems to be correct, but not POSTGRES correct. It keeps complaining about the 1-DAYOFWEEK. Here is what I will hopefully achieve:
UPDATE
It is pretty ugly, but I dont know of any better. Id does the job though. But dont know if it can be re-factored to look better at least. I also dont know how to handle division by zero at the moment.
SELECT to_char(d.day, 'YYYY/MM/DD - ') || to_char(d.day + 6, 'YYYY/MM/DD') AS Month
, SUM(CASE WHEN LOWER(situation) LIKE '%active%' THEN 1 ELSE 0 END) AS Activated
, SUM(CASE WHEN LOWER(situation) LIKE '%declined%' THEN 1 ELSE 0 END) AS Declined
, SUM(CASE WHEN LOWER(situation) LIKE '%declined%' OR LOWER(situation) LIKE '%active%' THEN 1 ELSE 0 END) AS Total
, to_char( 100.0 *( (SUM(CASE WHEN LOWER(situation) LIKE '%active%' THEN 1 ELSE 0 END)) / (SUM(CASE WHEN LOWER(situation) LIKE '%declined%' OR LOWER(situation) LIKE '%active%' THEN 1 ELSE 0 END))::real) , '99.9') AS percent_activated
, to_char( 100.0 *( (SUM(CASE WHEN LOWER(situation) LIKE '%declined%' THEN 1 ELSE 0 END)) / (SUM(CASE WHEN LOWER(situation) LIKE '%declined%' OR LOWER(situation) LIKE '%active%' THEN 1 ELSE 0 END))::real) , '99.9') AS percent_declined
FROM (
SELECT day::date
FROM generate_series('2014-08-01'::date, '2014-09-14'::date, interval '1 week') day
) d
JOIN account_details a ON a.date_opened >= d.day
AND a.date_opened < d.day + 6
GROUP BY d.day;

SELECT to_char(d.day, 'YYYY/MM/DD" - "')
|| to_char(d.day + 6, 'YYYY/MM/DD') AS week
, count(situation ILIKE '%active%' OR NULL) AS activated
, ...
FROM (
SELECT day::date
FROM generate_series('2014-08-11'::date
, '2014-09-14'::date
, '1 week'::interval) day
) d
LEFT JOIN account_details a ON a.date_opened >= d.day
AND a.date_opened < d.day + 7 -- 7, not 6!
GROUP BY d.day;
Related answers:
Weekly total sums
Calculate working hours between 2 dates in PostgreSQL
Best way to count records by arbitrary time intervals in Rails+Postgres
More about counting specific values:
For absolute performance, is SUM faster or COUNT?
SQL Query to Transpose Column Counts to Row Counts
Aside: You would typically use an enum or a look-up table and just store an ID for situation, not a lengthy text redundantly.

Related

How to combine 2 queries in same table with different group by ? (ORACLE)

I need to calculate On Time Arrival and Departure. Query to get On Time Departure:
SELECT DEPAIRPORT as AIRPORT,
COUNT(case when A.STATUS = 'Scheduled' and
A.ACTUAL_BLOCKOFF is not null then 1 else NULL END) as SCHEDULED,
COUNT(case when ((A.ACTUAL_BLOCKOFF+ interval '7' hour) - (A.SCHEDULED_DEPDT+ interval '7' hour))*24*60 <= '+000000015 00:00:00.000000000' and
A.ACTUAL_BLOCKOFF is not null then 1 else NULL END) as ONTIME
FROM TABLE A GROUP BY DEPAIRPORT
and Query to calculate On Time Arrival:
SELECT COUNT(case when ((A.ACTUAL_BLOCKON + interval '7' hour) - (A.SCHEDULED_ARRDT+ interval '7' hour))*24*60 <= '+000000015 00:00:00.000000000' and
A.ACTUAL_BLOCKON is not null then 1 else NULL END) as ARRONTIME
FROM TABLE A
GROUP BY ARRIVALAIRPORT
How to combine these queries into 1 single query so I can display it like this table:
Name #Schedule #OnTimeDeparture #ArrivalOntime
AIRPORTX 41 35 20
Without the sample data and expected output, it is difficult to tell what exactly you want. If you want to combine the two datasets, you may put them in with clauses and the then join them together(LEFT JOIN or INNER JOIN based on the output required for cases where arrival has happened yet or not)
WITH dep
AS (SELECT depairport AS airport,
count(CASE
WHEN a.status = 'Scheduled'
AND a.actual_blockoff IS NOT NULL THEN 1
END) AS scheduled,
count(CASE
WHEN( ( a.actual_blockoff + interval '7' hour ) - (
a.scheduled_depdt + interval '7' hour ) ) *
24 *
60
<=
'+000000015 00:00:00.000000000'
AND a.actual_blockoff IS NOT NULL THEN 1
END) AS ontime
FROM tablea
GROUP BY depairport),
arr
AS (SELECT arrivalairport AS airport,
count(CASE
WHEN( ( a.actual_blockon + interval '7' hour ) - (
a.scheduled_arrdt + interval '7' hour ) ) *
24 *
60
<=
'+000000015 00:00:00.000000000'
AND a.actual_blockon IS NOT NULL THEN 1
END) AS arrontime
FROM tablea
GROUP BY arrivalairport)
SELECT dep.airport AS Name,
dep.scheduled AS "#Schedule",
dep.ontime AS "#OnTimeDeparture",
arr.arrontime AS "#ArrivalOntime"
FROM dep
left join arr -- Or Inner join depending on the expected output.
ON ( dep.airport = arr.airport );
You can use something like this:
select
max(SCHEDULED) as SCHEDULED,
max(ONTIME) as ONTIME,
max(ARRONTIME) as ARRONTIME
from (select
count(case when ... ) over(partition by DEPAIRPORT) as SCHEDULED,
count(case when ... ) over(partition by DEPAIRPORT) as ONTIME,
count(case when ... ) over(partition by ARRIVALAIRPORT) as ARRONTIME
from a );
But I guess that your question is not complete. Also you need a key for join different flights.

SQL Efficiency on Date Range or Separate Tables

I'm calculating historical amount from a table in years(ex. 2015-2016, 2014-2015, etc.) I would like to seek expertise if its more efficient to do it in one batch or repeat the query multiple times filtered by the date required.
Thanks in advance!
OPTION 1:
select
id,
sum(case when year(getdate()) - year(txndate) between 5 and 6 then amt else 0 end) as amt_6_5,
...
sum(case when year(getdate()) - year(txndate) between 0 and 1 then amt else 0 end) as amt_1_0,
from
mytable
group by
id
OPTION 2:
select
id, sum(amt) as amt_6_5
from
mytable
group by
id
where
year(getdate()) - year(txndate) between 5 and 6
...
select
id, sum(amt) as amt_1_0
from
mytable
group by
id
where
year(getdate()) - year(txndate) between 0 and 1
1.
Unless you have resources issues I would go with the CASE version.
Although it has no impact on the results, filtering on the requested period in the WHERE clause might have a significant performance advantage.
2. Your period definition creates overlapping.
select id
,sum(case when year(getdate()) - year(txndate) = 6 then amt else 0 end) as amt_6
-- ...
,sum(case when year(getdate()) - year(txndate) = 0 then amt else 0 end) as amt_0
where txndate >= dateadd(year, datediff(year,0, getDate())-6, 0)
from mytable
group by id
This may be help you,
WITH CTE
AS
(
SELECT id,
(CASE WHEN year(getdate()) - year(txndate) BETWEEN 5 AND 6 THEN 'year_5-6'
WHEN year(getdate()) - year(txndate) BETWEEN 4 AND 5 THEN 'year_4-5'
...
END) AS my_year,
amt
FROM mytable
)
SELECT id,my_year,sum(amt)
FROM CTE
GROUP BY id,my_year
Here, inside the CTE, just assigned a proper year_tag for each records (based on your conditions), after that select a summary for the CTE grouped by that year_tag.

How to count every half hour?

I have a query that its counting every hour, using a pivot table.
How would it be possible to get the count for every 30 minutes?
for example 8:00-8:29,8:30-8:59,9:00-9:29 etc. until 5:00
SELECT CONVERT(varchar(8),start_date,1) AS 'Day',
SUM(CASE WHEN DATEPART(hour,start_date) = 8 THEN 1 ELSE 0 END) as eight ,
SUM(CASE WHEN DATEPART(hour,start_date) = 9 THEN 1 ELSE 0 END) AS nine,
SUM(CASE WHEN DATEPART(hour,start_date) = 10 THEN 1 ELSE 0 END) AS ten,
SUM(CASE WHEN DATEPART(hour,start_date) = 11 THEN 1 ELSE 0 END) AS eleven,
SUM(CASE WHEN DATEPART(hour,start_date) = 12 THEN 1 ELSE 0 END) AS twelve,
SUM(CASE WHEN DATEPART(hour,start_date) = 13 THEN 1 ELSE 0 END) AS one_clock,
SUM(CASE WHEN DATEPART(hour,start_date) = 14 THEN 1 ELSE 0 END) AS two_clock,
SUM(CASE WHEN DATEPART(hour,start_date) = 15 THEN 1 ELSE 0 END) AS three_clock,
SUM(CASE WHEN DATEPART(hour,start_date) = 16 THEN 1 ELSE 0 END) AS four_clock
FROM test
where user_id is not null
GROUP BY CONVERT(varchar(8),start_date,1)
ORDER BY CONVERT(varchar(8),start_date,1)
I use sql server 2012 (version Microsoft SQL Server Management Studio 11.0.3128.0)
Try using iif as below:
SELECT CONVERT(varchar(8),start_date,1) AS 'Day', SUM(iif(DATEPART(hour,start_date) = 8 and
DATEPART(minute,start_date) >= 0 and
DATEPART(minute,start_date) =< 29,1,0)) as eight_tirty
FROM test where user_id is not null GROUP BY
CONVERT(varchar(8),start_date,1) ORDER BY
CONVERT(varchar(8),start_date,1)
To get counts by day and half hour, something like this should work.
SELECT day, half_hour, count(1) AS half_hour_count
FROM (
SELECT
CAST(start_date AS date) AS day,
DATEPART(hh, start_date)
+ 0.5*(DATEPART(n,start_date)/30) AS half_hour
FROM test
WHERE user_id IS NOT NULL
) qry
GROUP BY day, half_hour
ORDER BY day, half_hour;
Formatting the result could be done later.
You need a few things, and then this query just falls together.
First, assuming you need multiple dates, you're going to want what's known as a Calendar Table (hands down, probably the most useful analysis table).
Next, you're going to want either an existing Numbers table if you have one, or just generate the first on the fly:
WITH Halfs AS (SELECT CAST(0 AS INT) m
UNION ALL
SELECT m + 1
FROM Halfs
WHERE m < 24 * 2)
SELECT m
FROM Halfs
(recursive CTE - generates a table with a list of numbers starting at 0).
These two tables will provide the basis for a range query based on the timestamps in your main table. This will make it very easy for the optimizer to bucket rows for whatever aggregation you're doing. That's done by CROSS JOINing the two tables together in a subquery, as well as adding a couple of other derived columns:
WITH Halfs AS (SELECT CAST(0 AS INT) m
UNION ALL
SELECT m + 1
FROM Halfs
WHERE m < 24 * 2)
SELECT calendarDate, m, rangeStart, rangeEnd
FROM (SELECT Calendar.calendarDate, Halfs.m rangeGroup,
DATEADD(minutes, m * 30, CAST(Calendar.calendarDate AS DATETIME2) rangeStart,
DATEADD(minutes, (m + 1) * 30, CAST(Calendar.calendarDate AS DATETIME2) rangeEnd
FROM Calendar
CROSS JOIN Halfs
WHERE Calendar.calendarDate >= CAST('20160823' AS DATE)
AND Calendar.calendarDate < CAST('20160830' AS DATE)
-- OR whatever your date range actually is.
) Range
ORDER BY rangeStart
(note that, if the range of dates is sufficiently large, it may be beneficial to save this off as a temporary table with indicies. For small tables and datasets, the performance gain isn't likely to be noticeable)
Now that we have our ranges, it's trivial to get our groups, and pivot the table.
Oh, and SQL Server has a specific operator for PIVOTing.
WITH Halfs AS (SELECT CAST(0 AS INT) m
UNION ALL
SELECT m + 1
FROM Halfs
WHERE m < 3 * 2)
-- Intentionally limiting range for example only
SELECT calendarDate AS day, [0], [1], [2], [3], [4], [5], [6]
-- If you're displaying "nice" names,
-- do it at this point, or in the reporting application
FROM (SELECT Range.calendarDate, Range.rangeGroup
FROM (SELECT Calendar.calendarDate, Halfs.m rangeGroup,
DATEADD(minutes, m * 30, CAST(Calendar.calendarDate AS DATETIME2) rangeStart,
DATEADD(minutes, (m + 1) * 30, CAST(Calendar.calendarDate AS DATETIME2) rangeEnd
FROM Calendar
CROSS JOIN Halfs
WHERE Calendar.calendarDate >= CAST('20160823' AS DATE)
AND Calendar.calendarDate < CAST('20160830' AS DATE)
-- OR whatever your date range actually is.
) Range
LEFT JOIN Test
ON Test.user_id IS NOT NULL
AND Test.start_date >= Range.rangeStart
AND Test.start_date < Range.rangeEnd
) AS DataTable
PIVOT (COUNT(*)
FOR Range.rangeGroup IN ([0], [1], [2], [3], [4], [5], [6])) AS PT
-- Only covers the first 6 groups,
-- or the first three hours.
ORDER BY day
The pivot should take care of the getting individual columns, and COUNT will automatically resolve null rows. Should be all you need.

Convert 2 rows with multiple columns into 2 columns with multiple rows

I often run ad-hoc queries in SQL Server 2005/2008 where I would like to convert two rows in multiple columns into multiple rows having only two columns.
Given a query like this:
SELECT
SUM(CASE WHEN created_at IS NOT NULL THEN 1 END) AS 'TOTAL'
, SUM(CASE WHEN created_at > '2013-07-15' THEN 1 END) AS 'CREATED W/I LAST YEAR'
, SUM(CASE WHEN updated_at > '2013-07-15' THEN 1 END) AS 'MODIFIED W/I LAST YEAR'
, SUM(CASE WHEN updated_at < '2011-07-15' THEN 1 END) AS 'UNTOUCHED OVER 3 YEARS'
, SUM(CASE WHEN updated_at < '2009-07-15' THEN 1 END) AS 'UNTOUCHED OVER 5 YEARS'
-- , often there are more columns
FROM
mytable
WHERE
< filtering >
I would like it to display something like this:
TOTAL: 5000
CREATED W/I LAST YEAR: 500
MODIFIED W/I LAST YEAR: 1500
UNTOUCHED OVER 3 YEARS: 2000
UNTOUCHED OVER 5 YEARS: 1000
I want to keep DRY and not string together a bunch of SELECTs with UNIONs. I have never used PIVOT, UNPIVOT or CROSS APPLY. Most of the examples I have seen for UNPIVOT don't seem to apply to queries like the one above - or am I must missing something? It seems simple enough but "I'm just not getting it."
;WITH t AS (
SELECT
SUM(CASE WHEN created_at IS NOT NULL THEN 1 END) AS 'TOTAL'
, SUM(CASE WHEN created_at > '2013-07-15' THEN 1 END) AS 'CREATED W/I LAST YEAR'
, SUM(CASE WHEN updated_at > '2013-07-15' THEN 1 END) AS 'MODIFIED W/I LAST YEAR'
, SUM(CASE WHEN updated_at < '2011-07-15' THEN 1 END) AS 'UNTOUCHED OVER 3 YEARS'
, SUM(CASE WHEN updated_at < '2009-07-15' THEN 1 END) AS 'UNTOUCHED OVER 5 YEARS'
-- , often there are more columns
FROM
mytable
WHERE
< filtering >
)
SELECT name, value
FROM t
UNPIVOT(value FOR name IN (
[TOTAL]
, [CREATED W/I LAST YEAR]
, [MODIFIED W/I LAST YEAR]
, [UNTOUCHED OVER 3 YEARS]
, [UNTOUCHED OVER 5 YEARS]
)) p

counting events over flexible ranges

I am trying to count events (which are rows in the event_table) in the year before and the year after a particular target date for each person. For example, say I have a person 100 and target date is 10/01/2012. I would like to count events in 9/30/2011-9/30/2012 and in 10/02/2012-9/30/2013.
My query looks like:
select *
from (
select id, target_date
from subsample_table
) as i
left join (
select id, event_date, count(*) as N
, case when event_date between target_date-365 and target_date-1 then 0
when event_date between target_date+1 and target_date+365 then 1
else 2 end as after
from event_table
group by id, target_date, period
) as h
on i.id = h.id
and i.target_date = h.event_date
The output should look something like:
id target_date after N
100 10/01/2012 0 1000
100 10/01/2012 1 0
It's possible that some people do not have any events in the before or after periods (or both), and it would be nice to have zeros in that case. I don't care about the events outside the 730 days.
Any suggestions would be greatly appreciated.
I think the following may approach what you are trying to accomplish.
select id
, target_date
, event_date
, count(*) as N
, SUM(case when event_date between target_date-365 and target_date-1
then 1
else 0
end) AS Prior_
, SUM(case when event_date between target_date+1 and target_date+365
then 1
else 0
end) as After_
from subsample_table i
left join
event_table h
on i.id = h.id
and i.target_date = h.event_date
group by id, target_date, period
This is a generic answer. I don't know what date functions teradata has, so I will use sql server syntax.
select id, target_date, sum(before) before, sum(after) after, sum(righton) righton
from yourtable t
join (
select id, target_date td
, case when yourdate >= dateadd(year, -1, target_date)
and yourdate < target_date then 1 else 0 end before
, case when yourdate <= dateadd(year, 1, target_date)
and yourdate > target_date then 1 else 0 end after
, case when yourdate = target_date then 1 else 0 end righton
from yourtable
where whatever
group by id, target_date) sq on t.id = sq.id and target_date = dt
where whatever
group by id, target_date
This answer assumes that an id can have more than one target date.