Count parts of total value as columns per row (pivot table) - sql

I'm stuck with a seemingly easy query, but couldn't manage to get it working the last hours.
I have a table files that holds file names and some values like records in this file, DATE of creation (create_date), DATE of processing (processing_date) and so on. There can be multiple files for a create date in different hours and it is likely that they will not get processed in the same day of creaton, in fact it can even take up to three days or longer for them to get processed.
So let's assume I have these rows, as an example:
create_date | processing_date
------------------------------
2012-09-10 11:10:55.0 | 2012-09-11 18:00:18.0
2012-09-10 15:20:18.0 | 2012-09-11 13:38:19.0
2012-09-10 19:30:48.0 | 2012-09-12 10:59:00.0
2012-09-11 08:19:11.0 | 2012-09-11 18:14:44.0
2012-09-11 22:31:42.0 | 2012-09-21 03:51:09.0
What I want in a single query is to get a grouped column truncated to the day create_date with 11 additional columns for the differences between the processing_date and the create_date, so that the result should roughly look like this:
create_date | diff0days | diff1days | diff2days | ... | diff10days
------------------------------------------------------------------------
2012-09-10 | 0 2 1 ... 0
2012-09-11 | 1 0 0 ... 1
and so on, I hope you get the point :)
I have tried this and so far it works getting a single aggregated column for a create_date with a difference of - for example - 3:
SELECT TRUNC(f.create_date, 'DD') as created, count(1) FROM files f WHERE TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD') = 3 GROUP BY TRUNC(f.create_date, 'DD')
I tried combining the single queries and I tried sub-queries, but that didn't help or at least my knowledge about SQL is not sufficient.
What I need is a hint so that I can include the various differences as columns, like shown above. How could I possibly achieve this?

That's basically the pivoting problem:
SELECT TRUNC(f.create_date, 'DD') as created
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 0 then 1 end) as diff0days
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 1 then 1 end) as diff1days
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 2 then 1 end) as diff2days
, ...
FROM files f
GROUP BY
TRUNC(f.create_date, 'DD')

SELECT CreateDate,
sum(CASE WHEN DateDiff(day, CreateDate, ProcessDate) = 1 THEN 1 ELSE 0 END) AS Diff1,
sum(CASE WHEN DateDiff(day, CreateDate, ProcessDate) = 2 THEN 1 ELSE 0 END) AS Diff2,
...
FROM table
GROUP BY CreateDate
ORDER BY CreateDate

As you are using Oracle 11g you can also get desired result by using pivot query.
Here is an example:
-- sample of data from your question
SQL> create table Your_table(create_date, processing_date) as
2 (
3 select '2012-09-10', '2012-09-11' from dual union all
4 select '2012-09-10', '2012-09-11' from dual union all
5 select '2012-09-10', '2012-09-12' from dual union all
6 select '2012-09-11', '2012-09-11' from dual union all
7 select '2012-09-11', '2012-09-21' from dual
8 )
9 ;
Table created
SQL> with t2 as(
2 select create_date
3 , processing_date
4 , to_date(processing_date, 'YYYY-MM-DD')
- To_Date(create_date, 'YYYY-MM-DD') dif
5 from your_table
6 )
7 select create_date
8 , max(diff0) diff0
9 , max(diff1) diff1
10 , max(diff2) diff2
11 , max(diff3) diff3
12 , max(diff4) diff4
13 , max(diff5) diff5
14 , max(diff6) diff6
15 , max(diff7) diff7
16 , max(diff8) diff8
17 , max(diff9) diff9
18 , max(diff10) diff10
19 from (select *
20 from t2
21 pivot(
22 count(dif)
23 for dif in ( 0 diff0
24 , 1 diff1
25 , 2 diff2
26 , 3 diff3
27 , 4 diff4
28 , 5 diff5
29 , 6 diff6
30 , 7 diff7
31 , 8 diff8
32 , 9 diff9
33 , 10 diff10
34 )
35 ) pd
36 ) res
37 group by create_date
38 ;
Result:
Create_Date Diff0 Diff1 Diff2 Diff3 Diff4 Diff5 Diff6 Diff7 Diff8 Diff9 Diff10
--------------------------------------------------------------------------------
2012-09-10 0 2 1 0 0 0 0 0 0 0 0
2012-09-11 1 0 0 0 0 0 0 0 0 0 1

Related

Count record based on dates like 0 to 7 days 7 to 14 days and 14 to 28 days

I have a table having three columns ID, status and CreateDate. I want a count from the table based on createdate like
Last 0-7 days , 7-14 days, 14-28, and 28-35 days only...
With a sample table that looks like this:
SQL> select trunc(sysdate) today from dual;
TODAY
----------
21.08.2021
SQL> with temp as
2 (select id, createdate, trunc(sysdate) - createdate diff_days
3 from test
4 )
5 select * from temp
6 order by createdate desc;
ID CREATEDATE DIFF_DAYS
---------- ---------- ----------
1 20.08.2021 1
2 19.08.2021 2
3 13.08.2021 8
4 12.08.2021 9
5 02.08.2021 19
6 26.07.2021 26
6 rows selected.
SQL>
you could do it as follows (note that your periods aren't correct; the same row can't (or, should I rather say: shouldn't) belong to two periods. It can't be 0-7 and 7-14 if date difference is exactly 7 days. It is either in the first or in the second period, not both.
SQL> with temp as
2 (select id, createdate, trunc(sysdate) - createdate diff_days
3 from test
4 )
5 select
6 sum(case when diff_days >= 0 and diff_days < 7 then 1 else 0 end) " 0- 6 days",
7 sum(case when diff_days >= 7 and diff_days < 14 then 1 else 0 end) " 7-13 days",
8 sum(case when diff_days >= 14 and diff_days < 21 then 1 else 0 end) "14-20 days",
9 sum(case when diff_days >= 21 and diff_days < 28 then 1 else 0 end) "21-27 days"
10 from temp;
0- 6 days 7-13 days 14-20 days 21-27 days
---------- ---------- ---------- ----------
2 2 1 1
SQL>

How to split the column values after a certain number

I have a dataset that looks like this:
ID HoursWorked TotalHours
23 1 1
23 1 2
23 1 3
23 0.5 3.5
23 1 4.5
23 1 5.5
23 1 6.5
23 1 7.5
23 1 8.5
61 1 1
61 1 2
What I want to do is if the total hours hits 8 hours, I want to split that row (e.g. 8.5 in the sample data above) so that an employee always has the total hours of 8. If someone works over 8 hours it should continue after hitting 8 in the totalhours column. For example, I want something like this as my final result.
ID HoursWorked TotalHours
23 1 1
23 1 2
23 1 3
23 0.5 3.5
23 1 4.5
23 1 5.5
23 1 6.5
23 1 7.5
23 0.5 8 *
23 0.5 8.5 *
61 1 1
61 1 2
As you can see the row which originally had 8.5 for its totalhours got broken down into two different rows.
I couldn't think of any way to do this in SQL Server. I'd appreciate any help on this.
see if this works.
select ID,HoursWorked,TotalHours from table_name where TotalHours <=8
union
select ID,(HoursWorked-(TotalHours-8) as HoursWorked ,8 as TotalHours from table_name where TotalHours >8
union
select ID,(TotalHours-8) as HoursWorked ,TotalHours from table_name where TotalHours >8
This seems rather complicated. This approach takes all the rows before 8 hours. It then finds the row that first passes 8 hours and splits that one as needed:
select id, hoursworked, totalhours
from t
where totalhours <= 8
union all
select t.id, v.hoursworked, v.totalhours
from (select t.*, row_number() over (partition by id order by totalhours) as seqnum
from t
where totalhours > 8
) t cross apply
(values (case when seqnum = 1 then totalhours - 8 end,
case when seqnum = 1 then 8 end
),
(case when seqnum = 1 and totalhours >= 8 then totalhours - 8 else hoursworked end,
totalhours
)
) v(hoursworked, totalhours)
where v.hoursworked > 0
order by id, totalhours;
Here is a db<>fiddle.

SQL (Vertica) - Calculate number of users who returned to the app at least x days in the past 7 days

Suppose I have my table like:
uid day_used_app
--- -------------
1 2012-04-28
1 2012-04-29
1 2012-04-30
2 2012-04-29
2 2012-04-30
2 2012-05-01
2 2012-05-21
2 2012-05-22
Suppose I want the number of unique users who returned to the app at least 2 different days in the last 7 days (from 2012-05-03).
So as an example to retrieve the number of users who have used the application on at least 2 different days in the past 7 days:
select count(distinct case when num_different_days_on_app >= 2
then uid else null end) as users_return_2_or_more_days
from (
select uid,
count(distinct day_used_app) as num_different_days_on_app
from table
where day_used_app between current_date() - 7 and current_date()
group by 1
)
This gives me:
users_return_2_or_more_days
---------------------------
2
The question I have is:
What if I want to do this for every day up to now so that my table looks like this, where the second field equals the number of unique users who returned 2 or more different days within a week prior to the date in the first field.
date users_return_2_or_more_days
-------- ---------------------------
2012-04-28 2
2012-04-29 2
2012-04-30 3
2012-05-01 4
2012-05-02 4
2012-05-03 3
Would this help?
WITH
-- your original input, don't use in "real" query ...
input(uid,day_used_app) AS (
SELECT 1,DATE '2012-04-28'
UNION ALL SELECT 1,DATE '2012-04-29'
UNION ALL SELECT 1,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-04-29'
UNION ALL SELECT 2,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-05-01'
UNION ALL SELECT 2,DATE '2012-05-21'
UNION ALL SELECT 2,DATE '2012-05-22'
)
-- end of input, start "real" query here, replace ',' with 'WITH'
,
one_week_b4 AS (
SELECT
uid
, day_used_app
, day_used_app -7 AS day_used_1week_b4
FROM input
)
SELECT
one_week_b4.uid
, one_week_b4.day_used_app
, count(*) AS users_return_2_or_more_days
FROM one_week_b4
JOIN input
ON input.day_used_app BETWEEN one_week_b4.day_used_1week_b4 AND one_week_b4.day_used_app
GROUP BY
one_week_b4.uid
, one_week_b4.day_used_app
HAVING count(*) >= 2
ORDER BY 1;
Output is:
uid|day_used_app|users_return_2_or_more_days
1|2012-04-29 | 3
1|2012-04-30 | 5
2|2012-04-29 | 3
2|2012-04-30 | 5
2|2012-05-01 | 6
2|2012-05-22 | 2
Does that help your needs?
Marco the Sane ...
SELECT DISTINCT
t1.day_used_app,
(
SELECT SUM(CASE WHEN t.num_visits >= 2 THEN 1 ELSE 0 END)
FROM
(
SELECT uid,
COUNT(DISTINCT day_used_app) AS num_visits
FROM table
WHERE day_used_app BETWEEN t1.day_used_app - 7 AND t1.day_used_app
GROUP BY uid
) t
) AS users_return_2_or_more_days
FROM table t1

SQL intersect two timestamp pairs and group up by hours

I have A little problem.
I have A table lets call it "events" with columns like: type, (1 or 0) , timestamp start , timestamp end.
I want to group them by hours (60 minutes periods)
Into 4 columns each calculating
How many minutes per hour there was no either type 1 or type 0 event.
How many minutes per hour there was an event type 1 and in the same time there was not event of type 2.
How many minutes per hour there was an event type 2 and in the same time there was no event of type 1
How many minutes per hour there was an event 2 and event 1 at the same time.
Result should look like this:
hour 00 10 01 11
12 10 20 20 10
13 5 15 25 15
Each row should always sum to 60 minutes.
Is it possible to do it in SQL? I need it in vertica so I can use verticas functions too.
Interesting Question! Here is a query which gets you what you need. I mocked up the following table and some dummy data, and then showed the results from the query at the end. As you required - the totals always add up to 60 minutes within each hour.
SETUP:
create table public.time_event_test(event_timestamp timestamptz, event_type int);
insert into public.time_event_test(event_timestamp,event_type) select getutcdate() as event_timestamp, 1 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',5,getutcdate()) as event_timestamp, 1 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',1,getutcdate()) as event_timestamp, 1 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',1,getutcdate()) as event_timestamp, 2 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',3,getutcdate()) as event_timestamp, 2 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',6,getutcdate()) as event_timestamp, 2 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',90,getutcdate()) as event_timestamp, 2 as event_type;
QUERY:
select date_trunc('hour',dat) as hr
, 60 - sum(case when event_type1 = 1 or event_type2 = 1 then 1 else 0 end) as type_00
, sum(case when event_type1 = 0 and event_type2 = 1 then 1 else 0 end) as type_01
, sum(case when event_type1 = 1 and event_type2 = 0 then 1 else 0 end) as type_10
, sum(case when event_type1 = 1 and event_type2 = 1 then 1 else 0 end) as type_11
from (
select date_trunc('minute',event_timestamp) as dat
, max(case when event_type = 1 then 1 else 0 end) as event_type1
, max(case when event_type = 2 then 1 else 0 end) as event_type2
from public.time_event_test
group by 1
) x
group by 1 order by 1;
RESULTS:
hr | type_00 | type_01 | type_10 | type_11
------------------------+---------+---------+---------+---------
2016-12-21 01:00:00+00 | 52 | 3 | 2 | 3
2016-12-21 02:00:00+00 | 59 | 1 | 0 | 0

Select Where Date Between

I would like to SELECT a table calendar and combine the results with the days of the month.
I mean,
Table: Calendar
ID TEAM EMPLOYER START END
17 19 8 04/08/2014 18:01:00 11/08/2014 07:59:00
18 19 39 11/08/2014 18:01:00 18/08/2014 07:59:00
19 19 44 18/08/2014 18:01:00 25/08/2014 07:59:00
20 19 38 25/08/2014 18:01:00 01/09/2014 07:59:00
And I have a SELECT for the days of the month.
Select Days.Dt
From (Select Trunc(To_Date('2014', 'YYYY'), 'y') - 1 + Rownum Dt
From All_Objects
Where Rownum <= Add_Months(Trunc(To_Date('2014', 'YYYY'), 'y'), 12) -
Trunc(To_Date('2014', 'YYYY'), 'y')) Days
Where To_Char(Dt, 'mm/yyyy') = '08/2014'
What I want is something like this:
DAY EMPLOYER_END EMPLOYER_START
1 01/08/2014
2 02/08/2014
3 03/08/2014
4 04/08/2014 4
5 05/08/2014 4 4
6 06/08/2014 4 4
7 07/08/2014 4 4
8 08/08/2014 4 4
9 09/08/2014 4 4
10 10/08/2014 4 4
11 11/08/2014 4 39
12 12/08/2014 39 39
The employer starts at 18:01 (always) and end at 07:59 (always).
Does anyone know if it's possible?
And the way I can do that.
Thanks!
Your desired results do not match your sample data. However, I think you want something like this:
with dates as (
Select Days.Dt
From (Select Trunc(To_Date('2014', 'YYYY'), 'y') - 1 + Rownum Dt
From All_Objects
Where Rownum <= Add_Months(Trunc(To_Date('2014', 'YYYY'), 'y'), 12) -
Trunc(To_Date('2014', 'YYYY'), 'y')
) Days
Where To_Char(Dt, 'mm/yyyy') = '08/2014'
)
select d.dt,
sum(case when c.employer_start = d.ddt then 0 else 1 end) as employer_end,
sum(case when c.employer_end = d.dt then 1 else 0 end) as employer_start
from dates d left outer join
calendar c
on d.dt between c.employer_start and c.employer_end
group by d.dt
order by d.dt;
I guess this can be useful to you
WITH mindates AS
(SELECT TRUNC(MIN(startdate),'month') st_date,
TRUNC(MAX(enddate)) ed_date
FROM calendar
) ,
dates AS
(SELECT st_date+ rownum-1 AS dates_col
FROM mindates,
dual
CONNECT BY rownum <= (ed_date- st_date)+1
)
SELECT d.dates_col dates,
MIN((
CASE
WHEN d.dates_col=c.startdate
THEN NULL
ELSE c.employer
END)) AS employer_end,
MIN((
CASE
WHEN d.dates_col=c.enddate
THEN NULL
ELSE c.employer
END )) AS employer_start
FROM dates d
LEFT OUTER JOIN calendar c
ON d.dates_col BETWEEN c.startdate AND c.enddate
GROUP BY d.dates_col
ORDER BY d.dates_col;