ORACLE SQL Running TOTAL and daytotal using window function - sql

From the EMPLOYEE table, I want to group the amount of records(employees hired) AND also have the running TOTAL per day.
The format of the input is like this:
rownum Hired_date_time
1 1/10/2012 11:00
2 1/10/2012 13:00
3 20/11/2012 10:00
4 20/11/2012 15:00
5 20/11/2012 16:00
6 30/12/2012 1:00
The desired output:
Hired_date.......Hired_per_day.........TOTAL_number_of_employees
1/10/2012 ...................2 ........2
20/11/2012 ..................3 ........5
30/12/2012 ..................1 ....... 6
No problem for the GROUPING PER DAY:
select trunc(Hired_date_time) as "Hired_date" ,
count(*) as "Hired_per_day"
from employee
group by trunc(Hired_date_time)
order by trunc(Hired_date_time);
Question: how can I have a running total (in last column) using the window function

select trunc(hired),
count(*) hired_today,
sum(count(*)) over (order by trunc(hired)) as running_total
from emp
group by trunc(hired)
http://sqlfiddle.com/#!4/4bd36/9

select trunc(hire_date),
count(*) over (partition by trunc(hire_date)) as hired_per_day,
count(*) over (order by hire_date) as total_number_of_employees
from employee
order by trunc(hire_date)

Related

I have a table of calls data I want to figure out the count Unique accounts called everyday and take sum of unique accounts called by monthly basis

I have a table with 2 unique columns one has an account number and the other is the date. The sample data is given below.
Date account
9/8/2020 555
9/8/2020 666
9/8/2020 777
9/8/2020 888
9/9/2020 555
9/9/2020 999
9/10/2020 555
9/10/2020 222
9/10/2020 333
9/11/2020 666
9/11/2020 111
I would like to calculate the number of unique accounts called every day and sum it up for a month for example if account number 555 is called on 8sept, p sept and 20 Sept its is not adding up to the cumulative sum the result should look like this
date Cumulative Unique Accounts Called SO Far this month
9/8/2020 4
9/9/2020 5
9/10/2020 7
9/11/2020 8
Thank you in advance for your help.
You can do this with aggregation and window functions. First, get the first date for each account, then aggregate and accumulate:
select min_date,
count(*) as as_of_date,
sum(count(*)) over (partition by year(min_datedate), month(min_datedate)
order by min_date
) as cumulative_unique_count
from (select account, min(date) as min_date
from t
group by account, year(date), month(date)
) t
group by min_date;
You can try the below -
with cte as
(
select date,count(*) as total from
(
select date,count,row_number() over(partition by count order by date) as rn
from tablename
)A where rn=1 group by date
)
select date,sum(total) over(order by date) as cum_sum
from cte

How to group consecutive rows together in SQL by multiple columns

I have rows in a query that return something like:
Date User Time Location Service Count
1/1/2018 Nick 12:00 Location A X 1
1/1/2018 Nick 12:01 Location A Y 1
1/1/2018 John 12:02 Location B Z 1
1/1/2018 Harry 12:03 Location A X 1
1/1/2018 Harry 12:04 Location A X 1
1/1/2018 Harry 12:05 Location B Y 1
1/1/2018 Harry 12:06 Location B X 1
1/1/2018 Nick 12:07 Location A X 1
1/1/2018 Nick 12:08 Location A Y 1
where the query returns locations visited by a user and a count of picks done from the location. results are sorted by user and time ascending. I need to group it to where CONSECUTIVE rows with same User and Location are grouped with a SUM of Count column and comma separated list of unique values in Service Column, final result returns something like this:
Date User Start Time End Time Location Service Count
1/1/2018 Nick 12:00 12:01 Location A X,Y 2
1/1/2018 John 12:02 12:02 Location B Z 1
1/1/2018 Harry 12:03 12:04 Location A X 2
1/1/2018 Harry 12:05 12:06 Location B X,Y 2
1/1/2018 Nick 12:07 12:08 Location A X,Y 2
I'm not sure where to start. Maybe lag or partition clauses? hoping an SQL guru can help here...
This is a gaps and islands problem. One method for solving it uses row_number():
select Date, User, min(Time) as start_time, max(time) as end_time,
Location,
listagg(Service, ',') within group (order by service),
count(*) as cnt
from (select t.*,
row_number() over (date order by time) as seqnum,
row_number() over (partition by user, date, location order by time) as seqnum_2
from t
) t
group by Date, User, Location, (seqnum - seqnum_2);
It is a bit tricky to explain how this works. My suggestion is to run the subquery and you will see how the difference of row numbers defines the groups that you are looking for.
Use lag to get user and location values of previous row. Then use a running sum to generate a new group whenever the user and location change. Finally aggregate on the classified groups,user,location and date.
select Date, User, min(Time) as start_time,max(time) as end_time, Location,
listagg(Service, ',') within group (order by Service),
count(*) as cnt
from (select Date, User, Time, Location,
sum(case when prev_location=location and prev_user=user then 0 else 1 end) over(order by date,time) as grp
from (select Date, User, Time, Location,
lag(Location) over(order by date,time) as prev_location,
lag(User) over(order by date,time) as prev_user,
from t
) t
) t
group by Date, User, Location, grp;

Grouping sets of data in Oracle SQL

I have been trying to separate groups in data being stored on my oracle database for more accurate analysis.
Current Output
Time Location
10:00 A111
11:00 A112
12:00 S111
13:00 S234
17:00 A234
18:00 S747
19:00 A878
Desired Output
Time Location Group Number
10:00 A111 1
11:00 A112 1
12:00 S111 1
13:00 S234 1
17:00 A234 2
18:00 S747 2
19:00 A878 3
I have been trying to use over and partition by to assign the values, however I can only get into to increment all the time not only on a change. Also tried using lag but I struggled to make use of that.
I only need the value in the second column to start from 1 and increment when the first letter of field 1 changes (using substr).
This is my attempt using row_number but I am far off I think. There would be a time column in the output as well not shown above.
select event_time, st_location, Row_Number() over(partition by
SUBSTR(location,1,1) order
by event_time)
as groupnumber from pic
Any help would be really appreciated!
Edit:
Time Location Group Number
10:00 A-10112 1
11:00 A-10421 1
12:00 ST-10621 1
13:00 ST-23412 1
17:00 A-19112 2
18:00 ST-74712 2
19:00 A-87812 3
It is a gap and island problem. Use the following code:
select location,
dense_rank() over (partition by SUBSTR(location,1,1) order by grp)
from
(
select (row_number() over (order by time)) -
(row_number() over (partition by SUBSTR(location,1,1) order by time)) grp,
location,
time
from data
) t
order by time
dbfiddle demo
The main idea is in the subquery which isolates consecutive sequences of items (computation of grp column). The rest is simple once you have the grp column.
select DENSE_RANK() over(partition by SUBSTR("location",1,1) ORDER BY SUBSTR("location",1,2))
as Rownumber,
"location" from Table1;
Demo
http://sqlfiddle.com/#!4/21120/16

Get MAX count but keep the repeated calculated value if highest

I have the following table, I am using SQL Server 2008
BayNo FixDateTime FixType
1 04/05/2015 16:15:00 tyre change
1 12/05/2015 00:15:00 oil change
1 12/05/2015 08:15:00 engine tuning
1 04/05/2016 08:11:00 car tuning
2 13/05/2015 19:30:00 puncture
2 14/05/2015 08:00:00 light repair
2 15/05/2015 10:30:00 super op
2 20/05/2015 12:30:00 wiper change
2 12/05/2016 09:30:00 denting
2 12/05/2016 10:30:00 wiper repair
2 12/06/2016 10:30:00 exhaust repair
4 12/05/2016 05:30:00 stereo unlock
4 17/05/2016 15:05:00 door handle repair
on any given day need do find the highest number of fixes made on a given bay number, and if that calculated number is repeated then it should also appear in the resultset
so would like to see the result set as follows
BayNo FixDateTime noOfFixes
1 12/05/2015 00:15:00 2
2 12/05/2016 09:30:00 2
4 12/05/2016 05:30:00 1
4 17/05/2016 15:05:00 1
I manage to get the counts of each but struggling to get the max and keep the highest calculated repeated value. can someone help please
Use window functions.
Get the count for each day by bayno and also find the min fixdatetime for each day per bayno.
Then use dense_rank to compute the highest ranked row for each bayno based on the number of fixes.
Finally get the highest ranked rows.
select distinct bayno,minfixdatetime,no_of_fixes
from (
select bayno,minfixdatetime,no_of_fixes
,dense_rank() over(partition by bayno order by no_of_fixes desc) rnk
from (
select t.*,
count(*) over(partition by bayno,cast(fixdatetime as date)) no_of_fixes,
min(fixdatetime) over(partition by bayno,cast(fixdatetime as date)) minfixdatetime
from tablename t
) x
) y
where rnk = 1
Sample Demo
You are looking for rank() or dense_rank(). I would right the query like this:
select bayno, thedate, numFixes
from (select bayno, cast(fixdatetime) as date) as thedate,
count(*) as numFixes,
rank() over (partition by cast(fixdatetime as date) order by count(*) desc) as seqnum
from t
group by bayno, cast(fixdatetime as date)
) b
where seqnum = 1;
Note that this returns the date in question. The date does not have a time component.

Getting a row with two group by constraints

I have a table
TIMESTAMP ID Name
5/30/2016 11:45 1 Ben
5/30/2016 11:45 2 Ben
5/30/2016 23:15 2 Ben
5/30/2016 7:30 1 Peter
5/30/2016 6:05 1 Peter
5/30/2016 14:40 2 May
5/30/2016 1:05 1 May
Now, I need to get the MIN timestamp for each distinct Name.
Then if there are more than one MIN entry, choose the one with the MAX ID.
So the result should be
TIMESTAMP ID Name
5/30/2016 11:45 2 Ben
5/30/2016 6:05 1 Peter
5/30/2016 1:05 1 May
I tried using the query below:
SELECT MIN(TIMESTAMP),NAME FROM TBLSAMPLE WHERE TIMESTAMP BETWEEN TO_DATE('5/30/2016', 'MM/DD/YYYY' ) AND TO_DATE('5/30/2016', 'MM/DD/YYYY' ) + 1
GROUP BY NAME
and I could get the minimum time. But once I add in MAX(ID) the result return an entry that does not match any of the rows.
Your help are really appreciated.
You can do this with row_number():
select t.*
from (select t.*,
row_number() over (partition by name order by timestamp asc, id desc) as seqnum
from tblsample t
) t
where seqnum = 1;
Your question doesn't specify a condition on the dates. But if you want to add a where clause, then add it to the subquery.