Getting aggregated values - sql

I have the below table -
UserId Some_Value1 Datetime
1 0 24-11-2016 13:00
1 0 24-11-2016 13:45
1 1 24-11-2016 15:00
1 1 24-11-2016 17:15
2 0 25-11-2016 11:23
2 1 25-11-2016 13:22
2 0 25-11-2016 14:01
2 1 25-11-2016 18:00
As you can see - The value in Some_Value1 is 1 when the datetime value with the previous row is greater than 1 Hour for the same userId
I'm trying to get a series number rather than 1 and 0 when the datetime field is greater than 1 Hour.
Something like the below result -
UserId Some_Value1 Datetime Some_Value2
1 0 24-11-2016 13:00 1
1 0 24-11-2016 13:45 1
1 1 24-11-2016 15:00 2
1 1 24-11-2016 17:15 3
2 0 25-11-2016 11:23 1
2 1 25-11-2016 13:22 2
2 0 25-11-2016 14:01 2
2 1 25-11-2016 18:00 3
I'm trying to achieve this using postgres or Redshift. Tagging Oracle and Mysql to reach a larger audience and get pseudo-code as SQL query would mostly be the same for all databases except for inbuilt functions.
All of the above data is for representational purpose only. This is just sample data and the real scenario would have more random data. Hence the code needs to be dynamic and not hard-coded.

sum (Some_Value1) over
(
partition by UserId
order by Datetime
rows unbounded preceding
) + 1 as Some_Value2

CREATE TABLE #Table (UserId INT,Some_Value1 INT,_Datetime VARCHAR(20),_Sum INT)
INSERT INTO #Table (UserId ,Some_Value1 ,_Datetime )
SELECT 1,0,'24-11-2016 13:00' UNION ALL
SELECT 1,0,'24-11-2016 13:45' UNION ALL
SELECT 1,1,'24-11-2016 15:00' UNION ALL
SELECT 1,1,'24-11-2016 17:15' UNION ALL
SELECT 2,0,'25-11-2016 11:23' UNION ALL
SELECT 2,1,'25-11-2016 13:22' UNION ALL
SELECT 2,0,'25-11-2016 14:01' UNION ALL
SELECT 2,1,'25-11-2016 18:00'
SELECT UserId ,Some_Value1,_Datetime,sum (Some_Value1) OVER(PARTITION BY UserId ORDER BY _Datetime) + 1 as Some_Value2
FROM #Table

Related

Oracle - get row repeated n times, with dynamic value on date

I'm sure there is an easy way to do this in SQL Oracle, but I'm not able to find it.
I have a table ( temp ) , with 3 columns (id_1, id_2, date ), where date is Date type. Each line is unique.
The output I want is to repeat each line 15 times, with a new date columns where I get in one row the original date, in the second the original + 1 day, in the third the original + 2 days, etc, for each original row...
Define a helper CTO with 15 rows and simple cross joinit with your original table
with days as (
select rownum -1 as day_offset
from dual connect by level <= 15)
select
a.id1, a.id2, a.date_d + b.day_offset new_date_d
from tab a
cross join days b
order by 1,2,3;
Example - for the sample data ...
select * from tab;
ID1 ID2 DATE_D
---------- ---------- -------------------
1 1 01.01.2020 00:00:00
2 2 01.01.2019 00:00:00
... you will get the following output
ID1 ID2 NEW_DATE_D
---------- ---------- -------------------
1 1 01.01.2020 00:00:00
1 1 02.01.2020 00:00:00
1 1 03.01.2020 00:00:00
1 1 04.01.2020 00:00:00
1 1 05.01.2020 00:00:00
....
2 2 13.01.2019 00:00:00
2 2 14.01.2019 00:00:00
2 2 15.01.2019 00:00:00
30 rows selected.
Alternativly you may use recursive subquery factoring
The *day offset is calculated recursively as days.day_offset + 1 is limited to 15 and used to build the new date value
with days( id1, id2, date_d, day_offset) as (
select id1, id2, date_d, 0 day_offset from tab
union all
select days.id1, days.id2, tab.date_d + days.day_offset + 1 as date_d,
days.day_offset + 1 as day_offset
from tab
join days
on tab.id1 = days.id1 and tab.id2 = days.id2
and days.day_offset +1 < 15)
select * from days
order by 1,2,3

Time series group by day and kind

I create a table using the command below:
CREATE TABLE IF NOT EXISTS stats (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
session_kind INTEGER NOT NULL,
ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
)
I insert some time series data using the command below:
INSERT INTO stats (session_kind) values (?1)
Some time after having executed several times the insert command, I have some time series data as below:
id session_kind ts
-----------------------------------------
1 0 2020-04-18 12:59:51 // day 1
2 1 2020-04-19 12:59:52 // day 2
3 0 2020-04-19 12:59:53
4 1 2020-04-19 12:59:54
5 0 2020-04-19 12:59:55
6 2 2020-04-19 12:59:56
7 2 2020-04-19 12:59:57
8 2 2020-04-19 12:59:58
9 2 2020-04-19 12:59:59
10 0 2020-04-20 12:59:51 // day 3
11 1 2020-04-20 12:59:52
12 0 2020-04-20 12:59:53
13 1 2020-04-20 12:59:54
14 0 2020-04-20 12:59:55
15 2 2020-04-20 12:59:56
16 2 2020-04-20 12:59:57
17 2 2020-04-20 12:59:58
18 2 2020-04-21 12:59:59 // day 4
What I would like to have a command that groups my data by date from the most recent day to the least and the number of each session_kind like below (I don't want to give any parameter to this command):
0 1 2 ts
-------------------------
0 0 1 2020-04-21 // day 4
3 2 3 2020-04-20 // day 3
2 2 4 2020-04-19 // day 2
1 0 0 2020-04-18 // day 1
How can I group my data as above?
You can do conditional aggregation:
select
sum(session_kind= 0) session_kind_0,
sum(session_kind= 1) session_kind_1,
sum(session_kind= 2) session_kind_2,
date(ts) ts_day
from mytable
group by date(ts)
order by ts_day desc
If you want something dynamic, then it might be simpler to put the results in rows rather than columns:
select date(ts) ts_day, session_kind, count(*) cnt
from mytable
group by date(ts), session_kind
order by ts_day desc, session_kind
If I understand correctly, you just want to sum the values:
select date(timestamp),
sum(case when session_kind = 1 then 1 else 0 end) as cnt_1,
sum(case when session_kind = 2 then 1 else 0 end) as cnt_2,
sum(case when session_kind = 3 then 1 else 0 end) as cnt_3
from t
group by date(timestamp);
You can also simplify this:
select date(timestamp),
sum( session_kind = 1 ) as cnt_1,
sum( session_kind = 2 ) as cnt_2,
sum( session_kind = 3 ) as cnt_3
from t
group by date(timestamp);

Subtract subsequent row from previous row based on User

I have the following data and I want to subtract current row from previous row based on the UserID. I tried the code below is not given me what I want
DECLARE #DATETBLE TABLE (UserID INT, Dates DATE)
INSERT INTO #DATETBLE VALUES
(1,'2018-01-01'), (1,'2018-01-02'), (1,'2018-01-03'),(1,'2018-01-13'),
(2,'2018-01-15'),(2,'2018-01-16'),(2,'2018-01-17'), (5,'2018-02-04'),
(5,'2018-02-05'),(5,'2018-02-06'),(5,'2018-02-11'), (5,'2018-02-17')
;with cte as (
select UserID,Dates, row_number() over (order by UserID) as seqnum
from #DATETBLE t
)
select t.UserID,t.Dates, datediff(day,tprev.Dates,t.Dates)as diff
from cte t left outer join
cte tprev
on t.seqnum = tprev.seqnum + 1;
Current Output
UserID Dates diff
1 2018-01-01 NULL
1 2018-01-02 1
1 2018-01-03 1
1 2018-01-13 10
2 2018-01-15 2
2 2018-01-16 1
2 2018-01-17 1
5 2018-02-04 18
5 2018-02-05 1
5 2018-02-06 1
5 2018-02-11 5
5 2018-02-17 6
My Expected Output
UserID Dates diff
1 2018-01-01 NULL
1 2018-01-02 1
1 2018-01-03 1
1 2018-01-13 10
2 2018-01-15 NULL
2 2018-01-16 1
2 2018-01-17 1
5 2018-02-04 NULL
5 2018-02-05 1
5 2018-02-06 1
5 2018-02-11 5
5 2018-02-17 6
Your tag (sql-server-2008) suggests me to use APPLY :
select t.userid, t.dates, datediff(day, t1.dates, t.dates) as diff
from #DATETBLE t outer apply
( select top (1) t1.*
from #DATETBLE t1
where t1.userid = t.userid and
t1.dates < t.dates
order by t1.dates desc
) t1;
If you have SQL Server version 2012 or higher, you could use LAG() with a partition by UserID:
SELECT UserID
, DATEDIFF(dd,COALESCE(LAG_DATES, Dates), Dates) as diff
FROM
(
SELECT UserID
, Dates
, LAG(Dates) OVER (PARTITION BY UserID ORDER BY Dates) as LAG_DATES
FROM #DATETBLE
) exp
This will give you a 0 value instead of a NULL value for the first date in the sequence though.
Since you tagged the post with SQL Server 2008, however, you may need to use a method that doesn't rely on this windowed function.

SQLHow do I modify this query to select unique by hour

(Looking for a better title)
Hello I have the query below
Declare #CDT varchar(23)
Declare #CDT2 varchar(23)
set #cdt = '2016-01-18 00:00:00.000'
set #cdt2 = '2016-01-26 00:00:00.000'
SELECT
spt.number AS [Hour of Day],
(SELECT COUNT(DISTINCT AgentId)
FROM history t2
WHERE DATEPART(HOUR, t2.calldatetime)=spt.number
AND projectid IN (5) and calldatetime between #cdt and #cdt2) AS [Project 5 ],
(SELECT COUNT(DISTINCT AgentId)
FROM history t2
WHERE DATEPART(HOUR, t2.calldatetime)=spt.number
AND projectid IN (124) and calldatetime between #cdt and #cdt2) AS [Project 124],
(SELECT COUNT(DISTINCT AgentId)
FROM history t2
WHERE DATEPART(HOUR, t2.calldatetime)=spt.number
AND projectid IN (576) and calldatetime between #cdt and #cdt2) AS [Project 576]
FROM master..spt_values spt
WHERE spt.number BETWEEN 0 AND 11 AND spt.type = 'p'
GROUP BY spt.number
ORDER BY spt.number
I now need to select a unique number per hour rather than a distinct ammount overall.
for instance if I run this with the "select distinct(Agentid), rest of query here, it will give me a count of agentids, independant of the cases, how do I "WHEN AGENTID is unique"?
I copied examples from the original question
Project id Datetime Agentid
---------- ----------------------- ---------
5 11-23-2015 09:00:00.000 12
5 11-23-2015 10:00:00.000 12
6 11-23-2015 11:00:00.000 12
1 11-23-2015 12:00:00.000 3
3 11-23-2015 13:00:00.000 4
124 11-23-2015 14:00:00.000 7
124 11-23-2015 15:00:00.000 9
124 11-23-2015 16:00:00.000 10
576 11-23-2015 17:00:00.000 10
576 11-23-2015 18:00:00.000 44
576 11-23-2015 19:00:00.000 69
etc 11-23-2015 20:00:00.000 23
Expected output (Ignore the incorrect counts, assume they are correct from above^):
Datetime 5 124 576
------------- --- --- ---
09:00 - 09:59 0 4 5
10:00 - 10:59 4 3 1
11:00 - 11:59 5 2 1
12:00 - 12:59 1 1 1
13:00 - 13:59 6 1 1
14:00 - 14:59 6 1 1
15:00 - 15:59 7 1 2
16:00 - 16:59 8 1 3
17:00 - 17:59 9 1 3
18:00 - 18:59 1 1 2
19:00 - 19:59 12 1 0
20:00 - 20:59 0 0 0
so far
Hour of Day Project 5 Project 124 Project 576
0 0 0 0
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 0 0 0
7 0 0 0
8 0 0 0
9 0 0 0
10 0 0 0
11 0 0 0
I'm pretty sure you need to do this with subqueries:
SELECT
spt.number AS [Hour of Day],
(SELECT COUNT(DISTINCT AgentId)
FROM YourTable t2
WHERE DATEPART(HOUR, t2.yourdatetime)=spt.number
AND projectId IN (5)) AS [Project 5 ],
(SELECT COUNT(DISTINCT AgentId)
FROM YourTable t2
WHERE DATEPART(HOUR, t2.yourdatetime)=spt.number
AND projectId IN (124)) AS [Project 124],
(SELECT COUNT(DISTINCT AgentId)
FROM YourTable t2
WHERE DATEPART(HOUR, t2.yourdatetime)=spt.number
AND projectId IN (576)) AS [Project 576]
FROM master..spt_values spt
WHERE spt.number BETWEEN 0 AND 11 AND spt.type = 'p'
GROUP BY spt.number
ORDER BY spt.number
Here is the table used by these queries:
DECLARE #wt TABLE (
projectid varchar(4) not null,
edate datetime not null,
agentid int not null );
If you want to get the counts by time and project, use this query:
SELECT edate, projectid, COUNT(*) as nentries
FROM #wt
GROUP BY edate, projectid;
I haven't dealt with bucketing the dates by hour; that is a separate issue.
To get a tabular result set as you have shown:
SELECT edate, [5] AS [Project 5], [124] AS [Project 124], [576] AS [Project 576]
FROM (
SELECT edate, CAST(projectid AS int) AS projectid
FROM #wt
WHERE ISNUMERIC(projectid) <> 0 ) AS s
PIVOT (
COUNT(projectid)
FOR projectid IN ([5], [124], [576])) AS p;
Here is the result set for the PIVOT query using the above data:
However, you have to specify the projects of interest in the query. If you want to have an arbitrary number of projects and get columns for each one, that is going to require dynamic SQL to construct the PIVOT query.
#Tab Alleman: I added some data to illustrate the conditions that will test your scenario. Here is the result set with the same PIVOT query:

sql query for finding slots for next four days avoiding sunday

I have doubts in sql query.
I have slots table. It basically contain maximum slots ,maximum slots for am and Pm
DayName slots AM PM
1 Monday 50 30 20
2 Tuesday 50 30 20
3 Wednesday 50 30 20
4 Thursday 50 30 20
5 Friday 25 25 0
6 Saturday 15 15 0
7 Sunday 0 0 0
I have appointment table. This table is used for adding appointment
table structure
Appointdate iS_AM
8/7/2011 12:00:00 AM 1
8/5/2011 12:00:00 AM 1
8/6/2011 12:00:00 AM 1
8/2/2011 12:00:00 AM 1
8/2/2011 12:00:00 AM 1
8/2/2011 12:00:00 AM 0
8/3/2011 12:00:00 AM 0
8/4/2011 12:00:00 AM 1
8/4/2011 12:00:00 AM 0
If it is 1 it is Am else PM.
I need to display remaining available slots for the next four days.
I need to avoid sundays.
How can we avoid sundays.
my query so far is this
with cte as
(
select dateName(dw,appoint_date) dayN,convert(varchar(12),appoint_date,101) appoint_date, sum(case is_am when 1 then 1 else 0 end) as AM,
sum(case is_am when 0 then 1 else 0 end) as PM ,sum (case is_am when 0 then 1 when 1 then 1 end) as Total
from pda_appoint where
convert(varchar(12),appoint_date,111) between
Convert(varchar(10), getdate() ,111) and Convert(varchar(10), dateadd(dd,3,getdate()) ,111)
group by appoint_date
)
select p.AM-cte.AM as [Rem AM],p.PM-cte.PM as [Rem PM],p.slots-cte.Total as [Rem Total] from cte inner join pda_slots p on cte.dayN=day_name
Output is as follows
remMax remAm remPM
28 19 47
30 19 49
29 19 48
23 0 23
I need to avoid sundays when calculating next four days and is my sql query is correct
How about that.
SELECT TOP 4
dateName(dw,a.appoint_date) dayN,
(s.AM - SUM(case a.is_am when 1 then 1 else 0 end)) AS Remaining AM,
(s.PM - SUM(case a.is_am when 0 then 1 else 0 end)) as Remaining PM,
(s.slots - COUNT(a.is_am)) AS Remaining Total Slots
FROM
pda_appoint a, slot s
WHERE
dateName(dw,a.appoint_date) = s.DayName
AND dateName(dw,a.appoint_date) != 'Sunday'
AND a.appoint_date > GETDATE()
GROUP BY a.appoint_date
ORDER BY a.appoint_date
How about this
declare #t table (DayName1 varchar(25), slots int, am int, pm int)
insert #t values('Monday',50,30,20)
insert #t values('Tuesday',50,30,20)
insert #t values('Wednesday',50,30,20)
insert #t values('Thursday',50,30,20)
insert #t values('Friday',50,30,20)
insert #t values('Saturday',50,30,20)
insert #t values('Sunday',50,30,20)
declare #t1 table (appoint_date datetime, is_am int)
insert #t1 values('8/9/2011',0)
insert #t1 values('8/10/2011',0)
insert #t1 values('8/10/2011',1)
/* You can create the below as a Table valued function that will return the values for next 4 days .you need to pass #appoint_date as a parameter*/
declare #appoint_date datetime
set #appoint_date='8/6/2011'
;with cte as
(
select dateName(dw,#appoint_date) dayN,
convert(varchar(12),#appoint_date,101) appoint_date,
1 as num
Union all
select
dateName(dw,DATEADD(day, 1, appoint_date)) dayN,
convert(varchar(12),DATEADD(day, 1, appoint_date),101) appoint_date,
num+1
from cte
where num<5
)
select top 4 dayN,(c.AM-temp.AM) as AM,(c.PM-temp.PM) as PM,(c.Slots-Temp.Total) as Total
from
(
select TOP 4
dateName(dw,a.appoint_date) dayN,
SUM(case b.is_am when 1 then 1 else 0 end) AS AM,
SUM(case b.is_am when 0 then 1 else 0 end) as PM,
COUNT(b.is_am) AS Total
from cte a left outer join #t1 b
on a.appoint_date=b.appoint_date
where a.dayN !='Sunday'
group by a.appoint_date
)Temp
inner join #t c on Temp.dayN=c.dayname1
dayN AM PM Total
Saturday 30 20 50
Monday 30 20 50
Tuesday 30 19 49
Wednesday 29 19 48