I know that this question is essentially a duplicate of an older question I asked but quite a few things changed since I asked that question so I thought I'd ask a new question about it.
I have a table that holds phone call records which has the following fields:
END: Holds the timestamp of when a call ended - Data Type: DATE
LINE: Holds the phone line that was used for a call - Data Type: NUMBER
CALLDURATION: Holds the duration of a call in seconds - Data Type: NUMBER
The table has entries like this:
END LINE CALLDURATION
---------------------- ------------------- -----------------------
25/01/2012 14:05:10 6 65
25/01/2012 14:08:51 7 1142
25/01/2012 14:20:36 5 860
I need to create a query that returns the number of concurrent phone calls based on the data from that table. The query should calculate that number in different intervals. What I mean by that is that the results of the query should only contain a new entry whenever a call was started or ended. As long as the number of concurrent phone calls stays the same there should not be any additional entry in the output.
To make this more clear, here is an example of everything the query should return based on the example entries from the previous table:
TIMESTAMP LINE CALLDURATION STATUS CURRENTLYUSEDLINES
---------------------- ----- ------------- ------- -------------------
25/01/2012 13:49:49 7 1142 1 1
25/01/2012 14:04:05 6 65 1 2
25/01/2012 14:05:10 6 65 -1 1
25/01/2012 14:06:16 5 860 1 2
25/01/2012 14:08:51 7 1142 -1 1
25/01/2012 14:20:36 5 860 -1 0
I got the following example query from a colleague but unfortunately I do not fully understand it and it also does not work exactly as it should because for calls with a duration of 0 seconds it would sometimes have "-1" in the CURRENTLYUSEDLINES-column:
SELECT COALESCE (SUM (STATUS) OVER (ORDER BY END ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING), 0) CURRENTLYUSEDLINES
FROM (SELECT END - CALLDURATION / 86400 AS TIMESTAMP,
LINE,
CALLDURATION,
1 AS STATUS
FROM t_calls
UNION ALL
SELECT END,
LINE,
CALLDURATION,
-1 AS STATUS
FROM t_calls) t
ORDER BY 1;
Now I am supposed to make that query work like in the example but I'm not sure how to do that.
Could someone help me out with this or at least explain this query so I can try fixing it myself?
I think this will solve your problem:
SELECT TIMESTAMP,
SUM(SUM(STATUS)) OVER (ORDER BY TIMESTAMP) as CURRENTLYUSEDLINES
FROM ((SELECT END - CALLDURATION / (24*60*60) AS TIMESTAMP,
COUNT(*) AS STATUS
FROM t_calls
GROUP BY END - CALLDURATION / (24*60*60)
) UNION ALL
(SELECT END, - COUNT(*) AS STATUS
FROM t_calls
GROUP BY END
)
) t
GROUP BY TIMESTAMP
ORDER BY 1;
This is a slight simplification of your query. But by doing all the aggregations, you should be getting 0s, but not negative values.
You are getting negative values because the "ends" of the calls are being processed before the begins. This does all the work "at the same time", because there is only one row per timestamp.
You can use an UNPIVOT (using a similar technique to my answer here):
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( END, LINE, CALLDURATION ) AS
SELECT CAST( TIMESTAMP '2012-01-25 14:05:10' AS DATE ), 6, 65 FROM DUAL UNION ALL
SELECT CAST( TIMESTAMP '2012-01-25 14:08:51' AS DATE ), 7, 1142 FROM DUAL UNION ALL
SELECT CAST( TIMESTAMP '2012-01-25 14:20:36' AS DATE ), 5, 860 FROM DUAL;
Query 1:
SELECT p.*,
SUM( status ) OVER ( ORDER BY dt, status DESC ) AS currentlyusedlines
FROM (
SELECT end - callduration / 86400 As dt,
t.*
FROM table_name t
)
UNPIVOT( dt FOR status IN ( dt As 1, end AS -1 ) ) p
Results:
| LINE | CALLDURATION | STATUS | DT | CURRENTLYUSEDLINES |
|------|--------------|--------|----------------------|--------------------|
| 7 | 1142 | 1 | 2012-01-25T13:49:49Z | 1 |
| 6 | 65 | 1 | 2012-01-25T14:04:05Z | 2 |
| 6 | 65 | -1 | 2012-01-25T14:05:10Z | 1 |
| 5 | 860 | 1 | 2012-01-25T14:06:16Z | 2 |
| 7 | 1142 | -1 | 2012-01-25T14:08:51Z | 1 |
| 5 | 860 | -1 | 2012-01-25T14:20:36Z | 0 |
Related
Following table represents results of given test.
Every result for the same test is either pass ( error_id=0) or fail ( error_id <> 0)
I need help to write a query, that returns the number of runs since last good run ( error_id= 0) and the date.
| Date | test_id | error_id |
-----------------------------------
| 2019-12-20 | 123 | 23
| 2019-12-19 | 123 | 23
| 2019-12-17 | 123 | 22
| 2019-12-18 | 123 | 0
| 2019-12-16 | 123 | 11
| 2019-12-15 | 123 | 11
| 2019-12-13 | 123 | 11
| 2019-12-12 | 123 | 0
So the result for this example should be:
| 2019-12-18 | 123 | 4
as the test 123 was PASS on 2019-12-18 and this happened 4 runs ago.
I have a query to determine whether given run is error or not, but I have trouble applying appropriate window function to it to get the wanted result
select test_id, Date, error_id, (CASE WHEN error_id 0 THEN 1 ELSE 0 END) as is_error
from testresults
You can generate a row number, in reverse order from the sorting of the query itself:
SELECT test_date, test_id, error_code,
(row_number() OVER (ORDER BY test_date asc) - 1) as runs_since_last_pass
FROM tests
WHERE test_date >= (SELECT MAX(test_date) FROM tests WHERE error_code=0)
ORDER BY test_date DESC
LIMIT 1;
Note that this will run into issues if test_date is not unique. Better use a timestamp (precise to the millisecond) instead of a date.
Here's a DBFiddle: https://www.db-fiddle.com/f/8gSHVcXMztuRiFcL8zLeEx/0
If there's more than one test_id, you'll want to add a PARTITION BY clause to the row number function, and the subquery would become a bit more complex. It may be more efficient to come up with a way to do this by a JOIN instead of a subquery, but it would be more cognitively complex.
I think you just want aggregation and some filtering:
select id, count(*),
max(date) over (filter where error_id = 0) as last_success_date
from t
where date > (select max(t2.date) from t t2 where t2.error_id = 0);
group by id;
You have to use the Maximum date of the good runs for every test_id in your query. You can try this query:
select tr2.Date_error, tr.test_id, count(tr.error_id) from
testresults tr inner join (select max(Date_error), test_id
from testresult where error_id=0 group by test_id) tr2 on
tr.test_id=tr2.test_id and tr.date_error >=tr2.date_error
group by test_id
This should do the trick:
select count(*) from table t,
(select max(date) date from table where error_id = 0) good
where t.date >= good.date
Basically you are counting the rows that have a date >= the date of the last success.
Please note: If you need the number of days, it is a complete different query:
select now()::date - max(test_date) last_valid from tests
where error_code = 0;
The below set represents the sales of a product in consecutive weeks.
22,19,20,23,16,14,15,15,18,21,24,10,17
...
weekly sales table
date sales
week-1 : 22
week-2 : 19
week-3 : 20
...
week-12 : 10
week-13 : 17
I need to find the longest run of higher sales figures for consecutive weeks, i.e week-6 to week-11 represented by 14,15,15,18,21,24.
I am trying to use a recursive CTE to move forward to the next week(s) to find if the sales value is equal or higher. As long as the value is equal or higher, keep on moving to the next week, recording the ROWNUMBER of the anchor member (represents the starting week number) and the week number of the iterated row. With this approach, there are redundant recursive calls. For example, when cte is called for week-2, it iterates week-3, week-4 and week-5 as the sales values are higher on each week from its previous week. Now, after week-2, the cte should be called for week-5 as week-3, week-4 and week-5 have already been visited.
Basically, if I have already visited a row of filt_coll in my recursive calls, I do not want it to be passed to the CTE again. The rows marked as redundant should not be found and the values for actualweek column should be unique.
I know the sql below does not give a solution to my problem of finding the longest run of higher values. I can work out that from the max count of startweek column. For now, I am trying to figure out how to eliminate the redundant recursive calls.
START_WEEK | SALES | SALESLAG | SALESLEAD | ACTUALWEEK
1 | 22 | 0 | -3 | 1
2 | 19 | -3 | 1 | 2
2 | 20 | 1 | 3 | 3
2 | 23 | 3 | -7 | 4
3 | 20 | 1 | 3 | 3 <-(redundant)
3 | 23 | 3 | -7 | 4 <-(redundant)
4 | 23 | 3 | -7 | 4 <-(redundant)
6 | 14 | -2 | 1 | 6
...
with
-- begin test data
raw_data (sales) as
(
select '22,19,20,23,16,14,15,15,18,21,24,10,17' from dual
)
,
derived_tbl(week, sales) as
(
select level, regexp_substr(sales, '([[:digit:]]+)(,|$)', 1, level, null, 1)
from raw_data connect by level <= regexp_count(sales,',')+1
)
-- end test data
,
coll(week, sales, saleslag, saleslead) as
(
select week, sales,
nvl(sales - (lag(sales) over (order by week)), 0),
nvl((lead(sales) over (order by week) - sales), 0)
from derived_tbl
)
,
filt_coll(week, sales, saleslag, saleslead) as
(
select week, sales, saleslag, saleslead
from coll
where not (saleslag < 0 and saleslead < 0)
)
,
cte(startweek, sales, saleslag, saleslead, actualweek) as
(
select week, sales, saleslag, saleslead, week from filt_coll
-- where week not in (select week from cte)
-- *** want to achieve the effect of the above commented out line
union all
select cte.startweek, cl.sales, cl.saleslag, cl.saleslead, cl.week
from filt_coll cl, cte
where cl.week = cte.actualweek + 1 and cl.sales >= cte.sales
)
select * from cte
order by 1,actualweek
;
Let's say I have a table as below
date add_days
2015-01-01 5
2015-01-04 2
2015-01-11 7
2015-01-20 10
2015-01-30 1
what I want to do is to check the days_balance, i.e. if date is greater or smaller than previous date + N days (add_days) and take the cumulated sum of days count if they are a continuous series.
So the algorithm should work like
for i in 2:N_rows {
days_balance[i] := date[i-1] + add_days[i-1] - date[i]
if days_balance[i] >= 0 then
date[i] := date[i] + days_balance[i]
}
The expected result should be as follows
date days_balance
2015-01-01 0
2015-01-04 2
2015-01-11 -3
2015-01-20 -2
2015-01-30 0
Is it possible in pure SQL? I imagine it should be with some conditional joins, but cannot see how it could be implemented.
I'm posting another answer since it may be nice to compare them since they use different methods (this one just does a n^2 style join, other one used a recursive CTE). This one takes advantage of the fact that you don't have to calculate the days_balance for each previous row before calculating it for a particular row, you just need to sum things from previous days....
drop table junk
create table junk(date DATETIME, add_days int)
insert into junk values
('2015-01-01',5 ),
('2015-01-04',2 ),
('2015-01-11',7 ),
('2015-01-20',10 ),
('2015-01-30',1 )
;WITH cte as
(
select ROW_NUMBER() OVER (ORDER BY date) i, date, add_days, ISNULL(DATEDIFF(DAY, LAG(date) OVER (ORDER BY date), date), 0) days_since_prev
FROM Junk
)
, combinedWithAllPreviousDaysCte as
(
select i [curr_i], date [curr_date], add_days [curr_add_days], days_since_prev [curr_days_since_prev], 0 [prev_add_days], 0 [prev_days_since_prev] from cte where i = 1 --get first row explicitly since it has no preceding rows
UNION ALL
select curr.i [curr_i], curr.date [curr_date], curr.add_days [curr_add_days], curr.days_since_prev [curr_days_since_prev], prev.add_days [prev_add_days], prev.days_since_prev [prev_days_since_prev]
from cte curr
join cte prev on curr.i > prev.i --join to all previous days
)
select curr_i, curr_date, SUM(prev_add_days) - curr_days_since_prev - SUM(prev_days_since_prev) [days_balance]
from combinedWithAllPreviousDaysCte
group by curr_i, curr_date, curr_days_since_prev
order by curr_i
outputs:
+--------+-------------------------+--------------+
| curr_i | curr_date | days_balance |
+--------+-------------------------+--------------+
| 1 | 2015-01-01 00:00:00.000 | 0 |
| 2 | 2015-01-04 00:00:00.000 | 2 |
| 3 | 2015-01-11 00:00:00.000 | -3 |
| 4 | 2015-01-20 00:00:00.000 | -5 |
| 5 | 2015-01-30 00:00:00.000 | -5 |
+--------+-------------------------+--------------+
Well, I think I have it with a recursive CTE (sorry, I only have Microsoft SQL Server available to me at the moment, so it may not comply with PostgreSQL).
Also I think the expected results you had were off (see comment above). If not, this can probably be modified to conform to your math.
drop table junk
create table junk(date DATETIME, add_days int)
insert into junk values
('2015-01-01',5 ),
('2015-01-04',2 ),
('2015-01-11',7 ),
('2015-01-20',10 ),
('2015-01-30',1 )
;WITH cte as
(
select ROW_NUMBER() OVER (ORDER BY date) i, date, add_days, ISNULL(DATEDIFF(DAY, LAG(date) OVER (ORDER BY date), date), 0) days_since_prev
FROM Junk
)
,recursiveCte (i, date, add_days, days_since_prev, days_balance, math) as
(
select top 1
i,
date,
add_days,
days_since_prev,
0 [days_balance],
CAST('no math for initial one, just has zero balance' as varchar(max)) [math]
from cte where i = 1
UNION ALL --recursive step now
select
curr.i,
curr.date,
curr.add_days,
curr.days_since_prev,
prev.days_balance - curr.days_since_prev + prev.add_days [days_balance],
CAST(prev.days_balance as varchar(max)) + ' - ' + CAST(curr.days_since_prev as varchar(max)) + ' + ' + CAST(prev.add_days as varchar(max)) [math]
from cte curr
JOIN recursiveCte prev ON curr.i = prev.i + 1
)
select i, DATEPART(day,date) [day], add_days, days_since_prev, days_balance, math
from recursiveCTE
order by date
And the results are like so:
+---+-----+----------+-----------------+--------------+------------------------------------------------+
| i | day | add_days | days_since_prev | days_balance | math |
+---+-----+----------+-----------------+--------------+------------------------------------------------+
| 1 | 1 | 5 | 0 | 0 | no math for initial one, just has zero balance |
| 2 | 4 | 2 | 3 | 2 | 0 - 3 + 5 |
| 3 | 11 | 7 | 7 | -3 | 2 - 7 + 2 |
| 4 | 20 | 10 | 9 | -5 | -3 - 9 + 7 |
| 5 | 30 | 1 | 10 | -5 | -5 - 10 + 10 |
+---+-----+----------+-----------------+--------------+------------------------------------------------+
I don’t quite get how your algorithm returns your expected results? But let me share a technique I came up with that might help.
This will only work if the end result of your data is to be exported to Excel, and even then it won’t work in all scenarios depending on what format you export your dataset in, but here it is....
If you’ll familiar with Excel Formulas, what I discovered is that if you write an Excel formula in your SQL as another field, it will execute that formula for you as soon as you export to excel (best method that works for me is just coping and pasting it into Excel, so that it doesn’t format it as text)
So for your example, here’s what you could do (noting again I don’t understand your algorithm, so this is probably wrong, but it’s just to give you the concept)
SELECT
date
, add_days
, '=INDEX($1:$65536,ROW()-1,COLUMN()-2)'
||'+INDEX($1:$65536,ROW()-1,COLUMN()-1)'
||'-INDEX($1:$65536,ROW(),COLUMN()-2)'
AS "days_balance[i]"
,'=IF(INDEX($1:$65536,ROW(),COLUMN()-1)>=0'
||',INDEX($1:$65536,ROW(),COLUMN()-3)'
||'+INDEX($1:$65536,ROW(),COLUMN()-1))'
AS "date[i]"
FROM
myTable
ORDER BY /*Ensure to order by whatever you need for your formula to work*/
The key part to making this work is using the INDEX formula function to select a cell based on the position of the current cell. So ROW()-1 tells it get me the result of the previous record, and COLUMN()-2 means take the value from two columns to the left of the current. Because you can't use cell references like A2+B2-A3 because the row numbers won't change on export, and it assumes the position of the columns.
I used SQL string concatenation with || just so it's easier to read on screen.
I tried this one in excel; it didn’t match your expected results. But if this technique works for you then just correct the excel formula to suit.
How do you do to retrieve only the max value of a group with only consecutive values?
I have a telephone database with only unique values and I want to get only the highest number of each telephone number group TelNr and I am struggling.
id | TeNr | Position
1 | 100 | SLMO2.1.3
2 | 101 | SLMO2.3.4
3 | 103 | SLMO2.4.1
4 | 104 | SLMO2.3.2
5 | 200 | SLMO2.5.1
6 | 201 | SLMO2.5.2
7 | 204 | SLMO2.5.5
8 | 300 | SLMO2.3.5
9 | 301 | SLMO2.6.2
10 | 401 | SLMO2.4.8
Result should be:
TelNr
101
104
201
204
301
401
I have tried almost every tip I could find so far and whether I get all TelNr or no number at all which is useless in my case.
Any brilliant idea to run this with SQLITE?
So you're searching for gaps and want to get the first value of those gaps.
This is probably the best way to get them, try to check for a row with the current TeNr plus 1 and if there's none you found it:
select t1.TeNr, t1.TeNr + 1 as unused_TeNr
from tab as t1
left join Tab as t2
on t2.TeNr = t1.TeNr + 1
where t2.TeNr is null
Edit:
To get the range of missing values you need to use some old-style SQL as SQLite doesn't seem to support ROW_NUMBER, etc.
select
TeNr + 1 as RangeStart,
nextTeNr - 1 as RangeEnd,
nextTeNr - TeNr - 1 as cnt
from
(
select TeNr,
( select min(TeNr) from tab as t2
where t2.TeNr > t1.TeNr ) as nextTeNr
from tab as t1
) as dt
where nextTeNr > TeNr + 1
It's probably not very efficient, but might be ok if the number of rows is small and/or there's a index on TeNr.
Getting each value in the gap as a row in your result set is very hard, if your version of SQLite supports recursive queries:
with recursive cte (TeNr, missing, maxTeNr) as
(
select
min(TeNr) as TeNr, -- start of range of existing numbers
0 as missing, -- 0 = TeNr exists, 1 = TeNr is missing
max(TeNr) as maxTeNr -- end of range of existing numbers
from tab
union all
select
cte.TeNr + 1, -- next TeNr, if it doesn't exists tab.TeNr will be NULL
case when tab.TeNr is not null then 0 else 1 end,
maxTeNr
from cte left join tab
on tab.TeNr = cte.TeNr + 1
where cte.TeNr + 1 < maxTeNr
)
select TeNr
from cte
where missing = 1
Depending on your data this might return a huge amount of rows.
You might also use the result of the previous RangeStart/RangeEnd query as input to this recursion.
I have a table with 2 columns. UTCTime and Values.
The UTCTime is in 15 mins increment. I want a query that would compare the value to the previous value in one hour span and display a value between 0 and 4 depends on if the values are constant. In other words there is an entry for every 15 minute increment and the value can be constant so I just need to check each value to the previous one per hour.
For example
+---------|-------+
| UTCTime | Value |
------------------|
| 12:00 | 18.2 |
| 12:15 | 87.3 |
| 12:30 | 55.91 |
| 12:45 | 55.91 |
| 1:00 | 37.3 |
| 1:15 | 47.3 |
| 1:30 | 47.3 |
| 1:45 | 47.3 |
| 2:00 | 37.3 |
+---------|-------+
In this case, I just want a Query that would compare the 12:45 value to the 12:30 and 12:30 to 12:15 and so on. Since we are comparing in only one hour span then the constant values must be between 0 and 4 (O there is no constant values, 1 there is one like in the example above)
The query should display:
+----------+----------------+
| UTCTime | ConstantValues |
----------------------------|
| 12:00 | 1 |
| 1:00 | 2 |
+----------|----------------+
I just wanted to mention that I am new to SQL programming.
Thank you.
See SQL fiddle here
Below is the query you need and a working solution Note: I changed the timeframe to 24 hrs
;with SourceData(HourTime, Value, RowNum)
as
(
select
datepart(hh, UTCTime) HourTime,
Value,
row_number() over (partition by datepart(hh, UTCTime) order by UTCTime) RowNum
from foo
union
select
datepart(hh, UTCTime) - 1 HourTime,
Value,
5
from foo
where datepart(mi, UTCTime) = 0
)
select cast(A.HourTime as varchar) + ':00' UTCTime, sum(case when A.Value = B.Value then 1 else 0 end) ConstantValues
from SourceData A
inner join SourceData B on A.HourTime = B.HourTime and
(B.RowNum = (A.RowNum - 1))
group by cast(A.HourTime as varchar) + ':00'
select SUBSTRING_INDEX(UTCTime,':',1) as time,value, count(*)-1 as total
from foo group by value,time having total >= 1;
fiddle
Mine isn't much different from Vasanth's, same idea different approach.
The idea is that you need recursion to carry it out simply. You could also use the LEAD() function to look at rows ahead of your current row, but in this case that would require a big case statement to cover every outcome.
;WITH T
AS (
SELECT a.UTCTime,b.VALUE,ROW_NUMBER() OVER(PARTITION BY a.UTCTime ORDER BY b.UTCTime DESC)'RowRank'
FROM (SELECT *
FROM #Table1
WHERE DATEPART(MINUTE,UTCTime) = 0
)a
JOIN #Table1 b
ON b.UTCTIME BETWEEN a.UTCTIME AND DATEADD(hour,1,a.UTCTIME)
)
SELECT T.UTCTime, SUM(CASE WHEN T.Value = T2.Value THEN 1 ELSE 0 END)
FROM T
JOIN T T2
ON T.UTCTime = T2.UTCTime
AND T.RowRank = T2.RowRank -1
GROUP BY T.UTCTime
If you run the portion inside the ;WITH T AS ( ) you'll see that gets us the hour we're looking at and the values in order by time. That is used in the recursive portion below by joining to itself and evaluating each row compared to the next row (hence the RowRank - 1) on the JOIN.