Oracle, SQL request to retrieve data by week - sql

I have a database where there is logs of action performed by users, I want to identify the number of users by week which the ID changed From K beginning to A beginning, between the 01/01/2019 till today (20/06/2019) , in this example the user 1000 changed his ID from K to A because the last date action in K is older than the first action with A , the userID is unique of each user , here is my table, also the user 1002 changed at well for the same reason.
My table of logs looks like that
ID date action USERID
KF12 01/01/2019 Create 1000
KG45 11/06/2019 Create 1002
KI89 06/05/2019 Modify 1003
AO22 20/03/2019 Delete 1000
AI88 20/06/2019 Delete 1002
..
WHERE is what I tried, it's not fully complete, but I have no idea how to count changes by week
select distinct USERID, max(DATE_USER) over (partition by USERID)
FROM
HISTORY
WHERE
USERID in (Select distinct USERID
from HISTORY
where ID like 'K%'
and DATE_USER >= to_date('1.1.' || 2019, 'DD.MM.YYYY')
and DATE_USER < to_date('20.06.' || 2019 , 'DD.MM.YYYY')
INTERSECT
select distinct USERID
from HISTORY
where ID like 'A%'
and DATE_USER >= to_date('1.1.' || 2019, 'DD.MM.YYYY')
and DATE_USER < to_date('19.06.' || 2019 , 'DD.MM.YYYY'))
and ID like 'A%'
;
In this example the expected result is the users (1000 , 1002) who changed at (20/03/2019,20/06/2019), the result have to be like this
WEEKNUMBER COUNTOFCHANGE
25 1
12 1

Instead of using functions, You can try to use self join to achieve the same as the following:
-- DATA PREPARATION
WITH LOGS(ID, "DATE",ACTION, USERID) AS
(SELECT 'KF12',TO_DATE('01/01/2019','DD/MM/RRRR'),'Create',1000 FROM DUAL UNION ALL
SELECT 'KG45',TO_DATE('11/06/2019','DD/MM/RRRR'),'Create',1002 FROM DUAL UNION ALL
SELECT 'KI89',TO_DATE('06/05/2019','DD/MM/RRRR'),'Modify',1003 FROM DUAL UNION ALL
SELECT 'AO22',TO_DATE('20/03/2019','DD/MM/RRRR'),'Delete',1000 FROM DUAL UNION ALL
SELECT 'AI88',TO_DATE('20/06/2019','DD/MM/RRRR'),'Delete',1002 FROM DUAL)
-- ACTUAL QUERY
SELECT
WK,
COUNT(DISTINCT USERID)
FROM
(
SELECT
TO_CHAR(L2."DATE", 'WW') WK,
L2.USERID
FROM
LOGS L1
JOIN LOGS L2 ON ( L1.USERID = L2.USERID
AND L1."DATE" < L2."DATE"
AND L1.ID LIKE 'K%'
AND L2.ID LIKE 'A%' )
)
GROUP BY
WK
Output:
DB Fiddle demo
Cheers!!

Use lag to find previous ID , filter by 'K' -> 'A' and count as needed
select wk, count(distinct USERID) n
from
(select log.*, to_char(dat,'ww') wk, lag(ID) over(partition by USERID order by dat) prev_id
from log) t
where substr(t.ID,0,1) = 'A' and substr(t.prev_id,0,1) = 'K'
group by wk

Related

SQL counting days with gap / overlapping

I am working on a "counting days" problem almost identical to this one. I have a list of date(s), and need to count how many days used excluding duplicate, and handling the gaps. Same input and output.
From: Markus Jarderot
Input
ID d1 d2
1 2011-08-01 2011-08-08
1 2011-08-02 2011-08-06
1 2011-08-03 2011-08-10
1 2011-08-12 2011-08-14
2 2011-08-01 2011-08-03
2 2011-08-02 2011-08-06
2 2011-08-05 2011-08-09
Output
ID hold_days
1 11
2 8
SQL to find time elapsed from multiple overlapping intervals
But for the life of me I couldn't understand Markus Jarderot's solution.
SELECT DISTINCT
t1.ID,
t1.d1 AS date,
-DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d1) AS n
FROM Orders t1
LEFT JOIN Orders t2 -- Join for any events occurring while this
ON t2.ID = t1.ID -- is starting. If this is a start point,
AND t2.d1 <> t1.d1 -- it won't match anything, which is what
AND t1.d1 BETWEEN t2.d1 AND t2.d2 -- we want.
GROUP BY t1.ID, t1.d1, t1.d2
HAVING COUNT(t2.ID) = 0
Why is DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d1) picking from the min(d1) from the entire list? Is that regardless of ID.
And what does t1.d1 BETWEEN t2.d1 AND t2.d2 do? Is that to ensure only overlapped interval are calculated?
Same thing with group by, I think because if in the event the same identical period will be discarded? I tried to trace the solution by hand but getting more confused.
This is mostly a duplicate of my answer here (including explanation) but with the inclusion of grouping on an id column. It should use a single table scan and does not require a recursive sub-query factoring clause (CTE) or self joins.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE your_table ( id, usr, start_date, end_date ) AS
SELECT 1, 'A', DATE '2017-06-01', DATE '2017-06-03' FROM DUAL UNION ALL
SELECT 1, 'B', DATE '2017-06-02', DATE '2017-06-04' FROM DUAL UNION ALL -- Overlaps previous
SELECT 1, 'C', DATE '2017-06-06', DATE '2017-06-06' FROM DUAL UNION ALL
SELECT 1, 'D', DATE '2017-06-07', DATE '2017-06-07' FROM DUAL UNION ALL -- Adjacent to previous
SELECT 1, 'E', DATE '2017-06-11', DATE '2017-06-20' FROM DUAL UNION ALL
SELECT 1, 'F', DATE '2017-06-14', DATE '2017-06-15' FROM DUAL UNION ALL -- Within previous
SELECT 1, 'G', DATE '2017-06-22', DATE '2017-06-25' FROM DUAL UNION ALL
SELECT 1, 'H', DATE '2017-06-24', DATE '2017-06-28' FROM DUAL UNION ALL -- Overlaps previous and next
SELECT 1, 'I', DATE '2017-06-27', DATE '2017-06-30' FROM DUAL UNION ALL
SELECT 1, 'J', DATE '2017-06-27', DATE '2017-06-28' FROM DUAL UNION ALL -- Within H and I
SELECT 2, 'K', DATE '2011-08-01', DATE '2011-08-08' FROM DUAL UNION ALL -- Your data below
SELECT 2, 'L', DATE '2011-08-02', DATE '2011-08-06' FROM DUAL UNION ALL
SELECT 2, 'M', DATE '2011-08-03', DATE '2011-08-10' FROM DUAL UNION ALL
SELECT 2, 'N', DATE '2011-08-12', DATE '2011-08-14' FROM DUAL UNION ALL
SELECT 3, 'O', DATE '2011-08-01', DATE '2011-08-03' FROM DUAL UNION ALL
SELECT 3, 'P', DATE '2011-08-02', DATE '2011-08-06' FROM DUAL UNION ALL
SELECT 3, 'Q', DATE '2011-08-05', DATE '2011-08-09' FROM DUAL;
Query 1:
SELECT id,
SUM( days ) AS total_days
FROM (
SELECT id,
dt - LAG( dt ) OVER ( PARTITION BY id
ORDER BY dt ) + 1 AS days,
start_end
FROM (
SELECT id,
dt,
CASE SUM( value ) OVER ( PARTITION BY id
ORDER BY dt ASC, value DESC, ROWNUM ) * value
WHEN 1 THEN 'start'
WHEN 0 THEN 'end'
END AS start_end
FROM your_table
UNPIVOT ( dt FOR value IN ( start_date AS 1, end_date AS -1 ) )
)
WHERE start_end IS NOT NULL
)
WHERE start_end = 'end'
GROUP BY id
Results:
| ID | TOTAL_DAYS |
|----|------------|
| 1 | 25 |
| 2 | 13 |
| 3 | 9 |
The brute force method is to create all days (in a recursive query) and then count:
with dates(id, day, d2) as
(
select id, d1 as day, d2 from mytable
union all
select id, day + 1, d2 from dates where day < d2
)
select id, count(distinct day)
from dates
group by id
order by id;
Unfortunately there is a bug in some Oracle versions and recursive queries with dates don't work there. So try this code and see whether it works in your system. (I have Oracle 11.2 and the bug still exists there; so I guess you need Oracle 12c.)
I guess Markus' idea is to find all starting points that are not within other ranges and all ending points that aren't. Then just take the first starting point till the first ending point, then the next starting point till the next ending point, etc. As Markus isn't using a window function to number starting and ending points, he must find a more complicated way to achieve this. Here is the query with ROW_NUMBER. Maybe this gives you a start what to look for in Markus' query.
select startpoint.id, sum(endpoint.day - startpoint.day)
from
(
select id, d1 as day, row_number() over (partition by id order by d1) as rn
from mytable m1
where not exists
(
select *
from mytable m2
where m1.id = m2.id
and m1.d1 > m2.d1 and m1.d1 <= m2.d2
)
) startpoint
join
(
select id, d2 as day, row_number() over (partition by id order by d1) as rn
from mytable m1
where not exists
(
select *
from mytable m2
where m1.id = m2.id
and m1.d2 >= m2.d1 and m1.d2 < m2.d2
)
) endpoint on endpoint.id = startpoint.id and endpoint.rn = startpoint.rn
group by startpoint.id
order by startpoint.id;
If all your intervals start at different dates, consider them in ascending order by d1 counting how many days are from d1 to the next interval.
You can discard an interval of it is contained in another one.
The last interval won't have a follower.
This query should give you how many days each interval give
select a.id, a.d1,nvl(min(b.d1), a.d2) - a.d1
from orders a
left join orders b
on a.id = b.id and a.d1 < b.d1 and a.d2 between b.d1 and b.d2
group by a.id, a.d1
Then group by id and sum days

SQL- how to retrieve by similar dates

Okay, so I have a table with a user_id column and a submitted_dtm column.
I want to find instances where users submitted multiple records within 1 day of each other, and count how many times that has happened.
I've tried something like
select * from table_t t where
(select count(*) from table_t t2 where
t.user_id = t2.user_id and
t.pk!=t2.pk and
t.submitted_dtm between t2.submitted_dtm-.5 and t2.submitted_dtm+.5)>0;
The problem is that this query returns a result for each record in a date group. Instead, I just want a result per date group. Ideally, I'd just get the count in that group.
That is, if I have 6 records:
user_id submitted_dtm
--------------------------
1 12/04/2017 1:15
1 12/04/2017 5:50
2 11/25/2017 2:00
2 11/25/2017 3:25
2 11/25/2017 6:05
2 10/06/2017 4:00
I want 2 results, a count of 2 and a count of 3.
Is it possible to do this in sql?
Following up on Dessma's answer.
select user_id, trunc(submitted_dtm), count(1)
from table_t
group by user_id, trunc(submitted_dtm)
having count(1) > 1;
Sqlfiddle
In Oracle 12.1 and higher, you can solve such problems easily with the match_recognize clause. Link to documentation (with examples) below; my only note about the solution below is that I left the date in DATE data type, especially important if the output is used in further computations. If it isn't, you can wrap within TO_CHAR() with whatever format model is appropriate for your users.
https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956
with
inputs ( user_id, submitted_dtm ) as (
select 1, to_date('12/04/2017 1:15', 'mm/dd/yyyy hh24:mi') from dual union all
select 1, to_date('12/04/2017 5:50', 'mm/dd/yyyy hh24:mi') from dual union all
select 2, to_date('11/25/2017 2:00', 'mm/dd/yyyy hh24:mi') from dual union all
select 2, to_date('11/25/2017 3:25', 'mm/dd/yyyy hh24:mi') from dual union all
select 2, to_date('11/25/2017 6:05', 'mm/dd/yyyy hh24:mi') from dual union all
select 2, to_date('10/06/2017 4:00', 'mm/dd/yyyy hh24:mi') from dual
)
-- End of simulated inputs (for testing only, not part of the solution).
-- SQL query begins below this line. Use your actual table and column names.
select user_id, submitted_dtm, cnt
from inputs
match_recognize(
partition by user_id
order by submitted_dtm
measures trunc(a.submitted_dtm) as submitted_dtm,
count(*) as cnt
pattern ( a b+ )
define b as trunc(submitted_dtm) = trunc(a.submitted_dtm)
);
USER_ID SUBMITTED_DTM CNT
---------- ------------------- ----------
1 2017-12-04 00:00:00 2
2 2017-11-25 00:00:00 3
I don't have data to test it but I suspect something like this would do the trick :
SELECT user_id,To_char(t.submitted_dtm, 'dd/mm/yyyy'), COUNT(*)
FROM table_t t
INNER JOIN table_t t2
ON t.user_id = t2.user_id
AND t.pk != t2.pk
AND t.submitted_dtm BETWEEN t2.submitted_dtm - .5 AND
t2.submitted_dtm + .5
GROUP BY user_id,To_char(t.submitted_dtm, 'dd/mm/yyyy')
HAVING COUNT(*) > 1
This is a general idea of how to get the instances.
select user_id, t1.submitted_dtm t1submitted, t2.submitted_dtm t2submtted
from table_t t1 join table_t t2 using (user_id)
where t2.submitted_dtm > t1.submitted_dtm
and t2.submitted_dtm - t1.submitted_dtm <= 1;
The last line could be modified somehow depending on what you mean by within a day.
To count the instances, create a derived table from the above and select count(*) from it.

Oracle distinct listagg with count

I need to display a comma-separated description per table row but I need the list to be distinct and have counts for some of the descriptions available.
every_description needs_count
---------------------------------
Bred yes
From Vendor yes
Grouped no
Removed yes
Separated no
Weaned yes
So in a day, the description could be something like Bred, Grouped, Weaned and I have this working now using LISTAGG and removing the duplicates using the solution mentioned here but I need to add counts for some of the descriptions like 5 Bred, Grouped, 2 Weaned.
Here's my current query where I'm stuck:
WITH cages AS (
SELECT 1234 AS id FROM DUAL
UNION SELECT 5678 AS id FROM DUAL
UNION SELECT 9012 AS id FROM DUAL
UNION SELECT 3456 AS id FROM DUAL
), cage_comments AS (
SELECT 1234 AS cage_id, 'Bred' AS description, TO_DATE('11/14/2017', 'MM/DD/YYYY') AS event_date FROM DUAL
UNION SELECT 5678 AS cage_id, 'Grouped' AS description, TO_DATE('11/14/2017', 'MM/DD/YYYY') AS event_date FROM DUAL
UNION SELECT 9012 AS cage_id, 'Weaned' AS description, TO_DATE('11/14/2017', 'MM/DD/YYYY') AS event_date FROM DUAL
UNION SELECT 3456 AS cage_id, 'Weaned' AS description, TO_DATE('11/14/2017', 'MM/DD/YYYY') AS event_date FROM DUAL
UNION SELECT 3456 AS cage_id, 'Bred' AS description, TO_DATE('11/02/2017', 'MM/DD/YYYY') AS event_date FROM DUAL
UNION SELECT 3456 AS cage_id, 'Grouped' AS description, TO_DATE('11/14/2017', 'MM/DD/YYYY') AS event_date FROM DUAL
), calendar AS (
SELECT dt
FROM (
SELECT TRUNC(LAST_DAY(TO_DATE(&month || '/01/' || &year, 'MM/DD/YYYY')) - ROWNUM + 1) dt
FROM DUAL CONNECT BY ROWNUM <= 31
)
WHERE dt >= TRUNC(TO_DATE(&month || '/01/' || &year, 'MM/DD/YYYY'), 'MM')
ORDER BY dt ASC
)
SELECT
cal.dt,
(
SELECT
CASE
WHEN COUNT(cc.cage_id) > 0 THEN RTRIM(
REGEXP_REPLACE(
(LISTAGG(cc.description, ',') WITHIN GROUP (ORDER BY cc.description)),
'([^,]*)(,\1)+($|,)',
'\1\3'
),
','
)
ELSE NULL
END
FROM cages c
LEFT JOIN cage_comments cc ON cc.cage_id = c.id
WHERE cc.event_date = cal.dt
) AS description
FROM calendar cal
ORDER BY cal.dt
In short - I'm just having difficulties adding the COUNT for some of the descriptions for that day. In the case above I would like to say 1 Bred for November 2, 2017 and 1 Bred, Grouped, 2 Weaned for November 14, 2017.
Currently you are aggregating all the descriptions (without DISTINCT) and then you remove the duplicate descriptions with a regular expression replace. This is very inefficient - it would be better to select distinct and then apply LISTAGG.
This becomes even more important if you need to add the count. Take the result of your join and GROUP BY description. (In particular, this will take care of DISTINCT). In the SELECT for this aggregate step, include the count. Then join the result to the additional table at the top of your question, and re-write the argument to LISTAGG to include a CASE expression, equal to the count (and a space) when the needs_count value is 'yes'.
You could use:
SELECT
cal.dt,
(
SELECT LISTAGG(CASE WHEN COUNT(*)=1 THEN ''
ELSE CAST(COUNT(*) AS VARCHAR2(10)) || ' ' END || description, ', ')
WITHIN GROUP (ORDER BY description)
FROM cages c
LEFT JOIN cage_comments cc ON cc.cage_id = c.id
WHERE cc.event_date = cal.dt
GROUP BY cc.event_date, cc.description
) AS description
FROM calendar cal
ORDER BY cal.dt;
DBFiddle Demo

Filter rows by those created within a close timeframe

I have a application where users create orders that are stored in a Oracle database. I'm trying to find a bug that only happens when a user creates orders within 30 seconds of the last order they created.
Here is the structure of the order table:
order_id | user_id | creation_date
I would like to write a query that can give me a list of orders where the creation_date is within 30 seconds of the last order for the same user. The results will hopefully help me find the bug.
I tried using the Oracle LAG() function but it doesn't seem to with the WHERE clause.
Any thoughts?
SELECT O.*
FROM YourTable O
WHERE EXISTS (
SELECT *
FROM YourTable O2
WHERE
O.creation_date > O2.creation_date
AND O.user_id = O2.user_id
AND O.creation_date - (30 / 86400) <= O2.creation_date
);
See this in action in a Sql Fiddle.
You can use the LAG function if you want, you would just have to wrap the query into a derived table and then put your WHERE condition in the outer query.
SELECT distinct
t1.order_id, t1.user_id, t1.creation_date
FROM
YourTable t1
join YourTable t2
on t2.user_id = t1.user_id
and t2.creation_date between t1.creation_date - 30/86400 and t1.creation_date
and t2.rowid <> t1.rowid
order by 3 desc
Example of using LAG():
SELECT id, (pss - css) time_diff_in_seconds
, creation_date, prev_date
FROM
(
SELECT id, creation_date, prev_date
, EXTRACT(SECOND From creation_date) css
, EXTRACT(SECOND From prev_date) pss
FROM
(
SELECT id, creation_date
, LAG(creation_date, 1, creation_date) OVER (ORDER BY creation_date) prev_date
FROM
( -- Table/data --
SELECT 1 id, timestamp '2013-03-20 13:56:58' creation_date FROM dual
UNION ALL
SELECT 2, timestamp '2013-03-20 13:57:27' FROM dual
UNION ALL
SELECT 3, timestamp '2013-03-20 13:59:16' FROM dual
)))
--WHERE (pss - css) <= 30
/
ID TIME_DIFF_IN_SECONDS
--------------------------
1 0 <<-- if uncomment where
2 31
3 11 <<-- if uncomment where

Comparing values sql

I have a table wherein I have to report the the present status and the date from which this status is applicable.
Example:
Status date
1 26 July
1 24 July
1 22 July
2 21 July
2 19 July
1 16 July
0 14 July
Given this, i want to display the current status as 1 and date as 22 July> I am not sure how to go about this.
Status date
1 25 July
1 24 July
1 20 July
In this case, I want to show the status as 1 and date as 20th July
This should pull what you need using very standard SQL:
-- Get the oldest date that is the current Status
select Status, min(date) as date
from MyTable
where date > (
-- Get the most recent date that isn't the current Status
select max(date)
from MyTable
where Status != (
-- Get the current Status
select Status -- May need max/min here for multiple statuses on same date
from MyTable
where date = (
-- Get the most recent date
select max(date)
from MyTable
)
)
)
group by Status
I'm assuming that the date column is of a data type suitable for sorting properly (as in, not a string, unless you can cast it).
This is a little inelegant, but it should work
SELECT status, date
FROM my_table t
WHERE status = ALL (SELECT status
FROM my_table
WHERE date = ALL(SELECT MAX(date) FROM my_table))
AND date = ALL (SELECT MIN(date)
FROM my_table t1
WHERE t1.status = t.status
AND NOT EXISTS (SELECT *
FROM my_table t2
WHERE t2.date > t1.date AND t2.status <> t1.status))
Another option is to use a window function like LEAD (or LAG depending on how you order your results). In this example we mark the row when the status changes with the date, order the results and exclude rows other than the first one:
with test_data as (
select 1 status, date '2012-07-26' status_date from dual union all
select 1 status, date '2012-07-24' status_date from dual union all
select 1 status, date '2012-07-22' status_date from dual union all
select 2 status, date '2012-07-21' status_date from dual union all
select 2 status, date '2012-07-19' status_date from dual union all
select 1 status, date '2012-07-16' status_date from dual union all
select 0 status, date '2012-07-14' status_date from dual)
select status, as_of
from (
select status
, case when status != lead(status) over (order by status_date desc) then status_date else null end as_of
from test_data
order by as_of desc nulls last
)
where rownum = 1;
Addendum:
The LEAD and LAG functions accept two more parameters: offset and default. The offset defaults to 1, and default defaults to null. The default allows you to determine what value to consider when you are at the beginning or end of the result set. In your case when the status has never changed, a default is needed. In this example I supplied -1 as a status default because I am assuming that status value is not part of your expected set:
with test_data as (
select 1 status, date '2012-07-25' status_date from dual union all
select 1 status, date '2012-07-24' status_date from dual union all
select 1 status, date '2012-07-20' status_date from dual)
select status, as_of
from (
select status
, case when status != lead(status,1,-1) over (order by status_date desc) then status_date else null end as_of
from test_data
order by as_of desc nulls last
)
where rownum = 1;
You can play around with the case condition (equals/not equals), the order by clause in the lead function, and the desired default to accomplish your needs.