How to fetch records that have an alternate entry - sql

I need some help to fetch records having alternate set of entries associated with Unique value(ex: user_id)
I want output to be only (1111,2222,3333)
Here is the scenario:
user_id 1111 attended .net course from 2005-01-01 to 2006-12-31
he later attended java from 2007-01-01 to 2009-12-31
he later came back to .net
so i want to retrieve these kind of user_id's
user_id 4444 should not be in the output, because there is no alternative courses.
UPDATE: 4444 started his Java course from 2007 to 2009 he again
attended Java from 2010 - 2012 Later he attended .net but never came
back to Java so he must be excluded from output
If Group by is used, it will consider records irrespective of alternate course name.
We can create a procedure to accomplish this by looping and comparing the alternate course name but i want to know if a query can do this?

You can use two INNER JOIN operations:
SELECT DISTINCT user_id
FROM mytable AS t1
INNER JOIN mytable AS t2
ON t1.user_id = t2.user_id AND t1.id < t2.id AND t1.course_name <> t2.course_name
INNER JOIN mytable AS t3
ON t2.user_id = t3.user_id AND t2.id < t3.id AND t1.course_name = t3.course_name
I assume that id is an auto-increment field that reflects the order the rows have been inserted in the DB. Otherwise, you should use a date field in its place.

Same as Girogos Betsos' answer, only with select distinct to prevent duplicates.
SELECT DISTINCT user_id
FROM mytable AS t1
INNER JOIN mytable AS t2
ON t1.user_id = t2.user_id AND t1.Start_Date < t2.Start_Date AND
t1.course_name <> t2.course_name
INNER JOIN mytable AS t3
ON t2.user_id = t3.user_id AND t2.Start_Date < t3.Start_Date AND
t1.course_name = t3.course_name
EDIT: Using Start_Date since the answer has been updated and IDs are not necessarily sequential.

This is a version utilizing Windowed Aggregate Fuctions instead of multiple self joins:
SELECT DISTINCT user_id
FROM
(
SELECT user_id
,course_name
,start_date
,RANK() -- number all courses
OVER (PARTITION BY user_id
ORDER BY start_date)
-
RANK() -- number each course
OVER (PARTITION BY user_id, course_name
ORDER BY start_date) AS x
FROM tab
) dt
GROUP BY user_id, course_name
HAVING MIN(x) <> MAX(x) -- same course but another inbetween
If a user has a course multiple times in a series that x will stay the same, if there was another course inbetween it will change:
java 1 - 1 = 0
java 2 - 2 = 0 <--- min
.net 3 - 1 = 2
java 4 - 3 = 1 <--- max
java 1 - 1 = 0
java 2 - 2 = 0
.net 3 - 1 = 2
.net 4 - 2 = 2

Using a single table scan and does not rely on GROUP BY:
WITH table_name ( user_id, start_date, end_date, course_name, id ) AS (
SELECT 1111, DATE '2005-01-01', DATE '2006-12-31', '.net', 1 FROM DUAL UNION ALL
SELECT 1111, DATE '2007-01-01', DATE '2009-12-31', 'java', 2 FROM DUAL UNION ALL
SELECT 1111, DATE '2010-01-01', DATE '2020-12-31', '.net', 3 FROM DUAL UNION ALL
SELECT 2222, DATE '2005-01-01', DATE '2006-12-31', 'java', 4 FROM DUAL UNION ALL
SELECT 2222, DATE '2007-01-01', DATE '2008-12-31', '.net', 5 FROM DUAL UNION ALL
SELECT 2222, DATE '2009-01-01', DATE '2012-12-31', '.net', 6 FROM DUAL UNION ALL
SELECT 2222, DATE '2013-01-01', DATE '2016-12-31', 'java', 7 FROM DUAL UNION ALL
SELECT 3333, DATE '2005-01-01', DATE '2007-12-31', 'java', 8 FROM DUAL UNION ALL
SELECT 3333, DATE '2007-01-01', DATE '2008-12-31', '.net', 9 FROM DUAL UNION ALL
SELECT 3333, DATE '2009-01-01', DATE '2013-12-31', 'java', 10 FROM DUAL UNION ALL
SELECT 3333, DATE '2014-01-01', DATE '2016-12-31', '.net', 11 FROM DUAL UNION ALL
SELECT 4444, DATE '2007-01-01', DATE '2009-12-31', 'java', 12 FROM DUAL UNION ALL
SELECT 4444, DATE '2010-01-01', DATE '2012-12-31', 'java', 13 FROM DUAL UNION ALL
SELECT 4444, DATE '2013-01-01', DATE '2015-12-31', '.net', 14 FROM DUAL UNION ALL
SELECT 4444, DATE '2016-01-01', DATE '2016-12-31', '.net', 15 FROM DUAL
)
SELECT DISTINCT user_id
FROM (
SELECT user_id,
LEAD( course_name )
OVER ( PARTITION BY user_id, course_name ORDER BY start_date )
AS next_same_course,
LEAD( course_name )
OVER ( PARTITION BY user_id ORDER BY start_date )
AS next_course
FROM table_name
)
WHERE next_same_course IS NOT NULL
AND next_course <> next_same_course;

Related

How do you combine query results from different rows into one?

My original query:
SELECT desc, start_date
FROM foo.bar
WHERE desc LIKE 'Fall%' AND desc NOT LIKE '%Med%'
UNION
SELECT desc, end_date
FROM foo.bar
WHERE desc LIKE 'Spring%' AND desc NOT LIKE '%Med%'
ORDER BY start_date;
With this query, I get (roughly) the data set I am looking for. I now need to take that data and combine the results taking two at a time in order and then produce a result like:
DESC
START_DATE
END_DATE
Fall 1971 - Spring 1972
15-AUG-71
15-MAY-72
Fall 1971 - Spring 1972
15-AUG-72
15-MAY-73
Where DESC is a concatenation of the DESC form row 1 and 2, START_DATE is the date from row 1 and END_DATE is the date from row 2. Following this same pattern for the entire data set.
Any help with a query that will produce the result I need is greatly appreciated. Not sure if I'm heading down the right path or if that originally query is just wrong.
As stated above, I tried the supplied query, which gives me the data I need. However, I've been unsuccessful in finding a way to format it into my desired output. It should also be noted that I am running this on an Oracle database.
Instead of union, use each of those queries as CTEs (with a slight modification - include row number you'll later use in JOIN):
Sample data:
SQL> with test (description, datum) as
2 (select 'Fall 1971' , date '1971-08-15' from dual union all
3 select 'Spring 1972', date '1972-05-15' from dual union all
4 select 'Fall 1972' , date '1972-08-15' from dual union all
5 select 'Spring 1973', date '1973-05-15' from dual union all
6 select 'Fall 1973' , date '1973-08-15' from dual union all
7 select 'Spring 1974', date '1974-05-15' from dual union all
8 select 'Fall 1974' , date '1974-08-15' from dual union all
9 select 'Spring 1975', date '1975-05-15' from dual
10 ),
Query begins here: t_start and t_end represent your current queries
11 t_start as
12 (select description, datum,
13 row_number() Over (order by datum) rn
14 from test
15 where description like 'Fall%' and description not like '%Med%'
16 ),
17 t_end as
18 (select description, datum,
19 row_number() Over (order by datum) rn
20 from test
21 where description like 'Spring%' and description not like '%Med%'
22 )
Finally:
23 select s.description ||' - '|| e.description as description,
24 s.datum start_date,
25 e.datum end_date
26 from t_start s join t_end e on s.rn = e.rn
27 order by s.rn;
DESCRIPTION START_DAT END_DATE
------------------------- --------- ---------
Fall 1971 - Spring 1972 15-AUG-71 15-MAY-72
Fall 1972 - Spring 1973 15-AUG-72 15-MAY-73
Fall 1973 - Spring 1974 15-AUG-73 15-MAY-74
Fall 1974 - Spring 1975 15-AUG-74 15-MAY-75
SQL>
You can also use the MODEL clause to avoid to scan the table twice:
with data(description,datum) as (
select 'Fall 1971' , date '1971-08-15' from dual union all
select 'Spring 1972', date '1972-05-15' from dual union all
select 'Fall 1972' , date '1972-08-15' from dual union all
select 'Spring 1973', date '1973-05-15' from dual union all
select 'Fall 1973' , date '1973-08-15' from dual union all
select 'Spring 1974', date '1974-05-15' from dual union all
select 'Fall 1974' , date '1974-08-15' from dual union all
select 'Spring 1975', date '1975-05-15' from dual
)
select description, start_date, end_date
from (
select rn, desc1 as description, start_date, end_date
from (
select row_number() over(order by datum) as rn, description, datum
from data
where description not like '%Med%'
)
model
dimension by (rn)
measures (
cast(' ' as varchar2(256)) as desc1, description, cast(NULL as DATE) start_date, cast(NULL as DATE) end_date , datum
)
rules (
desc1[mod(rn,2)=1] = description[cv()] || ' - ' || description[cv()+1],
start_date[mod(rn,2)=1] = datum[cv()],
end_date[mod(rn,2)=1] = datum[cv()+1]
)
)
where mod(rn,2)=1
;

How to make a query showing purchases of a client on the same day, but only if those were made in diffrent stores (oracle)?

I want to show cases of clients with at least 2 purchases on the same day. But I only want to count those purchases that were made in different stores.
So far I have:
Select Purchase.PurClientId, Purchase.PurDate, Purchase.PurId
from Purchase
join
(
Select count(Purchase.PurId),
Purchase.PurClientId,
to_date(Purchase.PurDate)
from Purchases
group by Purchase.PurClientId,
to_date(Purchase.PurDate)
having count (Purchase.PurId) >=2
) k
on k.PurClientId=Purchase.PurClientId
But I have no clue how to make it count purchases only if those were made in different stores. The column which would allow to identify shop is Purchase.PurShopId.
Thanks for help!
You can use:
SELECT PurId,
PurDate,
PurClientId,
PurShopId
FROM (
SELECT p.*,
COUNT(DISTINCT PurShopId) OVER (
PARTITION BY PurClientId, TRUNC(PurDate)
) AS num_stores
FROM Purchase p
)
WHERE num_stores >= 2;
Or
SELECT *
FROM Purchase p
WHERE EXISTS(
SELECT 1
FROM Purchase x
WHERE p.purclientid = x.purclientid
AND p.purshopid != x.purshopid
AND TRUNC(p.purdate) = TRUNC(x.purdate)
);
Which, for the sample data:
CREATE TABLE purchase (
purid PRIMARY KEY,
purdate,
purclientid,
PurShopId
) AS
SELECT 1, DATE '2021-01-01', 1, 1 FROM DUAL UNION ALL
SELECT 2, DATE '2021-01-02', 1, 1 FROM DUAL UNION ALL
SELECT 3, DATE '2021-01-02', 1, 2 FROM DUAL UNION ALL
SELECT 4, DATE '2021-01-03', 1, 1 FROM DUAL UNION ALL
SELECT 5, DATE '2021-01-03', 1, 1 FROM DUAL UNION ALL
SELECT 6, DATE '2021-01-04', 1, 2 FROM DUAL;
Both output:
PURID
PURDATE
PURCLIENTID
PURSHOPID
2
2021-01-02 00:00:00
1
1
3
2021-01-02 00:00:00
1
2
db<>fiddle here

How to get min and max from 2 tables in SQL

I am Trying to get start date from min ID (ID=1) and end date from max ID (ID=3) but i am not sure how i can retrieve. Following is my data -
Table1 and Table2 are source table. I am trying to get output like 3rd table.
My requirement is get start date from first record of ID and End Date from last record of ID, we can recognize first and and last record with the help of ID field. If ID is min means first record and ID is max then last record
Please help me!
Here's one option; presuming you use Oracle (regarding you use Oracle SQL Developer), the x inline view selects
start_date which belongs to name with the lowest ID column value for that name (i.e. first_value partition by name order by id)
end_date which belongs to name with the highest ID column value for that name (i.e. first_value partition by name order by id DESC)
SQL> with
2 -- sample data
3 t1 (pid, name) as
4 (select 123, 'xyz' from dual union all
5 select 234, 'pqr' from dual
6 ),
7 t2 (id, name, start_date, end_date) as
8 (select 1, 'xyz', date '2020-01-01', date '2020-07-20' from dual union all
9 select 2, 'xyz', date '2020-02-01', date '2020-05-30' from dual union all
10 select 3, 'xyz', date '2020-06-30', date '2020-07-30' from dual union all
11 --
12 select 1, 'pqr', date '2020-04-30', date '2020-09-30' from dual union all
13 select 2, 'pqr', date '2020-05-30', date '2020-09-30' from dual union all
14 select 3, 'pqr', date '2020-06-30', date '2020-07-01' from dual
15 )
16 select a.pid,
17 x.name,
18 max(x.start_date) start_date,
19 max(x.end_date) end_date
20 from t1 a join
21 (
22 -- start_date: always for the lowest T2.ID value row
23 -- end_date : always for the highest T2.ID value row
24 select b.name,
25 first_value(b.start_date) over (partition by b.name order by b.id ) start_date,
26 first_value(b.end_date) over (partition by b.name order by b.id desc) end_date
27 from t2 b
28 ) x
29 on a.name = x.name
30 group by a.pid,
31 x.name
32 order by a.pid;
PID NAME START_DATE END_DATE
---------- ---- ---------- ----------
123 xyz 01/01/2020 07/30/2020
234 pqr 04/30/2020 07/01/2020
SQL>

Return Month wise count if no data for month return 0 as count in oracle sql

I have a table having data for January to March (till current month) and I am able to take the month wise count.But user required is to display zero for rest of the month.Kindly suggest.
For example:
select count(a.emp_id) as cnt ,to_char(a.due_date,'MONTH') as Process_Month from EMP_Request a
where a.due_date is not null
group by to_char(a.due_date,'MONTH')
Output:
cnt Process_month
20 JANUARY
35 FEBUARY
26 March
Desired output:
cnt Process_month
20 JANUARY
35 FEBUARY
26 March
0 APRIL
0 MAY
…….
….
….
0 DECEMBER
Please assist.
use WWV_FLOW_MONTHS_MONTH to get all the month and left join with your query to get the month name from the date column and join with it
with cte
(
SELECT month_display as month FROM WWV_FLOW_MONTHS_MONTH
) , cnt as
(
select count(a.emp_id) as cnt ,
to_char(a.due_date,'MONTH') as Process_Month from EMP_Request a
where a.due_date is not null
group by to_char(a.due_date,'MONTH')
) select coalesce(Process_Month,month), cnt from cte left join cnt on cte.month=cnt.to_char(to_date(Process_Month, 'DD-MM-YYYY'), 'Month')
Right join months generator with your query:
select to_char(to_date(mth_num, 'MM'), 'MONTH') month, nvl(cnt, 0) cnt
from (
select count(emp_id) as cnt, to_char(due_date, 'mm') mth_num
from emp_request where due_date is not null
group by to_char(due_date, 'mm')) e
right join (
select to_char(level, 'fm00') mth_num
from dual connect by level <= 12) m using (mth_num)
order by mth_num
dbfiddle demo
Months generator is a simple hierarchical query which gives us 12 values 01, 02... 12:
select to_char(level, 'fm00') mth_num from dual connect by level <= 12
You can also use system views to get these numbers:
select to_char(rownum, 'fm00') mth_num from all_objects where rownum <= 12
or this syntax:
select to_char(column_value, 'fm00') mth_num
from table(sys.odcivarchar2list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12))
It's better to work on numbers which you can sort properly and convert to month names in the last step. This way you have natural months order.
If you want to be sure that month names are always in english, not dependent from local settings then use to_date with third parameter, like here:
select to_char(sysdate, 'month', 'nls_date_language=english') from dual
This is a general problem which is not really a sql problem. SQL doesn't really know about what months you are interested in. So the solution is to tell it in a sub query.
Here is a solution that doesn't use external tables. You simply select all months of the year and outer join your data.
select TO_CHAR(TO_DATE(available_months.m,'MM'),'MONTH') , NVL(sum(data.cnt),0) from
(select to_number(to_char(sysdate,'MM')) m, 7 cnt from dual) data,
(select 1 m from dual union select 2 from dual union select 3 from dual union select 4 from dual
union select 5 from dual union select 6 from dual union select 7 from dual
union select 8 from dual union select 9 from dual union select 10 from dual
union select 11 from dual union select 12 from dual) available_months
where
data.m (+) = available_months.m
group by available_months.m
order by available_months.m;
Or with your data query included is should look like (not tested):
select TO_CHAR(TO_DATE(available_months.m,'MM'),'MONTH') , NVL(sum(data.cnt),0) from
(select count(a.emp_id) as cnt ,to_char(a.due_date,'MONTH') as Process_Month from EMP_Request a where a.due_date is not null) data
(select 1 m from dual union select 2 from dual union select 3 from dual union select 4 from dual
union select 5 from dual union select 6 from dual union select 7 from dual
union select 8 from dual union select 9 from dual union select 10 from dual
union select 11 from dual union select 12 from dual) available_months
where
data.due_date (+) = available_months.m
group by available_months.m
order by available_months.m;

Month counts between dates

I have the below table. I need to count how many ids were active in a given month. So thinking I'll need to create a row for each id that was active during that month so that id can be counted each month. A row should be generated for a term_dt during that month.
active_dt term_dt id
1/1/2018 101
1/1/2018 5/15/2018 102
3/1/2018 6/1/2018 103
1/1/2018 4/25/18 104
Apparently this is a "count number of overlapping intervals" problem. The algorithm goes like this:
Create a sorted list of all start and end points
Calculate a running sum over this list, add one when you encounter a start and subtract one when you encounter an end
If two points are same then perform subtractions first
You will end up with list of all points where the sum changed
Here is a rough outline of the query. It is for SQL Server but could be ported to any RDBMS that supports window functions:
WITH cte1(date, val) AS (
SELECT active_dt, 1 FROM #t AS t
UNION ALL
SELECT COALESCE(term_dt, '2099-01-01'), -1 FROM #t AS t
-- if end date is null then assume the row is valid indefinitely
), cte2 AS (
SELECT date, SUM(val) OVER(ORDER BY date, val) AS rs
FROM cte1
)
SELECT YEAR(date) AS YY, MONTH(date) AS MM, MAX(rs) AS MaxActiveThisYearMonth
FROM cte2
GROUP BY YEAR(date), MONTH(date)
DB Fiddle
I was toying with a simpler query, that seemed to do the trick, for Oracle:
with candidates (month_start) as (
select to_date ('2018-' || column_value || '-01','YYYY-MM-DD')
from
table
(sys.odcivarchar2list('01','02','03','04','05',
'06','07','08','09','10','11','12'))
), sample_data (active_dt, term_dt, id) as (
select to_date('01/01/2018', 'MM/DD/YYYY'), null, 101 from dual
union select to_date('01/01/2018', 'MM/DD/YYYY'),
to_date('05/15/2018', 'MM/DD/YYYY'), 102 from dual
union select to_date('03/01/2018', 'MM/DD/YYYY'),
to_date('06/01/2018', 'MM/DD/YYYY'), 103 from dual
union select to_date('01/01/2018', 'MM/DD/YYYY'),
to_date('04/25/2018', 'MM/DD/YYYY'), 104 from dual
)
select c.month_start, count(1)
from candidates c
join sample_data d
on c.month_start between d.active_dt and nvl(d.term_dt,current_date)
group by c.month_start
order by c.month_start
An alternative solution would be to use a hierarchical query, e.g.:
WITH your_table AS (SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, NULL term_dt, 101 ID FROM dual UNION ALL
SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, to_date('15/05/2018', 'dd/mm/yyyy') term_dt, 102 ID FROM dual UNION ALL
SELECT to_date('01/03/2018', 'dd/mm/yyyy') active_dt, to_date('01/06/2018', 'dd/mm/yyyy') term_dt, 103 ID FROM dual UNION ALL
SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, to_date('25/04/2018', 'dd/mm/yyyy') term_dt, 104 ID FROM dual)
SELECT active_month,
COUNT(*) num_active_ids
FROM (SELECT add_months(TRUNC(active_dt, 'mm'), -1 + LEVEL) active_month,
ID
FROM your_table
CONNECT BY PRIOR ID = ID
AND PRIOR sys_guid() IS NOT NULL
AND LEVEL <= FLOOR(months_between(coalesce(term_dt, SYSDATE), active_dt)) + 1)
GROUP BY active_month
ORDER BY active_month;
ACTIVE_MONTH NUM_ACTIVE_IDS
------------ --------------
01/01/2018 3
01/02/2018 3
01/03/2018 4
01/04/2018 4
01/05/2018 3
01/06/2018 2
01/07/2018 1
01/08/2018 1
01/09/2018 1
01/10/2018 1
Whether this is more or less performant than the other answers is up to you to test.