Oracle: Select rows when value in one column changes - sql

I have the following table:
PLACE USER_ID Date
---------- ---------- -----------------------------
ABC 4 14/04/20 12:05:29,255000000
ABC 4 14/04/20 15:42:28,389000000
ABC 4 14/04/20 18:33:20,202000000
ABC 4 14/04/20 22:51:28,339000000
XYZ 4 14/04/20 11:07:23,335000000
XYZ 2 14/04/20 12:15:12,123000000
ABC 4 13/04/20 22:09:33,255000000
QWE 4 13/04/20 10:18:29,144000000
XYZ 2 14/04/20 10:05:47,255000000
And I need to get the rows when the place changes order by date for the user_id that I select.
So the desired result should be this (for user_id 4):
PLACE USER_ID DATE
---------- ---------- -----------------------------
ABC 4 14/04/20 12:05:29,255000000
XYZ 4 14/04/20 11:07:23,335000000
ABC 4 13/04/20 22:09:33,255000000
QWE 4 13/04/20 10:18:29,144000000
I tried with min date but in my example, I lose data if the user goes back to that place:
SELECT MIN(DATE), PLACE FROM user_places WHERE USER_ID=4 GROUP BY PLACE
Result I get (missing one row):
PLACE USER_ID DATE
---------- ---------- -----------------------------
XYZ 4 14/04/20 11:07:23,335000000
ABC 4 13/04/20 22:09:33,255000000
QWE 4 13/04/20 10:18:29,144000000
Thanks in advance!

In Oracle 12.1 and higher, gaps-and-islands problems like this one are an easy task for the match_recognize clause. For example:
Table setup
alter session set nls_timestamp_format = 'dd/mm/rr hh24:mi:ss,ff';
create table user_places (place, user_id, date_) as
select 'ABC', 4, to_timestamp('14/04/20 12:05:29,255000000') from dual union all
select 'ABC', 4, to_timestamp('14/04/20 15:42:28,389000000') from dual union all
select 'ABC', 4, to_timestamp('14/04/20 18:33:20,202000000') from dual union all
select 'ABC', 4, to_timestamp('14/04/20 22:51:28,339000000') from dual union all
select 'XYZ', 4, to_timestamp('14/04/20 11:07:23,335000000') from dual union all
select 'XYZ', 2, to_timestamp('14/04/20 12:15:12,123000000') from dual union all
select 'ABC', 4, to_timestamp('13/04/20 22:09:33,255000000') from dual union all
select 'QWE', 4, to_timestamp('13/04/20 10:18:29,144000000') from dual union all
select 'XYZ', 2, to_timestamp('14/04/20 10:05:47,255000000') from dual
;
commit;
Query and output
select place, user_id, date_
from (select * from user_places where user_id = 4)
match_recognize (
order by date_
all rows per match
pattern (a {- b* -} )
define b as place = a.place
)
order by date_ desc -- if needed
;
PLACE USER_ID DATE_
----- ------- ---------------------------
ABC 4 14/04/20 12:05:29,255000000
XYZ 4 14/04/20 11:07:23,335000000
ABC 4 13/04/20 22:09:33,255000000
QWE 4 13/04/20 10:18:29,144000000
A few things to note here:
DATE is a reserved keyword. Not a good column name. I used DATE_
instead; notice the trailing underscore.
I hardcoded the value 4. Of course, the better practice is to make that into a bind variable.
If you really only need to do this for one user_id at a time, it is most efficient to do what I did - filter the rows first, in a subquery. However, if you need to do this for all user id's in the same query, you don't need a subquery; you select from the table itself, and you need to add partition by user_id right at the top of the match_recognize clause, before order by date_.

You can use lag() in a subquery to retrieve the "previous" place, and then filter on rows where the previous place is different that the current place:
select place, user_id, date
from (
select t.*, lag(place) over(partition by user_id order by date) lag_place
from mytable t
) t
where lag_place is null or place <> lag_place
This gives you the expected output for all users. If you want only for user 4, then you can filter in the subquery (and there is no need to partition by user):
select place, user_id, date
from (
select t.*, lag(place) over(order by date) lag_place
from mytable t
where user_id = 4
) t
where lag_place is null or place <> lag_place

Related

case statement after where clause to omit the row of data if satisfied

Hi I have a table as below and I'm trying to extract the data from them if and only if the below condition is satisfied.
ID Rank
45689 1
54789 2
98765 1
96541 2
98523 3
92147 4
96741 2
99999 10
If the ID starts with 4 and 9 or 5 and 9 and have same Rank then omit them. If ID starts with 9 and no matching Rank with other ID (starting with 4 or 5) then show them as result.
So My Output should look like
ID Rank
98523 3
92147 4
99999 10
How can I use case statement in where clause to filter the data?
If I understand correctly, you want to select only those ID that begin with a 9, and have a rank that is not also the rank of (another) ID that begins with 4 or 5. Is that correct?
The query below is for the case ID is of string data type (although it will work OK, probably, if ID is numeric data type - through implicit conversion).
select *
from your_table
where id like '9%'
and rank not in (
select rank
from your_table
where substr(id, 1, 1) in ('4', '5')
)
;
One option would be using COUNT() analytic function along with a conditional aggregation such as
WITH t2 AS
(
SELECT SUM(CASE WHEN SUBSTR(id,1,1) IN ('5','9') OR
SUBSTR(id,1,1) IN ('4','9') THEN 1 END ) OVER
(PARTITION BY Rank) AS count, t.*
FROM t -- your original table
)
SELECT id, rank
FROM t2
WHERE count = 1
Demo
You can use an analytic function to only query the table once:
SELECT id,
rank
FROM (
SELECT t.*,
COUNT( CASE WHEN id LIKE '4%' OR id LIKE '5%' THEN 1 END )
OVER ( PARTITION BY Rank )
AS num_match
FROM table_name t
WHERE id LIKE '4%'
OR id LIKE '5%'
OR id LIKE '9%'
)
WHERE id LIKE '9%'
AND num_match = 0;
Which, for the sample data:
CREATE TABLE table_name ( ID, Rank ) AS
SELECT 45689, 1 FROM DUAL UNION ALL
SELECT 54789, 2 FROM DUAL UNION ALL
SELECT 98765, 1 FROM DUAL UNION ALL
SELECT 96541, 2 FROM DUAL UNION ALL
SELECT 98523, 3 FROM DUAL UNION ALL
SELECT 92147, 4 FROM DUAL UNION ALL
SELECT 96741, 2 FROM DUAL UNION ALL
SELECT 99999, 10 FROM DUAL;
Outputs:
ID | RANK
----: | ---:
98523 | 3
92147 | 4
99999 | 10
db<>fiddle here

Concat multiple rows based on If/Then to poulate a 5 digit code

Hoping someone can help.
I have data as follows in two seperate columns in a table called StudentRace
Student_ID RaceCD
---------- ------
123456 1
123456 2
589645 4
987654 3
987654 4
I am looking for a way to combine the data for the students by student id to output into 00000 format. example: Student_ID 123456 RACE: 12000; Student_ID 589645 Race: 00040; Student_ID 987654 Race = 00340. I need to have it be a sub query as it is part of a large report that pulls 50+ fields. If anyone is able to help I would greatly appreciate it. I am using Toads Data Point for Oracle to create my query.
The result can be achieved with a single group by without any joins.
Setup
CREATE TABLE studentrace
(
Student_ID,
RaceCD
)
AS
SELECT 123456, 1 FROM DUAL
UNION ALL
SELECT 123456, 2 FROM DUAL
UNION ALL
SELECT 589645, 4 FROM DUAL
UNION ALL
SELECT 987654, 3 FROM DUAL
UNION ALL
SELECT 987654, 4 FROM DUAL;
Query
SELECT student_id, LPAD (SUM (racecd * POWER (10, 5 - racecd)), 5, '0') AS race
FROM studentrace
GROUP BY student_id;
Result
STUDENT_ID RACE
_____________ ________
589645 00040
987654 00340
123456 12000
You can use a partitioned outer join:
SELECT t.Student_id,
LISTAGG( COALESCE( t.raceCD, 0 ) ) WITHIN GROUP ( ORDER BY r.race )
AS RaceCDs
FROM ( SELECT LEVEL AS race
FROM DUAL
CONNECT BY LEVEL <= 5 ) r
LEFT OUTER JOIN table_name t
PARTITION BY ( t.Student_ID )
ON ( r.race = t.RaceCD )
GROUP BY t.student_id
Which, for your test data:
CREATE TABLE table_name ( Student_ID, RaceCD ) AS
SELECT 123456, 1 FROM DUAL UNION ALL
SELECT 123456, 2 FROM DUAL UNION ALL
SELECT 589645, 4 FROM DUAL UNION ALL
SELECT 987654, 3 FROM DUAL UNION ALL
SELECT 987654, 4 FROM DUAL
Outputs:
STUDENT_ID | RACECDS
---------: | :------
123456 | 12000
589645 | 00040
987654 | 00340
db<>fiddle here

how to write sql query to select rows with max value in one column

My Table looks like this.
Id | Name | Ref | Date | From
10 | Ant | 100 | 2017-02-02 | David
10 | Ant | 300 | 2016-01-01 | David
2 | Cat | 90 | 2017-09-09 | David
2 | Cat | 500 | 2016-02-03 | David
3 | Bird | 150 | 2017-06-28 | David
This is the result I want.
Id | Name | Ref | Date | From
3 | Bird | 150 | 2017-06-28 | David
2 | Cat | 500 | 2016-02-03 | David
10 | Ant | 300 | 2016-01-01 | David
My target is the highest Ref per Id, ordered by Order Date desc.
Could you please tell me about how to write a sql query using pl/sql.
This kind of requirement (where you need the max or min by one column, grouped by another, but you need all the data from the max or min row) is pretty much what analytic functions are for. I used row_number - if ties are possible, you need to clarify the assignment (see my Comment under your question), and depending on the details, another analytic function may be more appropriate - perhaps rank().
with
my_table ( id, name, ref, dt, frm ) as (
select 10, 'Ant' , 100, date '2017-02-02', 'David' from dual union all
select 10, 'Ant' , 300, date '2016-01-01', 'David' from dual union all
select 2, 'Cat' , 90, date '2017-09-09', 'David' from dual union all
select 2, 'Cat' , 500, date '2016-02-03', 'David' from dual union all
select 3, 'Bird', 150, date '2017-06-28', 'David' from dual
)
-- End of simulated table (for testing purposes only, not part of the solution).
-- SQL query begins BELOW THIS LINE.
select id, name, ref, dt, frm
from (
select id, name, ref, dt, frm,
row_number() over (partition by id order by ref desc, dt desc) as rn
from my_table
)
where rn = 1
order by dt desc
;
ID NAME REF DT FRM
-- ---- --- ---------- -----
3 Bird 150 2017-06-28 David
2 Cat 500 2016-02-03 David
10 Ant 300 2016-01-01 David
You can use this
SELECT
Id
,Name
,Ref
,[Date]
FROM(
SELECT
*
, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Ref DESC) AS Row#
FROM yourtable
) A WHERE Row# = 1
ORDER BY A.[Date] DESC
Another solution with a self join (Idea came from here: How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL? ):
with
my_table ( id, name, ref, dt, frm ) as (
select 10, 'Ant' , 100, date '2017-02-02', 'David' from dual union all
select 10, 'Ant' , 300, date '2016-01-01', 'David' from dual union all
select 10, 'Ant' , 300, date '2015-01-01', 'David' from dual union all
select 2, 'Cat' , 90, date '2017-09-09', 'David' from dual union all
select 2, 'Cat' , 500, date '2016-02-03', 'David' from dual union all
select 3, 'Bird', 150, date '2017-06-28', 'David' from dual
)
-- End of simulated table (for testing purposes only, not part of the solution).
-- SQL query begins BELOW THIS LINE.
select m1.*
from my_table m1
left join my_table m2
on m1.id = m2.id and (
-- this is basically a comparator: order by ref desc, dt desc
m1.ref < m2.ref or (
m1.ref = m2.ref and
m1.dt < m2.dt
)
) where m2.id is null order by m1.dt desc
;
ID NAME REF DT FRM
---------- ---- ---------- --------- -----
3 Bird 150 28-JUN-17 David
2 Cat 500 03-FEB-16 David
10 Ant 300 01-JAN-16 David
Use the "better than" SQL principal:
select a.Id, a.Name, a.Ref, a.Dt, a.frm
from table_name a
left join table_name b on a.id = b.id and b.ref > a.ref -- b.ref > a.ref would make b.ref "better" that a
where b.id is null -- Now check and make sure there is nothing "better"
group by a.id;
SELECT Id, Name, Max(Ref) as Ref, Min(`Date`) as `Date`
From Forge
Group By Id, Name
Order by Min(`Date`) desc;

Oracle SQL (Toad): Expand table

Suppose I have an SQL (Oracle Toad) table named "test", which has the following fields and entries (dates are in dd/mm/yyyy format):
id ref_date value
---------------------
1 01/01/2014 20
1 01/02/2014 25
1 01/06/2014 3
1 01/09/2014 6
2 01/04/2015 7
2 01/08/2015 43
2 01/09/2015 85
2 01/12/2015 4
I know from how the table has been created that, since there are value entries for id = 1 for February 2014 and June 2014, the values for March through May 2014 must be 0. The same applies to July and August 2014 for id = 1, and for May through July 2015 and October through November 2015 for id = 2.
Now, if I want to calculate, say, the median of the value column for a given id, I will not arrive at the correct result using the table as it stands - as I'm missing 5 zero entries for each id.
I would therefore like to create/use the following (potentially just temporary table)...
id ref_date value
---------------------
1 01/01/2014 20
1 01/02/2014 25
1 01/03/2014 0
1 01/04/2014 0
1 01/05/2014 0
1 01/06/2014 3
1 01/07/2014 0
1 01/08/2014 0
1 01/09/2014 6
2 01/04/2015 7
2 01/05/2015 0
2 01/06/2015 0
2 01/07/2015 0
2 01/08/2015 43
2 01/09/2015 85
2 01/10/2015 0
2 01/11/2015 0
2 01/12/2015 4
...on which I could then compute the median by id:
select id, median(value) as med_value from test group by id
How do I do this? Or would there be an alternative way?
Many thanks,
Mr Clueless
In this solution, I build a table with all the "needed dates" and value of 0 for all of them. Then, instead of a join, I do a union all, group by id and ref_date and ADD the values in each group. If the date had a row with a value in the original table, then that's the resulting value; and if it didn't, the value will be 0. This avoids a join. In almost all cases a union all + aggregate will be faster (sometimes much faster) than a join.
I added more input data for more thorough testing. In your original question, you have two id's, and for both of them you have four positive values. You are missing five values in each case, so there will be five zeros (0) which means the median is 0 in both cases. For id=3 (which I added) I have three positive values and three zeros; the median is half of the smallest positive number. For id=4 I have just one value, which then should be the median as well.
The solution includes, in particular, an answer to your specific question - how to create the temporary table (which most likely doesn't need to be a temporary table at all, but an inline view). With factored subqueries (in the WITH clause), the optimizer decides if to treat them as temporary tables or inline views; you can see what the optimizer decided if you look at the Explain Plan.
with
inputs ( id, ref_date, value ) as (
select 1, to_date('01/01/2014', 'dd/mm/yyyy'), 20 from dual union all
select 1, to_date('01/02/2014', 'dd/mm/yyyy'), 25 from dual union all
select 1, to_date('01/06/2014', 'dd/mm/yyyy'), 3 from dual union all
select 1, to_date('01/09/2014', 'dd/mm/yyyy'), 6 from dual union all
select 2, to_date('01/04/2015', 'dd/mm/yyyy'), 7 from dual union all
select 2, to_date('01/08/2015', 'dd/mm/yyyy'), 43 from dual union all
select 2, to_date('01/09/2015', 'dd/mm/yyyy'), 85 from dual union all
select 2, to_date('01/12/2015', 'dd/mm/yyyy'), 4 from dual union all
select 3, to_date('01/01/2016', 'dd/mm/yyyy'), 12 from dual union all
select 3, to_date('01/03/2016', 'dd/mm/yyyy'), 23 from dual union all
select 3, to_date('01/06/2016', 'dd/mm/yyyy'), 2 from dual union all
select 4, to_date('01/11/2014', 'dd/mm/yyyy'), 9 from dual
),
-- the "inputs" table constructed above is for testing only,
-- it is not part of the solution.
ranges ( id, min_date, max_date ) as (
select id, min(ref_date), max(ref_date)
from inputs
group by id
),
prep ( id, ref_date, value ) as (
select id, add_months(min_date, level - 1), 0
from ranges
connect by level <= 1 + months_between( max_date, min_date )
and prior id = id
and prior sys_guid() is not null
),
v ( id, ref_date, value ) as (
select id, ref_date, sum(value)
from ( select id, ref_date, value from prep union all
select id, ref_date, value from inputs
)
group by id, ref_date
)
select id, median(value) as median_value
from v
group by id
order by id -- ORDER BY is optional
;
ID MEDIAN_VALUE
-- ------------
1 0
2 0
3 1
4 9
If ref_date is date and is second
with int1 as (select id
, max(ref_date) as max_date
, min(ref_date) as min_date from test group by id )
, s(n) as (select level -1 from dual connect by level <= (select max(months_between(max_date, min_date)) from int1 ) )
select i.id
, add_months(i.min_date,s.n) as ref_date
, nvl(value,0) as value
from int1 i
join s on add_months(i.min_date,s.n) <= i.max_date
LEFT join test t on t.id = i.id and add_months(i.min_date,s.n) = t.ref_date
And with median
with int1 as (select id
, max(ref_date) as max_date
, min(ref_date) as min_date from test group by id )
, s(n) as (select level -1 from dual connect by level <= (select max(months_between(max_date, min_date)) from int1 ) )
select i.id
, MEDIAN(nvl(value,0)) as value
from int1 i
join s on add_months(i.min_date,s.n) <= i.max_date
LEFT join test t on t.id = i.id and add_months(i.min_date,s.n) = t.ref_date
group by i.id

SQL find nearest number

Say I have a table like the following (I'm on Oracle 10g btw)
NAME VALUE
------ ------
BOB 1
BOB 2
BOB 4
SUZY 1
SUZY 2
SUZY 3
How can I select all rows where value is closest to, but not greater than, a given number. For example if I want to find all the rows where value is closest to 3 I would get:
NAME VALUE
------ ------
BOB 2
SUZY 3
This seems like it should be simple... but I'm having no luck.
Thanks!
SELECT name, max(value)
FROM tbl
WHERE value <= 3
GROUP BY name
This works (SQLFiddle demo):
SELECT name, max(value)
FROM mytable
WHERE value <= 3
GROUP BY name
Based on hagensofts answer:
SELECT name, max(value)
FROM tbl
WHERE value <= 3 AND ROWNUM <=2
GROUP BY name
With ROWNUM you can limit the output rows, so if you want 2 row, then you can limit the rownum.
WITH v AS (
SELECT 'BOB' NAME, 1 value FROM dual
UNION ALL
SELECT 'BOB', 2 FROM dual
UNION ALL
SELECT 'BOB', 4 FROM dual
UNION ALL
SELECT 'SUZY', 1 FROM dual
UNION ALL
SELECT 'SUZY', 2 FROM dual
UNION ALL
SELECT 'SUZY', 3 FROM dual
)
SELECT *
FROM v
WHERE (name, value) IN (SELECT name, MAX(value)
FROM v
WHERE value <= :num
GROUP BY name)
;