Extracting most recent date from VARCHAR2 - sql

Data:
USER_ID VIOLATION_DATES
--------------------------------
1 18-Jul-21 > 24-Jul-21
2 05-Aug-21
3 09-Jun-21
1 18-Jul-21
I have a table that has columns for Users and their dates of violations. I want to extract the most recent violation for each user.
This is the query I've written:
select
USR_ID,
max(to_date(VIOLATION_DATES, 'DD-MON-YY')) as Most_Recent_VIOLATIONS
from
table
group by
USR_ID
However I get this error:
ORA-01830: date format picture ends before converting entire input string
I believe it has something to do with the way the most recent violation is appended (18-Jul-21 > 24-Jul-21 ). Can anyone provide any clarity on how I can extract the most recent date for each user? for example:
USER_ID VIOLATION_DATES
--------------------------
1 24-Jul-21
2 05-Aug-21
3 09-Jun-21
I understand that this storage method isn't ideal but this is out of my control.

You can split the multi-value string into separate rows, then "group by" as normal (the with clause is just to provide test data - substitute your actual table):
with demo (user_id, violation_dates) as
( select 1, '18-Jul-21 > 24-Jul-21' from dual union all
select 2, '05-Aug-21' from dual union all
select 3, '09-Jun-21' from dual union all
select 1, '18-Jul-21' from dual )
select user_id
, max(to_date(regexp_substr(d.violation_dates, '[^> ]+', 1, r.rn) default null on conversion error,'DD-MON-YY')) as violation_date
from demo d
cross apply
( select rownum as rn
from dual connect by rownum <= regexp_count(d.violation_dates,'>') +1 ) r
group by user_id
order by 1;
"default null on conversion error" requires Oracle 12.2.

Related

SQL Query to select a specific part of the values in a column

I have a table in a database and one of the columns of the table is of the format AAA-BBBBBBB-CCCCCCC(in the table below column Id) where A, B and C are all numbers (0-9). I want to write a SELECT query such that for this column I only want the values in the format BBBBBBB-CCCCCCC. I am new to SQL so not sure how to do this. I tried using SPLIT_PART on - but not sure how to join the second and third parts.
Table -
Id
Name
Age
123-4567890-1234567
First Name
199
456-7890123-4567890
Hulkamania
200
So when the query is written the output should be like
Output
4567890-1234567
7890123-4567890
As mentioned in the request comments, you should not store a combined number, when you are interested in its parts. Store the parts in separate columns instead.
However, as the format is fixed 'AAA-BBBBBBB-CCCCCCC', it is very easy to get the substring you are interested in. Just take the string from the fifth position on:
select substr(col, 5) from mytable;
You can select the right part of a column starting at the 4th character
SELECT RIGHT(Id, LEN(Id)-4) AS TrimmedId;
Another option using regexp_substr
with x ( c1,c2,c3 ) as
(
select '123-4567890-1234567', 'First Name' , 199 from dual union all
select '456-7890123-4567890', 'Hulkamania' , 200 from dual
)
select regexp_substr(c1,'[^-]+',1,2)||'-'||regexp_substr(c1,'[^-]+',1,3) as result from x ;
Demo
SQL> with x ( c1,c2,c3 ) as
(
select '123-4567890-1234567', 'First Name' , 199 from dual union all
select '456-7890123-4567890', 'Hulkamania' , 200 from dual
)
select regexp_substr(c1,'[^-]+',1,2)||'-'||regexp_substr(c1,'[^-]+',1,3) as result from x ; 2 3 4 5 6
RESULT
--------------------------------------------------------------------------------
4567890-1234567
7890123-4567890
SQL>

Oracle - Split the parameter by comma and check if the parameter exist in Column

I am new to Oracle and not sure if there are any inbuilt functions to do this task.
I have a column that contains Product_ID's separated by comma.
Product_ID
123,234,546,789,487
I am passing a list of Product_ID's separated by a comma as varchar2.
so, I am passing "234,789" as varchar2.
I want to find if 234 and 789 exist in that column and if exists get that row.
How can I do this?
If you want to check that all the values in your input list are included in the column then you can use:
SELECT *
FROM table_name t
WHERE EXISTS (
WITH input ( value ) AS (
SELECT '123,789' FROM DUAL -- Your input value
)
SELECT 1
FROM input
WHERE ','||t.product_id||',' LIKE '%,' || REGEXP_SUBSTR( value, '[^,]+', 1, LEVEL ) || ',%'
CONNECT BY LEVEL <= REGEXP_COUNT( value, '[^,]+' )
HAVING COUNT(*) = REGEXP_COUNT( value, '[^,]+' )
)
Which, for the sample data:
CREATE TABLE table_name ( Product_ID ) AS
SELECT '123,234,546,789,487' FROM DUAL
Outputs:
| PRODUCT_ID |
| :------------------ |
| 123,234,546,789,487 |
If you want to check that at least one value in your input list is in the column then you can use the same query without the line containing the HAVING clause.
db<>fiddle here
Here's one way - making the comma-separated number lists into JSON arrays so that we can split them using json_table, then re-aggregating as nested tables so that we can compare with the submultiset operator:
create type table_of_pid as table of number;
/
with
sample_data (product_id) as (
select '123,234,546,789,487' from dual union all
select '333,444,555,666,888' from dual
)
, user_input (product_list) as (
select '234,789' from dual
)
select *
from sample_data
where ( select cast(collect(pid) as table_of_pid) as input_pid
from user_input cross apply
json_table('[' || product_list || ']', '$[*]'
columns pid number path '$')
)
submultiset
( select cast(collect(pid) as table_of_pid) as input_pid
from json_table('[' || product_id || ']', '$[*]'
columns pid number path '$')
)
;
PRODUCT_ID
-------------------
123,234,546,789,487
Your inputs violate First Normal Form, the most basic sanity requirement in a relational database. If the data was in normal form, you wouldn't need any of the JSON trickery. Still, the aggregation into collection and the submultiset comparison would be the correct approach even if the data was already in normal form.
It is a bad idea to store comma-separated values into a single column.
One option to do what you asked for is in the following example; read comments within code. Note that - for large tables - performance WILL suffer.
SQL> set ver off
SQL>
SQL> with
2 test (product_id) as
3 -- your sample table
4 (select '123,234,546,789,487' from dual union all
5 select '111,222,333' from dual
6 ),
7 test_split as
8 -- you have to split it into rows
9 (select product_id,
10 regexp_substr(product_id, '[^,]+', 1, column_value) val
11 from test
12 cross join table(cast(multiset(select level from dual
13 connect by level <= regexp_count(product_id, ',') + 1
14 ) as sys.odcinumberlist))
15 ),
16 parameter_split as
17 -- split input parameter into rows as well
18 (select regexp_substr('&&par_id', '[^,]+', 1, level) val
19 from dual
20 connect by level <= regexp_count('&&par_id', ',') + 1
21 )
22 -- join split values, return the result
23 select distinct t.product_id
24 from test_split t join parameter_split p on p.val = t.val;
Enter value for par_id: 123,546
PRODUCT_ID
-------------------
123,234,546,789,487
SQL> undefine par_id
SQL> /
Enter value for par_id: 333
PRODUCT_ID
-------------------
111,222,333
SQL>

Oracle SQL - Select Max Integer Value from UserID String

I want to select the MAX value of the integer associated with the UserID in my Oracle database table to generate the next username for users with similar UserID.
The UserID contains values such as below. There is no fixed pattern of characters before the integer as the string is a username.
TKe10
TKe9
TKe12
TomKelly13
TomKelly9
PJames12
PJames7
I tried using the query below but it always gives TKe9 OR TomKelly9 OR PJames7 as the MAX value.
SELECT * FROM
(SELECT MAX(UserID) from PV_USERS
WHERE REGEXP_LIKE (UserID, '^'|| '<some_user_id>'|| '[^A-
Za-z][0-9]*'));
I have also tried using ORDER BY DESC WHERE ROWNUM<=1 but it also gives the same output.
You need to extract just the numeric part of the ID, which you can do with
regexp_substr(userid, '[0-9]*$')
and then convert that to a number before finding the maximum (otherwise you'll still be doing string comparison, and sorting 9 before 10):
max(to_number(regexp_substr(userid, '[0-9]*$')))
and you probably want to allow for the ID root you're checking to not exist at all yet, which you can do with nvl() or coalesce():
select coalesce(max(to_number(regexp_substr(userid, '[0-9]*$'))), 0) as max_num
from pv_users
where regexp_like(userid, '^'|| 'TomKelly'|| '[0-9]*');
MAX_NUM
----------
13
select coalesce(max(to_number(regexp_substr(userid, '[0-9]*$'))), 0) as max_num
from pv_users
where regexp_like(userid, '^'|| 'PJames'|| '[0-9]*');
MAX_NUM
----------
12
select coalesce(max(to_number(regexp_substr(userid, '[0-9]*$'))), 0) as max_num
from pv_users
where regexp_like(userid, '^'|| 'NewName'|| '[0-9]*');
MAX_NUM
----------
0
... and then add 1 and append back onto the root to get the next ID.
Depending on your business rules, you might want to make the filter case-insensitive.
You should be aware that two sessions performing this operation simultaneously will see the same result, so both would try to create the same ID, e.g. TomKelly14. You either need to serialise this generation operation, or include a fall back - like checking if you get a PK violation when you try to insert the new value into the table and repeating if that happens.
with temp as
(
select 'Tke10' userid from dual
union all
select 'Tke9' userid from dual
union all
select 'Tke12' userid from dual
union all
select 'Tomkelly13' userid from dual
union all
select 'Tomkelly9' userid from dual
union all
select 'Pjames12' userid from dual
union all
select 'Pjames7' userid from dual
)
select A||B from (
select
substr(userid,1,instr(userid,to_number(regexp_substr(userid,'\d+$')))-1) A
,max(to_number(regexp_substr(userid,'\d+$'))) B
from temp
group by substr(userid,1,instr(userid,to_number(regexp_substr(userid,'\d+$')))-1)
)
;

How to loop a single output from a query

Assuming the query below returns a single value:
SELECT date FROM dual WHERE name = 'max'
21-05-2011
How it can loop the previous single value result to return:
21-05-2011
21-06-2011
21-07-2011
21-08-2011
Expanding on comments, you can use the same mechanism shown in your previous question, just replacing the fixed date value with the column name and dual with your real table.
With your pseudocode:
select add_months(date, level-1)
from dual
where name = 'max'
connect by level <= 4
... but as dual is a real table that doesn't have those columns and date is a reserved word, it would be more like:
select add_months(your_date_column, level-1)
from your_table
where name = 'max'
connect by level <= 4
As a demo, I'll create a table with a single row that matches your filter and has your date:
create table your_table (name varchar2(10), your_date_column date);
insert into your_table (name, your_date_column)
values ('max', date '2011-05-21');
select add_months(your_date_column, level-1)
from your_table
where name = 'max'
connect by level <= 4;
ADD_MONTHS(YOUR_DATE_COLUMN,LEVEL-1)
------------------------------------
2011-05-21
2011-06-21
2011-07-21
2011-08-21
You just need to substitute your actual table and column names.

Returning the row with the most recent timestamp from each group

I have a table (Postgres 9.3) defined as follows:
CREATE TABLE tsrs (
id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL REFERENCES customers,
timestamp TIMESTAMP WITHOUT TIME ZONE,
licensekeys_checksum VARCHAR(32));
The pertinent details here are the customer_id, the timestamp, and the licensekeys_checksum. There can be multiple entries with the same customer_id, some of those may have matching licensekey_checksum entries, and some may be different. There will never be rows with equal checksum and equal timestamps.
I want to return a table containing 1 row for each group of rows with matching licensekeys_checksum entries. The row returned for each group should be the one with the newest / most recent timestamp.
Sample Input:
1, 2, 2014-08-21 16:03:35, 3FF2561A
2, 2, 2014-08-22 10:00:41, 3FF2561A
2, 2, 2014-06-10 10:00:41, 081AB3CA
3, 5, 2014-02-01 12:03:23, 299AFF90
4, 5, 2013-12-13 08:14:26, 299AFF90
5, 6, 2013-09-09 18:21:53, 49FFA891
Desired Output:
2, 2, 2014-08-22 10:00:41, 3FF2561A
2, 2, 2014-06-10 10:00:41, 081AB3CA
3, 5, 2014-02-01 12:03:23, 299AFF90
5, 6, 2013-09-09 18:21:53, 49FFA891
I have managed to piece together a query based on the comments below, and hours of searching on the internet. :)
select * from tsrs
inner join (
select licensekeys_checksum, max(timestamp) as mts
from tsrs
group by licensekeys_checksum
) x on x.licensekeys_checksum = tsrs.licensekeys_checksum
and x.mts = tsrs.timestamp;
It seems to work, but I am unsure. Am I on the right track?
Your query in the question should perform better than the queries in the (previously) accepted answer. Test with EXPLAIN ANALYZE.
DISTINCT ON is typically simpler and faster:
SELECT DISTINCT ON (licensekeys_checksum) *
FROM tsrs
ORDER BY licensekeys_checksum, timestamp DESC NULLS LAST;
db<>fiddle here
Old sqlfiddle
Detailed explanation:
Select first row in each GROUP BY group?
Alternative deduplication, using NOT EXISTS(...)
SELECT *
FROM tsrs t
WHERE NOT EXISTS (
SELECT *
FROM tsrs x
WHERE x.customer_id = t.customer_id -- same customer
AND x.licensekeys_checksum = t.licensekeys_checksum -- same checksum
AND x.ztimestamp > t.ztimestamp -- but more recent
);
Try this
select *
from tsrs
where (timestamp,licensekeys_checksum) in (
select max(timestamp)
,licensekeys_checksum
from tsrs
group by licensekeys_checksum)
>SqlFiddle Demo
or
with cte as (
select id
,customer_id
,timestamp
,licensekeys_checksum
,row_number () over (partition by licensekeys_checksum ORDER BY timestamp DESC) as rk
from tsrs)
select id
,customer_id
,timestamp
,licensekeys_checksum
from cte where rk=1 order by id
>SqlFiddle Demo
Reference : Window Functions, row_number(), and CTE