Oracle SQL Regexp - sql

I have a , separated string values in two different columns and need to match a specific value between these two columns.
Example:
Column A: A123,B234,I555,K987
Column B: AAA1,A123,B234,I555,K987
I want to check the value B234 from Column A (which is starting 6th position) and B234 from Column B (which is starting 11th position), if they are matching or not. I have few hundred of such records and need to check if these values are matching or not.

The way you put it, you'd compare "words" - the 2nd one in column A against the 3rd one in column B (sample data in lines #1 - 4; query you might be interested in begins at line #5):
SQL> with test (cola, colb) as
2 (select 'A123,B234,I555,K987', 'AAA1,A123,B234,I555,K987' from dual union all
3 select 'XYZ' , 'DEF' from dual
4 )
5 select *
6 from test
7 where regexp_substr(cola, '\w+', 1, 2) = regexp_substr(colb, '\w+', 1, 3);
COLA COLB
------------------- ------------------------
A123,B234,I555,K987 AAA1,A123,B234,I555,K987
SQL>

Related

How to search for comma delimited string Oracle SQL? [duplicate]

I'm using Oracle Apex 4,2. I have a table with a column in it called 'versions'. In the 'versions' column for each row there is a list of values that are separated by commas e.g. '1,2,3,4'.
I'm trying to create a Select List whose list of values will be each of the values that are separated by commas for one of the rows. What would the SQL query for this be?
Example:
Table Name: Products
Name | Versions
--------------------
myProd1 | 1,2,3
myProd2 | a,b,c
Desired output:
Two Select Lists.
The first one is obvious, I just select the name column from the products table. This way the user can select whatever product they want.
The second one is the one I'm not sure about. Let's say the user has select 'myProd1' from the first Select List. Then the second select should contain the following list of values for the user to select from: '1.0', '1.1' or '1.2'.
After reading your latest comments I understand that what you want is not an LOV but rather list item. Although it can be an LOV too. The first list item/lov will have all products only that user selects from it, e.g. Prod1, Prod2, Prod3... The second list item will have all versions converted from comma separated values as in your example to table as in my examples below. Because in my understanding user may pick only a single value per product from this list. Single product may have many values, e.g. Prod1 has values 1,2,3, 4. But user needs to select only one. Correct? This is why you need to convert comma values to table. The first query select is smth lk this:
SELECT prod_id
FROM your_prod_table
/
id
--------
myProd1
myProd2
.....
The second query should select all versions where product_id is in your_prod_table:
SELECT version FROM your_versions_table
WHERE prod_id IN (SELECT prod_id FROM your_prod_table)
/
Versions
--------
1,2,3,4 -- myProd1 values
a,b,c,d -- myProd2 values
.....
The above will return all versions for the product, e.g. all values for myProd1 etc...
Use my examples converting comma sep. values to table. Replace harcoded '1,2,3,4' with your value column from your table. Replace dual with your table name
If you need products and versions in a single query and single result then simply join/outer join (left, right join) both tables.
SELECT p.prod_id, v.version
FROM your_prod_table p
, your_versions_table v
WHERE p.prod_id = v.prod_id
/
In this case you will get smth lk this in output:
id | Values
------------------
myProd1 | 1,2,3,4
myProd2 | a,b,c,d
If you convert comma to table in above query then you will get this - all in one list or LOV:
id | Values
------------------
myProd1 | 1
myProd1 | 2
myProd1 | 3
myProd1 | 4
myProd2 | a
myProd2 | b
myProd2 | c
myProd2 | d
I hope this helps. Again, you may use LOV or list values if available in APEX. Two separate list of values - one for products other for versions - make more sense to me. In case of list items you will need two separate queries as above and it will be easier to do comma to table conversion for values/versions only. But is is up to you.
Comma to table examples:
-- Comma to table - regexp_count --
SELECT trim(regexp_substr('1,2,3,4', '[^,]+', 1, LEVEL)) str_2_tab
FROM dual
CONNECT BY LEVEL <= regexp_count('1,2,3,4', ',')+1
/
-- Comma to table - Length -
SELECT trim(regexp_substr('1,2,3,4', '[^,]+', 1, LEVEL)) token
FROM dual
CONNECT BY LEVEL <= length('1,2,3,4') - length(REPLACE('1,2,3,4', ',', ''))+1
/
-- Comma to table - instr --
SELECT trim(regexp_substr('1,2,3,4', '[^,]+', 1, LEVEL)) str_2_tab
FROM dual
CONNECT BY LEVEL <= instr('1,2,3,4', ',', 1, LEVEL - 1)
/
The output of all that above is the same:
STR_2_TAB
----------
1
2
3
4
Comma to table - PL/SQL-APEX example. For LOV you need SQL not PL/SQL.
DECLARE
v_array apex_application_global.vc_arr2;
v_string varchar2(2000);
BEGIN
-- Convert delimited string to array
v_array:= apex_util.string_to_table('alpha,beta,gamma,delta', ',');
FOR i in 1..v_array.count LOOP
dbms_output.put_line('Array: '||v_array(i));
END LOOP;
-- Convert array to delimited string
v_string:= apex_util.table_to_string(v_array,'|');
dbms_output.put_line('String: '||v_string);
END;
/

Querying a subset of an array in Snowflake, including some values but excluding other values

I am attempting to subset on certain elements within an array in a Snowflake database, including some elements but excluding others.
Example:
SELECT column1
FROM table
WHERE array_contains('cats'::variant, column1)
LIMIT 6;
with output:
Row column1
1 ["cats","dogs"]
2 ["horses","cows","cats"]
3 ["cats"]
4 ["cats","fish",turtles"]
5 ["cats","turtles","dogs"]
6 ["fish","cats"]
BUT how would I write a query that selects rows with "cats" in the array, but also excludes rows that have "cows" and "fish" even if "cats" is in those arrays as well? The goal would only be to return rows 1, 3, and 5 out of the above output, and exclude the other rows/arrays that have "cows" and/or "fish" in them even if "cats" happens to be in the array as well.
The desired subsetted output from above should be:
Row column1
1 ["cats","dogs"]
2 ["cats"]
3 ["cats","turtles","dogs"]
Just use NOT, ARRAY_CONTAINS and AND:
with t as (
select array_construct('dogs', 'cats') column1
union all select array_construct('dogs', 'cats', 'fish')
)
SELECT column1
FROM t
WHERE array_contains('cats'::variant, column1)
AND NOT array_contains('cows'::variant, column1);
AND NOT array_contains('fish'::variant, column1);

Finding a value in multiple columns in Oracle table

I have a table like below
ID NUMBER 1 NUMBER 2 NUMBER 3 LOC
1-14H-4950 0616167 4233243 CA
A-522355 1234567 TN
A-522357 9876543 WY
A-522371 1112223 WA
A-522423 1234567 2345678 1234567 NJ
A-A-522427 9876543 6249853 6249853 NJ
and I have a bunch of values (1234567, 9876543, 0616167, 1112223, 999999...etc) which will be used in where clause, if a value from where clause found in one of the three Number columns (Number 1 or Number 2 Number 3) then I will have to write that to output1 (its like VLOOKUP of Excel).
If the value is found in more than one of the three columns then it will be different output2 with a flag as MultipleMatches. If the value is not found in any of the three columns then it should be in Output2 with flag as No Match. I tried using self join and or clauses, but not able to get what I want.
I want to write the SQL to generate both outputs. Outputs will include all the columns from the above table. For eg:
Output 1 from above sample data will look like
ID NUMBER 1 NUMBER 2 NUMBER 3 LOC
1-14H-4950 0616167 4233243 CA
A-522371 1112223 WA
Output 2 will be like:
ID NUMBER 1 NUMBER 2 NUMBER 3 LOC Flag
A-522423 1234567 2345678 1234567 NJ Multiple Match
A-A-522427 9876543 6249853 6249853 NJ Multiple Match
1234 No Match
I want to write the SQL to generate both outputs.
One SELECT operator cannot produce two output sets.
The main question is, why split the output when that the difference is only in the FLAG column? If you really need two different output of the result, then you can do this:
(Rightly) create a common cursor for the query, where the FLAG column will be calculated and split the output screens already in the UI.
drop table test_dt;
create table test_dt as
select '1-14h-4950' id,null num1,616167 num2,4233243 num3,'ca' loc from dual
union all
select 'a-522355',null ,1234567,null,'tn' from dual union all
select 'a-522357',null ,9876543,null,'wy' from dual union all
select 'a-522371',null ,1112223,null,'wa' from dual union all
select 'a-522423',1234567,2345678,1234567,'nj' from dual union all
select 'a3-522423',null,null,null,'nj' from dual union all
select 'a-a-522427',9876543,6249853,6249853,'nj' from dual;
--
select
d.*,
case when t.cc_ndv=0 and t.cc_null=3 then 'Not matching'
when t.cc_ndv=(3-t.cc_null) then 'Once'
else 'Multiplay match'
end flag
--t.cc_ndv,
--t.cc_null
from test_dt d ,lateral(
select
count(distinct case level when 1 then num1
when 2 then num2
when 3 then num3
end ) cc_ndv,
count(distinct case level when 1 then nvl2(num1,null,1)
when 2 then nvl2(num2,null,2)
when 3 then nvl2(num3,null,3)
end ) cc_null
from dual connect by level<=3 and sys_guid()is not null
) t;
Or
create a procedure(see to dbms_sql.return_result) that returns a some data sets.
Process these data of cursors / datasets separately.

Oracle: Select multiple values from a column while satisfying condition for some values

I have a column COL in a table which has integer values like: 1, 2, 3, 10, 11 ... and son on. Uniqueness in the table is created by an ID. Each ID can be associated with multiple COL values. For example
ID | COL
——————————
1 | 2
————+—————
1 | 3
————+—————
1 | 10
————+—————
is valid.
What I want to do is select only the COL values from the table that are greater than 3, AND (the problematic part) also select the value that is the MAX of 1, 2, and 3, if they exist at all. So in the table above, I would want to select values [3, 10] because 10 is greater than 3 and 3 = MAX(3, 2).
I know I can do this with two SQL statements, but it's sort of messy. Is there a way of doing it with one statement only?
SELECT col FROM table
WHERE
col > 3
UNION
SELECT MAX(col) FROM table
WHERE
col <= 3
This query does not assume you want the results per id, because you don't explicitely mention it.
I don't think you need pl/sql for this, SQL is enough.

SQL - suppressing duplicate *adjacent* records

I need to run a Select statement (DB2 SQL) that does not pull adjacent row duplicates based on a certain field. In specific, I am trying to find out when data changes, which is made difficult because it might change back to its original value.
That is to say, I have a table that vaguely resembles the below, sorted by Letter and then by Date:
A, 5, 2009-01-01
A, 12, 2009-02-01
A, 12, 2009-03-01
A, 12, 2009-04-01
A, 9, 2009-05-01
A, 9, 2009-06-01
A, 5, 2009-07-01
And I want to get the results:
A, 5, 2009-01-01
A, 12, 2009-02-01
A, 9, 2009-05-01
A, 5, 2009-07-01
discarding adjacent duplicates but keeping the last row (despite it having the same number as the first row). The obvious:
Select Letter, Number, Min(Update_Date) from Table group by Letter, Number
does not work -- it doesn't include the last row.
Edit: As there seems to have been some confusion, I have clarified the month column into a date column. It was meant as a human-parseable short form, not as actual valid data.
Edit: The last row is not important BECAUSE it is the last row, but because it has a "new value" that is also an "old value". Grouping by NUMBER would wrap it in with the first row; it needs to remain a separate entity.
Depending on which DB2 you're on, there are analytic functions which can make this problem easy to solve. An example in Oracle is below, but the select syntax appears to be pretty similar.
create table t1 (c1 char, c2 number, c3 date);
insert into t1 VALUES ('A', 5, DATE '2009-01-01');
insert into t1 VALUES ('A', 12, DATE '2009-02-01');
insert into t1 VALUES ('A', 12, DATE '2009-03-01');
insert into t1 VALUES ('A', 12, DATE '2009-04-01');
insert into t1 VALUES ('A', 9, DATE '2009-05-01');
insert into t1 VALUES ('A', 9, DATE '2009-06-01');
insert into t1 VALUES ('A', 5, DATE '2009-07-01');
SQL> l
1 SELECT C1, C2, C3
2 FROM (SELECT C1, C2, C3,
3 LAG(C2) OVER (PARTITION BY C1 ORDER BY C3) AS PRIOR_C2,
4 LEAD(C2) OVER (PARTITION BY C1 ORDER BY C3) AS NEXT_C2
5 FROM T1
6 )
7 WHERE C2 <> PRIOR_C2
8 OR PRIOR_C2 IS NULL -- to pick up the first value
9 ORDER BY C1, C3
SQL> /
C C2 C3
- ---------- -------------------
A 5 2009-01-01 00:00:00
A 12 2009-02-01 00:00:00
A 9 2009-05-01 00:00:00
A 5 2009-07-01 00:00:00
This is not possible with set based commands (i.e. group by and such).
You may be able to do this by using cursors.
Personally, I would get the data into my client application and do the filtering there.
The first thing you'd have to do is identify the sequence within which you wish to view/consider the the data. Values of "Jan, Feb, Mar" don't help, because the data's not in alphabetical order. And what happens when you flip from Dec to Jan? Step 1: identify a sequence that uniquely defines each row with regards to your problem.
Next, you have to be able to compare item #x with item #x-1, to see if it has changed. If changed, include; if not changed, exclude. Trivial when using procedural code loops (cursors in SQL), but would you want to use those? They tend not to perform too well.
One SQL-based way to do this is to join the table on itself, with the join clause being "MyTable.SequenceVal = MyTable.SequenceVal - 1". Throw in a comparison, make sure you don't toss the very first row of the set (where there is no x-1), and you're done. Note that performance may suck if the "SequenceVal" is not indexed.
Using an "EXCEPT" clause is one way to do it. See below for the solution. I've included all of my test steps here. First, I created a session table (this will go away after I disconnect from my database).
CREATE TABLE session.sample (
letter CHAR(1),
number INT,
update_date DATE
);
Then I imported your sample data:
IMPORT FROM sample.csv OF DEL INSERT INTO session.sample;
Verified that your sample information is in the database:
SELECT * FROM session.sample;
LETTER NUMBER UPDATE_DATE
------ ----------- -----------
A 5 01/01/2009
A 12 02/01/2009
A 12 03/01/2009
A 12 04/01/2009
A 9 05/01/2009
A 9 06/01/2009
A 5 07/01/2009
7 record(s) selected.
I wrote this with an EXCEPT clause, and used the "WITH" to try to make it clearer. Basically, I'm trying to select all rows that have a previous date entry. Then, I exclude all of those rows from a select on the whole table.
WITH rows_with_previous AS (
SELECT s.*
FROM session.sample s
JOIN session.sample s2
ON s.letter = s2.letter
AND s.number = s2.number
AND s.update_date = s2.update_date - 1 MONTH
)
SELECT *
FROM session.sample
EXCEPT ALL
SELECT *
FROM rows_with_previous;
Here is the result:
LETTER NUMBER UPDATE_DATE
------ ----------- -----------
A 5 01/01/2009
A 12 04/01/2009
A 9 06/01/2009
A 5 07/01/2009
4 record(s) selected.