ORACLE SQL IN Clause (SQL Query) - sql

I'm having : delimited column like 1:2:3:. I want to get this into 1,2,3. My query looks like,
select name
from status where id IN (SELECT REPLACE(NEXT_LIST,':',',')
FROM status);
but I got an error
ORA-01722: invalid number

(1, 2, 3, 4) is different from ('1, 2, 3, 4'). IN requires the former, a list of values; you give it the latter, a string.
You have two options mainly:
Build the query dynamically, i.e. get the list first, then use this to build a query string.
Tokenize the string. This can be done with a custom pipelined function or a recursive query, maybe also via some XML functions. Google "Oracle tokenize string" to find a method that suits you.
UPDATE Option #3: Use LIKE as in ':1:2:3:4:' like '%:3:%'
(This requires your next_list to contain only simple numbers separated with colons. No leading zeros, no blanks, no other characters.)
select name
from status
where (select ':' || next_list || ':' from status) like '%:' || id || ':%'

i agreed with Thorsten but i wonder if we just replace one more time would it works? i mean like this:
select name
from status where id IN (SELECT replace(REPLACE(NEXT_LIST,':',','),'''','')
FROM status);

The REPLACE function returns a string, so the nested query returns a list of string values (where colons replaced with commas), but not a list of number values. When Oracle engine interprets id IN (str_value) it tries to cast the str_value to number and raises exception ORA-01722: invalid number because there are cases like '1:2:3' which are definetely unparseable.
The "pure sql" approach leads us to using custom function detecting if a number is in a colon-separated list:
-- you need Oracle 12c to use function in the WITH clause
-- on earlier versions just unwrap CASE statement and put it into query
WITH
FUNCTION in_list(p_id NUMBER, p_list VARCHAR2) RETURN NUMBER DETERMINISTIC IS
BEGIN
RETURN CASE WHEN
instr(':' || p_list || ':', ':' || p_id || ':') > 0
THEN 1 ELSE 0 END;
END;
SELECT *
FROM status
WHERE in_list(id, next_list) = 1;
Here I assume that values in the next_list column are strings containing numbers separated with colon without spaces. In common case you shall modify the function to match specific list formats.

Related

How to make a list of quoted strings from the string values of a column in postgresql?

select my_col from test;
Out:
my_col
x
y
z
How can I change the output of the three rows into an output of a list of three quoted strings in postgresql, so that it looks like:
Out:
'x','y','z'
If I run string_agg(my_val, ''','''), I get
Out:
x','y','z
If I run quote_literal() on top of this output, I get:
Out:
'x'',''y'',''z'
I need this list of quoted strings as an input for the argument of a function (stored procedure). The function works by passing the 'x','y','z' as the argument by hand. Therefore, it is all just about the missing leading and trailing quote.
Side remark, not for the question: it would then get read into the function as variadic _v text[] so that I can check for its values in the function with where t.v = any(_v).
You seem to want:
select string_agg('''' || my_val || '''', ',') my_val_agg
from test
That is: concatenate the quotes around the values before aggregating them - then all that is left is to add the , separator in between.
'''' is there to produce a single quote. We can also use the POSIX syntax in Postgres:
select string_agg(E'\'' || my_val || E'\'', ',') my_val_agg
from test

Issue with replace function in Oracle

I want to replace 6 of the last 10 digits in a string with XXXXXX. The length of the string can be 16 or 19.
Using below query:
SELECT REPLACE('0000000000000000000',SUBSTR('0000000000000000000',-10,6), 'XXXXXX') FROM DUAL;
--Actual Output --XXXXXXXXXXXXXXXXXX0
--Expected Output--000000000XXXXXX0000
SELECT REPLACE('1234561234561234561',SUBSTR('1234561234561234561',-10,6), 'XXXXXX') FROM DUAL;
--Actual Output --123XXXXXXXXXXXX4561
--Expected Output--123456123XXXXXX4561
SELECT REPLACE('0004421640006525212',SUBSTR('0004421640006525212',-10,6), 'XXXXXX') FROM DUAL;
--Actual Output --000442164XXXXXX5212
--Expected Output--000442164XXXXXX5212
Why do the first two give the wrong result, and how can I fix the query?
If the length of the string was always 19 you could do:
substr('0004421640006525212', 1, 9) || 'XXXXXX' || substr('0004421640006525212', -4)
With two possible lengths you could use a case expression to decide the second argument for the first substr() call, based on the actual string length; or you could allow for any length (of at least 10, anyway) with:
substr('0004421640006525212', 1, length('0004421640006525212') - 10) || 'XXXXXX' || substr('0004421640006525212', -4)
or with a placeholder/column for brevity:
substr(str, 1, length(str) - 10) || 'XXXXXX' || substr(str, -4)
Or maybe simpler, but slower, you could use a regular expression:
regexp_replace('0004421640006525212', '^(.*?)(.{6})(.{4})$', '\1XXXXXX\3')
The regular expression splits the string into three groups; working backwards, (.{4})$ is a group of exactly four characters at the end of the string; then (.{6}) is a group of exactly six characters (the ones you want to replace); then ^(.*} is a group of any/all the remaining characters from the start of the string. The replacement pattern keeps the first and third groups - with \1 and \3 - and puts the fixed Xs between those. The second group - of six characters - is discarded.
SQL Fiddle getting the values, and a couple of shorter ones, form a table to avoid having to repeat them all; which also shows the first version doesn't work properly with varying lengths.
The replace function replaces every occurrence of one string with another. It doesn't know or care how the second argument is generated; it doesn't know you're getting it from a particular position in the same string.
When you do:
REPLACE('0004421640006525212',SUBSTR('0004421640006525212',-10,6), 'XXXXXX')
the SUBSTR() evaluates to '000652', so it's effectively:
REPLACE('0004421640006525212','000652', 'XXXXXX')
and that does what you want, because that substring only appears once in the original string. But with:
REPLACE('1234561234561234561',SUBSTR('1234561234561234561',-10,6), 'XXXXXX')
the SUBSTR() evaluates to '456123', so it's effectively:
REPLACE('1234561234561234561','456123', 'XXXXXX')
and that appears multiple times in the original string:
1234561234561234561
^^^^^^
^^^^^^
and both of those are replaced. With all zeros it's even worse; the SUBSTR() is now '000000', so it matches three times:
0000000000000000000
^^^^^^
^^^^^^
^^^^^^
and all three of those are replaced.

Extracting a value from a key/value pair stored in a text field

I need to extract the value from a key/value pair stored in a text field using sql in oracle 11g.
I can detect the "key" with
SELECT *
FROM mytable
WHERE valuet2 LIKE '%' || chr(10) || 'F;' || '%'
but I'm not confidant this is the best way to do the search, and I don't know how to return the value of variable length (up to, but not including the carriage return).
This is the text field I need to search against and extract the value from.
;Please Select;*
E;Expelled
F;Expelled Following Suspension
N;In-School Suspension
S;Out-of-School Suspension
BS;Bus Suspension
101;Detention
130;Conference / Warning
131;Parent Contact / Conference
200;Loss of Recess
I'm querying a separate table that stores the "key", so I need to do the lookup from this text field to determine what that key value represents. I will be pushing this query out to other servers that will have their own unique combinations of key/value pairs, and I cannot anticipate what those may be. Therefore, I cannot write a decode.
You can use regular expression functions and take advantage of the 'm' modifier, which instructs Oracle to treat ^ and $ as the start-of-line and end-of-line anchors (rather than matching only at the beginning and the end of the string). Something like this:
select regexp_substr(valuet2, '^F;(.*)$', 1, 1, 'm', 1)
from mytable
where regexp_like(valuet2, '^F;', 'm')
;
Brief demo:
create table mytable (valuet2 varchar2(4000));
insert into mytable(valuet2) values(
';Please Select;*
E;Expelled
F;Expelled Following Suspension
N;In-School Suspension
S;Out-of-School Suspension
BS;Bus Suspension
101;Detention
130;Conference / Warning
131;Parent Contact / Conference
200;Loss of Recess'
);
select regexp_substr(valuet2, '^F;(.*)$', 1, 1, 'm', 1) as myval
from mytable
where regexp_like(valuet2, '^F;', 'm')
;
MYVAL
----------------------------------------
Expelled Following Suspension
Here F is hardcoded, but you can replace it with a bind variable; the query needs to be tweaked slightly. Please write back if you need help with that.
My not 100% sure what you are looking for but
select substr(valuet2,length(:val)+2) FROM mytable
WHERE valuet2 LIKE :val || ';%';
will work to get everything after the ; where the before matches :val

SQL Regular expression Function

I'm trying to understand the meaning of this regular expression function and it purpose in the select statement.
create or replace FUNCTION REPS_MTCH(string_orig IN VARCHAR2 , string_new IN VARCHAR2, score IN NUMBER)
RETURN PLS_INTEGER AS
BEGIN
IF string_orig IS NULL AND string_new IS NULL THEN
RETURN 0;
ELSIF utl_match.jaro_winkler_similarity(replace(REGEXP_REPLACE(UPPER(string_orig), '[^a-z|A-Z|0-9]+', ''),' ',''),replace(REGEXP_REPLACE(UPPER(string_new), '[^a-z|A-Z|0-9]+', ''),' ','')) >= score THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
//the REPS_MTCH function is being called in this select statement. the select statement is to match names in the the Temp table name as REPS_MTCH_D_STDNT_TMP against the master table named as REPS_MTCH_D_STDNT_MSTR. what is the purpose of the REPS_MTCH function in this select statement?
SELECT
REPS_MTCH(REPS_MTCH_D_STDNT_TMP.FIRST_NAME,REPS_MTCH_D_STDNT_MSTR.FIRST_NAME,85) AS first_match_score,
what is the purpose of the REPS_MTCH function in this select statement?
In the above function the REGEXP_REPLACE is removing all occurrences any non alpha numeric or pipe (|) characters. After that the REGEXP_REPLACE is also wrapped in a redundant call to the regular REPLACE function which simply removes the spaces which were already removed by the REGEXP_REPLACE calls. The test could be rewritten as follows and still behave the identically since the inputs are first UPPERcased before the replace operations occur:
ELSIF utl_match.jaro_winkler_similarity(
REGEXP_REPLACE(UPPER(string_orig), '[^A-Z|0-9]+', '')
,REGEXP_REPLACE(UPPER(string_new) , '[^A-Z|0-9]+', '')
) >= score
THEN RETURN 1;
I simply removed the extra replace operation, the unnecessary lower case a-z and the extra pipe (|) character from the regular expression's character classes.
The JARO_WINKLER_SIMILARITY function just computes a score from 0 not similar to 100 identical of the remaining alpha numeric and pipe characters. You can check out the wikipedia entry on Jaro Winkler distances if you want to know more about them.

PostgreSQL function to select max values of split record

I have a number of tables 'App_build', 'Server_build' with a column called 'buildid' and it contains a large number of records. I.e.:
buildid
-----------
Application1_BLD_01
Application1_BLD_02
Application1_BLD_03
Application2_BLD_01
Application3_BLD_01
Application3_BLD_02
Application4_1_0_0_1 - old format to be disregarded
Application4_1_0_0_2
Application4_BLD_03
I want to write a function called getmax(tablename) i.e. getmax('App_build')
which will return a recordset which lists the highest values only. I.e:
buildid
--------
Application1_BLD_03
Application2_BLD_01
Application3_BLD_02
Application4_BLD_03
I am new to SQL so am not sure how to start - I guess I can use a split command and then the MAX function but I have no idea where to start.
Any help will be great.
Assuming current version PostgreSQL 9.2 for lack of information.
Plain SQL
The simple query could look like this:
SELECT max(buildid)
FROM app_build
WHERE buildid !~ '\d+_\d+_\d+_\d+$' -- to exclude old format
GROUP BY substring(buildid, '^[^_]+')
ORDER BY substring(buildid, '^[^_]+');
The WHERE condition used a regular expression:
buildid !~ '\d+_\d+_\d+_\d+$'
Excludes buildid that end in 4 integer numbers divided by _.
\d .. character class shorthand for digits. Only one backslash \ in modern PostgreSQL with standard_conforming_strings = ON.
+ .. 1 or more of preceding atom.
$ .. As last character: anchored to the end of the string.
There may be a cheaper / more accurate way, you did not properly specify the format.
GROUP BY and ORDER BY extract the the string before the first occurrence of _ with substring() as app name to group and order by. The regexp explained:
^ .. As first character: anchor search expression to start of string.
[^_] .. Character class: any chracter that is not _.
Does the same as split_part(buildid, '_', 1). But split_part() may be faster ..
Function
If you want to write a function where the table name is variable, you need dynamic SQL. That is a plpgsql function with EXECUTE:
CREATE OR REPLACE FUNCTION getmax(_tbl regclass)
RETURNS SETOF text AS
$func$
BEGIN
RETURN QUERY
EXECUTE format($$
SELECT max(buildid)
FROM %s
WHERE buildid !~ '\d+_\d+_\d+_\d+$'
GROUP BY substring(buildid, '^[^_]+')
ORDER BY substring(buildid, '^[^_]+')$$, _tbl);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM getmax('app_build');
Or if you are, in fact, using mixed case identifiers:
SELECT * FROM getmax('"App_build"');
->SQLfiddle demo.
More info on the object identifier class regclass in this related questions:
Table name as a PostgreSQL function parameter
What you want is a groupwise_max. It can be done with MAX() but the usual way is left join:
SELECT b1.buildid
FROM builds AS b1
LEFT JOIN builds AS b2 ON
split_part(b1.buildid, '_', 1)=split_part(b2.buildid, '_', 1)
AND
split_part(b1.buildid, '_', 3)::int<split_part(b2.buildid, '_', 3)::int
WHERE b2.buildid IS NULL;
But since you're using PG it can be done with DISTINCT ON ()
SELECT DISTINCT ON (split_part(buildid, '_', 1)) buildid
FROM builds
ORDER BY split_part(buildid, '_', 1),split_part(buildid, '_', 3)::int DESC
http://sqlfiddle.com/#!12/308bf/9