remove extra + from text SQL - sql

this refers to a question asked by someone else previously
previous question
my question is how do I adapt this solution so that before any function/script is ran the name and value fields are stripped of any additional + and updated so no additional + remain.
For e.g.
Name Value
A+B+C+ 1+2+3+
A++B 1++2
this should be updated to
Name Value
A+B+C 1+2+3
A+B 1+2
once this update has taken place, I can run the solution provided in the previous question.
Thanks

You need to replace ++ with + and to remove the + at the end of the string.
/* sample data */
with input(Name, Value) as (
select 'A+B+C+' ,'1+2+3+' from dual union all
select 'A++B' ,'1++2' from dual
)
/* query */
select trim('+' from regexp_replace(name, '\+{2,}', '+') ) as name,
trim('+' from regexp_replace(value, '\+{2,}', '+') ) as value
from input
If you need to update a table, you may need:
update yourTable
set name = trim('+' from regexp_replace(name, '\+{2,}', '+') ),
value= trim('+' from regexp_replace(value, '\+{2,}', '+') )
In a more compact way, without the external trim ( assuming you have no leading +):
/* sample data */
with input(Name, Value) as (
select 'A+B+C+' ,'1+2+3+' from dual union all
select 'A++B+++C+' ,'1++2+++3+' from dual union all
select 'A+B' ,'1+2' from dual
)
/* query */
select regexp_replace(name, '(\+)+(\+|$)', '\2') as name,
regexp_replace(value, '(\+)+(\+|$)', '\2') as value
from input

You could use something on the lines of:
Select substr('1+2+3+', 0, length('1+2+3+')-1) from dual ;
Select replace('1++2', '++', '+') from dual;
I'm assuming you have the output already present in a variable you can play with.
EDIT:
Here's a function that can solve the problem (You can call this function in your select clauses thereby solving the problem):
CREATE OR REPLACE Function ReplaceChars
( name_in IN varchar2 )
RETURN varchar2
IS
changed_string varchar2(20) ;
BEGIN
changed_string:=replace(name_in, '++', '+') ;
CASE WHEN substr(changed_string, -1) in ('+')
then
changed_string:=substr(changed_string,0, length(changed_string) - 1) ;
else changed_string:=changed_string ;
end CASE ;
RETURN changed_string;
END;

You can use the below:
LTRIM(RTRIM (REGEXP_REPLACE (column_name, '\+{2,}', '+'), '+'),'+')
Eg:
SELECT LTRIM(RTRIM (REGEXP_REPLACE ('+A+++B+C+++D++', '\+{2,}', '+'), '+'),'+') VALUE
FROM DUAL;
returns output: A+B+C+D

if youre working with ssms, GIVE IT A GO:::
UPDATE tablename
SET colname=
CASE colname WHEN LIKE '%++%' THEN
WHILE colname LIKE '%++%'
(REPLACE(colname,++,+))
END LOOP
WHEN LIKE '%+' THEN
SUBSTR(colname, 1, LENGTH(colname)-1)
WHEN LIKE '+%' THEN
SUBSTR(colname, 2, LENGTH(colname))
ELSE
colname
END

Related

How to include the code into a REPLACE function in oracle?

User #psaraj12 helped me with a ticket here about finding ascii character in a string in my DB with the following code:
with test (col) as (
select
'L.A.D'
from
dual
union all
select
'L.􀈉.D.'
from
dual
)
select
col,
case when max(ascii_of_one_character) >= 65535 then 'NOT OK' else 'OK' end result
from
(
select
col,
substr(col, column_value, 1) one_character,
ascii(
substr(col, column_value, 1)
) ascii_of_one_character
from
test cross
join table(
cast(
multiset(
select
level
from
dual connect by level <= length(col)
) as sys.odcinumberlist
)
)
)
group by
col
having max(ascii_of_one_character) >= 4000000000;
The script looks for characters of a certain range GROUPs them and marks displays them.
Is it possible to include this in a REPLACE statement of a similar sort:
REPLACE(table.column, max(ascii_of_one_character) >= 4000000000, '')
EDIT: As per #flyaround answer this is the code I use changed a little bit:
with test (col) as (
select skunden.name1
from skunden
)
select col
, REGEXP_REPLACE(col, 'max(ascii_of_one_character)>=4000000000', '') as cleaned
, CASE WHEN REGEXP_COUNT(col, 'max(ascii_of_one_character)>=4000000000') > 0 THEN 0 ELSE 1 END as isOk
from test;
Coming back to your original code, because my suggested REGEX_REPLACE is not working sufficient with high surrogates. Your approach is already very effective, so I jumped into it to have a solution here.
MERGE
INTO skunden
USING (
select
id as innerId,
name as innerName,
case when max(ascii_of_one_character) >= 65535 then 0 else 1 end isOk,
listagg(case when ascii_of_one_character <65535 then one_character end , '') within group (order by rn) as cleaned
from
(
select
id,
name,
substr(name, column_value, 1) one_character,
ascii(
substr(name, column_value, 1)
) ascii_of_one_character
, rownum as rn
from
skunden cross
join table(
cast(
multiset(
select
level
from
dual connect by level <= length(name)
) as sys.odcinumberlist
)
)
)
group by
id, name
having max(ascii_of_one_character) >= 4000000000
)
ON (skunden.id = innerId)
WHEN MATCHED THEN
UPDATE
SET name = cleaned
;
On MERGE you can't use the referencing column for an update. Therefore you should use the unique key (I used 'id' in my example) of your table.
The resulting value will be 'L..D' for your example value of 'L.􀈉.D.'
If I got your question correctly you would like to remove characters with a higher decimal representation of characters than specified.
You could check to use REGEXP_REPLACE for this, like:
with test (col) as (
select
'L.A.D'
from
dual
union all
select
'L.􀈉.D.'
from
dual
)
select col
, REGEXP_REPLACE(col, '[^\u00010000-\u0010FFFF]+$', '') as cleaned
, CASE WHEN REGEXP_COUNT(col, '[^\u00010000-\u0010FFFF]+$') > 0 THEN 0 ELSE 1 END as isOk
from test;

Substring from underscore and onwards in Oracle

I have a string with under score and some characters. I need to apply substring and get values to the left excluding underscore. So I applied below formula and its working correctly for those strings which have underscore (_). But for strings without (_) it is bringing NULL. Any suggestions how this can be handled in the substring itself.
Ex: ABC_BASL ---> Works correctly; ABC ---> gives null
My formula as below -
select SUBSTR('ABC_BAS',1,INSTR('ABC_BAS','_')-1) from dual;
ABC
select SUBSTR('ABC',1,INSTR('ABC','_')-1) from dual;
(NULL)
You could use a CASE expression to first check for an underscore:
WITH yourTable AS (
SELECT 'ABC_BAS' AS col FROM dual UNION ALL
SELECT 'ABC' FROM dual
)
SELECT
CASE WHEN col LIKE '%\_%' ESCAPE '\'
THEN SUBSTR(col, 1, INSTR(col, '_') - 1)
ELSE col END AS col_out
FROM yourTable;
Use regular expression matching:
SELECT REGEXP_SUBSTR('ABC_BAS', '(.*)([_]|$)?', 1, 1, NULL, 1) FROM DUAL;
returns 'ABC', and
SELECT REGEXP_SUBSTR('ABC', '(.*)([_]|$)?', 1, 1, NULL, 1) FROM DUAL;
also returns 'ABC'.
db<>fiddle here
EDIT
The above gives correct results, but I missed the easiest possible regular expression to do the job:
SELECT REGEXP_SUBSTR('ABC_BAS', '[^_]*') FROM DUAL;
returns 'ABC', as does
SELECT REGEXP_SUBSTR('ABC', '[^_]*') FROM DUAL;
db<>fiddle here
Yet another approach is to use the DECODE in the length parameter of the substr as follows:
substr(str,
1,
decode(instr(str,'_'), 0, lenght(str), instr(str,'_') - 1)
)
You seem to want everything up to the first '_'. If so, one method usesregexp_replace():
select regexp_replace(str, '(^[^_]+)_.*$', '\1')
from (select 'ABC' as str from dual union all
select 'ABC_BAS' from dual
) s
A simpler method is:
select regexp_substr(str, '^[^_]+')
from (select 'ABC' as str from dual union all
select 'ABC_BAS' from dual
) s
Here is a db<>fiddle.
I'd use
regexp_replace(text,'_.*')
or if performance was a concern,
substr(text, 1, instr(text||'_', '_') -1)
For example,
with demo(text) as
( select column_value
from table(sys.dbms_debug_vc2coll('ABC', 'ABC_DEF', 'ABC_DEF_GHI')) )
select text
, regexp_replace(text,'_.*')
, substr(text, 1, instr(text||'_', '_') -1)
from demo;
TEXT REGEXP_REPLACE(TEXT,'_.*') SUBSTR(TEXT,1,INSTR(TEXT||'_','_')-1)
------------ --------------------------- -------------------------------------
ABC ABC ABC
ABC_DEF ABC ABC
ABC_DEF_GHI ABC ABC
Ok i think i got it. Add nvl to the substring and insert the condition as below -
select nvl(substr('ABC',1,instr('F4001Z','_')-1),'ABC') from dual;

SQL to find upper case words from a column

I have a description column in my table and its values are:
This is a EXAMPLE
This is a TEST
This is a VALUE
I want to display only EXAMPLE, TEST, and VALUE from the description column.
How do I achieve this?
This could be a way:
-- a test case
with test(id, str) as (
select 1, 'This is a EXAMPLE' from dual union all
select 2, 'This is a TEST' from dual union all
select 3, 'This is a VALUE' from dual union all
select 4, 'This IS aN EXAMPLE' from dual
)
-- concatenate the resulting words
select id, listagg(str, ' ') within group (order by pos)
from (
-- tokenize the strings by using the space as a word separator
SELECT id,
trim(regexp_substr(str, '[^ ]+', 1, level)) str,
level as pos
FROM test t
CONNECT BY instr(str, ' ', 1, level - 1) > 0
and prior id = id
and prior sys_guid() is not null
)
-- only get the uppercase words
where regexp_like(str, '^[A-Z]+$')
group by id
The idea is to tokenize every string, then cut off the words that are not made by upper case characters and then concatenate the remaining words.
The result:
1 EXAMPLE
2 TEST
3 VALUE
4 IS EXAMPLE
If you need to handle some other character as an upper case letter, you may edit the where condition to filter for the matching words; for example, with '_':
with test(id, str) as (
select 1, 'This is a EXAMPLE' from dual union all
select 2, 'This is a TEST' from dual union all
select 3, 'This is a VALUE' from dual union all
select 4, 'This IS aN EXAMPLE' from dual union all
select 5, 'This IS AN_EXAMPLE' from dual
)
select id, listagg(str, ' ') within group (order by pos)
from (
SELECT id,
trim(regexp_substr(str, '[^ ]+', 1, level)) str,
level as pos
FROM test t
CONNECT BY instr(str, ' ', 1, level - 1) > 0
and prior id = id
and prior sys_guid() is not null
)
where regexp_like(str, '^[A-Z_]+$')
group by id
gives:
1 EXAMPLE
2 TEST
3 VALUE
4 IS EXAMPLE
5 IS AN_EXAMPLE
Here's another solution. It was inspired by Aleksej's answer.
The idea? Get all the words. Then aggregate only fully uppercased to a list.
Sample data:
create table descriptions (ID int, Description varchar2(100));
insert into descriptions (ID, Description)
select 1 as ID, 'foo Foo FOO bar Bar BAR' as Description from dual
union all select 2, 'This is an EXAMPLE TEST Description VALUE' from dual
;
Query:
select id, Description, listagg(word, ',') within group (order by pos) as UpperCaseWords
from (
select
id, Description,
trim(regexp_substr(Description, '\w+', 1, level)) as word,
level as pos
from descriptions t
connect by regexp_instr(Description, '\s+', 1, level - 1) > 0
and prior id = id
and prior sys_guid() is not null
)
where word = upper(word)
group by id, Description
Result:
ID | DESCRIPTION | UPPERCASEWORDS
-- | ----------------------------------------- | ------------------
1 | foo Foo FOO bar Bar BAR | FOO,BAR
2 | This is an EXAMPLE TEST Description VALUE | EXAMPLE,TEST,VALUE
It is possible to achieve this thanks to the REGEXP_REPLACE function:
SELECT REGEXP_REPLACE(my_column, '(^[A-Z]| |[a-z][A-Z]*|[A-Z]*[a-z])', '') AS Result FROM my_table
It uses a regex which replaces first upper case char of the line and converts every lower case char and space with blanks.
Try this:
SELECT SUBSTR(column_name, INSTR(column_name,' ',-1) + 1)
FROM your_table;
This should do the trick:
SELECT SUBSTR(REGEXP_REPLACE(' ' || REGEXP_REPLACE(description, '(^[A-Z]|[a-z]|[A-Z][a-z]+|[,])', ''), ' +', ' '), 2, 9999) AS only_upper
FROM (
select 'Hey IF you do not know IT, This IS a test of UPPERCASE and IT, with good WILL and faith, Should BE fine to be SHOWN' description
from dual
)
I have added condition to strip commas, you can add inside that brakets other special characters to remove.
ONLY_UPPER
-----------------------------------
IF IT IS UPPERCASE IT WILL BE SHOWN
This is a function based on some of the regular expression answers.
create or replace function capwords(orig_string varchar2)
return varchar2
as
out_string varchar2(80);
begin
out_string := REGEXP_REPLACE(orig_string, '([a-z][A-Z_]*|[A-Z_]*[a-z])', '');
out_string := REGEXP_REPLACE(trim(out_string), '( *)', ' ');
return out_string;
end;
/
Removes strings of upper case letters and underscores that have lower case letters
on either end. Replaces multiple adjacent spaces with one space.
Trims extra spaces off of the ends. Assumes max size of 80 characters.
Slightly edited output:
>select id,str,capwords(str) from test;
ID STR CAPWORDS(STR)
---------- ------------------------------ ------------------
1 This is a EXAMPLE EXAMPLE
2 This is a TEST TEST
3 This is a VALUE VALUE
4 This IS aN EXAMPLE IS EXAMPLE
5 This is WITH_UNDERSCORE WITH_UNDERSCORE
6 ThiS IS aN EXAMPLE IS EXAMPLE
7 thiS IS aN EXAMPLE IS EXAMPLE
8 This IS wiTH_UNDERSCORE IS
If you only need to "display" the result without changing the values in the column then you can use CASE WHEN (in the example Description is the column name):
Select CASE WHEN Description like '%EXAMPLE%' then 'EXAMPLE' WHEN Description like '%TEST%' then 'TEST' WHEN Description like '%VALUE%' then 'VALUE' END From [yourTable]
The conditions are not case sensitive even if you write it all in uppercase.
You can add Else '<Value if all conditions are wrong>' before the END in case there are descriptions that don't contain any of the values. The example will return NULL for those cases, and writing ELSE Description will return the original value of that row.
It also works if you need to update. It is simple and practical, easy way out, haha.

I want to remove part of string from a string

Thank you in advance.
I want to remove string after . including ., but length is variable and string can be of any length.
1)Example:
Input:- SCC0204.X and FRK0005.X and RF0023.X and ADF1010.A and HGT9010.V
Output: SCC0204 and FRK0005 and RF0023 and ADF1010.A and HGT9010.V
I tried using the charindex but as the length keeps on changing i wasn't able to do it. I want to trim the values with ending with only X
Any help will be greatly appreciated.
Assuming there is only one dot
UPDATE TABLE
SET column_name = left(column_name, charindex('.', column_name) - 1)
For SELECT
select left(column_name, charindex('.', column_name) - 1) AS col
from your_table
Hope this helps. The code only trims the string when the value has a decimal "." in it and if that value is equal to .X
;WITH cte_TestData(Code) AS
(
SELECT 'SCC0204.X' UNION ALL
SELECT 'FRK0005.X' UNION ALL
SELECT 'RF0023.X' UNION ALL
SELECT 'ADF1010.A' UNION ALL
SELECT 'HGT9010.V' UNION ALL
SELECT 'SCC0204' UNION ALL
SELECT 'FRK0005'
)
SELECT CASE
WHEN CHARINDEX('.', Code) > 0 AND RIGHT(Code,2) = '.X'
THEN SUBSTRING(Code, 1, CHARINDEX('.', Code) - 1)
ELSE Code
END
FROM cte_TestData
If the criteria is only to replace remove .X then probably this should also work
;WITH cte_TestData(Code) AS
(
SELECT 'SCC0204.X' UNION ALL
SELECT 'FRK0005.X' UNION ALL
SELECT 'RF0023.X' UNION ALL
SELECT 'ADF1010.A' UNION ALL
SELECT 'HGT9010.V' UNION ALL
SELECT 'SCC0204' UNION ALL
SELECT 'FRK0005'
)
SELECT REPLACE (Code,'.X','')
FROM cte_TestData
Use LEFT String function :
DECLARE #String VARCHAR(100) = 'SCC0204.XXXXX'
SELECT LEFT(#String,CHARINDEX('.', #String) - 1)
I think your best bet here is to create a function that parses the string and uses regex. I hope this old post helps:
Perform regex (replace) in an SQL query
However, if the value you need to trim is constantly ".X", then you should use
select replace(string, '.x', '')
Please check the below code. I think this will help you.
DECLARE #String VARCHAR(100) = 'SCC0204.X'
IF (SELECT RIGHT(#String,2)) ='.X'
SELECT LEFT(#String,CHARINDEX('.', #String) - 1)
ELSE
SELECT #String
Update: I just missed one of the comments where the OP clarifies the requirement. What I put together below is how you would deal with a requirement to remove everything after the first dot on strings ending with X. I leave this here for reference.
;WITH cte_TestData(Code) AS
(
SELECT 'SCC0204.X' UNION ALL -- ends with '.X'
SELECT 'FRK.000.X' UNION ALL -- ends with '.X', contains multiple dots
SELECT 'RF0023.AX' UNION ALL -- ends with '.AX'
SELECT 'ADF1010.A' UNION ALL -- ends with '.A'
SELECT 'HGT9010.V' UNION ALL -- ends with '.V'
SELECT 'SCC0204.XF' UNION ALL -- ends with '.XF'
SELECT 'FRK0005' UNION ALL -- totally clean
SELECT 'ABCX' -- ends with 'X', not dots
)
SELECT
orig_string = code,
newstring =
SUBSTRING
(
code, 1,
CASE
WHEN code LIKE '%X'
THEN ISNULL(NULLIF(CHARINDEX('.',code)-1, -1), LEN(code))
ELSE LEN(code)
END
)
FROM cte_TestData;
FYI - SQL Server 2012+ you could simplify this code like this:
SELECT
orig_string = code,
newstring =
SUBSTRING(code, 1,IIF(code LIKE '%X', ISNULL(NULLIF(CHARINDEX('.',code)-1, -1), LEN(code)), LEN(code)))
FROM cte_TestData;
With SUBSTRING you can achieve your requirements by below code.
SELECT SUBSTRING(column_name, 0, CHARINDEX('.', column_name)) AS col
FROM your_table
If you want to remove fixed .X from string you can also use REPLACE function.
SELECT REPLACE(column_name, '.X', '') AS col

Reg Expression in oracle?

I have a string like this '102/103/104/106'
Now if i pass 102 as input then output should be the next field that is 103. if 103 then output should be 104 and if 106 then output should be null(as for last field I don't have any further expression). I can do this using procedure by splitting the string into arrays and comparing. But can I do this through sql statement something like this
select '102/103/104/106' from dual where [expression 102 or 103].
Thanks!!
You can do it in pure SQL with something like this:
--convert your string into rows
with vals as (
select
substr('102/103/104/106',
instr('102/103/104/106', '/', 1, level)-3,
3
) col,
level lvl
from dual
connect by level <= length('102/103/104/106')-length(replace('102/103/104/106', '/'))+1
)
select *
from (
select col,
lead(col) over (order by lvl) next_val -- find the next value in the list
from vals
)
where col = :val;
Basically, convert your string into rows by parsing it. Then use the analytic lead to find the "next" value.
-- p_whole_string = '102/103/104/106'
-- p_prev = '102'
select
regexp_substr(p_whole_string, '(^|/)' || p_prev || '/([^/]+)', 1, 1, null, 2)
as next
from dual;
Added NVL to return last value if 106 is entered:
SELECT NVL(REGEXP_SUBSTR('102/103/104/106', '(^|/)' || '106' || '/([^/]+)', 1, 1, null, 2), REGEXP_SUBSTR('102/103/104/106', '[^/]+$')) as nxt
FROM dual
/
works for Oracle form 10 up.
SELECT
REGEXP_SUBSTR(
REGEXP_SUBSTR('102/103/104/106', '(^|/)102/[^/]+'), -- returns 102/103
'[^/]+',1,2) val -- takes second part
FROM DUAL;
with parameters looks like this:
-- p_string_to_search = '102/103/104/106'
-- p_string_to_match = '102'
SELECT
REGEXP_SUBSTR(
REGEXP_SUBSTR(p_string_to_search, '(^|/)' || p_string_to_match ||'/[^/]+'), -- returns 102/103
'[^/]+',1,2) val -- takes second part
FROM DUAL;