I am trying to find a way to replace the string in between two "anchor points" in my VARCHAR2 column.
These "anchors" are <? and ?> and I want to remove (replace with '') everything that is between those two symbols.
I've already tried playing around with the REPLACE() function, e.g. stuff like SELECT REPLACE(my_varchar2_column,'<? % ?>') FROM my_table; and using the % operator as a wildcard, but that didn't work. No error was thrown, but the result wasn't as expected, as in the the replace function interpreting the % literally and not as a wildcard.
Does anyone have an idea how to achieve a replacement like this?
Example for current content of the column:
text I want to keep <? cryptic stuff in betweeen ?> text I want to keep as well
By replacing everything in between <? and ?> I want to remove that whole passage from my columns text. Expected result is like this:
text I want to keep text I want to keep as well
You Can you use REGEXP_REPLACE functionality of ORACLE.
First Argument = The column which needs to be replaced.
Second Argument = the substring to search for replacement.
Third Argument = the text to be replaced ( NOTE : if we omit this argument, the matched substrings are deleted
SELECT
REGEXP_REPLACE('text I want to keep <? cryptic stuff in betweeen ?> text ','<\?.*\?>')
FROM DUAL
OUTPUT:
text I want to keep text
select substr( my_varchar2_column, 1, INSTR( my_varchar2_column, '') +2 ) from yourtable;
Would that do the trick? You find the position of the beginning and end tags with the INSTR function and use it to SUBSTR the values before and after the tags.
Related
I need some help with the next. I have a field text in SQL, this record a list of times sepparates with '|'. For example
'14613|15474|3832|148|5236|5348|1055|524' Each value is a time in milliseconds. This field could any length, for example is perfect correct '3215|2654' or '4565' (only 1 value). I need get this field and replace all number with -1000 value.
So '14613|15474|3832|148|5236|5348|1055|524' will be '-1000|-1000|-1000|-1000|-1000|-1000|-1000|-1000'
Or '3215|2654' => '-1000|-1000' Or '4565' => '-1000'.
I try use regexp_replace(times_field,'[[:digit:]]','-1000','g') but it replace each digit, not the complete number, so in this example:
'3215|2654' than must be '-1000|-1000', i get:
'-1000-1000-1000-1000|-1000-1000-1000-1000', I try with other combinations and more options of regexp but i'm done.
Please need your help, thanks!!!.
We can try using REGEXP_REPLACE here:
UPDATE yourTable
SET times_field = REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g');
If instead you don't really want to alter your data but rather just view your data this way, then use a select:
SELECT
times_field,
REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g') AS times_field_replace
FROM yourTable;
Note that in either case we pass g as the fourtb parameter to REGEXP_REPLACE to do a global replacement of all pipe separated numbers.
[[:digit:]] - matches a digit [0-9]
+ Quantifier - matches between one and unlimited times, as many times as possible
your regexp must look like
regexp_replace(times_field,'[[:digit:]]+','-1000','g')
I am using Snowflake SQL. I would like to remove characters from a string after a special character ~. How can I do that?
here is the whole scenario. Let me explain. I do have a string like 'CK#123456~fndkjfgdjkg'. Now, i want only the number after #.And not anything after ~. This is number length varies for that field value. It might be 1 or 5 or 3. And i want to add the condition in where class where this number is equal to check_num from other table after joining. I am trying REGEXP_SUBSTR(A.SRC_TXT, '(?<=CK#)(.+?\b)') = C.CHK_NUM in the where condition. I am getting the error as 'No repititive argument after ?'
You can use a regex for this
-- To remove just the character after a ~
select regexp_replace('fo~o bar','~.', '');
-- returns 'fo bar'
--If you want to keep the ~
select regexp_replace('fo~o bar','~.', '~');
-- returns 'fo~ bar'
--If you want to remove everything after the ~
select regexp_replace('fo~o bar','~.*', '');
--returns 'fo'
If you need to remove other specific character sets after a ~, you can probably do this with a slightly more complicated regex, but I'd need examples of your desired input/output to help with that.
EDIT for updated question
This regex replace should get what you need.
select regexp_replace('CK#123456~fndkjfgdjkg','CK#(\\d*)~.*', '\\1');
-- returns 123456
(\\d*) gets ANY number of digits in a row, and the \\1 causes it to replace the match with what was in the first set of parenthesis, which is your list of digits. the CK# and ~.* are there to make sure the whole string gets matched and replaced.
If the CK# can vary as well, you can use .*? like this.
select regexp_replace('ABCD123HI#123456~fndkjfgdjkg','.*?#(\\d*)~.*', '\\1')
-- returns 123456
I'd probably do something like the following, easy enough but not as cool as RegEx type of functions.
set my_string='fooo~12345';
set search_for_me = '~';
SELECT SUBSTR($my_string, 1, DECODE(position($search_for_me, $my_string), 0, length($my_string), position($search_for_me, $my_string)));
I hope this helps...Rich
It looks like lookahead and lookbehinds are not supported in REGEXP functions, they seem to work in the PATTERN clause of a LIST command. Snowflake documentation makes no mention either way of lookahead or lookbehinds.
In your example:
It seems that the query engine is looking for that repeating argument, where you are attempting a lookbehind
You have not specified what you wanted extracted. You have two capture groups, but in this scenario everything would be returned
Since you are looking to remove everything after ~ you have a delimiter, why not use it in your REGEXP_SUBSTR function?
Try the following:
SELECT $1,REGEXP_SUBSTR($1,'\\w+#(.+?)~',1,1,'is',1)
FROM VALUES
('CK#123456~fndkjfgdjkg')
,('QH#128fklj924~fndkjfgdjkg')
;
This looks for:
One or more word characters
Followed by #
Capturing one or more characters upto and not including ~
Returns the characters within the capture group
You can change the .+? to \\d+? to make sure the pattern is only digits. Backslashes must be escaped with a backslash.
The descriptions for each argument of the function can be found here:
https://docs.snowflake.net/manuals/sql-reference/functions/regexp_substr.html
You could check this!!
select substr('CK#123456~fndkjfgdjkg',4,6) from dual;
OUTPUT
123456
https://docs.snowflake.net/manuals/sql-reference/functions/substr.html
I want to use regexp_replace to replace all blank with '_'.
I use this statment:
select regexp_replace('"<div_class="CCL-temp-border"><div_class="input-group_moveDivEnd"_style="margin-bottom:_5px;_top:_auto;_left:_auto;_width:_100%;_position:_relative;_opacity:_1;_filter:_none;"_data-id="moveDivEnd_1545116285310">_; <span_class="input-group-addon_CCL-te (...)"', '\s', '_', 'g')
But the result is this:
"<div_class="CCL-temp-border"><div_class="input-group_moveDivEnd"_style="margin-bottom:_5px;_top:_auto;_left:_auto;_width:_100%;_position:_relative;_opacity:_1;_filter:_none;"_data-id="moveDivEnd_1545116285310">_;_______<span_class="input-group-addon_CCL-t (...)"
My statment is this:
select case when length(topiccontent)=0 THEN '_' else coalesce(regexp_replace(replace(replace(replace(topiccontent,chr(13), '_'),chr(10),'_'),' ','_'),'\s', '_', 'g'),'_') end as topiccontent
from ccl_topics
You can see the blank still exists, why?
I know why it can't be replaced.
There are some character restrictions when the data is pasted out from database.
The omitted part is converted to (...).
So (...) is not real characters, but ellipses.
For example, more than 600 characters exist in a column of a table, and then paste it,the result with ellipsis marks.
Input string: ["1189-13627273","89-13706681","118-13708388"]
Expected Output: ["14013627273","14013706681","14013708388"]
What I am trying to achieve is to replace any numbers till the '-' for each item with hard coded text like '140'
SELECT replace(value_to_replace, '-', '140')
FROM (
VALUES ('1189-13627273-77'), ('89-13706681'), ('118-13708388')
) t(value_to_replace);
check this
I found the right way to achieve that using the below regular expression.
SELECT REGEXP_REPLACE (string_to_change, '\\"[0-9]+\\-', '140')
You don't need a regexp for this, it's as easy as concatenation of 140 and the substring from - (or the second part when you split by -)
select '140'||substring('89-13706681' from position('-' in '89-13706681')+1 for 1000)
select '140'||split_part('89-13706681','-',2)
also, it's important to consider if you might have instances that don't contain - and what would be the output in this case
Use regexp_replace(text,text,text) function to do so giving the pattern to match and replacement string.
First argument is the value to be replaced, second is the POSIX regular expression and third is a replacement text.
Example
SELECT regexp_replace('1189-13627273', '.*-', '140');
Output: 14013627273
Sample data set query
SELECT regexp_replace(value_to_replace, '.*-', '140')
FROM (
VALUES ('1189-13627273'), ('89-13706681'), ('118-13708388')
) t(value_to_replace);
Caution! Pattern .*- will replace every character until it finds last occurence of - with text 140.
I have a varchar column, and each field contains a single word, but there are random number of pipe character before and after the word.
Something like this:
MyVarcharColumn
'|||Apple|||||'
'|||||Pear|||||'
'||Leaf|'
When I query the table, I wish to replace the multiple pipes to a single one, so the result would be like this:
MyVarcharColumn
'|Apple|'
'|Pear|'
'|Leaf|'
Cannot figure out how to solve it with REPLACE function, anybody knows?
vkp's method absolutely solves your issue. Another method that works, and also will work in a variety of other situations, is using a triple REPLACE()
SELECT REPLACE(REPLACE(REPLACE('|||Apple|||||', '|', '><'), '<>',''), '><','|')
This method will allow you to keep a delimiter between multiple strings where Mr. VPK's method will concat the strings and put a delim at the very beginning and the very end.
SELECT REPLACE(REPLACE(REPLACE('|||Apple|||||Banana||||||||||', '|', '><'), '<>',''), '><','|')
One way is to replace all the | with blanks and add a pipe character at the beginning and the end of string.
select '|'+replace(mycolumn,'|','')+'|' from tablename