Replace multiple repeating character to one - sql

I have a varchar column, and each field contains a single word, but there are random number of pipe character before and after the word.
Something like this:
MyVarcharColumn
'|||Apple|||||'
'|||||Pear|||||'
'||Leaf|'
When I query the table, I wish to replace the multiple pipes to a single one, so the result would be like this:
MyVarcharColumn
'|Apple|'
'|Pear|'
'|Leaf|'
Cannot figure out how to solve it with REPLACE function, anybody knows?

vkp's method absolutely solves your issue. Another method that works, and also will work in a variety of other situations, is using a triple REPLACE()
SELECT REPLACE(REPLACE(REPLACE('|||Apple|||||', '|', '><'), '<>',''), '><','|')
This method will allow you to keep a delimiter between multiple strings where Mr. VPK's method will concat the strings and put a delim at the very beginning and the very end.
SELECT REPLACE(REPLACE(REPLACE('|||Apple|||||Banana||||||||||', '|', '><'), '<>',''), '><','|')

One way is to replace all the | with blanks and add a pipe character at the beginning and the end of string.
select '|'+replace(mycolumn,'|','')+'|' from tablename

Related

Postgres SQL regexp_replace replace all number

I need some help with the next. I have a field text in SQL, this record a list of times sepparates with '|'. For example
'14613|15474|3832|148|5236|5348|1055|524' Each value is a time in milliseconds. This field could any length, for example is perfect correct '3215|2654' or '4565' (only 1 value). I need get this field and replace all number with -1000 value.
So '14613|15474|3832|148|5236|5348|1055|524' will be '-1000|-1000|-1000|-1000|-1000|-1000|-1000|-1000'
Or '3215|2654' => '-1000|-1000' Or '4565' => '-1000'.
I try use regexp_replace(times_field,'[[:digit:]]','-1000','g') but it replace each digit, not the complete number, so in this example:
'3215|2654' than must be '-1000|-1000', i get:
'-1000-1000-1000-1000|-1000-1000-1000-1000', I try with other combinations and more options of regexp but i'm done.
Please need your help, thanks!!!.
We can try using REGEXP_REPLACE here:
UPDATE yourTable
SET times_field = REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g');
If instead you don't really want to alter your data but rather just view your data this way, then use a select:
SELECT
times_field,
REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g') AS times_field_replace
FROM yourTable;
Note that in either case we pass g as the fourtb parameter to REGEXP_REPLACE to do a global replacement of all pipe separated numbers.
[[:digit:]] - matches a digit [0-9]
+ Quantifier - matches between one and unlimited times, as many times as possible
your regexp must look like
regexp_replace(times_field,'[[:digit:]]+','-1000','g')

How can I remove characters in a string after a specific special character (~) in snowflake sql?

I am using Snowflake SQL. I would like to remove characters from a string after a special character ~. How can I do that?
here is the whole scenario. Let me explain. I do have a string like 'CK#123456~fndkjfgdjkg'. Now, i want only the number after #.And not anything after ~. This is number length varies for that field value. It might be 1 or 5 or 3. And i want to add the condition in where class where this number is equal to check_num from other table after joining. I am trying REGEXP_SUBSTR(A.SRC_TXT, '(?<=CK#)(.+?\b)') = C.CHK_NUM in the where condition. I am getting the error as 'No repititive argument after ?'
You can use a regex for this
-- To remove just the character after a ~
select regexp_replace('fo~o bar','~.', '');
-- returns 'fo bar'
--If you want to keep the ~
select regexp_replace('fo~o bar','~.', '~');
-- returns 'fo~ bar'
--If you want to remove everything after the ~
select regexp_replace('fo~o bar','~.*', '');
--returns 'fo'
If you need to remove other specific character sets after a ~, you can probably do this with a slightly more complicated regex, but I'd need examples of your desired input/output to help with that.
EDIT for updated question
This regex replace should get what you need.
select regexp_replace('CK#123456~fndkjfgdjkg','CK#(\\d*)~.*', '\\1');
-- returns 123456
(\\d*) gets ANY number of digits in a row, and the \\1 causes it to replace the match with what was in the first set of parenthesis, which is your list of digits. the CK# and ~.* are there to make sure the whole string gets matched and replaced.
If the CK# can vary as well, you can use .*? like this.
select regexp_replace('ABCD123HI#123456~fndkjfgdjkg','.*?#(\\d*)~.*', '\\1')
-- returns 123456
I'd probably do something like the following, easy enough but not as cool as RegEx type of functions.
set my_string='fooo~12345';
set search_for_me = '~';
SELECT SUBSTR($my_string, 1, DECODE(position($search_for_me, $my_string), 0, length($my_string), position($search_for_me, $my_string)));
I hope this helps...Rich
It looks like lookahead and lookbehinds are not supported in REGEXP functions, they seem to work in the PATTERN clause of a LIST command. Snowflake documentation makes no mention either way of lookahead or lookbehinds.
In your example:
It seems that the query engine is looking for that repeating argument, where you are attempting a lookbehind
You have not specified what you wanted extracted. You have two capture groups, but in this scenario everything would be returned
Since you are looking to remove everything after ~ you have a delimiter, why not use it in your REGEXP_SUBSTR function?
Try the following:
SELECT $1,REGEXP_SUBSTR($1,'\\w+#(.+?)~',1,1,'is',1)
FROM VALUES
('CK#123456~fndkjfgdjkg')
,('QH#128fklj924~fndkjfgdjkg')
;
This looks for:
One or more word characters
Followed by #
Capturing one or more characters upto and not including ~
Returns the characters within the capture group
You can change the .+? to \\d+? to make sure the pattern is only digits. Backslashes must be escaped with a backslash.
The descriptions for each argument of the function can be found here:
https://docs.snowflake.net/manuals/sql-reference/functions/regexp_substr.html
You could check this!!
select substr('CK#123456~fndkjfgdjkg',4,6) from dual;
OUTPUT
123456
https://docs.snowflake.net/manuals/sql-reference/functions/substr.html

How can I replace a string pattern with blank in hive?

I have a string as:
https://maps.googleapis.com/maps/api/staticmap?center=41.892532+-87.63811&zoom=11&scale=2&size=280x320&maptype=roadmap&format=png&visual_refresh=true%7C&markers=size:mid%7Ccolor:0x8000ff%7Clabel:1%7C2413+S+State+St++Chicago+IL+60616%7C&markers=size:mid%7Ccolor:0x8000ff%7Clabel:2%7C3000+N+Halsted+St++Chicago+IL+60657%7C&markers=size:mid%7Ccolor:0x8000ff%7Clabel:3%7C++++&key=AIzaSyBNEAQcC5niAEeiP3zkA_nuWGvtl0IOEs4
I want to replace the '++++' pattern at the end with blank and not the single occurrence of '+'. Tried using regexp_replace and translate functions in hive but that replaces all the single occurrences of '+' as well.
Use
regexp_replace(string,'[+]{4}','')
Pattern '[+]{4}' means + caracter four times.
Test:
select regexp_replace('++markers=size:mid%7Ccolor:0x8000ff%7Clabel:3%7C++++&','[+]{4}','');
Result:
OK
++markers=size:mid%7Ccolor:0x8000ff%7Clabel:3%7C&
Dod you try this?
replace(string, '++++', '')
Admittedly, this will replace all occurrences of '++++', but your string only has one of them.

Cut string after first occurrence of a character

I have strings like 'keepme:cutme' or 'string-without-separator' which should become respectively 'keepme' and 'string-without-separator'. Can this be done in PostgreSQL? I tried:
select substring('first:last' from '.+:')
But this leaves the : in and won't work if there is no : in the string.
Use split_part():
SELECT split_part('first:last', ':', 1) AS first_part
Returns the whole string if the delimiter is not there. And it's simple to get the 2nd or 3rd part etc.
Substantially faster than functions using regular expression matching. And since we have a fixed delimiter we don't need the magic of regular expressions.
Related:
Split comma separated column data into additional columns
regexp_replace() may be overload for what you need, but it also gives the additional benefit of regex. For instance, if strings use multiple delimiters.
Example use:
select regexp_replace( 'first:last', E':.*', '');
SQL Select to pick everything after the last occurrence of a character
select right('first:last', charindex(':', reverse('first:last')) - 1)

replace two characters in one cell

I am using this query to replace one character in a cell
select replace(id,',','')id from table
But I want to replace two characters in a cell.
If the cell is having this data (1,3.1), and I want it to look like this (131).
How can I replace two different characters in one cell?
Use TRANSLATE instead of REPLACE(). It replaces each occurrence of a character in the first pattern with its matched character in the second. To remove characters, simply leave cut short the replacement string:
select translate(id, '1,.', '1') id from table
Note that the second string cannot be null. Hence the need to include 1 (or some other character) in both strings.
Find out more.
Obviously the more characters you need to convert/remove the more attractive TRANSLATE() becomes. The main use for REPLACE is changing patterns (such as words) rather than individual characters.
Can use
select replace(translate(id,',.',' '),' ','') from table;
or
select regexp_replace('1,3.1','[,.]','') from dual;
or
select replace(replace(id,',',''),'.','') from table;
Call the replace again.
select replace(replace(id,',',''), '.','') id from table
Do this:
select REPLACE(REPLACE(id,',',''),'.','')
Or use a regular expression:
select regexp_replace(id, '[.,]', '') id from table
Find out more