How to Replace backslash(\) in data in Impala - impala

select translate('Parts and Supplies \ Compressors','\','0');
**Output **: Parts and Supplies Compressors
I tried replacing the \ with 0 but translate or regex_replace or replace are not working.
Please let me know a workaround for this

add one more \ to escape the replace character.
replace(mycol,'\\','|')
if you want to split this string, you can use split part function.
select split_part( replace(mycol,'\\','|'),'|',1) split_first part

Related

TERADATA REGEXP_SUBSTR Get string between two values

I am fairly new to teradata, but I was trying to understand how to use REGEXP_SUBSTR
For example I have the following cell value = ABCD^1234567890^1
How can I extract 1234567890
What I attempted to do is the following:
REGEXP_SUBSTR(x, '(?<=^).*?(?=^)')
But this didnt seem to work.
Can anyone help?
It might (or might not) be possible to use REGEXP_SUBSTR() to handle this, but you would need to use a capture group. An alternative here would be to do a regex replacement instead:
SELECT x, REGEXP_REPLACE(x, '^.*?\^|\^.*$', '') AS output
FROM yourTable;
The regex pattern used here matches:
^.*?\^ everything from the start to the first ^
| OR
\^.*$ everything from the second ^ to the end
We then replace with empty string to remove the content being matched.

Extract characters between a string and the first occurrence of something in BigQuery

I want to extract a set of characters between "u1=" and the first semi-colon using a regex. For instance, given the following string: id=1w54;name=nick;u1=blue;u2=male;u3=ohio;u5=
The desired regex output should be just blue.
I tested (?<=u1=)[^;]* on https://regex101.com and it works. However, when I run this in BigQuery, using regexp_extract(string, '(?<=u1=)[^;]*') , I get an error that reads "Cannot parse regular expression: invalid perl operator: (?<"
I'm confused why this isn't working in BQ. Any help would be appreciated.
You can use regexp_extract() like this:
regexp_extract(string, 'u1=([^;]+)')

Equivalent of Presto REPLACE function in Hive

I have a condition currently in presto like below
replace(replace(replace(replace(replace(replace(replace(replace(UPPER(FIELDNAME),' ',''),'LLC',''),'INC',''),'INTERNATIONAL',''),'LTD',''),'.',''),',',''),'QMT','')
What will be the equivalent function in Hive that will serve the exact same prurpose? Will regexp_replace work for the above scenario?
Used regexp_replace in the above condition instead of replace, but it is throwing me the below error.
FAILED: SemanticException [Error 10014]: Line 1:255 Wrong arguments ''('': No matching method for class org.apache.hadoop.hive.ql.udf.UDFRegExpReplace with (string, string). Possible choices: _FUNC_(string, string, string) (state=42000,code=10014)
It would be greatful if someone can help on this. Thanks
That is true. regexp_replace is the command that you can use. You can give the list of substrings to be substituted as a list separated by '|' and you need to use '\' to give any special characters which might act as wildcards like - . or ? or * etc. Below is the sample usage.
SELECT regexp_replace(upper(fieldname),'LTD|QMT|INTERNATIONAL|ORG|INC|\\.|,| ','') from my_table;

Regexp_Extract BigQuery anything up to "|"

I'm fairly new to coding and I was wondering if you could give me a hand writing some regular expression for BigQuery SQL.
Basically I would like to extract everything before the bar sign "|" for one of my column.
Example:
Source string:
bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed
Desired output:
bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah
I thought about using the REGEXP_EXTRACT(string, delimiter) function but I'm totally unable to write some regex (LOL). Therefore I had a look over Stack, and have found stuff like:
SELECT REGEXP_EXTRACT( String_Name , "\S*\s*\|" ) ,
# or
SELECT REGEXP_EXTRACT( String_Name , '.+?(?=|)')
But every time I get error messages like " invalid perl operator: (?= " or "Illegal escape space"
Would you have any suggestions on why I get these messages and/or how could I proceed to extract these strings?
Many many thanks in advance <3
You can use SPLIT instead:
SELECT SPLIT("bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed", "|")[OFFSET(0)]
Prefix the pattern string with r:
SELECT REGEXP_EXTRACT(String_Name, r'\S*\s*\|')
This is the syntax for a raw string constant. You can review what this means in the documentation.

REGEXP REPLACE with backslashes in Spark-SQL

I have a string containing \s\ keyword. Now, I want to replace it with NULL.
select string,REGEXP_REPLACE(string,'\\\s\\','') from test
But unable to replace with the above statement in spark sql
input: \s\help
output: help
want to use regexp_replace
To replace one \ in the actual string you need to use \\\\ (4 backslashes) in the pattern of the regexep_replace. Please do look at https://stackoverflow.com/a/4025508/9042433 to understand why 4 backslashes are needed to replace just one backslash
So, the required statement would become like below
select name, regexp_replace(name, '\\\\s\\\\', '') from test
Below screenshot has examples for better understanding