I am trying to get the string between parentheses but i am getting always getting empty value.
String_Input: select sum(OUTPUT_VALUE) from table_name
Output : OUTPUT_VALUE
What I tried here:
select regexp_extract(String_Input,"/\\(([^)]+)\\)/") from table_name;
any suggestion to get the value ?
If you need to get the value without the parentheses, you should indicate that you need to extract Captturing group 1 value in the third argument to regexp_extract function. Besides, you should remove / delimiters, they are parsed as literal symbols.
select regexp_extract(String_Input,"\\(([^)]+)\\)", 1) from table_name;
^ ^ ^
From the Hive documentation:
The 'index' parameter is the Java regex Matcher group() method index. See docs/api/java/util/regex/Matcher.html for more information on the 'index' or Java regex group() method.
Try this:
\((.*?)\)
In Hive:
select regexp_extract(String_input,'\\((.*?)\\)')
from table_name
Related
I am trying to remove template text like &#x; or &#xx; or &#xxx; from long string
Note: x / xx / xxx - is number, The length of the number is unknown, The cell type is CLOB
for example:
SELECT 'H'ello wor±ld' FROM dual
A desirable result:
Hello world
I know that regexp_replace should be used, But how do you use this function to remove this text?
You can use
SELECT REGEXP_REPLACE(col,'&&#\d+;')
FROM t
where
& is put twice to provide escaping for the substitution character
\d represents digits and the following + provides the multiple occurrences of them
ending the pattern with ;
or just use a single ampersand ('&#\d+;') for the pattern as in the case of Demo , since an ampersand has a special meaning for Oracle, a usage is a bit problematic.
In case you wanted to remove the entities because you don't know how to replace them by their character values, here is a solution:
UTL_I18N.UNESCAPE_REFERENCE( xmlquery( 'the_double_quoted_original_string' RETURNING content).getStringVal() )
In other words, the original 'H'ello wor±ld' should be passed to XMLQUERY as '"H'ello wor±ld"'.
And the result will be 'H'ello wo±ld'
I tried to run the following query:
select * from table where regexp_like('^{{', text_field)
And got the following error:
too big number for repeat range
Thinking perhaps regexp_like is confusing { for the repeat count operator, I also tried the following variations:
select * from table where regexp_like('^\{\{', text_field)
select * from table where regexp_like('^[{][{]', text_field)
select * from table where regexp_like('^[[:punct:]]{2}', text_field)
None of which worked. For now, text_field like '{{' suffices, but I may want to include a more flexible version of this that would require regular expressions. What's wrong with my approach here? And what does this error message mean?
You are using the prestodb regex_like function in the wrong way:
regexp_like(string, pattern)
Evaluates the regular expression pattern and determines if it is
contained within string. This function is similar to the LIKE
operator, expect that the pattern only needs to be contained within
string, rather than needing to match all of string. In other words,
this performs a contains operation rather than a match operation. You
can match the entire string by anchoring the pattern using ^ and $:
SELECT regexp_like('1a 2b 14m', '\d+b'); -- true
I require a select query that adds a space to the data based on the placement of the capital letters i.e. 'HelpMe' using this query would be displayed as 'Help Me' . Note i cannot use a stored function to do this the it must be done in the query itself. The Data is of variable length and query must be in SQL. Any Help will be appreciated.
Thanks
You need to use user defined function for this until MS give us support for regular expressions. Solution would be something like:
SELECT col1, dbo.RegExReplace(col1, '([A-Z])',' \1') FROM Table
Aldo this would produce leading space that you can remove with TRIM.
Replace regular expresion function:
http://connect.microsoft.com/SQLServer/feedback/details/378520
About dbo.RegexReplace you can read at:
TSQL Replace all non a-z/A-Z characters with an empty string
Assume if you are using Oracle RDBMS, you use the following,
REGEX_REPLACE
SELECT REGEXP_REPLACE('ILikeToWatchCSIMiami',
'([A-Z.])', ' \1')
AS RX_REPLACE
FROM dual
;
Managed to get this output: * SQLFIDDLE
But as you see it doesn't treat well on words such as CSI though.
How to write a regular expression to match a string if at least 3 characters from the start are matching?
Here is how my SQL query looks right now -
SELECT * FROM tableName WHERE columnName REGEXP "^[a-zA-Z]{3}someString";
You cannot use CONCAT or alike with REGEX, it will fail. Easiest way to do it, is:
$query = 'SELECT * FROM Test WHERE colb REGEXP "^'.substr($mystring,0,3).'"');
Another is:
SELECT * FROM Test WHERE LEFT(colb, 3) LIKE "{$mystring}%"
Please use jQuery and jqSQL plugin. Note that symbol $ must be escaped in SQL query with this plugin.
I got this query and want to extract the value between the brackets.
select de_desc, regexp_substr(de_desc, '\[(.+)\]', 1)
from DATABASE
where col_name like '[%]';
It however gives me the value with the brackets such as "[TEST]". I just want "TEST". How do I modify the query to get it?
The third parameter of the REGEXP_SUBSTR function indicates the position in the target string (de_desc in your example) where you want to start searching. Assuming a match is found in the given portion of the string, it doesn't affect what is returned.
In Oracle 11g, there is a sixth parameter to the function, that I think is what you are trying to use, which indicates the capture group that you want returned. An example of proper use would be:
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]', 1,1,NULL,1) from dual;
Where the last parameter 1 indicate the number of the capture group you want returned. Here is a link to the documentation that describes the parameter.
10g does not appear to have this option, but in your case you can achieve the same result with:
select substr( match, 2, length(match)-2 ) from (
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]') match FROM dual
);
since you know that a match will have exactly one excess character at the beginning and end. (Alternatively, you could use RTRIM and LTRIM to remove brackets from both ends of the result.)
You need to do a replace and use a regex pattern that matches the whole string.
select regexp_replace(de_desc, '.*\[(.+)\].*', '\1') from DATABASE;