I am working with Sybase SQL and want to exclude all entries that look like this:
(NOT PRESENT)
So I tried using:
SELECT col FROM table WHERE col NOT LIKE '(%)'
Do you guys know what is happening? I think I need to escap ( somehow, but I do not know how. The following returns an error:
SELECT col FROM table WHERE col NOT LIKE '\(%\)' ESCAPE '\'
Kind Regards
Try this :
SELECT col FROM table WHERE col NOT LIKE ('(%)')
You might find this helpful
Sybase Event Stream Processor 5.0 CCL Programmers Guide - String Functions
like()
Scalar. Determines whether a given string matches a specified pattern string.
Syntax
like ( string, pattern )
Parameters
string A string.
pattern A pattern of characters, as a string. Can contain wildcards.
Usage
Determines whether a string matches a pattern string. The function returns 1 if the string matches the pattern, and 0 otherwise. The pattern argument can contain wildcards: '_' matches a single arbitrary character, and '%' matches 0 or more arbitrary characters. The function takes in two strings as its arguments, and returns an integer.
Note: In SQL, the infix notation can also be used: sourceString like patternString.
Example
like ('MSFT', 'M%T') returns 1.
Related
I am trying to remove template text like &#x; or &#xx; or &#xxx; from long string
Note: x / xx / xxx - is number, The length of the number is unknown, The cell type is CLOB
for example:
SELECT 'H'ello wor±ld' FROM dual
A desirable result:
Hello world
I know that regexp_replace should be used, But how do you use this function to remove this text?
You can use
SELECT REGEXP_REPLACE(col,'&&#\d+;')
FROM t
where
& is put twice to provide escaping for the substitution character
\d represents digits and the following + provides the multiple occurrences of them
ending the pattern with ;
or just use a single ampersand ('&#\d+;') for the pattern as in the case of Demo , since an ampersand has a special meaning for Oracle, a usage is a bit problematic.
In case you wanted to remove the entities because you don't know how to replace them by their character values, here is a solution:
UTL_I18N.UNESCAPE_REFERENCE( xmlquery( 'the_double_quoted_original_string' RETURNING content).getStringVal() )
In other words, the original 'H'ello wor±ld' should be passed to XMLQUERY as '"H'ello wor±ld"'.
And the result will be 'H'ello wo±ld'
how do I write a SQL where statement that checks if a string contains some substring and a number. For example:
string: macsea01
where string like 'macsea' plus a number
Regex is the most obvious solution to this question. Without more detail about the specific format of the string, I can suggest the following, which will match a sequence of a letter in the alphabet followed immediately by a digit:
where column_name like '%[a-zA-Z][0-9]%'
If you're literally looking for macsea at the beginning of the string followed by a digit, it would be:
where column_name like 'macsea[0-9]%'
Regex seem to bee a little slippery here, depending on your needs you can for instance divide the string into several parts, first the text part, and take the rest of the string, try to convert it into a number.
Somthing like this (but I think this perticular code is broken
where substring(column_name, 1, 6) = 'macsea' and cast(substring(column_name, 7, 1000) as int) > 0
I am using Snowflake SQL. I would like to remove characters from a string after a special character ~. How can I do that?
here is the whole scenario. Let me explain. I do have a string like 'CK#123456~fndkjfgdjkg'. Now, i want only the number after #.And not anything after ~. This is number length varies for that field value. It might be 1 or 5 or 3. And i want to add the condition in where class where this number is equal to check_num from other table after joining. I am trying REGEXP_SUBSTR(A.SRC_TXT, '(?<=CK#)(.+?\b)') = C.CHK_NUM in the where condition. I am getting the error as 'No repititive argument after ?'
You can use a regex for this
-- To remove just the character after a ~
select regexp_replace('fo~o bar','~.', '');
-- returns 'fo bar'
--If you want to keep the ~
select regexp_replace('fo~o bar','~.', '~');
-- returns 'fo~ bar'
--If you want to remove everything after the ~
select regexp_replace('fo~o bar','~.*', '');
--returns 'fo'
If you need to remove other specific character sets after a ~, you can probably do this with a slightly more complicated regex, but I'd need examples of your desired input/output to help with that.
EDIT for updated question
This regex replace should get what you need.
select regexp_replace('CK#123456~fndkjfgdjkg','CK#(\\d*)~.*', '\\1');
-- returns 123456
(\\d*) gets ANY number of digits in a row, and the \\1 causes it to replace the match with what was in the first set of parenthesis, which is your list of digits. the CK# and ~.* are there to make sure the whole string gets matched and replaced.
If the CK# can vary as well, you can use .*? like this.
select regexp_replace('ABCD123HI#123456~fndkjfgdjkg','.*?#(\\d*)~.*', '\\1')
-- returns 123456
I'd probably do something like the following, easy enough but not as cool as RegEx type of functions.
set my_string='fooo~12345';
set search_for_me = '~';
SELECT SUBSTR($my_string, 1, DECODE(position($search_for_me, $my_string), 0, length($my_string), position($search_for_me, $my_string)));
I hope this helps...Rich
It looks like lookahead and lookbehinds are not supported in REGEXP functions, they seem to work in the PATTERN clause of a LIST command. Snowflake documentation makes no mention either way of lookahead or lookbehinds.
In your example:
It seems that the query engine is looking for that repeating argument, where you are attempting a lookbehind
You have not specified what you wanted extracted. You have two capture groups, but in this scenario everything would be returned
Since you are looking to remove everything after ~ you have a delimiter, why not use it in your REGEXP_SUBSTR function?
Try the following:
SELECT $1,REGEXP_SUBSTR($1,'\\w+#(.+?)~',1,1,'is',1)
FROM VALUES
('CK#123456~fndkjfgdjkg')
,('QH#128fklj924~fndkjfgdjkg')
;
This looks for:
One or more word characters
Followed by #
Capturing one or more characters upto and not including ~
Returns the characters within the capture group
You can change the .+? to \\d+? to make sure the pattern is only digits. Backslashes must be escaped with a backslash.
The descriptions for each argument of the function can be found here:
https://docs.snowflake.net/manuals/sql-reference/functions/regexp_substr.html
You could check this!!
select substr('CK#123456~fndkjfgdjkg',4,6) from dual;
OUTPUT
123456
https://docs.snowflake.net/manuals/sql-reference/functions/substr.html
I am busy building a lookup table for specific names of merchants. I tried to make use of the following regex but it's returning less results than the standard "like" function in Netezza SQL. Please refer to below:
SQL Like function: where trim(upper(a.MRCH_NME)) like '%CNA %' -- returns 4622 matches
Regex function in Netezza SQL: where array_combine(regexp_extract_all(trim(upper(a.MRCH_NME)),'.*CNA\s','i'),'|') = 'CNA' -- returns 2226 matches
I looked at the two result sets and found that strings such as the following aren't matched:
!C CNA INT ARR
*CNA PLATZ 0400
015764 CNA CRAD
C#CNA PARK 0
I made use of the following regex expression: /.*CNA\s'/
Any idea why the above strings aren't being returned as matches?
Thank you.
You probably should be using regexp_like:
SELECT *
FROM yourTable
WHERE REGEXP_LIKE(MRCH_NME, 'CNA[ ]', 'i');
This would be logically identical to the following query using LIKE:
SELECT *
FROM yourTable
WHERE MRCH_NME LIKE '%CNA ';
It seems to me the problem is more with your code rather than the regex. Look: like '%CNA %' returns all entries that contain a CNA substring followed with a literal space anywhere inside the entry. The '.*CNA\s' regex matches any 0+ chars other than newline followed with CNA and **any whitespace char*.
Acc. to this reference, \s matches "a white space character. White space is defined as [\t\n\f\r\p{Z}].
Thus, you should in fact just use
WHERE REGEXP_LIKE(MRCH_NME, 'CNA ', 'i')
or, better with a word boundary check:
WHERE REGEXP_LIKE(MRCH_NME, '\bCNA\b', 'i')
where \b marks a transition from a word to non-word and non-word to word character, thus ensuring a whole word search and justifying the regex usage.
If you do not need to match the merchant name as a whole word, use the regular LIKE with '%CNA %', it should be more efficient.
CASE
WHEN <in_data> LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]' THEN SUBSTR(<in_data>,1,3)
ELSE '000'
END
We're doing a migration project from Sybase to Teradata, and having a problem figuring this one out :) I'm still new to Teradata.
I would like to ask the equivalent TD code for this -
LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]' to Teradata
Basically, it just checks whether the digits are numeric value.
Can someone give me a hint on this
You can also use REGEXP_SUBSTR to directly extract the three digits:
COALESCE(REGEXP_SUBSTR(in_data,'^[0-9]{3}(?=[0-9]{3}$)'), '000')
This looks for the first three digits and then does a lookahead for three following digits without adding them to the overall match.
^ indicates the begin of the string, '$' the end, so there are no other characters before or after the six digits. (?=...) is a so-called "lookahead", i.e. those three digits are checked, but ignored.
If there's no match the regex returns NULL which is changed to '000'.
You need to use regexp instead of like, since [0-9][0-9][0-9][0-9][0-9][0-9] is a regular expression.
To do an exact match, you need to add anchors. ie, to match the string which contains an exact 6 digit chars.
regexp '^[0-9]{6}$'
or
regexp '^[[:digit:]]{6}$'