Wildcard usage for ^ - sql

I want to verify a field,
if a field is only number like 12345, return nothing
if a field is something like 1234-1, or -123331, return nothing
if a field is a12341, or 34j123 or 99933hh return 1
if a field is a1234-, or sodf233- return 1.
Basically just check if there is a non number character in this field, but allow the dash to be in.
Here is my thoughts:
select 1
from dbo.random.field
where ISNUMBERIC(field)=0 and field not like '%-%'
Use this to check if there is letter, and then, if there is a -, but my test case always like this:
12345 Pass
12345a Failed
12345- Pass
1234a- Pass, but this should fail.
So what am I doing wrong here?

Let me assume you are using SQL Server, based on isnumeric(). You can use like:
where field like 'a%[0-9]%' and
field not like 'a%[^-0-9]%'
The first checks that the column starts with 'a' and has at least one digit. The second checks that there are no non-digits or non-hyphens after the 'a'.
You can generalize the a to any letter using [a-z] (assuming case insensitivity) or to any non-digit using [^0-9].
EDIT:
For your revised question, you just seem to want a letter. You can use:
select *
from (values ('12345'), ('1234-1'), ('1234-1'), ('a12341'), ('34j123'), ('99933hh'), ('a1234-'), ('sodf233-')) v(field)
where field like '%[^-0-9]%';

So this is what I did in the end
like '%[^-0-9]%'

Related

BigQuery - Using regexp with LIKE operator (?)

I'd like to get productids from url and I've almost finetuned a query to do it but still there is an issue I cannot solve.
The url usually looks like this:
/xp-pen/toll-spe43-deco-pro-small-medium-spe43-tobuy-p665088831/
or
/harry-potter-es-a-tuz-serlege-2019-m19247107/
As you can see there are two types of ids:
in general, ids start with '-p'
ids of some special products start with '-m'
I created this case when statement:
CASE
WHEN MAX(hits.page.pagePath) LIKE '%-p%'
THEN MAX(REGEXP_REPLACE(REGEXP_EXTRACT(
hits.page.pagePath, '-p[0-9]+/'), '\\-|p|/', ''))
WHEN MAX(hits.page.pagePath) LIKE '%-m%'
THEN MAX(REGEXP_REPLACE(REGEXP_EXTRACT(
hits.page.pagePath, '-m[0-9]+/'), '\\-|m|/', ''))
ELSE NULL
END AS productId
It's a little complicated at the first look but I really needed a regexp_replace and a regexp_extract because '-p' or '-m' characters doesn't appear only before the id but it can be multiplied times in a url.
The problem with my code is that there are some special cases when the url looks like this:
/elveszett-profeciak-2019-m17855487/
As you can see the id starts with '-m' but the url also contains '-p'. In this case the result is empty value in the query.
I think it could be solved by modifying the like operator in the when part of the case when statement: LIKE '%-p%' or LIKE '%-m%'
It would be great to have a regexp expression after or instead of the LIKE operator. Something similar to the parameter of '-p[0-9]+/' what I used in regexp_extract function.
So what I would need is to define in the when part of the statement that if the '-p' or '-m' text is followed by numbers in the urls
I'm not sure it's possible to do or not in BQ.
So what I would need is to define in the when part of the statement that if the '-p' or '-m' text is followed by numbers in the urls
I think you want '-p' and '-m' followed by digits. If so, I think this does what you want:
select regexp_extract(url, '-[pm][0-9]+')
from (select '/xp-pen/toll-spe43-deco-pro-small-medium-spe43-tobuy-p665088831/' as url union all
select '/elveszett-profeciak-2019-m17855487/' union all
select '/harry-potter-es-a-tuz-serlege-2019-m19247107/'
) x

Postgres SQL regexp_replace replace all number

I need some help with the next. I have a field text in SQL, this record a list of times sepparates with '|'. For example
'14613|15474|3832|148|5236|5348|1055|524' Each value is a time in milliseconds. This field could any length, for example is perfect correct '3215|2654' or '4565' (only 1 value). I need get this field and replace all number with -1000 value.
So '14613|15474|3832|148|5236|5348|1055|524' will be '-1000|-1000|-1000|-1000|-1000|-1000|-1000|-1000'
Or '3215|2654' => '-1000|-1000' Or '4565' => '-1000'.
I try use regexp_replace(times_field,'[[:digit:]]','-1000','g') but it replace each digit, not the complete number, so in this example:
'3215|2654' than must be '-1000|-1000', i get:
'-1000-1000-1000-1000|-1000-1000-1000-1000', I try with other combinations and more options of regexp but i'm done.
Please need your help, thanks!!!.
We can try using REGEXP_REPLACE here:
UPDATE yourTable
SET times_field = REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g');
If instead you don't really want to alter your data but rather just view your data this way, then use a select:
SELECT
times_field,
REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g') AS times_field_replace
FROM yourTable;
Note that in either case we pass g as the fourtb parameter to REGEXP_REPLACE to do a global replacement of all pipe separated numbers.
[[:digit:]] - matches a digit [0-9]
+ Quantifier - matches between one and unlimited times, as many times as possible
your regexp must look like
regexp_replace(times_field,'[[:digit:]]+','-1000','g')

PostgreSQL - find matching line in char/string column?

How can I find matching line in char/string type column?
For example let say I have column called text and some row has content of:
12345\nabcdf\nXKJKJ
(where \n are real new lines)
Now I want to find related row if any of lines match. For example, I have value 12345,
then it should find match. But if I have value 123, It would not.
I tried using like but it finds in both cases, when I have matching value (like 12345) and partially matching value (like 123).
For example something like this, but to have boundary for checking whole line:
SELECT id
FROM my_table
WHERE text like [SOME_VALUE]
Update
Maybe its not yet clear what Im asking. But basically I want something equivalent what you can do with regular expression,
like this: https://regexr.com/5akj1
Here regular expression /^123$/m would not match my string, it would only match if it would have been with pattern /^12345$/m (when I use pattern, value is dynamic, so pattern would change depending what value I got).
You may use regexp_replace and then check that the replaced string is not equal to the original column value:
select count(*)
from dummy
where regexp_replace(mytext, '(?m)^1234$', '') <> mytext;
You have a demo here.
Bear in mind that I have used the (?m) modifier, which makes ^ and $ match begin and end of line instead of begin and end of string.
You should be able to use ~ for matching:
where mytext ~ '(\n|^)1234(\n|$)'

How can I extract a substring from a character column without using SUBSTR()?

I have a questions regarding below data.
You clearly can see each EMP_IDENTIFIER has connected with EMP_ID.
So I need to pull only identifier which is 10 characters that will insert another column.
How would I do that?
I did some traditional way, using INSTR, SUBSTR.
I just want to know is there any other way to do it but not using INSTR, SUBSTR.
EMP_ID(VARCHAR2)EMP_IDENTIFIER(VARCHAR2)
62049 62049-2162400111
6394 6394-1368000222
64473 64473-1814702333
61598 61598-0876000444
57452 57452-0336503555
5842 5842-0000070666
75778 75778-0955501777
76021 76021-0546004888
76274 76274-0000454999
73910 73910-0574500122
I am using Oracle 11g.
If you want the second part of the identifier and it is always 10 characters:
select t.*, substr(emp_identifier, -10) as secondpart
from t;
Here is one way:
REGEXP_SUBSTR (EMP_IDENTIFIER, '-(.{10})',1,1,null,1)
That will give the 1st 10 character string that follows a dash ("-") in your string. Thanks to mathguy for the improvement.
Beyond that, you'll have to provide more details on the exact logic for picking out the identifier you want.
Since apparently this is for learning purposes... let's say the assignment was more complicated. Let's say you had a longer input string, and it had several groups separated by -, and the groups could include letters and digits. You know there are at least two groups that are "digits only" and you need to grab the second such "purely numeric" group. Then something like this will work (and there will not be an instr/substr solution):
select regexp_substr(input_str, '(-|^)(\d+)(-|$)', 1, 2, null, 2) from ....
This searches the input string for one or more digits ( \d means any digit, + means one or more occurrences) between a - or the beginning of the string (^ means beginning of the string; (a|b) means match a OR b) and a - or the end of the string ($ means end of the string). It starts searching at the first character (the second argument of the function is 1); it looks for the second occurrence (the argument 2); it doesn't do any special matching such as ignore case (the argument "null" to the function), and when the match is found, return the fragment of the match pattern included in the second set of parentheses (the last argument, 2, to the regexp function). The second fragment is the \d+ - the sequence of digits, without the leading and/or trailing dash -.
This solution will work in your example too, it's just overkill. It will find the right "digits-only" group in something like AS23302-ATX-20032-33900293-CWV20-3499-RA; it will return the second numeric group, 33900293.

Regular expression to validate whether the data contains numeric ( or empty is also valid)

i have to validate the data contains numeric or not and if it is not numeric return 0 and if it is numeric or empty return 1.
Below is my query i tried in SQL.
SELECT dbo.Regex('^[0-9]','123') -- This is returning 1.
SELECT dbo.Regex('^[0-9]','') -- this is not returning 1 but i want to return as 1 and i try to put space in "pattern" also it is not working...
please can any one help....
Thanks in advance
Try:
SELECT dbo.Regex('^[0-9]*$','123')
or better yet:
SELECT dbo.Regex('^\d*$','123')
\d is standard shorthand for [0-9]. Note that when you use [0-9] it means "match exactly one digit. The * wildcard means match zero or more so \d* means match zero or more digits, which includes an empty result.
Your Regex is not quite right: "^[0-9]" checks only, if the first character in the string is a number. Your regex would return 1 on "3abcde".
Use
"^\d*$"
to check if the field contains only numbers (or nothing). If you want to allow whitespace, use
"^\s*\d*\s*$"
If you want to allow signs and decimals, try
"^\s*(+|-)?\d*\.?\d*\s*$"
Try ^[0-9]*$, which will match 0 or more digits only:
SELECT dbo.Regex('^[0-9]*$','')
should return 1, as should...
SELECT dbo.Regex('^[0-9]*$','123')