RegEx in Oracle SQL seems not to work - sql

I've got a small issue with regex in oracle.
I have a table as follows with 3 cases of string formats:
In my table, col1 holds the Strings I have, col2 is target.
Case 1: W1234W4321
Case 2: W1234,W4321
Case 3: W1234/W4321
(Length and and actual numbers vary)
Now I've set up this little regex: [\d,/]W.* to separate the two values after decimal, comma and slash.
I've tested the result in the tool RegExBuddy where he result is as expected.
When updating my table with following query, cases 2 and 3 are being updated, case 1 is still null in col 2.
update nyTable set col2 = regexp_substr(col1, '[\d,/]W.*');
Is this some issue related to oracle (maybe not understanding the \d)?

http://docs.oracle.com/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm
\d: A digit character. It is equivalent to the POSIX class [[:digit:]].

Related

Query to find if a column contains both number and decimal only

I have a column to check if contains number from 0-9 and a decimal. Since in the version of SQL am using the below does not seem working
select *
from tablename
whwere columnname like '%[^.0-9]%'
Also tried using column name like '%[0-9]%' and columnname not like '%.%' but if there is a negative sign it is not getting captured. Please advise.
The column data type is float. So can someone provide me a query to check if the column contains values from 0-9 and also it can contain decimal values these two are permitted. If say for example if I have value 9,9.99 ,-1.24 the query should output -1.24 I need this value other than decimal and number –
The issue with your LIKE clause is bad predicate logic ...like '%[^.0-9]%'should be NOT LIKE '%[^0-9.]%'
Take this sample data.
DECLARE #table TABLE (SomeNbr VARCHAR(32));
INSERT #table VALUES ('x'),('0'),('0.12'),('999'),('-29.33'),('88.33.22'),('9-9-'),('11-');
What you were trying to do would be accomplished like this:
SELECT t.someNbr
FROM #table AS t
WHERE someNbr NOT LIKE '%[^0-9.]%';
The problem here is we'll also return "88.33.22" and miss "-29.33", both valid float values. You can handle hyphens by adding a hyphen to your LIKE pattern:
SELECT t.someNbr, LEN(t.SomeNbr)-LEN(REPLACE(t.SomeNbr,'.',''))
FROM #table AS t
WHERE someNbr NOT LIKE '%[^0-9.-]%';
But now we also pick up "9-9-" and stuff with 2+ dots. To ensure that each starts with a number OR a hyphen, to ensure hyphens only exist in the front of the string (if at all) and that we a maximum of one dot:
--==== This will do a good job but can still be broken
SELECT t.someNbr
FROM #table AS t
WHERE someNbr NOT LIKE '%[^0-9.-]%' -- Can only contain numbers, dots and hyphens
AND LEN(t.SomeNbr)-LEN(REPLACE(t.SomeNbr,'.','')) < 2 -- can have up to 1 dot
AND LEN(t.SomeNbr)-LEN(REPLACE(t.SomeNbr,'-','')) < 2 -- can have up to 1 hyphen
AND PATINDEX('%-%',t.SomeNbr) < 2 -- hyphen can only be in the front
This does the trick and returns:
someNbr
--------------------------------
0
0.12
999
-29.33
All that said - **DONT DO THIS ANY OF THIS ^^^ **. There is no need to parse numbers in this way except to show others why not to. I can still break this. They way I return valid floats in a scenario like this is with TRY_CAST or TRY_CONVERT. This returns what you need and will perform better.
--==== Best Solution
SELECT t.someNbr
FROM #table AS t
WHERE TRY_CAST(t.SomeNbr AS float) IS NOT NULL;

How to select rows that have numbers as a value?

I have got a table with a column that is type of VARCHAR2(255 BYTE). I would like to select only these rows that have numbers as a value, so I discard any other values as for example "lala","1z". I just want to have pure numbers from 1 to ..... 999999999 (just digital numbers in other words) :P
Could you tell me how to make it?
if you're using Oracle 12c r2 or later then use the built-in validate_conversion() function:
select *
from your_table
where validate_conversion(cast(your_column as number)) = 0
validate_conversion() returns 0 when the proposed conversion would succeed and 1 when it wouldn't. It also supports date and timestamp conversions. Find out more.
Something like this is the usual option. You could use regexp, but it's usually a bit slower.
select column1
from tableA
where translate(column1, '1234567890', '') is null;
Here's the regexp version kfinity referred to. The regex matches a line consisting of 1 or more digits.
select column1
from tableA
where regexp_like(column1, '^\d+$');
You don't want zero to start a number. So it seems like regular expressions are the way to go:
where regexp_like(column1, '^[1-9][0-9]*$');

Comparing two empty Strings in Oracle SQL

Hi today I have met with weird situation. I had a where clause where was condition which returns String and I wanted to check if it's empty or not. And when it returns empty string Oracle still treat it like a different Strings. So I went further and prepared simple queries:
select 1 from dual where 1 = 1;
returns: 1
select 1 from dual where 'A' = 'A';
returns: 1
And now what I cannot understand:
select 1 from dual where '' = '';
No result.
Even if I check if they are different there is still no result.
select 1 from dual where '' != '';
No result.
Can someone explain it for me ?
Oracle treats empty strings as NULL. It's a gotcha. Make a note of it and hope it never bites you in the butt in production.
The reason is as #Captain Kenpachi explained. If want to compare two strings (or other types that are the same) and want to be tolerant of NULLs (or empty string in Oracle as it treats it as the same) then you need to involve an IS test.
You could try the common cheat of using a rogue value that will never be used but Murphy's Law dictates that one day someone will. This technique also has the drawback that the rogue value should match the type of the thing you are comparing i.e. comparing strings you need a rogue string while comparing dates you need a rouge date. This also means you can't cut-and-paste it liberally without applying a little thought. Example:
WHERE NVL(col1,'MyRougeValue')=NVL(col2,'MyRougeValue')
The standard version is to explicitly test for NULLs
WHERE (col1=col2 OR (col1 IS NULL AND col2 IS NULL))
The opposite becomes WHERE NOT(col1=col2 OR (col1 IS NULL AND col2 IS NULL))
I have seen the a long winded opposite version (as seen in Toad's data compare tool)
WHERE (col1<>col2 OR (col1 IS NULL AND col2 IS NOT NULL) OR (col1 IS NOT NULL AND col2 IS NULL))
Oracle does have a handy DECODE function that is basically is IF a IS b THEN c ELSE d so equality is WHERE DECODE(col1,col2,1,0)=1 and the opposite is WHERE DECODE(col1,col2,1,0)=0. You may find this a little slower than the explicit IS test. It is proprietary to Oracle but helps make up for the empty string problem.

How do I sort a VARCHAR column in PostgreSQL that contains words and numbers?

I need to order a select query using a varchar column, using numerical and text order. The query will be done in a java program, using jdbc over postgresql.
If I use ORDER BY in the select clause I obtain:
1
11
2
abc
However, I need to obtain:
1
2
11
abc
The problem is that the column can also contain text.
This question is similar (but targeted for SQL Server):
How do I sort a VARCHAR column in SQL server that contains words and numbers?
However, the solution proposed did not work with PostgreSQL.
Thanks in advance, regards,
I had the same problem and the following code solves it:
SELECT ...
FROM table
order by
CASE WHEN column < 'A'
THEN lpad(column, size, '0')
ELSE column
END;
The size var is the length of the varchar column, e.g 255 for varying(255).
You can use regular expression to do this kind of thing:
select THECOL from ...
order by
case
when substring(THECOL from '^\d+$') is null then 9999
else cast(THECOL as integer)
end,
THECOL
First you use regular expression to detect whether the content of the column is a number or not. In this case I use '^\d+$' but you can modify it to suit the situation.
If the regexp doesn't match, return a big number so this row will fall to the bottom of the order.
If the regexp matches, convert the string to number and then sort on that.
After this, sort regularly with the column.
I'm not aware of any database having a "natural sort", like some know to exist in PHP. All I've found is various functions:
Natural order sort in Postgres
Comment in the PostgreSQL ORDER BY documentation

SQL query - LEFT 1 = char, RIGHT 3-5 = numbers in Name

I need to filter out junk data in SQL (SQL Server 2008) table. I need to identify these records, and pull them out.
Char[0] = A..Z, a..z
Char[1] = 0..9
Char[2] = 0..9
Char[3] = 0..9
Char[4] = 0..9
{No blanks allowed}
Basically, a clean record will look like this:
T1234, U2468, K123, P50054 (4 record examples)
Junk data looks like this:
T12.., .T12, MARK, TP1, SP2, BFGL, BFPL (7 record examples)
Can someone please assist with a SQL query to do a LEFT and RIGHT method and extract those characters, and do a LIKE IN or something?
A function would be great though!
The following should work in a few different systems:
SELECT *
FROM TheTable
WHERE Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9]%'
AND Data NOT LIKE '% %'
This approach will indeed match P2343, P23423JUNK, and other similar text but requires that the format is A0000*.
Now, if the OP implies a format of 1st position is a character and all succeeding positions are numeric, as in A0+, then use the following (in SQL Server and a good deal of other database systems):
SELECT *
FROM TheTable
WHERE SUBSTRING(Data, 1, 1) LIKE '[A-Za-z]'
AND SUBSTRING(Data, 2, LEN(Data) - 1) NOT LIKE '%[^0-9]%'
AND LEN(Data) >= 5
To incorporate this into a SQL Server 2008 function, since this appears to be what you'd like most, you can write:
CREATE FUNCTION ufn_IsProperFormat(#data VARCHAR(50))
RETURNS BIT
AS
BEGIN
RETURN
CASE
WHEN SUBSTRING(#Data, 1, 1) LIKE '[A-Za-z]'
AND SUBSTRING(#Data, 2, LEN(#Data) - 1) NOT LIKE '%[^0-9]%'
AND LEN(#Data) >= 5 THEN 1
ELSE 0
END
END
...and call into it like so:
SELECT *
FROM TheTable
WHERE dbo.ufn_IsProperFormat(Data) = 1
...this query needs to change for Oracle queries because Oracle doesn't appear to support bracket notation in LIKE clauses:
SELECT *
FROM TheTable
WHERE REGEXP_LIKE(Data, '^[A-za-z]\d{4,}$')
This is the expansion gbn is doing in his answer, but these versions allow for varying string lengths without the OR conditions.
EDIT: Updated to support examples in SQL Server and Oracle for ensuring the format A0+, so that A1324, A2342388, and P2342 match but A2342JUNK and A234 do not.
The Oracle REGEXP_LIKE code was borrowed from Mark's post but updated to support 4 or more numeric digits.
Added a custom SQL Server 2008 approach which implements these techniques.
Depends on your database. Many have regex functions (note examples not tested so check)
e.g. Oracle
SELECT x
FROM table
WHERE REGEXP_LIKE(x, '^[A-za-z][:digit:]{4}$')
Sybase uses LIKE
Given that you're allowing between 3 and 6 digits for the number in your examples then it's probably better to use the ISNUMERIC() function on the 2nd character onwards:
SELECT *
FROM TheTable
-- start with a letter
WHERE Data LIKE '[A-Za-z]%'
-- everything from 2nd character onwards is a number
AND ISNUMERIC( SUBSTRING( Data, 2, 50 ) ) = 1
-- number doesn't have a decimal place
AND Data NOT LIKE '%.%'
For more information look at the ISNUMERIC function on MSDN.
Also note that:
I've limited the 2nd part with the number to 50 characters maximum, change this to suit your needs.
Strictly speaking you should check for currency symbols etc, as ISNUMERIC allows them, as well as +/- and some others
A better option might be to create a function that checks that each character after the first is between 0 and 9 (or 1 and 0 if you're using ASCII codes).
You can't use Regular Expressions in SQL Server, so you have to use OR. Correcting David Andres' answer...
WHERE
(
Data LIKE '[A-Za-z][0-9][0-9][0-9]'
OR
Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9]'
OR
Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9][0-9]'
)
David's answer allows "D1234junk" through
You also only need "[A-Z]" if you don't have case sensitivity