I am using Firebird 2.1.
I have job order number that may have 1 or 2 alpha characters then 4 or 5 numbers then maybe a prefix with 1 alpha character and 2 numbers.
I want to extract the 4-5 digit number in the middle.
I tried the following to find the number char, but it returned 0:
POSITION('%[0-9]%',JOBHEADER.ORDERNUMBER,1) AS "FIRST NUMBER"
I am not sure if I can use wildcards with the POSITION function. I guess I could try and check the 2nd or 3rd character for a number, but I really need the wild card feature to then find the next alpha after I find the position of the first number. Or maybe there is another solution to extract the number.
I have found something simliar:
CASE WHEN SUBSTRING(ordernumber FROM 2 FOR 5) SIMILAR TO '[0-9]+'
THEN SUBSTRING(ordernumber FROM 2 FOR 5)
ELSE SUBSTRING(ordernumber FROM 3 FOR 5)
END as PROJECTNUMBER
But with the number possibly starting in the first 5 chars then a if/case statement starts getting quite big.
No you can't do this with POSITION. Position searches for the exact substring in a given string. However, with Firebird 3, you could use SUBSTRING with regular expressions to extract the value, for example:
substring(ordernumber similar '%#"[[:DIGIT:]]+#"%' escape '#')
The regular expression must cover the entire string, while the #" encloses the term to extract (the # is the explicitly defined escape symbol). You may need to use more complex patterns like [^[:DIGIT:]]*#"[[:DIGIT:]]+#"([^[:DIGIT:]]%)? to avoid edge-cases in greediness.
If you know the pattern is always 1 or 2 alpha, 4 or 5 digits you want to extract, possibly followed by 1 alpha and 2 numbers, you could also use [[:ALPHA:]]{1,2}#"[[:DIGIT:]]{4,5}#"([[:ALPHA:]][[:DIGIT:]]{1,2})?. If the pattern isn't matched null is returned.
See also:
README.substring_similar.txt
regular expression syntax
Be aware that the SQL standard regular expression syntax supported by Firebird is a bit odd, and less powerful than regular expressions common in other languages.
Using PSQL
To solve this using PSQL, under Firebird 2.1, you can use something like:
create or alter procedure extract_number(input_value varchar(50))
returns (output_value bigint)
as
declare char_position integer = 0;
declare number_string varchar(20) = '';
declare current_char char(1);
begin
while (char_position < char_length(input_value)) do
begin
char_position = char_position + 1;
current_char = substring(input_value from char_position for 1);
if ('0' <= current_char and current_char <= '9') then
begin
number_string = number_string || current_char;
end
else if (char_length(number_string) > 0) then
begin
-- switching from numeric to non-numeric, found first number occurrence in string
leave;
end
end
output_value = iif(char_length(number_string) > 0, cast(number_string as bigint), null);
end
Related
I have this query
SELECT text
FROM book
WHERE lyrics IS NULL
AND MOD(TO_NUMBER(SUBSTR(text,18,16)),5) = 1
sometimes the string is something like this $OK$OK$OK$OK$OK$OK$OK, sometimes something like #P,351811040302663;E,101;D,07112018134733,07012018144712;G,4908611,50930248,207,990;M,79379;S,0;IO,3,0,0
if I would like to know if it is possible to prevent ORA-01722: invalid number, because is some causes the char in that position is not a number.
I run this query inside a procedure a process all the rows in a cursor, if 1 row is not a number I can't process any row
You could use VALIDATE_CONVERSION if it's Oracle 12c Release 2 (12.2),
WITH book(text) AS
(SELECT '#P,351811040302663;E,101;D,07112018134733,07012018144712;G,4908611,50930248,207,990;M,79379;S,0;IO,3,0,0'
FROM DUAL
UNION ALL SELECT '$OK$OK$OK$OK$OK$OK$OK'
FROM DUAL
UNION ALL SELECT '12I45678912B456781234567812345671'
FROM DUAL)
SELECT *
FROM book
WHERE CASE
WHEN VALIDATE_CONVERSION(SUBSTR(text,18,16) AS NUMBER) = 1
THEN MOD(TO_NUMBER(SUBSTR(text,18,16)),5)
ELSE 0
END = 1 ;
Output
TEXT
12I45678912B456781234567812345671
Assuming the condition should be true if and only if the 16-character substring starting at position 18 is made up of 16 digits, and the number is equal to 1 modulo 5, then you could write it like this:
...
where .....
and case when translate(substr(text, 18, 16), 'z0123456789', 'z') is null
and substr(text, 33, 1) in ('1', '6')
then 1 end
= 1
This will check that the substring is made up of all-digits: the translate() function will replace every occurrence of z in the string with itself, and every occurrence of 0, 1, ..., 9 with nothing (it will simply remove them). The odd-looking z is needed due to Oracle's odd implementation of NULL and empty strings (you can use any other character instead of z, but you need some character so no argument to translate() is NULL). Then - the substring is made up of all-digits if and only if the result of this translation is null (an empty string). And you still check to see if the last character is 1 or 6.
Note that I didn't use any regular expressions; this is important if you have a large amount of data, since standard string functions like translate() are much faster than regular expression functions. Also, everything is based on character data type - no math functions like mod(). (Same as in Thorsten's answer, which was only missing the first part of what I suggested here - checking to see that the entire substring is made up of digits.)
SELECT text
FROM book
WHERE lyrics IS NULL
AND case when regexp_like(SUBSTR(text,18,16),'^[^a-zA-Z]*$') then MOD(TO_NUMBER(SUBSTR(text,18,16)),5)
else null
end = 1;
I've been working on this for days and can't seem to work it out. Basically I need return digits from a field before there is a forward slash. e.g. if the field was 1234/TEXT I want to return 1234. I can't just use left fieldname 4 as the digits vary in left e.g. 12345/TEXT, so it needs to be anything left of the forward slash. Now in the World of MS Access, it is something like this - and it works
Left(TABLE!FIELD,InStr(1,TABLE!FIELD,"/")-1)
However, how do I convert this to be used in an IBM\DB2 system? The DB2 SQL seems somewhat different to 'normal' SQL.
Thanks!
Rather than INSTR, maybe LOCATE
LOCATE(char, string)
char is the search term
string is the string being searched
You can achieve this by combining LOCATE with SUBSTR;
Locate information
Substring information
Cheat sheet (for this example);
SUBSTRING('FIELD','START POSITION', 'LENGTH')
LOCATE('SEARCH STRING', 'SOURCE STRING')
SUBSTRING lets you retrieve specific characters from a string, i.e.;
AFIELD = 'Hello'
SUBSTRING(AFIELD,4,2)
Result = 'lo' (position 4 and 5 of Hello)
LOCATE returns the position of the first character of the search string it finds as a number, i.e.;
AFIELD = 'Hello'
LOCATE('ello', AFIELD)
Result = 2 (it starts at position 2)
So you can combine these to do what you want, example;
XTABLE has 1 column called ACOL with the following values in it;
123467/ABCD
1321/ABDD
1123467/ABCD
To just retrieve the numbers;
SELECT SUBSTRING(ACOL,1, LOCATE('/',ACOL)-1)
FROM XRDK/XTABLE
Result;
123467
1321
1123467
What are we doing?
SUBSTRING(
ACOL,
1,
LOCATE('/',ACOL)-1
)
SUBSTRING(
Field ACOL,
Starting at position 1,
Length; using locate set this to where I find a '/' and subtract 1 from the
resulting postion (without the -1 you'd have the / on the end)
)
Try this
SELECT SUBSTRING(CAST (ROUND(COLUMN,2) AS DECIMAL(6,2)), 0, locate('/',CAST (ROUND(COLUMN,2) AS DECIMAL(6,2))))
FROM TABLE
If I have a number (such as 88) and I want to perform a LIKE query in Rails on a primary ID column to return all records that contain that number at the end of the ID (IE: 88, 288, etc.), how would I do that? Here's the code to generate the result, which works fine in SQLLite:
#item = Item.where("id like ?", "88").all
In PostgreSQL, I'm running into this error:
PG::Error: ERROR: operator does not exist: integer ~~ unknown
How do I do this? I've tried converting the number to a string, but that doesn't seem to work either.
Based on Erwin's Answer:
This is a very old question, but in case someone needs it, there is one very simple answer, using ::text cast:
Item.where("(id::text LIKE ?)", "%#{numeric_variable}").all
This way, you find the number anywhere in the string.
Use % wildcard to the left only if you want the number to be at the end of the string.
Use % wildcard to the right also, if you want the number to be anywhere in the string.
Simple case
LIKE is for string/text types. Since your primary key is an integer, you should use a mathematical operation instead.
Use modulo to get the remainder of the id value, when divided by 100.
Item.where("id % 100 = 88")
This will return Item records whose id column ends with 88
1288
1488
1238872388
862388
etc...
Match against arbitrary set of final two digits
If you are going to do this dynamically (e.g. match against an arbitrary set of two digits, but you know it will always be two digits), you could do something like:
Item.where(["id % 100 = ?", last_two_digits)
Match against any set or number of final digits
If you wanted to match an arbitrary number of digits, so long as they were always the final digits (as opposed to digits appearing elsewhere in the id field), you could add a custom method on your model. Something like:
class Item < ActiveRecord
...
def find_by_final_digits(num_digits, digit_pattern)
# Where 'num_digits' is the number of final digits to match
# and `digit_pattern` is the set of final digits you're looking fo
Item.where(["id % ? = ?", 10**num_digits, digit_pattern])
end
...
end
Using this method, you could find id values ending in 88, with:
Item.find_by_final_digits(2, 88)
Match against a range of final digits, of any length
Let's say you wanted to find all id values that end with digits between 09 and 12, for whatever reason. Maybe they represent some special range of codes you're looking up. To do this you could do another custom method to use Postgres' BETWEEN to find on a range.
def find_by_final_digit_range(num_digits, start_of_range, end_of_range)
Item.where(["id % ? BETWEEN ? AND ?", 10**num_digits, start_of_range, end_of_range)
end
...and could be called using:
Item.find_by_final_digit_range(2, 9, 12)
...of course, this is all just a little crazy, and probably overkill.
The LIKE operator is for string types only.
Use the modulo operator % for what you are trying to do:
#item = Item.where("(id % 100) = ?", "88").all
I doubt it "works" in SQLite, even though it coerces the numeric types to strings. Without leading % the pattern just won't work.
-> sqlfiddle demo
Cast to text and use LIKE as you intended for arbitrary length:
#item = Item.where("(id::text LIKE ('%'::text || ?)", "'12345'").all
Or, mathematically:
#item = Item.where("(id % 10^(length(?)) = ?", "'12345'", "12345").all
LIKE operator does not work with number types and id is the number type so you can use it with concat
SELECT * FROM TABLE_NAME WHERE concat("id") LIKE '%ID%'
I have two strings. I would like to know the upto how many characters are similar in both the strings.
E.x: lets say 'xyzabc', 'xyzadh'. I would like to know if there is a function that can give the index at which the similarity is breaking. In this case it would be 4 because upto 'xyza' the strings are same. If the strings are 'xyzabc', 'xymabc' then the result should be 2.
I would like to use it as select func('xyzabc', 'xyzwer'); to get the required answer. Kinldy let me know if there is a function existing in SQL.
Thanks a lot in advance!!!
One way to do this is with regular expressions. Here is the idea. You would use regexp_substr(). The regular expression would, in your example, be 'xyzwer|xyzwe|xyzw|xyz|xy|x'. You would then measure the length of the matching substring. In other words, you need to convert one of the strings to a regular expression.
An alternative is to use a giant case statement:
(case when left(str1, 10) = left(str2, 10) then 10
when left(str1, 9) = left(str2, 9) then 9
...
when left(str1, 1) = left(str2, 1) then 1
else 0
end)
This assumes that 10 is the longest string.
I need to filter out junk data in SQL (SQL Server 2008) table. I need to identify these records, and pull them out.
Char[0] = A..Z, a..z
Char[1] = 0..9
Char[2] = 0..9
Char[3] = 0..9
Char[4] = 0..9
{No blanks allowed}
Basically, a clean record will look like this:
T1234, U2468, K123, P50054 (4 record examples)
Junk data looks like this:
T12.., .T12, MARK, TP1, SP2, BFGL, BFPL (7 record examples)
Can someone please assist with a SQL query to do a LEFT and RIGHT method and extract those characters, and do a LIKE IN or something?
A function would be great though!
The following should work in a few different systems:
SELECT *
FROM TheTable
WHERE Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9]%'
AND Data NOT LIKE '% %'
This approach will indeed match P2343, P23423JUNK, and other similar text but requires that the format is A0000*.
Now, if the OP implies a format of 1st position is a character and all succeeding positions are numeric, as in A0+, then use the following (in SQL Server and a good deal of other database systems):
SELECT *
FROM TheTable
WHERE SUBSTRING(Data, 1, 1) LIKE '[A-Za-z]'
AND SUBSTRING(Data, 2, LEN(Data) - 1) NOT LIKE '%[^0-9]%'
AND LEN(Data) >= 5
To incorporate this into a SQL Server 2008 function, since this appears to be what you'd like most, you can write:
CREATE FUNCTION ufn_IsProperFormat(#data VARCHAR(50))
RETURNS BIT
AS
BEGIN
RETURN
CASE
WHEN SUBSTRING(#Data, 1, 1) LIKE '[A-Za-z]'
AND SUBSTRING(#Data, 2, LEN(#Data) - 1) NOT LIKE '%[^0-9]%'
AND LEN(#Data) >= 5 THEN 1
ELSE 0
END
END
...and call into it like so:
SELECT *
FROM TheTable
WHERE dbo.ufn_IsProperFormat(Data) = 1
...this query needs to change for Oracle queries because Oracle doesn't appear to support bracket notation in LIKE clauses:
SELECT *
FROM TheTable
WHERE REGEXP_LIKE(Data, '^[A-za-z]\d{4,}$')
This is the expansion gbn is doing in his answer, but these versions allow for varying string lengths without the OR conditions.
EDIT: Updated to support examples in SQL Server and Oracle for ensuring the format A0+, so that A1324, A2342388, and P2342 match but A2342JUNK and A234 do not.
The Oracle REGEXP_LIKE code was borrowed from Mark's post but updated to support 4 or more numeric digits.
Added a custom SQL Server 2008 approach which implements these techniques.
Depends on your database. Many have regex functions (note examples not tested so check)
e.g. Oracle
SELECT x
FROM table
WHERE REGEXP_LIKE(x, '^[A-za-z][:digit:]{4}$')
Sybase uses LIKE
Given that you're allowing between 3 and 6 digits for the number in your examples then it's probably better to use the ISNUMERIC() function on the 2nd character onwards:
SELECT *
FROM TheTable
-- start with a letter
WHERE Data LIKE '[A-Za-z]%'
-- everything from 2nd character onwards is a number
AND ISNUMERIC( SUBSTRING( Data, 2, 50 ) ) = 1
-- number doesn't have a decimal place
AND Data NOT LIKE '%.%'
For more information look at the ISNUMERIC function on MSDN.
Also note that:
I've limited the 2nd part with the number to 50 characters maximum, change this to suit your needs.
Strictly speaking you should check for currency symbols etc, as ISNUMERIC allows them, as well as +/- and some others
A better option might be to create a function that checks that each character after the first is between 0 and 9 (or 1 and 0 if you're using ASCII codes).
You can't use Regular Expressions in SQL Server, so you have to use OR. Correcting David Andres' answer...
WHERE
(
Data LIKE '[A-Za-z][0-9][0-9][0-9]'
OR
Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9]'
OR
Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9][0-9]'
)
David's answer allows "D1234junk" through
You also only need "[A-Z]" if you don't have case sensitivity