Oracle INSTR replacement in SQLite - sql

I'm currently porting part of an application from an Oracle to a SQLite backend (Java, using plain JDBC). One Oracle-specific feature often being used is the INSTR function with three arguments:
INSTR(<string>, <search-string>, <position>)
This function searches within a string for a search string starting from a certain position. The third parameter can either be positive or negative. If it's negative, the search works backwards starting at the end of the string.
This function isn't available in SQLite and the best I could come up with is an alternative by nesting some other functions:
If <position> is positive:
LENGTH(<string>) - LENGTH(SUBSTR(SUBSTR(<string>,
<position>), STRPOS(SUBSTR(<string>, <position>),
<search-string>) + 1))
If <position> is negative (in our case -1 is the only negative value being used):
LENGTH(<string>) - LENGTH(REPLACE(<string>,
RTRIM(<string>, REPLACE(<string>, <search-string>,
'')), ''))
This seems to be giving the desired result, but you can see why I'm not really in favor of this approach. Certainly because in the original syntax the INSTR is used a lot and is being nested as well. It becomes a disaster for maintenance afterwards.
Is there a more elegant approach or could I be missing some other native solution for what seems to be a rather trivial task?

SQL
CASE WHEN position = 0
THEN INSTR(string, substring)
WHEN position > 0
THEN INSTR(SUBSTR(string, position), substring) + position - 1
WHEN position < 0
THEN LENGTH(RTRIM(REPLACE(string,
substring,
REPLACE(HEX(ZEROBLOB(LENGTH(substring))),
'00',
'¬')),
string)) - LENGTH(substring) + 1
END
It assumes the ¬ character won't be part of the search string (but in the unlikely event this assumption is false could of course be changed to a different rarely used character).
SQLFiddle Demo
Some worked examples here: http://sqlfiddle.com/#!5/7e40f9/5
Credits
The positive position method was adapted from Tim Biegeleisen's answer. (But a zero value needs to be handled separately).
The negative position method used the method described in this question as a starting point.
The creation of a string consisting of a character repeated n times was taken from this simplified answer.

Actually, SQLite does support an INSTR function. But, it does not have a third parameter, which means, it always searches from the very beginning of the string.
But, we can workaround this by passing a substring to INSTR, and then offsetting the position found by adding the amount of the offset of the substring.
So, as an example, Oracle's call:
INSTR('catsondogsonhats', 'on', 7)
which would return 11, would become:
INSTR(SUBSTR('catsondogsonhats', 7), 'on') + 6

Related

Regex match first number if it does not appear at the end

I am currently facing a Regex problem which apparently I cannot find an answer to.
My Regex is embedded in a teradata SQL of the form:
REGEXP_SUBSTR(column, 'regex_pattern')
I want to find the first appearance of any number except if it appears at the end of the string.
For Example:
"YEL2X30" -> "2"
"YEL19XYZ05" -> "19"
"YELLOW05" -> ""
I tried it with '[0-9]+(?!$)/' but this returns me a blank String always.
Thanks in Advance!
Shot in the dark here since I'm unfamiliar with teradata and the supported SQL-functionality. However, reading the docs on the REGEXP_SUBSTR() function it seems like you may want to use the 3rd and 4th possible argument along with a slightly different regular expression:
[0-9]+(?![0-9]|$)
Meaning: 1+ Digits that are not followed by either the end of the string or another digit.
I'd believe the following syntax may work now to retrieve the 1st appearance of any number from the matching results:
REGEXP_SUBSTR(column, '[0-9]+(?![0-9]|$)', 1, 1)
The 3rd parameter states from which position in the source-string we need to start searching whereas the 4th will return the 1st match from any possible multiple matches (is how I read the docs). For example: abc123def456ghi789 whould return 123.
Fiddling around in online IDE's gave me that:
CREATE TABLE TBL (TST varchar(100));
INSERT INTO TBL values ('YEL2X30'), ('YEL19XYZ05'), ('YELLOW05'), ('abc123def456ghi789');
SELECT REGEXP_SUBSTR(TST, '[0-9]+(?![0-9]|$)', 1, 1) as 'RESULTS' FROM TBL;
Resulted in:
RESULTS
2
19
NULL
123
NOTE: I also noticed that leaving out the 3rd and 4th parameter made no difference since they will default back to 1 without explicitly mentioning them. I tested this over here.
Possibly the simplest way is to look for digits followed by a non-digit. Then keep all the digits:
regexp_substr(regexp_substr(column, '[0-9]+[^0-9]'), '[0-9]+')

How can I set a minimal amount of numbers after the decimal dot (0.9=>0.90)

I'm using google bigquery, and a column has values I want to round. If I do, and the rounded value ends in a zero, the zero is not displayed.
I've tried the function FORMAT, which apparently has some .number function, but I have no idea how to use it. Whenever I include any 2 things separated by a comma inside its brackets, it complains that it only takes 1 value.
You would use FORMAT() with the precision specifier. For four decimal places always -- including zeros:
select format('%.4f', 1.23)
If the BQ documentation does not answer your questions, I find that that the function seems to be inspired by the classic C printf()/sprintf() functions.
Unaware if in BigQuery (haven't used it ever) there is a better way I guess this will fix your problem since I just tried it in their console.
Cast your float to a string and then check if your last digit is a 0. In case it's not add it:
SELECT case when RIGHT(cast(0.9 as string), 1) <> '0' then cast(0.9 as string)+'0' else cast(0.9 as string) end as FormattedNumber

What is the XQuery equivalent of the ESQL COALESCE function?

I'm trying convert WMB 7 mapping nodes to IIB 9 nodes. The auto-convert process turns some ESQL functions to XQuery functions.
Specifically, it turns the ESQL function
COALESCE (var0, var1)
(which returns the first non-null value, as in if var0 = null then var1 else var0) into
XQUERY (var0,var1)
Is it a correct conversion?
If it is, can someone provide a link to API? I couldn't find this on XQuery syntax and operators manuals.
XQuery is not an API, but a standard, and the full syntax can be found online: XQuery 1.0 and XQuery 3.0 (there is no 2.0). You'll also find many manuals, tutorials etc.
XQuery relies on XPath, which is even wider used than XQuery and can be found in libraries for almost every general purpose language.
Your expression is correct XQuery, in that it considers everything a sequence, and the comma concatenates (and flattens) two sequences.
XPath does not know NULL, but it knows xsi:nil and (), the latter being the empty sequence. An empty sequence is removed from the result.
I am not sure what XQuery processor is used underneath, but the correct expression should be ($var0, $var1)[1]2, which works the same way as your COALESCE operation1. In XPath and XQuery, variables are referenced with the $ sign. The number of variables or expressions separated by the comma is unbounded. If all are the empty sequence (null), the result is the empty sequence.
Without [1], it will return all items that are non-null and discard the rest. You can use another index, like [3] to get the third non-null value. If no such value exists, it will return null (empty sequence).
1 which behaves not exactly the way you described it. I believe it behaves like if var0 == null then var1 else var0, it selects the first non-null value (I've updated the OP).
2 as Florent has explained in the comments, a warning with this expression is in place. If you have $var1 := (1, 2) and $var2 := (3, 4), the expression $var1, $var2)[1] will return 1, not (1, 2), because sequences cannot contain subsequences, and indexing a sequence with [x] will return the xth value of the flattened sequence. You can safe-guard your expression with (zero-or-one($var1), zero-or-one($var2))[1].

How SQL/sqlite wildcars work? LIKE operator

How wildcards in sqlite work. Or how LIKE operator matches.
For examle lets say:
1: LIKE('s%s%', 's12s12')
2: LIKE('asdaska', '%sk%')
In 1st example what % matches after 1st s, and how it decides to continue matching % or s after %.
In 2nd example if s matches first then FALSE returned.
Both examples return TRUE. From my Programming knowledge I came up with that LIKE function is some like a recursive function that when 2 possibilities appear function calls itself with 2 different params and uses OR between them, then obviously if one call returns true, upper function directly returns true. If it is so, then LIKE operator is quiet slow to use on large DBs.
P.S. There is one more '_' wildcard which matches exactly one character
I couldnt find any detailed documentation of LIKE operator.
% matches zero or more characters, _ matches exactly one.
Your first pattern 's%s%' would match, 'ss', 's1s', 's1111s', 'ss1111', etc. etc.
However if you wrote 's_s_' it would match 's1s1', but none of the above.

What is taking IsNumeric() so long in .NET?

Was doing some benchmarking with IsNumeric today and compared it to the following function:
Private Function IsNumeric(ByVal str As String) As Boolean
If String.IsNullOrEmpty(Str) Then Return False
Dim c As Char
For i As Integer = 0 To Str.Length - 1
c = Str(i)
If Not Char.IsNumber(c) Then Return False
Next
Return True
End Function
I was pretty surprised with the result.
With a numeric value this one was about 8-10 times faster then regular IsNumeric(), and with an empty or non-numeric value it was 1000-1500 times faster.
What is taking IsNumeric so long? Is there something else going on under the hood that I should consider before replacing it with the function above?
I use IsNumeric in about 50 different places all over my site, mostly for validation of forms and query strings.
Where is your check for locale-specific delimiters and decimal places? Negation? Exponential notation?
You see, your function is only a tiny subset of what numeric strings can be.
1,000,000.00
1,5E59
-123456789
You're missing all of these.
A single char can only contains 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
but a full string may contains any of the following:
1
1234
12.34
-1234
-12.34
0001
so IsNumeric ought to be somewhat slower.
There're also cultural and internationalization issues. i.e. If you are processing a Unicode string, does IsNumeric also process numbers from all other languages as well?
Generally speaking, I would avoid duplicating or replacing language features like this, especially in a language like VB, which is bound to be implementing many use cases for you. At the very least, I would name your method something distinct so that it is not confusing to other developers.
The speed difference is bound to be the fact your code is doing far less work than the VB language feature is.
I would use Integer.TryParse() or Double.TryParse() depending on the data type you are using, since both of these account for regional differences.