SUBSTRING() function behaving differently in SELECT and WHERE clause? [duplicate]

SUBSTRING() function behaving differently in SELECT and WHERE clause? [duplicate] - sql

This question already has an answer here:
Why is it that a change in query plan suddenly seems to break a query
(1 answer)
Closed 4 months ago.
This post was edited and submitted for review 4 months ago and failed to reopen the post:
Original close reason(s) were not resolved
I have a table with the following column:
server
SLQ-ABCD-001
SLQ-ABCA-002
SLP-DBMSA-003
SLD-ABC-004
SLS-123456-005
I would like to be able to filter the rows based on a substring of this column, specifically, the string between those hyphens; there will always be three characters before the first hyphen and three characters after the second hyphen.
Here's what I have tried:
AND substring(server, 5, (len(server)-8)) in ('ABC', 'DBMSA')
AND substring(server, charindex('-', server)+1,(charindex('-',server, charindex('-', server)+1)-(charindex('-', server)+1))) in ('ABC', 'DBMSA')
Both of these work perfectly fine (expected substrings obtained) when used in the SELECT clause but give the error below
Invalid length parameter passed to the LEFT or SUBSTRING function.
I am not able to use the more simpler way, AND server like '%ABC%' as I have more than one combination of characters I'm looking for and also, because that comma separated list will be dynamically parsed in that query for this use case.
Is there any way this type of filter can be achieved in SQL Server?
EDIT
After #DaleK helped me realize that the issue might be I might have some bad data (server names with length < 8) and that I might have missed them when I tested the expression in the SELECT clause since I might have some other filters in my WHERE clause, here's how I had managed to get around that
SELECT *
from
(SELECT *
from my_original_table
where
--all my other filters that helped me eliminate the bad data
) my_filtered_table
where substring(server, 5, (len(server)-8)) in ('ABC', 'DBMSA');
As for the "question being duplicate" part, I think the error in that question is encountered in SELECT statement where as in my case, the expression worked fine in the SELECT statement and only errored when used in the WHERE clause.
For solution, the one provided by #Isolated seems to work perfectly fine!

One simpler approach you can try is the (often misused) parsename function:
Example being
with sampledata as (
select * from (
values('SLQ-ABCD-001'),('SLQ-ABCA-002'),('SLP-DBMSA-003'),('SLD-ABC-004'),('SLS-123456-005')
)x([server])
)
select [server]
from sampledata
cross apply(values(Replace([server], '-','.')))v(v)
where ParseName(v,2) in ('ABC', 'DBMSA');

No need for substring. You could nest left and right with len such as this:
with my_data as (
select 'SLQ-ABCD-001' as server union all
select 'SLQ-ABCA-002' union all
select 'SLP-DBMSA-003' union all
select 'SLD-ABC-004' union all
select 'SLS-123456-005'
)
select server
from my_data
where left(right(server, len(server) - 4), len(right(server, len(server) - 4))- 4) in ('ABC', 'DBMSA')
server
SLP-DBMSA-003
SLD-ABC-004
And left(right(server, len(server) - 4), len(right(server, len(server) - 4))- 4) works fine in the select clause too.

Related

Oracle SQL: Filtering rows with non-numeric characters

My question is very similar to this one: removing all the rows from a table with columns A and B, where some records include non-numeric characters (looking like '1234#5' or '1bbbb'). However, the solutions I read around don't seem to work for me. For example,
SELECT count(*) FROM tbl
--962060;
SELECT count(*)
FROM tbl
WHERE (REGEXP_like(A,'[^0-9]') OR REGEXP_like(B,'[^0-9]') ) ;
--17
SELECT count(*)
FROM tbl
WHERE (REGEXP_like(A,'[0-9]') and REGEXP_like(B,'[0-9]') )
;
--962060
From the 3rd query, I'd expect to see (962060-17)=962043. Why is it still 962060? An alternative query like this also gives the same answer:
SELECT count(*)
FROM tbl
WHERE (REGEXP_like(A,'[[:digit:]]')and REGEXP_like(B,'[[:digit:]]') )
;
--962060
Of course, I could bypass the problem by doing query1 minus query2, but I'd like to learn how to do that using regular expressions.

If you use regexp you should take in account that any part of string may be matched as regexp. According your example you should specify that whole string should cntain only numbers ^ - is the beginig of string $ - is the end. And you may use \d- is digits
SELECT count(*)
FROM tbl
WHERE (REGEXP_like(A,'^[0-9]+$') and REGEXP_like(B,'^[0-9]+$') )
or
SELECT count(*)
FROM tbl
WHERE (REGEXP_like(A,'^\d+$') and REGEXP_like(B,'^\d+$') )

I know you specifically asked for a regex solution, but translate can solve these kind of questions as well (and usually faster because regexes use more processing power):
select count(1)
from tbl
where translate(a, 'x0123456789', 'x') is null
and translate(b, 'x0123456789', 'x') is null;
What this does: translate the characters 0123456789 to null, and if the result is null, then the input must have been all digits. The 'x' is just there because the third argument to translate can not be null.
Thought I should add this here, might be helpful to other readers.

How to substring records with variable length

I have a table which has a column with doc locations, such as AA/BB/CC/EE
I am trying to get only one of these parts, lets say just the CC part (which has variable length). Until now I've tried as follows:
SELECT RIGHT(doclocation,CHARINDEX('/',REVERSE(doclocation),0)-1)
FROM Table
WHERE doclocation LIKE '%CC %'
But I'm not getting the expected result

Use PARSENAME function like this,
DECLARE #s VARCHAR(100) = 'AA/BB/CC/EE'
SELECT PARSENAME(replace(#s, '/', '.'), 2)

This is painful to do in SQL Server. One method is a series of string operations. I find this simplest using outer apply (unless I need subqueries for a different reason):
select *
from t outer apply
(select stuff(t.doclocation, 1, patindex('%/%/%', t.doclocation), '') as doclocation2) t2 outer apply
(select left(tt.doclocation2), charindex('/', tt.doclocation2) as cc
) t3;

The PARSENAME function is used to get the specified part of an object name, and should not used for this purpose, as it will only parse strings with max 4 objects (see SQL Server PARSENAME documentation at MSDN)
SQL Server 2016 has a new function STRING_SPLIT, but if you don't use SQL Server 2016 you have to fallback on the solutions described here: How do I split a string so I can access item x?

The question is not clear I guess. Can you please specify which value you need? If you need the values after CC, then you can do the CHARINDEX on "CC". Also the query does not seem correct as the string you provided is "AA/BB/CC/EE" which does not have a space between it, but in the query you are searching for space WHERE doclocation LIKE '%CC %'
SELECT SUBSTRING(doclocation,CHARINDEX('CC',doclocation)+2,LEN(doclocation))
FROM Table
WHERE doclocation LIKE '%CC %'

SQL Like condition fails to run

I've been tasked to develop a query that behaves essentially like the following one:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*textToSearch*'
The textToSearch is a string which contains information about the condition in which a given device is tested (Voltage, Current, Frequency, etc) in the following format as an example:
[V:127][PF:1][F:50][I:65]
The objective is to recover a list of any and all tests performed at a voltage of 127 Volts, so the SQL developed would look like the folllowing:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*V:127*'
This works as intended but there is a problem due to an inproper introduction of data, there are cases in which the _textToSearch string looks like the following examples:
[V.127][PF:1][F:50][I:65]
[V.230][PF:1][F:50][I:65]
As you can see, my previous SQL transaction does not work as it does not meet the conditions.
If I try to do the following transaction with the objective of ignoring improper data format:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*V*127*'
The transaction is not succesful and returns an error.
What am I doing wrong for this transaction not to work? I am approaching this problem wrong?
I see a pair of problems although with this transaction, if there were a group of test conditions like the following:
[V.127][PF:1][F:50][I:127]
[V.230][PF:1][F:50][I:127]
Would it return the values of both points given that both meet the condition of the transaction stated above?
In conclusion, my questions are:
What is wrong with the LIKE '*V*127*' condition for it not to work?
What implications has working with this condition? Can it return more information than desired if I am not careful?
I hope it is clear what I am asking for, if it isn't, please point out what is not clear and I will try to clarify it

One choice is to look for any character between the "V" and the "127":
WHERE TestConditions LIKE '%V_127%'
Note that % is the wildcard for a string of any length and _ is the wildcard for a single character.
You can also use regular expressions:
WHERE regexp_like(TestConditions, 'V[.:]127')
Note that regular expressions match anywhere in the string, so wildcards at the beginning and end are not needed.

You could check for both cases (although this will decrease performance)
SELECT *
FROM tblTestData
WHERE (TestConditions LIKE '%V:127%' OR TestConditions LIKE '%V.127%')
It is better to clean the data in your database if only old records have this problem.

Using regular expressions is recommended by Oracle for this kind of conditions. You could build a regular expression for your case:
WITH your_table AS (
SELECT '[V.127][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V.230][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V:127][PF:1][F:50][I:65]' text_to_search FROM dual
)
SELECT *
FROM your_table
WHERE REGEXP_LIKE(text_to_search,'\[V(.|:)127\]','i')
Or you could use the good old LIKE operator. In this case, you need to know that:
% matches zero or more characters
_ matches only one character
So you should use an underscore to match the : or the .
WITH your_table AS (
SELECT '[V.127][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V.230][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V:127][PF:1][F:50][I:65]' text_to_search FROM dual
)
SELECT *
FROM your_table
WHERE text_to_search LIKE '%V_127%';

ORACLE substitute variable in IN statement

I have simple query like
SELECT * FROM temp t WHERE t.id IN (:IDs)
When executed, it prompts me (Oracle SQL Developer) for entering value if IDs variable.
When I enter for example 169, everything runs smoothly, but when I try to enter multiple IDs, like 169,170,171, I get error Invalid Number even while putting it into ''.
I'm used to working with MS SQL and MySQL, so this is little confusing to me.
Anyone any suggestions.

The problem is the varying-IN list. In SQL Developer, when you are prompted to enter the value for the bind variable, you are simple passing it as 169,170,171 which it is not considering as a set of values.
What you could do is, have multiple binds -
SELECT * FROM temp t WHERE t.id IN (:ID1, :ID2)
When prompted, enter value for each bind.
UPDATE Alright, if the above solution looks ugly, then I would prefer the below solution -
WITH DATA AS
(SELECT to_number(trim(regexp_substr(:ids, '[^,]+', 1, LEVEL))) ids
FROM dual
CONNECT BY instr(:ids, ',', 1, LEVEL - 1) > 0
)
SELECT * FROM temp t WHERE it.d IN
(SELECT ids FROM data
)
/

If you put them into ", you get error. Oracle doesn't accept ". You should use just numbers without ".
i.e: (169,170,171,...)

You can define a substitution variable as an array like so:
define IDS = (169,170,171);
and then use it like so:
SELECT * FROM temp t WHERE t.id IN &IDS;

Parse a string before the Last Index Of a character in SQL Server

I started with this but is it the best way to perform the task?
select
reverse(
substring(reverse(some_field),
charindex('-', reverse(some_field)) + 1,
len(some_field) - charindex('-', reverse(some_field))))
from SomeTable
How does SQL Server treat the
multiple calls to
reverse(some_field)?
Besides a UDF and iterating through
the string looking for charindex
of the '-' and storing the last
index of it, is there a more
efficient way to perform this task in T-SQL?
Note that what I have works, I just am really wondering if it is the best way about it.
Below are some sample values for some_field.
s2-st, s1-st, s3-st, s3-sss-zzz, s4-sss-zzzz
EDIT:
Sample output for this would be...
s1, s2, s3-sss, s3, s4-sss
The solution ErikE wrote is actually getting the end of the string so everything after the last hyphen. I just modified his version to get everything before it instead using a similar method with the left function. Thanks for all of your your help.
select left(some_field, abs(charindex('-', reverse(some_field)) - len(some_field)))
from (select 's2-st' as some_field
union select 's1-st'
union select 's3-st'
union select 's3-sss-zzz'
union select 's4-sss-zzzz') as SomeTable

May I suggest this simplification of your expression:
select right(some_field, charindex('-', reverse(some_field)) - 1)
from SomeTable
Also, there's no harm, as far as I know, in specifying 8000 characters in length with the substring function when you want the rest of the string. As long as it's not varchar(max), it works just fine.
If this is something you have to do all the time, over and over, how about #1 splitting out the data into separate columns and storing it that way, or #2 adding a calculated column with an index on it, which will perform the calculation once at update/insert time and not again later.
Last, I don't know if SQL Server is smart enough to reverse(some_field) only once and inject it into the other instance. When I get some time I'll try to figure it out.
Update
Oops, somehow I got backwards what you wanted. Sorry about that. The new expression you showed can still be simplified a little:
select left(some_field, len(some_field) - charindex('-', reverse(some_field)))
from (
select 's2-st'
union all select 's1-st'
union all select 's3-st'
union all select 's3-sss-zzz'
union all select 's4-sss-zzzz'
union all select 's5'
) X (some_field)
The abs() in your expression was just reversing the sign. So I put + len - charindex instead of + charindex - len and all is well now. It even works for strings without dashes.
One more thing to mention: your UNION SELECTs should be UNION ALL SELECT because without the ALL, the engine has to remove duplicates just as if you'd indicated SELECT DISTINCT. Simply get in the habit of using ALL and you'll be much better off. :)

Not sure about #1, but I would say that you might be better off doing this in code. Is there a reason you have to do it in the database?
Are you experiencing performance problems because of some similar code or is this purely hypothetical.

I am also not sure how SQL Server handles the multiple calls to REVERSE and CHARINDEX.
You can eliminate the last call to CHARINDEX since you want to take everything to the end of the string:
select
reverse(
substring(reverse(some_field),
charindex('-', reverse(some_field)) + 1,
len(some_field)))
from SomeTable
Although I would recommend against it, you could also replace the LEN function call with the size of the column:
select
reverse(
substring(reverse(some_field),
charindex('-', reverse(some_field)) + 1,
1024))
from SomeTable
I am curious how much of a difference either of these changes would make.

The 3 inner reverses are discrete from each other. The outer reverse will reverse anything that is already reversed by the inner ones.
ErikE's approach is best as a pure TSQL solution. You don't need LEN

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SUBSTRING() function behaving differently in SELECT and WHERE clause? [duplicate] - sql

Related

Oracle SQL: Filtering rows with non-numeric characters

How to substring records with variable length

SQL Like condition fails to run

ORACLE substitute variable in IN statement

Parse a string before the Last Index Of a character in SQL Server

Categories

Resources