SQL - Version comparison - sql

this is my first question here.
I am building an SQL query in which I need to verify that the version of the object B is always lower or equal than the version of the object A. This is a link table, here is an example :
The query is :
SELECT *
FROM TABLE
WHERE B_VERSION <= A_VERSION
As you can see, it works for the 2 first rows, but not the third, because AA0 is detected as smaller than H08 while it shouldn't (when we arrive at Z99 the next version number is AA0 so the <= operator doesn't work anymore).
So I would like to do something like to parse the version to compare on how many letters are they in the versions, and only if both versions have the same number of letters then I use the <= operator.
I don't know however how to do that in an SQL query. Didn't find anything usefull on google neither. Do you have a solution ?
Thanks in advance

The key for solving this problem is the function PATINDEX. You can find more information here.
This query takes the value of A_VERSION and finds the first occurrence of a number. Then uses this position to divide the value in two parts. The first part is padded to the right with spaces because it is alphabetic, while the second part is padded to the right with zeros ('0') because it is numeric.
The same process occurs for B_VERSION.
Noticed that in this example, each part is assumed to be of maximum 5 characters, so this will work in your case for versions ranging from A0 to ZZZZZ99999. Feel free to adjust as you need.
SELECT *
FROM TABLE
WHERE RIGHT(SPACE(5)
+ SUBSTRING(A_VERSION,
1,
PATINDEX('%[0-9]%', A_VERSION) - 1), 5)
+ RIGHT(REPLICATE('0', 5)
+ SUBSTRING(A_VERSION,
PATINDEX('%[0-9]%', A_VERSION),
LEN(A_VERSION)), 5)
<= RIGHT(SPACE(5)
+ SUBSTRING(B_VERSION,
1,
PATINDEX('%[0-9]%', B_VERSION) - 1), 5)
+ RIGHT(REPLICATE('0', 5)
+ SUBSTRING(B_VERSION,
PATINDEX('%[0-9]%', B_VERSION),
LEN(B_VERSION)) ,5)
If you are going to do this operation in many places, you might consider creating a function for this operation.
Hope this helps.

Many thanks! It helped a lot however I am using sql developer and I cannot use PATINDEX with this software, I found the equivalent which is REGEXP_INSTR, it works very similarly.
I used this alrogithm that filters out the lines where there are more letters in VERSION_B than VERSION_A and then filter out the lines where VERSION_B is bigger than VERSION_A when they have both the same quantity of letters:
WHERE
(REGEXP_INSTR(VERSION_B, '[0-9]') < REGEXP_INSTR(VERSION_A, '[0-9]')) OR
(REGEXP_INSTR(VERSION_B, '[0-9]') = REGEXP_INSTR(VERSION_A, '[0-9]') AND VERSION_B <= VERSION_A)

Related

Problem with using SUBSTRING and CHARINDEX

I have a column (RCV1.ECCValue) in a table which 99% of the time has a constant string format- example being:
T0-11.86-273
the middle part of the two hyphens is a percentage. I'm using the below sql to obtain this figure which is working fine and returns 11.86 on the above example. when the data in that table is in above format
'Percentage' = round(SUBSTRING(RCV1.ECCValue,CHARINDEX('-',RCV1.ECCValue)+1, CHARINDEX('-',RCV1.ECCValue,CHARINDEX('-',RCV1.ECCValue)+1) -CHARINDEX('-',RCV1.ECCValue)-1),2) ,
However...this table is updated from an external source and very occasionally the separators differ, for example:
T0-11.86_273
when this occurs I get the error:
Invalid length parameter passed to the LEFT or SUBSTRING function.
I'm very new to SQL and have got myself out of many challenges but this one has got me stuck. Any help would be mostly appreciated. Is there a better way to extract this percentage value?
Replace '_' with '-' to string in CHARINDEX while specifying length to the substring
'Percentage' = round(SUBSTRING(RCV1.ECCValue,CHARINDEX('-',RCV1.ECCValue)+1, CHARINDEX('-',replace(RCV1.ECCValue,'_','-'),CHARINDEX('-',RCV1.ECCValue)+1) -CHARINDEX('-',RCV1.ECCValue)-1),2) ,
If you can guarantee the structure of these strings, you can try parsename
select round(parsename(translate(replace('T0-11.86_273','.',''),'-_','..'),2), 2)/100
Breakdown of steps
Replace . character in the percentage value with empty string using replace.
Replace - or _, whichever is present, with . using translate.
Parse the second element using parsename.
Round it up to 2 digits, which will also
automatically cast it to the desired numeric type.
Divide by 100
to restore the number as percentage.
Documentation & Gotchas
Use NULLIF to null out such values
round(
SUBSTRING(
RCV1.ECCValue,
NULLIF(CHARINDEX('-', RCV1.ECCValue), 0) + 1,
NULLIF(CHARINDEX('-',
RCV1.ECCValue,
NULLIF(CHARINDEX('-', RCV1.ECCValue), 0) + 1
), 0)
- NULLIF(CHARINDEX('-', RCV1.ECCValue), 0) - 1
),
2)
I strongly recommend that you place the repeated values in CROSS APPLY (VALUES to avoid having to repeat yourself. And do use whitespace, it's free.

How to implement MAX function on a text column in SQL Server?

I'm using SQL Server 2005 and have a column that contains serial numbers, which are nvarchar(50).
My problem is selecting max(serial_no) from the table. The serial numbers used to have a length of 7 only but new ones are now 15. Whenever I select the max, I get a result with a length of 7, which means that data is old. I also can't filter it to only select from records which have a length of 15 because then i'll miss some other data on my query.
Old serial numbers look like this...
'SNGD001'
..., and new ones look like this:
'SN14ABCD0000001'
Edit: I tried creating a dummy table without the old serial numbers (5 characters long), and I'm getting correct results.
As has been mentioned, your question is a bit hard to follow. If the max value could be either one of your old serial numbers or one of your new ones, I believe the following should do the trick:
SELECT MAX(RIGHT('0000000' + REVERSE(LEFT(REVERSE(YourTextColumn), PATINDEX('%[a-z]%', REVERSE(YourTextColumn)) - 1)), 7))
FROM YourTable
It finds the first non numeric character from the right keeping everything to the right of that. It then left zero pads the resulting numeric string to 7 characters and applies the MAX function.
Your question is a little tough to follow without good sample data to get a bearing on. I suggest for future, you show a few more examples of data to get better context, especially with sequencing. Now, your desire to get the MAX() of a "serial_no" from your table appears you need so you get detect the next sequential serial number to assign. However, your serial number appears to be a concatenation of a prefix string and then sequential. So, if I were to look at your brief data MIGHT HAVE BEEN along the lines of (last 3 digits are the sequential serializations)
SNGD001
SNGD002
SNGD003
...
SNGD389, etc...
and your new data with the last (last 7 digits are sequential serializations)
SN14ABCD0000001
SN14ABCD0000002
SN14ABCD0000003
...
SN14ABCD0002837
If this is correct, then you basically need to look at the max based on the leading 3 or 8 characters of the string PLUS the converted suffix numeric sequence. For starters, lets go with that to see if we are on the correct track or not, then you can easily concatenate the prefix and sequence number together at the end for determining the next available number.
So, based on the above samples, you may want to know that for each prefix, the last number of
SNGD389 and
SN14ABCD0002837 respective per their prefix
If the above is correct, I might start with...
select
case when LEN( RTRIM( yt.serial_no )) = 7
then LEFT( yt.serial_no, 4 )
else LEFT( yt.serial_no, 8 ) end as SerialPrefix,
MAX( case when LEN( RTRIM( yt.serial_no )) = 7
then CONVERT(INT, RIGHT( yt.serial_no, 3 ))
else CONVERT(INT, RIGHT( yt.serial_no, 7 )) end ) as SerialSequence
from
YourTable yt
group by
case when LEN( RTRIM( yt.serial_no )) = 7
then LEFT( yt.serial_no, 4 )
else LEFT( yt.serial_no, 8 ) end as SerialPrefix
Which would result in (based on sample data I presented)
SerialPrefix SerialSequence
SNGD 389
SN14ABCD 0002837
Of which since the serial sequence column being numeric, you could add 1 to it, then left-zero fill a string and concatenate the two back together such as to create
SNGD390
SN14ABCD0002838

Average Row [SQL]

Actually I'm a bit confused about what should i wrote in the subject.
The point is like this, I want to average the Speed01,Speed02,Speed03 and Speed04 :
SELECT
Table01.Test_No,
Table01.Speed01,
Table01.Speed02,
Table01.Speed03,
Table01.Speed04,
I want to create new column that consists of this average -->>
AVG(Table01.Speed01, Table01.Speed02, Table01.Speed03,Table01.Speed04) as "Average"
I have tried this, but it did not work.
From
Table01
So, the contain of the Speed column could be exist but sometimes the Speed02 don't have number but the others are have numbers. sometimes speed04 data is also missing and the others is exist, sometimes only one data (example: only Speed01) have the data. lets say it depends on the sensor ability to catch the speed of the test material.
It will be a big help if you can find the solution. I'm newbie here.
THANK YOU ^^
AVG is a SQL aggregate function, therefore not applicable. So simply do the math. Average is sum divided by count:
(SPEED01 + SPEED02 + SPEED03 +SPEED04)/4
To deal with missing values, use NULLIF or COALESCE:
(COALESCE(SPEED01, 0) + COALESCE(SPEED02, 0) + COALESCE(SPEED03, 0) + COALESCE(SPEED04, 0))
That leaves the denominator. You need to add 1 for every non null. For example:
(COALESCE(SPEED01/SPEED01,0) + COALESCE(SPEED02/SPEED02,0) + ...)
You can also use CASE, depending on the supported SQL dialect, to avoid the possible divide by 0:
CASE WHEN SPEED01 IS NULL THEN 0 ELSE 1
OR you can normalize the data, extract all SPEEDs into a 1:M relation and use the AVG aggregate, avoiding all these issues. Not to mention the possibility to add a 5th measurement, then a 6th and so on and so forth!
Just add the columns and divide them by 4. To deal with the "missing" values use coalesce to treat NULL values as zero:
SELECT Test_No,
(coalesce(Speed01,0) + coalesce(Speed02,0) + coalesce(Speed03,0) + coalesce(Speed04,0)) / 4 as "Average"
FROM Table01;
You didn't mention your DBMS (Postgres, Oracle, ...), but the above is ANSI (standard) SQL and should run on nearly every DBMS.
As I understood your question, I supposed that Table01.Speed01, Table01.Speed03, Table01.Speed04 are nullable and of type int whereas Table01.Speed02 is nullable and of type nvarchar:
SELECT
Table01.Test_No,
(
ISNULL(Table01.Speed01, 0) +
CASE ISNUMERIC(Table01.Speed02) WHEN 0 THEN 0 ELSE CAST(Table01.Speed02 AS int) END +
ISNULL(Table01.Speed03, 0) +
ISNULL(Table01.Speed04, 0)
)/4 AS AVG
FROM Table01

How can I use LEFT & RIGHT Functions in SQL to get last 3 characters?

I have a Char(15) field, in this field I have the data below:
94342KMR
947JCP
7048MYC
I need to break down this, I need to get the last RIGHT 3 characters and I need to get whatever is to the LEFT. My issue is that the code on the LEFT is not always the same length as you can see.
How can I accomplish this in SQL?
Thank you
SELECT RIGHT(RTRIM(column), 3),
LEFT(column, LEN(column) - 3)
FROM table
Use RIGHT w/ RTRIM (to avoid complications with a fixed-length column), and LEFT coupled with LEN (to only grab what you need, exempt of the last 3 characters).
if there's ever a situation where the length is <= 3, then you're probably going to have to use a CASE statement so the LEFT call doesn't get greedy.
You can use RTRIM or cast your value to VARCHAR:
SELECT RIGHT(RTRIM(Field),3), LEFT(Field,LEN(Field)-3)
Or
SELECT RIGHT(CAST(Field AS VARCHAR(15)),3), LEFT(Field,LEN(Field)-3)
Here an alternative using SUBSTRING
SELECT
SUBSTRING([Field], LEN([Field]) - 2, 3) [Right3],
SUBSTRING([Field], 0, LEN([Field]) - 2) [TheRest]
FROM
[Fields]
with fiddle
select right(rtrim('94342KMR'),3)
This will fetch the last 3 right string.
select substring(rtrim('94342KMR'),1,len('94342KMR')-3)
This will fetch the remaining Characters.

SQL query - LEFT 1 = char, RIGHT 3-5 = numbers in Name

I need to filter out junk data in SQL (SQL Server 2008) table. I need to identify these records, and pull them out.
Char[0] = A..Z, a..z
Char[1] = 0..9
Char[2] = 0..9
Char[3] = 0..9
Char[4] = 0..9
{No blanks allowed}
Basically, a clean record will look like this:
T1234, U2468, K123, P50054 (4 record examples)
Junk data looks like this:
T12.., .T12, MARK, TP1, SP2, BFGL, BFPL (7 record examples)
Can someone please assist with a SQL query to do a LEFT and RIGHT method and extract those characters, and do a LIKE IN or something?
A function would be great though!
The following should work in a few different systems:
SELECT *
FROM TheTable
WHERE Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9]%'
AND Data NOT LIKE '% %'
This approach will indeed match P2343, P23423JUNK, and other similar text but requires that the format is A0000*.
Now, if the OP implies a format of 1st position is a character and all succeeding positions are numeric, as in A0+, then use the following (in SQL Server and a good deal of other database systems):
SELECT *
FROM TheTable
WHERE SUBSTRING(Data, 1, 1) LIKE '[A-Za-z]'
AND SUBSTRING(Data, 2, LEN(Data) - 1) NOT LIKE '%[^0-9]%'
AND LEN(Data) >= 5
To incorporate this into a SQL Server 2008 function, since this appears to be what you'd like most, you can write:
CREATE FUNCTION ufn_IsProperFormat(#data VARCHAR(50))
RETURNS BIT
AS
BEGIN
RETURN
CASE
WHEN SUBSTRING(#Data, 1, 1) LIKE '[A-Za-z]'
AND SUBSTRING(#Data, 2, LEN(#Data) - 1) NOT LIKE '%[^0-9]%'
AND LEN(#Data) >= 5 THEN 1
ELSE 0
END
END
...and call into it like so:
SELECT *
FROM TheTable
WHERE dbo.ufn_IsProperFormat(Data) = 1
...this query needs to change for Oracle queries because Oracle doesn't appear to support bracket notation in LIKE clauses:
SELECT *
FROM TheTable
WHERE REGEXP_LIKE(Data, '^[A-za-z]\d{4,}$')
This is the expansion gbn is doing in his answer, but these versions allow for varying string lengths without the OR conditions.
EDIT: Updated to support examples in SQL Server and Oracle for ensuring the format A0+, so that A1324, A2342388, and P2342 match but A2342JUNK and A234 do not.
The Oracle REGEXP_LIKE code was borrowed from Mark's post but updated to support 4 or more numeric digits.
Added a custom SQL Server 2008 approach which implements these techniques.
Depends on your database. Many have regex functions (note examples not tested so check)
e.g. Oracle
SELECT x
FROM table
WHERE REGEXP_LIKE(x, '^[A-za-z][:digit:]{4}$')
Sybase uses LIKE
Given that you're allowing between 3 and 6 digits for the number in your examples then it's probably better to use the ISNUMERIC() function on the 2nd character onwards:
SELECT *
FROM TheTable
-- start with a letter
WHERE Data LIKE '[A-Za-z]%'
-- everything from 2nd character onwards is a number
AND ISNUMERIC( SUBSTRING( Data, 2, 50 ) ) = 1
-- number doesn't have a decimal place
AND Data NOT LIKE '%.%'
For more information look at the ISNUMERIC function on MSDN.
Also note that:
I've limited the 2nd part with the number to 50 characters maximum, change this to suit your needs.
Strictly speaking you should check for currency symbols etc, as ISNUMERIC allows them, as well as +/- and some others
A better option might be to create a function that checks that each character after the first is between 0 and 9 (or 1 and 0 if you're using ASCII codes).
You can't use Regular Expressions in SQL Server, so you have to use OR. Correcting David Andres' answer...
WHERE
(
Data LIKE '[A-Za-z][0-9][0-9][0-9]'
OR
Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9]'
OR
Data LIKE '[A-Za-z][0-9][0-9][0-9][0-9][0-9]'
)
David's answer allows "D1234junk" through
You also only need "[A-Z]" if you don't have case sensitivity