ORDER BY specific numerical value in string [SQL] - sql

Have a column ID that I would like to ORDER in a specific format. Column has a varchar data type and always has an alphabetic value, typically P in front followed by three to four numeric values. Possibly even followed by an underscore or another alphabetic value. I have tried few options and none are returning what I desire.
SELECT [ID] FROM MYTABLE
ORDER BY
(1) LEN(ID), ID ASC
/ (2) LEFT(ID,2)
OPTIONS TRIED (3) SUBSTRING(ID,2,4) ASC
\ (4) ROW_NUMBER() OVER (ORDER BY SUBSTRING(ID,2,4))
(5) SUBSTRING(ID,PATINDEX('%[0-9]%',ID),LEN(ID))
(6) LEFT(ID, PATINDEX('%[0-9]%', ID)-1)
Option 1 seems to be closest to what I am looking for except when an _ or Alphabetic values follow the numeric value. See results from Option 1 below
P100
P208
P218
P301
P305
P306
P4200
P4510
P4511
P4512
P5011
P1400A
P4125H
P4202A
P4507L
P4706A
P1001_2
P2103_B
P4368_RL
Would like to see..
P100
P208
P218
P301
P305
P306
P1001_2
P1400A
P2103_B
P4125H
P4200
P4202A
P4368_RL
P4507L
P4510
P4511
P4512
P4706A
P5011

ORDER BY
CAST(SUBSTRING(id, 2, 4) AS INT),
SUBSTRING(id, 6, 3)
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9464
And one that's still less complex than a getOnlyNumbers() UDF, but copes with varying length of numeric part.
CROSS APPLY
(
SELECT
tail_start = PATINDEX('%[0-9][^0-9]%', id + '_')
)
stats
CROSS APPLY
(
SELECT
numeric = CAST(SUBSTRING(id, 2, stats.tail_start-1) AS INT),
alpha = RIGHT(id, LEN(id) - stats.tail_start)
)
id_tuple
ORDER BY
id_tuple.numeric,
id_tuple.alpha
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9499
Finally, one that can cope with there being no number at all (but still assumes the first character exists and should be ignored).
CROSS APPLY
(
SELECT
tail_start = NULLIF(PATINDEX('%[0-9][^0-9]%', id + '_'), 0)
)
stats
CROSS APPLY
(
SELECT
numeric = CAST(SUBSTRING(id, 2, stats.tail_start-1) AS INT),
alpha = RIGHT(id, LEN(id) - ISNULL(stats.tail_start, 1))
)
id_tuple
ORDER BY
id_tuple.numeric,
id_tuple.alpha
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9507

This is a rather strange way to sort but now that I understand it I figured out a solution. I am using a table valued function here to strip out only the numbers from a string. Since the function returns all numeric characters I also need to check for the _ and only pass in the part of the string before that.
Here is the function.
create function GetOnlyNumbers
(
#SearchVal varchar(8000)
) returns table as return
with MyValues as
(
select substring(#SearchVal, N, 1) as number
, t.N
from cteTally t
where N <= len(#SearchVal)
and substring(#SearchVal, N, 1) like '[0-9]'
)
select distinct NumValue = STUFF((select number + ''
from MyValues mv2
order by mv2.N
for xml path('')), 1, 0, '')
from MyValues mv
This function is using a tally table. If you have one you can tweak that code slightly to fit. Here is my tally table. I keep it as a view.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
GO
Next of course we need to have some data to work. In this case I just created a table variable to represent your actual table.
declare #Something table
(
SomeVal varchar(10)
)
insert #Something values
('P100')
, ('P208')
, ('P218')
, ('P301')
, ('P305')
, ('P306')
, ('P4200')
, ('P4510')
, ('P4511')
, ('P4512')
, ('P5011')
, ('P1400A')
, ('P4125H')
, ('P4202A')
, ('P4507L')
, ('P4706A')
, ('P1001_2')
, ('P2103_B')
, ('P4368_RL')
With all the legwork and setup behind us we can get to the actual query needed to accomplish this.
select s.SomeVal
from #Something s
cross apply dbo.GetOnlyNumbers(case when charindex('_', s.SomeVal) = 0 then s.SomeVal else left(s.SomeVal, charindex('_', s.SomeVal) - 1) end) x
order by convert(int, x.NumValue)
This returns the rows in the order you listed them in your question.

You can break down ID in steps to extract the number. Then, order by the number and ID. I like to break down long string manipulation into steps using CROSS APPLY. You can do it inline (it'd be long) or bundle it into an inline TVF.
SELECT t.*
FROM MYTABLE t
CROSS APPLY (SELECT NoP = STUFF(ID, 1, 1, '')) nop
CROSS APPLY (SELECT FindNonNumeric = LEFT(NoP, ISNULL(NULLIF(PATINDEX('%[^0-9]%', NoP)-1, -1), LEN(NoP)))) fnn
CROSS APPLY (SELECT Number = CONVERT(INT, FindNonNumeric)) num
ORDER BY Number
, ID;

I think your best bet is to create a function that strips the numbers out of the string, like this one, and then sort by that. Even better, as #SeanLange suggested, would be to use that function to store the number value in a new column and sort by that.

Related

How to extract a string between two of the SAME delimiters T-SQL?

I'm wanting to extract part of a string from a value which has a number of the same delimiters.
Here is an example of the data I am working with (these file paths could be even longer depending on the depth of the file):
FilePath:
Q:\12345\downloads\randomfilename.png
Q:\123_4566\downloads\randomfilename.pdf
Q:\CCCMUD\downloads\randomfilename.mp4
I want to extract part of the string between the first two delimiters ( \ ) for every row into a new column e.g.
12345
123_4566
CCCMUD
I know I need to be using SUBSTRING and CHARINDEX but I'm not sure how. I would appreciate any help. Thanks.
Use CHAR_INDEX twice:
SELECT *, SUBSTRING(path, pos1 + 1, pos2 - pos1 - 1)
FROM tests
CROSS APPLY (SELECT NULLIF(CHARINDEX('\', path), 0)) AS ca1(pos1)
CROSS APPLY (SELECT NULLIF(CHARINDEX('\', path, pos1 + 1), 0)) AS ca2(pos2)
-- NULLIF is used to convert 0 value (character not found) to NULL
Test on db<>fiddle
In all your examples, the first \ is at character 3 in the string. If so, then you can simply use:
select v.*,
substring(filepath, 4, charindex('\', filepath, 4) - 4)
from (values ('Q:\123_4566\downloads\randomfilename.pdf')) v(filepath)
DECLARE #s table (path varchar(4000));
INSERT #s(path) VALUES
('Q:\12345\downloads\randomfilename.png'),
('Q:\123_4566\downloads\randomfilename.pdf'),
('Q:\CCCMUD\downloads\randomfilename.mp4');
SELECT folder = LEFT(o, CHARINDEX('\', o) - 1) FROM
(
SELECT o = SUBSTRING(path, CHARINDEX('\', path) + 1, 4000)
FROM #s
) AS o;
Output:
folder
----------
12345
123_4566
CCCMUD
This will error, though, for paths that don't contain two \ characters. So you may want to add a filter to the inner query (or determine how you want to handle the output differently in that case):
WHERE path LIKE '%\%\%'
An easy and efficient way to do this is to use an ordinal splitter (like this one). To make sure the split value only contains numbers you could add WHERE try_cast(ds.Item as int) is not null. Something like this
splitter
CREATE FUNCTION [dbo].[DelimitedSplit8K_LEAD]
--===== Define I/O parameters
(#pString VARCHAR(8000), #pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(#pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1
FROM cteTally t
WHERE (SUBSTRING(#pString,t.N,1) = #pDelimiter OR t.N = 0)
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
Item = SUBSTRING(#pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000))
FROM cteStart s
;
query
select ds.*
from #s s
cross apply dbo.DelimitedSplit8K_LEAD(s.[path], '\') ds
where ds.ItemNumber=2
and try_cast(ds.Item as int) is not null;
ItemNumber Item
2 12345

SQL Server 2014 : Convert two comma separated string into two columns

I have two comma-separated string which needs to be converted into a temptable with two columns synchronized based on the index.
If the input string as below
a = 'abc,def,ghi'
b = 'aaa,bbb,ccc'
then output should be
column1 | column2
------------------
abc | aaa
def | bbb
ghi | ccc
Let us say I have function fnConvertCommaSeparatedStringToColumn which takes in comma-separated string and delimiter as a parameter and returns a column with values. I use this on both strings and get two columns to verify if the count is the same on both sides. But it would be nice two have them in a single temp table. How can i do that?
Let us say I have function which ... returns a column with values.
At that point, the basic idea is to select the column and use the row_number() function with both of your strings. Then you can JOIN the two together using the row_number() result as the matching field for the join.
One method is a recursive CTE:
with cte as (
select convert(varchar(max), null) as a_part, convert(varchar(max), null) as b_part,
convert(varchar(max), 'abc,def,ghi') + ',' as a,
convert(varchar(max), 'aaa,bbb,ccc') + ',' as b,
0 as lev
union all
select convert(varchar(max), left(a, charindex(',', a) - 1)),
convert(varchar(max), left(b, charindex(',', b) - 1)),
stuff(a, 1, charindex(',', a), ''),
stuff(b, 1, charindex(',', b), ''),
lev + 1
from cte
where a <> '' and lev < 10
)
select a_part, b_part
from cte
where lev > 0;
Here is a db<>fiddle.
Here's something a bit sneaky you can try.
I don't have your bespoke function so have used the built-in string_split function (SQL2016+) - for quickly testing, but assuming the parameters are the same. Ideally, your bespoke function should return its own row number in which case you'd use that instead of a rownumber function.
declare #a varchar(20)='abc,def,ghi', #b varchar(20)='aaa,bbb,ccc';
with v as (
select a.value A,b.value B,
row_number() over(partition by a.value order by (select 1/0))Arn,
row_number() over(partition by b.value order by (select 1/0))Brn
from fnConvertCommaSeparatedStringToColumn (#a,',')a
cross apply fnConvertCommaSeparatedStringToColumn (#b,',')b
)
select A,B from v
where Arn=Brn
I would suggest getting a (set based) function that can split a string, based on a delimiter, that returns the ordinal position as well. For example DelimitedSplit8k_LEAD. Then you can trivially split the value, and JOIN on the ordinal position:
DECLARE #a varchar(100) = 'abc,def,ghi';
DECLARE #b varchar(100) = 'aaa,bbb,ccc';
SELECT A.Item AS A,
B.Item AS B
FROM dbo.delimitedsplit8k_lead(#a,',') A
FULL OUTER JOIN dbo.delimitedsplit8k_lead(#a,',') B ON A.ItemNumber = B.ItemNumber;
db<>fiddle
I use a FULL OUTER JOIN and then if either column has a NULL value you know that the 2 delimited lists don't have the same number of delimited values.

How to Read Data Number by Number

I have a field that contains numbers such as the examples below in #Numbers. Each number within each row in #Numbers relates
to many different values that are contained within the #Area table.
I need to make a relationship from #Numbers to #Area using each number within each row.
CREATE TABLE #Numbers
(
Number int
)
INSERT INTO #Numbers
(
Number
)
SELECT 102 UNION
SELECT 1 UNION
SELECT 2 UNION
select * from #Numbers
CREATE TABLE #Area
(
Number int,
Area varchar(50)
)
INSERT INTO #Area
(
Number,
Area
)
SELECT 0,'Area1' UNION
SELECT 1,'Area2' UNION
SELECT 1,'Area3' UNION
SELECT 1,'Area5' UNION
SELECT 1,'Area8' UNION
SELECT 1,'Area9' UNION
SELECT 2,'Area12' UNION
SELECT 2,'Area43' UNION
SELECT 2,'Area25' UNION
select * from #Area
It would return the following for 102:
102,Area2
102,Area3
102,Area5
102,Area8
102,Area9
102,Area1
102,Area12
102,Area43
102,Area25
For 1 it would return:
1,Area2
1,Area3
1,Area5
1,Area8
1,Area9
For 2 it would return:
2,Area12
2,Area43
2,Area25
Note how the numbers match up to the individual Areas and return the values accordingly.
Well, the OP marked an answer already, which even got votes. Maybe he will not read this, but here is another option using direct simple select, which (according to the EP) seems like using a lot less resources:
SELECT *
FROM #Numbers t1
LEFT JOIN #Area t2 ON CONVERT(VARCHAR(10), t1.Number) like '%' + CONVERT(CHAR(1), t2.Number) + '%'
GO
Note! According to Execution Plan this solution uses only 27% while the selected answer (written by Squirrel) uses 73%, but Execution Plan can be misleading sometimes and you should check IO and TIME statistics as well using the real table structure and real data.
looks like you need to extract individual digit from #Number and then used it to join to #Area
; with tally as
(
select n = 1
union all
select n = n + 1
from tally
where n < 10
)
select n.Number, a.Area
from #Numbers n
cross apply
(
-- here it convert n.Number to string
-- then extract 1 digit
-- and finally convert back to integer
select num = convert(int,
substring(convert(varchar(10), n.Number),
t.n,
1)
)
from tally t
where t.n <= len(convert(varchar(10), n.Number))
) d
inner join #Area a on d.num = a.Number
order by n.Number
or if you prefer to do it in arithmetic and not string
; with Num as
(
select Number, n = 0, Num = Number / power(10, 0) % 10
from #Numbers
union all
select Number, n = n + 1, Num = Number / power(10, n + 1) % 10
from Num
where Number > power(10, n + 1)
)
select n.Number, a.Area
from Num n
inner join #Area a on n.Num = a.Number
order by n.Number
Here is my idea. In theory, it should work.
Have a table (temp or permanent) with the values and it's translation
I.E.
ID value
1 Area1, Area2, Area7, Area8, Area15
2 Area28, Area35
etc
Take each row and put a some special character between each number. Use a function like string_split with that character to turn it into a column of values.
e.g 0123 will then be something like 0|1|2|3 and when you run that through string_split you would get
0
1
2
3
Now join each value to your lookup table and return the Value.
Now you have a row with all the values that you want. Use another function like STUFF FOR XML and put those values back into a single column.
This doesn't sound very efficient.. but this is one way of achieving what you desire..
Another is to do a replace().. but that would be very messy!
Create a third table called n which contains a single column also called n that contains integers from 1 to the maximum number of digits in your number. Make it 1000 if you like, doesn't matter. Then:
select #numbers.number, substring(convert(varchar,#numbers.number),n,1) as chr, Area
from #numbers
join n on n>0 and n <=len(convert(varchar,number))
join #area on #area.number=substring(convert(varchar,#numbers.number),n,1)
The middle column chr is just there to show you what it's doing, and would be removed from the final result.

Alphanumeric sort on nvarchar(50) column

I am trying to write a query that will return data sorted by an alphanumeric column, Code.
Below is my query:
SELECT *
FROM <<TableName>>
CROSS APPLY (SELECT PATINDEX('[A-Z, a-z][0-9]%', [Code]),
CHARINDEX('', [Code]) ) ca(PatPos, SpacePos)
CROSS APPLY (SELECT CONVERT(INTEGER, CASE WHEN ca.PatPos = 1 THEN
SUBSTRING([Code], 2,ISNULL(NULLIF(ca.SpacePos,0)-2, 8000)) ELSE NULL END),
CASE WHEN ca.PatPos = 1 THEN LEFT([Code],
ISNULL(NULLIF(ca.SpacePos,0)-0,1)) ELSE [Code] END) ca2(OrderBy2, OrderBy1)
WHERE [TypeID] = '1'
OUTPUT:
FFS1
FFS2
...
FFS12
FFS1.1
FFS1.2
...
FFS1.1E
FFS1.1R
...
FFS12.1
FFS12.2
FFS.12.1E
FFS12.1R
FFS12.2E
FFS12.2R
DESIRED OUTPUT:
FFS1
FFS1.1
FFS1.1E
FFS1.1R
....
FFS12
FFS12.1
FFS12.1E
FFS12.1R
What am I missing or overlooking?
EDIT:
Let me try to detail the table contents a little better. There are records for FFS1 - FFS12. Those are broken into X subs, i.e., FFS1.1 - FFS1.X to FFS12.1 - FFS12.X. The E and the R was not a typo, each sub record has two codes associated with it: FFS1.1E & FFS1.1R.
Additionally I tried using ORDER BY but it sorted as
FFS1
...
FFS10
FFS2
This will work for any count of parts separated by dots. The sorting is alphanumerical for each part separately.
DECLARE #YourValues TABLE(ID INT IDENTITY, SomeVal VARCHAR(100));
INSERT INTO #YourValues VALUES
('FFS1')
,('FFS2')
,('FFS12')
,('FFS1.1')
,('FFS1.2')
,('FFS1.1E')
,('FFS1.1R')
,('FFS12.1')
,('FFS12.2')
,('FFS.12.1E')
,('FFS12.1R')
,('FFS12.2E')
,('FFS12.2R');
--The query
WITH Splittable AS
(
SELECT ID
,SomeVal
,CAST(N'<x>' + REPLACE(SomeVal,'.','</x><x>') + N'</x>' AS XML) AS Casted
FROM #YourValues
)
,Parted AS
(
SELECT Splittable.*
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PartNmbr
,A.part.value(N'text()[1]','nvarchar(max)') AS Part
FROM Splittable
CROSS APPLY Splittable.Casted.nodes(N'/x') AS A(part)
)
,AddSortCrit AS
(
SELECT ID
,SomeVal
,(SELECT LEFT(x.Part + REPLICATE(' ',10),10) AS [*]
FROM Parted AS x
WHERE x.ID=Parted.ID
ORDER BY PartNmbr
FOR XML PATH('')
) AS SortColumn
FROM Parted
GROUP BY ID,SomeVal
)
SELECT ID
,SomeVal
FROM AddSortCrit
ORDER BY SortColumn;
The result
ID SomeVal
10 FFS.12.1E
1 FFS1
4 FFS1.1
6 FFS1.1E
7 FFS1.1R
5 FFS1.2
3 FFS12
8 FFS12.1
11 FFS12.1R
9 FFS12.2
12 FFS12.2E
13 FFS12.2R
2 FFS2
Some explanation:
The first CTE will transform your codes to XML, which allows to address each part separately.
The second CTE returns each part toegther with a number.
The third CTE re-concatenates your code, but each part is padded to a length of 10 characters.
The final SELECT uses this new single-string-per-row in the ORDER BY.
Final hint:
This design is bad! You should not store these values in concatenated strings... Store them in separate columns and fiddle them together just for the output/presentation layer. Doing so avoids this rather ugly fiddle...

Query Split string into rows

I have a table that looks like this:
ID Value
1 1,10
2 7,9
I want my result to look like this:
ID Value
1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
1 10
2 7
2 8
2 9
I'm after both a range between 2 numbers with , as the delimiter (there can only be one delimiter in the value) and how to split this into rows.
Splitting the comma separated numbers is a small part of this problem. The parsing should be done in the application and the range stored in separate columns. For more than one reason: Storing numbers as strings is a bad idea. Storing two attributes in a single column is a bad idea. And, actually, storing unsanitized user input in the database is also often a bad idea.
In any case, one way to generate the list of numbers is to use a recursive CTE:
with t as (
select t.*, cast(left(value, charindex(',', value) - 1) as int) as first,
cast(substring(value, charindex(',', value) + 1, 100) as int) as last
from table t
),
cte as (
select t.id, t.first as value, t.last
from t
union all
select cte.id, cte.value + 1, cte.last
from cte
where cte.value < cte.last
)
select id, value
from cte
order by id, value;
You may need to fiddle with the value of MAXRECURSION if the ranges are really big.
Any table that a field with multiple values such as this is a problem in terms of design. The only way to deal with these records as it is is to split the values on the delimiter and put them into a temporary table, implement custom splitting code, integrate a CTE as noted, or redesign the original table to put the comma-delimited fields into separate fields, eg
ID LOWLIMIT HILIMIT
1 1 10
similar with Gordon Linoff variant, but has some difference
--create temp table for data sample
DECLARE #Yourdata AS TABLE ( id INT, VALUE VARCHAR(20) )
INSERT #Yourdata
( id, VALUE )
VALUES ( 1, '1,10' ),
( 2, '7,9' )
--final query
;WITH Tally
AS ( SELECT MIN(CONVERT(INT, SUBSTRING(y.VALUE, 1, CHARINDEX(',', y.value) - 1))) AS MinV ,
MAX(CONVERT(INT, SUBSTRING(y.VALUE, CHARINDEX(',', y.value) + 1, 18))) AS MaxV
FROM #yourdata AS y
UNION ALL
SELECT MinV = MinV + 1 , MaxV
FROM Tally
WHERE MinV < Maxv
)
SELECT y.id , t.minV AS value
FROM #yourdata AS y
JOIN tally AS t ON t.MinV BETWEEN CONVERT(INT, SUBSTRING(y.VALUE, 1, CHARINDEX(',', y.value) - 1))
AND CONVERT(INT, SUBSTRING(y.VALUE, CHARINDEX(',', y.value) + 1, 18))
ORDER BY id, minV
OPTION ( MAXRECURSION 999 ) --change it if required
output