I have a column with different length strings which has dashes (-) that separates alphanumeric strings.
The string could look like "A1-2-3".
I need to order by first "A1" then "2" then "3"
I want to achieve the following order for the column:
A1
A1-1-1
A1-1-2
A1-1-3
A1-2-1
A1-2-2
A1-2-3
A1-7
A2-1-1
A2-1-2
A2-1-3
A2-2-1
A2-2-2
A2-2-3
A2-10-1
A2-10-2
A2-10-3
A10-1-1
A10-1-2
A10-1-3
A10-2-1
A10-2-2
A10-2-3
I can separate the string with the following code:
declare #string varchar(max) = 'A1-2-3'
declare #first varchar(max) = SUBSTRING(#string,1,charindex('-',#string)-1)
declare #second varchar(max) = substring(#string, charindex('-',#string) + 1, charindex('-',reverse(#string))-1)
declare #third varchar(max) = right(#string,charindex('-',reverse(#string))-1)
select #first, #second, #third
With the above logic I thought that I could use the following:
Note this only regards strings with 2 dashes
select barcode from tabelWithBarcodes
order by
case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(SUBSTRING(barcode,1,charindex('-',barcode)-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
SUBSTRING(barcode,1,(charindex('-',barcode)-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(substring(barcode, charindex('-',barcode) + 1, charindex('-',reverse(barcode))-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
substring(barcode, charindex('-',barcode) + 1, charindex('-',reverse(barcode))-1)
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(right(barcode,charindex('-',reverse(barcode))-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
right(barcode,charindex('-',reverse(barcode))-1)
end
But the sorting is not working for the second and third section of the string.
(I haven't added the code for checking if the string has only 1 or no dash in it for simplicity)
Not sure if I'm on the right path here.
Is anybody able to solve this?
This is not pretty, however...
USE Sandbox;
GO
WITH VTE AS(
SELECT V.SomeString
--Randomised order
FROM (VALUES ('A1-1-1'),
('A10-1-3'),
('A10-2-2'),
('A1-1-3'),
('A10-2-1'),
('A2-2-2'),
('A1-2-1'),
('A1-2-2'),
('A2-1-1'),
('A10-1-2'),
('B2-1-2'),
('A1'),
('A2-2-1'),
('A2-10-3'),
('A10-2-3'),
('A2-1-2'),
('B1-4'),
('A2-10-2'),
('A2-2-3'),
('A10-1-1'),
('A1-A1-3'),
('A1-7'),
('A2-10-1'),
('A2-1-3'),
('A1-1-2'),
('A1-2-3')) V(SomeString)),
Splits AS(
SELECT V.SomeString,
DS.Item,
DS.ItemNumber,
CONVERT(int,STUFF((SELECT '' + NG.token
FROM dbo.NGrams8k(DS.item,1) NG
WHERE TRY_CONVERT(int, NG.Token) IS NOT NULL
ORDER BY NG.position
FOR XML PATH('')),1,0,'')) AS NumericPortion
FROM VTE V
CROSS APPLY dbo.DelimitedSplit8K(V.SomeString,'-') DS),
Pivoted AS(
SELECT S.SomeString,
MIN(CASE V.P1 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P1Alpha,
MIN(CASE V.P1 WHEN S.Itemnumber THEN S.NumericPortion END) AS P1Numeric,
MIN(CASE V.P2 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P2Alpha,
MIN(CASE V.P2 WHEN S.Itemnumber THEN S.NumericPortion END) AS P2Numeric,
MIN(CASE V.P3 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P3Alpha,
MIN(CASE V.P3 WHEN S.Itemnumber THEN S.NumericPortion END) AS P3Numeric
FROM Splits S
CROSS APPLY (VALUES(1,2,3)) AS V(P1,P2,P3)
GROUP BY S.SomeString)
SELECT P.SomeString
FROM Pivoted P
ORDER BY P.P1Alpha,
P.P1Numeric,
P.P2Alpha,
P.P2Numeric,
P.P3Alpha,
P.P3Numeric;
This outputs:
A1
A1-1-1
A1-1-2
A1-1-3
A1-2-1
A1-2-2
A1-2-3
A1-7
A1-A1-3
A2-1-1
A2-1-2
A2-1-3
A2-2-1
A2-2-2
A2-2-3
A2-10-1
A2-10-2
A2-10-3
A10-1-1
A10-1-2
A10-1-3
A10-2-1
A10-2-2
A10-2-3
B1-4
B2-1-2
This makes use of 2 user defined functions. Firstly or DelimitedSplit8k_Lead (I used DelimitedSplit8k as I don't have the other on my sandbox at the moment). Then you also have NGrams8k.
I really should explain how this works, but yuck... (edit coming).
OK... (/sigh) What it does. Firstly, we split the data into its relevant parts using delimitedsplit8k(_lead). Then, within the SELECT we use FOR XML PATH to get (only) the nuemrical part of that string (For example, for 'A10' we get '10') and we convert it to a numerical value (an int).
Then we pivot that data out into respective parts. The alphanumerical part, and the numerical part. So, for the value 'A10-A1-12' we end up with the row:
'A', 10, 'A', 1, 12
Then, now that we've pivoted the data, we sort it by each column individually. And voila.
This will fall over if you have a value like 'A1A' or '1B1', and honestly, I'm not changing it to catter for that. This was messy, and really isn't what the RDBMS should be doing.
Up to 3 dashes can be covered by fiddling with replace & parsename & patindex:
declare #TabelWithBarcodes table (id int primary key identity(1,1), barcode varchar(20) not null, unique (barcode));
insert into #TabelWithBarcodes (barcode) values
('2-2-3'),('A2-2-2'),('A2-2-1'),('A2-10-3'),('A2-10-2'),('A2-10-1'),('A2-1-3'),('A2-1-2'),('A2-1-1'),
('A10-2-3'),('A10-2-2'),('A10-2-10'),('A10-1-3'),('AA10-A111-2'),('A10-1-1'),
('A1-7'),('A1-2-3'),('A1-2-12'),('A1-2-1'),('A1-1-3'),('B1-1-2'),('A1-1-1'),('A1'),('A10-10-1'),('A12-10-1'), ('AB1-2-E1') ;
with cte as
(
select barcode,
replace(BarCode, '-', '.')
+ replicate('.0', 3 - (len(BarCode)-len(replace(BarCode, '-', '')))) as x
from #TabelWithBarcodes
)
select *
, substring(parsename(x,4), 1, patindex('%[0-9]%',parsename(x,4))-1)
,cast(substring(parsename(x,4), patindex('%[0-9]%',parsename(x,4)), 10) as int)
,substring(parsename(x,3), 1, patindex('%[0-9]%',parsename(x,3))-1)
,cast(substring(parsename(x,3), patindex('%[0-9]%',parsename(x,3)), 10) as int)
,substring(parsename(x,2), 1, patindex('%[0-9]%',parsename(x,2))-1)
,cast(substring(parsename(x,2), patindex('%[0-9]%',parsename(x,2)), 10) as int)
,substring(parsename(x,1), 1, patindex('%[0-9]%',parsename(x,1))-1)
,cast(substring(parsename(x,1), patindex('%[0-9]%',parsename(x,1)), 10) as int)
from cte
order by
substring(parsename(x,4), 1, patindex('%[0-9]%',parsename(x,4))-1)
,cast(substring(parsename(x,4), patindex('%[0-9]%',parsename(x,4)), 10) as int)
,substring(parsename(x,3), 1, patindex('%[0-9]%',parsename(x,3))-1)
,cast(substring(parsename(x,3), patindex('%[0-9]%',parsename(x,3)), 10) as int)
,substring(parsename(x,2), 1, patindex('%[0-9]%',parsename(x,2))-1)
,cast(substring(parsename(x,2), patindex('%[0-9]%',parsename(x,2)), 10) as int)
,substring(parsename(x,1), 1, patindex('%[0-9]%',parsename(x,1))-1)
,cast(substring(parsename(x,1), patindex('%[0-9]%',parsename(x,1)), 10) as int)
extend each barcode to 4 groups by adding trailing .0 if missing
split each barcode in 4 groups
split each group in leading characters and trailing digits
sort by the leading character first
then by casting the digits as numeric
See db<>fiddle
An alterative approach would be to use your technique to split the string into its 3 component parts, then left pad those strings with leading zeros (or characters of your choice). That avoids any issues where the string may contain alphanumerics rather than just numerics. However, it does mean that strings containing different length alphabetic characters may not be sorted as you may expect... Here's the code to play with (using the definitions from #dnoeth's excellent answer):
;with cte as
(
select barcode
, case
when barcode like '%-%' then
substring(barcode,1,charindex('-',barcode)-1)
else
barcode
end part1
, case
when barcode like '%-%' then
substring(barcode, charindex('-',barcode) + 1, case
when barcode like '%-%-%' then
(charindex('-',barcode,charindex('-',barcode) + 1)) - 1
else
len(barcode)
end
- charindex('-',barcode))
else
''
end part2
, case
when barcode like '%-%-%' then
right(barcode,charindex('-',reverse(barcode))-1) --note: assumes you don't have %-%-%-%
else
''
end part3
from #TabelWithBarcodes
)
select barcode
, part1, part2, part3
, right('0000000000' + coalesce(part1,''), 10) lpad1
, right('0000000000' + coalesce(part2,''), 10) lpad2
, right('0000000000' + coalesce(part3,''), 10) lpad3
from cte
order by lpad1, lpad2, lpad3
DBFiddle Example
Been trying for some hours to convert this to a query I can use with OPENQUERY in SQL Server 2014 (to use with Progress OpenEdge 10.2B via ODBC). Can't seem to get the escaping of the quote right. Can anyone offer some assistance? Is there a tool to do it?
(There's a SQL table called #tAPBatches that is used in this, but I omitted it from this code)
DECLARE
#NoDays AS INT = 30
,#Prefix AS VARCHAR(5) = 'M_AP_'
SELECT
#Prefix + LTRIM(CAST(gh.[Batch-Number] AS VARCHAR(20))) AS BatchNo
,gh.[Batch-Number] AS BatchNo8
, aph.[Reference-number] AS InvoiceNo
,aph.[Voucher-Number] AS VoucherNo
,aph.[Amount] AS InvoiceTotal
,gh.[Journal-Number] AS JournalNo
,4 AS FacilityID
,CASE aph.[voucher-type]
WHEN 'DM' THEN 5
ELSE 1
END AS DocType
,apb.[Batch-Desc] AS BatchDesc
,apb.[Posting-Date] AS PostingDate
,apb.[Posting-Period]
,apb.[Posting-Fiscal-Year]
,apb.[Batch-Status]
,apb.[Expected-Count]
,apb.[Expected-Amount]
,apb.[Posted-To-GL-By]
,'Broadview' AS FacilityName
,apb.[Date-Closed] AS BatchDate
,gh.[Posted-by] AS PostUser
,gh.[Posted-Date] AS PostDT
,gh.[Created-Date] AS CreateDT
,gh.[Created-By] AS CreateUser
,aph.[Supplier-Key] AS VendorID
,sn.[Supplier-Name]
,aph.[Invoice-Date] AS InvoiceDate
,-1 AS Total
,-1 AS Discount
,gh.[Posted-by] AS Username
,CASE gt.[Credit-Debit]
WHEN 'CR' THEN LEFT(CAST(gacr.[GL-Acct] AS VARCHAR(20)), 2) + '.' + SUBSTRING(CAST(gacr.[GL-Acct] AS VARCHAR(20)), 3, 6) + '.'
+ RIGHT(CAST(gacr.[GL-Acct] AS VARCHAR(20)), 3)
ELSE NULL
END AS GLCreditAcct
,CASE gt.[Credit-Debit]
WHEN 'DR' THEN LEFT(CAST(gacr.[GL-Acct] AS VARCHAR(20)), 2) + '.' + SUBSTRING(CAST(gacr.[GL-Acct] AS VARCHAR(20)), 3, 6) + '.'
+ RIGHT(CAST(gacr.[GL-Acct] AS VARCHAR(20)), 3)
ELSE NULL
END AS GLDebitAcct
,CASE gt.[Credit-Debit]
WHEN 'CR' THEN gacr.[Report-Label]
ELSE NULL
END AS GLCreditDesc
,CASE gt.[Credit-Debit]
WHEN 'DR' THEN gacr.[Report-Label]
ELSE NULL
END AS GLDebitDesc
,'D' AS [Status]
,aph.[PO-Number] AS PoNo
,aph.[Terms-Code] AS TermsCode
,aph.[Due-Date] AS DueDate
,'' AS Comments
,aph.[Discount-Date] AS DiscountDate
,aph.[Discount-Amount] AS DiscountAmount
,aph.[Discount-Taken] AS DiscountTaken
,aph.[Amount] AS APAmount
,gt.[Amount]
,'BA REGULAR ' AS CheckBookID --ToDO
,0 AS Transferred
,aph.[voucher-type] AS VoucherType
,gt.[Credit-Debit]
,gacr.[Account-type]
,aph.[Freight-Ref-Num]
FROM
[Progress].[GAMS1].pub.[GL-Entry-Header] gh
INNER JOIN [Progress].[GAMS1].pub.[gl-entry-trailer] gt ON gt.[System-ID] = gh.[System-ID] AND gt.[Origin] = gh.[Origin] AND gt.[Journal-Number] = gh.[Journal-Number]
INNER JOIN [Progress].[GAMS1].pub.[apinvhdr] aph ON (gh.[Journal-Number] = aph.[Journal-Number]
OR (gh.[Journal-Num-Reversal-Of] = aph.[Journal-Number] AND aph.[Journal-Number] <> ' ' AND gh.[Journal-Num-Reversal-Of] <> ' '))
AND gh.[system-id] = aph.[system-id-gl]
AND gh.origin = 'inv'
AND gh.[system-id] = 'arcade'
INNER JOIN [Progress].[GAMS1].pub.[APInvoiceBatch] apb ON gh.[Batch-number] = apb.[Batch-number]
AND apb.[system-id] = 'lehigh'
AND apb.[Posted-To-GL] = 1
INNER JOIN [Progress].[GAMS1].pub.[GL-accts] gacr ON gacr.[system-id] = gt.[system-id]
AND gacr.[Gl-Acct-Ptr] = gt.[GL-Acct-Ptr]
INNER JOIN [Progress].[GAMS1].pub.[suppname] sn ON sn.[Supplier-Key] = aph.[Supplier-Key]
AND sn.[system-id] = 'arcade'
WHERE
gh.[Posted-Date] > CAST(DATEADD(DAY, -#NoDays, GETDATE()) AS DATE)
AND case
when CAST(gh."Posting-Period" as int) < 10 then gh."Posting-Year" + '0' + ltrim(gh."Posting-Period")
else gh."Posting-Year" + Ltrim(gh."Posting-Period")
end > '201501'
AND gh.[Batch-number] NOT IN (SELECT
BatchNo COLLATE SQL_Latin1_General_CP1_CI_AS
FROM
#tAPBatches)
TIA
MArk
Here's an example of what's giving me a syntax error. This works, but "M_AP_" is a parameter passed to SP
DECLARE
#NoDays AS INT = 5
,#Prefix AS VARCHAR(5) = 'M_AP_';
DECLARE
#InterestDate AS varchar(20)
SELECT #InterestDate = CAST(CAST(DATEADD(DAY, -#NoDays, GETDATE()) AS DATE) AS VARCHAR(20))
SELECT * FROM OPENQUERY(PROGRESS,
'SELECT TOP 100 ''M_AP_'' + LTRIM(CAST(gh."Batch-Number" AS VARCHAR(20))) AS BatchNo
, gh."Batch-Number"
This works, but when I try to swap in the variable I get Incorrect Syntax near '+'
DECLARE
#NoDays AS INT = 5
,#Prefix AS VARCHAR(5) = 'M_AP_';
DECLARE
#InterestDate AS varchar(20)
SELECT #InterestDate = CAST(CAST(DATEADD(DAY, -#NoDays, GETDATE()) AS DATE) AS VARCHAR(20))
SELECT * FROM OPENQUERY(PROGRESS,
'SELECT TOP 100 '' ' + #Prefix + ' '' + LTRIM(CAST(gh."Batch-Number" AS VARCHAR(20))) AS BatchNo
, gh."Batch-Number"
FROM
"GAMS1".pub."GL-Entry-Header" gh
OPENQUERY will only support a string literal query that is less than 8K. You might be running into that limit if you've got even more code that you're not showing here. Make sure that your query is less than 8000 bytes, or create procedures or views to reduce the size of your query.
It only accepts a single string literal... so if you are trying to concatenate strings and parameters together, it will not work. There are some ways to work around this by using dynamic SQL or creating supporting tables or views for filters.
When the following query is executed if fails to run unless the 2nd substring statement (commented out here) is uncommmented. What is going on here that I am missing?
Uses the Northwind database
SELECT Substring(Contactname, Charindex(' ', Contactname) + 1, Len(Contactname))AS LastName,
Substring(Contactname, 1, Charindex(' ', Contactname) - 1) AS FirstName1
--, substring(ContactName, 1, 4) AS FirstName2
-- if this line is commented out then the query crashes with the error msg
--Invalid length parameter passed to the LEFT or SUBSTRING function.
,
Phone,
Orderid,
Orderdate
FROM customers
INNER JOIN orders
ON customers.Customerid = orders.Customerid
Charindex(' ', Contactname) - 1
Returns -1 if Contactname does not contain a space. This is an invalid length parameter.
There must be a Contactname that causes the Substring expression to fail but that is filtered out by the JOIN.
Presumably the compute scalar shifts around between the two plans and happens to be evaluated after the join when you have that line uncommented.
See SQL Server should not raise illogical errors for some discussion on this type of issue.
A way around this would be to append a space to the input to Charindex
Substring(Contactname, 1, Charindex(' ', Contactname + ' ' ) - 1)
You need to watch out for the negative cases. A null value, a empty string, or a one name person.
I used a Common Table Expression since I did not want the charindex() function all over the place.
Also, your first substring() did not substract the correct amount of characters.
-- Use the sample db
use [Northwind]
go
-- Watch out for null & one name
;
with cteContactsOrders
as
(
SELECT
Contactname as FullName,
Substring(IsNull(Contactname, ''), 1, 4) as FirstFour,
Charindex(' ', IsNull(Contactname, '')) as Pos,
Phone,
Orderid,
Orderdate
FROM
customers as c
INNER JOIN
orders as o
ON
c.Customerid = o.Customerid
)
select
co.*,
case
when Pos > 0 then substring(FullName, 1, Pos-1)
when Pos = 0 and len(ltrim(rtrim(FullName))) > 0 then FullName
else ''
end as FirstName,
case
when Pos > 0 then substring(FullName, Pos+1, len(FullName) - Pos)
else ''
end as LastName
from
cteContactsOrders co
The output on SQL Server 2014 CTP2.