How to separate multiple values with strings - sql

Can't find a proper function to separate my strings.
I tried LEFT and RIGHT function but only applies to one value not multiple, but using the SPLIT_STRING function, it separates all value but doesnot correlate with the values it separated from. i.e i need the Atlanta in a separate column and the IN in another column.
(Location)
Atlanta, IN
Atlanta, IN
Cedar Rapids, IA
Cedar Rapids, IA
Indianapolis, IN
Indianapolis, IN
Dearborn, MI
SELECT *
FROM Salary_2
CROSS APPLY
string_split ("Location", ',')
(value)
Atlanta
IN
Cedar Rapids
IA
Indianapolis
IN
Dearborn
MI

You can do it using SUBSTRING and CHARINDEX
SELECT location, SUBSTRING(location, 0, CHARINDEX(',', location)) as city,
SUBSTRING(location, CHARINDEX(',', location)+2, 4) as code
FROM Salary_2
SUBSTRING to extracts some characters from a string.
CHARINDEX to searches for a substring in a string, and returns the position.
check it here : https://dbfiddle.uk/pcmBIH68

A numbers makes it pretty quick. This has the advantage of not requiring an explicit loop via a script. It also lets you capture the order of the results if that were important.
declare #s varchar(255) = 'Atlanta, IN Cedar Rapids, IA Indianapolis, IN Dearborn, MI';
with data as (
select substring(#s,
5 + lag(n, 1, -4) over (order by n),
n - lag(n, 1, -4) over (order by n)
) as CityState, n
from numbers
where substring(#s, n, 1) = ','
)
select n, left(CityState, len(CityState) - 4), right(CityState, 2)
from data
order by n;
https://dbfiddle.uk/yD5iSVxj

Related

Count Similar Substrings SQL query

I've tried a few scenarios and googled a lot, but still can't find a solution.
I have a table of user names with entries something like the below:
UserName
Cakes420
18Jack01
18Jack04
16Jack22
22Jack16
Mapple7609
Chrom44
chrom22
chrom77
013Cake
016Cake
122Cake
123Cake87
So I need a query that checks for all records that share 4 or more (in sequence) characters in the table.
So I need to return something like :
Characters
Times Used
Names Sharing
Cake
5
Cakes420, 013Cake, 016Cake, 122Cake, 123Cake87
Chro
3
Chrom44, chrom22, chrom77
or anything similar as I'd prefer not to repeat patterns, but hey, at this stage if it returns the values properly, I don't mind.
The shared characters can naturally appear in any place in the string, which is what makes this so difficult.
Should you do this in T-SQL? Probably not.
Can you do this in T-SQL? Yes.
Sample data
create table Names
(
Name nvarchar(20)
);
insert into Names (Name) values
('Cakes420'),
('18Jack01'),
('18Jack04'),
('16Jack22'),
('22Jack16'),
('Mapple7609'),
('Chrom44'),
('chrom22'),
('chrom77'),
('013Cake'),
('016Cake'),
('122Cake'),
('123Cake87');
Solution
Using STRING_AGG() for easy concatenation. Available from SQL Server 2017. Alternatives available for older SQL versions (use the search box on this site, there are many examples).
with rcte as
(
select n.Name,
convert(nvarchar(4), substring(n.Name, 1, 4)) as Part,
1 as PartFrom
from Names n
where len(n.Name) >= 4
union all
select r.Name,
convert(nvarchar(4), substring(r.Name, r.PartFrom+1, r.PartFrom+4)),
r.PartFrom+1
from rcte r
where len(r.Name) >= r.PartFrom+4
),
cte_count as
(
select r.Part,
count(1) as PartCount
from rcte r
where r.Part not like '%[0-9]%' -- exclude parts with numbers in them
group by r.Part
having count(1) > 1
)
select c.Part,
c.PartCount,
string_agg(r.Name, ', ') as Names
from cte_count c
join rcte r
on r.Part = c.Part
group by c.Part,
c.PartCount
order by c.Part;
Result
Part PartCount Names
---- --------- ----------------------------------------------
Cake 5 Cakes420, 123Cake87, 122Cake, 016Cake, 013Cake
Chro 3 Chrom44, chrom22, chrom77
hrom 3 chrom77, chrom22, Chrom44
Jack 4 22Jack16, 16Jack22, 18Jack04, 18Jack01
Fiddle to see it in action with the intermediate CTE results.
Let's use Itzik Ben-Gan's Tally Function to break out a list of substrings, then group them. This is called N-Gram, after the more common Trigram which is 3-character substrings.
I've removed one extra cross-join from the function to speed it up slightly, it's now good for up to varchar(65536):
CREATE OR ALTER FUNCTION dbo.GetNums(#num AS BIGINT)
RETURNS TABLE
AS
RETURN
WITH
L0 AS ( SELECT 1 AS c
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1)) AS D(c) ),
L1 AS ( SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B ),
L2 AS ( SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B ),
Nums AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM L2 )
SELECT TOP(#num)
rownum AS rn
FROM Nums
ORDER BY rownum;
GO
DECLARE #substringLen int = 4;
SELECT
Characters,
[Times Used] = COUNT(*),
[Names Sharing] = STRING_AGG(Username, ', ')
FROM (
SELECT DISTINCT
-- remove DISTINCT if you want to know about multiple in a single username
t.Username,
Characters = SUBSTRING(t.Username, n.rn, #substringLen)
FROM myTable t
CROSS APPLY dbo.GetNums (LEN(t.UserName) - #substringLen + 1) n
) t
GROUP BY t.Characters
HAVING COUNT(*) > 1

I want to select the value before and after the comma

Sql Query for select the value before and after the comma
There is a table called employee, there is 3 fields..Id,Name,Departmentid
In DepartmentId it consist of 3 id 201,301,401.
From this want to execute
Select * from employee where DepartmentId =301
This will work if you're always looking to match on the number in the middle, e.g. 301 in your example.
SELECT *
FROM employee
WHERE
SUBSTRING (
DepartmentId,
CHARINDEX(',', DepartmentId, 0) + 1,
CHARINDEX(',', DepartmentId, CHARINDEX(',', DepartmentId, 0) + 1) - CHARINDEX(',', DepartmentId, 0) - 1
) = '301'
If 301 can be in any location in that field then you can just use LIKE
SELECT *
FROM employee
WHERE DepartmentId LIKE '%301%'
SELECT * FROM EMPLOYEE WHERE DEPARTMENTID LIKE '%301%'
Even if it works, I don't really understand how your DB is designed and why you need commas
You can use parsename() function:
. . .
where parsename(replace(DepartmentId, ',', '.'), 2) = 301;
However like predicate also useful:
. . .
where DepartmentId LIKE '%301%';
Comma seperated columns are an antipattern. You want to normalize your departments.
However, here's one way to do it:
select * from employee
where '301' in (SELECT value FROM STRING_SPLIT(department, ','))
If you have departments like 1301 or 3011 then a simple LIKE might fail
Please check following SQL code
select * from employees where ','+departmentid+',' like '%,301,%'
And other option is to split departmentid column into each seperate department id value list
If you have a SQL Server before SQL Server 2016 then you need your own custome SQL string split function.
Then you can use following SQL query
select e.*
from employees as e
cross apply dbo.split(departmentid,',') as s
where s.val = '301'
If you are working on SQL Server 2016 or SQL Server 2017, string_split build-in SQL function can be used as follows again in a CROSS APPLY query
SELECT e.*
FROM employees as e
CROSS APPLY STRING_SPLIT(departmentid, ',')
WHERE value = '301'
One last method can be using SQL XML query by modifying the comma seperated list into an XML data as follows
select
Id, Name, sqlXML.value('.','varchar(5)') as DepId
from (
SELECT
Id, Name,
convert(xml, '<root><t>' + REPLACE(Departmentid, ',', '</t><t>') + '</t></root>') as dlist
FROM employees
) tbl
CROSS APPLY dlist.nodes('/root/t') as XMLData(sqlXML)
WHERE sqlXML.value('.','varchar(5)') = '301'

Sql How to Join a table where data in Colums are alike

I am managing a large databse. I am trying to Join a table. but the data in the Colums dont actualy match. One has dashes and the other has space.
Such GPD 142 pol (Partnumber)in the Company table and GPD-142-pol (PartNumber)in the Customer table.
My query is written like this:
SELECT *
FROM CompanyPartsList
JOIN SalesReport
On FordPartsList.[Company Part Number] = SalesReport.[Customer Part #]
I tries something like this
SELECT *
FROM CompanyPartsList
JOIN SalesReport
On FordPartsList.[Company Part Number] Like SalesReport.[Customer Part #]
Any help would be appreciated.
again doing this will be very slow , a solution would be a trigger to create the correct formatted column on either side
SELECT *
FROM CompanyPartsList
JOIN SalesReport
On FordPartsList.[Company Part Number] = Replace(SalesReport.[Customer Part #],'-',' ')
Try replacing characters that can cause the values to be different.
SELECT
*
FROM
CompanyPartsList cpl,
SalesReport sr
WHERE
REPLACE(REPLACE(cpl.[Company Part Number],'-',''),' ','') = REPLACE(REPLACE(sr.[Customer Part #],'-',''),' ','')
At the end of the day, for the JOIN to work, you've got to have matching values. So your only option is to find a way to make them equal.
Given your examples, I'd experiment to see if there's a way that you could normalize the values to a standard. For example, you could try to remove all the spaces and hyphens on both sides by using REPLACE, and see if that does it for you.
If you're able to get matches that way, you've got two choices. You could always do that normalization on-the-fly when you do the JOIN, but that would probably be performance prohibitive. Or you could add another column to each table, which you update at the same time you update the real part#, setting it to that value with the spaces, hyphens, etc, removed.
Aside from that headache, your potential problem is the chance of clashes. What happens if you've got part#s "123-4B" and "12-34B"? If you use my proposal, those two products would appear the same.
Create a function as mentioned in question 1007697 and modify it gently to strip off anything but alphabets and numbers
Create Function [dbo].[RemoveNonAlphaNumericCharacters](#Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare #KeepValues as varchar(50)
Set #KeepValues = '%[^a-z0-9]%'
While PatIndex(#KeepValues, #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex(#KeepValues, #Temp), 1, '')
Return #Temp
End
Then you could compare data, but it may be slow on a large table:
SELECT *
FROM CompanyPartsList
JOIN SalesReport
On RemoveNonAlphaNumericCharacters(FordPartsList.[Company Part Number])
= RemoveNonAlphaNumericCharacters(SalesReport.[Customer Part #])
If you end up wanting to use a function I would recommend using an inline table valued function instead of a scalar function that has a while loop inside. The performance of that scalar function is going to degrade quickly as the table gets larger. Here is an example of using an inline table valued function and a tally table so the replacement is set based.
If this was my code I would prefer to use the REPLACE option if that is a possibility.
CREATE FUNCTION [dbo].[StripNonAlphaNumeric_itvf]
(
#OriginalText VARCHAR(8000)
) RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
Tally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select STUFF(
(
SELECT SUBSTRING(#OriginalText, t.N, 1)
FROM tally t
WHERE
(
ASCII(SUBSTRING(#OriginalText, t.N, 1)) BETWEEN 48 AND 57 --numbers 0 - 9
OR
ASCII(SUBSTRING(#OriginalText, t.N, 1)) BETWEEN 65 AND 90 --UPPERCASE letters
OR
ASCII(SUBSTRING(#OriginalText, t.N, 1)) BETWEEN 97 AND 122 --LOWERCASE letters
)
AND n <= len(#OriginalText)
FOR XML PATH('')
), 1 ,0 , '') AS CleanedText

How to split and display distinct letters from a word in SQL?

Yesterday in a job interview session I was asked this question and I had no clue about it. Suppose I have a word "Manhattan " I want to display only the letters 'M','A','N','H','T'
in SQL. How to do it?
Any help is appreciated.
Well, here is my solution (sqlfiddle) - it aims to use a "Relational SQL" operations, which may have been what the interviewer was going for conceptually.
Most of the work done is simply to turn the string into a set of (pos, letter) records as the relevant final applied DQL is a mere SELECT with a grouping and ordering applied.
select letter
from (
-- All of this just to get a set of (pos, letter)
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
-- Or use another form to create a "numbers table"
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
) as pairs
group by letter -- guarantees distinctness
order by min(pos) -- ensure output is ordered MANHT
The above query works in SQL Server 2008, but the "Numbers Table" may have to be altered for other vendors. Otherwise, there is nothing used that is vendor specific - no CTE, or cross application of a function, or procedural language code ..
That being said, the above is to show a conceptual approach - SQL is designed for use with sets and relations and multiplicity across records; the above example is, in some sense, merely a perversion of such.
Examining the intermediate relation,
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
uses a cross join to generate the Cartesian product of the string (1 row) with the numbers (9 rows); the substring function is then applied with the string and each number to obtain each character in accordance with its position. The resulting set contains the records-
POS LETTER
1 M
2 A
3 N
..
9 N
Then the outer select groups each record according to the letter and the resulting records are ordered by the minimum (first) occurrence position of the letter that establishing the grouping. (Without the order by the letters would have been distinct but the final order would not be guaranteed.)
One way (if using SQL Server) is with a recursive CTE (Commom Table Expression).
DECLARE #source nvarchar(100) = 'MANHATTAN'
;
WITH cte AS (
SELECT SUBSTRING(#source, 1, 1) AS c1, 1 as Pos
WHERE LEN(#source) > 0
UNION ALL
SELECT SUBSTRING(#source, Pos + 1, 1) AS c1, Pos + 1 as Pos
FROM cte
WHERE Pos < LEN(#source)
)
SELECT DISTINCT c1 from cte
SqlFiddle for this is here. I had to inline the #source for SqlFiddle, but the code above works fine in Sql Server.
The first SELECT generates the initial row(in this case 'M', 1). The second SELECT is the recursive part that generates the subsequent rows, with the Pos column getting incremented each time until the termination condition WHERE Pos < LEN(#source) is finally met. The final select removes the duplicates. Internally, SELECT DISTINCT sorts the rows in order to facilitate the removal of duplicates, which is why the final output happens to be in alphabetic order. Since you didn't specify order as a requirement, I left it as-is. But you could modify it to use a GROUP instead, that ordered on MIN(Pos) if you needed the output in the characters' original order.
This same technique can be used for things like generating all the Bigrams for a string, with just a small change to the general structure above.
declare #charr varchar(99)
declare #lp int
set #charr='Manhattan'
set #lp=1
DECLARE #T1 TABLE (
FLD VARCHAR(max)
)
while(#lp<=LEN(#charr))
begin
if(not exists(select * from #T1 where FLD=(select SUBSTRING(#charr,#lp,1))))
begin
insert into #T1
select SUBSTRING(#charr,#lp,1)
end
set #lp=#lp+1
end
select * from #T1
check this it may help u
Here's an Oracle version of #user2864740's answer. The only difference is how you construct the "numbers table" (plus slight differences in aliasing)
select letter
from (
select ns.n as pos, substr(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s from dual) ss
cross join (
SELECT LEVEL as n
FROM DUAL
CONNECT BY LEVEL <= 9
ORDER BY LEVEL) ns
) pairs
group by letter
order by min(pos)

Sort string as number in sql server

I have a column that contains data like this. dashes indicate multi copies of the same invoice and these have to be sorted in ascending order
790711
790109-1
790109-11
790109-2
i have to sort it in increasing order by this number but since this is a varchar field it sorts in alphabetical order like this
790109-1
790109-11
790109-2
790711
in order to fix this i tried replacing the -(dash) with empty and then casting it as a number and then sorting on that
select cast(replace(invoiceid,'-','') as decimal) as invoiceSort...............order by invoiceSort asc
while this is better and sorts like this
invoiceSort
790711 (790711) <-----this is wrong now as it should come later than 790109
790109-1 (7901091)
790109-2 (7901092)
790109-11 (79010911)
Someone suggested to me to split invoice id on the - (dash ) and order by on the 2 split parts
like=====> order by split1 asc,split2 asc (790109,1)
which would work i think but how would i split the column.
The various split functions on the internet are those that return a table while in this case i would be requiring a scalar function.
Are there any other approaches that can be used? The data is shown in grid view and grid view doesn't support sorting on 2 columns by default ( i can implement it though :) ) so if any simpler approaches are there i would be very nice.
EDIT : thanks for all the answers. While every answer is correct i have chosen the answer which allowed me to incorporate these columns in the GridView Sorting with minimum re factoring of the sql queries.
Judicious use of REVERSE, CHARINDEX, and SUBSTRING, can get us what we want. I have used hopefully-explanatory columns names in my code below to illustrate what's going on.
Set up sample data:
DECLARE #Invoice TABLE (
InvoiceNumber nvarchar(10)
);
INSERT #Invoice VALUES
('790711')
,('790709-1')
,('790709-11')
,('790709-21')
,('790709-212')
,('790709-2')
SELECT * FROM #Invoice
Sample data:
InvoiceNumber
-------------
790711
790709-1
790709-11
790709-21
790709-212
790709-2
And here's the code. I have a nagging feeling the final expressions could be simplified.
SELECT
InvoiceNumber
,REVERSE(InvoiceNumber)
AS Reversed
,CHARINDEX('-',REVERSE(InvoiceNumber))
AS HyphenIndexWithinReversed
,SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))
AS ReversedWithoutAffix
,SUBSTRING(InvoiceNumber,1+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS AffixIncludingHyphen
,SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS AffixExcludingHyphen
,CAST(
SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS int)
AS AffixAsInt
,REVERSE(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber)))
AS WithoutAffix
FROM #Invoice
ORDER BY
-- WithoutAffix
REVERSE(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber)))
-- AffixAsInt
,CAST(
SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS int)
Output:
InvoiceNumber Reversed HyphenIndexWithinReversed ReversedWithoutAffix AffixIncludingHyphen AffixExcludingHyphen AffixAsInt WithoutAffix
------------- ---------- ------------------------- -------------------- -------------------- -------------------- ----------- ------------
790709-1 1-907097 2 907097 -1 1 1 790709
790709-2 2-907097 2 907097 -2 2 2 790709
790709-11 11-907097 3 907097 -11 11 11 790709
790709-21 12-907097 3 907097 -21 21 21 790709
790709-212 212-907097 4 907097 -212 212 212 790709
790711 117097 0 117097 0 790711
Note that all you actually need is the ORDER BY clause, the rest is just to show my working, which goes like this:
Reverse the string, find the hyphen, get the substring after the hyphen, reverse that part: This is the number without any affix
The length of (the number without any affix) tells us how many characters to drop from the start in order to get the affix including the hyphen. Drop an additional character to get just the numeric part, and convert this to int. Fortunately we get a break from SQL Server in that this conversion gives zero for an empty string.
Finally, having got these two pieces, we simple ORDER BY (the number without any affix) and then by (the numeric value of the affix). This is the final order we seek.
The code would be more concise if SQL Server allowed us to say SUBSTRING(value, start) to get the string starting at that point, but it doesn't, so we have to say SUBSTRING(value, start, LEN(value)) a lot.
Try this one -
Query:
DECLARE #Invoice TABLE (InvoiceNumber VARCHAR(10))
INSERT #Invoice
VALUES
('790711')
, ('790709-1')
, ('790709-21')
, ('790709-11')
, ('790709-211')
, ('790709-2')
;WITH cte AS
(
SELECT
InvoiceNumber
, lenght = LEN(InvoiceNumber)
, delimeter = CHARINDEX('-', InvoiceNumber)
FROM #Invoice
)
SELECT InvoiceNumber
FROM cte
CROSS JOIN (
SELECT repl = MAX(lenght - delimeter)
FROM cte
WHERE delimeter != 0
) mx
ORDER BY
SUBSTRING(InvoiceNumber, 1, ISNULL(NULLIF(delimeter - 1, -1), lenght))
, RIGHT(REPLICATE('0', repl) + SUBSTRING(InvoiceNumber, delimeter + 1, lenght), repl)
Output:
InvoiceNumber
-------------
790709-1
790709-2
790709-11
790709-21
790709-211
790711
Try this
SELECT invoiceid FROM Invoice
ORDER BY
CASE WHEN PatIndex('%[-]%',invoiceid) > 0
THEN LEFT(invoiceid,PatIndex('%[-]%',invoiceid)-1)
ELSE invoiceid END * 1
,CASE WHEN PatIndex('%[-]%',REVERSE(invoiceid)) > 0
THEN RIGHT(invoiceid,PatIndex('%[-]%',REVERSE(invoiceid))-1)
ELSE NULL END * 1
SQLFiddle Demo
Above query uses two case statements
Sorts first part of Invoiceid 790109-1 (eg: 790709)
Sorts second part of Invoiceid after splitting with '-' 790109-1 (eg: 1)
For detailed understanding check the below SQLfiddle
SQLFiddle Detailed Demo
OR use 'CHARINDEX'
SELECT invoiceid FROM Invoice
ORDER BY
CASE WHEN CHARINDEX('-', invoiceid) > 0
THEN LEFT(invoiceid, CHARINDEX('-', invoiceid)-1)
ELSE invoiceid END * 1
,CASE WHEN CHARINDEX('-', REVERSE(invoiceid)) > 0
THEN RIGHT(invoiceid, CHARINDEX('-', REVERSE(invoiceid))-1)
ELSE NULL END * 1
Order by each part separately is the simplest and reliable way to go, why look for other approaches? Take a look at this simple query.
select *
from Invoice
order by Convert(int, SUBSTRING(invoiceid, 0, CHARINDEX('-',invoiceid+'-'))) asc,
Convert(int, SUBSTRING(invoiceid, CHARINDEX('-',invoiceid)+1, LEN(invoiceid)-CHARINDEX('-',invoiceid))) asc
Plenty of good answers here, but I think this one might be the most compact order by clause that is effective:
SELECT *
FROM Invoice
ORDER BY LEFT(InvoiceId,CHARINDEX('-',InvoiceId+'-'))
,CAST(RIGHT(InvoiceId,CHARINDEX('-',REVERSE(InvoiceId)+'-'))AS INT)DESC
Demo: - SQL Fiddle
Note, I added the '790709' version to my test, since some of the methods listed here aren't treating the no-suffix version as lesser than the with-suffix versions.
If your invoiceID varies in length, before the '-' that is, then you'd need:
SELECT *
FROM Invoice
ORDER BY CAST(LEFT(list,CHARINDEX('-',list+'-')-1)AS INT)
,CAST(RIGHT(InvoiceId,CHARINDEX('-',REVERSE(InvoiceId)+'-'))AS INT)DESC
Demo with varying lengths before the dash: SQL Fiddle
My version:
declare #Len int
select #Len = (select max (len (invoiceid) - charindex ( '-', invoiceid))-1 from MyTable)
select
invoiceid ,
cast (SUBSTRING (invoiceid ,1,charindex ( '-', invoiceid )-1) as int) * POWER (10,#Len) +
cast (right(invoiceid, len (invoiceid) - charindex ( '-', invoiceid) ) as int )
from MyTable
You can implement this as a new column to your table:
ALTER TABLE MyTable ADD COLUMN invoice_numeric_id int null
GO
declare #Len int
select #Len = (select max (len (invoiceid) - charindex ( '-', invoiceid))-1 from MyTable)
UPDATE TABLE MyTable
SET invoice_numeric_id = cast (SUBSTRING (invoiceid ,1,charindex ( '-', invoiceid )-1) as int) * POWER (10,#Len) +
cast (right(invoiceid, len (invoiceid) - charindex ( '-', invoiceid) ) as int )
One way is to split InvoiceId into its parts, and then sort on the parts. Here I use a derived table, but it could be done with a CTE or a temporary table as well.
select InvoiceId, InvoiceId1, InvoiceId2
from
(
select
InvoiceId,
substring(InvoiceId, 0, charindex('-', InvoiceId, 0)) as InvoiceId1,
substring(InvoiceId, charindex('-', InvoiceId, 0)+1, len(InvoiceId)) as InvoiceId2
FROM Invoice
) tmp
order by
cast((case when len(InvoiceId1) > 0 then InvoiceId1 else InvoiceId2 end) as int),
cast((case when len(InvoiceId1) > 0 then InvoiceId2 else '0' end) as int)
In the above, InvoiceId1 and InvoiceId2 are the component parts of InvoiceId. The outer select includes the parts, but only for demonstration purposes - you do not need to do this in your select.
The derived table (the inner select) grabs the InvoiceId as well as the component parts. The way it works is this:
When there is a dash in InvoiceId, InvoiceId1 will contain the first part of the number and InvoiceId2 will contain the second.
When there is not a dash, InvoiceId1 will be empty and InvoiceId2 will contain the entire number.
The second case above (no dash) is not optimal because ideally InvoiceId1 would contain the number and InvoiceId2 would be empty. To make the inner select work optimally would decrease the readability of the select. I chose the non-optimal, more readable, approach since it is good enough to allow for sorting.
This is why the ORDER BY clause tests for the length - it needs to handle the two cases above.
Demo at SQL Fiddle
Break the sort into two sections:
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE TestData
(
data varchar(20)
)
INSERT TestData
SELECT '790711' as data
UNION
SELECT '790109-1'
UNION
SELECT '790109-11'
UNION
SELECT '790109-2'
Query 1:
SELECT *
FROM TestData
ORDER BY
FLOOR(CAST(REPLACE(data, '-', '.') AS FLOAT)),
CASE WHEN CHARINDEX('-', data) > 0
THEN CAST(RIGHT(data, len(data) - CHARINDEX('-', data)) AS INT)
ELSE 0
END
Results:
| DATA |
-------------
| 790109-1 |
| 790109-2 |
| 790109-11 |
| 790711 |
Try:
select invoiceid ... order by Convert(decimal(18, 2), REPLACE(invoiceid, '-', '.'))