Select range of items in a varchar column (sql server) - sql

How can i select a range of items from a VARCHAR type column in sql server?
I want to make something like:
SELECT TE.DESC
FROM PRODUCT P, ETYPE TE WHERE ( P.IDTYPE = TE.IDTYPE )
AND P.NUMBER BETWEEN '619' AND '623'
The 'P.NUMBER' column can contain numbers and letters together like 'abc123', then sql can't select correctly what i want.
There's some way to do it?
Thanks

First, you need to create a function which will strip all non-numeric characters from your NUMBER and return a float (or int), like so:
create function dbo.RemoveAlpha(#str varchar(1000))
returns float
AS
begin
while patindex('%[^0-9]%', #str) > 0
begin
set #strText = stuff(#str, patindex('%[^0-9]%', #str), 1, '')
end
return convert(float, #str)
end
Then your can rewrite your query like so:
SELECT TE.DESC
FROM PRODUCT P, ETYPE TE
WHERE ( P.IDTYPE = TE.IDTYPE )
AND dbo.RemoveAlpha(P.NUMBER) BETWEEN 619 AND 623

You can get only numbers using this double CASE
SELECT TE.DESC
FROM PRODUCT P, ETYPE TE WHERE ( P.IDTYPE = TE.IDTYPE )
WHERE 1 = CASE ISNUMERIC(P.NUMBER)
WHEN 1 THEN
CASE WHEN CAST(P.NUMBER AS INT) BETWEEN 619 AND 623 THEN 1 ELSE 0 END
ELSE 0
END

Related

SQL Patindex / Regex - Match where there are 4 or less characters between 2 apostrophes

I have the following string:
'Siemens','Simatic','Microbox','PC','27','6ES7677AA200PA0','6ES7','677AA200PA0'
I want to remove any "terms" that are less than 5 characters. So in this case I'd like to remove 'PC', '27' and '6ES7'.
Which would result in:
'Siemens','Simatic','Microbox','6ES7677AA200PA0','677AA200PA0'
This is in SQL server and I have a function that accepts a regex command, so far it looks like this:
SELECT dbo.fn_StripCharacters(title, '/^''PC''$/')
I tried to hardcode to remove 'PC' but I think its removing all apostrophes, and 'P' and 'C' characters:
Siemens,Simati,Mirobox,,427B,6ES76477AA200A0,6ES7,6477AA200A0
This is the function I'm using:
CREATE FUNCTION [dbo].[fn_StripCharacters]
(
#String NVARCHAR(MAX),
#MatchExpression VARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
SET #MatchExpression = '%['+#MatchExpression+']%'
WHILE PatIndex(#MatchExpression, #String) > 0
SET #String = Stuff(#String, PatIndex(#MatchExpression, #String), 1, '')
RETURN #String
END
If you don't care about the particular order of the words which are retained after filtering off words 4 characters or less, you could use STRING_SPLIT and STRING_AGG:
WITH cte AS (
SELECT id, value
FROM yourTable
CROSS APPLY STRING_SPLIT(val, ',')
)
SELECT id, STRING_AGG(value, ',') AS val
FROM cte
WHERE LEN(value) > 6
GROUP BY id;
Demo

Order string alpha numerically A1-1-1, A1-2-1, A1-10-1, A1-2-2, A1-2-3 etc

I have a column with different length strings which has dashes (-) that separates alphanumeric strings.
The string could look like "A1-2-3".
I need to order by first "A1" then "2" then "3"
I want to achieve the following order for the column:
A1
A1-1-1
A1-1-2
A1-1-3
A1-2-1
A1-2-2
A1-2-3
A1-7
A2-1-1
A2-1-2
A2-1-3
A2-2-1
A2-2-2
A2-2-3
A2-10-1
A2-10-2
A2-10-3
A10-1-1
A10-1-2
A10-1-3
A10-2-1
A10-2-2
A10-2-3
I can separate the string with the following code:
declare #string varchar(max) = 'A1-2-3'
declare #first varchar(max) = SUBSTRING(#string,1,charindex('-',#string)-1)
declare #second varchar(max) = substring(#string, charindex('-',#string) + 1, charindex('-',reverse(#string))-1)
declare #third varchar(max) = right(#string,charindex('-',reverse(#string))-1)
select #first, #second, #third
With the above logic I thought that I could use the following:
Note this only regards strings with 2 dashes
select barcode from tabelWithBarcodes
order by
case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(SUBSTRING(barcode,1,charindex('-',barcode)-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
SUBSTRING(barcode,1,(charindex('-',barcode)-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(substring(barcode, charindex('-',barcode) + 1, charindex('-',reverse(barcode))-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
substring(barcode, charindex('-',barcode) + 1, charindex('-',reverse(barcode))-1)
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(right(barcode,charindex('-',reverse(barcode))-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
right(barcode,charindex('-',reverse(barcode))-1)
end
But the sorting is not working for the second and third section of the string.
(I haven't added the code for checking if the string has only 1 or no dash in it for simplicity)
Not sure if I'm on the right path here.
Is anybody able to solve this?
This is not pretty, however...
USE Sandbox;
GO
WITH VTE AS(
SELECT V.SomeString
--Randomised order
FROM (VALUES ('A1-1-1'),
('A10-1-3'),
('A10-2-2'),
('A1-1-3'),
('A10-2-1'),
('A2-2-2'),
('A1-2-1'),
('A1-2-2'),
('A2-1-1'),
('A10-1-2'),
('B2-1-2'),
('A1'),
('A2-2-1'),
('A2-10-3'),
('A10-2-3'),
('A2-1-2'),
('B1-4'),
('A2-10-2'),
('A2-2-3'),
('A10-1-1'),
('A1-A1-3'),
('A1-7'),
('A2-10-1'),
('A2-1-3'),
('A1-1-2'),
('A1-2-3')) V(SomeString)),
Splits AS(
SELECT V.SomeString,
DS.Item,
DS.ItemNumber,
CONVERT(int,STUFF((SELECT '' + NG.token
FROM dbo.NGrams8k(DS.item,1) NG
WHERE TRY_CONVERT(int, NG.Token) IS NOT NULL
ORDER BY NG.position
FOR XML PATH('')),1,0,'')) AS NumericPortion
FROM VTE V
CROSS APPLY dbo.DelimitedSplit8K(V.SomeString,'-') DS),
Pivoted AS(
SELECT S.SomeString,
MIN(CASE V.P1 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P1Alpha,
MIN(CASE V.P1 WHEN S.Itemnumber THEN S.NumericPortion END) AS P1Numeric,
MIN(CASE V.P2 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P2Alpha,
MIN(CASE V.P2 WHEN S.Itemnumber THEN S.NumericPortion END) AS P2Numeric,
MIN(CASE V.P3 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P3Alpha,
MIN(CASE V.P3 WHEN S.Itemnumber THEN S.NumericPortion END) AS P3Numeric
FROM Splits S
CROSS APPLY (VALUES(1,2,3)) AS V(P1,P2,P3)
GROUP BY S.SomeString)
SELECT P.SomeString
FROM Pivoted P
ORDER BY P.P1Alpha,
P.P1Numeric,
P.P2Alpha,
P.P2Numeric,
P.P3Alpha,
P.P3Numeric;
This outputs:
A1
A1-1-1
A1-1-2
A1-1-3
A1-2-1
A1-2-2
A1-2-3
A1-7
A1-A1-3
A2-1-1
A2-1-2
A2-1-3
A2-2-1
A2-2-2
A2-2-3
A2-10-1
A2-10-2
A2-10-3
A10-1-1
A10-1-2
A10-1-3
A10-2-1
A10-2-2
A10-2-3
B1-4
B2-1-2
This makes use of 2 user defined functions. Firstly or DelimitedSplit8k_Lead (I used DelimitedSplit8k as I don't have the other on my sandbox at the moment). Then you also have NGrams8k.
I really should explain how this works, but yuck... (edit coming).
OK... (/sigh) What it does. Firstly, we split the data into its relevant parts using delimitedsplit8k(_lead). Then, within the SELECT we use FOR XML PATH to get (only) the nuemrical part of that string (For example, for 'A10' we get '10') and we convert it to a numerical value (an int).
Then we pivot that data out into respective parts. The alphanumerical part, and the numerical part. So, for the value 'A10-A1-12' we end up with the row:
'A', 10, 'A', 1, 12
Then, now that we've pivoted the data, we sort it by each column individually. And voila.
This will fall over if you have a value like 'A1A' or '1B1', and honestly, I'm not changing it to catter for that. This was messy, and really isn't what the RDBMS should be doing.
Up to 3 dashes can be covered by fiddling with replace & parsename & patindex:
declare #TabelWithBarcodes table (id int primary key identity(1,1), barcode varchar(20) not null, unique (barcode));
insert into #TabelWithBarcodes (barcode) values
('2-2-3'),('A2-2-2'),('A2-2-1'),('A2-10-3'),('A2-10-2'),('A2-10-1'),('A2-1-3'),('A2-1-2'),('A2-1-1'),
('A10-2-3'),('A10-2-2'),('A10-2-10'),('A10-1-3'),('AA10-A111-2'),('A10-1-1'),
('A1-7'),('A1-2-3'),('A1-2-12'),('A1-2-1'),('A1-1-3'),('B1-1-2'),('A1-1-1'),('A1'),('A10-10-1'),('A12-10-1'), ('AB1-2-E1') ;
with cte as
(
select barcode,
replace(BarCode, '-', '.')
+ replicate('.0', 3 - (len(BarCode)-len(replace(BarCode, '-', '')))) as x
from #TabelWithBarcodes
)
select *
, substring(parsename(x,4), 1, patindex('%[0-9]%',parsename(x,4))-1)
,cast(substring(parsename(x,4), patindex('%[0-9]%',parsename(x,4)), 10) as int)
,substring(parsename(x,3), 1, patindex('%[0-9]%',parsename(x,3))-1)
,cast(substring(parsename(x,3), patindex('%[0-9]%',parsename(x,3)), 10) as int)
,substring(parsename(x,2), 1, patindex('%[0-9]%',parsename(x,2))-1)
,cast(substring(parsename(x,2), patindex('%[0-9]%',parsename(x,2)), 10) as int)
,substring(parsename(x,1), 1, patindex('%[0-9]%',parsename(x,1))-1)
,cast(substring(parsename(x,1), patindex('%[0-9]%',parsename(x,1)), 10) as int)
from cte
order by
substring(parsename(x,4), 1, patindex('%[0-9]%',parsename(x,4))-1)
,cast(substring(parsename(x,4), patindex('%[0-9]%',parsename(x,4)), 10) as int)
,substring(parsename(x,3), 1, patindex('%[0-9]%',parsename(x,3))-1)
,cast(substring(parsename(x,3), patindex('%[0-9]%',parsename(x,3)), 10) as int)
,substring(parsename(x,2), 1, patindex('%[0-9]%',parsename(x,2))-1)
,cast(substring(parsename(x,2), patindex('%[0-9]%',parsename(x,2)), 10) as int)
,substring(parsename(x,1), 1, patindex('%[0-9]%',parsename(x,1))-1)
,cast(substring(parsename(x,1), patindex('%[0-9]%',parsename(x,1)), 10) as int)
extend each barcode to 4 groups by adding trailing .0 if missing
split each barcode in 4 groups
split each group in leading characters and trailing digits
sort by the leading character first
then by casting the digits as numeric
See db<>fiddle
An alterative approach would be to use your technique to split the string into its 3 component parts, then left pad those strings with leading zeros (or characters of your choice). That avoids any issues where the string may contain alphanumerics rather than just numerics. However, it does mean that strings containing different length alphabetic characters may not be sorted as you may expect... Here's the code to play with (using the definitions from #dnoeth's excellent answer):
;with cte as
(
select barcode
, case
when barcode like '%-%' then
substring(barcode,1,charindex('-',barcode)-1)
else
barcode
end part1
, case
when barcode like '%-%' then
substring(barcode, charindex('-',barcode) + 1, case
when barcode like '%-%-%' then
(charindex('-',barcode,charindex('-',barcode) + 1)) - 1
else
len(barcode)
end
- charindex('-',barcode))
else
''
end part2
, case
when barcode like '%-%-%' then
right(barcode,charindex('-',reverse(barcode))-1) --note: assumes you don't have %-%-%-%
else
''
end part3
from #TabelWithBarcodes
)
select barcode
, part1, part2, part3
, right('0000000000' + coalesce(part1,''), 10) lpad1
, right('0000000000' + coalesce(part2,''), 10) lpad2
, right('0000000000' + coalesce(part3,''), 10) lpad3
from cte
order by lpad1, lpad2, lpad3
DBFiddle Example

Cast substring to int only for numeric values in SQL

I have this query :
SUBSTRING (
dbo.Table.RNumber,
1,
CHARINDEX(
'+',
dbo.Table.RNumber
) - 1
) AS RoomNumber,
SUBSTRING (
dbo.Table.R.Number,
CHARINDEX(
'+',
dbo.Table.R.Number
) + 1,
LEN(
dbo.Table.R.Number
)
) AS HallNumber,
My Table RNumber is mostly like 2+3 or 3+5, but sometimes it is like x+5 or y+0. I want to convert fields to int, but I want to convert strings like "x" or "y" to 0. I googled it but I couldn't find a solution. How can I do that? Thanks.
You can use case statement try this
Edited to use isnumeric() method
CASE
WHEN isnumeric(SUBSTRING(dbo.Table.RNumber,1,CHARINDEX('+',dbo.Table.RNumber) - 1)) = 1
THEN SUBSTRING(dbo.Table.RNumber,1,CHARINDEX('+',dbo.Table.RNumber) - 1)
else 0
end AS RoomNumber,
CASE
WHEN isnumeric(SUBSTRING(dbo.Table.R.Number,CHARINDEX('+',dbo.Table.RNumber) + 1,LEN(dbo.Table.R.Number))) = 1
THEN SUBSTRING(dbo.Table.R.Number,CHARINDEX('+',dbo.Table.RNumber) + 1,LEN(dbo.Table.R.Number))
else 0
end AS HallNumber,
Hope this should solve your problem
Perhaps you can use ParseName() and Try_Convert()
Declare #YourTable table (SomeField varchar(50))
Insert Into #YourTable values
('2+3'),('3+5'),('x+5'),('y+0')
Select *
,RoomNumber = IsNull(Try_Convert(int,ParseName(Replace(SomeField,'+','.'),2)),0)
,HallNumber = IsNull(Try_Convert(int,ParseName(Replace(SomeField,'+','.'),1)),0)
From #YourTable
Returns
SomeField RoomNumber HallNumber
2+3 2 3
3+5 3 5
x+5 0 5
y+0 0 0
For versions prior to 2012, you can do it like this:
CASE
WHEN NOT columnName like '%[^0-9]%' -- Contains no non-digits
AND columnName like '%[0-9]%' -- contains at least one digit
THEN CAST(columnName as INT) ELSE NULL
END
(Note that this will reject negative numbers, but you can easily adapt it if you need to support them)
Alternatively using IsNumeric, you must first cast to float because Isnumeric accepts some strings that Cast(EXPRESSION as INT) does not accept:
CASE WHEN ISNUMERIC(columnName)=1
THEN CAST(CAST(columnName as float) as int) END

SQL Server REPLACE AND CHECK IF EXISTS

I have to check the string with the following scenarios in WHERE condition.
The data ProductId stored in the database can be like
7314-3337 sometimes with - symbol and not prefixed with 19
73143337 sometimes without symbol and not prefixed with 19
1973143337 correct format
197314-3337 sometimes with - symbol
I need to filter the record ProductId and the input is correct format , i.e 1973143337
WHERE P.ProductId=#ProductId
How can i filter it if the data stored in other 3 formats?
How to use the string replace(-) and prefix 19 if not exists in SQL server?
please check this 2 approach.
one is very simple and second is some trick. (I think you go with second option which cover everythings)
declare #t table (ProductId varchar(100))
insert into #t
values
('7314-3337')
,('73143337')
,('1973143337')
,('197314-3337')
,('73683337')
,('73143338')
declare #valuetosearch varchar(100) = '1973143337'
--this is very simple , but not work in each schenerio. the second approach is fine.
--select CHARINDEX ( '19','1973143337'), SUBSTRING('1973143337',3,len('1973143337'))
--select * from
--#t
--where
--replace(REPLACE(ProductId ,'-','') ,'19','') = replace(REPLACE(#valuetosearch ,'-','') ,'19','')
select * from
#t
where
REPLACE( case when CHARINDEX ( '19',ProductId) = 1
then SUBSTRING( ProductId ,3,LEN(ProductId))
else ProductId
end ,'-','')
=
REPLACE ( case when CHARINDEX ( '19',#valuetosearch) = 1
then SUBSTRING( #valuetosearch ,3,LEN(#valuetosearch))
else #valuetosearch
end ,'-','')
You should first sanitize your data, if it is not consistent then you won't be able to get the correct results.
For prefixing with 19:
UPDATE foo
SET ProductId = '19' + ProductId
WHERE Left(ProductID, 2) <> '19'
For removing the '-':
UPDATE foo
SET ProductId = REPLACE(ProductId, '-', '')
Then you should be able to get the results you want.
UPDATE:
You could construct a CTE with the results in a single format, and then, filter that CTE:
WITH cte (
FormattedPID
,ProductId
)
AS (
SELECT CASE
WHEN LEFT(ProductId, 2) = '19'
THEN REPLACE(ProductId, '-', '')
ELSE '19' + REPLACE(ProductId, '-', '')
END
,ProductId
FROM foo
)
SELECT FormattedPID
,ProductId
FROM cte
WHERE FormattedPID = #ProductID
You could make sure the column is in the correct format like this:
Remove the - by replacing it with an empty string (197314-3337 -> 1973143337, 7314-3337 -> 73143337).
Add 19 at the beginning (1973143337 -> 191973143337, 73143337 -> 1973143337).
Take 10 rightmost characters of the result and compare to the input (1973143337 -> 1973143337, 1973143337 -> 1973143337).
In Transact-SQL:
WHERE RIGHT('19' + REPLACE(P.ProductId, '-', ''), 10) = #ProductId
Of course, this means no index seek for you, because we are applying functions to the column.
An alternative to that would be to produce the three non-standard formats out of the input:
cut off the initial 19 (1973143337 -> 73143337);
insert the - (1973143337 -> 197314-3337);
insert the - and cut off the 19 (1973143337 -> 197314-3337 -> 7314-3337).
In Transact-SQL:
WHERE P.ProductId IN (
#ProductId,
SUBSTRING(#ProductId, 3, 999999999),
STUFF(#ProductId, 7, 0, '-'),
SUBSTRING(STUFF(#ProductId, 7, 0, '-'), 3, 999999999)
)
This way if there is an index on P.ProductId, it will be used efficiently.
Both approaches assume that the length of the correct format is fixed.

SQL Server - sum comma separated value from a column

There is a column in database which contains comma separated values like: 0.00,12.45,14.33 and so on.
I need to sum this inside a stored procedure. One way which I can think of is to split and convert it into a table using a function and then sum it.
Any other ideas?
Using Sql Server 2005+ CTE you can create a recursive select, something like
DECLARE #Table TABLE(
ID INT,
Vals VARCHAR(100)
)
INSERT INTO #Table SELECT 1, '0.00,12.45,14.33'
INSERT INTO #Table SELECT 2, '1,2,3,4'
;WITH ValList AS(
SELECT ID,
CAST(LEFT(Vals,PATINDEX('%,%', Vals) - 1) AS FLOAT) Val,
RIGHT(Vals,LEN(Vals) - PATINDEX('%,%', Vals)) Remainder
FROM #Table
UNION ALL
SELECT ID,
CAST(LEFT(Remainder,CASE WHEN PATINDEX('%,%', Remainder) = 0 THEN LEN(Remainder) ELSE PATINDEX('%,%', Remainder) - 1 END) AS FLOAT) Val,
RIGHT(Remainder,CASE WHEN PATINDEX('%,%', Remainder) = 0 THEN 0 ELSE LEN(Remainder) - PATINDEX('%,%', Remainder) END) Remainder
FROM ValList
WHERE LEN(Remainder) > 0
)
SELECT ID,
SUM(Val)
FROM ValList
GROUP BY ID
OUTPUT
ID Total
----------- ----------------------
1 26.78
2 10
within a function you could try something like this, totally unsure if it will work tho!
CREATE FUNCTION ufn_sum_csv(#string varchar(100))
RETURNS #result int
AS BEGIN
EXEC 'SELECT #result = ' + REPLACE(#string,',','+')
RETURN
Can't try it out on this comp.