Sql query to find the Most starting characters - sql

Simplified scenario, I have a table with the following fields/values:
ID value
1 '12345'
2 '1234'
3 '123'
4 '12'
5 '1'
I want to find the record that is the closest to A='1230' and it should correspond to ID=3.
The only implementation I have in mind now is basic... Use a loop to iterate through A substring and do the comparison.
Is there a better way to solve this?
Will appreciate your help

Try this :-
Declare #valueToSearch int
Set #valueToSearch =1230
;WITH cte
AS
(
SELECT ID,RANK() OVER(ORDER BY ABS(value-#valueToSearch)) AS num FROM Sample
)
SELECT ID FROM cte
WHERE num=(SELECT MIN(num) FROM cte
This will give 2 as the result
Sql Fiddle

declare #q varchar(5)
select #q = '1230'
select top 1 number, substring(source.value, 1, number)
from master.dbo.spt_values, source
where type='p'
and number<=len(source.value)
and substring(source.value, 1, number) = substring(#q, 1, number)
order by number desc
Or using like...
select top 1 * from #source s
where #q like value +'%'
order by len(value) desc

Related

How to split single cell into multiple columns in sql server 2008R2?

I want to split each name for individual columns
create table split_test(value integer,Allnames varchar(40))
insert into split_test values(1,'Vinoth,Kumar,Raja,Manoj,Jamal,Bala');
select * from split_test;
Value Allnames
-------------------
1 Vinoth,Kumar,Raja,Manoj,Jamal,Bala
Expected output
values N1 N2 N3 N4 N5 N6 N7.......N20
1 Vinoth Kumar Raja Manoj Jamal Bala
using this example you can get an idea.
declare #str varchar(max)
set #str = 'Hello world'
declare #separator varchar(max)
set #separator = ' '
declare #Splited table(id int identity(1,1), item varchar(max))
set #str = REPLACE(#str,#separator,'''),(''')
set #str = 'select * from (values('''+#str+''')) as V(A)'
insert into #Splited
exec(#str)
select * from #Splited
Here is an sql statement using recursive CTE to split names into rows, then pivot rows into columns.
SqlFiddle
with names as
(select
value,
1 as name_id,
substring(Allnames,1,charindex(',',Allnames+',', 0)-1) as name,
substring(Allnames,charindex(',',Allnames, 0)+1, 40) as left_names
from split_test
union all
select
value,
name_id +1,
case when charindex(',',left_names, 0)> 0 then
substring(left_names,1,charindex(',',left_names, 0)-1)
else left_names end as name,
case when charindex(',',left_names, 0)> 0 then
substring(left_names,charindex(',',left_names, 0)+1, 40)
else '' end as left_names
from names
where ltrim(left_names)<>'')
select value,
[1],[2],[3],[4],[5],[6],[7],[8],[9]
from (select value,name_id,name from names) as t1
PIVOT (MAX(name) FOR name_id IN ( [1],[2],[3],[4],[5],[6],[7],[8],[9] ) ) AS t2
UPDATE
#KM.'s answer might be a better way to split data into rows without recursive CTE table. It should be more efficient than this one. So I follow that example and simplified the part of null value process logic. Here is the result:
Step 1:
Create a table includes all numbers from 1 to a number grater than max length of Allnames column.
CREATE TABLE Numbers( Number int not null primary key);
with n as
(select 1 as num
union all
select num +1
from n
where num<100)
insert into numbers
select num from n;
Step 2:
Join data of split_test table with numbers table, we can get all the parts start from ,.
Then take the first part between 2 , form every row. If there are null values exists, add them with union.
select value ,
ltrim(rtrim(substring(allnames,number+1,charindex(',',substring(allnames,number,40),2)-2))) as name
from
(select value, ','+allnames+',' as allnames
from split_test) as t1
left join numbers
on number<= len(allnames)
where substring(allnames,number,1)=','
and substring(allnames,number,40)<>','
union
select value, Allnames
from split_test
where Allnames is null
Step 3: Pivot names from rows to columns like my first attempt above, omitted here.
SQLFiddle

Truncating leading zeros in sql server

I need to represent the following records
DATA
000200AA
00000200AA
000020BCD
00000020BCD
000020ABC
AS
DATA CNT
200AA 1
20BCD 2
20ABC 2
ANY IDEAS?
USE patindex
select count(test) as cnt,
substring(test, patindex('%[^0]%',test),len(test)) from (
select ('000200AA') as test
union
select '00000200AA' as test
union
select ('000020BCD') as test
union
select ('00000020BCD') as test
union
select ('000020ABC') as test
)ty
group by substring(test, patindex('%[^0]%',test),len(test))
How about a nice recursive user-defined function?
CREATE FUNCTION dbo.StripLeadingZeros (
#input varchar(MAX)
) RETURNS varchar(MAX)
BEGIN
IF LEN(#input) = 0
RETURN #input
IF SUBSTRING(#input, 1, 1) = '0'
RETURN dbo.StripLeadingZeros(SUBSTRING(#input, 2, LEN(#input) - 1))
RETURN #input
END
GO
Then:
SELECT dbo.StripLeadingZeros(DATA) DATA, COUNT(DATA) CNT
FROM YourTable GROUP BY dbo.StripLeadingZeros(DATA)
DECLARE #String VARCHAR(32) = N'000200AA'
SELECT SUBSTRING ( #String ,CHARINDEX(N'2', #String),LEN(#String))
Depending on the what you need to get the values this code may differ:
Assuming a simple right 5 chars as Barry suggested, you can use RIGHT(data, 5) and GROUP BY and COUNT to get your results
http://sqlfiddle.com/#!3/19ecd/2
take a look at the STUFF function
It inserts data into a string on a range
You can do this query:
SELECT RIGHT([DATA],LEN[DATA])-PATINDEX('%[1-9]%',[DATA])+1) [DATA], COUNT(*) CNT
FROM YourTable
GROUP BY RIGHT([DATA],LEN[DATA])-PATINDEX('%[1-9]%',[DATA])+1)

T-Sql count string sequences over multiple rows

How can I find subsets of data over multiple rows in sql?
I want to count the number of occurrences of a string (or number) before another string is found and then count the number of times this string occurs before another one is found.
All these strings can be in random order.
This is what I want to achieve:
I have one table with one column (columnx) with data like this:
A
A
B
C
A
B
B
The result I want from the query should be like this:
2 A
1 B
1 C
1 A
2 B
Is this even possible in sql or would it be easier just to write a little C# app to do this?
Since, as per your comment, you can add a column that will unambiguously define the order in which the columnx values go, you can try the following query (provided the SQL product you are using supports CTEs and ranking functions):
WITH marked AS (
SELECT
columnx,
sortcolumn,
grp = ROW_NUMBER() OVER ( ORDER BY sortcolumn)
- ROW_NUMBER() OVER (PARTITION BY columnx ORDER BY sortcolumn)
FROM data
)
SELECT
columnx,
COUNT(*)
FROM marked
GROUP BY
columnx,
grp
ORDER BY
MIN(sortcolumn)
;
You can see the method in work on SQL Fiddle.
If sortcolumn is an auto-increment integer column that is guaranteed to have no gaps, you can replace the first ROW_NUMBER() expression with just sortcolumn. But, I guess, that cannot be guaranteed in general. Besides, you might indeed want to sort on a timestamp instead of an integer.
I dont think you can do it with a single select.
You can use AdventureWorks cursor:
create table my_Strings
(
my_string varchar(50)
)
insert into my_strings values('A'),('A'),('B'),('C'),('A'),('B'),('B') -- this method will only work on SQL Server 2008
--select my_String from my_strings
declare #temp_result table(
string varchar(50),
nr int)
declare #myString varchar(50)
declare #myLastString varchar(50)
declare #nr int
set #myLastString='A' --set this with the value of your FIRST string on the table
set #nr=0
DECLARE string_cursor CURSOR
FOR
SELECT my_string as aux_column FROM my_strings
OPEN string_cursor
FETCH NEXT FROM string_cursor into #myString
WHILE ##FETCH_STATUS = 0 BEGIN
if (#myString = #myLastString) begin
set #nr=#nr+1
set #myLastString=#myString
end else begin
insert into #temp_result values (#myLastString, #nr)
set #myLastString=#myString
set #nr=1
end
FETCH NEXT FROM string_cursor into #myString
END
insert into #temp_result values (#myLastString, #nr)
CLOSE string_cursor;
DEALLOCATE string_cursor;
select * from #temp_result
Result:
A 2
B 1
C 1
A 1
B 2
Try this :
;with sample as (
select 'A' as columnx
union all
select 'A'
union all
select 'B'
union all
select 'C'
union all
select 'A'
union all
select 'B'
union all
select 'B'
), data
as (
select columnx,
Row_Number() over(order by (select 0)) id
from sample
) , CTE as (
select * ,
Row_Number() over(order by (select 0)) rno from data
) , result as (
SELECT d.*
, ( SELECT MAX(ID)
FROM CTE c
WHERE NOT EXISTS (SELECT * FROM CTE
WHERE rno = c.rno-1 and columnx = c.columnx)
AND c.ID <= d.ID) AS g
FROM data d
)
SELECT columnx,
COUNT(1) cnt
FROM result
GROUP BY columnx,
g
Result :
columnx cnt
A 2
B 1
C 1
A 1
B 2

Get N th row value in sql server

DECLARE #ActionNumber varchar(20)='EHPL-DES-SQ-1021'
set #ActionNumber=(select top 1 * from dbo.ANOSplit(#ActionNumber,'-')
order by ROW_NUMBER() OVER (ORDER BY items))
select #ActionNumber
from above query i need to return the 2ND and 3RD index from initial #ActionNumber
'EHPL-DES-SQ-1021' after Split().
format of the ActionNumber is exactly as above but DES, SQ and 1021 can change.
so i can not use ORDER BY items ASC or ORDER BY items DESC because it will order alphabetically.
above query returns 'EHPL'.how can i get DES and SQ.
You can do it with the ANOSplit function, but I would insert the result into a temp table or table variable.
As you said yourself, you can't just ORDER BY the values returned by the ANOSplit function because it will order alphabetically.
--> So you can use a temp table with an IDENTITY column, and use this for sorting:
DECLARE #ActionNumber varchar(20)='EHPL-DES-SQ-1021'
declare #tmp table
(
id int identity(1,1),
item varchar(20)
)
insert into #tmp (item)
select * from dbo.ANOSplit(#ActionNumber,'-')
select * from #tmp where id in (2,3)
The items will be inserted into the table in the exact order returned by the function, so after inserting you know that the lines with id 2 and 3 are the ones you want.
Try to use Substring with CharIndex >>>
DECLARE #ActionNumber varchar(20)='EHPL-DES-SQ-1021'
select SUBSTRING (#ActionNumber,CHARINDEX ('-',#ActionNumber,0) + 1, 3)
This isn't tested, but I think it will work:
DECLARE #ActionNumber varchar(20)='EHPL-DES-SQ-1021'
WITH nCTE AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY items) AS RNum
FROM dbo.ANOSplit(#ActionNumber,'-')
)
SELECT * FROM nCTE WHERE RNum = 2 --put n here

In SQL can a sequenced range selection be done more efficiently than my algorithm (see code) that uses a cursor?

I need to collapse multiple ranges of sequential numbers (1 or more) to sets of their minimum and maximum values. I have unique integers (no duplicates) stored in a table column.
The obvious way (to me) to solve this problem is to use a cursor (see my algorithm below) and iterate through every integer. However, it seems inefficient to me so I am wondering if there is a more efficient algorithm. Perhaps there is a way using common table expressions with recursion. I have more than 32767 integers though, so any solution will need to use option (MAXRECURSION 0) which sets unlimited recursion.
Following is a simplified test case for my existing algorithm usign a cursor. It will output the minimum and maximum for each range of sequential numbers (e.g. 1-3, 9-11, 13-13, 15-16).
I am using MS SQL Server 2008. Please note comments begin with two dashes (--).
declare #minInt int, #maxInt int
declare #nextInt int, #prevInt int
--need a temporary table to store the ranges that were found
declare #rangeTable table (minInt int, maxInt int)
declare mycursor cursor for
select * from
(
select 1 as id union
select 2 as id union
select 3 as id union
select 9 as id union
select 10 as id union
select 11 as id union
select 13 as id union
select 15 as id union
select 16 as id
) tblRanges
order by id--order is needed for this algorithm if used with generic data
open mycursor
--initialise new sequence
fetch next from mycursor into #minInt
select #maxInt = #minInt--set the min and max to the smallest value
select #prevInt = #minInt--store the last int
declare #sequenceFound int
while ##FETCH_STATUS=0
begin
select #sequenceFound=1--set the default flag value to true
--loop while sequence found
while ##FETCH_STATUS=0 and #sequenceFound = 1
begin
fetch next from mycursor into #nextInt
if #nextInt = (#prevInt + 1)
begin
select #sequenceFound = 1
end
else
begin
select #sequenceFound = 0
end
select #prevInt = #nextInt--store the current value as the previous value for the next comparison
if #sequenceFound = 1 --if the nextInt is part of a sequence, then store the new maxInt
and #maxInt < #nextInt--should always be true for ordered output containing no duplicates
begin
select #maxInt = #nextInt
end
end--while sequenceFound
--store the sequence range and then check for more sequences
insert into #rangeTable (minInt,maxInt) values (#minInt,#maxInt)
--store the current value as the new minInt and maxInt for the next sequence iteration
select #minInt = #nextInt
select #maxInt = #nextInt
end--while more table rows found
select * from #rangeTable
close mycursor
deallocate mycursor
Courtesy of Itzik Ben-Gan:
WITH tblRanges AS
(
SELECT 1 AS ID UNION
SELECT 2 AS ID UNION
SELECT 3 AS ID UNION
SELECT 9 AS ID UNION
SELECT 10 AS ID UNION
SELECT 11 AS ID UNION
SELECT 13 AS ID UNION
SELECT 15 AS ID UNION
SELECT 16 AS ID
),
StartingPoints AS
(
SELECT ID, ROW_NUMBER() OVER(ORDER BY ID) AS rownum
FROM tblRanges AS A
WHERE NOT EXISTS
(SELECT *
FROM tblRanges AS B
WHERE B.ID = A.ID - 1)
),
EndingPoints AS
(
SELECT ID, ROW_NUMBER() OVER(ORDER BY ID) AS rownum
FROM tblRanges AS A
WHERE NOT EXISTS
(SELECT *
FROM tblRanges AS B
WHERE B.ID = A.ID + 1)
)
SELECT S.ID AS start_range, E.ID AS end_range
FROM StartingPoints AS S
JOIN EndingPoints AS E
ON E.rownum = S.rownum;
You can read a full explanation from his chapter in SQL Sever MVP Deep Dives called Gaps and Islands. He explains various techniques (including cursors) and compares them in terms of performance.