SQL: Extract last 5 digits in a string after special char - sql

I am struggling to extract last 5 digits in title(free text field) after special char ': ' (with a space). Sample records are as follows:
title column
1 ABC Requirement1 - 1,500 - 3,000 sq m : 12345
2 10,000 sft shed requirement
3 OFFICES REQUIRED 500/700 SQ FT : 56789
4 Land Acquisition : 34567
5 Storage Requirement : 12345
6 Land Requirement :100 sq.m
my result set should be as follows:
ID
1 12345
3 56789
4 34567
5 12345
It should only pick up last 5 digits(ID) after special char ': ' and ignore other records with ': ' in between. I am trying to extract ID values to join with another table. Any help is highly appreciated!

This should get the query that you want.
SELECT LEFT(SUBSTRING(Title, CHARINDEX(': ', Title) + 2, LEN(Title)), 5)
FROM #table
WHERE [Title] LIKE '%: %'
AND ISNUMERIC(LEFT(SUBSTRING(Title, CHARINDEX(': ', Title) + 2, LEN(Title)), 5)) = 1

Try this query --
;WITH CTE
AS (
SELECT Id
,CASE
WHEN CHARINDEX(':', Title, 1) > 1
THEN SUBSTRING(Title, CHARINDEX(':', Title, 1) + 2, 5)
END AS TitleID
FROM RequirementTable
)
SELECT ID
,TitleID
FROM CTE
WHERE ISNUMERIC(TitleID) = 1;

First, you should seriously reconsider the way you're storing your data if you need to go to these lengths to form a relation between records. This is potentially disastrous should your data ever include ': ' naturally and without ending in a foreign key value. And you most likely won't figure that out until it's too late and processing and/or other applications fail as a result.
However, to answer the question as it was asked, I have the same thing as #ChesterLin, but with sample data and including the 'ID' column in the output.
DECLARE #Temp TABLE (ID int, Title varchar(255))
INSERT INTO #Temp
VALUES
(1, 'ABC Requirement1 - 1,500 - 3,000 sq m : 12345'),
(2, '10,000 sft shed requirement'),
(3, 'OFFICES REQUIRED 500/700 SQ FT : 56789'),
(4, 'Land Acquisition : 34567'),
(5, 'Storage Requirement : 12345'),
(6, 'Land Requirement :100 sq.m')
SELECT ID, LEFT(SUBSTRING(Title, CHARINDEX(': ', Title) + 2, LEN(Title)), 5) AS [Extracted Value]
FROM #Temp
WHERE [Title] LIKE '%: %'
AND ISNUMERIC(LEFT(SUBSTRING(Title, CHARINDEX(': ', Title) + 2, LEN(Title)), 5)) = 1

you can get last 5 digits
SUBSTR(column, LENGTH(column) - 5, 5)
OR
SELECT RIGHT('ABC Requirement1 - 1,500 - 3,000 sq m : 12345',5)
OR Full query
SELECT substr(title, character(title)-5) from table_name;

substr(column, -5, 5)
Starts from the last character in the string, and gives the five characters.
Then cast it as INT.
select cast(substr(column, -5, 5) as INT) as ID from table_name
where isnumeric(substr(column, -5, 5)) = 1
I hope this will work. Or, something like this.

Related

STUFF doesn't work well with NULL Values and Grouping

I have table with below schema and data.
CREATE TABLE [dbo].[SearchTest]
(
[DocumentNumber] [int] NOT NULL,
[AlphaNumeric] [nvarchar](50) NULL,
[Integers] [int] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[SearchTest] ([DocumentNumber], [AlphaNumeric], [Integers])
VALUES (1, N'abc', 1)
INSERT [dbo].[SearchTest] ([DocumentNumber], [AlphaNumeric], [Integers])
VALUES (2, N'abc', 1)
INSERT [dbo].[SearchTest] ([DocumentNumber], [AlphaNumeric], [Integers])
VALUES (3, N'bcd', 2)
INSERT [dbo].[SearchTest] ([DocumentNumber], [AlphaNumeric], [Integers])
VALUES (4, N'bcd', 2)
GO
Table data:
I would like to do grouping using Alphanumeric and Integers column and get the DocumentNumber as comma separated value in my final result.
My final result should look like this,
Here is my query that gives the above output,
SELECT *
FROM
(SELECT
STUFF((SELECT ', ' + CAST(DocumentNumber AS VARCHAR(10)) [text()]
FROM SearchTest
WHERE AlphaNumeric = Result.Alphanumeric
OR Integers = Result.Integers
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, ' ') DocumentNumbers,
COUNT(DocumentNumber) TotalDocuments,
Result.AlphaNumeric,
Result.Integers
FROM
(SELECT *
FROM SearchTest
WHERE AlphaNumeric LIKE '%b%' OR Integers = 1) AS Result
GROUP BY
Result.AlphaNumeric, Result.Integers) AS Final
However the above query breaks if I have null values in Integers column.
For example, if I have NULL value in my integer columns as shown here:
Now my query breaks and I get the wrong results in my stuff query as shown below
Grouping works fine in the above query but STUFF part which gives DocumentNumbers gives wrong result. In this case it has be 2 in first row and 1 in second row.
Here is the expected result:
| DocumentNumbers| TotalDocuments| AlphaNumeric | Integers |
+----------------+---------------+---------------+---------------+
| 2 | 1 | abc | NULL |
| 1 | 1 | abc | 1 |
| 3, 4 | 2 | bcd | 2 |
Please assist on where I'm going wrong
You need to change the WHERE clause of the inner query to a) use AND instead of OR and b) to check for NULLs too.
SELECT stuff((SELECT concat(', ', documentnumber)
FROM searchtest st2
WHERE (st2.alphanumeric = st1.alphanumeric
OR st2.alphanumeric IS NULL
AND st1.alphanumeric IS NULL)
AND (st2.integers = st1.integers
OR st2.integers IS NULL
AND st1.integers IS NULL)
FOR XML PATH('')),
1,
2,
'') documentnumbers,
count(*) totaldocuments,
alphanumeric,
integers
FROM searchtest st1
WHERE st1.alphanumeric LIKE '%b%'
OR st1.integers = 1
GROUP BY st1.alphanumeric,
st1.integers;
Following #GordonLinoff comments in the question. This can be easily achieved using STRING_AGG() provided you're using SQL Server 2017 and above. This simplifies the query as well.
Query:
SELECT *
FROM
(SELECT
STRING_AGG(Result.DocumentNumber, ', ') DocumentNumbers,
COUNT(DocumentNumber) TotalDocuments,
Result.AlphaNumeric,
Result.Integers
FROM
(SELECT *
FROM SearchTest
WHERE AlphaNumeric LIKE '%b%' OR Integers = 1) AS Result
GROUP BY
Result.AlphaNumeric, Result.Integers) AS Final
Expected Output:
I had edited the code now:
You can use the below code to attain this:
SELECT S.DocumentNumbers,COUNT(S.DocumentNumbers) AS TotalDocuments,S.AlphaNumeric,S.Integers FROM
(SELECT COALESCE(Stuff((SELECT ', ' + CAST(DocumentNumber AS VARCHAR(10))
FROM SearchTest T1
WHERE T1.AlphaNumeric=T2.AlphaNumeric AND T1.Integers=T2.Integers
FOR XML PATH(''), TYPE).value('.','NVARCHAR(MAX)'),1,2,' ')
,CAST (T2.DocumentNumber AS VARCHAR(20))) AS DocumentNumbers,T2.AlphaNumeric,T2.Integers
FROM SearchTest T2) S
GROUP BY DocumentNumbers,S.AlphaNumeric,S.Integers
ORDER BY S.Integers

SQL Query to parse numbers from name

The DBMS in this case is SQL Server 2012.
I need a SQL query that will grab just the numbers from a device name. I've got devices that follow a naming scheme that SHOULD look like this:
XXXnnnnn
or
XXXnnnnn-XX
Where X is a letter and n is a number which should be left padded with 0's where appropriate. However, not all of the names are properly padded in this way.
So, imagine you have a column that looks something like this:
Name
----
XXX01234
XXX222
XXX0390-A2
XXX00965-A1
I need an SQL query that will return results from this example column as follows.
Number
------
01234
00222
00390
00965
Anyone have any thoughts? I've tried things like casting the name first as a float and then as an int, but to be honest, I'm just not skilled enough with SQL yet to find the solution.
Any help is greatly appreciated!
SQL Server does not have great string parsing functions. For your particular example, I think a case statement might be the simplest approach:
select (case when number like '___[0-9][0-9][0-9][0-9][0-9]%'
then substring(number, 4, 5)
when number like '___[0-9][0-9][0-9][0-9]%'
then '0' + substring(number, 4, 4)
when number like '___[0-9][0-9][0-9]%'
then '00' + substring(number, 4)
when number like '___[0-9][0-9]%'
then '000' + substring(number, 4, 2)
when number like '___[0-9][0-9]%'
then '0000' + substring(number, 4, 1)
else '00000'
end) as EmbeddedNumber
This might work :
SELECT RIGHT('00000'
+ SUBSTRING(Col, 1, ISNULL(NULLIF((PATINDEX('%-%', Col)), 0) - 1, LEN(Col))), 5)
FROM (SELECT REPLACE(YourColumn, 'XXX', '') Col
FROM YourTable)t
SQLFIDDLE
This will work even when XXX can be of different len:
DECLARE #t TABLE ( n NVARCHAR(50) )
INSERT INTO #t
VALUES ( 'XXXXXXX01234' ),
( 'XX222' ),
( 'X0390-A2' ),
( 'XXXXXXX00965-A1' )
SELECT REPLICATE('0', 5 - LEN(n)) + n AS n
FROM ( SELECT SUBSTRING(n, PATINDEX('%[0-9]%', n),
CHARINDEX('-', n + '-') - PATINDEX('%[0-9]%', n)) AS n
FROM #t
) t
Output:
n
01234
00222
00390
00965
If the first 3 chars are always needed to be removed, then you can do something like that (will work if the characters will start only after '-' sign):
DECLARE #a AS TABLE ( a VARCHAR(100) );
INSERT INTO #a
VALUES
( 'XXX01234' ),
( 'XXX222' ),
( 'XXX0390-A2' ),
( 'XXX00965-A1' );
SELECT RIGHT('00000' + SUBSTRING(a, 4, CHARINDEX('-',a+'-')-4),5)
FROM #a
-- OUTPUT
01234
00222
00390
00965
Another option (will extract numbers after first 3 characters):
SELECT
RIGHT('00000' + LEFT(REPLACE(a, LEFT(a, 3), ''),
COALESCE(NULLIF(PATINDEX('%[^0-9]%',
REPLACE(a, LEFT(a, 3), '')),
0) - 1,
LEN(REPLACE(a, LEFT(a, 3), '')))), 5)
FROM
#a;
-- OUTPUT
01234
00222
00390
00965

How to do Custom Sorting in SQL Server 2005

I want to do custom sort by Customercode for tblCustomer table.
CustomerCode consist of (3 char of Surname) + 1 + (PostCode)
Here, 1 will increment if Same Surname and postcode customer found.
For e.g. ABB12615, ABB22615
So mainly I want to sort this by
First 3 Letters of Surname + Index + PostCode.
I tried to do in this manner :
ORDER BY CHARINDEX(SUBSTRING(customerCode, 1, 3), customerCode)
but it gives me output like this:
ABB12615
ABB12715
...
...
...
..
.
ABB22615
But I want output in this order:
ABB12615
ABB22615
ABB12715
and so on
Is it possible to do?
Based on your expected results you really want to sort on
Surname, postcode, index
which would be
ORDER BY SUBSTRING(customerCode, 1, 3),
SUBSTRING(customerCode, 5, 4),
SUBSTRING(customerCode, 4, 1)
Try this
SELECT *
FROM TABLE1
ORDER BY CASE WHEN COlumn1 = 'ABB12615' THEN 1
WHEN COlumn1 = 'ABB22615' THEN 2
WHEN COlumn1 = 'ABB12715' THEN 3
END
This code should sort the way you want.
-- play table
create table #postal
(
id int identity(1,1) primary key,
code varchar(16)
)
go
-- remove data
truncate table #postal;
go
-- add data
insert into #postal
values
('ABB12615'),
('ABB22615'),
('ABB12715'),
('AAA29615'),
('AAA19615');
go
-- sort
select
*
from
#postal
order by
substring(code, 1, 3),
substring(code, 5, len(code) - 5),
substring(code, 4, 1)
Output from the test run.
Yes its possible.
Assuming that your CustomerCode format will remain the same, you can use the below code.
You need to split the Customercode based on String functions & before sorting index, need to convert them to integer as shown below:
select * from tblCustomer
ORDER BY
SUBSTRING(Customercode , 1, 3) --SurName
,CONVERT(INT, SUBSTRING(Customercode , 4, 1)) --Index
,CONVERT(INT,SUBSTRING(Customercode , 5, 5)) --Post Code; You can optionally remove the convert to int function if your post code will contain characters

How to concatenate the "overflow" of fields with character limits

I have a table with 3 address fields and each address field has a limit of 100 characters each.
I need to create a query to make the maximum character limit for each address field to be 30 characters long. If one address field is > 30 then I'll cut off the rest, but take the remainder and concatenate it onto the beginning of the next address field. I would do this until the last address field (address3) is filled up and then just get rid of the remainder on the last address field.
Is there a way to do this with an SQL query or with T-SQL?
You don't specify what to do with very short addresses, but my first crack at it would be something like this:
with temp as
(
select 1 id, 'abcdefghijklmnopqrstuvwxyz123456789' part1, 'second part' part2, 'third part' part3
),
concated as
(
SELECT id, part1 + part2 + part3 as whole
FROM temp
)
select id,
SUBSTRING(whole, 0, 30) f,
SUBSTRING(whole, 30,30) s,
SUBSTRING(whole, 60,30) t
from concated
This returns:
id | f | s | t
1 | abcdefghijklmnopqrstuvwxyz123 | 456789second partthird part |
If that's not what you're looking for please specify the desired output for the above.
UPDATE:
Well... this appears to work but it's pretty gross. I'm sure someone can come up with a better solution.
with temp as
(
select 1 id, 'abcdefghijklmnopqrstuvwxyz123456789 ' part1, 'second part' part2, 'third part' part3
)
select id,
SUBSTRING(part1, 0, 30) f,
SUBSTRING(SUBSTRING(part1, 30, 70) + SUBSTRING(part2, 0,30),0,30) s,
SUBSTRING(SUBSTRING(SUBSTRING(SUBSTRING(part1, 30, 70) + SUBSTRING(part2, 0,30),30,70),0,30) + SUBSTRING(part3, 0,30),0,30) t
from temp
I think I'd go with the problem description, and write something that's "obviously" correct (provided I've understood your spec :-))
/* Setup data - second example stolen from Abe, first just showing that it works with short enough data */
declare #t table (ID int not null,Address1 varchar(100) not null,Address2 varchar(100) not null,Address3 varchar(100) not null)
insert into #t (ID,Address1,Address2,Address3)
values (1,'abc','def','ghi'),
(2,'abcdefghijklmnopqrstuvwxyz123456789 ', 'second part', 'third part')
/* Actual query - shift address pieces through the address fields, but only to later ones */
;with Shift1 as (
select
ID,SUBSTRING(Address1,1,30) as Address1,SUBSTRING(Address1,31,70) as Address1Over,Address2,Address3
from #t
), Shift2 as (
select
ID,Address1,SUBSTRING(Address1Over+Address2,1,30) as Address2,SUBSTRING(Address1Over+Address2,31,70) as Address2Over,Address3
from Shift1
), Shift3 as (
select
ID,Address1,Address2,SUBSTRING(Address2Over+Address3,1,30) as Address3
from Shift2
)
select * from Shift3
Result:
ID Address1 Address2 Address3
----------- ------------------------------ ------------------------------ ------------------------------
1 abc def ghi
2 abcdefghijklmnopqrstuvwxyz1234 56789 second part third part

Parse SQL field into multiple rows

How can I take a SQL table that looks like this:
MemberNumber JoinDate Associate
1234 1/1/2011 A1 free A2 upgrade A31
5678 3/15/2011 A4
9012 5/10/2011 free
And output (using a view or writing to another table or whatever is easiest) this:
MemberNumber Date
1234-P 1/1/2011
1234-A1 1/1/2011
1234-A2 1/1/2011
1234-A31 1/1/2011
5678-P 3/15/2011
5678-A4 3/15/2011
9012-P 5/10/2011
Where each row results in a "-P" (primary) output line as well as any A# (associate) lines. The Associate field can contain a number of different non-"A#" values, but the "A#"s are all I'm interested in (# is from 1 to 99). There can be many "A#"s in that one field too.
Of course a table redesign would greatly simplify this query but sometimes we just need to get it done. I wrote the below query using multiple CTEs; I find its easier to follow and see exactly whats going on, but you could simplify this further once you grasp the technique.
To inject your "P" primary row you will see that I simply jammed it into Associate column but it might be better placed in a simple UNION outside the CTEs.
In addition, if you do choose to refactor your schema the below technique can be used to "split" your Associate column into rows.
;with
Split (MemberNumber, JoinDate, AssociateItem)
as ( select MemberNumber, JoinDate, p.n.value('(./text())[1]','varchar(25)')
from ( select MemberNumber, JoinDate, n=cast('<n>'+replace(Associate + ' P',' ','</n><n>')+'</n>' as xml).query('.')
from #t
) a
cross apply n.nodes('n') p(n)
)
select MemberNumber + '-' + AssociateItem,
JoinDate
from Split
where left(AssociateItem, 1) in ('A','P')
order
by MemberNumber;
The XML method is not a great option performance-wise, as its speed degrades as the number of items in the "array" increases. If you have long arrays the follow approach might be of use to you:
--* should be physical table, but use this cte if needed
--;with
--number (n)
--as ( select top(50) row_number() over(order by number) as n
-- from master..spt_values
-- )
select MemberNumber + '-' + substring(Associate, n, isnull(nullif(charindex(' ', Associate + ' P', n)-1, -1), len(Associate)) - n+1),
JoinDate
from ( select MemberNumber, JoinDate, Associate + ' P' from #t
) t (MemberNumber, JoinDate, Associate)
cross
apply number n
where n <= convert(int, len(Associate)) and
substring(' ' + Associate, n, 1) = ' ' and
left(substring(Associate, n, isnull(nullif(charindex(' ', Associate, n)-1, -1), len(Associate)) - n+1), 1) in ('A', 'P');
Try this new version
declare #t table (MemberNumber varchar(8), JoinDate date, Associate varchar(50))
insert into #t values ('1234', '1/1/2011', 'A1 free A2 upgrade A31'),('5678', '3/15/2011', 'A4'),('9012', '5/10/2011', 'free')
;with b(f, t, membernumber, joindate, associate)
as
(
select 1, 0, membernumber, joindate, Associate
from #t
union all
select t+1, charindex(' ',Associate + ' ', t+1), membernumber, joindate, Associate
from b
where t < len(Associate)
)
select MemberNumber + case when t = 0 then '-P' else '-'+substring(Associate, f,t-f) end NewMemberNumber, JoinDate
from b
where t = 0 or substring(Associate, f,1) = 'A'
--where t = 0 or substring(Associate, f,2) like 'A[1-9]'
-- order by MemberNumber, t
Result is the same as the requested output.
I would recommend altering your database structure by adding a link table instead of the "Associate" column. A link table would consist of two or more columns like this:
MemberNumber Associate Details
-----------------------------------
1234 A1 free
1234 A2 upgrade
1234 A31
5678 A4
Then the desired result can be obtained with a simple JOIN:
SELECT CONCAT(m.`MemberNumber`, '-', 'P'), m.`JoinDate`
FROM `members` m
UNION
SELECT CONCAT(m.`MemberNumber`, '-', IFNULL(a.`Associate`, 'P')), m.`JoinDate`
FROM `members` m
RIGHT JOIN `members_associates` a ON m.`MemberNumber` = a.`MemberNumber`