Extracting text between two characters in SQL - sql

I am trying to extract the text between two characters using t-sql. I have been able to write it where it pulls the information close to what I want, but for some reason I am not getting what i am expecting(suprise, suprise). Could really use alittle help refining it. I am trying to extract part of the table name that is located between two [ ]. An example of the column data is as follows(this is a table that records all changes made to the database so the column text is basically SQL statements):
ALTER TABLE [TABLENAME].[MYTABLE] ADD
[VIP_CUSTOMER] [int] NULL
I am trying to extract part of the table name, in this example I just want 'MYTABLE'
Right now I am using:
select SUBSTRING(db.Event_Text, CHARINDEX('.', db.Event_Text) + 2, (CHARINDEX(']', db.Event_Text)) - CHARINDEX('', db.Event_Text) + Len(']')) as OBJName
FROM DBA_AUDIT_EVENT DB
WHERE DATABASE_NAME = 'XYZ'
But when I use this, I don't always get the results needed. Sometimes I get 'MYTABLE] ADD' and sometimes I get the part of the name I want, and sometimes depending on the length of the tablename I only get part the first part of the name with part of the name cut off at the end. Is there anyway to get this right, or is there a better way of writing it? Any help would be greatly appreciated. Thanks in advance.

Long, but here's a formula using the brackets:
Declare #text varchar(200);
Select #text='ALTER TABLE [TABLENAME].[MYTABLE] ADD [VIP_CUSTOMER] [int] NULL';
Select SUBSTRING(#text,
CHARINDEX('[', #text, CHARINDEX('[', #text) + 1 ) +1,
CHARINDEX(']', #text, CHARINDEX('[', #text, CHARINDEX('[', #text) + 1 ) ) -
CHARINDEX('[', #text, CHARINDEX('[', #text) + 1 ) - 1 );
Replace #text with your column name.

Give this a shot:
select SUBSTRING(db.Event_Text, CHARINDEX('.', db.Event_Text) + 2
, CHARINDEX(']', db.Event_Text) - 2) as OBJName
FROM DBA_AUDIT_EVENT DB
WHERE DATABASE_NAME = 'XYZ'

this is a pretty ugly way to get the length, but I've used something like this before:
select SUBSTRING(db.Event_Text,
CHARINDEX('.', db.Event_Text) + 2,
charindex('] ADD',db.Event_Text) - CHARINDEX('.',db.Event_Text)-2))
Give it a try, it may work for you.

Related

SQL Server query to delete text from text column

I have a SQL Server database with a table feedback that contains a text column comment. In that column I have tag data, for example
This is my record <tag>Random characters are here</tag> with information.
How do I write a query to update all of these records to remove the <tag></tag> and all of the text in between?
I'd like to write this to a different 'temporary' table to first verify the changes and then update the original table.
I am running SQL Server 2014 Express.
Thank you
Here is a function to remove tags..
CREATE FUNCTION [dbo].[RemoveTag](#text NVARCHAR(MAX), #tag as nvarchar(max))
RETURNS NVARCHAR(MAX)
AS
BEGIN
declare #startTagIndex as int
declare #endTagIndex as int
set #startTagIndex = CHARINDEX('<' + #tag + '>', #text)
if(#startTagIndex > 0) BEGIN
set #endTagIndex = CHARINDEX('</' + #tag + '>', #text, #startTagIndex)
if(#endTagIndex > 0) BEGIN
return LEFT(#text, #startTagIndex - 1) + RIGHT(#text, len(#text) - len(#tag) - #endTagIndex - 2)
END
END
return #text
END
Later you can use it like:
Update table set field = dbo.RemoveTag(field, 'tag')
If you want to write fields to other table then:
CREATE TABLE dbo.OtherTable (
OtherField nvarchar(MAX) NOT NULL
)
GO
INSERT INTO OtherTable (OtherField)
SELECT dbo.RemoveTag(field, 'tag') from table
Making a lot assumptions about the format of your string. But if they're valid then this is very simple:
left(s, charindex('<tag>', s - 1)) +
substring(s, charindex('</tag>', s) + 6, len(s))
Obviously we're basically assuming that the search strings appear only once and in the correct order. There's also an assumption that there will be matches. Also, I used len(s) as an easy upper bound on the number of characters to take from the right. You could just hard-code something appropriate if you felt like it since SQL Server doesn't error for going past the end. s is just a stand in for your char column.
http://sqlfiddle.com/#!3/771a3/8
Not sure if extra whitespace is going to be an issue so you might want to trim and add a space character in the middle.
rtrim(left(s, charindex('<tag>', s) - 1)) + ' ' +
ltrim(substring(s, charindex('</tag>', s) + 6, len(s)))
You can use CHARINDEX to find where your tags start and stop, SUBSTRING to get all text between < and >, and REPLACE to swap out the substring for ''.
Select Field,
Substring(FIELD, charindex('<', Field), CHARINDEX('>', Field,
(CHARINDEX('>', FIELD)) + 1) - charindex('<', Field)+1) as ToRemove,
replace (Field, Substring(FIELD, charindex('<', Field), CHARINDEX('>',
Field, (CHARINDEX('>', FIELD)) + 1) - charindex('<', Field)+1), '')
as FinalResult
from TableName
The output will be three columns, Field, ToRemove and FinalResult, but nothing will actually be updated.
I think the only way this will fail is if you have nested tags. <b><i>sometext</i></b>
To actually make the change:
Update #TableName set Field = replace (Field, Substring(FIELD, charindex('<', Field), CHARINDEX('>', Field, (CHARINDEX('>', FIELD)) + 1) - charindex('<', Field)+1), '')
Tested on SQL Server 2012.

SQL SERVER 2008 - Returning a portion of text using SUBSTRING AND CHARINDEX. Need to return all text UNTIL a specific char

I have a column called 'response' that contains lots of data about a person.
I'd like to only return the info after a specific string
But, using the method below I sometimes (when people have <100 IQ) get the | that comes directly after the required number..
I'd like any characters after the'PersonIQ=' but only before the pipe.
I'm not sure of the best way to achieve this.
Query speed is a concern and my idea of nested CASE is likely not the best solution.
Any advice appreciated. Thanks
substring(response,(charindex('PersonIQ=',response)+9),3)
This is my suggestion:
declare #s varchar(200) = 'aaa=bbb|cc=d|PersonIQ=99|e=f|1=2'
declare #iq varchar(10) = 'PersonIQ='
declare #pipe varchar(1) = '|'
select substring(#s,
charindex(#iq, #s) + len(#iq),
charindex(#pipe, #s, charindex(#iq, #s)) - (charindex(#iq, #s) + len(#iq))
)
Instead of the 3 in your formula you should calculate the space between #iq and #pipe with this last part of the formula charindex(#pipe, #s, charindex(#iq, #s)) - (charindex(#iq, #s) + len(#iq)), which gets the first #pipe index after #iq, and then substructs the index of the IQ value.
Assuming there's always a pipe, you could do this:
substring(stuff(reponse,1,charindex('PersonIQ=',reponse)-1,''),1,charindex('|',stuff(reponse,1,charindex('PersonIQ=',reponse)-1,''))-1)
Or, you could convert your string to xml and reference PersonIQ directly, e.g.:
--assuming your string looks something like this..
declare #s varchar(max) = 'asdaf=xxx|PersonIQ=100|xxx=yyy'
select convert(xml, '<x ' + replace(replace(#s, '=', '='''), '|', ''' ') + '''/>').value('(/x/#PersonIQ)[1]','int')

Substring with conditional statement

tl;dr
I don't understand how to conditionally change the length parameter of SUBSTRING(..)
Short enough, did read
I've got a text field in a sql table that I want to retrieve a substring from
There is a specific part of text I am having trouble retrieving a substring from, because I cannot guarantee the next string.
For example, I have:
... Tracking Code : /a/delimited/string AttributeW : ValueW ...
And
... Tracking Code : /a/different/delimited/string A random string ...
From both of those i want /a/delimited/string and /a/different/delimited/string respectively
My current sql looks something like:
DECLARE #TrackingStartStr VARCHAR(50), #TrackingEndStr VARCHAR(50)
SET #TrackingStartStr = 'Tracking Code :'
SET #TrackingEndStr = 'Some string that indicates the text is about to end'
SELECT
AField
,RTRIM(LTRIM(Substring(CAST([Body] AS VARCHAR(MAX))
,Charindex(#TrackingStartStr,CAST([Body] AS VARCHAR(MAX))) + LEN(#TrackingStartStr)
,charindex(#TrackingEndStr,CAST([Body] AS VARCHAR(MAX))) - (Charindex(#TrackingStartStr,CAST([Body] AS VARCHAR(MAX))) + LEN(#TrackingStartStr))
))) AS TrackingCode
From tbl_stupidTextTable
I don't know how to conditionally change what #TrackingEndStr is for each row.
Try:
select substring(stringfield,
charindex('/', stringfield, 1),
charindex(' ',
stringfield,
charindex('/', stringfield, 1)) -
charindex('/', stringfield, 1)) as val
from tbl
SQL Fiddle demo: http://sqlfiddle.com/#!6/cebab/16/0

Extract float from String/Text SQL Server

I have a Data field that is supposed to have floating values(prices), however, the DB designers have messed up and now I have to perform aggregate functions on that field. Whereas 80% of the time data is in correct format,eg. '80.50', sometime it is saved as '$80.50' or '$80.50 per sqm'.
The data field is nvarchar. What I need to do is extract the floating point number from the nvarchar. I came accross this: Article on SQL Authority
This, however, solves half my problem, or compound it, some might say. That function just returns the numbers in a string. That is '$80.50 per m2'will return 80502. Obviously that wont work. I tried to change the Regex from =>
PATINDEX('%[^0-9]%', #strAlphaNumeric) to=>
PATINDEX('%[^0-9].[^0-9]%', #strAlphaNumeric)
doesnt work. Any help would be appreciated.
This will do want you need, tested on (http://sqlfiddle.com/#!6/6ef8e/53)
DECLARE #data varchar(max) = '$70.23 per m2'
Select LEFT(SubString(#data, PatIndex('%[0-9.-]%', #data),
len(#data) - PatIndex('%[0-9.-]%', #data) +1
),
PatIndex('%[^0-9.-]%', SubString(#data, PatIndex('%[0-9.-]%', #data),
len(#data) - PatIndex('%[0-9.-]%', #data) +1))
)
But as jpw already mentioned a regular expression over a CLR would be better
This should work too, but it assumes that the float numbers are followed by a white space in case there's text after.
// sample data
DECLARE #tab TABLE (strAlphaNumeric NVARCHAR(30))
INSERT #tab VALUES ('80.50'),('$80.50'),('$80.50 per sqm')
// actual query
SELECT
strAlphaNumeric AS Original,
CAST (
SUBSTRING(stralphanumeric, PATINDEX('%[0-9]%', strAlphaNumeric),
CASE WHEN PATINDEX('%[ ]%', strAlphaNumeric) = 0
THEN LEN(stralphanumeric)
ELSE
PATINDEX('%[ ]%', strAlphaNumeric) - PATINDEX('%[0-9]%', strAlphaNumeric)
END
)
AS FLOAT) AS CastToFloat
FROM #tab
From the sample data above it generates:
Original CastToFloat
------------------------------ ----------------------
80.50 80,5
$80.50 80,5
$80.50 per sqm 80,5
Sample SQL Fiddle.
If you want something more robust you might want to consider writing an CLR-function to do regex parsing instead like described in this MSDN article: Regular Expressions Make Pattern Matching And Data Extraction Easier
Inspired on #deterministicFail, I thought a way to extract only the numeric part (although it's not 100% yet):
DECLARE #NUMBERS TABLE (
Val VARCHAR(20)
)
INSERT INTO #NUMBERS VALUES
('$70.23 per m2'),
('$81.23'),
('181.93 per m2'),
('1211.21'),
(' There are 4 tokens'),
(' No numbers '),
(''),
(' ')
select
CASE
WHEN ISNUMERIC(RTRIM(LEFT(RIGHT(RTRIM(LTRIM(n.Val)), 1+LEN(RTRIM(LTRIM(n.Val)))-PatIndex('%[0-9.-]%', RTRIM(LTRIM(n.Val)))), LEN(RIGHT(RTRIM(LTRIM(n.Val)), 1+LEN(RTRIM(LTRIM(n.Val)))-PatIndex('%[0-9.-]%', RTRIM(LTRIM(n.Val)))))- PATINDEX('%[^0-9.-]%',RIGHT(RTRIM(LTRIM(n.Val)), 1+LEN(RTRIM(LTRIM(n.Val)))-PatIndex('%[0-9.-]%', RTRIM(LTRIM(n.Val))))))))=1 THEN
RTRIM(LEFT(RIGHT(RTRIM(LTRIM(n.Val)), 1+LEN(RTRIM(LTRIM(n.Val)))-PatIndex('%[0-9.-]%', RTRIM(LTRIM(n.Val)))), LEN(RIGHT(RTRIM(LTRIM(n.Val)), 1+LEN(RTRIM(LTRIM(n.Val)))-PatIndex('%[0-9.-]%', RTRIM(LTRIM(n.Val)))))- PATINDEX('%[^0-9.-]%',RIGHT(RTRIM(LTRIM(n.Val)), 1+LEN(RTRIM(LTRIM(n.Val)))-PatIndex('%[0-9.-]%', RTRIM(LTRIM(n.Val)))))))
ELSE '0.0'
END
FROM #NUMBERS n

Increment a varchar in SQL

Basically, I want to increment a varchar in SQL the has a value of "ABC001".
I have code that adds one to an int, but I don't know how to get it working for a varchar:
SELECT
NXT_NO
FROM
TABLE
UPDATE
TABLE
SET
NXT_NO = NXT_NO + 1
Is there an easy way to increment if NXT_NO is a varchar?
I want:
ABC001
ABC002
ABC003
AND
It also needs to work with:
001, A0001, AB00001
Well, you can do something like this:
update table
set nxt_no = left(next_no, 3) +
right('0000000' + cast(substring(next_no, 4, 100)+1 as varchar(255)), 4)
A bit brute force in my opinion.
By the way, you could use an identity column to autoincrement ids. If you then want to put a fixed prefix in front, you ca use a calculated column. Or take Bohemian's advice and store the prefix and number in different columns.
update
[table]
set [nxt_no] = case when PATINDEX('%[0-9]%', [nxt_no]) > 0 then
left([nxt_no], PATINDEX('%[0-9]%', [nxt_no])-1) -- Text part
+ -- concat
right( REPLICATE('0', LEN([nxt_no]) - PATINDEX('%[0-9]%', [nxt_no])+1) + convert( varchar, convert(int, right([nxt_no], LEN([nxt_no]) - PATINDEX('%[0-9]%', [nxt_no])+1))+1), LEN([nxt_no]) - PATINDEX('%[0-9]%', [nxt_no])+1)
else
[nxt_no] end