SQL UPDATE specific characters in string - sql

I have a column with the following values (there is alot more):
20150223-001
20150224-002
20150225-003
I need to write an UPDATE statement which will change the first 2 characters after the dash to 'AB'. Result has to be the following:
20150223-AB1
20150224-AB2
20150225-AB3
Could anyone assist me with this?
Thanks in advance.

Use this,
DECLARE #MyString VARCHAR(30) = '20150223-0000000001'
SELECT STUFF(#MyString,CHARINDEX('-',#MyString)+1,2,'AB')

If there is a lot of data, you could consider to use .WRITE clause. But it is limited to VARCHAR(MAX), NVARCHAR(MAX) and VARBINARY(MAX) data types.
If you have one of the following column types, the .WRITE clause is easiest for this purpose, example below:
UPDATE Codes
SET val.WRITE('AB',9,2)
GO
Other possible choice could be simple REPLACE:
UPDATE Codes
SET val=REPLACE(val,SUBSTRING(val,10,2),'AB')
GO
or STUFF:
UPDATE Codes
SET val=STUFF(val,10,2,'AB')
GO
I based on the information that there is always 8 characters of date and one dash after in the column. I prepered a table and checked some solutions which were mentioned here.
CREATE TABLE Codes(val NVARCHAR(MAX))
INSERT INTO Codes
SELECT TOP 500000 CONVERT(NVARCHAR(128),GETDATE()-CHECKSUM(NEWID())%1000,112)+'-00'+CAST(ABS(CAST(CHECKSUM(NEWID())%10000 AS INT)) AS NVARCHAR(128))
FROM sys.columns s1 CROSS JOIN sys.columns s2
I run some tests, and based on 10kk rows with NVARCHAR(MAX) column, I got following results:
+---------+------------+
| Method | Time |
+---------+------------+
| .WRITE | 28 seconds |
| REPLACE | 30 seconds |
| STUFF | 15 seconds |
+---------+------------+
As we can see STUFF looks like the best option for updating part of string. .WRITE should be consider when you insert or append new data into string, then you could take advantage of minimall logging if the database recovery model is set to bulk-logged or simple. According to MSDN articleabout UPDATE statement: Updating Large Value Data Types

According to the OP Comment:-
Its always 8 charachters before the dash but the characters after the
dash can vary. It has to update the first two after the dash.
use the next simple code:-
DECLARE #MyString VARCHAR(30) = '20150223-0000000001'
SELECT REPLACE(#MyString,SUBSTRING(#MyString,9,3),'-AB')
Result:-
20150223-AB00000001

try,
update table set column=stuff(column,charindex('-',column)+1,2,'AB')

Declare #Table1 TABLE (DateValue Varchar(50))
INSERT INTO #Table1
SELECT '20150223-000000001' Union all
SELECT '20150224-000000002' Union all
SELECT '20150225-000000003'
SELECT DateValue,
CONCAT(SUBSTRING(DateValue,0,CHARINDEX('-',DateValue)),
REPLACE(LEFT(SUBSTRING(DateValue,CHARINDEX('-',DateValue)+1,Len(DateValue)),2),'00','-AB'),
SUBSTRING(DateValue,CHARINDEX('-',DateValue)+1,Len(DateValue))) AS ExpectedDateValue
FROM #Table1
OutPut
DateValue ExpectedDateValue
---------------------------------------------
20150223-000000001 20150223-AB000000001
20150224-000000002 20150224-AB000000002
20150225-000000003 20150225-AB000000003
To Update
Update #Table1
SEt DateValue= CONCAT(SUBSTRING(DateValue,0,CHARINDEX('-',DateValue)),
REPLACE(LEFT(SUBSTRING(DateValue,CHARINDEX('-',DateValue)+1,Len(DateValue)),2),'00','-AB'),
SUBSTRING(DateValue,CHARINDEX('-',DateValue)+1,Len(DateValue)))
From #Table1
SELECT * from #Table1
OutPut
DateValue
-------------
20150223-AB000000001
20150224-AB000000002
20150225-AB000000003

Related

SQL - Replace a particular part of column string value (between second and third slash)

In my SQLServer DB, I have a table called Documents with the following columns:
ID - INT
DocLocation - NTEXT
DocLocation has values in following format:
'\\fileShare1234\storage\ab\xyz.ext'
Now it seems these documents are stored in multiple file share paths.
We're planning to migrate all documents in one single file share path called say 'newFileShare' while maintaining the internal folder structure.
So basically '\\fileShare1234\storage\ab\xyz.ext' should be updated to '\\newFileShare\storage\ab\xyz.ext'
Two questions:
How do I query my DocLocation to extract DocLocations with unique file share values? Like 'fileShare1234' and 'fileShare6789' and so on..
In a single Update query how do I update my DocLocation values to newFileShare ('\\fileShare1234\storage\ab\xyz.ext' to '\\newFileShare\storage\ab\xyz.ext')
I think the trick would be extract and replace text between second and third slashes.
I've still not figured out how to achieve my first objective. I require those unique file shares for some other tasks.
As for the second objective, I've tried using replace between it will require multiple update statements. Like I've done as below:
update Documents set DocLocation = REPLACE(Cast(DocLocation as NVarchar(Max)), '\\fileShare1234\', '\\newFileShare\')
The first step is fairly easy. If all your paths begin with \\, then you can find all the DISTINCT servers using SUBSTRING. I will make a simple script with a table variable to replicate some data. The value of 3 is in the query and it is the length of \\ plus 1 since SQL Server counts from 1.
DECLARE #Documents AS TABLE(
ID INT NOT NULL,
DocLocation NTEXT NOT NULL
);
INSERT INTO #Documents(ID, DocLocation)
VALUES (1,'\\fileShare56789\storage\ab\xyz.ext'),
(2,'\\fileShare1234\storage\ab\cd\xyz.ext'),
(3,'\\share4567890\w\x\y\z\file.ext');
SELECT DISTINCT SUBSTRING(DocLocation, 3, CHARINDEX('\', DocLocation, 3) - 3) AS [Server]
FROM #Documents;
The results from this are:
Server
fileShare1234
fileShare56789
share4567890
For the second part, we can just concatenate the new server name with the path that appears after the first \.
UPDATE #Documents
SET DocLocation = CONCAT('\\newfileshare\',
SUBSTRING(DocLocation, 3, LEN(CAST(DocLocation AS nvarchar(max))) - 2));
SELECT * FROM #Documents;
For some reason I cannot create a table with the results here, but the values I see are this:
\\newfileshare\fileShare56789\storage\ab\xyz.ext
\\newfileshare\fileShare1234\storage\ab\cd\xyz.ext
\\newfileshare\share4567890\w\x\y\z\file.ext
Please try the following solution based on XML and XQuery.
Their data model is based on ordered sequences. Exactly what we need while processing fully qualified file path: [position() ge 4]
When you are comfortable, just run the UPDATE statement by updating the DocLocation column with the calculated result.
It is better to use NVARCHAR(MAX) instead of NText data type.
SQL
-- DDL and sample data population, start
DECLARE #tbl AS TABLE(ID INT IDENTITY PRIMARY KEY, DocLocation NVARCHAR(MAX));
INSERT INTO #tbl(DocLocation) VALUES
('\\fileShare56789\storage\ab\xyz.ext'),
('\\fileShare1234\storage\ab\cd\xyz.ext'),
('\\share4567890\w\x\y\z\file.ext');
-- DDL and sample data population, end
DECLARE #separator CHAR(1) = '\'
, #newFileShare NVARCHAR(100) = 'newFileShare';
SELECT ID, DocLocation
, result = '\\' + #newFileShare + #separator +
REPLACE(c.query('data(/root/r[position() ge 4]/text())').value('text()[1]', 'NVARCHAR(MAX)'), SPACE(1), #separator)
FROM #tbl
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' +
REPLACE(DocLocation, #separator, ']]></r><r><![CDATA[') +
']]></r></root>' AS XML)) AS t(c);
Output
+----+---------------------------------------+--------------------------------------+
| ID | DocLocation | result |
+----+---------------------------------------+--------------------------------------+
| 1 | \\fileShare56789\storage\ab\xyz.ext | \\newFileShare\storage\ab\xyz.ext |
| 2 | \\fileShare1234\storage\ab\cd\xyz.ext | \\newFileShare\storage\ab\cd\xyz.ext |
| 3 | \\share4567890\w\x\y\z\file.ext | \\newFileShare\w\x\y\z\file.ext |
+----+---------------------------------------+--------------------------------------+
to get the unique list of shared folder path , you can use this query:
SELECT distinct SUBSTRING(DocLocation,0,CHARINDEX('\',DocLocation,3))
from Documents
and your update command should work and yes you can merge copuple of replace update but better to run them seperately
update Documents
set DocLocation = REPLACE(DocLocation,'\\fileShare1234','\\newFileShare')
but I recommend you always record relative address instead of full path like: \storage\ab\xyz.ext'

Using UPDATE statement with a SELECT statement converting hexadecimal to text

I have a table called Script_Data that has three columns - ScriptID (primary), RowOrder and ScriptData. each row value for ScripData is hexadecimal. For me to make sense of it, I convert it to text. I CAST() the ScriptData column into VarChar datatype using the following query
SELECT ScriptID, RowOrder, CAST(CAST(ScriptData AS varbinary(MAX)) AS varchar(MAX)) AS Converted_SD
FROM Script_Data
Is it possible to UPDATE values in the ScriptData column when converted? I know that I would typically do something like this if not for converting:
UPDATE Script_Data
SET ScriptData='Sales'
WHERE ScriptData='Marketing';
Is it even possible to do something like this when I have it converted from hex to text? I've tried so many different queries, most of which include subqueries, but all failed.
Converting it changes this
| ScriptID | RowOrder | ScriptData |
------------------------------------
| 5008 | 1 | 0x435669787|
to this (I'm over simplifying the results)
| ScriptID | RowOrder | ScriptData |
------------------------------------
| 5008 | 1 | Sales |
EDIT:
My best attempt seems to have been this query
UPDATE Script_Data
SET ScriptData='Engineering'
(SELECT ScriptID, RowOrder, CONVERT(varchar(max), ScriptData)
FROM Script_Data
WHERE ScriptData = 'Accounting')
But SQL is telling me that Implicit conversion from data type varchar to varbinary(max) is not allowed. Use the CONVERT function to run this query. I've tried to use CONVERT in creative ways to satisfy the error, but have not been successful. The ScriptData column is varbinary datatype with -1 length.
It seems you need to cast the new value to varbinary as part of the update.
UPDATE Script_Data SET
ScriptData = CAST('Engineering' AS VARBINARY(MAX))
WHERE CAST(ScriptData AS VARCHAR(MAX)) = 'Accounting'
I won't ask why you are storing strings as varbinary because I'm sure you realise life would be much easier if you just stored it as a varchar.
Here is the test script I used:
declare #ScriptData table (ScriptData varbinary(max));
insert into #ScriptData (ScriptData)
values (0x435669787), (convert(varbinary(max),'Sales'));
select *, convert(varchar(max),ScriptData,3), CAST(ScriptData AS varchar(MAX)) from #ScriptData;
update #ScriptData set
ScriptData = CAST('Marketing' AS VARBINARY(MAX))
where CAST(ScriptData AS varchar(MAX)) = 'Sales';
select *, convert(varchar(max),ScriptData,3), CAST(ScriptData AS varchar(MAX)) from #ScriptData;
For your SELECT query, the analogous UPDATE is to place conversion in the SET command assigning value to a new or different column, not the same column.
UPDATE is not a DDL (data definition language) but a DML (data manipulation language) command. Hence, it only adjusts data but does not change a columns' defined data type. Consider an ALTER command to create a new VARCHAR(MAX) column then run UPDATE to assign value:
ALTER TABLE Converted_SD ADD ScriptData_Text VARCHAR(MAX);
UPDATE Converted_SD
SET ScriptData_Text = CAST(CAST(ScriptData AS varbinary(MAX)) AS varchar(MAX));
Also, since ALTER is a DDL command, use it very sparingly and never in application code or stored procedure since it can adjust table schema and column definitions.

creating a SQL table with multiple columns automatically

I must create an SQL table with 90+ fields, the majority of them are bit fields like N01, N02, N03 ... N89, N90 is there a fast way of creating multiple fileds or is it possible to have one single field to contain an array of values true/false? I need a solution that can also easily be queried.
There is no easy way to do this and it will be very challenging to do queries against such a table. Create a table with three columns - item number, bit field number and a value field. Then you will be able to write 'good' succinct Tsql queries against the table.
At least you can generate ALTER TABLE scripts for bit fields, and then run those scripts.
DECLARE #COUNTER INT = 1
WHILE #COUNTER < 10
BEGIN
PRINT 'ALTER TABLE table_name ADD N' + RIGHT('00' + CONVERT(NVARCHAR(4), #COUNTER), 2) + ' bit'
SET #COUNTER += 1
END
TLDR: Use binary arithmetic.
For a structure like this
==============
Table_Original
==============
Id | N01| N02 |...
I would recommend an alternate table structure like this
==============
Table_Alternate
==============
Id | One_Col
This One_Col is of varchar type which will have value set as
cast(n01 as nvarchar(1)) + cast(n02 as nvarchar(1))+ cast(n03 as nvarchar(1)) as One_Col
I however feel that you'd use C# or some other programming language to set value into column. You can also use bit and bit-shift operations.
Whenever you need to get a value, you can use SQL or C# syntax(treating as string)
In sql query terms you can use a query like
SELECT SUBSTRING(one_col,#pos,1)
and #pos can be set like
DECLARE #Colname nvarchar(4)
SET #colname=N'N32'
-- ....
SET #pos= CAST(REPLACE(#colname,'N','') as INT)
Also you can use binary arithmetic too with ease in any programming language.
Use three columns.
Table
ID NUMBER,
FIELD_NAME VARCHAR2(10),
VALUE NUMBER(1)
Example
ID FIELD VALUE
1 N01 1
1 N02 0
.
1 N90 1
.
2 N01 0
2 N02 1
.
2 N90 1
.
You can also OR an entire column for a fieldname (or fieldnameS):
select DECODE(SUM(VALUE), 0, 0, 1) from table where field_name = 'N01';
And even perform an AND
select EXP(SUM(LN(VALUE))) from table where field_name = 'N01';
(see http://viralpatel.net/blogs/row-data-multiplication-in-oracle/)

comparable varchar "arrays" in seperate fields but on same row

I have a table that looks like this:
memberno(int)|member_mouth (varchar)|Inspected_Date (varchar)
-----------------------------------------------------------------------------
12 |'1;2;3;4;5;6;7' |'12-01-01;12-02-02;12-03-03' [7 members]
So by looking at how this table has been structured (poorly yes)
The values in the member_mouth field is a string that is delimited by a ";"
The values in the Inspected_Date field is a string that is delimited by a ";"
So - for each delimited value in member_mouth there is an equal inspected_date value delimited inside the string
This table has about 4Mil records, we have an application written in C# that normalizes the data and stores it in a separate table. The problem now is because of the size of the table it takes a long time for this to process. (the example above is nothing compared to the actual table, it's much larger and has a couple of those string "array" fields)
My question is this: What would be the best and fastest way to normilize this data in MSSQL proc? let MSSQL do the work and not a C# app?
The best way will be SQL itself. The way followed in the below code is something which worked for me well with 2-3 lakhs of data.
I am not sure about the below code when it comes to 4 Million, but may help.
Declare #table table
(memberno int, member_mouth varchar(100),Inspected_Date varchar(400))
Insert into #table Values
(12,'1;2;3;4;5;6;7','12-01-01;12-02-02;12-03-03;12-04-04;12-05-05;12-07-07;12-08-08'),
(14,'1','12-01-01'),
(19,'1;5;8;9;10;11;19','12-01-01;12-02-02;12-03-03;12-04-04;12-07-07;12-10-10;12-12-12')
Declare #tableDest table
(memberno int, member_mouth varchar(100),Inspected_Date varchar(400))
The table will be like.
Select * from #table
See the code from here.
------------------------------------------
Declare #max_len int,
#count int = 1
Set #max_len = (Select max(Len(member_mouth) - len(Replace(member_mouth,';','')) + 1)
From #table)
While #count <= #max_len
begin
Insert into #tableDest
Select memberno,
SUBSTRING(member_mouth,1,charindex(';',member_mouth)-1),
SUBSTRING(Inspected_Date,1,charindex(';',Inspected_Date)-1)
from #table
Where charindex(';',member_mouth) > 0
union
Select memberno,
member_mouth,
Inspected_Date
from #table
Where charindex(';',member_mouth) = 0
Delete from #table
Where charindex(';',member_mouth) = 0
Update #table
Set member_mouth = SUBSTRING(member_mouth,charindex(';',member_mouth)+1,len(member_mouth)),
Inspected_Date = SUBSTRING(Inspected_Date,charindex(';',Inspected_Date)+1,len(Inspected_Date))
Where charindex(';',member_mouth) > 0
Set #count = #count + 1
End
------------------------------------------
Select *
from #tableDest
Order By memberno
------------------------------------------
Result.
You can take a reference here.
Splitting delimited values in a SQL column into multiple rows
Do it on SQl server side, if possible a SSIS package would be great.

Find and replace string in MySQL using data from another table

I have two MySQL tables, and I want to find and replace text strings in one using data in another.
Table texts:
+---------------------+
| messages |
+---------------------+
| 'thx guys' |
| 'i think u r great' |
| 'thx again' |
| ' u rock' |
+---------------------+
Table dictionary:
+--------------+---------------+
| bad_spelling | good_spelling |
+--------------+---------------+
| 'thx' | 'thanks' |
| ' u ' | ' you ' |
| ' r ' | ' are ' |
+--------------+---------------+
I want SQL to go through and look at every row in messages and replace every instance of bad_spelling with good_spelling, and to do this for all the pairs of bad_spelling and good_spelling.
The closest I have gotten is this:
update texts, dictionary
set texts.message = replace(texts.message,
dictionary.bad_spelling,
dictionary.good_spelling)
But this only changes "thx" to "thanks" (in two rows) and does not go on to replace " u " with " you" or " r " with " are ."
Any ideas how to make it use all the rows in dictionary in the replace statement?
PS forgot to mention that this is a small example and in the real thing I will have a lot of find/replace pairs, which may get added to over time.
I've never used MySql, so this is just a theory based on my other database work. When reading the other answers, trying to use REPLACE(), I thought I could post this and get someone with MySql syntax experience a few ideas to make a set base solution.
here is some SQL Server code to that does most of the work for you:
DECLARE #Source table (Texts varchar(50))
INSERT #Source VALUES ('thx guys')
INSERT #Source VALUES ('i think u r great')
INSERT #Source VALUES ('thx again')
INSERT #Source VALUES ('u rock')
DECLARE #Dictionary table (bad_spelling varchar(50), good_spelling varchar(50))
INSERT #Dictionary VALUES ('thx', 'thanks')
INSERT #Dictionary VALUES ('u', 'you')
INSERT #Dictionary VALUES ('r', 'are')
SELECT
t.Texts,COALESCE(d.good_spelling,c.ListValue) AS WordToUse
FROM #Source t
CROSS APPLY dbo.FN_ListToTable(' ',t.Texts) c
LEFT OUTER JOIN #Dictionary d ON c.ListValue=d.bad_spelling
OUTPUT:
Texts WordToUse
------------------ ---------
thx guys thanks
thx guys guys
i think u r great i
i think u r great think
i think u r great you
i think u r great are
i think u r great great
thx again thanks
thx again again
u rock you
u rock rock
(11 row(s) affected)
It would be better to use a "real" PK than the actual "Texts" in the query above, but the OP doesn't list many columns in that table, so I use "Texts".
Using SQL Server you need to use a some funky XML syntax to join the rows back together (so I won't show that code, as it doesn't matter), but using MySql's GROUP_CONCAT() you should be able to concatenate the word rows back together into phrase rows.
the code for the (SQL Server) split function and how it works can be found here: SQL Server: Split operation
It does not go all the way because even though the replace had been run x times (where x is the number of rows in dictionary) only one update is retained (the last one).
Transactions don't write down intermediate results and therefore can't see them as input values for the next batch of replacements.
As (AFAIK) MySQL does not support recursive queries you'll have to resort to procedural approach.
You need to execute your query many times anyways. Since this is the operation of clean-up type, which you usually do occasionally, i suggest you perform the following query until there was something updated. I do not know how to do it with MySql, but in SQL Server it would be to check the number of rows updated (which is result of this UPDATE query execution), and run the UPDATE again, until no rows are updated.
update texts,
dictionary
set texts.message = replace(texts.message, dictionary.bad_spelling, dictionary.good_spelling)
where texts.message <> replace(texts.message, dictionary.bad_spelling, dictionary.good_spelling)
You have to call Replace multiple times on the text:
Update ...
Set texts.message = Replace(
Replace(
Replace( texts.message, 'thx ', 'thanks ' )
, ' u ', ' you ')
, ' r ', ' are ')
EDIT Given that you said you had numerous replacements, you would need to do this in a cursor with multiple UPDATE statement calls. Something like (I haven't tested this at all, so beware):
Create Temporary Table ReplaceValues
(
BeforeText varchar(100) not null
, AfterText varchar(100) not null
)
Insert ReplaceValues(BeforeText, AfterText) Values('thx ', 'thanks ')
Insert ReplaceValues(BeforeText, AfterText) Values(' u ', ' you ')
Insert ReplaceValues(BeforeText, AfterText) Values(' r ', ' are ')
DECLARE done int DEFAULT(0)
DECLARE BeforeValue varchar(100);
DECLARE AfterValue varchar(100);
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
DECLARE ReplaceList CURSOR FOR Select BeforeText, AfterText From ReplaceValues;
OPEN ReplaceList;
REPEAT
If NOT done THEN
FETCH ReplaceList INTO BeforeValue, AfterValue;
Update texts
Set texts.message = REPLACE(texts.message, BeforeValue, AfterValue);
END IF
UNTIL done END REPEAT;
CLOSE ReplaceList;
You could wrap all this up into a procedure so that you can call it again later.