Performance issues generating a unique name - sql

I have a table 'Objects' present in SQL Server DB. It contains the names (string) of objects.
I have a list of names of new objects that need to be inserted in the 'Objects' table, in a separate table 'NewObjects'. This operation will be referred as 'import' henceforward.
I need to generate a unique name for each record to be imported to 'Objects' from 'NewObjects', if the record name is already present in 'Objects'. This new name will be stored in 'NewObjects' table against the old name.
DECLARE #NewObjects TABLE
(
...
Name varchar(20),
newName nvarchar(20)
)
I have implemented a stored procedures which generates unique name for each record to be imported from 'NewObjects'. However, I am not happy with the performance for 1000 records (in 'NewObjects'.)
I want help to optimize my code. Below is the implementation:
PROCEDURE [dbo].[importWithNewNames] #args varchar(MAX)
-- Sample of #args is like 'A,B,C,D' (a CSV string)
...
DECLARE #NewObjects TABLE
(
_index int identity PRIMARY KEY,
Name varchar(20),
newName nvarchar(20)
)
-- 'SplitString' function: this is a working implementation which is right now not concern of performance
INSERT INTO #NewObjects (Name)
SELECT * from SplitString(#args, ',')
declare #beg int = 1
declare #end int
DECLARE #oldName varchar(10)
-- get the count of the rows
select #end = MAX(_index) from #NewObjects
while #beg <= #end
BEGIN
select #oldName = Name from #NewObjects where #beg = _index
Declare #nameExists int = 0
-- this is our constant. We cannot change
DECLARE #MAX_NAME_WIDTH int = 5
DECLARE #counter int = 1
DECLARE #newName varchar(10)
DECLARE #z varchar(10)
select #nameExists = count(name) from Objects where name = #oldName
...
IF #nameExists > 0
BEGIN
-- create name based on pattern 'Fxxxxx'. Example: 'F00001', 'F00002'.
select #newName = 'F' + REPLACE(STR(#counter, #MAX_NAME_WIDTH, 0), ' ', '0')
while EXISTS (select top 1 1 from Objects where name = #newName)
OR EXISTS (select top 1 1 from #NewObjects where newName = #newName)
BEGIN
select #counter = #counter + 1
select #newName = 'F' + REPLACE(STR(#counter, #MAX_NAME_WIDTH, 0), ' ', '0')
END
select top 1 #z = #newName from Objects
update #NewObjects
set newName = #z where #beg = _index
END
select #beg = #beg + 1
END
-- finally, show the new names generated
select * from #NewObjects

DISCLAIMER: I am in no position to test these recommendations therefore there may be syntax errors that you'll have to work out on your own as you implement them. They are here as a guide to both fix this procedure but also aid you in growing your skill set for future projects.
One optimization just skimming through, that would become more prevalent as you iterated over larger sets, is this code here:
select #nameExists = count(name) from Objects where name = #oldName
...
IF #nameExists > 0
consider changing it to this:
IF EXISTS (select name from Objects where name = #oldName)
Also, rather than doing this:
-- create name based on pattern 'Fxxxxx'. Example: 'F00001', 'F00002'.
select #newName = 'F' + REPLACE(STR(#counter, #MAX_NAME_WIDTH, 0), ' ', '0')
while EXISTS (select top 1 1 from Objects where name = #newName)
OR EXISTS (select top 1 1 from #NewObjects where newName = #newName)
BEGIN
select #counter = #counter + 1
select #newName = 'F' + REPLACE(STR(#counter, #MAX_NAME_WIDTH, 0), ' ', '0')
END
consider this:
DECLARE #maxName VARCHAR(20)
SET #newName = 'F' + REPLACE(STR(#counter, #MAX_NAME_WIDTH, 0), ' ', '0')
SELECT #maxName = MAX(name) FROM Objects WHERE name > #newName ORDER BY name
IF (#maxName IS NOT NULL)
BEGIN
#counter = CAST(SUBSTRING(#maxName, 2) AS INT)
SET #newName = 'F' + REPLACE(STR(#counter, #MAX_NAME_WIDTH, 0), ' ', '0')
END
that will ensure that you're not iterating and doing multiple queries just to find the maximum integer value of the generated name.
Further, based on what little context I have, you should also be able to make one more optimization that will ensure you only have to do the aforementioned one time, ever.
DECLARE #maxName VARCHAR(20)
SET #newName = 'F' + REPLACE(STR(#counter, #MAX_NAME_WIDTH, 0), ' ', '0')
IF (#beg = 1)
BEGIN
SELECT #maxName = MAX(name) FROM Objects WHERE name > #newName ORDER BY name
IF (#maxName IS NOT NULL)
BEGIN
#counter = CAST(SUBSTRING(#maxName, 2) AS INT)
SET #newName = 'F' + REPLACE(STR(#counter, #MAX_NAME_WIDTH, 0), ' ', '0')
END
END
The reason I say you can make that optimization is because unless you have to worry about other entities inserting records during this time that look like the ones you are (e.g. Fxxxxx), then you only have to find the MAX one time and can simply iterate #counter over the loop.
In fact, you could actually pull this entire piece out of the loop. You should be able to extrapolate that pretty easily. Just pull the DECLARE and SET of #counter out along with the code inside the IF (#beg = 1). But take it one step at a time.
Also, change this line:
select top 1 #z = #newName from Objects
to this:
SET #z = #newName
because you are literally running a query to SET two local variables. This is likely a huge cause for the performance issues. A good practice for you to get into is unless you're actually setting a variable from a SELECT statement, use the SET operation for local variables. There are some other places in your code where this applies, consider this line:
select #beg = #beg + 1
use this instead:
SET #beg = #beg + 1
Finally, as stated above regarding simply iterating #counter, at the end of the loop where you have this line:
select #beg = #beg + 1
just add a line:
SET #counter = #counter + 1
and you're golden!
So to recap, you can gather the maximum conflicting name just one time so you'll be getting rid of all those iterations. You're going to start using SET to get rid of performance ridden lines like select top 1 #z = #newName from Objects where you're actually querying a table to set two local variables. And you're going to leverage the EXISTS method instead of setting a variable that leveraged an AGGREGATE function COUNT to do that work.
Let me know how these optimizations work.

You should avoid queries inside loops.. Especially if this is in a table variable...
You should try to use a temp table and index this table on newname column. I bet it would improve a bit the performance..
But would be better you rewrite it all avoiding those loop with query inside..
Setting my ambient for test...
--this would be your object table... I feed it with some values for test
DECLARE #Objects TABLE
(
_index int identity PRIMARY KEY,
Name varchar(20)
)
insert into #Objects(name)
values('A'),('A1'),('B'),('F00001')
--the parameter of your procedure
declare #args varchar(MAX)
set #args = 'A,B,C,D,F00001'
--#NewObjects2 is your #NewObjects just named the n2 cause I did run your solution together when testing
DECLARE #NewObjects2 TABLE
(
_index int identity PRIMARY KEY,
Name varchar(20),
newName nvarchar(20)
)
INSERT INTO #NewObjects2 (Name)
SELECT * from SplitString(#args, ',')
declare #end int
select #end = MAX(_index) from #NewObjects2
DECLARE #MAX_NAME_WIDTH int = 5
At this point its very similar your solution
Now what I would do instead your looping
--generate newNames in format FXXXXX with free names sufficient to give newnames for all lines in #newObject
--you should alter this to get the greater FXXXXX name inside the Objects and start generate newNames from this point.. to avoid overhead creating newNames that will sure not to be used..
with N_free as
(
select
0 as [count],
'F' + REPLACE(STR(0, #MAX_NAME_WIDTH, 0), ' ', '0') as [newName],
0 as fl_free,
0 as count_free
union all
select
N.[count] + 1 as [count],
'F' + REPLACE(STR(N.[count]+1, #MAX_NAME_WIDTH, 0), ' ', '0') as [newName],
OA.fl_free,
count_free + OA.fl_free as count_free
from
N_free N
outer apply
(select
case
when not exists(select name from #Objects
where Name = 'F' + REPLACE(STR(N.[count]+1, #MAX_NAME_WIDTH, 0), ' ', '0'))
then 1
else 0
end as fl_free) OA
where
N.count_free < #end
)
--return only those newNames that are free to be used
,newNames as (select ROW_NUMBER() over (order by [count]) as _index_name
,[newName]
from N_free where fl_free = 1
)
--update the #NewObjects2 giving newname for the ones that got the name already been used on Objects
update N2
set newName = V2.[newName]
from #NewObjects2 N2
inner join (select V._index,V.Name,newNames.[newName]
from( select row_number() over (partition by case when O.Name is not null
then 1
else 0
end
order by N._index) as _index_name
,N._index
,N.Name
,case when O.Name is not null
then 1
else 0
end as [fl_need_newName]
from #NewObjects2 N
left outer join #Objects O
on O.Name = N.Name
)V
left outer join newNames
on newNames._index_name = V._index_name
and V.fl_need_newName = 1
)V2
on V2._index = N2._index
option(MAXRECURSION 0)
select * from #NewObjects2
The results that I achieved was the same then using your solution for this ambient...
You may check if this really generate same result...
The result for this query was
_index Name newName
1 A F00002
2 B F00003
3 C NULL
4 D NULL
5 F00001 F00004

Related

SQL Loop through tables and columns to find which columns are NOT empty

I created a temp table #test containing 3 fields: ColumnName, TableName, and Id.
I would like to see which rows in the #test table (columns in their respective tables) are not empty? I.e., for every column name that i have in the ColumnName field, and for the corresponding table found in the TableName field, i would like to see whether the column is empty or not. Tried some things (see below) but didn't get anywhere. Help, please.
declare #LoopCounter INT = 1, #maxloopcounter int, #test varchar(100),
#test2 varchar(100), #check int
set #maxloopcounter = (select count(TableName) from #test)
while #LoopCounter <= #maxloopcounter
begin
DECLARE #PropIDs TABLE (tablename varchar(max), id int )
Insert into #PropIDs (tablename, id)
SELECT [tableName], id FROM #test
where id = #LoopCounter
set #test2 = (select columnname from #test where id = #LoopCounter)
declare #sss varchar(max)
set #sss = (select tablename from #PropIDs where id = #LoopCounter)
set #check = (select count(#test2)
from (select tablename
from #PropIDs
where id = #LoopCounter) A
)
print #test2
print #sss
print #check
set #LoopCounter = #LoopCounter + 1
end
In order to use variables as column names and table names in your #Check= query, you will need to use Dynamic SQL.
There is most likely a better way to do this but I cant think of one off hand. Here is what I would do.
Use the select and declare a cursor rather than a while loop as you have it. That way you dont have to count on sequential id's. The cursor would fetch fields columnname, id and tablename
In the loop build a dynamic sql statement
Set #Sql = 'Select Count(*) Cnt Into #Temp2 From ' + TableName + ' Where ' + #columnname + ' Is not null And ' + #columnname <> '''''
Exec(#Sql)
Then check #Temp2 for a value greater than 0 and if this is what you desire you can use the #id that was fetched to update your #Temp table. Putting the result into a scalar variable rather than a temp table would be preferred but cant remember the best way to do that and using a temp table allows you to use an update join so it would well in my opinion.
https://www.mssqltips.com/sqlservertip/1599/sql-server-cursor-example/
http://www.sommarskog.se/dynamic_sql.html
Found a way to extract all non-empty tables from the schema, then just joined with the initial temp table that I had created.
select A.tablename, B.[row_count]
from (select * from #test) A
left join
(SELECT r.table_name, r.row_count, r.[object_id]
FROM sys.tables t
INNER JOIN (
SELECT OBJECT_NAME(s.[object_id]) table_name, SUM(s.row_count) row_count, s.[object_id]
FROM sys.dm_db_partition_stats s
WHERE s.index_id in (0,1)
GROUP BY s.[object_id]
) r on t.[object_id] = r.[object_id]
WHERE r.row_count > 0 ) B
on A.[TableName] = B.[table_name]
WHERE ROW_COUNT > 0
order by b.row_count desc
How about this one - bitmask computed column checks for NULLability. Value in the bitmask tells you if a column is NULL or not. Counting base 2.
CREATE TABLE FindNullComputedMask
(ID int
,val int
,valstr varchar(3)
,NotEmpty as
CASE WHEN ID IS NULL THEN 0 ELSE 1 END
|
CASE WHEN val IS NULL THEN 0 ELSE 2 END
|
CASE WHEN valstr IS NULL THEN 0 ELSE 4 END
)
INSERT FindNullComputedMask
SELECT 1,1,NULL
INSERT FindNullComputedMask
SELECT NULL,2,NULL
INSERT FindNullComputedMask
SELECT 2,NULL, NULL
INSERT FindNullComputedMask
SELECT 3,3,3
SELECT *
FROM FindNullComputedMask

Change characters but keep length

I am migrating sensitive data to a database, and I need to hide details of the text. We would like to keep the volume and length of the text, but change the meaning.
For example:
"James has been well received, and should be helped when ever he finds it hard to speak"
should change to:
"jhdfy dfw aslk dfe kjdfkjd, kjf kjdsf df iotryy erhr lsdj jf ytwe it kjdf tr kjsdd"
Is there a way to update all rows, set the column text to this random type text? Really only want to change charactors (a-z, A-Z), and keep the rest.
One option is to use a bunch of nested replaces . . . but that would probably hit on the maximum number of nested functions.
You could write a painful query using outer apply:
select
from t outer apply
(select replace(t.col, 'a', 'z') as col1) outer apply
(select replace(col1, 'b', 'y') ) outer apply
. . .
However, you might want to write your own function. In other databases, this is called translate() (after the Unix command). If you Google SQL Server translate, I think you'll find examples on the web.
One way is to split the string character by character and replace each row with a random string. And then concatenate them back to get the desired output
DECLARE #str VARCHAR(MAX) = 'James has been well received, and should be helped when ever he finds it hard to speak'
;WITH Cte(orig, random) AS(
SELECT
SUBSTRING(t.a, v.number + 1, 1),
CASE
WHEN SUBSTRING(t.a, v.number + 1, 1) LIKE '[a-z]'
THEN CHAR(ABS(CHECKSUM(NEWID())) % 25 + 97)
ELSE SUBSTRING(t.a, v.number + 1, 1)
END
FROM (SELECT #str) t(a)
CROSS JOIN master..spt_values v
WHERE
v.number < LEN(t.a)
AND v.type = 'P'
)
SELECT
OrignalString = #str,
RandomString = (
SELECT '' + random
FROM Cte FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'
)
TRY IT HERE
OK this is possible using a user defined function (UDF) and a view.
SQL Server does not allow random number generation in a UDF but does allow it in a view. Ref: http://blog.sqlauthority.com/2012/11/20/sql-server-using-rand-in-user-defined-functions-udf/
So here is the solution
CREATE VIEW [dbo].[rndView]
AS
SELECT RAND() rndResult
GO
CREATE FUNCTION [dbo].[RandFn]()
RETURNS float
AS
BEGIN
DECLARE #rndValue float
SELECT #rndValue = rndResult
FROM rndView
RETURN #rndValue
END
GO
CREATE FUNCTION [dbo].[randomstring] ( #stringToParse VARCHAR(MAX))
RETURNS
varchar(max)
AS
BEGIN
/*
A = 65
Z = 90
a = 97
z = 112
declare #stringToParse VARCHAR(MAX) = 'James has been well received, and should be helped when ever he finds it hard to speak'
Select [dbo].[randomstring] ( #stringToParse )
go
Update SpecialTable
Set SpecialString = [dbo].[randomstring] (SpecialString)
go
*/
declare #StringToreturn varchar(max) = ''
declare #charCounter int = 1
declare #len int = len(#stringToParse)
declare #thisRand int
declare #UpperA int = 65
declare #UpperZ int = 90
declare #LowerA int = 97
declare #LowerZ int = 112
declare #thisChar char(1)
declare #Random_Number float
declare #randomChar char(1)
WHILE #charCounter < #len
BEGIN
SELECT #thisChar = SUBSTRING(#stringToParse, #charCounter, 1)
set #randomChar = #thisChar
--print #randomChar
SELECT #Random_Number = dbo.RandFn()
--print #Random_Number
--only swap if a-z or A-Z
if ASCII(#thisChar) >= #UpperA and ASCII(#thisChar) <= #UpperZ begin
--upper case
set #thisRand = #UpperA + (#Random_Number * convert(float, (#UpperZ-#UpperA)))
set #randomChar = CHAR(#thisRand)
--print #thisRand
end
if ASCII(#thisChar) >= #LowerA and ASCII(#thisChar) <= #LowerZ begin
--upper case
set #thisRand = #LowerA + (#Random_Number * convert(float, (#LowerZ-#LowerA)))
set #randomChar = CHAR(#thisRand)
end
--print #thisRand
--print #randomChar
set #StringToreturn = #StringToreturn + #randomChar
SET #charCounter = #charCounter + 1
END
--Select * from #returnList
return #StringToreturn
END
GO

Auto increment Alphanumeric ID in MSSQL

I have an existing Stored procedure that generate employee ID. The employee ID have a format of EPXXXX, EP then 4 numeric values. I want my stored procedure to be shorten.
given the table (tblEmployee) above. Below is the stored procedure for inserting the new employee with the new employee number. The process is I have to get the last employee id, get the last 4 digits (which is the number), convert it to integer, add 1 to increment, check if the number is less than 10, 100 or 1000 or equal/greater than 1000, add the prefix before inserting the new records to the table.
create procedure NewEmployee
#EmployeeName VARCHAR(50)
AS
BEGIN
SET NOCOUNT ON
DECLARE #lastEmpID as VARCHAR(6)
SET #lastEmpID =
(
SELECT TOP 1 Employee_ID
FROM tblEmployee
ORDER BY Employee_ID DESC
)
DECLARE #empID as VARCHAR(4)
SET #empID =
(
SELECT RIGHT(#lastEmpID, 4)
)
DECLARE #numEmpID as INT
#numEmpID =
(
SELECT CONVERT(INT, #empID) + 1
)
DECLARE #NewEmployeeID as VARCHAR(6)
IF #numEmp < 10
SET #NewEmployee = SELECT 'EP000' + CONVERT(#EmpID)
IF #numEmp < 100
SET #NewEmployee = SELECT 'EP00' + CONVERT(#EmpID)
IF #numEmp < 1000
SET #NewEmployee = SELECT 'EP0' + CONVERT(#EmpID)
IF #numEmp >= 1000
SET #NewEmployee = SELECT 'EP' + CONVERT(#EmpID)
INSERT INTO tblEmployee(Employee_ID, Name)
VALUES (#NewEmployeeID, #EmployeeName)
END
Try this one -
CREATE PROCEDURE dbo.NewEmployee
#EmployeeName VARCHAR(50)
AS BEGIN
SET NOCOUNT ON;
INSERT INTO dbo.tblEmployee(Employee_ID, Name)
SELECT
'EP' + RIGHT('0000' + CAST(Employee_ID + 1 AS VARCHAR(4)), 4)
, #EmployeeName
FROM (
SELECT TOP 1 Employee_ID = CAST(RIGHT(Employee_ID, 4) AS INT)
FROM dbo.tblEmployee
ORDER BY Employee_ID DESC
) t
END
I'm not suggesting over what you have currently but, i'd do this way. This is the way I've implemented in my application. Which im gonna give you. Hope you Like this. This is fully Dynamic and Works for all the Transaction you could have.
I've a table Which hold the Document Number as :
CREATE TABLE INV_DOC_FORMAT(
DOC_CODE VARCHAR(10),
DOC_NAME VARCHAR(100),
PREFIX VARCHAR(10),
SUFFIX VARCHAR(10),
[LENGTH] INT,
[CURRENT] INT
)
Which would hold the Data Like :
INSERT INTO INV_DOC_FORMAT(DOC_CODE,DOC_NAME,PREFIX,SUFFIX,[LENGTH],[CURRENT])
VALUES('01','INV_UNIT','U','',5,0)
INSERT INTO INV_DOC_FORMAT(DOC_CODE,DOC_NAME,PREFIX,SUFFIX,[LENGTH],[CURRENT])
VALUES('02','INV_UNIT_GROUP','UG','',5,0)
And, i'd have a fUNCTION OR Procedure but, i've an function here Which would generate the Document Number.
CREATE FUNCTION GET_DOC_FORMAT(#DOC_CODE VARCHAR(100))RETURNS VARCHAR(100)
AS
BEGIN
DECLARE #PRE VARCHAR(10)
DECLARE #SUF VARCHAR(10)
DECLARE #LENTH INT
DECLARE #CURRENT INT
DECLARE #FORMAT VARCHAR(100)
DECLARE #REPEAT VARCHAR(10)
IF NOT EXISTS(SELECT DOC_CODE FROM INV_DOC_FORMAT WHERE DOC_CODE=#DOC_CODE)
RETURN ''
SELECT #PRE= PREFIX FROM INV_DOC_FORMAT WHERE DOC_CODE=#DOC_CODE
SELECT #SUF= SUFFIX FROM INV_DOC_FORMAT WHERE DOC_CODE=#DOC_CODE
SELECT #LENTH= [LENGTH] FROM INV_DOC_FORMAT WHERE DOC_CODE=#DOC_CODE
SELECT #CURRENT= [CURRENT] FROM INV_DOC_FORMAT WHERE DOC_CODE=#DOC_CODE
SET #REPEAT=REPLICATE('0',(#LENTH-LEN(CONVERT(VARCHAR, #CURRENT))))
SET #FORMAT=#PRE + #REPEAT +CONVERT(VARCHAR, #CURRENT+1) + #SUF
RETURN #FORMAT
END
You can use the Function like :
INSERT INTO INV_UNIT(UNIT_CODE,UNIT_NAME,UNIT_ALIAS,APPROVED,APPROVED_USER_ID,APPROVED_DATE)
VALUES(DBO.GET_DOC_FORMAT('01'),#Unit_Name,#Unit_Alias,#APPROVED,#APPROVED_USER_ID,#APPROVED_DATE)
--After Transaction Successfully complete, You can
UPDATE INV_DOC_FORMAT SET [CURRENT]=[CURRENT]+1 WHERE DOC_CODE='01'
Or, you can create an Single Procedure which would handle all the things alone too.
Hope you got the way...
Hence,
Looking at your Way, you are making an Mistake.
You are getting
SET #lastEmpID =
(
SELECT TOP 1 Employee_ID
FROM tblEmployee
ORDER BY Employee_ID DESC
)
Last employee id, and then you are manipulating the rest of the ID. This would create or reuse the ID that was generated earlier however deleted now.
Suppose EMP0010 was there. After some day that EMP has been Deleted. So, When you again create an Employeee next time, You gonna have Same Emp ID you had before for anohter Employe but no more exits however. I dont think thats a good idea.
And, Instead of this :
DECLARE #NewEmployeeID as VARCHAR(6)
IF #numEmp < 10
SET #NewEmployee = SELECT 'EP000' + CONVERT(#EmpID)
IF #numEmp < 100
SET #NewEmployee = SELECT 'EP00' + CONVERT(#EmpID)
IF #numEmp < 1000
SET #NewEmployee = SELECT 'EP0' + CONVERT(#EmpID)
IF #numEmp >= 1000
SET #NewEmployee = SELECT 'EP' + CONVERT(#EmpID)
Which you used to repeat an Zero. You would use Replicate Function() of SQL. Like above on the Example of Mine.
SET #REPEAT=REPLICATE('0',(#LENTH-LEN(CONVERT(VARCHAR, #CURRENT))))
I don't think you need a Stored Procedure , Try using Ranking Functions
select
'EP'+RIGHT('000000'+ CAST(ROW_NUMBER() OVER (ORDER BY Name) AS VARCHAR(6)), 4)
AS [emp_code]
,
Name
FROM emp1 WITH(NOLOCK)
SQL Fiddle
EDIT
select
'EP'+RIGHT('000000'+ CAST((ROW_NUMBER() OVER (ORDER BY Name)+10) AS VARCHAR(6)), 4)
AS [emp_code] --^Add the last Emp no.
,
Name
FROM emp1 WITH(NOLOCK)
SQL Fiddle
of course the accepted answer is working fine, but it is not working if we have numm in previous values. so modified it as below, hope this will help others as well
CREATE PROCEDURE dbo.NewEmployee
#EmployeeName VARCHAR(50)
AS BEGIN
SET NOCOUNT ON;
INSERT INTO dbo.tblEmployee(Employee_ID, Name)
SELECT
'EP' + RIGHT('0000' + CAST(Employee_ID + 1 AS VARCHAR(4)), 4)
, #EmployeeName
FROM (
SELECT TOP 1 Employee_ID = CAST(RIGHT(Employee_ID, 4) AS INT)
FROM dbo.tblEmployee
ORDER BY Employee_ID DESC
) t
END

Finding Uppercase Character then Adding Space

I bought a SQL World City/State database. In the state database it has the state names pushed together. Example: "NorthCarolina", or "SouthCarolina"...
IS there a way in SQL to loop and find the uppercase characters and add a space???
this way "NorthCarolina" becomes "North Carolina"???
Create this function
if object_id('dbo.SpaceBeforeCaps') is not null
drop function dbo.SpaceBeforeCaps
GO
create function dbo.SpaceBeforeCaps(#s varchar(100)) returns varchar(100)
as
begin
declare #return varchar(100);
set #return = left(#s,1);
declare #i int;
set #i = 2;
while #i <= len(#s)
begin
if ASCII(substring(#s,#i,1)) between ASCII('A') and ASCII('Z')
set #return = #return + ' ' + substring(#s,#i,1)
else
set #return = #return + substring(#s,#i,1)
set #i = #i + 1;
end;
return #return;
end;
GO
Then you can use it to update your database
update tbl set statename = select dbo.SpaceBeforeCaps(statename);
There's a couple ways to approach this
Construct a function using a pattern and the PATINDEX feature.
Chain minimal REPLACE statements for each case (e.g. REPLACE(state_name, 'hC', 'h C' for your example case). This seems is kind of a hack, but might actually give you the best performance, since you have such a small set of replacements.
If you absolutely cannot create functions and need this as a one-off, you can use a recursive CTE to break the string up (and add the space at the same time where required), then recombine the characters using FOR XML. Elaborate example below:
-- some sample data
create table #tmp (id int identity primary key, statename varchar(100));
insert #tmp select 'NorthCarolina';
insert #tmp select 'SouthCarolina';
insert #tmp select 'NewSouthWales';
-- the complex query updating the "statename" column in the "#tmp" table
;with cte(id,seq,char,rest) as (
select id,1,cast(left(statename,1) as varchar(2)), stuff(statename,1,1,'')
from #tmp
union all
select id,seq+1,case when ascii(left(rest,1)) between ascii('A') and ascii('Z')
then ' ' else '' end + left(rest,1)
, stuff(rest,1,1,'')
from cte
where rest > ''
), recombined as (
select a.id, (select b.char+''
from cte b
where a.id = b.id
order by b.seq
for xml path, type).value('/','varchar(100)') fixed
from cte a
group by a.id
)
update t
set statename = c.fixed
from #tmp t
join recombined c on c.id = t.id
where statename != c.fixed;
-- check the result
select * from #tmp
----------- -----------
id statename
----------- -----------
1 North Carolina
2 South Carolina
3 New South Wales

Update multiple columns by loop?

I have a select statement which I want to convert into an update statement for all the columns in the table which have the name Variable[N].
For example, I want to do these things:
I want to be able to convert the SQL below into an update statement.
I have n columns with the name variable[N]. The example below only updates column variable63, but I want to dynamically run the update on all columns with names variable1 through variableN without knowing how many variable[N] columns I have in advance. Also, in the example below I get the updated result into NewCol. I actually want to update the respective variable column with the results if possible, variable63 in my example below.
I want to have a wrapper that loops over column variable1 through variableN and perform the same respective update operation on all those columns:
SELECT
projectid
,documentid
,revisionno
,configurationid
,variable63
,ISNULL(Variable63,
(SELECT TOP 1
variable63
FROM table1
WHERE
documentid = t.documentid
and projectid=t.projectid
and configurationid=t.configurationid
and cast(revisionno as int) < cast(t.revisionno as int)
AND Variable63 is NOT NULL
ORDER BY
projectid desc
,documentid desc
,revisionno desc
,configurationid desc
)) as NewCol
FROM table1 t;
There's no general way to loop through variables in SQL, you're supposed to know exactly what you want to modify. In some databases, it will be possible to query system tables to dynamically build an update statement (I know how to do that in InterBase and it's decessor Firebird), but you haven't told us anything which database engine you're using.
Below is a way you could update several fields that are null, COALESCE and CASE are two way of doing the same thing, as is using LEFT JOIN or NOT EXISTS. Use the ones you and your database engine is most comfortable with. Beware that all records will be updated, so this is not a good solution if your database contains millions of records, each record is large and you want this query to be executed lots of times.
UPDATE table1 t
SET t.VARIABLE63 =
COALESCE(t.VARIABLE63,
(SELECT VARIABLE63
FROM table1 t0
LEFT JOIN table1 tNot
ON tNot.documentid = t.documentid
AND tNot.projectid=t.projectid
AND tNot.configurationid=t.configurationid
AND cast(tNot.revisionno as int) > cast(t0.revisionno as int)
AND cast(tNot.revisionno as int) < cast(t.revisionno as int)
AND tNot.Variable63 is NOT NULL
WHERE t0.documentid = t.documentid
AND t0.projectid=t.projectid
AND t0.configurationid=t.configurationid
AND cast(t0.revisionno as int) < cast(t.revisionno as int)
AND t0.Variable63 is NOT NULL
AND tNot.Variable63 is NULL)),
t.VARIABLE64 = CASE WHEN t.VARIABLE64 IS NOT NULL then t.VARIABLE64
ELSE (SELECT VARIABLE64
FROM table1 t0
WHERE t0.documentid = t.documentid
AND t0.projectid=t.projectid
AND t0.configurationid=t.configurationid
AND cast(t0.revisionno as int) < cast(t.revisionno as int)
AND t0.Variable64 is NOT NULL
AND NOT EXISTS(SELECT 1
FROM table1 tNot
WHERE tNot.documentid = t.documentid
AND tNot.projectid=t.projectid
AND tNot.configurationid=t.configurationid
AND cast(tNot.revisionno as int) > cast(t0.revisionno as int)
AND cast(tNot.revisionno as int) < cast(t.revisionno as int)
AND tNot.Variable64 is NOT NULL)) END
OK I think I got it. Function that loops through columns and runs an update command per column.
DECLARE #sql NVARCHAR(1000),
#cn NVARCHAR(1000)--,
--#r NVARCHAR(1000),
--#start INT
DECLARE col_names CURSOR FOR
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'PIVOT_TABLE'
ORDER BY ordinal_position
--SET #start = 0
DECLARE #op VARCHAR(max)
SET #op=''
OPEN col_names FETCH next FROM col_names INTO #cn
WHILE ##FETCH_STATUS = 0
BEGIN
--print #cn
IF UPPER(#cn)<> 'DOCUMENTID' and UPPER(#cn)<> 'CONFIGURATIONID' and UPPER(#cn)<> 'PROJECTID' and UPPER(#cn)<> 'REVISIONNO'
BEGIN
SET #sql = 'UPdate pt
set pt.' + #cn + ' = ((SELECT TOP 1 t.' + #cn + ' FROM pivot_table t WHERE t.documentid = pt.documentid and t.projectid=pt.projectid
and t.configurationid=pt.configurationid and cast(t.revisionno as int) < cast(pt.revisionno as int) AND t.' + #cn + ' is NOT NULL
ORDER BY revisionno desc)) from PIVOT_TABLE pt where pt.' + #cn + ' is NULL;'
EXEC Sp_executesql
#sql
--print #cn
END
FETCH next FROM col_names INTO #cn
END
CLOSE col_names
DEALLOCATE col_names;