I have two tables with the following (dummy) structure:
Table 1
idText sText(nvarchar(500))
1 Text with some keywords
2 Text2 with one keyword
3 Text3 with three keywords
Table 2
idText idKey sKeyword
1 1 some
1 2 keywords
2 3 one
3 4 with
3 2 keywords
3 5 three
Is there any way to execute a nested replace among all the related keywords from Table2?
There are some solutions around like creating a function, but I do not think is a good solution because this is not going to be reused anywhere else. I did try a recursive CTE as well but without success.
The result must be something like this:
Table 1
idText sText(nvarchar(500))
1 Text with Replaced_some Replaced_keywords
2 Text2 with Replaced_one keyword
3 Text3 Replaced_with Replaced_three Replaced_keywords
PS.
The Replaced string is fixed. So you can use the string you prefer. The Replace sentence would be something like this: replace(sText, sKeyword, 'Replaced_' + sKeyowrd)
IdKey is useless in this case, however it is part of our real DB structure
This is my failed attemp using a recursive CTE:
DECLARE #Table1 TABLE( ID int, sText nvarchar(200))
DECLARE #Table2 TABLE( ID int, sKeyword nvarchar(10))
INSERT INTO #Table1 VALUES(1, 'Text with some keywords')
INSERT INTO #Table1 VALUES(2, 'Text2 with one keyword')
INSERT INTO #Table1 VALUES(3, 'Text3 with three keywords')
INSERT INTO #Table2 VALUES(1, 'some')
INSERT INTO #Table2 VALUES(1, 'keywords')
INSERT INTO #Table2 VALUES(2, 'one')
INSERT INTO #Table2 VALUES(3, 'with')
INSERT INTO #Table2 VALUES(3, 'keywords')
INSERT INTO #Table2 VALUES(3, 'three')
;WITH CTE AS(
SELECT ID, sText FROM #Table1
UNION ALL
SELECT c.ID, CAST(REPLACE(sText, sKeyword, 'New_' + sKeyword) AS nvarchar(200)) FROM CTE c
INNER JOIN #Table2 t2 ON t2.ID = c.ID
)
SELECT * FROM CTE
The result is an infinite loop, it does not stop.
Any help will be appreciated
Disclaimer: Function slimmed down as promised, will update answer description accordingly in due time.
Per my current understanding of your problem, I think I can apply to it a function I designed to solve a more complex problem I had recently. There might be other solutions, but most certainly others can & will propose them, so why don't I offer you something a little less to be offered.
Be advised though, it was meant to address something more complex than yours (explained later), and right now I sadly don't have time to slim it down, but I'll get to that probably tomorrow. I hope the comments help. Irregardless, I'll summarize my function's objective for you:
There's a table that contains what messages to find, and what to replace them with. The function will receive a text value as input, will use a cursor to loop said table, and for each record in said table it will check if input text contains something to replace, and replace if applicable.
Two things to note about the original objective. First, there's a nested loop to address the scenario where a certain keyword exists multiple times, hence requiring multiple replacements. Second, I had to also deal with wildcards, variable lengths, and whether or not the replacement flag is set in discussed table. These two things plus others are probably the reason you'll find lots of weird material flying around.
CREATE FUNCTION [JACKINABOX](#TextToUpdate varchar(30), #FilterId int)
RETURNS varchar(30) AS
BEGIN
DECLARE #Keyword varchar(30)
DECLARE LonelyCursor CURSOR FOR
SELECT Keyword FROM ReplacementInformation WHERE Id = #FilterId
OPEN LonelyCursor ; FETCH NEXT FROM LonelyCursor INTO #Keyword
WHILE ##FETCH_STATUS = 0 -- While there still remains keywords to process.
BEGIN
WHILE 1 = 1 -- Not sure, but I think this nested loop can be unlooped if [FETCH NEXT] was cut & pasted to replace [BREAK].
BEGIN
IF(CHARINDEX(#Keyword, #TextToUpdate) = 0)
BREAK -- If cannot find current keyword anymore, move on to next keyword.
ELSE -- Otherwise, update text then check again for same keyword.
SET #TextToUpdate = REPLACE(#TextToUpdate, #Keyword, CONCAT('Replaced_', #Keyword))
END
FETCH NEXT FROM LonelyCursor INTO #Keyword
END
CLOSE LonelyCursor ; DEALLOCATE LonelyCursor
RETURN #TextToUpdate
END
Related
I'm new to SQL so please forgive me if I use incorrect terminology and my question sounds confused.
I've been tasked with writing a stored procedure which will be sent 3 variables as strings (varchar I think). I need to take two of the variables and remove text from the end of the variable and only from the end.
The strings/text I need to remove from the end of the variables are
co
corp
corporation
company
lp
llc
ltd
limited
For example this string
Global Widgets LLC
would become
Global Widgets
However it should only apply once so
Global Widgets Corporation LLC
Should become
Global Widgets Corporation
I then need to use the altered variables to do a SQL query.
This is to be used as a backup for an integration piece we have which makes a callout to another system. The other system takes the same variables and uses Regex to remove the strings from the end of variables.
I've tried different combinations of PATINDEX, SUBSTRING, REPLACE, STUFF but cannot seem to come up with something that will do the job.
===============================================================
Edit: I want to thank everyone for the answers provided so far, but I left out some information that I didn't think was important but judging by the answers seems like it would affect the processing.
My proc will start something like
ALTER PROC [dbo].[USP_MyDatabaseTable] #variableToBeAltered nvarchar(50)
AS
I will then need to remove all , and . characters. I've already figured out how to do this. I will then need to do the processing on #variableToBeAltered (technically there will be two variables) to remove the strings I listed previously. I must then remove all spaces from #variableToBeAltered. (Again I figured that part out). Then finally I will use #variableToBeAltered in my SQL query something like
SELECT [field1] AS myField
,[field2] AS myOtherField
FROM [MyData].[dbo].[MyDatabaseTable]
WHERE [field1] = (#variableToBeAltered);
I hope this information is more useful.
I'd keep all of your suffixes in a table to make this a little easier. You can then perform code like this either within a query or against a variable.
DECLARE #company_name VARCHAR(50) = 'Global Widgets Corporation LLC'
DECLARE #Suffixes TABLE (suffix VARCHAR(20))
INSERT INTO #Suffixes (suffix) VALUES ('LLC'), ('CO'), ('CORP'), ('CORPORATION'), ('COMPANY'), ('LP'), ('LTD'), ('LIMITED')
SELECT #company_name = SUBSTRING(#company_name, 1, LEN(#company_name) - LEN(suffix))
FROM #Suffixes
WHERE #company_name LIKE '%' + suffix
SELECT #company_name
The keys here are that you are only matching with strings that end in the suffix and it uses SUBSTRING rather than REPLACE to avoid accidentally removing copies of any of the suffixes from the middle of the string.
The #Suffixes table is a table variable here, but it makes more sense for you to just create it and fill it as a permanent table.
The query will just find the one row (if any) that matches its suffix with the end of your string. If a match is found then the variable will be set to a substring with the length of the suffix removed from the end. There will usually be a trailing space, but for a VARCHAR that will just get dropped off.
There are still a couple of potential issues to be aware of though...
First, if you have a company name like "Watco" then the "co" would be a false positive here. I'm not sure what can be done about that other than maybe making your suffixes include a leading space.
Second, if one suffix ends with one of your other suffixes then the ordering that they get applied could be a problem. You could get around this by only applying the row with the greatest length for suffix, but it gets a little more complicated, so I've left that out for now.
Building on the answer given by Tom H, but applying across the entire table:
set nocount on;
declare #suffixes table(tag nvarchar(20));
insert into #suffixes values('co');
insert into #suffixes values('corp');
insert into #suffixes values('corporation');
insert into #suffixes values('company');
insert into #suffixes values('lp');
insert into #suffixes values('llc');
insert into #suffixes values('ltd');
insert into #suffixes values('limited');
declare #companynames table(entry nvarchar(100),processed bit default 0);
insert into #companynames values('somecompany llc',0);
insert into #companynames values('business2 co',0);
insert into #companynames values('business3',0);
insert into #companynames values('business4 lpx',0);
while exists(select * from #companynames where processed = 0)
begin
declare #currentcompanyname nvarchar(100) = (select top 1 entry from #companynames where processed = 0);
update #companynames set processed = 1 where entry = #currentcompanyname;
update #companynames
set entry = SUBSTRING(entry, 1, LEN(entry) - LEN(tag))
from #suffixes
where entry like '%' + tag
end
select * from #companynames
You can use a query like below:
-- Assuming that you can maintain all patterns in a table or a temp table
CREATE TABLE tbl(pattern varchar(100))
INSERT INTO tbl values
('co'),('llc'),('beta')
--#a stores the string you need to manipulate, #lw & #b are variables to aid
DECLARE #a nvarchar(100), #b nvarchar(100), #lw varchar(100)
SET #a='alpha beta gamma'
SET #b=''
-- #t is a flag
DECLARE #t int
SET #t=0
-- Below is a loop
WHILE(#t=0 OR LEN(#a)=0 )
BEGIN
-- Store the current last word in the #lw variable
SET #lw=reverse(substring(reverse(#a),1, charindex(' ', reverse(#a)) -1))
-- check if the word is in pattern dictionary. If yes, then Voila!
SELECT #t=1 FROM tbl WHERE #lw like pattern
-- remove the last word from #a
SET #a=LEFT(#a,LEN(#a)-LEN(#lw))
IF (#t<>1)
BEGIN
-- all words which were not pattern are joined back onto this stack
SET #b=CONCAT(#lw,#b)
END
END
-- get back the remaining word
SET #a=CONCAT(#a,#b)
SELECT #a
drop table tbl
Do note that this method overcomes Tom's problem of
if you have a company name like "Watco" then the "co" would be a false positive here. I'm not sure what can be done about that other than maybe making your suffixes include a leading space.
use the replace function in SQL 2012,
declare #var1 nvarchar(20) = 'ACME LLC'
declare #var2 nvarchar(20) = 'LLC'
SELECT CASE
WHEN ((PATINDEX('%'+#var2+'%',#var1) <= (LEN(#var1)-LEN(#var2)))
Or (SUBSTRING(#var1,PATINDEX('%'+#var2+'%',#var1)-1,1) <> SPACE(1)))
THEN #var1
ELSE
REPLACE(#var1,#var2,'')
END
Here is another way to overcome the 'Runco Co' situation.
declare #var1 nvarchar(20) = REVERSE('Runco Co')
declare #var2 nvarchar(20) = REVERSE('Co')
Select REVERSE(
CASE WHEN(CHARINDEX(' ',#var1) > LEN(#var2)) THEN
SUBSTRING(#var1,PATINDEX('%'+#var2+'%',#var1)+LEN(#var2),LEN(#var1)-LEN(#var2))
ELSE
#var1
END
)
I have a table that looks like this:
memberno(int)|member_mouth (varchar)|Inspected_Date (varchar)
-----------------------------------------------------------------------------
12 |'1;2;3;4;5;6;7' |'12-01-01;12-02-02;12-03-03' [7 members]
So by looking at how this table has been structured (poorly yes)
The values in the member_mouth field is a string that is delimited by a ";"
The values in the Inspected_Date field is a string that is delimited by a ";"
So - for each delimited value in member_mouth there is an equal inspected_date value delimited inside the string
This table has about 4Mil records, we have an application written in C# that normalizes the data and stores it in a separate table. The problem now is because of the size of the table it takes a long time for this to process. (the example above is nothing compared to the actual table, it's much larger and has a couple of those string "array" fields)
My question is this: What would be the best and fastest way to normilize this data in MSSQL proc? let MSSQL do the work and not a C# app?
The best way will be SQL itself. The way followed in the below code is something which worked for me well with 2-3 lakhs of data.
I am not sure about the below code when it comes to 4 Million, but may help.
Declare #table table
(memberno int, member_mouth varchar(100),Inspected_Date varchar(400))
Insert into #table Values
(12,'1;2;3;4;5;6;7','12-01-01;12-02-02;12-03-03;12-04-04;12-05-05;12-07-07;12-08-08'),
(14,'1','12-01-01'),
(19,'1;5;8;9;10;11;19','12-01-01;12-02-02;12-03-03;12-04-04;12-07-07;12-10-10;12-12-12')
Declare #tableDest table
(memberno int, member_mouth varchar(100),Inspected_Date varchar(400))
The table will be like.
Select * from #table
See the code from here.
------------------------------------------
Declare #max_len int,
#count int = 1
Set #max_len = (Select max(Len(member_mouth) - len(Replace(member_mouth,';','')) + 1)
From #table)
While #count <= #max_len
begin
Insert into #tableDest
Select memberno,
SUBSTRING(member_mouth,1,charindex(';',member_mouth)-1),
SUBSTRING(Inspected_Date,1,charindex(';',Inspected_Date)-1)
from #table
Where charindex(';',member_mouth) > 0
union
Select memberno,
member_mouth,
Inspected_Date
from #table
Where charindex(';',member_mouth) = 0
Delete from #table
Where charindex(';',member_mouth) = 0
Update #table
Set member_mouth = SUBSTRING(member_mouth,charindex(';',member_mouth)+1,len(member_mouth)),
Inspected_Date = SUBSTRING(Inspected_Date,charindex(';',Inspected_Date)+1,len(Inspected_Date))
Where charindex(';',member_mouth) > 0
Set #count = #count + 1
End
------------------------------------------
Select *
from #tableDest
Order By memberno
------------------------------------------
Result.
You can take a reference here.
Splitting delimited values in a SQL column into multiple rows
Do it on SQl server side, if possible a SSIS package would be great.
sql2005
This is my simplified example:
(in reality there are 40+ tables in here, I only showed 2)
I got a table called tb_modules, with 3 columns (id, description, tablename as varchar):
1, UserType, tb_usertype
2, Religion, tb_religion
(Last column is actually the name of a different table)
I got an other table that looks like this:
tb_value (columns:id, tb_modules_ID, usertype_OR_religion_ID)
values:
1111, 1, 45
1112, 1, 55
1113, 2, 123
1114, 2, 234
so, I mean 45, 55, 123, 234 are usertype OR religion ID's
(45, 55 usertype, 123, 234 religion ID`s)
Don't judge, I didn't design the database
Question
How can I make a select, showing * from tb_value, plus one column
That one column would be TITLE from the tb_usertype or RELIGIONNAME from the tb_religion table
I would like to make a general thing.
Was thinking initially about maybe a SQL function that returns a string, but I think I would need dynamic SQL, which is not ok in a function.
Anyone a better idea ?
At the beginning we have this -- which is quite messy.
To clean-up a bit I add two views and a synonym:
create view v_Value as
select
ID as ValueID
, tb_modules_ID as ModuleID
, usertype_OR_religion_ID as RemoteID
from tb_value ;
go
create view v_Religion as
select
ID
, ReligionName as Title
from tb_religion ;
go
create synonym v_UserType for tb_UserType ;
go
And now the model looks like
It is easier now to write the query
;
with
q_mod as (
select
m.ID as ModuleID
, coalesce(x1.ID , x2.ID) as RemoteID
, coalesce(x1.Title , x2.Title) as Title
, m.Description as ModuleType
from tb_Modules as m
left join v_UserType as x1 on m.TableName = 'tb_UserType'
left join v_Religion as x2 on m.TableName = 'tb_Religion'
)
select
a.ModuleID
, v.ValueID
, a.RemoteID
, a.ModuleType
, a.Title
from q_mod as a
join v_Value as v on (v.ModuleID = a.ModuleID and v.RemoteID = a.RemoteID) ;
There is an obvious pattern in this query, so it can be created as dynamic sql if you have to add another module-type table. When adding another table, use ID and Title to avoid having to use a view.
EDIT
To build dynamic sql (or query on application level)
Modify lines 6 and 7, the x-index is tb_modules.id
coalesce(x1. , x2. , x3. ..)
Add lines to the left join (below line 11)
left join v_SomeName as x3 on m.TableName = 'tb_SomeName'
The SomeName is tb_modules.description and x-index is matching tb_modules.id
EDIT 2
The simplest would probably be to package the above query into a view and then each time the schema changes dynamically crate and run ALTER VIEW. This way the query would not change from the point of the application.
Since we're all agreed the design is flaky, I'll skip any comments on that. The pattern of the query is this:
-- Query 1
select tb_value.*,tb_religion.religion_name as ANY_DESCRIPTION
from tb_value
JOIN tb_religion on tb_value.ANY_KIND_OF_ID = tb_religion.id
WHERE tb_value.module_id = 2
-- combine it with...
UNION ALL
-- ...Query 2
select tb_value.*,tb_religion.title as ANY_DESCRIPTION
from tb_value
JOIN tb_userType on tb_value.ANY_KIND_OF_ID = tb_userType.id
WHERE tb_value.module_id = 1
-- combine it with...
UNION ALL
-- ...Query 3
select lather, rinse, repeat for 40 tables!
You can actually define a view that hardcodes all 40 cases, and then put filters onto queries for the particular modules you want.
To do this dynamically you need to be able to create a sql statement that looks like this
select tb_value.*, tb_usertype.title as Descr
from tb_value
inner join tb_usertype
on tb_value.extid = tb_usertype.id
where tb_value.tb_module_id = 1
union all
select tb_value.*, tb_religion.religionname as Descr
from tb_value
inner join tb_religion
on tb_value.extid = tb_religion.id
where tb_value.tb_module_id = 2
-- union 40 other tables
Currently you can not do that because you do not have any information in the db telling you which column to use from tb_religion and tb_usertype etc. You can add that as a new field in tb_module.
If you have fieldname to use in tb_module you can build a view that does what you want.
And you could add a trigger to table tb_modules that alters the view whenever tb_modules is modified. That way you do not need to use dynamic sql from the client when doing queries. The only thing you need to worry about is that the table needs to be created in the db before you add a new row to tb_modules
Edit 1
Of course the code in the trigger needs to dynamically build the alter view statement.
Edit 2 You also need to have a field with information about what column in tb_usertype and tb_religion etc. to join against tb_value.extid (usertype_OR_religion_ID). Or you can assume that the field will always be called id
Edit 3 Here is how you could build the trigger on tb_module that alters the view v_values. I have added fieldname as a column in tb_modules and I assume that the id field in the related tables is called id.
create trigger tb_modules_change on tb_modules after insert, delete, update
as
declare #sql nvarchar(max)
declare #moduleid int
declare #tablename varchar(50)
declare #fieldname varchar(50)
set #sql = 'alter view v_value as '
declare mcur cursor for
select id, tablename, fieldname
from tb_modules
open mcur
fetch next from mcur into #moduleid, #tablename, #fieldname
while ##FETCH_STATUS = 0
begin
set #sql = #sql + 'select tb_value.*, '+#tablename+'.'+#fieldname+' '+
'from tb_value '+
'inner join '+#tablename+' '+
'on tb_value.extid = '+#tablename+'.id '+
'where tb_value.tb_module_id = '+cast(#moduleid as varchar(10))
fetch next from mcur into #moduleid, #tablename, #fieldname
if ##FETCH_STATUS = 0
begin
set #sql = #sql + ' union all '
end
end
close mcur
deallocate mcur
exec sp_executesql #sql
Hm..there are probably better solutions available but here's my five cents:
SELECT
id,tb_modules_ID,usertype_OR_religion_ID,
COALESCE(
(SELECT TITLE FROM tb_usertype WHERE Id = usertype_OR_religion_ID),
(SELECT RELIGIONNAME FROM tb_religion WHERE Id = usertype_OR_religion_ID),
'N/A'
) AS SourceTable
FROM tb_valuehere
Note that I don't have the possibility to check the statement right now so I'm reserving myself for any syntax errors...
First, using your current design the only reasonable solution is dynamic SQL. You should write a module in your middle-tier that queries for the appropriate table names and builds the queries on the fly. Trying to accomplish that in T-SQL will be a nightmare. T-SQL was not designed for string construction.
The right solution is to build a new database designed properly, migrate the data and scrap the existing design. The problems you will encounter with your current design will simply grow. It will be harder for new developers to learn the new system. It will be prone to errors. There will be no data integrity (e.g. forcing the attribute "Start Date" to be parsable as a date). Custom queries will be a chore to write and so on. Eventually, you will hit the day when the types of information desired from the system are simply too difficult to extract given the current design.
First take the undesigner out the back and put them out of their misery. They are hurting people.
Due to their incompetence, every time you add a row to Module, you have to modify every query that uses it. Good for www.dailywtf.com.
You do not have Referential Integrity either, because you cannot define an FK on the this_or_that column. Your data is exposed, probably to "code" written by the same undesigner. No doubt you are aware that this is where the deadlocks are created.
That it is a "judgement", that is so that you understand the gravity of the undesign, and you can justify replacing it, to your managers.
SQL was designed for Relational Databases, that means Normalised. It is not good for mangled files. Sure, some queries may be better than others (just look at the answers), but there is no way to get around the undesign, any SQL query will be hamstrung, and need change whenever a Module row is added.
"Dynamic" is reserved for Databases, not possible for flat flies.
Two answers. One to stop the continuing idiocy of changing the existing queries every time a Module row is added (you're welcome); the second to answer your question.
Safe Future Queries
CREATE VIEW UserReligion_vw AS
SELECT [XxxxId] = id, -- replace Xxxx
[ReligionId] = usertype_OR_religion_ID
FROM tb_value
WHERE tb_modules_ID = 1
CREATE VIEW UserReligion_vw AS
SELECT [XxxxId] = id,
[ReligionId] = usertype_OR_religion_ID
FROM tb_value
WHERE tb_modules_ID = 2
From now on, make sure the all queries currently using the undesign, are modified to use the correct View instead. Do not use the Views for Update/Delete/Insert.
Answer
Ok, now for the main question. I can think of other approaches, but this one is the best. You have stated, you want the third column to also be an unnormalised piece of chicken excreta and the supply Title for [EITHER_Religion_OR_UserType_OR_This_OR_That]. Right, so you are teaching the user to be confused as well; when the no of modules grow, they will have great fun figuring out what the column contains. Yes a problem does always compound itself.
SELECT [XxxxId] = id,
[Whatever] = CASE tb_modules_ID
WHEN 1 THEN ( SELECT name -- title, whatever
FROM tb_religion
WHERE id = V.usertype_OR_religion_ID
)
WHEN 2 THEN ( SELECT name -- title, whatever
FROM tb_usertype
WHERE id = V.usertype_OR_religion_ID
)
ELSE "(UnknownModule)" -- do not remove the brackets
END
FROM tb_value V
WHERE conditions... -- you need something here
This is called a Correlated Scalar Subquery.
It works on any version of Sybase since 4.9.2 with no limitations. And SQL 2005 (last time I looked, anyway, Aug 2009). But on MS you will get a StackTrace if the volume of tb_value is large, so make sure the WHERE clause has some conditions on it.
But MS have broken the server with their "new" 2008 codeline, so it does not work in all circumstances (the worse your mangled files, the less likely it will work; the better your database design, the more likely it will work). That is why some MS people pray every day for the next Service pack, and others never attend church.
I guess you want something like this:
Adding tables and one row per table into tb_modules is straight forward.
SET NOCOUNT ON
if OBJECT_ID('tb_modules') > 0 drop table tb_modules;
if OBJECT_ID('tb_value') > 0 drop table tb_value;
if OBJECT_ID('tb_usertype') > 0 drop table tb_usertype;
if OBJECT_ID('tb_religion') > 0 drop table tb_religion;
go
create table dbo.tb_modules (
id int,
description varchar(20),
tablename varchar(255)
);
insert into tb_modules values ( 1, 'UserType', 'tb_usertype');
insert into tb_modules values ( 2, 'Religion', 'tb_religion');
create table dbo.tb_value(
id int,
tb_modules_ID int,
usertype_OR_religion_ID int
);
insert into tb_value values ( 1111, 1, 45);
insert into tb_value values ( 1112, 1, 55);
insert into tb_value values ( 1113, 2, 123);
insert into tb_value values ( 1114, 2, 234);
create table dbo.tb_usertype(
id int,
UserType varchar(30)
);
insert into tb_usertype values ( 45, 'User_type_45');
insert into tb_usertype values ( 55, 'User_type_55');
create table dbo.tb_religion(
id int,
Religion varchar(30)
);
insert into tb_religion values ( 123, 'Religion_123');
insert into tb_religion values ( 234, 'Religion_234');
-- start of query
declare #sql varchar(max) = null
Select #sql = case when #sql is null then ' ' else #sql + char(10) + 'union all ' end
+ 'Select ' + str(id) + ' type, id, ' + description + ' description from ' + tablename from tb_modules
set #sql = 'select v.id, tb_modules_ID , usertype_OR_religion_ID , t.description
from tb_value v
join ( ' + #sql + ') as t
on v.tb_modules_ID = t.type and v.usertype_OR_religion_ID = t.id
'
Print #sql
exec( #sql)
I think it's intended to be used with dynamic sql.
Maybe break out each tb_value.tb_modules_ID row into its own temp table, named with the tb_modules.tablename.
Then have an sp iterate through the temp tables matching your naming convention (by prefix or suffix) building the sql and doing your join.
Sorry for the poor question wording I wasn't sure how to describe this. I want to iterate through every row in a table and while doing so, extract a column, parse the varchar that is in it and depending on what it finds insert rows into another table. Something along the lines of this:
DECLARE #string varchar(max);
foreach row in (select * from Table) {
set #string = row[column];
while (len(#string) > 0) {
-- Do all the parsing in here
if (found what was looking for)
insert into Table2 values(row[column2], row[column3]);
}
}
It would be really nice for this to be a stored procedure so for it to be done in SQL. I'm just not too sure on how to approach it. Thanks.
Edit:
This is basically the functionality I was hoping for:
Table 1 |
id_number | text |
1 Hello, test 532. Yay oh and test 111
2 test 932.
3 This is a test 315 of stuff test 555.
4 haflksdhfal test 311 sadjhfalsd
5 Yay.
I want to go through this table and parse all of the text columns to look for instances of 'test #' where # is a number. When it finds something inside of the text in that format it will insert that value into another table like:
Table 2 |
id_number | number
1 532
1 111
2 932
3 315
3 555
4 311
You are thinking procedurally instead of set based. You can probably write the whole thing as a single query:
INSERT INTO target_table (column list)
SELECT (column list)
FROM source_table
WHERE (parse your column) = (some criterion)
It is much easier to write, and probably a lot faster too.
If your parsing function is complicated, you can use put it into a user defined function instead of embedding it directly into the query.
In SQL Server 2008 you can do this
WITH testTable AS
(
SELECT 1 AS id_number, N'Hello, test 532. Yay oh and test 111' AS txt UNION ALL
SELECT 2, N'test 932.' UNION ALL
SELECT 3, N'This is a test 315 of stuff test 555.' UNION ALL
SELECT 4, N'haflksdhfal test 311 sadjhfalsd' UNION ALL
SELECT 5, N'Yay.'
)
SELECT id_number,display_term
FROM testTable
CROSS APPLY sys.dm_fts_parser('"' + REPLACE(txt,'"','""') + '"', 1033, 0,0)
WHERE TXT IS NOT NULL and
display_term NOT LIKE '%[^0-9]%' /*Or use LIKE '[0-9][0-9][0-9]' to only get 3
digit numbers*/
Returns
id_number display_term
----------- ------------------------------
1 532
1 111
2 932
3 315
3 555
4 311
Something like this is you always have "Test (number)". It works on SQL Server 2005+
DECLARE #Table1 TABLE (id_number int, textcol nvarchar(MAX))
INSERT #Table1 VALUES (1, 'Hello, test 532. Yay oh and test 111')
INSERT #Table1 VALUES (2, 'test 932.')
INSERT #Table1 VALUES (3, 'This is a test 315 of stuff test 555.')
INSERT #Table1 VALUES (4, 'haflksdhfal test 311 sadjhfalsd')
INSERT #Table1 VALUES (5, 'Yay.')
;WITH cte AS
(
SELECT TOP 9999 CAST(ROW_NUMBER() OVER (ORDER BY c1.OBJECT_ID) AS varchar(6)) AS TestNum
FROM sys.columns c1 CROSS JOIN sys.columns c2
)
SELECT id_number, TestNum FROM
cte
JOIN
#Table1 ON PATINDEX('%Test ' + TestNum + '[^0-9]%', textcol) > 0
OR textcol LIKE '%Test ' + TestNum
ORDER BY
id_number
The feature you are looking for is called a CURSOR - here is an article on how to use them.
They are considered bad for performance and difficult to use correctly.
Rethink your problem and restate it so it can be solved in a set based operation.
Look at using table variables or sub queries for your complex condition.
You're after a cursor - see the MSDN docs here. Note that cursors should be avoided wherever possible - there are very few places that they're appropriate and can result in slow inefficient code - you're usually better off trying a set-based solution.
To do this as you request, with iteration you can do it using a Cursor, using your sample information below is how a cursor is laid-out. You put your row-by-row process where my comment is.
DECLARE #CurrentRecord VARCHAR(MAX)
DECLARE db_cursor CURSOR FOR
SELECT Column
FROM Table
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO #CurrentRecord
WHILE ##FETCH_STATUS = 0
BEGIN
--Your stuff here
FETCH NEXT FROM db_cursor INTO #name
END
CLOSE db_cursor
DEALLOCATE db_cursor
However, depending on what you are doing, and if this is something that you do on a regular basis. I would recommend seeing if you can extract the parsing out to a User Defined Function, then you could make it set based, and not use a cursor. As a cursor should be a "last ditch" effort.
I have a table with 1 column of varchar values. I am looking for a way to concatenate those values into a single value without a loop, if possible. If a loop is the most efficient way of going about this, then I'll go that way but figured I'd ask for other options before defaulting to that method. I'd also like to keep this inside of a SQL query.
Ultimately, I want to do the opposite of a split function.
Is it possible to do without a loop (or cursor) or should I just use a loop to make this happen?
Edit:
Since there was a very good answer associated with how to do it in MySql (as opposed to MS Sql like I initially intended), I decided to retag so others may be able to find the answer as well.
declare #concat varchar(max)
set #concat = ''
select #concat = #concat + col1 + ','
from tablename1
try this:
DECLARE #YourTable table (Col1 int)
INSERT INTO #YourTable VALUES (1)
INSERT INTO #YourTable VALUES (2)
INSERT INTO #YourTable VALUES (30)
INSERT INTO #YourTable VALUES (400)
INSERT INTO #YourTable VALUES (12)
INSERT INTO #YourTable VALUES (46454)
SELECT
STUFF(
(
SELECT
', ' + cast(Col1 as varchar(30))
FROM #YourTable
WHERE Col1<=400
ORDER BY Col1
FOR XML PATH('')
), 1, 2, ''
)
OUTPUT:
-------------------
1, 2, 12, 30, 400
(1 row(s) affected)
I just tackled a problem like this and looping took forever. So, I concantenated the values in the presentation medium (in this case Crystal Reports) and it was very fast.
Just an idea.
If it is MySQL, you can use GROUP_CONCAT
SELECT a, GROUP_CONCAT(b SEPARATOR ',') FROM table GROUP BY a;
Probably dated now but check out Adam Machanic's post on the topic.
And this one is certainly dated; I wrote it in 2004.
Why do I prefer a function over "keeping it inside a SQL query"? Because you'll probably have to do this more than once. Why not encapsulate that code into a single module instead of repeating it all over the place?