SQL Server bringing back inconsistent results - sql-server-2005

I am working on some reports that were created before I started in my current job. One of these reports is based on a view (SQL Server 2005).
This view is incredibly large and unwieldy, and for now, I won't post it because I think it's just too big. I'm not sure how it was produced - I'm guessing that it was produced in the designer because I can't see someone actually writing stuff like this. It's several pages long, and references 5 other views. Bottom line - it's complicated, and needs to be refactored/redesigned, but until we get time for that we're stuck with it.
Anyway, I have to make some minor non-functional changes to it in order to move it to a different database and schema. In order to make sure I'm not changing what it actually returns, I'm amending a second version of the view. Let's call the first view vw_Data1 and my new view vw_Data2. Now, if I write:
SELECT Count(*) FROM
(
SELECT * FROM vw_Data1
UNION
SELECT * FROM vw_Data2
)
then I should get back the same number as if I just did
SELECT Count(*) FROM vw_Data1
as long as vw_Data1 and vw_Data2 return identical rows (which is what I want to check).
However, what I am finding is if I run the UNION query above several times, I get DIFFERENT RESULTS EACH TIME.
So, just to be clear, if I run:
SELECT Count(*) FROM
(
SELECT * FROM vw_Data1
UNION
SELECT * FROM vw_Data2
)
more than once, then I get different results each time.
As I say, I'm not posting the actual code yet, because the first thing I want to ask is simply this - how on earth can a query return different results?
There is one non-deterministic function used, and that is as part of the following (horrible) join:
LEFT OUTER JOIN dbo.vwuniversalreportingdata_budget
ON
CASE
WHEN dbo.f_tasks.ta_category = 'Reactive' THEN
CAST(dbo.f_tasks.ta_fkey_fc_seq AS VARCHAR(10))
+ ' | '
+ CAST(dbo.f_tasks.ta_fkey_fcc_seq AS VARCHAR(10))
+ ' | '
+ CAST(YEAR(DATEADD(MONTH, -3, dbo.f_tasks.ta_sched_date)) AS VARCHAR(10))
WHEN dbo.f_tasks.ta_category = 'Planned' THEN
CAST(dbo.f_tasks.ta_fkey_fc_seq AS VARCHAR(10))
+ ' | '
+ CAST(dbo.f_tasks.ta_fkey_fcc_seq AS VARCHAR(10))
+ ' | '
+ CAST(YEAR(DATEADD(MONTH, -3, dbo.f_tasks.ta_est_date)) AS VARCHAR(10))
WHEN dbo.f_tasks.ta_category = 'Periodic' THEN
CAST(dbo.f_tasks.ta_fkey_fc_seq AS VARCHAR(10))
+ ' | '
+ CAST(dbo.f_tasks.ta_fkey_fcc_seq AS VARCHAR(10))
+ ' | '
+ CAST(YEAR(DATEADD(MONTH, -3, dbo.f_tasks.ta_est_date)) AS VARCHAR(10))
END
= dbo.vwuniversalreportingdata_budget.id
The whole query is pretty disgusting like this. Anyway, any thoughts on how this could happen would be gratefully received. Is it something to do with the union, perhaps? I don't know. Help!

Related

SQL query throws error while using Alias in nested select

I posted one question today but that was too broad. So, I worked on that and now narrow down to some of parts. However, my query is returning syntax error while just copy paste the same text.
(
SELECT
TOP (100) PERCENT v_UpdateComplianceStatus.ResourceID, v_UpdateComplianceStatus.Status, CAST(DATEPART(yyyy, v_UpdateInfo.DatePosted) AS varchar(255)) + '-' + RIGHT('0' + CAST(DATEPART(mm, v_UpdateInfo.DatePosted) AS VARCHAR(255)), 2) AS MonthPosted, COUNT(1) AS Count
FROM
v_UpdateComplianceStatus
INNER JOIN
v_UpdateInfo
ON v_UpdateComplianceStatus.CI_ID = v_UpdateInfo.CI_ID
INNER JOIN
v_R_System
ON v_UpdateComplianceStatus.ResourceID = v_R_System.ResourceID
inner join
v_FullCollectionMembership fcm
on v_UpdateComplianceStatus.ResourceID = fcm.ResourceID
WHERE
(
v_R_System.Operating_System_Name_and0 like '%Workstation 6.1%'
and v_R_System.Obsolete0 = 0
)
AND
(
v_UpdateInfo.Severity IN
(
8,
10
)
)
AND
(
v_UpdateInfo.IsSuperseded = 0
)
AND
(
v_UpdateInfo.IsEnabled = 1
)
and fcm.CollectionID = 'ABC00328'
GROUP BY
v_UpdateComplianceStatus.ResourceID, v_UpdateComplianceStatus.Status, CAST(DATEPART(yyyy, v_UpdateInfo.DatePosted) AS varchar(255)) + '-' + RIGHT('0' + CAST(DATEPART(mm, v_UpdateInfo.DatePosted) AS VARCHAR(255)), 2)) FF
where Status =2
Group By MonthPosted ) E
where E.MonthPosted = E.MonthPosted
order by MonthPosted Desc
If I run the above query it is throwing error at
Group By MonthPosted ) EE
Not sure why it is giving error.
Msg 102, Level 15, State 1, Line 123
Incorrect syntax near 'EE'.
Some Important things which I discover.
This query works fine if I run some part of it.
(SELECT TOP (100) PERCENT v_UpdateComplianceStatus.ResourceID, v_UpdateComplianceStatus.Status, CAST(DATEPART(yyyy,
v_UpdateInfo.DatePosted) AS varchar(255)) + '-' + RIGHT('0' + CAST(DATEPART(mm, v_UpdateInfo.DatePosted) AS VARCHAR(255)), 2)
AS MonthPosted, COUNT(1) AS Count
FROM v_UpdateComplianceStatus INNER JOIN
v_UpdateInfo ON v_UpdateComplianceStatus.CI_ID = v_UpdateInfo.CI_ID INNER JOIN
v_R_System ON v_UpdateComplianceStatus.ResourceID = v_R_System.ResourceID
inner join v_FullCollectionMembership fcm on v_UpdateComplianceStatus.ResourceID=fcm.ResourceID
WHERE (v_R_System.Operating_System_Name_and0 like '%Workstation 6.1%' and v_R_System.Obsolete0 = 0) AND (v_UpdateInfo.Severity IN (8, 10)) AND (v_UpdateInfo.IsSuperseded = 0) AND (v_UpdateInfo.IsEnabled = 1)
and fcm.CollectionID='ABC00328'
GROUP BY v_UpdateComplianceStatus.ResourceID, v_UpdateComplianceStatus.Status, CAST(DATEPART(yyyy,
v_UpdateInfo.DatePosted) AS varchar(255)) + '-' + RIGHT('0' + CAST(DATEPART(mm, v_UpdateInfo.DatePosted) AS VARCHAR(255)), 2))
But if I Put Alias (FF) then it is throwing a syntax error.
Too long for a comment. Here is your first snippet condensed into something readable. Note - learn to write and post readable code. You have gone with FAR too much line spacing, indentation, and white space for anyone to easily read your code. The harder it is to read, the harder it is to understand. I've done my best to condense your first query into the essential elements.
(
SELECT blah blah,
CAST(DATEPART(yyyy, v_UpdateInfo.DatePosted) AS varchar(255)) + '-' +
RIGHT('0' + CAST(DATEPART(mm, v_UpdateInfo.DatePosted) AS VARCHAR(255)), 2) AS MonthPosted,
<aggregated column>
GROUP BY blah blah) as FF
where Status =2
Group By MonthPosted
) as E
where E.MonthPosted = E.MonthPosted
order by MonthPosted Desc
So what do YOU see wrong here. The where clause is pointless - does nothing useful. You probably introduced that error with all the editing. And apparently you cast numbers to large strings for no reason. That is just sloppy. It is concerning that you feel a need to cast a number to zero-filled string in the first place - that is probably an inefficient approach. If you really need to do that, look up the documentation for convert. Style 112 does what you need - all you need to do is take the first 6 characters of the converted string. Note 6 characters, not 255 characters or MAX characters. That will declutter your code significantly.
And now that you have edited your post multiple times, it logically makes no sense. There is no "EE" alias in your first query at all - so the error you posted cannot come from that snippet. Most likely the problem comes from the code you left out.
So now it is time to divide and conquer - a technique you can use to build complicated queries. Focus on that snippet ONLY. Write it as a complete query and run it, test it, validate that it works. When it does execute without errors and returns the correct results (results that you have actually verified and not just glanced at to see if numbers/values look "reasonable"), you can then add additional logic as needed. Usually it is best to add joins 1 by 1 to avoid creating a monster problem that is difficult to understand and diagnose. Often the use of CTEs can help. Put your starting query in a cte, get it working correctly. Example:
with cte1 as (...)
select * from cte1
order by ...;
Then add another cte to this first one and write it to use the first one, get it working. Example:
with cte1 as (...),
cte2 as (select ... from cte1 inner join ...)
select * from cte2
order by ...;
Repeat that as needed. Once everything works you can try to bring it all together and "beautify" it if needed.
And start thinking about your code. Use the appropriate datatypes, do NOT try to prematurely optimize things, learn and understand your schema, and stop using tricks. As I mentioned, "select top 100 percent" is generally pointless. Rhetorical question - why do you think this is needed as a part of this derived table. And use meaningful alias names. "E" and "EE" are not meaningful. Remember that someone will need to maintain this code, perhaps even modify it. And, of course, create an alias for every table/view and use it. Something short (but not too short) but meaningful. This will vastly improve readability - especially with those very long view (presumably) names.
Lastly, You said "This query works fine if i run some part of it." That just is not a useful thing to write since it means nothing to the reader. Which part? All of it? The first line? Just lines 2 through 5? It is difficult to have a technical discussion - do not add confusion by using terms that are imprecise.

SQL SERVER FOR XML PATH - Exporting data to excel using Integration Services

Using the help of SO and code below:
select t1.*,
stuff( (select '; ' + coalesce(data1, '') + ',' + coalesce(data2, '')
from table2 t2
where t2.FK_TBL1_ID = t1.id
for xml path ('')
), 1, 2, ''
) as Data1Data2
from table1 t1;
I successfully combined multiple rows with multiple columns into one row-one column in my sql view.
What i would ultimately like to achieve is to be able for each row with multiple columns combined, to break the line for the new record(row) when viewed inside an excel cell like below:
**Data1Data2Cell**
aaaa, bbbb;
cccc, dddd;
....
Same functionality can be achieved in Excel using ALT+ENTER on each cell.
I tried using Char(10) and Char(13) to no avail.
You should change your first line of STUFF to this
stuff( (select CHAR(10)+ coalesce(data1, '') + ',' + coalesce(data2, '')+'; '
This will introduce the necessary line break in the string correctly.
See working demo here
Please note that when you run this query in SSMS, you may not see the results correctly as typically results show up in grid view like below
To get the right view you should change your query results to text view by either pressing cntrl+ t or by clicking Query options from top menu(see screenshot below)
Finally after this your query will give you correct view of results like below

Typically long running query?

I inherited a poorly designed SQL Server implementation, with horrible database schemas, and several pitifully slow queries/views that take hours, or even days, to execute.
I'm curious: What might an experienced DBA/SQL programmer consider to be an unusually long time for a query to take? In my professional experience I'm not used to seeing queries (for views or reports, etc) that take more than maybe an hour or two to run. We have several here that take 1-2 days or more!
(This database has no relationships, no Primary Keys, no Foreign Keys, hardly any indexes, duplicate data all over the place, old data that shouldn't even be in the tables, temp tables everywhere, and so on...ugh!)
What do you consider within the realm of acceptable or normal for a lengthy query process?
I'm trying to do a sanity check to determine how awful this database really is...
This isn't an answer to your question, it's just easier to offer this script to you here than in a comment.
You might want to run this query and see what SQL Server thinks are the missing indexes it needs to perform better (a short-term solution while you migrate to your new schema and database). DO NOT BLINDLY APPLY THESE INDEXES. These are merely suggestions for you to consider, that SQL Server itself has identified as being useful when it runs. You might select one or two, perhaps tweaking include columns, and AFTER TESTING, apply some of them to help you speed up your existing system (I forget where this query came from, so I'm sadly unable to attribute the original author):
SELECT
migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS improvement_measure,
(migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)) AS [cumulative_impact],
OBJECT_NAME(OBJECT_ID) as TableName,
'CREATE INDEX [missing_index_' + CONVERT (varchar, mig.index_group_handle) + '_' + CONVERT (varchar, mid.index_handle)
+ '_' + LEFT (PARSENAME(mid.statement, 1), 32) + ']'
+ ' ON ' + mid.statement
+ ' (' + ISNULL (mid.equality_columns,'')
+ CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END
+ ISNULL (mid.inequality_columns, '')
+ ')'
+ ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement,
migs.*, mid.database_id, mid.[object_id]
FROM sys.dm_db_missing_index_groups mig
INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle
WHERE migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 10
AND database_id = DB_ID()
ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC
Further, you can use the following script to find the "longest running queries" (along with their SQL Plans). There's a TON of information here, so play with the ORDER BY to bring different kinds of issues to your attention at the top (for instance, the longest running queries might run only once or twice, while one that runs thousand of times might not take SO long, but might consume far more resources all told):
SELECT [St].Text,
[Qp].[Query_Plan],
[Qs].*
FROM (SELECT TOP 50 *
FROM [Sys].[Dm_Exec_Query_Stats]
ORDER BY [Total_Worker_Time] DESC
) AS [Qs]
CROSS APPLY [Sys].[Dm_Exec_Sql_Text] ([Qs].[Sql_Handle]) AS [St]
CROSS APPLY [Sys].[Dm_Exec_Query_Plan] ([Qs].[Plan_Handle]) AS [Qp]
WHERE ([Qs].[Max_Worker_Time] > 300 OR [Qs].[Max_Elapsed_Time] > 300)
AND [Qs].execution_count > 1
ORDER BY min_elapsed_time DESC, max_elapsed_time DESC
These queries only return data from running instances. Stopping and restarting SQL Server erases all the collected data, so these queries are only really valuable for systems that have been up and running in "real world" situations for a while.
I hope you find these useful.

SQL query with subquery and concatenating variable using CRM on-premise data

I am working on a report where I need to provide a summary of notes for particular "activities/tasks".
Since the activity can accept multiple notes, I have to search for all the notes related to that activity. I then order it by date (new to old), and concatenate them with some other strings as such:
[Tom Smith wrote on 9/23/2016 1:21 pm] Client was out of office, left message. [Jane Doe wrote on 9/21/2016 3:24 pm] Client called asking about pricing.
The data comes from replicated tables of our on-premise CRM system, and I'm using SQL Server 2012. The tables I'm using are: AnnotationBase (contains the notes), ActivityPointerBase (contains the activities/tasks), and SystemUserID (to lookup usernames). Due to Data mismatch, I have to do some converting of the data types so that I can concatenate them properly, so that's why there's a lot of CAST and CONVERT. In addition, not all Activities have a NoteText associated with them, and sometimes the NoteText field is NULL, so I have to catch and filter the NULLs out (or it'll break my concatenated string).
I have written the following query:
DECLARE #Notes VarChar(Max)
SELECT
( SELECT TOP 5 #Notes = COALESCE(#Notes+ ', ', '') + '[' + CONVERT(varchar(max), ISNULL(sUB.FullName, 'N/A')) + ' wrote on ' + CONVERT(varchar(10), CAST(Anno.ModifiedOn AS DATE), 101) + RIGHT(CONVERT(varchar(32),Anno.ModifiedOn,100),8) + '] ' + CONVERT(varchar(max), ISNULL(Anno.NoteText, '')) --+ CONVERT(varchar(max), CAST(ModifiedOn AS varchar(max)), 101)--+ CAST(ModifiedOn AS varchar(max))
FROM [CRM_rsd].[dbo].[AnnotationBase] AS Anno
LEFT OUTER JOIN [CRM_rsd].[dbo].[systemUserBase] AS sUB
ON Anno.ModifiedBy = sUB.SystemUserId
WHERE Anno.ObjectId = Task.ActivityId--'0B48AB28-C08F-419A-8D98-9916BDFFDE4C'
ORDER BY Anno.ModifiedOn DESC
SELECT LEFT(#Notes,LEN(#Notes)-1)
) AS Notes
,Task.*
FROM [CRM_rsd].[dbo].[ActivityPointerBase] AS Task
WHERE Task.Subject LIKE '%Project On Hold%'
I know the above method is probably not very efficient, but the list of "Projects On Hold" is rather small (less than 500), so performance isn't a priority. What is a priority is to be able to get a consolidated and concatenated list of notes for each activity. I have been searching all over the internet for a solution, and I have tried many different methods. But I get the following errors:
Msg 102, Level 15, State 1, Line 3
Incorrect syntax near '='.
Msg 102, Level 15, State 1, Line 10
Incorrect syntax near ')'.
I envision two possible solutions to my problem:
My subquery errors are fixed or
Create a "view" of just the concatenated NotesText, grouped by ActivityId (which would work as a key), and then just query from that.
Yet even though I'm pretty sure my ideas would work, I can't seem to figure out how to concatenate a column and group at the same time.
What you are trying to do is display the records from one table (in your case ActivityPointerBase) and inside you want to add a calculated column with information from multiple records from another table (in your case AnnotationBase) merged in the rows.
There are multiple ways how you could achieve this that are different in terms of performance impact:
Option 1. You could write a scalar function that would receive as parameter the id of the task and would inside select the top 5 records, concatenating them in a procedural fashion and returning a varchar(max)
Option 2: You could use a subquery in combination with the FOR XML clause.
SELECT
SUBSTRING(
CAST(
(SELECT TOP 5 ', ' +
'[' + CONVERT(varchar(max), ISNULL(FullName, 'N/A')) +
' wrote on ' +
CONVERT(varchar(10), CAST(ModifiedOn AS DATE), 101) +
RIGHT(CONVERT(varchar(32),ModifiedOn,100),8) + '] ' +
CONVERT(varchar(max), ISNULL(NoteText, ''))
FROM [CRM_rsd].[dbo].[AnnotationBase] AS Anno
LEFT OUTER JOIN [CRM_rsd].[dbo].[systemUserBase] AS sUB ON Anno.ModifiedBy = sUB.SystemUserId
WHERE Anno.ObjectId = Task.ActivityId
ORDER BY Anno.ModifiedOn DESC
FOR XML PATH(''),TYPE
) AS VARCHAR(MAX)
),3,99999) AS Notes
,Task.*
FROM [CRM_rsd].[dbo].[ActivityPointerBase] AS Task
WHERE Task.Subject LIKE '%Project On Hold%'
What here happens is that by using the construct inside the CAST() we fetch the top 5 lines and make SQL server produce an XML with no element names, resulting in concatenation of the element values, we add comma as separator. Then we convert the XML to varchar(max) and remove the initial separator before the first record.
I prefer option 2, it will perform much better then using a scalar function.

How to count words in specific column against matching words in another table

I want to be able to:
extract specific words from column1 in Table1 - but only the words that are matched from Table2 from a column called word,
perform a(n individual) count of the number of words that have been found, and
put this information into a permanent table with a format, that looks like:
Final
Word | Count
--------+------
Test | 7
Blue | 5
Have | 2
Currently I have tried this:
INSERT INTO final (word, count)
SELECT
extext
, SUM(dbo.WordRepeatedNumTimes(extext, 'test')) AS Count
FROM [dbo].[TestSite_Info], [dbo].[table_words]
WHERE [dbo].[TestSite_Info].ExText = [dbo].[table_words].Words
GROUP BY ExText;
The function dbo.WordRepeatedNumTimes is:
ALTER function [dbo].[WordRepeatedNumTimes]
(#SourceString varchar(8000),#TargetWord varchar(8000))
RETURNS int
AS
BEGIN
DECLARE #NumTimesRepeated int
,#CurrentStringPosition int
,#LengthOfString int
,#PatternStartsAtPosition int
,#LengthOfTargetWord int
,#NewSourceString varchar(8000)
SET #LengthOfTargetWord = len(#TargetWord)
SET #LengthOfString = len(#SourceString)
SET #NumTimesRepeated = 0
SET #CurrentStringPosition = 0
SET #PatternStartsAtPosition = 0
SET #NewSourceString = #SourceString
WHILE len(#NewSourceString) >= #LengthOfTargetWord
BEGIN
SET #PatternStartsAtPosition = CHARINDEX (#TargetWord,#NewSourceString)
IF #PatternStartsAtPosition <> 0
BEGIN
SET #NumTimesRepeated = #NumTimesRepeated + 1
SET #CurrentStringPosition = #CurrentStringPosition + #PatternStartsAtPosition +
#LengthOfTargetWord
SET #NewSourceString = substring(#NewSourceString, #PatternStartsAtPosition +
#LengthOfTargetWord, #LengthOfString)
END
ELSE
BEGIN
SET #NewSourceString = ''
END
END
RETURN #NumTimesRepeated
END
When I run the above INSERT statement, no record is inserted.
In the table TestSite_Info is a column called Extext. Within this column, there is random text - one of the words being 'test'.
In the other table called Table_Words, I have a column called Words and one of the words in there is 'Test'. So in theory, as the word is a match, I would pick it up, put it into the table Final, and then next to the word (in another column) the count of how many times the word has been found within TestSite_Info.Extext.
Table_Words
id|word
--+----
1 |Test
2 |Onsite
3 |Here
4 |As
TestSite_Info
ExText
-------------------------------------------------
This is a test, onsite test , test test i am here
The expected Final table has been given at the top.
-- Update
Now that I have run Abecee block of code this actually works in terms of bringing back a count column and the id relating to the word.
Here are the results :
id|total
--+----
169 |3
170 |0
171 |5
172 |7
173 |1
174 |3
Taken from the following text which it is extracting from :
Test test and I went to and this was a test I'm writing rubbish hello
but I don't care about care and care seems care to be the word that you will see appear
four times as well as word word word word word, but a .!
who knows whats going on here.
So as you can see, the count for ID 172 appears 7 times (as a reference please see below to what ID numbers relate to in terms of words) which is incorrect it should appear appear 6 times (its added +1 for some reason) as well as ID 171 which is the word care, that appears 4 times but is showing up as 5 times on the count. Any ideas why this would be?
Also what I was really after was a way as you have quite kindly done of the table showing the ID and count BUT also showing the word it relates to as well in the final table, so I don't have to link back through the ID table to see what the actual word is.
Word|id
--+----
as |174
here |173
word |172
care |171
hello |170
test |169
You could work along the updated
WITH
Detail AS (
SELECT
W.id
, W.word
, T.extext
, (LEN(REPLACE(T.extext, ' ', ' ')) + 2
- LEN(REPLACE(' '
+ UPPER(REPLACE(REPLACE(REPLACE(REPLACE(T.extext, ' ', ' '), ':', ' '), '.', ' '), ',', ' '))
+ ' ', ' ' + UPPER(W.word) + ' ', '')) - 1
) / (LEN(W.word) + 2) count
FROM Table_Words W
JOIN TestSite_Info T
ON CHARINDEX(UPPER(W.word), UPPER(T.extext)) > 0
)
INSERT INTO Result
SELECT
id
, SUM(count) total
FROM Detail
GROUP BY id
;
(Had forgotten to count in the blanks added to the front and the end, missed a sign change, and got mixed up as for the length of the word(s) surrounded by blanks. Sorry about that. Thanks for testing it more thoroughly than I did originally!)
Tested on SQL Server 2008: Updated SQL Fiddle and 2012: Updated SQL Fiddle.
And with your test case as well.
It:
is pure SQL (no UDF required),
has room for some tuning:
Store words all lower / all upper case, unless case matters (Which would require to adjust the suggested solution.)
Store strings to check with all punctuation marks removed.
Please comment if and as further detail is required.
From what I could understand, this might do the job. However, things will be more clear if you post the schema
create table final(
word varchar(100),
count integer
);
insert into final (word, count)
select column1, count(*)
from table1, table2
where table1.column1 = table2.words
group by column1;
Thanks for all your help.
The best approach for this solution was :
WITH
Detail AS (
SELECT
W.id
, W.word
, T.extext
, (LEN(REPLACE(T.extext, ' ', ' ')) + 2
- LEN(REPLACE(' '
+ UPPER(REPLACE(REPLACE(REPLACE(REPLACE(T.extext, ' ', ' '), ':', ' '), '.', ' '), ',', ' '))
+ ' ', ' ' + UPPER(W.word) + ' ', '')) - 1
) / (LEN(W.word) + 2) count
FROM Table_Words W
JOIN TestSite_Info T
ON CHARINDEX(UPPER(W.word), UPPER(T.extext)) > 0
)
INSERT INTO Result
SELECT
id
, SUM(count) total
FROM Detail
GROUP BY id
Re-reading the problem description, the presented SELECT seems to try to align/join full "exText" strings with individual "words" - but based on equality. Then, it has "exText" in the SELECT list whilst "Result" seems to wait for individual words. (That would not fail the INSERT, though, as long as the field is not guarded by a foreign key constraint. But as the WHERE/JOIN is not likely to let any data through, this is probably never coming up as an issue anyway.)
For an alternative to the pure declarative approach, you might want to try along
INSERT INTO final (word, count)
SELECT
word
, SUM(dbo.WordRepeatedNumTimes(extext, word)) AS Count
FROM [dbo].[TestSite_Info], [dbo].[table_words]
WHERE CHARINDEX(UPPER([dbo].[table_words].Word), UPPER([dbo].[TestSite_Info].ExText)) > 0
GROUP BY word;
You have "Word" in your "Table_Words" description - but use "[dbo].[table_words].Words" in your WHERE condition…