SQL Server : OR statement within CASE statement - sql

I have a stored procedure that is querying some employee records based on what the user sends over.
On the UI, the user will enter multiple data points such as email addresses, User ID's, or Employee Names. This stored procedure checks what datatype they are providing and then searches that field in the database for the records.
Input to stored procedure:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<data>
<dataPoints>
<dataPoint>
<order>0</order>
<value>Jim Bob</value>
</dataPoint>
<dataPoint>
<order>1</order>
<value>Sally Jones</value>
</dataPoint>
</dataPoints>
</data>
</root>
Query:
#dataType VARCHAR (20), #data XML
AS
BEGIN
SET NOCOUNT ON;
BEGIN
-- Create a temp table
DECLARE #dataSet TABLE (data VARCHAR(100), [order] INT);
INSERT INTO #dataSet( data , [order] )
SELECT
ParamValues.x1.value('value[1]', 'VARCHAR(100)') ,
ParamValues.x1.value('order[1]', 'INT')
FROM
#data.nodes('/root/data/dataPoints/dataPoint') AS ParamValues(x1)
-- Search Employees
SELECT
ec.FirstName, ec.PreferredName, ec.LastName,
ec.NTID, ec.QID,
ec.DepartmentName, ec.SegmentName,
ec.CenterName, ec.RoleName, ec.MarketName,
ec.IncentivePlanName,
ec.CostCenterID,
ec.SupFirstName, ec.SupPreferredName, ec.SupLastName,
ec.SiloName,
ec.AreaName,
ec.PersonnelID,
d.[order]
FROM
Resources.emp.EmployeeComplete AS ec
INNER JOIN
#dataset AS d ON d.data = CASE
WHEN #dataType = 'NTID' THEN ec.ntid
WHEN #dataType = 'QID' THEN ec.QID
WHEN #dataType = 'Emp ID' THEN ec.EmpID
WHEN #dataType = 'Email Address' THEN ec.Email
WHEN #dataType = 'Personnel ID' OR #dataType = 'Sap ID' THEN ec.PersonnelID
--WHEN #dataType = 'Name' THEN (
-- (ec.FirstName + ' ' + ec.LastName)
-- OR (ec.PreferredName + ' ' + ec.LastName)
-- OR (ec.LastName + ', ' + ec.FirstName)
-- OR (ec.LastName + ', ' + ec.PreferredName
-- )
END
FOR XML PATH ('employees'), ELEMENTS, TYPE, ROOT ('root');
In short, I take the multiple data points being searched and throw them into an XML string to pass to the stored procedure. Once they arrive, I put them into a temp table so that I can join that with my main employee records.
The problem / question:
You will see I have some commented out code in my example and this is where my issue is. There are three name fields in my database. First Name, Preferred Name, Last Name.
I essentially need to test what the user provided and find employees based on the combination they entered them. All the user selects in the UI is that they are providing a name but not the format that its in.
For this reason, I need to check to see if I can find records in a couple of different formats.
Issue in this case is that I can't join my dataset using OR conditions in the CASE statement.
If #dataType = 'Name', I need to be able to join my temp table on a couple of the different combination possibilities.
The one thing we do make them aware of is that they can't mix and match. Meaning they cant do a FirstName LastName with a LastName FirstName search.
I had trouble explaining this so please let me know if I need to somehow clarify.

Push the equations in the CASE. If they are true let the THEN return 1. Check if the CASE returned 1. If and only if it did, you've found a match. In the conditions of WHEN you can use Boolean operators. So you can build your OR (or use IN as i did below) there.
...
CASE
WHEN #dataType = 'NTID' AND d.data = ec.ntid THEN 1
WHEN #dataType = 'QID' AND d.data = ec.QID THEN 1
...
WHEN (#dataType = 'Personnel ID' OR #dataType = 'Sap ID') AND d.data = ec.PersonnelID THEN 1
WHEN #dataType = 'Name' AND (d.data IN ('' + ec.FirstName + ' ' + ec.LastName,
'' + ec.PreferredName + ' ' + ec.LastName,
'' + ec.LastName + ', ' + ec.FirstName,
'' + ec.LastName + ', ' + ec.PreferredName) THEN 1
END = 1
...

What about just putting ORs in your join?
FROM #dataset AS d
INNER JOIN Resources.emp.EmployeeComplete AS ec
ON (#dataType = 'NTID' AND ec.ntid = d.data)
OR (#dataType = 'QID' AND ec.QID = d.data)
OR (#dataType = 'Emp ID' AND ec.EmpID = d.data)
OR (#dataType = 'Email Address' AND ec.Email = d.data)
OR ((#dataType = 'Personnel ID' OR #dataType = 'Sap ID') AND ec.PersonnelID = d.data)
OR (#dataType = 'Name' AND (ec.FirstName + ' ' + ec.LastName) = d.data)
OR (#dataType = 'Name' AND (ec.PreferredName + ' ' + ec.LastName) = d.data)
OR (#dataType = 'Name' AND (ec.LastName + ', ' + ec.FirstName) = d.data)
OR (#dataType = 'Name' AND (ec.LastName + ', ' + ec.PreferredName) = d.data)
I'm not sure if SQL is smart enough to use the proper indexes on those fields if you have them, probably especially not when combining columns. You would create indexed views for those combined names, or computed columns you could index on. It may be better to spread them out into separate queries in if/thens so SQL could optimize each query based on the field you are joining on and just execute the one query.

Related

Iterate through databases with similar tables but slightly different column names

In SQL Server, I am trying to consolidate multiple databases with similar relational table structure into one database. Let's say each database contains a table called items and each items table has a column for the name of an item. However, this column name varies depending on the database we are in. Some databases may have Item_Name while others may have ItemName or Name, etc.
In the consolidated database, I am trying to append all of the items tables into one table, and I want to create a CASE expression that checks if a column name exists in the items table from any of the databases, and if it does, then to return the results from that table column.
This is what I have at the moment:
DECLARE #sql varchar(max)
SELECT #sql = #sql + 'UNION ALL
SELECT ''' + name + ''' AS DatabaseName,
CASE WHEN EXISTS (SELECT COLUMN_NAME
FROM ' + QUOTENAME(name) + '.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = ''items'' AND
COLUMN_NAME = ''Item_Name'')
THEN Item_Name
WHEN EXISTS (SELECT COLUMN_NAME
FROM ' + QUOTENAME(name) + '.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = ''items'' AND
COLUMN_NAME = ''ItemName'')
THEN ItemName
...
END AS NameOfItem
FROM ' + QUOTENAME(name) + 'dbo.Items'
FROM master.sys.databases
WHERE name <> 'master' ...
SET #sql = STUFF(#sql, 1, LEN(' UNION ALL'), '')
EXEC(#sql)
I believe this throws an error because of the CASE expression, since in the THEN clauses I am specifying a column that may not exist in that current table, even though it wouldn't satisfy the condition in the WHEN clause. How would I resolve this issue?

MS SQL Server - replace names while avoiding words containing the names

This is my first time posting on Stack Overflow, so please let me know if I can do anything better or provide more information.
I have been working on this issue for a few days now. I have a table with comments from employees about the company. Some of them could refer to specific employees in the company. For HR reasons, we want to replace any occurrence of an employee name with the word 'employee'. We aren't accounting for typos or misspellings.
An example of my desired outcome would be:
Input: 'I dislike dijon mustard. My boss Jon sucks.'
Name to search for: 'Jon'
Output: 'I dislike dijon mustard. My boss employee sucks.'
Another example:
Input: 'Aggregating data is boring. Greg is the worst person ever.'
Name to search for: 'Greg'
Output: 'Aggregating data is boring. employee is the worst person ever.'
I want to search the comments for occurrences of the employee names, but only if they aren't followed by other letters or numbers on either end. Occurrences with spaces or punctuation on either end of the name should be replaced.
So far I have tried the suggestions in the following threads:
How to replace a specific word in a sentence without replacing in substring in SQL Server
replacing-in-substring-in-s
This yielded the following
update c
set c.Comment = rtrim(ltrim(Replace(replace(' ' + c.Comment + ' ',' ' + en.FirstName + ' ', 'employee'), ' ' + en.FirstName + ' ', 'employee')))
from AnswerComment c
join #EmployeeNames en on en.SurveyId = c.SurveyId
and c.Comment like '%' + en.FirstName + '%'
However, I got results like this:
Input: 'I hate bob.'
Name to search for: 'Bob'
Output: 'I hate bob.'
Input: 'Jon sucks'
Name to search for: 'Jon'
Output: 'employeesucks'
A coworker looked at this thread Replace whole word using ms sql server "replace"
and gave me the following based off of it:
DECLARE #token VARCHAR(10) = 'bob';
DECLARE #replaceToken VARCHAR(10) = 'employee';
DECLARE #paddedToken VARCHAR(10) = ' ' + #token + ' ';
DECLARE #paddedReplaceToken VARCHAR(10) = ' ' + #replaceToken + ' ';
;WITH Step1 AS (
SELECT CommentorId
, QuestionId
, Comment
, REPLACE(Comment, #paddedToken, #paddedReplaceToken) AS [Value]
FROM AnswerComment
WHERE SurveyId = 90492
AND Comment LIKE '%' + #token + '%'
), Step2 AS (
SELECT CommentorId
, QuestionId
, Comment
, REPLACE([Value], #paddedToken, #paddedReplaceToken) AS [Value]
FROM Step1
), Step3 AS (
SELECT CommentorId
, QuestionId
, Comment
, IIF(CHARINDEX(LTRIM(#paddedToken), [Value]) = 1, STUFF([Value], 1, LEN(TRIM(#paddedToken)), TRIM(#paddedReplaceToken)), [Value]) AS [Value]
FROM Step2
)
SELECT CommentorId
, QuestionId
, Comment
, IIF(CHARINDEX(REVERSE(RTRIM(#paddedToken)), REVERSE([Value])) = 1,
REVERSE(STUFF(REVERSE([Value]), CHARINDEX(REVERSE(RTRIM(#paddedToken)), REVERSE([Value])), LEN(RTRIM(#paddedToken)), REVERSE(RTRIM(#paddedReplaceToken)))),
[Value])
FROM Step3;
But I have no idea how I would implement this.
Another thread I can't find anymore suggested using %[^a-z0-9A-Z]% for searching, like this:
update c
set c.Comment = REPLACE(c.Comment, en.FirstName, 'employee')
from AnswerComment c
join #EmployeeNames en on en.SurveyId = c.SurveyId
and c.Comment like '%' + en.FirstName + '%'
and c.Comment not like '%[^a-z0-9A-Z]%' + en.FirstName + '%[^a-z0-9A-Z]%'
select ##ROWCOUNT [first names replaced]
This doesn't work for me. It replaces occurrences of the employee names even if they're part of a larger word, like in this example:
Input: 'I dislike dijon mustard.'
Name to search for: 'Jon'
Output: 'I dislike diemployee mustard.'
At this point it seems to me that it's impossible to accomplish this. Is there anything wrong with how I've implemented these, or anything obvious that I'm missing?
Here is a method that uses a combination of STUFF and PATINDEX.
It'll only replace the first occurence of the name in the comment.
So it might have to be executed more than once till nothing gets updated by it.
UPDATE c
SET c.Comment = STUFF(c.Comment, PATINDEX('%[^a-z0-9]'+en.FirstName+'[^a-z0-9]%', '/'+c.Comment+'/'), len(en.FirstName), 'employee')
FROM AnswerComment c
JOIN #EmployeeNames en ON en.SurveyId = c.SurveyId
WHERE '/'+c.Comment+'/' LIKE '%[^a-z0-9]'+en.FirstName+'[^a-z0-9]%';
Something like this seems to work.
declare #charsTable table (notallowed char(1))
insert into #charsTable (notallowed) values (',')
insert into #charsTable (notallowed) values ('.')
insert into #charsTable (notallowed) values (' ')
declare #input nvarchar(max) = 'Aggregating data is boring. Greg is the worst person ever.'
declare #name nvarchar(50) = 'Greg'
--declare #input nvarchar(max) = 'I dislike dijon mustard. You know who sucks? My boss Jon.'
--declare #name nvarchar(50) = 'Jon'
select case when #name + notallowed = value or notallowed + #name = value or notallowed + #name = value then replace(value, #name, 'employee') else value end 'data()' from string_split(#input, ' ')
left join #charsTable on #name + notallowed = value or notallowed + #name = value or notallowed + #name + notallowed = value
for xml path('')
Results:
Aggregating data is boring. employee is the worst person ever.
I dislike dijon mustard. You know who sucks? My boss employee.

SQL not hard case statement

I'm working on an issue with a stored procedure. I have the following:
SELECT #message = 'ID' + CAST(CASE WHEN #StoreID = 0 THEN 'BK'
WHEN #StoreID = 1 THEN 'MK'
END AS VARCHAR (50)) + char(13)
This sends an e-mail that says: ID: 'MK' or ID:'BK'.
Currently i'm hard-coding the case statement, but I need to pull the strings 'BK' 'MK' from a different table all together.
The #StoreID is from the Store_Orders table. The names are from the Store table.
One way to get this I tried was doing this:
SELECT #message = 'ID' + StoreName from db.Store where StoreName = StoreID and StoreID = #StoreID
When I execute the code, it finds the correct store, but says that
Conversion failed when converting the varchar value 'MK' to data type
int.
But I don't want to convert 'MK' I want to display 'MK' by finding it via the #StoreID.
Are you looking for something like this:
SELECT CONCAT('ID: ' , CASE WHEN s.StoreID = 0
THEN (select '(''' + colname + ''')'
from othertable ot where ot.colname = s.colname)
WHEN s.StoreID = 1
THEN (select '(''' + colname + ''')'
from othertable ot where ot.colname = s.colname)
END) AS 'Message'
FROM Store_Orders s

SQL query returns multiple rows when trying to find specific value

I have 2 tables. One is called "Tasks" and the other one is called "TaskDescription"
in my "Task" the setup looks like this:
"taskID(primary)","FileID","TaskTypeID" and a bunch of other columns irrelevant.
Then in my "TaskDescription", the setup looks like:
"TaskTypeID", "TaskTypeDesc"
so for example if TaskTypeID is 1 , then the description would be"admin"
or if TaskTypeID is 2, then TaskTypeDesc would be "Employee" etc.
The two tables have a relationship on the primary/foreign key "TaskTypeID".
What I am trying to do is get a task id, and the TaskDesc where the FileID matches the #fileID(which I pass in as a param). However in my query I get multiple rows returned instead of a single row when trying to obtain the description.
this is my query:
SELECT taskid,
( 'Task ID: '
+ Cast(cf.taskid AS NVARCHAR(15)) + ' - '
+ Cast((SELECT DISTINCT td.tasktypedesc FROM casefiletaskdescriptions
td JOIN
casefiletasks cft ON td.tasktypeid=cft.tasktypeid WHERE cft.taskid =
1841 )AS
NVARCHAR(100))
+ ' - Investigator : ' + ( Cast(i.fname AS NVARCHAR(20)) + ' '
+ Cast(i.lname AS NVARCHAR(20)) ) ) AS
'Display'
FROM casefiletasks [cf]
JOIN investigators i
ON CF.taskasgnto = i.investigatorid
WHERE cf.fileid = 2011630988
AND cf.concluded = 0
AND cf.progressflag != 'Conclude'
I am trying to get the output to look like "Task ID: 1234 - Admin - Investigator : John Doe". However I am having trouble on this part:
CAST((select DISTINCT td.TaskTypeDesc from CaseFileTaskDescriptions td
JOIN CaseFileTasks cft ON td.TaskTypeID=cft.TaskTypeID
where cft.TaskID =1841 )as nvarchar(100))
This seems to work but the problem is I have to hard code the value "1841" to make it work. Is there a way to assign a "taskID" variable with the values being returned from the TaskID select query, or will it not work since I think sql runs everything at once instead of line by line.
EDIT-this is in Microsoft SQL Server Management Studio 2008
You can dynamically reference a column that exists in your FROM set. In this case, it would be any column from casefiletasks or investigators. You would replace 1841 with the table.column reference.
Update
Replacing your static integer with the column reference, your query would look like:
SELECT taskid,
( 'Task ID: '
+ Cast(cf.taskid AS NVARCHAR(15)) + ' - '
+ Cast((SELECT DISTINCT td.tasktypedesc FROM casefiletaskdescriptions
td JOIN
casefiletasks cft ON td.tasktypeid=cft.tasktypeid WHERE cft.taskid =
cf.taskid )AS
NVARCHAR(100))
+ ' - Investigator : ' + ( Cast(i.fname AS NVARCHAR(20)) + ' '
+ Cast(i.lname AS NVARCHAR(20)) ) ) AS
'Display'
FROM casefiletasks [cf]
JOIN investigators i
ON CF.taskasgnto = i.investigatorid
WHERE cf.fileid = 2011630988
AND cf.concluded = 0
AND cf.progressflag != 'Conclude'
Would this work as your inner query?
SELECT DISTINCT td.TaskTypeDesc FROM CaseFileTaskDescriptions td
JOIN CaseFileTasks cft ON td.TaskTypeID = cft.TaskTypeID
WHERE cft.TaskID = cf.TaskID
Why not just do another join instead of a subquery?
SELECT taskid,
( 'Task ID: '
+ Cast(cf.taskid AS NVARCHAR(15)) + ' - '
+ Cast(td.tasktypedesc AS NVARCHAR(100))
+ ' - Investigator : ' + ( Cast(i.fname AS NVARCHAR(20)) + ' '
+ Cast(i.lname AS NVARCHAR(20)) ) ) AS
'Display'
FROM casefiletasks [cf]
JOIN investigators i
ON CF.taskasgnto = i.investigatorid
JOIN casefiletaskdescriptions td
ON td.tasktypeid = cf.tasktypeid
WHERE cf.fileid = 2011630988
AND cf.concluded = 0
AND cf.progressflag != 'Conclude'

How to optimize MSSQL CASE WHEN queries

Here's a sample of my code:
SET #variable_out =
'Report: '
+ CASE WHEN (SELECT name FROM person WITH(NOLOCK) WHERE person_id = #person_id) != ''
THEN 'Name: ' + (SELECT name FROM person WITH(NOLOCK) WHERE person_id = #person_id) + CHAR(13)+CHAR(10)
ELSE 'Name: not found' + CHAR(13)+CHAR(10)
END
+ CASE WHEN (SELECT home_phone FROM person WITH(NOLOCK) WHERE person_id = #person_id) != ''
THEN 'Phone #: ' + (SELECT home_phone FROM person WITH(NOLOCK) WHERE person_id = #person_id) + CHAR(13)+CHAR(10)
ELSE 'Phone #: not found' + CHAR(13)+CHAR(10)
END
etc...
As you can see, I am redundantly performing two selects for each CASE WHENE... of the variable that I am constructing, and I would love to collapse this down to only one select for each line.
The only solution I know of would be to create a unique variable for CASE WHEN..., run all of the selects before hand, and then if the variables aren't empty, concat them into #variable_out.
Is there a more clever way to accomplish this?
DECLARE #name the_same_datatype_as_name_field_from_person_table --Ex. VARCHAR(100)
,#home_phone the_same_datatype_as_homephone_field_from_person_table; --Ex. VARCHAR(15)
SELECT #name = NULLIF(p.name,''), #home_phone = NULLIF(p.home_phone,'')
FROM person p --WITH(NOLOCK)
WHERE p.person_id = #person_id;
SET #variable_out =
'Report: '
+ ISNULL('Name: ' + #name, 'Name: not found')
+ CHAR(13)+CHAR(10)
+ ISNULL('Phone #: ' + #home_phone, 'Phone #: not found')
+ CHAR(13)+CHAR(10);
Note:
or, you can use for #name & #home_phone variables the same data type like as #variable_out variable (ex. VARCHAR).
NOLOCK pros & cons.