SQL Server Full Text Search - Create one computed column - sql-server-2005

I am currently working on a project where I want to search for employees via just one input search term. For this I am using the SQL FTS.
The table schema looks like this
Employee table
EmployeeId, Firstname, Lastname
Sample data
1, John, Miller
2, Chuck, Norris
Address table
AddressId, EmployeeId, CityId, Street, StreetNumber
Sample data
1, 1, 1, Avenue, 12
2, 2, 2, Wimbledon Rd, 12
City table
CityId, Name, ZipCode
Sample data
1, Hamburg, 22335
2, London, 12345
So now I got the following search term:
John Hamburg: Means John AND Hamburg and should return 1 record.
John London: Means John AND London and should return 0 records since there is no John in London.
Norris Wimbledon: Means Norris AND Wimbledone and should return 1 records.
Now the problem with this is that using CONTAINSTABLE only allows to search one table at a time. So applying "John AND Hamburg" on the Employee Full text catalog returns 0 records since "Hamburg" is located in the address table.
So currently I can use "OR" instead of "AND" only, like:
SELECT
(keyTblSp.RANK * 3) AS [Rank],
sp.*
FROM Employee sp
INNER JOIN
CONTAINSTABLE(Employee, *, 'John OR Hamburg', 1000) AS keyTblSp
ON sp.EmployeeId = keyTblSp.[KEY]
UNION ALL
SELECT
(keyTbl.RANK * 2) AS [Rank],
sp.*
FROM Employee sp
LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = sp.EmployeeId
INNER JOIN
CONTAINSTABLE([Address], *, 'John OR Hamburg', 1000) AS keyTbl
ON addr.AddressId = keyTbl.[KEY]
UNION ALL
SELECT
(keyTbl.RANK * 2) AS [Rank],
sp.*
FROM Employee sp
LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = sp.EmployeeId
LEFT OUTER JOIN [City] cty ON cty.CityId = addr.CityId
INNER JOIN
CONTAINSTABLE([City], *, 'John OR Hamburg', 1000) AS keyTbl
ON cty.CityId = keyTbl.[KEY]
This causes that not just John who lives Hamburg is returned, but every person named John and every person who lives in Hamburg.
One solution I could think of is to somehow compute a column in the Employee Table that holds all necessary values for the full text search like.
Employee table
EmployeeId, Firstname, Lastname, FulltextColumn
Sample data
1 | John | Miller | John Miller Avenue 12 Hamburg 22335
So then I could do
SELECT
(keyTbl.RANK) AS [Rank],
sp.*
FROM Employee sp
INNER JOIN
CONTAINSTABLE([Employee], FulltextColumn, 'John AND Hamburg', 1000) AS keyTbl
ON sp.EmployeeId = keyTbl.[KEY]
Is this possible? Any other ideas?

you could use a join to require a match in both the address and the persons name.
SELECT
(keyTblSp.RANK * 3) AS [Rank],
sp.*
FROM Employee sp
INNER JOIN
CONTAINSTABLE(Employee, *, 'John OR Hamburg', 1000) AS keyTblSp
ON sp.EmployeeId = keyTblSp.[KEY]
join
(
SELECT
(keyTbl.RANK * 2) AS [Rank],
sp.*
FROM Employee sp
LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = sp.EmployeeId
INNER JOIN
CONTAINSTABLE([Address], *, 'John OR Hamburg', 1000) AS keyTbl
ON addr.AddressId = keyTbl.[KEY]
UNION ALL
SELECT
(keyTbl.RANK * 2) AS [Rank],
sp.*
FROM Employee sp
LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = sp.EmployeeId
LEFT OUTER JOIN [City] cty ON cty.CityId = addr.CityId
INNER JOIN
CONTAINSTABLE([City], *, 'John OR Hamburg', 1000) AS keyTbl
ON cty.CityId = keyTbl.[KEY]
) addr_matches
on addr_matches.EmployeeId = sp.EmployeeId
which I think would give you the results you specified, obviously though, this requires both a name and an address search term for a search to return any results. You didn't specify what happens if someone just searches for 'John', if you will always get both a name and address the above will work fine I think.

I think the computed column is your best option. It'll be the most flexible, given that you don't know which tokens will be in the search query, it'll perform better, and your stored procedure will be smaller.
In order to create a computed column based on data in another table, you will have to create it using a UDF (user defined function) like this:
CREATE FUNCTION dbo.udf_ComputedColumnFunction (
#EmployeeId INT
)
RETURNS VARCHAR(1000)
AS
BEGIN
DECLARE #RET VARCHAR(1000)
SELECT
#RET = e.FirstName + ' ' + e.LastName + ' ' + a.Street + ' ' + a.StreetNumber + ' ' + c.Name + ' ' + c.ZipCode
FROM Employee e
INNER JOIN Address a ON a.EmployeeId = e.EmployeeId
INNER JOIN City c ON c.CityId = a.CityId
RETURN #RET
END
GO
ALTER TABLE Employee
ADD SearchColumn AS dbo.udf_ComputedColumnFunction(EmployeeId)
If you don't want to do that, you could:
Create an indexed view and add the FullText index onto that.
Create a lookup table populated by a trigger or by periodically running a stored procedure.

I think you should create and index view and should join all the columns which can be used in FullText to be combined in one single column by separating them with spaces or dashes, as both are noise words for sql server 2005. Then on that indexed view create a full text index.
Contains table does not by default applies FormsOf Inflectional or Forms of Thesaurus. These two are good options to configure and use.
If you want to go only for "OR" then use FreeTextTable as if by default applies both Forms of Thesaurus and FormsOf inflectional.

Related

SQL hierarchyid type issue

The full task sounds like this: Display the content of employees whose immediate supervisor is younger and less employed in the company.
Columns:
Manager Name | Date of hiring a manager | Head's date of birth
Employee name | Employee hiring date | Employee's date of birth
I already broke my head here:
SELECT
LastName + ' ' + FirstName AS SupervisorFullName,
HireDate,
BirthDate,
(SELECT LastName + ' ' + FirstName
FROM HumanResources.Employee AS subHrE
WHERE HrE.OrganizationNode.IsDescendantOf(subHrE.OrganizationNode) = 1
AND HrE.OrganizationLevel = HrE.OrganizationNode.GetLevel() + 1) AS EmployeeFullName,
(SELECT HireDate
FROM HumanResources.Employee AS subHrE
WHERE HrE.OrganizationNode.IsDescendantOf(subHrE.OrganizationNode) = 1
AND HrE.OrganizationLevel = HrE.OrganizationNode.GetLevel() + 1) AS HireDateEmp,
(SELECT BirthDate
FROM HumanResources.Employee AS subHrE
WHERE HrE.OrganizationNode.IsDescendantOf(subHrE.OrganizationNode) = 1
AND HrE.OrganizationLevel = HrE.OrganizationNode.GetLevel() + 1) AS BithDateEmp
FROM
HumanResources.Employee as HrE
JOIN
Person.Person as P ON HrE.BusinessEntityID = P.BusinessEntityID
ORDER BY
SupervisorFullName ASC
Output
AdventureWork2016 db is used for work
Full Schema AW2016
There's definitely a certain way to think about use of hierarchyid that, once you get the hang of it, opens a lot of possibilities. Here's what I came up with:
WITH FullPerson AS (
SELECT CONCAT_WS(' ', p.FirstName, p.MiddleName, p.LastName) AS [FullName],
e.HireDate,
e.BirthDate,
e.OrganizationNode
FROM HumanResources.Employee AS e
JOIN Person.Person AS p
ON p.BusinessEntityID = e.BusinessEntityID
)
SELECT
manager.OrganizationNode.ToString(),
manager.FullName,
manager.HireDate,
manager.BirthDate,
subordinate.OrganizationNode.ToString(),
subordinate.FullName,
subordinate.HireDate,
subordinate.BirthDate
FROM FullPerson AS subordinate
JOIN FullPerson AS manager
ON subordinate.OrganizationNode.GetAncestor(1) = manager.OrganizationNode
WHERE manager.HireDate > subordinate.HireDate
AND manager.BirthDate > subordinate.BirthDate;
Breaking it down, I create a common table expression to join Employee and Person as I'll need columns from both tables for both subordinates and their managers. The real trick is the join condition. subordinate.OrganizationNode.GetAncestor(1) = manager.OrganizationNode says "take the subordinate's OrganizationNode and go one level up the tree". What's amazing to me is that this sort of query can be supported by indexes and indeed there is an index on that column in the AdventureWorks schema! In addition to the columns you asked for, I added a human-readable representation of OrganizationNode to help with the visualization of how the data relates.

SQL: How to concatenate two cells from one column of multiple rows if all of the other row's cells are equal

Looking for DepartmentName1 + ', ' + DepartmentName2
I'm trying to merge two rows into one row when only one column has different values. Specifically I'm trying to list the name, job title, gender, pay rate, hire date and department name of the top 100 highest paid employees of the AdventureWorks2017 database. Here is the code I have so far:
SELECT TOP 100 (P.FirstName + ' ' + P.LastName) AS Name, HRE.JobTitle, HRE.Gender,
CAST(HRPH.Rate AS Decimal(10,2)) AS PayRate, HRE.HireDate, HRD.Name AS Department
FROM ((((Person.Person AS P
INNER JOIN HumanResources.Employee AS HRE
ON P.BusinessEntityID = HRE.BusinessEntityID)
INNER JOIN
(SELECT BusinessEntityID, MAX(RateChangeDate) AS RCD, MAX(Rate) AS Rate
FROM HumanResources.EmployeePayHistory
GROUP BY BusinessEntityID) AS HRPH
ON HRE.BusinessEntityID = HRPH.BusinessEntityID)
INNER JOIN HumanResources.EmployeeDepartmentHistory AS HRDH
ON HRE.BusinessEntityID = HRDH.BusinessEntityID)
INNER JOIN HumanResources.Department AS HRD
ON HRDH.DepartmentID = HRD.DepartmentID)
ORDER BY HRPH.Rate DESC;
This gives me the following result:
Two questions:
How can I get every 'Name' to be listed only once, regardless of DepartmentName? For example: Rows 5 & 6 to be only Row 5: Laura Norman | Chief Financial Officer | F | 60.10 | 2009-01-31 | Executive, Finance.
OR, David Bradley...|...Marketing, Purchasing
Does my code include an employee that may have gotten a pay cut? Meaning, the RateChangeDate (RCD) is MAX but the Rate is not?
Using Microsoft SQL Server 2019
I bet you can make use of the string_agg() to aggregate the values with a delimiter in a query field.
SELECT TOP 100 (P.FirstName + ' ' + P.LastName) AS Name, HRE.JobTitle, HRE.Gender,
CAST(HRPH.Rate AS Decimal(10,2)) AS PayRate, HRE.HireDate, STRING_AGG(HRD.Name,',') AS Department
FROM ((((Person.Person AS P
INNER JOIN HumanResources.Employee AS HRE
ON P.BusinessEntityID = HRE.BusinessEntityID)
INNER JOIN
(SELECT BusinessEntityID, MAX(RateChangeDate) AS RCD, MAX(Rate) AS Rate
FROM HumanResources.EmployeePayHistory
GROUP BY BusinessEntityID) AS HRPH
ON HRE.BusinessEntityID = HRPH.BusinessEntityID)
INNER JOIN HumanResources.EmployeeDepartmentHistory AS HRDH
ON HRE.BusinessEntityID = HRDH.BusinessEntityID)
INNER JOIN HumanResources.Department AS HRD
ON HRDH.DepartmentID = HRD.DepartmentID)
GROUP BY P.FirstName,P.LastName,HRE.JobTitle, HRE.Gender, HRPH.Rate, HRE.HireDate
ORDER BY HRPH.Rate DESC;
To answer the second part, I took the liberty of creating an example and you may be able to work into your solution. The data you are working with lacks a unique key and using FirstName, LastName, and Gender is an obviously bad candidate for a unique key. You also mention RateChangeDate but do not mention how to handle that value when the data aggregates. The query below basically ignores RateChangeDate on the output and marks the records that have a decrease in pay. Another query into the data is needed to remove those records, below I did it using a HAVING clause.
DECLARE #X TABLE (ID INT, Rate MONEY, RateChangeDate DATETIME, Department NVARCHAR(50))
INSERT #X VALUES
(1,25.00,'01/01/2021','A'),
(1,23.00,'05/01/2021','A'),
(2,25.00,'01/01/2021','A'),
(3,25.00,'01/01/2021','A'),
(3,26.00,'02/01/2021','A'),
(4,25.00,'01/01/2021','A'),
(4,25.00,'01/01/2021','B')
SELECT
ID,
SUM(LatestRate) AS LatestRate,
MAX(MaxRateChange) AS RateChanges,
Departments
FROM
(
SELECT
ID,
STRING_AGG(Department,',') AS Departments,
Rate,
MAX(RateChangeDate) AS MaxRateChange,
CASE WHEN LAG(Rate) OVER (PARTITION BY ID ORDER BY RateChangeDate) > Rate THEN 1 ELSE 0 END AS DecreaseInPay,
CASE WHEN MAX(RateChangeDate)OVER(PARTITION BY ID) = RateChangeDate THEN Rate ELSE NULL END LatestRate
FROM
#X
GROUP BY
ID,Rate,RateChangeDate
)AS X
GROUP BY
ID,Departments
HAVING
MAX(DecreaseInPay) = 0

SQL Join left join or left outer join

I am having a question in SQL Joins. I have table employee with employeeid as primary key and some other columns for employee. And there is another table called employeeaddress where there can be multiple employeeid is a foreign key. One employee can have many employeeaddresses just to explain one to many relationship.
If I want to write a query which will fetch the following columns
employee.employeeid, employee.empname,
employeeaddress.employeeaddressid, employeeaddress.addr1,
employeeaddress.addr2
So there can be an employee with no employeeaddress. But anyway I wanted to fetch all the employees who may have zero or multiple addresses.
Do I need to apply left join or left outer join? I want the following result for a table that has 2 employees John and Michael where John has two employeeaddresses with employeeaddressid 21 and 22 and Michael has no employeeaddress
1, John, 21, addr1 for John, addr2 for John
1, John, 22, another addr1 for John, another addr2 for John
2, Michael, NULL , NULL , NULL
The above result is arranged in the following fashion
employee.employeeid, employee.empname, employeeaddress.employeeaddressid, employeeaddress.addr1, employeeaddress.addr2
Please help.
Based on your description it sounds like you're looking for a query as follows. If you also wanted the address details, you'll just have to add a left join to the outer query.
Also, as comments have eluded to, LEFT JOIN is shorthand for LEFT OUTER JOIN, they will produce the same results.
SELECT *
FROM employee
inner join
(
SELECT
employeeid,
count(*) as addresscount
FROM employee
left join employeeaddress ON employeeaddress.employeeaddressid = employee.employeeaddressid
group by employeeid
) counts on counts.employeeid = employee.employeeid
WHERE counts.addresscount = 0 -- Or 1, or 5 or > 1, etc.
LEFT JOIN should be all you need.
SQL Fiddle Example
SELECT e.employeeID ,
e.empName ,
ea.employeeAddressID ,
ea.addr1 ,
ea.addr2
FROM Employee e
LEFT JOIN EmployeeAddress ea ON ea.employeeID = e.employeeID

Only return results from first table with two joins

I have 3 tables. I need to get lastname, firstname, and employee number from the first table and name from another table.
In order for me to get the name on table s there needs to be a match between the slsrep columns on table s and table sw.
The issues is that I only want to return the rows from the first table (p). There is only 700 records in the first table but it is pulling 900.
Basically, I just want to look at each row in the table p and match the name from table s.
This is what I currently have:
SELECT p.LastName,
p.FirstName,
p.EmpNo,
s.Name
FROM PDDA..PhoneDirectory p
LEFT OUTER JOIN nxtsql..swsmsn sw
ON p.EmpNo = sw.EmpNo
JOIN NxtSQL..SMSN s
ON sw.slsrep = s.slsrep
WHERE sw.statustype = 1
ORDER BY
p.LastName
There are lots of ways to do this. One is to use a sub-select to get s.Name:
SELECT p.LastName, p.FirstName, p.EmpNo, (
SELECT TOP 1 s.Name
FROM NxtSQL..SMSN s
INNER JOIN nxtsql..swsmsn sw
ON sw.slsrep = s.slsrep
WHERE p.EmpNo = sw.EmpNo
AND sw.statustype = 1
) AS Name
FROM PDDA..PhoneDirectory p
ORDER By p.LastName

how use distinct in second join table in sql server

I have a SQL table consists of id, name, email,.... I have another SQL table that has id, email, emailstatus but these 2 id are different they are not related. The only thing that is common between these 2 tables are emails.
I would like to join these 2 tables bring all the info from table1 and if the email address from table 1 and table 2 are same and emailstatus is 'Bounced'. But the query that I am writing gives me more record than I expected because there are multiple rows in tbl_webhook(second table) for each row in Applicant(first table) .I want to know if applicant has EVER had an email bounce.
Query without join shows 23000 record but after join shows 42000 record that is because of duplicate how I can keep same 23000 record only add info from second table?
This is my query:
SELECT
A.[Id]
,A.[Application]
,A.[Loan]
,A.[Firstname]
,A.[Lastname]
,A.[Email],
,H.[Email], H.[EmailStatus] as BouncedEmail
FROM Applicant A (NOLOCK)
left outer join [tbl_Webhook] [H] (NOLOCK)
on A.Email = H.Email
and H.[event]='bounced'
this is sample of desired data:
id email name emailFromTable2 emailstatus
1 test2#yahoo.com lili test2#yahoo.com bounced
2 tesere#yahoo.com mike Null Null
3 tedfd2#yahoo.com nik tedfd2#yahoo.com bounced
4 tdfdft2#yahoo.com sam Null Null
5 tedft2#yahoo.com james tedft2#yahoo.com bounced
6 tedft2#yahoo.com San Null
Use a nested select for this type of query. I would write this as:
select id, application, load, firstname, lastname, email,
(case when BouncedEmail is not null then email end) as EmailFromTable2,
BouncedEmail
from (SELECT A.[Id], A.[Application], A.[Loan], A.[Firstname], A.[Lastname], A.[Email],
(case when exists (select 1
from tbl_WebHook h
where A.Email = H.Email and H.[event] = 'bounced'
)
then 'bounced
end) as BouncedEmail
FROM Applicant A (NOLOCK)
) a
You can also do this with cross apply, but because you only really need one column, a correlated subquery also works.
;WITH DistinctEmails
AS
(
SELECT * , rn = ROW_NUMBER() OVER (PARTITION BY [Email] ORDER BY [Email])
FROM [tbl_Webhook]
)
SELECT
A.[Id]
,A.[Application]
,A.[Loan]
,A.[Firstname]
,A.[Lastname]
,A.[Email],
,H.[Email], H.[EmailStatus] as BouncedEmail
FROM Applicant A (NOLOCK) left outer join DistinctEmails [H] (NOLOCK)
on A.Email = H.Email
WHERE H.rn = 1
and H.[event]='bounced'
i believe query below should be enough to select distinct bounced email for you, cheer :)
SELECT
A.[Id]
,A.[Application]
,A.[Loan]
,A.[Firstname]
,A.[Lastname]
,A.[Email],
,H.[Email], H.[EmailStatus] as BouncedEmail
FROM Applicant A (NOLOCK)
Inner join [tbl_Webhook] [H] (NOLOCK)
on A.Email = H.Email
and H.[EmailStatus]='bounced'
basically i just change the joining to inner join and change the 2nd table condition from event to emailstatus, if u can provide your table structure and sample data i believe i can help you up :)