Checking existence of all words words of a column of table 1 in other column of table 2 - sql

I have a table which contains product_name field. Then another table with models.
===products
product_id, product_name
===models
model_id, model_name
I am looking for a way to do the following.
Model names can have words separated by hyphen i.e JVC-600-BLACK
For each model I need to check the existence of each words of model in product name.
I'll need result in some where like below.
== results
model_id, product_id
If someone can point me in right direction, that would be a great help.
Notes
These are huge tables with about millions of records and number of
words in model_name are not fixed.
words in model may exist in any order or in between or other words in product name

Here's a function that splits the first string into parts using - as a delimiter and looks up each part in the second string, returning 1 if all parts were found and 0 otherwise.
CREATE FUNCTION dbo.func(#str1 varchar(max), #str2 varchar(max))
RETURNS BIT
AS
BEGIN
DECLARE #pos INT, #newPos INT,
#delimiter NCHAR(1)
SET #delimiter = '-'
SET #pos = 1
SET #newPos = 0
WHILE (#newPos < LEN(#str1))
BEGIN
SET #newPos = CHARINDEX(#delimiter, #str1, #pos)
IF #newPos = 0
SET #newPos = LEN(#str1)+1
DECLARE #data2 NVARCHAR(MAX)
SET #data2 = SUBSTRING(#str1, #pos, #newPos-#pos)
IF CHARINDEX(#data2, #str2) = 0
RETURN 0
SET #pos = #newPos + 1
IF #newPos = 0
BREAK
END
RETURN 1
END
You can use the above function for your problem as follows:
SELECT model_id, product_id
FROM models
JOIN products
ON dbo.func(models.model_name, products.product_name) = 1
It's not going to be fast, but I don't think a fast solution exists, since your problem doesn't allow for indexing. It may be possible to change the database structure to allow for this, but how exactly this can be done largely depends on what your data looks like.

I don't know if this solution is faster, for you to check if you care:
--=======================
-- sample data
-- ======================
declare #Products table
(
product_id int,
product_name nvarchar(max)
)
insert into #Products select 1, 'sdfsd def1 abc1klm1 sdljkfd'
insert into #Products select 2, 'sdfsd def2 abc2klm2 sdljkfd'
insert into #Products select 3, 'sdfsd def3 abc3klm3 sdljkfd'
declare #Models table
(
model_id int,
model_name nvarchar(max)
)
insert into #Models select 1, 'abc1-def1-klm1'
insert into #Models select 2, 'abc2-def2-klm2'
insert into #Models select 3, 'abc3-def3-klm3'
--=======================
-- solution
-- ======================
select t1.product_id, t2.model_id from #Products t1
cross join (
select
t1.model_id, Word = t2.r.value('.', 'nvarchar(max)')
from (select model_id, x = cast('<e>' + replace(model_name, '-', '</e><e>') + '</e>' as xml) from #Models ) t1
cross apply x.nodes('e') as t2 (r)
) t2
group by product_id, model_id
having min(charindex(word, product_name)) != 0

You may want to consider using the Full Text Search feature of SQL Server. In a nutshell, it catalogs all of the words (ignoring noise words like "and", "or", "a" and "the" by default but this list of noise worlds is configurable) in the tables and columns you specify when setting up the Full Text Catalog and offers a handful of functions that allow you to utilize that catalog to quickly find rows.

Related

SQL Server: efficiently search for many values on many to many columns?

I am creating a website using SQL Server. In the admin interface, I have two fields:
Subject: Math, English, History, ...
Grade: 1, 2, 3, 4, ...
Multiple values of a field can be assigned to a record.
Now in the frontend search, I would like a visitor to be able to select more than one value of a field for search. For example, someone may search for Subject being Math OR History and Grade being 1 OR 3.
What table design and SQL statement (or MS-proprietary statement) should I use to have efficient search?
Thanks and regards.
UPDATE
Thanks for all input!
I feel compelled to explain. I am technical and familiar with SQL. One thing I learned over my MANY years of programming experience is to be practical. For this question, I already have an initial design, but not sure how other folks to handle it for EFFICIENT SEARCH (there are always smarter folks out there). Here is my table design for storing a record:
Subject
type: varchar. record example: ,1,3, (each is the id of corresponding value)
Grade (this means applicable grade)
type: varchar. record example: ,1,2, (each is the id of corresponding values. this means a record is applicable to grade 1, 2)
Search example
where (subject LIKE '%,1,%' OR subject like '%,3,%') AND (grade like '%,1,%')
This design should lead to efficient search, but has drawbacks that it increases the complexity data management in the backend.
Another reason for my design is: the Subject and Grade each have a list of values that never/rarely change, and once a record is created, it never or rarely updates.
I am trying to strike a balance among simplicity, understandability, design, management, etc.
create table Subject (
SubjectId int identity(1, 1),
SubjectName nvarchar(255),
other fields.... )
create table GradingScale (
GradeId int identity(1, 1),
Grade int,
Description varchar(25),
other fields... )
create table Students (
StudentId int identity(1, 1),
StudentName nvarchar(255))
create table StudentGrades (
StudentId int,
SubjectId int,
GradeId int,
SemesterId int )
create FUNCTION [dbo].[fnArray] ( #Str varchar(max), #Delim varchar(1) = ' ', #RemoveDups bit = 0 )
returns #tmpTable table ( arrValue varchar(max))
as
begin
declare #pos integer
declare #lastpos integer
declare #arrdata varchar(8000)
declare #data varchar(max)
set #arrdata = replace(#Str,#Delim,'|')
set #arrdata = #arrdata + '|'
set #lastpos = 1
set #pos = 0
set #pos = charindex('|', #arrdata)
while #pos <= len(#arrdata) and #pos <> 0
begin
set #data = substring(#arrdata, #lastpos, (#pos - #lastpos))
if rtrim(ltrim(#data)) > ''
begin
if #RemoveDups = 0
begin
insert into #tmpTable ( arrValue ) values ( #data )
end
else
begin
if not exists( select top 1 arrValue from #tmpTable where arrValue = #data )
begin
insert into #tmpTable ( arrValue ) values ( #data )
end
end
end
set #lastpos = #pos + 1
set #pos = charindex('|', #arrdata, #lastpos)
end
return
end
select *
from Students st
inner join StudentGrades sg on sg.StudentId = st.StudentId
inner join Subject s on sg.SubjectId = s.SubjectId
inner join GradingScale gs on sg.GradeId = gs.GradeId
inner join dbo.fnArray(#subjects, ',') sArr on s.SubjectId = convert(int, sArr.arrValue)
inner join dbo.fnArray(#grades, ',') gArr on gs.GradeId = convert(int, gArr.arrValue)
obviously #subjectId and #gradeId could be passed in via some drop down selectors or however your UI is composed.
Edited to use dbo.fnArray, a little tool that can parse delimited strings into a list.
Now of course this would mean that if you had 2 subjects and 2 grades...like Show me all students that took ( Math and Science ) and scored ( 2 or 3 ) this would work. However if you wanted students who took Math and scored 2 or 3 or Students who took Science and scored a 3 you would have to rework the query.

What is the best way to join between two table which have coma seperated columns

Table1
ID Name Tags
----------------------------------
1 Customer1 Tag1,Tag5,Tag4
2 Customer2 Tag2,Tag6,Tag4,Tag11
3 Customer5 Tag6,Tag5,Tag10
and Table2
ID Name Tags
----------------------------------
1 Product1 Tag1,Tag10,Tag6
2 Product2 Tag2,Tag1,Tag5
3 Product5 Tag1,Tag2,Tag3
what is the best way to join Table1 and Table2 with Tags column?
It should look at the tags column which coma seperated on table 2 for each coma seperated tag on the tags column in the table 1
Note: Tables are not full-text indexed.
The best way is not to have comma separated values in a column. Just use normalized data and you won't have trouble with querying like this - each column is supposed to only have one value.
Without this, there's no way to use any indices, really. Even a full-text index behaves quite different from what you might thing, and they are inherently clunky to use - they're designed for searching for text, not meaningful data. In the end, you will not get much better than something like
where (Col like 'txt,%' or Col like '%,txt' or Col like '%,txt,%')
Using a xml column might be another alternative, though it's still quite a bit silly. It would allow you to treat the values as a collection at least, though.
I don't think there will ever be an easy and efficient solution to this. As Luaan pointed out, it is a very bad idea to store data like this : you lose most of the power of SQL when you squeeze what should be individual units of data into a single cell.
But you can manage this at the slight cost of creating two user-defined functions. First, use this brilliant recursive technique to split the strings into individual rows based on your delimiter :
CREATE FUNCTION dbo.TestSplit (#sep char(1), #s varchar(512))
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT 1, 1, CHARINDEX(#sep, #s)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(#sep, #s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn AS SplitIndex,
SUBSTRING(#s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS SplitPart
FROM Pieces
)
Then, make a function that takes two strings and counts the matches :
CREATE FUNCTION dbo.MatchTags (#a varchar(512), #b varchar(512))
RETURNS INT
AS
BEGIN
RETURN
(SELECT COUNT(*)
FROM dbo.TestSplit(',', #a) a
INNER JOIN dbo.TestSplit(',', #b) b
ON a.SplitPart = b.SplitPart)
END
And that's it, here is a test roll with table variables :
DECLARE #A TABLE (Name VARCHAR(20), Tags VARCHAR(100))
DECLARE #B TABLE (Name VARCHAR(20), Tags VARCHAR(100))
INSERT INTO #A ( Name, Tags )
VALUES
( 'Customer1','Tag1,Tag5,Tag4'),
( 'Customer2','Tag2,Tag6,Tag4,Tag11'),
( 'Customer5','Tag6,Tag5,Tag10')
INSERT INTO #B ( Name, Tags )
VALUES
( 'Product1','Tag1,Tag10,Tag6'),
( 'Product2','Tag2,Tag1,Tag5'),
( 'Product5','Tag1,Tag2,Tag3')
SELECT * FROM #A a
INNER JOIN #B b ON dbo.MatchTags(a.Tags, b.Tags) > 0
I developed a solution as follows:
CREATE TABLE [dbo].[Table1](
Id int not null,
Name nvarchar(250) not null,
Tag nvarchar(250) null,
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Table2](
Id int not null,
Name nvarchar(250) not null,
Tag nvarchar(250) null,
) ON [PRIMARY]
GO
get sample data for Table1, it will insert 28000 records
INSERT INTO Table1
SELECT CustomerID,CompanyName, (FirstName + ',' + LastName)
FROM AdventureWorks.SalesLT.Customer
GO 3
sample data for Table2.. i need same tags for Table2
declare #tag1 nvarchar(50) = 'Donna,Carreras'
declare #tag2 nvarchar(50) = 'Johnny,Caprio'
get sample data for Table2, it will insert 9735 records
INSERT INTO Table2
SELECT ProductID,Name, (case when(right(ProductID,1)>=5) then #tag1 else #tag2 end)
FROM AdventureWorks.SalesLT.Product
GO 3
My Solution
create TABLE #dt (
Id int IDENTITY(1,1) PRIMARY KEY,
Tag nvarchar(250) NOT NULL
);
I've create temp table and i will fill with Distinct Tag-s in Table1
insert into #dt(Tag)
SELECT distinct Tag
FROM Table1
Now i need to vertical table for tags
create TABLE #Tags ( Tag nvarchar(250) NOT NULL );
Now i'am fill #Tags table with While, you can use Cursor but while is faster
declare #Rows int = 1
declare #Tag nvarchar(1024)
declare #Id int = 0
WHILE #Rows>0
BEGIN
Select Top 1 #Tag=Tag,#Id=Id from #dt where Id>#Id
set #Rows =##RowCount
if #Rows>0
begin
insert into #Tags(Tag) SELECT Data FROM dbo.StringToTable(#Tag, ',')
end
END
last step : join Table2 with #Tags
select distinct t.*
from Table2 t
inner join #Tags on (',' + t.Tag + ',') like ('%,' + #Tags.Tag + ',%')
Table rowcount= 28000 Table2 rowcount=9735 select is less than 2 second
I use this kind of solution with paths of trees. First put a comma at the very begin and at the very end of the string. Than you can call
Where col1 like '%,' || col2 || ',%'
Some database index the column also for the like(postgres do it partially), therefore is also efficient. I don't know sqlserver.

How to get query result with stored procedure (convert item quantity from one table into my unit defined in second table)

I have two MSSQL2008 tables like this:
I have problem on the unit conversion logic.
The result I expect like this :
1589 cigar = 1ball, 5slop, 8box, 2pcs
52 pen = 2box, 12pcs
Basically I'm trying to take number (qty) from one table and to convert (split) him into the units which I defined in other table!
Note : Both table are allowed to add new row and new data (dinamic)
How can I get these results through a SQL stored procedure?
i totally misunderstand the question lest time so previous answer is removed (you can see it in edit but it's not relevant for this question)... However i come up with solution that may solve your problem...
NOTE: one little think about this solution, if you enter the value in second table like this
+--------+-------+
| Item | qty |
+--------+-------+
| 'cigar'| 596 |
+--------+-------+
result for this column will be
598cigar = 0ball, 5slop, 8box, 0pcs
note that there is a ball and pcs is there even if their value is 0, that probably can be fix if you don't want to show that value but I let you to play with it...
So let's back to solution and code. Solution have two stored procedures first one is the main and that one is the one you execute. I call it sp_MainProcedureConvertMe. Here is a code for that procedure:
CREATE PROCEDURE sp_MainProcedureConvertMe
AS
DECLARE #srcTable TABLE(srcId INT IDENTITY(1, 1), srcItem VARCHAR(50), srcQty INT)
DECLARE #xTable TABLE(xId INT IDENTITY(1, 1), xVal1 VARCHAR(1000), xVal2 VARCHAR(1000))
DECLARE #maxId INT
DECLARE #start INT = 1
DECLARE #sItem VARCHAR(50)
DECLARE #sQty INT
DECLARE #val1 VARCHAR(1000)
DECLARE #val2 VARCHAR(1000)
INSERT INTO #srcTable (srcItem, srcQty)
SELECT item, qty
FROM t2
SELECT #maxId = (SELECT MAX(srcId) FROM #srcTable)
WHILE #start <= #maxId
BEGIN
SELECT #sItem = (SELECT srcItem FROM #srcTable WHERE srcId = #start)
SELECT #sQty = (SELECT srcQty FROM #srcTable WHERE srcId = #start)
SELECT #val1 = (CAST(#sQty AS VARCHAR) + #sItem)
EXECUTE sp_ConvertMeIntoUnit #sItem, #sQty, #val2 OUTPUT
INSERT INTO #xTable (xVal1, xVal2)
VALUES (#val1, #val2)
SELECT #start = (#start + 1)
CONTINUE
END
SELECT xVal1 + ' = ' + xVal2 FROM #xTable
GO
This stored procedure have two variables as table #srcTable is basically your second table but instead of using id of your table it's create new srcId which goes from 1 to some number and it's auto_increment it's done because of while loop to avoid any problems when there is some deleted values etc. so we wanna be sure that there wont be any skipped number or something like that.
There is few more variables some of them is used to make while loop work other one is to store data. I think it's not hard to figure out from code what are they used for...
While loop iterate throughout all rows from #srcTable take values processing them and insert them into #xTable which basically hold result.
In while loop we execute second stored procedure which have a task to calculate how many unit of something is there in specific number of item. I call her sp_ConvertMeIntoUnit and here is a code for her:
CREATE PROCEDURE sp_ConvertMeIntoUnit
#inItemName VARCHAR(50),
#inQty INT,
#myResult VARCHAR(5000) OUT
AS
DECLARE #rTable TABLE(rId INT IDENTITY(1, 1), rUnit VARCHAR(50), rQty INT)
DECLARE #yTable TABLE(yId INT IDENTITY(1, 1), yVal INT, yRest INT)
DECLARE #maxId INT
DECLARE #start INT = 1
DECLARE #quentity INT = #inQty
DECLARE #divider INT
DECLARE #quant INT
DECLARE #rest INT
DECLARE #result VARCHAR(5000)
INSERT INTO #rTable(rUnit, rQty)
SELECT unit, qty
FROM t1
WHERE item = #inItemName
ORDER BY qty DESC
SELECT #maxId = (SELECT MAX(rId) FROM #rTable)
WHILE #start <= #maxId
BEGIN
SELECT #divider = (SELECT rQty FROM #rTable WHERE rId = #start)
SELECT #quant = (#quentity / #divider)
SELECT #rest = (#quentity % #divider)
INSERT INTO #yTable(yVal, yRest)
VALUES (#quant, #rest)
SELECT #quentity = #rest
SELECT #start = (#start + 1)
CONTINUE
END
SELECT #result = COALESCE(#result + ', ', '') + CAST(y.yVal AS VARCHAR) + r.rUnit FROM #rTable AS r INNER JOIN #yTable AS y ON r.rId = y.yId
SELECT #myResult = #result
GO
This procedure contain three parametars it's take two parameters from the first one and one is returned as result (OUTPUT). In parameters are Item and Quantity.
There are also two variables as table #rTable we stored values as #rId which is auto increment and always will go from 1 to some number no matter what is there Id's in the first table. Other two values are inserted there from the first table based on #inItemName parameter which is sanded from first procedure... From the your first table we use unit and quantity and stored them with rId into table #rTable ordered by Qty from biggest number to lowest. This is a part of code for that
INSERT INTO #rTable(rUnit, rQty)
SELECT unit, qty
FROM t1
WHERE item = #inItemName
ORDER BY qty DESC
Then we go into while loop where we do some maths. Basically we store into variable #divider values from #rTable. In the first iteration we take the biggest value calculate how many times it's contain into the number (second parameter we pass from first procedure is qty from the yours second table) and store it into #quant than we also calculate modulo and store it into variable #rest. This line
SELECT #rest = (#quentity % #divider)
After that we insert our values into #yTable. Before we and with iteration in while loop we assign #quentity variable value of #rest value because we need to work just with the remainder not with whole quantity any more. In second iteration we take next (the second greatest number in our #rTable) number and procedure repeat itself...
When while loop finish we create a string. This line here:
SELECT #result = COALESCE(#result + ', ', '') + CAST(y.yVal AS VARCHAR) + r.rUnit FROM #rTable AS r INNER JOIN #yTable AS y ON r.rId = y.yId
This is the line you want to change if you want to exclude result with 0 (i talk about them at the beginning of answer)...
And at the end we store result into output variable #myResult...
Result of this stored procedure will return string like this:
+--------------------------+
| 1ball, 5slop, 8box, 2pcs |
+--------------------------+
Hope I didn't miss anything important. Basically only think you should change here is the name of the table and their columns (if they are different) in first stored procedure instead t2 here
INSERT INTO...
SELECT item, qty
FROM t2
And in second one instead of t1 (and column if needed) here..
INSERT INTO...
SELECT unit, qty
FROM t1
WHERE item = #inItemName
ORDER BY qty DESC
Hope i help a little or give you an idea how this can be solved...
GL!
You seem to want string aggregation – something that does not have a simple instruction in Transact-SQL and is usually implemented using a correlated FOR XML subquery.
You have not provided names for your tables. For the purpose of the following example, the first table is called ItemDetails and the second one, Items:
SELECT
i.item,
i.qty,
details = (
SELECT
', ' + CAST(d.qty AS varchar(10)) + d.unit
FROM
dbo.ItemDetails AS d
WHERE
d.item = i.item
FOR XML
PATH (''), TYPE
).value('substring(./text()[1], 3)', 'nvarchar(max)')
FROM
dbo.Items AS i
;
For the input provided in the question, the above query would return the following output:
item qty details
----- ----------- ------------------------------
cigar 1598 1pcs, 1000ball, 12box, 100slop
pen 52 1pcs, 20box
You can further arrange the data into strings as per your requirement. I would recommend you do it in the calling application and use SQL only as your data source. However, if you must, you can do the concatenation in SQL as well.
Note that the above query assumes that the same unit does not appear more than once per item in ItemDetails. If it does and you want to aggregate qty values per unit before producing the detail line, you will need to change the query a little:
SELECT
i.item,
i.qty,
details = (
SELECT
', ' + CAST(SUM(d.qty) AS varchar(10)) + d.unit
FROM
dbo.ItemDetails AS d
WHERE
d.item = i.item
GROUP BY
d.unit
FOR XML
PATH (''), TYPE
).value('substring(./text()[1], 3)', 'nvarchar(max)')
FROM
dbo.Items AS i
;

Remove a sentence from a paragraph that has a specific pattern with T-SQL

I have a large number of descriptions that can be anywhere from 5 to 20 sentences each. I am trying to put a script together that will locate and remove a sentence that contains a word with numbers before or after it.
before example: Hello world. Todays department has 345 employees. Have a good day.
after example: Hello world. Have a good day.
My main problem right now is identifying the violation.
Here "345 employees" is what causes the sentence to be removed. However, each description will have a different number and possibly a different variation of the word employee.
I would like to avoid having to create a table of all the different variations of employee.
JTB
This would make a good SQL Puzzle.
Disclaimer: there are probably TONS of edge cases that would blow this up
This would take a string, split it out into a table with a row for each sentence, then remove the rows that matched a condition, and then finally join them all back into a string.
CREATE FUNCTION dbo.fn_SplitRemoveJoin(#Val VARCHAR(2000), #FilterCond VARCHAR(100))
RETURNS VARCHAR(2000)
AS
BEGIN
DECLARE #tbl TABLE (rid INT IDENTITY(1,1), val VARCHAR(2000))
DECLARE #t VARCHAR(2000)
-- Split into table #tbl
WHILE CHARINDEX('.',#Val) > 0
BEGIN
SET #t = LEFT(#Val, CHARINDEX('.', #Val))
INSERT #tbl (val) VALUES (#t)
SET #Val = RIGHT(#Val, LEN(#Val) - LEN(#t))
END
IF (LEN(#Val) > 0)
INSERT #tbl VALUES (#Val)
-- Filter out condition
DELETE FROM #tbl WHERE val LIKE #FilterCond
-- Join back into 1 string
DECLARE #i INT, #rv VARCHAR(2000)
SET #i = 1
WHILE #i <= (SELECT MAX(rid) FROM #tbl)
BEGIN
SELECT #rv = IsNull(#rv,'') + IsNull(val,'') FROM #tbl WHERE rid = #i
SET #i = #i + 1
END
RETURN #rv
END
go
CREATE TABLE #TMP (rid INT IDENTITY(1,1), sentence VARCHAR(2000))
INSERT #tmp (sentence) VALUES ('Hello world. Todays department has 345 employees. Have a good day.')
INSERT #tmp (sentence) VALUES ('Hello world. Todays department has 15 emps. Have a good day. Oh and by the way there are 12 employees somewhere else')
SELECT
rid, sentence, dbo.fn_SplitRemoveJoin(sentence, '%[0-9] Emp%')
FROM #tmp t
returns
rid | sentence | |
1 | Hello world. Todays department has 345 employees. Have a good day. | Hello world. Have a good day.|
2 | Hello world. Todays department has 15 emps. Have a good day. Oh and by the way there are 12 employees somewhere else | Hello world. Have a good day. |
I've used the split/remove/join technique as well.
The main points are:
This uses a pair of recursive CTEs, rather than a UDF.
This will work with all English sentence endings: . or ! or ?
This removes whitespace to make the comparison for "digit then employee" so you don't have to worry about multiple spaces and such.
Here's the SqlFiddle demo, and the code:
-- Split descriptions into sentences (could use period, exclamation point, or question mark)
-- Delete any sentences that, without whitespace, are like '%[0-9]employ%'
-- Join sentences back into descriptions
;with Splitter as (
select ID
, ltrim(rtrim(Data)) as Data
, cast(null as varchar(max)) as Sentence
, 0 as SentenceNumber
from Descriptions -- Your table here
union all
select ID
, case when Data like '%[.!?]%' then right(Data, len(Data) - patindex('%[.!?]%', Data)) else null end
, case when Data like '%[.!?]%' then left(Data, patindex('%[.!?]%', Data)) else Data end
, SentenceNumber + 1
from Splitter
where Data is not null
), Joiner as (
select ID
, cast('' as varchar(max)) as Data
, 0 as SentenceNumber
from Splitter
group by ID
union all
select j.ID
, j.Data +
-- Don't want "digit+employ" sentences, remove whitespace to search
case when replace(replace(replace(replace(s.Sentence, char(9), ''), char(10), ''), char(13), ''), char(32), '') like '%[0-9]employ%' then '' else s.Sentence end
, s.SentenceNumber
from Joiner j
join Splitter s on j.ID = s.ID and s.SentenceNumber = j.SentenceNumber + 1
)
-- Final Select
select a.ID, a.Data
from Joiner a
join (
-- Only get max SentenceNumber
select ID, max(SentenceNumber) as SentenceNumber
from Joiner
group by ID
) b on a.ID = b.ID and a.SentenceNumber = b.SentenceNumber
order by a.ID, a.SentenceNumber
One way to do this. Please note that it only works if you have one number in all sentences.
declare #d VARCHAR(1000) = 'Hello world. Todays department has 345 employees. Have a good day.'
declare #dr VARCHAR(1000)
set #dr = REVERSE(#d)
SELECT REVERSE(RIGHT(#dr,LEN(#dr) - CHARINDEX('.',#dr,PATINDEX('%[0-9]%',#dr))))
+ RIGHT(#d,LEN(#d) - CHARINDEX('.',#d,PATINDEX('%[0-9]%',#d)) + 1)

Compare two list items

I am trying to compare a database field which stores list items (comma separated) with unfortunately a variable which is also a list item.
Example:
In this case, a user can belong to multiple groups, and content access is also allocated to multiple groups.
contentid | group
(1) (c,d)
(2) (a,c)
(3) (b)
So, I need to select all content where user is in group (a,c). In this case, contentid 1,2 should be returned.
Here's a safe but slow solution for SQL 2008
BEGIN
-- setup
DECLARE #tbl TABLE (
[contentid] INT
,[group] VARCHAR(MAX)
)
INSERT INTO #tbl VALUES
(1, 'c,d')
,(2, 'a,c')
,(3, 'd')
-- send your request as simple xml
DECLARE #param XML
SET #param = '<g>a</g><g>c</g>'
-- query
SELECT DISTINCT contentid
FROM #tbl t
INNER JOIN #param.nodes('/g') AS t2(g)
ON ',' + t.[group] + ',' LIKE '%,' + t2.g.value('.', 'varchar(max)') + ',%'
END
You just pass your query in as an XML snippet instead of a comma separated list.
If your group names are single characters or you can be sure the names are not character-subsets of each other (ie: GroupA, GroupAB), then the query can be optimized to.
ON t.[group] LIKE '%' + t2.g.value('.', 'varchar(max)') + '%'
If you're using a RDBMS without XML parsing capability you'll have to use string split your query into a temp table and work it that way.
You really should not be using comma separated values inside your columns. It would be much better if the [group] column only contained one value and you had repeated entries with a UNIQUE constraint on the composite (contentid, group).
You might find this question and answer useful : How do I split a string so I can access item x?
Or you could always use something like this :
create function SplitString(
#string varchar(max),
#delimiter char(1)
)
returns #items table (item varchar(max))
as
begin
declare #index int set #index = 0
if (#delimiter is null) set #delimiter = ','
declare #prevdelimiter int set #prevdelimiter = 0
while (#index < len(#string)) begin
if (substring(#string, #index, 1) = #delimiter) begin
insert into #items
select substring(#string, #prevdelimiter, #index-#prevdelimiter)
set #prevdelimiter = #index + 1
end
set #index = #index + 1
end
--last item (or only if there were no delimiters)
insert into #items
select substring(#string, #prevdelimiter, #index - #prevdelimiter + 1)
return
end
go
declare #content table(contentid int, [group] varchar(max))
insert into #content
select 1, 'c,d'
union
select 2, 'a,c'
union
select 3, 'b'
declare #groups varchar(max) set #groups = 'a,c'
declare #grouptable table(item varchar(max))
insert into #grouptable
select * from dbo.SplitString(#groups, ',')
select * From #content
where (select count(*) from #grouptable g1 join dbo.SplitString([group], ',') g2 on g1.item = g2.item) > 0