im trying to use keywords like detergent, soap, dish etc to match two column in my sql table, if the keywords find match in two column, i want to have another column saying its a matched. i am planning to use the if exist but i do not know the proper syntax.
sample column:
Column1 Column2
-----------------------------------------------
detergent powder all powder detergent
dish washing liquid dish liquid for washing
hand soap hand liquid soap
Here is the simplest solution to your question. The trick is in the "virtual" column, aliased as Match, that we create in the select statement. This column is computed using a case statement to see if the search term appears in both of the columns. Note we need to use the like statement with wildcard operators %.
create table Example (Column1 varchar(max), Column2 varchar(max));
insert into Example select 'detergent powder', 'all powder detergent';
insert into Example select 'dish washing liquid', 'dish liquid for washing' ;
insert into Example select 'hand soap', 'hand liquid soap';
declare #search varchar(20) = 'detergent';
select Column1,
Column2,
case when Column1 like '%' + #search + '%' and
Column2 like '%' + #search + '%'
then 'matched'
else 'not matched' end as [Match]
from Example;
We could also create the Match column as a "real" column in the table and modify this script slightly to update that column based on the same criteria.
Here's an example that checks if any of the 3 words appears in both columns.
Sample data:
CREATE TABLE Test (
Id INT IDENTITY(1,1) PRIMARY KEY,
Col1 VARCHAR(100),
Col2 VARCHAR(100)
);
INSERT INTO Test (Col1, Col2) VALUES
('detergent powder', 'all powder detergent'),
('dish washing liquid', 'dish liquid for washing'),
('hand soap', 'hand liquid soap'),
('soap dish', 'detergent');
Query:
SELECT t.*
, cast(
case
when exists (
select 1
from (values ('soap'),('detergent'),('dish')) s(search)
join (values (Col1),(Col2)) c(col)
on c.col like '%'+s.search+'%'
group by s.search
having count(*) = 2
) then 1 else 0 end as bit) as hasMatch
FROM Test t;
An EXISTS checks if there's at least 1 result from a query.
And the HAVING clause makes sure that 2 matches per search words are needed.
But it can also be done without that GROUP BY & HAVING clause:
SELECT t.*
, cast(case when exists (
select 1
from (values ('soap'),('detergent'),('dish')) s(search)
where Col1 like '%'+s.search+'%'
and Col2 like '%'+s.search+'%'
) then 1 else 0 end as bit) as hasMatch
FROM Test t;
A test on rextester here
Related
Table1
ID Name Tags
----------------------------------
1 Customer1 Tag1,Tag5,Tag4
2 Customer2 Tag2,Tag6,Tag4,Tag11
3 Customer5 Tag6,Tag5,Tag10
and Table2
ID Name Tags
----------------------------------
1 Product1 Tag1,Tag10,Tag6
2 Product2 Tag2,Tag1,Tag5
3 Product5 Tag1,Tag2,Tag3
what is the best way to join Table1 and Table2 with Tags column?
It should look at the tags column which coma seperated on table 2 for each coma seperated tag on the tags column in the table 1
Note: Tables are not full-text indexed.
The best way is not to have comma separated values in a column. Just use normalized data and you won't have trouble with querying like this - each column is supposed to only have one value.
Without this, there's no way to use any indices, really. Even a full-text index behaves quite different from what you might thing, and they are inherently clunky to use - they're designed for searching for text, not meaningful data. In the end, you will not get much better than something like
where (Col like 'txt,%' or Col like '%,txt' or Col like '%,txt,%')
Using a xml column might be another alternative, though it's still quite a bit silly. It would allow you to treat the values as a collection at least, though.
I don't think there will ever be an easy and efficient solution to this. As Luaan pointed out, it is a very bad idea to store data like this : you lose most of the power of SQL when you squeeze what should be individual units of data into a single cell.
But you can manage this at the slight cost of creating two user-defined functions. First, use this brilliant recursive technique to split the strings into individual rows based on your delimiter :
CREATE FUNCTION dbo.TestSplit (#sep char(1), #s varchar(512))
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT 1, 1, CHARINDEX(#sep, #s)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(#sep, #s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn AS SplitIndex,
SUBSTRING(#s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS SplitPart
FROM Pieces
)
Then, make a function that takes two strings and counts the matches :
CREATE FUNCTION dbo.MatchTags (#a varchar(512), #b varchar(512))
RETURNS INT
AS
BEGIN
RETURN
(SELECT COUNT(*)
FROM dbo.TestSplit(',', #a) a
INNER JOIN dbo.TestSplit(',', #b) b
ON a.SplitPart = b.SplitPart)
END
And that's it, here is a test roll with table variables :
DECLARE #A TABLE (Name VARCHAR(20), Tags VARCHAR(100))
DECLARE #B TABLE (Name VARCHAR(20), Tags VARCHAR(100))
INSERT INTO #A ( Name, Tags )
VALUES
( 'Customer1','Tag1,Tag5,Tag4'),
( 'Customer2','Tag2,Tag6,Tag4,Tag11'),
( 'Customer5','Tag6,Tag5,Tag10')
INSERT INTO #B ( Name, Tags )
VALUES
( 'Product1','Tag1,Tag10,Tag6'),
( 'Product2','Tag2,Tag1,Tag5'),
( 'Product5','Tag1,Tag2,Tag3')
SELECT * FROM #A a
INNER JOIN #B b ON dbo.MatchTags(a.Tags, b.Tags) > 0
I developed a solution as follows:
CREATE TABLE [dbo].[Table1](
Id int not null,
Name nvarchar(250) not null,
Tag nvarchar(250) null,
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Table2](
Id int not null,
Name nvarchar(250) not null,
Tag nvarchar(250) null,
) ON [PRIMARY]
GO
get sample data for Table1, it will insert 28000 records
INSERT INTO Table1
SELECT CustomerID,CompanyName, (FirstName + ',' + LastName)
FROM AdventureWorks.SalesLT.Customer
GO 3
sample data for Table2.. i need same tags for Table2
declare #tag1 nvarchar(50) = 'Donna,Carreras'
declare #tag2 nvarchar(50) = 'Johnny,Caprio'
get sample data for Table2, it will insert 9735 records
INSERT INTO Table2
SELECT ProductID,Name, (case when(right(ProductID,1)>=5) then #tag1 else #tag2 end)
FROM AdventureWorks.SalesLT.Product
GO 3
My Solution
create TABLE #dt (
Id int IDENTITY(1,1) PRIMARY KEY,
Tag nvarchar(250) NOT NULL
);
I've create temp table and i will fill with Distinct Tag-s in Table1
insert into #dt(Tag)
SELECT distinct Tag
FROM Table1
Now i need to vertical table for tags
create TABLE #Tags ( Tag nvarchar(250) NOT NULL );
Now i'am fill #Tags table with While, you can use Cursor but while is faster
declare #Rows int = 1
declare #Tag nvarchar(1024)
declare #Id int = 0
WHILE #Rows>0
BEGIN
Select Top 1 #Tag=Tag,#Id=Id from #dt where Id>#Id
set #Rows =##RowCount
if #Rows>0
begin
insert into #Tags(Tag) SELECT Data FROM dbo.StringToTable(#Tag, ',')
end
END
last step : join Table2 with #Tags
select distinct t.*
from Table2 t
inner join #Tags on (',' + t.Tag + ',') like ('%,' + #Tags.Tag + ',%')
Table rowcount= 28000 Table2 rowcount=9735 select is less than 2 second
I use this kind of solution with paths of trees. First put a comma at the very begin and at the very end of the string. Than you can call
Where col1 like '%,' || col2 || ',%'
Some database index the column also for the like(postgres do it partially), therefore is also efficient. I don't know sqlserver.
I have a query that has to filter our results from a text field based on certain keywords used in the textline .. currently the SQL statement looks like the below.
and (name like '%Abc%') or (name like '%XYZ%') or (name like '%CSV%')...
Is there a way to avoid multiple or statements and achieve the same results?
You could put your filter keywords into a table or temp table and query them like this:
select a.*
from table_you_are_searching a
inner join temp_filter_table b
on charindex(b.filtercolumn,a.searchcolumn) <> 0
A slightly more shorthand way of doing this if you have a large amount of different patterns is to use EXISTS and a table value constructor:
SELECT *
FROM T
WHERE EXISTS
( SELECT 1
FROM (VALUES ('abc'), ('xyz'), ('csv')) m (match)
WHERE T.Name LIKE '%' + m.Match + '%'
);
A similar approach can be applied with table valued parameters. Since this is usually a requirement where people want to pass a variable number of search terms for a match it can be quite a useful approach:
CREATE TYPE dbo.ListOfString TABLE (value VARCHAR(MAX));
Then a procedure can take this type:
CREATE PROCEDURE dbo.GetMatches #List dbo.ListOfString READONLY
AS
BEGIN
SELECT *
FROM T
WHERE EXISTS
( SELECT 1
FROM #List AS l
WHERE T.Name LIKE '%' + l.value + '%'
);
END
Then you can call this procedure:
DECLARE #T dbo.ListOfString;
INSERT #T VALUES ('abc'), ('xyz'), ('csv');
EXECUTE dbo.GetMatches #T;
Just to give you another option you could also try this, an IN statement mixed with a PATINDEX:
Select *
from tbl
Where 0 not in (PATINDEX('%Abc%', name), PATINDEX('%XYZ%', name), PATINDEX('%CSV%', name))
I'm working on a database which has the following table:
id location
1 Singapore
2 Vancouver
3 Egypt
4 Tibet
5 Crete
6 Monaco
My question is, how can I produce a query from this which would result in column names like the following without writing them into the query:
Query result:
Singapore , Vancouver, Egypt, Tibet, ...
< values >
how can I produce a query which would result in column names like the
following without writing them into the query:
Even with crosstab() (from the tablefunc extension), you have to spell out the column names.
Except, if you create a dedicated C function for your query. The tablefunc extension provides a framework for this, output columns (the list of countries) have to be stable, though. I wrote up a "tutorial" for a similar case a few days ago:
PostgreSQL row to columns
The alternative is to use CASE statements like this:
SELECT sum(CASE WHEN t.id = 1 THEN o.ct END) AS "Singapore"
, sum(CASE WHEN t.id = 2 THEN o.ct END) AS "Vancouver"
, sum(CASE WHEN t.id = 3 THEN o.ct END) AS "Egypt"
-- more?
FROM tbl t
JOIN (
SELECT id, count(*) AS ct
FROM other_tbl
GROUP BY id
) o USING (id);
ELSE NULL is optional in a CASE expression. The manual:
If the ELSE clause is omitted and no condition is true, the result is null.
Basics for both techniques:
PostgreSQL Crosstab Query
You could do this with some really messing dynamic sql but I wouldn't recommend it.
However you could produce something like below, let me know if that stucture is acceptable and I will post some sql.
Location | Count
---------+------
Singapore| 1
Vancouver| 0
Egypt | 2
Tibet | 1
Crete | 3
Monaco | 0
Script for SelectTopNRows command from SSMS
drop table #yourtable;
create table #yourtable(id int, location varchar(25));
insert into #yourtable values
('1','Singapore'),
('2','Vancouver'),
('3','Egypt'),
('4','Tibet'),
('5','Crete'),
('6','Monaco');
drop table #temp;
create table #temp( col1 int );
Declare #Script as Varchar(8000);
Declare #Script_prepare as Varchar(8000);
Set #Script_prepare = 'Alter table #temp Add [?] varchar(100);'
Set #Script = ''
Select
#Script = #Script + Replace(#Script_prepare, '?', [location])
From
#yourtable
Where
[id] is not null
Exec (#Script);
ALTER TABLE #temp DROP COLUMN col1 ;
select * from #temp;
I want to compare the individual words from the user input to individual words from a column in my table.
For example, consider these rows in my table:
ID Name
1 Jack Nicholson
2 Henry Jack Blueberry
3 Pontiac Riddleson Jack
Consider that the user's input is 'Pontiac Jack'. I want to assign weights/ranks for each match, so I can't use a blanket LIKE (WHERE Name LIKE #SearchString).
If Pontiac is present in any row, I want to award it 10 points. Each match for Jack gets another 10 points, etc. So row 3 would get 20 points, and rows 1 and 2 get 10.
I have split the user input into individual words, and stored them into a temporary table #SearchWords(Word).
But I can't figure out a way to have a SELECT statement that allows me to combine this. Maybe I'm going about this the wrong way?
Cheers,
WT
For SQL Server, try this:
SELECT Word, COUNT(Word) * 10 AS WordCount
FROM SourceTable
INNER JOIN SearchWords ON CHARINDEX(SearchWords.Word, SourceTable.Name) > 0
GROUP BY Word
What about this? (this is MySQL syntax, I think you only have to replace the CONCAT and do it with +)
SELECT names.id, count(searchwords.word) FROM names, searchwords WHERE names.name LIKE CONCAT('%', searchwords.word, '%') GROUP BY names.id
Then you would have a SQL result with the ID of the names-table and count of the words that match to that id.
You could do it via a common table expression that works out the weighting. For example:
--** Set up the example tables and data
DECLARE #Name TABLE (id INT IDENTITY, name VARCHAR(50));
DECLARE #SearchWords TABLE (word VARCHAR(50));
INSERT INTO #Name
(name)
VALUES ('Jack Nicholson')
,('Henry Jack Blueberry')
,('Pontiac Riddleson Jack')
,('Fred Bloggs');
INSERT INTO #SearchWords
(word)
VALUES ('Jack')
,('Pontiac');
--** Example SELECT with #Name selected and ordered by words in #SearchWords
WITH Order_CTE (weighting, id)
AS (
SELECT COUNT(*) AS weighting
, id
FROM #Name AS n
JOIN #SearchWords AS sw
ON n.name LIKE '%' + sw.word + '%'
GROUP BY id
)
SELECT n.name
, cte.weighting
FROM #Name AS n
JOIN Order_CTE AS cte
ON n.id = cte.id
ORDER BY cte.weighting DESC;
Using this technique, you can also apply a value to each search word if you wanted to. So you could make Jack more valueable than Pontiac. This would look something like this:
--** Set up the example tables and data
DECLARE #Name TABLE (id INT IDENTITY, name VARCHAR(50));
DECLARE #SearchWords TABLE (word VARCHAR(50), value INT);
INSERT INTO #Name
(name)
VALUES ('Jack Nicholson')
,('Henry Jack Blueberry')
,('Pontiac Riddleson Jack')
,('Fred Bloggs');
--** Set up search words with associated value
INSERT INTO #SearchWords
(word, value)
VALUES ('Jack',10)
,('Pontiac',20)
,('Bloggs',40);
--** Example SELECT with #Name selected and ordered by words and values in #SearchWords
WITH Order_CTE (weighting, id)
AS (
SELECT SUM(sw.value) AS weighting
, id
FROM #Name AS n
JOIN #SearchWords AS sw
ON n.name LIKE '%' + sw.word + '%'
GROUP BY id
)
SELECT n.name
, cte.weighting
FROM #Name AS n
JOIN Order_CTE AS cte
ON n.id = cte.id
ORDER BY cte.weighting DESC;
Seems to me that the best thing to do would be to maintain a separate table with all the individual words. Eg:
ID Word FK_ID
1 Jack 1
2 Nicholson 1
3 Henry 2
(etc)
This table would be kept up to date with triggers, and you'd have a non-clustered index on 'Word', 'FK_ID'. Then the SQL to produce your weightings would be simple and efficient.
How about something like this....
Select id, MAX(names.name), count(id)*10 from names
inner join #SearchWords as sw on
names.name like '%'+sw.word+'%'
group by id
assuming that table with names called "names".
This is I think a simple problem but not getting the solution yet. I would like to get the valid numbers only from a column as explained here.
Lets say we have a varchar column with following values
ABC
Italy
Apple
234.62
2:234:43:22
France
6435.23
2
Lions
Here the problem is to select numbers only
select * from tbl where answer like '%[0-9]%' would have done it but it returns
234.62
2:234:43:22
6435.23
2
Here, obviously, 2:234:43:22 is not desired as it is not valid number.
The desired result is
234.62
6435.23
2
Is there a way to do this?
You can use the following to only include valid characters:
SQL
SELECT * FROM #Table
WHERE Col NOT LIKE '%[^0-9.]%'
Results
Col
---------
234.62
6435.23
2
You can try this
ISNUMERIC (Transact-SQL)
ISNUMERIC returns 1 when the input
expression evaluates to a valid
numeric data type; otherwise it
returns 0.
DECLARE #Table TABLE(
Col VARCHAR(50)
)
INSERT INTO #Table SELECT 'ABC'
INSERT INTO #Table SELECT 'Italy'
INSERT INTO #Table SELECT 'Apple'
INSERT INTO #Table SELECT '234.62'
INSERT INTO #Table SELECT '2:234:43:22'
INSERT INTO #Table SELECT 'France'
INSERT INTO #Table SELECT '6435.23'
INSERT INTO #Table SELECT '2'
INSERT INTO #Table SELECT 'Lions'
SELECT *
FROM #Table
WHERE ISNUMERIC(Col) = 1
Try something like this - it works for the cases you have mentioned.
select * from tbl
where answer like '%[0-9]%'
and answer not like '%[:]%'
and answer not like '%[A-Z]%'
With SQL 2012 and later, you could use TRY_CAST/TRY_CONVERT to try converting to a numeric type, e.g. TRY_CAST(answer AS float) IS NOT NULL -- note though that this will match scientific notation too (1+E34). (If you use decimal, then scientific notation won't match)
what might get you where you want in plain SQL92:
select * from tbl where lower(answer) = upper(answer)
or, if you also want to be robust for leading/trailing spaces:
select * from tbl where lower(answer) = trim(upper(answer))