How to search multiple values in a column with contains? - sql

I have a list with searchitems:
DROP TABLE IF EXISTS #searchitems
SELECT 'Road' as item
INTO #searchitems
UNION
SELECT 'Bike'
item
Road
Bike
And I want looking for rows that contain that items
I put a Full Text Index on the Name column and I tried this already
SELECT Name
FROM [AdventureWorksLT2019].[SalesLT].[Product]
WHERE CONTAINS(*, (Select item FROM #searchitems) )
But it does not work. With just one value its working but not with a list of search values.
Does it even possible with CONTAINS on a SQL-Server.
I expect something like this:
SELECT distinct Name
FROM [AdventureWorksLT2019].[SalesLT].[Product]
WHERE CONTAINS(*, 'Road OR Bike' )
expected output, but why its not working on a input list

Refere the documentation https://learn.microsoft.com/fr-fr/sql/t-sql/queries/contains-transact-sql?view=sql-server-ver16
SELECT distinct Name
FROM [AdventureWorksLT2019].[SalesLT].[Product]
WHERE CONTAINS(*, 'Road OR Bike' )
Is working, official example:
SELECT Name
FROM Production.Product
WHERE CONTAINS(Name, ' Mountain OR Road ')
Update
maybe that's your solution :
SELECT Name
FROM [AdventureWorksLT2019].[SalesLT].[Product]
WHERE CONTAINS(*, (Select STRING_AGG(item, ' OR ') FROM #searchitems)

One way is to use a CTE and JOIN it to your table with LIKE:
WITH searchitems AS
(SELECT 'Road' AS item
UNION ALL
SELECT 'Bike')
SELECT DISTINCT y.name
FROM yourtable y
JOIN searchitems s
ON y.name LIKE CONCAT('%',s.item,'%');
In the CTE, you can select as many items as desired.
You should generally note that lots of LIKE conditions can slow down your query if your table holds very many entries (and of course, similar ideas like CONTAINS can be very slow, too).
This should be avoided if possible and be replaced with an exact string search to improve the execution time.
Then of course the way shown above is no more required, you could just use a standard IN clause:
SELECT name
FROM yourtable
WHERE
name IN ('Road','Bike','Hello World','Another Text');
Try out with some sample data: db<>fiddle

Related

How to efficiently select records matching substring in another table using BigQuery?

I have a table of several million strings that I want to match against a table of about twenty thousand strings like this:
#standardSQL
SELECT record.* FROM `record`
JOIN `fragment` ON record.name
LIKE CONCAT('%', fragment.name, '%')
Unfortunately this is taking an awful long time.
Considering that the fragment table is only 20k records, can I load it into a JavaScript array using a UDF and match it that way? I'm trying to figure out how to this right now but perhaps there's already some magic I could do here to make this faster. I tried a CROSS JOIN and got resource exceeded fairly quickly. I've also tried using EXISTS but I can't reference the record.name inside that subquery's WHERE without getting an error.
Example using Public Data
This seems to reflect about the same amount of data ...
#standardSQL
WITH record AS (
SELECT LOWER(text) AS name
FROM `bigquery-public-data.hacker_news.comments`
), fragment AS (
SELECT LOWER(name) AS name, COUNT(*)
FROM `bigquery-public-data.usa_names.usa_1910_current`
GROUP BY name
)
SELECT record.* FROM `record`
JOIN `fragment` ON record.name
LIKE CONCAT('%', fragment.name, '%')
Below is for BigQuery Standard SQL
#standardSQL
WITH record AS (
SELECT LOWER(text) AS name
FROM `bigquery-public-data.hacker_news.comments`
), fragment AS (
SELECT DISTINCT LOWER(name) AS name
FROM `bigquery-public-data.usa_names.usa_1910_current`
), temp_record AS (
SELECT record, TO_JSON_STRING(record) id, name, item
FROM record, UNNEST(REGEXP_EXTRACT_ALL(name, r'\w+')) item
), temp_fragment AS (
SELECT name, item FROM fragment, UNNEST(REGEXP_EXTRACT_ALL(name, r'\w+')) item
)
SELECT AS VALUE ANY_VALUE(record) FROM (
SELECT ANY_VALUE(record) record, id, r.name name, f.name fragment_name
FROM temp_record r
JOIN temp_fragment f
USING(item)
GROUP BY id, name, fragment_name
)
WHERE name LIKE CONCAT('%', fragment_name, '%')
GROUP BY id
above was completed in 375 seconds, while original query is still running at 2740 seconds and keep running, so I will not even wait for it to complete
Mikhail's answer appears to be faster - but lets have one that doesn't need to SPLIT nor separate the text into words.
First, compute a regular expression with all the words to be searched:
#standardSQL
WITH record AS (
SELECT text AS name
FROM `bigquery-public-data.hacker_news.comments`
), fragment AS (
SELECT name AS name, COUNT(*)
FROM `bigquery-public-data.usa_names.usa_1910_current`
GROUP BY name
)
SELECT FORMAT('(%s)',STRING_AGG(name,'|'))
FROM fragment
Now you can take that resulting string, and use it in a REGEX ignoring case:
#standardSQL
WITH record AS (
SELECT text AS name
FROM `bigquery-public-data.hacker_news.comments`
), largestring AS (
SELECT '(?i)(mary|margaret|helen|more_names|more_names|more_names|josniel|khaiden|sergi)'
)
SELECT record.* FROM `record`
WHERE REGEXP_CONTAINS(record.name, (SELECT * FROM largestring))
(~510 seconds)
As eluded to in my question, I worked on a version using a JavaScript UDF which solves this albeit in a slower way than the answer I accepted. For completeness, I'm posting it here because perhaps someone (like myself in the future) may find it useful.
CREATE TEMPORARY FUNCTION CONTAINS_ANY(str STRING, fragments ARRAY<STRING>)
RETURNS STRING
LANGUAGE js AS """
for (var i in fragments) {
if (str.indexOf(fragments[i]) >= 0) {
return fragments[i];
}
}
return null;
""";
WITH record AS (
SELECT text AS name
FROM `bigquery-public-data.hacker_news.comments`
WHERE text IS NOT NULL
), fragment AS (
SELECT name AS name, COUNT(*)
FROM `bigquery-public-data.usa_names.usa_1910_current`
WHERE name IS NOT NULL
GROUP BY name
), fragment_array AS (
SELECT ARRAY_AGG(name) AS names, COUNT(*) AS count
FROM fragment
GROUP BY LENGTH(name)
), records_with_fragments AS (
SELECT record.name,
CONTAINS_ANY(record.name, fragment_array.names)
AS fragment_name
FROM record INNER JOIN fragment_array
ON CONTAINS_ANY(name, fragment_array.names) IS NOT NULL
)
SELECT * EXCEPT(rownum) FROM (
SELECT record.name,
records_with_fragments.fragment_name,
ROW_NUMBER() OVER (PARTITION BY record.name) AS rownum
FROM record
INNER JOIN records_with_fragments
ON records_with_fragments.name = record.name
AND records_with_fragments.fragment_name IS NOT NULL
) WHERE rownum = 1
The idea is that the list of fragments is relatively small enough that it can be processed in an array, similar to Felipe's answer using regular expressions. The first thing I do is create a fragment_array table which is grouped by the fragment lengths ... a cheap way of preventing an over-sized array which I found can cause UDF timeouts.
Next I create a table called records_with_fragments that joins those arrays to the original records, finding only those which contain a matching fragment using the JavaScript UDF CONTAINS_ANY(). This will result in a table containing some duplicates since one record may match multiple fragments.
The final SELECT then pulls in the original record table, joins to records_with_fragments to determine which fragment matched, and also uses the ROW_NUMBER() function to prevent duplicates, e.g. only showing the first row of each record as uniquely identified by its name.
Now, the reason I do the join in the final query is because in my actual data there are more fields I want besides just the string being matched. Earlier on in my actual data I create a table of DISTINCT strings which then later need to be re-joined.
Voila! Not the most elegant but it gets the job done.

select TableData where ColumnData start with list of strings

Following is the query to select column data from table, where column data starts with a OR b OR c. But the answer i am looking for is to Select data which starts with List of Strings.
SELECT * FROM Table WHERE Name LIKE '[abc]%'
But i want something like
SELECT * FROM Table WHERE Name LIKE '[ab,ac,ad,ae]%'
Can anybody suggest what is the best way of selecting column data which starts with list of String, I don't want to use OR operator, List of strings specifically.
The most general solution you would have to use is this:
SELECT *
FROM Table
WHERE Name LIKE 'ab%' OR Name LIKE 'ac%' OR Name LIKE 'ad%' OR Name LIKE 'ae%';
However, certain databases offer some regex support which you might be able to use. For example, in SQL Server you could write:
SELECT *
FROM Table
WHERE NAME LIKE 'a[bcde]%';
MySQL has a REGEXP operator which supports regex LIKE operations, and you could write:
SELECT *
FROM Table
WHERE NAME REGEXP '^a[bcde]';
Oracle and Postgres also have regex like support.
To add to Tim's answer, another approach could be to join your table with a sub-query of those values:
SELECT *
FROM mytable t
JOIN (SELECT 'ab' AS value
UNION ALL
SELECT 'ac'
UNION ALL
SELECT 'ad'
UNION ALL
SELECT 'ae') v ON t.vame LIKE v.value || '%'

simplify multiple sql queries different variable

I am trying to better automate my queries so I don't have to change the table name and where clause each time. Right now this is what I do:
Years 2014, 2013, etc. I might out these variables into a table. Also doing this on Oracle.
Colors: Red, Green, etc
select count(*) from Apples_2014
where Type = 'Red'
;
select count(*) from Apples_2014
where Type = 'Green'
;
select count(*) from Apples_2013
where Type = 'Red'
;
select count(*) from Apples_2013
where Type = 'Green'
;
Is there a simpler way to do this so I have only one query and then it gets run multiple times but with the different parameters?
Also through some research I saw I can use && which then creates a popup each time in Toad. This isn't really efficient though but its kinda works.
ADDITION (original answer below. Editing after seeing your comment on your post.)
Sounds like you actually want do to grouping.
Running something like the below will give counts for all tables and all values at once in a single result set.
select count(*) as total, 'TABLENAME' as tablename, value
from Tablename
group by value;
UNION ALL
select count(*) as total, 'TABLENAME2' as tablename, value
from Tablename2
group by value
***********************************88
Original answer:
you can use bind variables in Oracle - it will then prompt for each value.
select count(*) from TABLE
where Type = :value
assuming your table structures are very similar, you can union and add a paramater to handle the table name changes without switching to use dynamic sql (dynamic sql is just writing it as a concatenated string - not very efficient.).
So like this....
select total
from (
select count(*) as total, 'TABLENAME' as tablename
from Tablename
where type = :value;
UNION ALL
select count(*) as total, 'TABLENAME2' as tablename
from Tablename2
where type = :value
) a
where a.tablename = :tablename

TSQL Select where one column equals another column

I'm trying to create a search based on where items go together in the same table.
So if I input a value like a123 for column BoxNo then all values in the column Goeswith which are also a123 are selected. This below code is my attempt, but does not work.
SELECT *
FROM Equipment
WHERE (BoxNo LIKE '%') = GoesWith
thanks
If you want all rows where BoxNo and GoesWith have the same value then it's this:
SELECT *
FROM Equipment
WHERE BoxNo = GoesWith
Maybe you mean,
SELECT *
FROM tableName
WHERE 'a123' IN (BoxNo, GoesWith)
or maybe this,
SELECT *
FROM tableName
WHERE BoxNo LIKE '%a123%' AND
BoxNo = GoesWith
If you want to search all items like a123 in BoxNo column:
SELECT * From Equipment WHERE BoxNo LIKE '%a123%'
Of if you want to search for a123 in both columns:
DECLARE #Search Varchar(50) = 'a123'
SELECT * From Equipment WHERE BoxNo = #Search AND GoesWith = #Search
I suspect you meant 'as well as', hence OR:
select * from equipment where BoxNo='a123' OR GoesWith='a123'
Be a bit careful to add bracketing if you need some further constraints adding...
Were you looking for a general match of all Box records that have another entry that they "go with"?
if so then you need an self-join (which uses aliases to identify the records):
select
b.BoxNo,
g.GoesWith
from
equipment as b
inner join equipment as g on b.BoxNo = g.GoesWith
This identifies all the records that have a matching box (and which thing they go with).
Change to a left join to include records that have no match.
It will produce multiple matches if a Box has several GoesWith entries, but by returning either DISTINCT b.* or DISTINCT g.* you can get a list of individual matches.

Sql Server Full Text Search Single Result Column Searched Across Multiple Columns

I am trying to implement an AutoComplete search box(like google) using SQL Server 2008 and Full Text Search.
Say I have 3 columns that I want to search across and have created the proper indexes and what not.
The columns are ProductName, ProductNumber, and Color...
For the user input I want to search for possible matches across all three columns and suggest the proper search term.
So say the user starts typing "Bl"
id like to return a single column containtng results like "Black" "Blue" which come from the Color column and also any matches from the other two columns(like ProductNumber: BL2300)
So basically I need to search across multiple columns and return a single column as the result. Is there a way to do this?
UPDATED follwoing comment of op If you created a FULLTEXT INDEX on different columns, then you can simple use CONTAINS or FREETEXT to look on one of them, all of them, or some of them. Like this:
SELECT *
FROM YourTable
WHERE CONTAINS(*, #SearchTerm);
If you want to look on all the columns that are included in the FULLTEXT INDEX. or:
SELECT *
FROM YourTable
WHERE CONTAINS((ProductName, ProductNumber, Color), #SearchTerm);
If you want to specify the columns that you want to search.
If you need the results in one column, you are gonna have to do a UNION and do a search for every column you want to be searched.
SELECT *
FROM YourTable
WHERE CONTAINS(ProductName, #SearchTerm)
UNION
SELECT *
FROM YourTable
WHERE CONTAINS(ProductNumber, #SearchTerm)
UNION
SELECT *
FROM YourTable
WHERE CONTAINS(Color, #SearchTerm)
If you do not need to associate the single columns, something like
SELECT * FROM Table WHERE ProductName LIKE #SearchTerm + '%'
UNION
SELECT * FROM Table WHERE ProductNumber LIKE #SearchTerm + '%'
UNION
SELECT * FROM Table WHERE Color LIKE #SearchTerm + '%'
is a good point to start from.