Search for multiple values in IN clause located in where clause - sql

I'm extending search functionality and I need to add search for multiple values in IN clause. So I have textbox and write some number in it. This is dynamic functionality which could add query syntaxis to the WHERE clause of the sql and should be added only when you search.
Let's say we put 1056 in our textbox to search for it.
SELECT
*
FROM
QueryTable
WHERE
(Select ID from SearchTable Where Number='1056') IN
(SELECT value FROM OPENJSON(JsonField,'$.Data.ArrayOfIds'))
This is working properly because searching for ID in SearchTable returns one row, but I want to make it work when returns multiple rows, something like this:
(Select ID from SearchTable Where Number like '%105%')
This obvious gives an error, because returns multiple records. What is the approach to search multiple values(or array of values) in IN clause.

Use an EXISTS. I think this is what you need (no sample data to test with to confirm):
FROM QueryTable QT
WHERE EXISTS (SELECT 1
FROM SearchTable ST
JOIN OPENJSON(QT.JsonField,'$.Data.ArrayOfIds') J ON ST.ID = J.[value]
WHERE ST.Number LIKE '%105%')
Note: I've added aliases to everything, but I've guessed the alias for JsonField. Quantifying your columns is really important to avoid unexpected behaviour and make your code easier to read for both yourself and others.

Related

SQL Where Condition having 15+ different strings for one column

I would like to implement for 1 column multiple specific parameters like:
select * from table1
where column1 = a or column1 = b or column1 = c ...
Can it be done in a better way (the SQL Statement in Use is over 10 lines long with the OR statements it'll grow another 10 lines O.o and it'll make the code much slower!)
You can use in:
select t.*
from t
where column in ( . . . );
The in list is pretty equivalent to a bunch of or conditions. There are some nuances. For instance, all the values in the in list will be converted to the same type. If one is a number and the rest are strings, then all will be converted to strings -- perhaps generating an error.
For performance, you want an index on t(column).

Is there a way to express AND in SIMILAR TO ignoring order of matches?

I have a Redshift table column that contains 1 to many hashtags (e.g. #a, #b, etc.). I want to write a query that finds rows where all tags from a given set exist (e.g. #a and #b) while not picking up other rows that have some but not all of the tags (e.g. only #a or only #b).
I can see how to do this with multiple LIKE statements (e.g. LIKE '%#a %' AND LIKE '%#b%') but I would really like to do it with a single statement. I can see how to do this with SIMILAR TO but not in a way that ignores ordering. The following would work but only if I include all possible combinations of ordering.
SELECT * FROM table WHERE field SIMILAR TO '(%#a%)(%#b%)|(%#b%)(%#a%)'
This works but having to list all combinations of the tags I'm looking for would be a royal pain and prone to error. Is there a way to express 'AND' in SIMLAR TO (or another function) in Redshift that ignores order?
Make sure to capture the whole tag in any position and not match on incomplete tags:
SELECT *
FROM table
WHERE (field LIKE '#a#%' OR field LIKE '%#a') AND
(field LIKE '#b#%' OR field LIKE '%#b')
This avoids matching data such as #ac#b
Use AND and LIKE:
SELECT t.*
FROM table t
WHERE field LIKE '%#a%' AND
field LIKE '%#b%';

Fastest way to check if any case of a pattern exist in a column using SQL

I am trying to write code that allows me to check if there are any cases of a particular pattern inside a table.
The way I am currently doing is with something like
select count(*)
from database.table
where column like (some pattern)
and seeing if the count is greater than 0.
I am curious to see if there is any way I can speed up this process as this type of pattern finding happens in a loop in my query and all I need to know is if there is even one such case rather than the total number of cases.
Any suggestions will be appreciated.
EDIT: I am running this inside a Teradata stored procedure for the purpose of data quality validation.
Using EXISTS will be faster if you don't actually need to know how many matches there are. Something like this would work:
IF EXISTS (
SELECT *
FROM bigTbl
WHERE label LIKE '%test%'
)
SELECT 'match'
ELSE
SELECT 'no match'
This is faster because once it finds a single match it can return a result.
If you don't need the actual count, the most efficient way in Teradata will use EXISTS:
select 1
where exists
( select *
from database.table
where column like (some pattern)
)
This will return an empty result set if the pattern doesn't exist.
In terms of performance, a better approach is to:
select the result set based on your pattern;
limit the result set's size to 1.
Check whether a result was returned.
Doing this prevents the database engine from having to do a full table scan, and the query will return as soon as the first matching record is encountered.
The actual query depends on the database you're using. In MySQL, it would look something like:
SELECT id FROM database.table WHERE column LIKE '%some pattern%' LIMIT 1;
In Oracle it would look like this:
SELECT id FROM database.table WHERE column LIKE '%some pattern%' AND ROWNUM = 1;

Replacement for 'OR' in SphinxQL

I'm currently trying to integrate Sphinx search engine into Python application. The problem is that SphinxQL doesn't support OR clause as common SQL does. There are some hacks to use, like writing expressions in SELECT like this:
SELECT id,(field1 = val1 OR field2 = val2) as expr FROM foo_bar WHERE expr = 1;
However, it doesn't work with strings, because they should be handled using MATCH function. So I decided to divide query into separate subqueries and combine results obtained. Yet there's still a problem of getting a proper META information, especially the total_found field. Sphinx counts it for separate queries, but rows obtained from these queries may intersect and I have no ability to check it (database is large).
I believe there must be a solution. I'm using Sphinxit (SphinxAlchemy has a version conflict with SQLAlchemy I'm using).
Repost from SphinxSearch forum:
I have a table I need to search in with text and numerical columns as well. I need to
write a query with OR condition; found out that there's a way to do it using SELECT
expressions like:
SELECT *, quantity>=50 OR quantity=0 AS mycond FROM table1 WHERE mycond = 1;
Hopelessly it doesn't work with string attributes. This query isn't parsed:
SELECT *, category='foo' OR category='bar' AS mycond FROM table1 WHERE mycond = 1;
Yet this is working in Beta 2.2.3:
SELECT * FROM table1 WHERE category='foo';
What should I do to find count of rows that fit one of conditions, not every one of them?
I can make a few queries and merge obtained items into one list, but I need to now how
much of these rows are in the database now.
For attribute / facet OR'ing, I think you're correct that the only way is to put an expression in the SELECT clause.
For strings, though, check out the documentation on the fulltext query syntax. You can't exactly use the OR keyword, but something like this should work:
SELECT id, name
FROM recipes
WHERE MATCH('(#ingredients chocolate) | (#name cake)')
LIMIT 10;

COUNT() Function in conjunction with NOT IN clause not working properly with varchar field (T-SQL)

I came across a weird situation when trying to count the number of rows that DO NOT have varchar values specified by a select statement. Ok, that sounds confusing even to me, so let me give you an example:
Let's say I have a field "MyField" in "SomeTable" and I want to count in how many rows MyField values do not belong to a domain defined by the values of "MyOtherField" in "SomeOtherTable".
In other words, suppose that I have MyOtherField = {1, 2, 3}, I wanna count in how many rows MyField value are not 1, 2 or 3. For that, I'd use the following query:
SELECT COUNT(*) FROM SomeTable
WHERE ([MyField] NOT IN (SELECT MyOtherField FROM SomeOtherTable))
And it works like a charm. Notice however that MyField and MyOtherField are int typed. If I try to run the exact same query, except for varchar typed fields, its returning value is 0 even though I know that there are wrong values, I put them there! And if I, however, try to count the opposite (how many rows ARE in the domain opposed to what I want that is how many rows are not) simply by supressing the "NOT" clause in the query above... Well, THAT works! ¬¬
Yeah, there must be tons of workarounds to this but I'd like to know why it doesn't work the way it should. Furthermore, I can't simply alter the whole query as most of it is built inside a C# code and basically the only part I have freedom to change that won't have an impact in any other part of the software is the select statement that corresponds to the domain (whatever comes in the NOT IN clause). I hope I made myself clear and someone out there could help me out.
Thanks in advance.
For NOT IN, it is always false if the subquery returns a NULL value. The accepted answer to this question elegantly describes why.
The NULLability of a column value is independent of the datatype used too: most likely your varchar columns has NULL values
Do deal with this, use NOT EXISTS. For non-null values, it works the same as NOT IN so is compatible
SELECT COUNT(*) FROM SomeTable S1
WHERE NOT EXISTS (SELECT * FROm SomeOtherTable S2 WHERE S1.[MyField] = S2.MyOtherField)
gbn has a more complete answer, but I can't be bothered to remember all that. Instead I have the religious habit of filtering nulls out of my IN clauses:
SELECT COUNT(*)
FROM SomeTable
WHERE [MyField] NOT IN (
SELECT MyOtherField FROM SomeOtherTable
WHERE MyOtherField is not null
)