Need help with optimizing "not in" query - sql

I have an SQL query that I am looking to optimize.
Basically need to get all rows where ALERT_ORIGIN is "FOO" Which do not have a corresponding row in the same table with ALERT_ORIGIN "BAR". The table contains abt 17000 rows and there are only abt 1000 records with ALERT_ORIGIN "BAR". So my query is supposed to give me abt 16000 rows.
EDIT : The current query is very slow. I do not have any indexes currently.

I'm guessing that you have NULL values in the phone column which means NOT IN doesn't work (so it's "fix" not "optimise"). So I've written it with NOT EXISTS:
Q1.RECORD_ID is null
If it is slow rather than "wrong" then you need to use indexes. What do you have now?
For this query, you need an index on (ALERT_ORIGIN, PHONE, RECORD_ID).
Note: use single quotes for string delimiters


SQL Server where column in where clause is null

Let's say that we have a table named Data with Id and Weather columns. Other columns in that table are not important to this problem. The Weather column can be null.
I want to display all rows where Weather fits a condition, but if there is a null value in weather then display null value.
My SQL so far:
FROM Data d
WHERE (d.Weather LIKE '%'+COALESCE(NULLIF('',''),'sunny')+'%' OR d.Weather IS NULL)
My results are wrong, because that statement also shows values where Weather is null if condition is not correct (let's say that users mistyped wrong).
I found similar topic, but there I do not find appropriate answer.
SQL WHERE clause not returning rows when field has NULL value
Please help me out.
Your query is correct for the general task of treating NULLs as a match. If you wish to suppress NULLs when there are no other results, you can add an AND EXISTS ... condition to your query, like this:
FROM Data d
WHERE d.Weather LIKE '%'+COALESCE(NULLIF('',''),'sunny')+'%'
OR (d.Weather IS NULL AND EXISTS (SELECT * FROM Data dd WHERE dd.Weather LIKE '%'+COALESCE(NULLIF('',''),'sunny')+'%'))
The additional condition ensures that NULLs are treated as matches only if other matching records exist.
You can also use a common table expression to avoid duplicating the query, like this:
WITH cte (id, weather) AS
FROM Data d
WHERE d.Weather LIKE '%'+COALESCE(NULLIF('',''),'sunny')+'%'
statement show also values where Wether is null if condition is not correct (let say that users typed wrong sunny).
This suggests that the constant 'sunny' is coming from end-user's input. If that is the case, you need to parameterize your query to avoid SQL injection attacks.

What is the difference between the IN operator and = operator in SQL?

I am just learning SQL, and I'm wondering what the difference is between the following lines:
WHERE s.parent IN (SELECT l.parent .....)
WHERE s.parent = (SELECT l.parent .....)
will not generate an error if you have multiple results on the subquery. Allows to have more than one value in the result returned by the subquery.
will generate an error if you have more than one result on the subquery.
SQLFiddle Demo (IN vs =)
when you are using 'IN' it can compare multiple
select * from tablename where student_name in('mari','sruthi','takudu')
but when you are using '=' you can't compare multiple values
select * from tablenamewhere student_name = 'sruthi'
i hope this is the right answer
The "IN" clause is also much much much much slower. If you have many results in the select portion of
IN (SELECT l.parent .....),
it will be extremely inefficient as it actually generates a separate select sql statement for each and every result within the select statement ... so if you return 'Cat', 'Dog', 'Cow'
it will essentially create a sql statement for each result... if you have 200 results... you get the full sql statement 200 times...takes forever... (This was as of a few years ago... maybe imporved by now... but it was horribly slow on big result sets.)
Much more efficient to do an inner join such as:
Select id, parent
from table1 as T
inner join (Select parent from table2) as T2 on T.parent = T2.parent
For future visitors.
Basically in case of equals (just remember that here we are talking like where =, each cell value from table 1 will be compared one by one to each cell value of all the rows from table 2, if it matches then that row will be selected (here that row will be selected means that row from table 1 and table 2) for the overall result set otherwise will not be selected.
Now, in case of IN, complete result set on the right side of the IN will be used for comparison, so its like each value from table 1 will be checked on whether this cell value is present in the complete result set of the IN, if it is present then that value will be shown for all the rows of the IN’s result set, so let say IN result set has 20 rows, so that cell value from table 1 will be present in overall result set 20 times (i.e. that particular cell value will have 20 rows).
For more clarity see below screen shot, notice below that how complete result set from the right of the IN (and NOT IN) is considered in the overall result set; whole emphasis is on the fact that in case comparison using =, matching row from second table is selected, while in case of IN complete result from the second table is selected.
In can match a value with more than one values, in other words it checks if a value is in the list of values so for e.g.
x in ('a', 'b', 'x') will return true result as x is in the the list of values
while = expects only one value, its as simple as
x = y returns false
x = x returns true
The general rule of thumb is:
The = expects a single value to compare with. Like this:
WHERE s.parent = 'father_name'
IN is extremely useful in scenarios where = cannot work i.e. scenarios where you need the comparison with multiple values.
WHERE s.parent IN ('father_name', 'mother_name', 'brother_name', 'sister_name')
Hope this is useful!!!
This helps when a subquery returns more than one result.
This operator cannot handle more than one result.
Like in this example:
Select LOC from dept where DEPTNO = (select DEPTNO from emp where
Gives ERROR ORA-01427: single-row subquery returns more than one row
Instead use
Select LOC from dept where DEPTNO in (select DEPTNO from emp
where JOB='MANAGER');
1) Sometimes = also used as comparison operator in case of joins which IN doesn't.
2) You can pass multiple values in the IN block which you can't do with =. For example,
SELECT * FROM [Products] where ProductID IN((select max(ProductID) from Products),
(select min(ProductID) from Products))
would work and provide you expected number of rows.However,
SELECT * FROM [Products] where ProductID = (select max(ProductID) from Products)
and ProductID =(select min(ProductID) from Products)
will provide you 'no result'. That means, in case subquery supposed to return multiple number of rows , in that case '=' isn't useful.

Minus operator in sql

I am trying to create a sql query with minus.
I have query1 which returns 28 rows with 2 columns
I have query2 which returns 22 row2 with same 2 columns in query 2.
when I create a query query1 minus query 2 it should have only show the 28-22=6 rows.
But it showing up all the 28 rows returned by query1.
Please advise.
Try using EXCEPT instead of MINUS. For Example:
Lets consider a case where you want to find out what tasks are in a table that haven't been assigned to you(So basically you are trying to find what tasks could be available to do).
SELECT TaskID, TaskType
FROM Tasks
SELECT TaskID, TaskType
FROM Tasks
WHERE Username = 'Vidya'
That would return all the tasks that haven't been assigned to you. Hope that helps.
If MINUS won't work for you, the general form you want is the main query in the outer select and a variation of the other query in a not exists clause.
select <insert list of fields here>
from mytable a
join myothertable b
on b.aId = a.aid
where not exists (select * from tablec c where a.aid = c.aid)
The fields might not be exactly alike. may be one of the fields is char(10) and the other is char(20) and they both have the string "TEST" in them. They might "look" the same.
If the database you are working on supports "INTERSECT", try this query and see how many are perfectly matching results.
select field1, field2 from table1
select field1, field2 from table2
To get the results you are expecting, this query should give you 22 rows.
something like this:
select field1, field2, . field_n
from tables
select field1, field2, . field_n
from tables;
MINUS works on the same principle as it does in the set operations. Suppose if you have set A and B,
A = {1,2,3,4}; B = {3,5,6}
then, A-B = {1,2,4}
If A = {1,3,5} and B = {2,4,6}
then, A-B = {1,3,5}. Here the count(A) before and after the MINUS operation will be the same, as it does not contain any overlapping terms with set B.
On similar lines, may be the result set obtained in query 2 may not have matching terms with the result of query1. Hence you are still getting 28 instead of 6 rows.
Hope this helps.
It returns the difference records in the upper query which are not contained by the second query.
In your case for example
A={1,2,3,4,5...28} AND B={29,30} then A-B={1,2,3....28}

TSQL NOT EXISTS Why is this query so slow?

Debugging an app which queries SQL Server 05, can't change the query but need to optimise things.
Running all the selects seperately are quick <1sec, eg: select * from acscard, select id from employee... When joined together it takes 50 seconds.
Is it better to set uninteresting accesscardid fields to null or to '' when using EXISTS?
( SELECT Id FROM Employee
WHERE Employee.AccessCardId = ACSCard.acs_card_number )
WHERE Visit.AccessCardId = ACSCard.acs_card_number )
ORDER by acs_card_id
Do you have indexes on Employee.AccessCardId, Visit.AccessCardId, and ACSCard.acs_card_number?
The SELECT clause is not evaluated in an EXISTS clause. This:
...should raise an error for dividing by zero, but it won't. But you need to put something in the SELECT clause for it to be a valid query - it doesn't matter if it's NULL or a zero length string.
In SQL Server, NOT EXISTS (and NOT IN) are better than the LEFT JOIN/IS NULL approach if the columns being compared are not nullable (the values on either side can not be NULL). The columns compared should be indexed, if they aren't already.

Is there a way to use a wildcard in a "where" statment for MySQL?

I have a query that uses a where clause. At times, this may be used and at others, I may want to omit it completely to get back all results. I can certainly write two different queries but I would like to cut down on any code that I can for simplistic reasons. Is there a way to do this in mysql?
Take a query like:
SELECT * FROM my_table WHERE id = '3'
SELECT * FROM my_table
Is there a way to use the top query and still get back all records?
No, because the predicate in the first query may not actually retrieve all of the records from the table; it may use an index so that it only has to obtain the specific record(s) the query needs to return.
If you wanted to keep a predicate of that same form but still return all of the results, you would need to do something like this:
where id = 3 or id <> 3
or this:
where id = id
Note that to either of these, you'll have to add or id is null if id can be null.
If you just want to have a predicate in your query, this will suffice:
where 1
but this is just redundant, and you may as well just leave the predicate out.
More food for thought...
I notice you quoted the '3'. If your ids are char data you could use the LIKE string comparison operator.
For a single value
SELECT * FROM my_table WHERE id LIKE '3'
For all values
SELECT * FROM my_table WHERE id LIKE '%'
Won't give you any values with NULL id though.
If I understand your question correctly, then YES
SELECT * FROM my_table WHERE 1=1
If you're building the SQL query as you go along, and you decide at the last minute that you want to negate/ignore the "WHERE" part of your query, you can append OR 1 to your where-clause. Remember that logically, X OR TRUE is true for all X.
sqlite> SELECT id FROM moz_downloads WHERE id < 405 LIMIT 10;
sqlite> SELECT id FROM moz_downloads WHERE id < 405 OR 1 LIMIT 10;
Note that I had to stick a LIMIT 10 in there to not get too many results for the demonstration, but the second statement's where-clause is id < 405 or 1.
It depends on the application, but you may or may not generate your queries at runtime. Some queries will always be the same, like SELECT * FROM recent_files, but some queries will be like generated on-the-fly. In the latter case, you might have something like
something = make_safe_for_sql(get_something_from_user())
query = "SELECT * FROM data WHERE something=" + something
if should_ignore_something:
query += " OR 1"
Note: Depending on your SQL engine, you might need to do OR 1=1 to evaluate to a boolean true.
Your question is dubious. You are really saying that when there is no id==3 all entries should be returned. You can do that easily if you pull all entries and then sort them out using php:
$sql = mysql_query("SELECT * FROM my_table");
while($row = mysql_fetch_array($sql) {
// do something
But as the table grows this will put an enormous stress on the database. You should go with the multiple query and enforce some kind of limit on the second query.
// try to get id == 3
SELECT * FROM my_table
// if id == 3 returns 0 results
SELECT * FROM my_table LIMIT 5
Hope it helps!
You'd have to be using dynamic SQL, like
"SELECT * FROM my_table WHERE id " &
Then set qualifier to "= '3'" or to "1".