Mysql, NOT EXISTS, SELECT - sql

Well, I was trying to select rows from one table if there are no rows in another table.
My original query was:
SELECT * FROM `jos_datsogallery` as a WHERE a.published = 1
and a.approved=1 NOT EXISTS (SELECT * FROM `jos_datsogallery_votes`
As v WHERE v.vip=62 AND v.vpic=a.id) ORDER BY a.imgdate DESC
but it keeps failing.
I made some testing and shortened my query to:
SELECT * FROM `jos_datsogallery` WHERE EXISTS (SELECT 1)
Which is supposed to select everything from jos_datsogallery as 'EXISTS (SELECT 1)' is always true.
I tried phpMyAdmin:
1064 - You have an error in your SQL syntax. Check the manual that
corresponds to your MySQL server
version for the right syntax to use
near 'EXISTS (SELECT 1) LIMIT 0, 30'
at line 1
What's wrong?
MySQL version: 4.0.27
MySQL doc: http://dev.mysql.com/doc/refman/4.1/en/exists-and-not-exists-subqueries.html

EXISTS is only supported in 4.1 and above - the documentation you linked is a combined documentation for both 4.0/4.1 so it may be misleading as to what versions actually support the keyword.
Since you updated your question to state that you're using 4.0.x that would be why it's not working for you.

Here is another way you can achieve the same, without using NOT EXISTS.
SELECT * FROM `jos_datsogallery` AS a
LEFT JOIN `jos_datsogallery_votes` AS v ON v.vip=62 AND v.vpic=a.id
WHERE a.published = 1 AND
a.approved=1 AND
v.vip IS NULL
ORDER BY a.imgdate DESC
Using a left join means the right-hand of the join (the jos_datsogallery_votes part) is allowed to not find any rows while still returning a result. When the right hand side of the join is not found, its columns will all have a value of NULL, which you can check on in the WHERE part of the query.
HTH

A little late, but I was searching for something similar and it might be of use to someone else. Another way to do this, is to use a count in the where clause. So, your query would be:
SELECT * FROM `jos_datsogallery` AS a
WHERE a.published = 1
AND a.approved=1
AND (
SELECT COUNT(*) FROM `jos_datsogallery_votes` AS v
WHERE v.vip=62
AND v.vpic=a.id
) = 0
ORDER BY a.imgdate DESC

Related

Unable to execute nested SQL queries in Spark SQL

I am trying to execute this query but it doesn't work:
SELECT COLUMN
FROM TABLE A
WHERE A.COLUM_1 = '9999-12-31' AND NOT EXISTS (SELECT 1 FROM TABLE2 ET WHERE ET.COl1 = A.COL2 LIMIT 1)
It results in an error which says the following:
"mismatched input FROM expecting"
Went through this post as it states its supported by Spark with 2.0+ version.
I'm not sure that SparkSQL supports TOP. But it is not needed. Does this work?
SELECT t.COLUMN
FROM TABLE t
WHERE t.COLUM_1 = '9999-12-31' AND
NOT EXISTS (SELECT 1 FROM TABLE2 ET WHERE ET.COl1 = t.COL2);
This fixes a few other syntax issues with the query (such as no alias A).
LIMIT in the subquery is also not needed. NOT EXISTS should stop at the first match.

Exists / not exists: 'select 1' vs 'select field'

Which one of the two would perform better(I was recently accused of not being careful with my code because I used the later in Oracle):
Select *
from Tab1
Where (not) exists(Select 1 From Tab2 Where Tab1.id = Tab2.id)
Select *
from Tab1
Where (not) exists(Select Field1 From Tab2 Where Tab1.id = Tab2.id)
Or are they both same?
Please answer both from SQL Server perspective as well as Oracle perspective.
I have googled (mostly from sql-server side) and found that there is still a lot of debate over this although my present opinion/assumption is the optimiser in both the RDMBS are mature enough to understand that all that is required from the subquery is a Boolean value.
Yes, they are the same. exists checks if there is at least one row in the sub query. If so, it evaluates to true. The columns in the sub query don't matter in any way.
According to MSDN, exists:
Specifies a subquery to test for the existence of rows.
And Oracle:
An EXISTS condition tests for existence of rows in a subquery.
Maybe the MySQL documentation is even more explaining:
Traditionally, an EXISTS subquery starts with SELECT *, but it could begin with SELECT 5 or SELECT column1 or anything at all. MySQL ignores the SELECT list in such a subquery, so it makes no difference.
I know this is old,but want to add few points i observed recently..
Even though exists checks for only existence ,when we write "select *" all ,columns will be expanded,other than this slight overhead ,there are no differences.
Source:
http://www.sqlskills.com/blogs/conor/exists-subqueries-select-1-vs-select/
Update:
Article i referred seems to be not valid.Even though when we write,select 1 ,SQLServer will expand all the columns ..
please refer to below link for in depth analysis and performance statistics,when using various approaches..
Subquery using Exists 1 or Exists *
The expression in the subquery's column list matters absolutely nothing, it will not even be executed:
select * from dual t1
where exists (
select 1/0 from dual t2
--^^^ division by 0
where t2.dummy = t2.dummy)
/
DUMMY
--------
X
The only thing to watch out for in my experience between using
"EXISTS(SELECT * ..." and "EXISTS(SELECT 1 ..." is that "*" is not allowed in schema-bound objects -- it will throw:
Syntax '*' is not allowed in schema-bound objects.

SELECT query to return a row from a table with all values set to Null

I need to make a query but get the value in every field empty. Gordon Linoff give me the clue to this need here:
SQL Empty query results
which is:
select t.*
from (select 1 as val
) v left outer join
table t
on 1 = 0;
This query wors perfectly on PostgreSQL but gets an error when trying to execute it in Microsoft Access, it says that 1 = 0 expression is not admitted. How could it be fixed to work on microsoft access?
Regards,
If the table has a numeric primary key column whose values are non-negative then the following query will work in Access. The primary key field is [ID].
SELECT t2.*
FROM
myTable AS t2
RIGHT JOIN
(
SELECT TOP 1 (ID * -1) AS badID
FROM myTable AS t1
) AS rowStubs
ON t2.ID = rowStubs.badID
This was tested with Access 2010.
I am offering this answer here, even though you didn't think it worked in my edit to your original question. What is the problem?
select t.*
from (select max(col) as maxval from table as t
) as v left join
table as t
on v.val < t.col;
You can use the following query, but it would still need a little "manual coding".
EDITS:
Actually, you do not need the SWITCH function. Modified query below.
Removed the reference to Description column from one line. Still, you would need to use a Text column name (such as Description) in the last line of the query.
For example, the following query would work for the Months table:
select Months.*
from Months
RIGHT OUTER JOIN
(select "" as DummyColumn from Months) Blank_Data
ON Months.Description = Blank_Data.DummyColumn; --hardcoded Description column

SQL Sybase Query Strange Behaviour

I've got 2 tables with exactly the same structure in the same Sybase database but they're separate tables.
This query works on one of the 2:
select * from table1 where
QUOTA_FIELD >
(SELECT
count(ACCOUNT) FROM
table1 As t1
where SECTOR = t1.SECTOR
AND
STATUS = 'QUOTA'
)
But for the second table I have to change it to this:
select * from table2 as tref where
QUOTA_FIELD >
(SELECT
count(ACCOUNT) FROM
table2 As t2
where tref.SECTOR = t2.SECTOR
AND
STATUS = 'QUOTA'
)
There's a restriction on where this will execute which means it needs to work like in the first query.
Does anyone have any ideas as to why the first might work as expected and the second wouldn't?
Since I am not yet allowed to comment, here as an answer to the question "does anyone...?":
No. I couldn't find anyone :)
This first query cannot work correctly, since it compares a column with itself (as long as the column names are all normal ASCII characters and not some similar looking UNICODE ones). Please give a proof that the result of this query is in every case the same as of query 2.
Also, the second query would normally be done like that: where SECTOR = tref.SECTOR...
You might be looking for something like this in query #1 :
select * from table1 t2 where
QUOTA_FIELD >
(SELECT
count(ACCOUNT) FROM
table1 As t1
where t2.SECTOR = t1.SECTOR
AND
t1.STATUS = 'QUOTA'
)
This explicitly specifies that the table in subquery is joining with the table in outer query ( co-related subquery ).
If this works, use the same idea in query #2

Difference between EXISTS and IN in SQL?

What is the difference between the EXISTS and IN clause in SQL?
When should we use EXISTS, and when should we use IN?
The exists keyword can be used in that way, but really it's intended as a way to avoid counting:
--this statement needs to check the entire table
select count(*) from [table] where ...
--this statement is true as soon as one match is found
exists ( select * from [table] where ... )
This is most useful where you have if conditional statements, as exists can be a lot quicker than count.
The in is best used where you have a static list to pass:
select * from [table]
where [field] in (1, 2, 3)
When you have a table in an in statement it makes more sense to use a join, but mostly it shouldn't matter. The query optimiser should return the same plan either way. In some implementations (mostly older, such as Microsoft SQL Server 2000) in queries will always get a nested join plan, while join queries will use nested, merge or hash as appropriate. More modern implementations are smarter and can adjust the plan even when in is used.
EXISTS will tell you whether a query returned any results. e.g.:
SELECT *
FROM Orders o
WHERE EXISTS (
SELECT *
FROM Products p
WHERE p.ProductNumber = o.ProductNumber)
IN is used to compare one value to several, and can use literal values, like this:
SELECT *
FROM Orders
WHERE ProductNumber IN (1, 10, 100)
You can also use query results with the IN clause, like this:
SELECT *
FROM Orders
WHERE ProductNumber IN (
SELECT ProductNumber
FROM Products
WHERE ProductInventoryQuantity > 0)
Based on rule optimizer:
EXISTS is much faster than IN, when the sub-query results is very large.
IN is faster than EXISTS, when the sub-query results is very small.
Based on cost optimizer:
There is no difference.
I'm assuming you know what they do, and thus are used differently, so I'm going to understand your question as: When would it be a good idea to rewrite the SQL to use IN instead of EXISTS, or vice versa.
Is that a fair assumption?
Edit: The reason I'm asking is that in many cases you can rewrite an SQL based on IN to use an EXISTS instead, and vice versa, and for some database engines, the query optimizer will treat the two differently.
For instance:
SELECT *
FROM Customers
WHERE EXISTS (
SELECT *
FROM Orders
WHERE Orders.CustomerID = Customers.ID
)
can be rewritten to:
SELECT *
FROM Customers
WHERE ID IN (
SELECT CustomerID
FROM Orders
)
or with a join:
SELECT Customers.*
FROM Customers
INNER JOIN Orders ON Customers.ID = Orders.CustomerID
So my question still stands, is the original poster wondering about what IN and EXISTS does, and thus how to use it, or does he ask wether rewriting an SQL using IN to use EXISTS instead, or vice versa, will be a good idea?
EXISTS is much faster than IN when the subquery results is very large.
IN is faster than EXISTS when the subquery results is very small.
CREATE TABLE t1 (id INT, title VARCHAR(20), someIntCol INT)
GO
CREATE TABLE t2 (id INT, t1Id INT, someData VARCHAR(20))
GO
INSERT INTO t1
SELECT 1, 'title 1', 5 UNION ALL
SELECT 2, 'title 2', 5 UNION ALL
SELECT 3, 'title 3', 5 UNION ALL
SELECT 4, 'title 4', 5 UNION ALL
SELECT null, 'title 5', 5 UNION ALL
SELECT null, 'title 6', 5
INSERT INTO t2
SELECT 1, 1, 'data 1' UNION ALL
SELECT 2, 1, 'data 2' UNION ALL
SELECT 3, 2, 'data 3' UNION ALL
SELECT 4, 3, 'data 4' UNION ALL
SELECT 5, 3, 'data 5' UNION ALL
SELECT 6, 3, 'data 6' UNION ALL
SELECT 7, 4, 'data 7' UNION ALL
SELECT 8, null, 'data 8' UNION ALL
SELECT 9, 6, 'data 9' UNION ALL
SELECT 10, 6, 'data 10' UNION ALL
SELECT 11, 8, 'data 11'
Query 1
SELECT
FROM t1
WHERE not EXISTS (SELECT * FROM t2 WHERE t1.id = t2.t1id)
Query 2
SELECT t1.*
FROM t1
WHERE t1.id not in (SELECT t2.t1id FROM t2 )
If in t1 your id has null value then Query 1 will find them, but Query 2 cant find null parameters.
I mean IN can't compare anything with null, so it has no result for null, but EXISTS can compare everything with null.
If you are using the IN operator, the SQL engine will scan all records fetched from the inner query. On the other hand if we are using EXISTS, the SQL engine will stop the scanning process as soon as it found a match.
IN supports only equality relations (or inequality when preceded by NOT).
It is a synonym to =any / =some, e.g
select *
from t1
where x in (select x from t2)
;
EXISTS supports variant types of relations, that cannot be expressed using IN, e.g. -
select *
from t1
where exists (select null
from t2
where t2.x=t1.x
and t2.y>t1.y
and t2.z like '℅' || t1.z || '℅'
)
;
And on a different note -
The allegedly performance and technical differences between EXISTS and IN may result from specific vendor's implementations/limitations/bugs, but many times they are nothing but myths created due to lack of understanding of the databases internals.
The tables' definition, statistics' accuracy, database configuration and optimizer's version have all impact on the execution plan and therefore on the performance metrics.
The Exists keyword evaluates true or false, but IN keyword compare all value in the corresponding sub query column.
Another one Select 1 can be use with Exists command. Example:
SELECT * FROM Temp1 where exists(select 1 from Temp2 where conditions...)
But IN is less efficient so Exists faster.
I think,
EXISTS is when you need to match the results of query with another subquery.
Query#1 results need to be retrieved where SubQuery results match. Kind of a Join..
E.g. select customers table#1 who have placed orders table#2 too
IN is to retrieve if the value of a specific column lies IN a list (1,2,3,4,5)
E.g. Select customers who lie in the following zipcodes i.e. zip_code values lies in (....) list.
When to use one over the other... when you feel it reads appropriately (Communicates intent better).
As per my knowledge when a subquery returns a NULL value then the whole statement becomes NULL. In that cases we are using the EXITS keyword. If we want to compare particular values in subqueries then we are using the IN keyword.
Which one is faster depends on the number of queries fetched by the inner query:
When your inner query fetching thousand of rows then EXIST would be better choice
When your inner query fetching few rows, then IN will be faster
EXIST evaluate on true or false but IN compare multiple value. When you don't know the record is exist or not, your should choose EXIST
Difference lies here:
select *
from abcTable
where exists (select null)
Above query will return all the records while below one would return empty.
select *
from abcTable
where abcTable_ID in (select null)
Give it a try and observe the output.
The reason is that the EXISTS operator works based on the “at least found” principle. It returns true and stops scanning table once at least one matching row found.
On the other hands, when the IN operator is combined with a subquery, MySQL must process the subquery first, and then uses the result of the subquery to process the whole query.
The general rule of thumb is that if the subquery contains a large
volume of data, the EXISTS operator provides a better performance.
However, the query that uses the IN operator will perform faster if
the result set returned from the subquery is very small.
In certain circumstances, it is better to use IN rather than EXISTS. In general, if the selective predicate is in the subquery, then use IN. If the selective predicate is in the parent query, then use EXISTS.
https://docs.oracle.com/cd/B19306_01/server.102/b14211/sql_1016.htm#i28403
My understand is both should be the same as long as we are not dealing with NULL values.
The same reason why the query does not return the value for = NULL vs is NULL.
http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/
As for as boolean vs comparator argument goes, to generate a boolean both values needs to be compared and that is how any if condition works.So i fail to understand how IN and EXISTS behave differently
.
If a subquery returns more than one value, you might need to execute the outer query- if the values within the column specified in the condition match any value in the result set of the subquery. To perform this task, you need to use the in keyword.
You can use a subquery to check if a set of records exists. For this, you need to use the exists clause with a subquery. The exists keyword always return true or false value.
I believe this has a straightforward answer. Why don't you check it from the people who developed that function in their systems?
If you are a MS SQL developer, here is the answer directly from Microsoft.
IN:
Determines whether a specified value matches any value in a subquery or a list.
EXISTS:
Specifies a subquery to test for the existence of rows.
I found that using EXISTS keyword is often really slow (that is very true in Microsoft Access).
I instead use the join operator in this manner :
should-i-use-the-keyword-exists-in-sql
If you can use where in instead of where exists, then where in is probably faster.
Using where in or where exists
will go through all results of your parent result. The difference here is that the where exists will cause a lot of dependet sub-queries. If you can prevent dependet sub-queries, then where in will be the better choice.
Example
Assume we have 10,000 companies, each has 10 users (thus our users table has 100,000 entries). Now assume you want to find a user by his name or his company name.
The following query using were exists has an execution of 141ms:
select * from `users`
where `first_name` ='gates'
or exists
(
select * from `companies`
where `users`.`company_id` = `companies`.`id`
and `name` = 'gates'
)
This happens, because for each user a dependent sub query is executed:
However, if we avoid the exists query and write it using:
select * from `users`
where `first_name` ='gates'
or users.company_id in
(
select id from `companies`
where `name` = 'gates'
)
Then depended sub queries are avoided and the query would run in 0,012 ms
I did a little exercise on a query that I have recently been using. I originally created it with INNER JOINS, but I wanted to see how it looked/worked with EXISTS. I converted it. I will include both version here for comparison.
SELECT DISTINCT Category, Name, Description
FROM [CodeSets]
WHERE Category NOT IN (
SELECT def.Category
FROM [Fields] f
INNER JOIN [DataEntryFields] def ON f.DataEntryFieldId = def.Id
INNER JOIN Section s ON f.SectionId = s.Id
INNER JOIN Template t ON s.Template_Id = t.Id
WHERE t.AgencyId = (SELECT Id FROM Agencies WHERE Name = 'Some Agency')
AND def.Category NOT IN ('OFFLIST', 'AGENCYLIST', 'RELTO_UNIT', 'HOSPITALS', 'EMS', 'TOWCOMPANY', 'UIC', 'RPTAGENCY', 'REP')
AND (t.Name like '% OH %')
AND (def.Category IS NOT NULL AND def.Category <> '')
)
ORDER BY 1
Here are the statistics:
Here is the converted version:
SELECT DISTINCT cs.Category, Name, Description
FROM [CodeSets] cs
WHERE NOT Exists (
SELECT * FROM [Fields] f
WHERE EXISTS (SELECT * FROM [DataEntryFields] def
WHERE def.Id = f.DataEntryFieldId
AND def.Category NOT IN ('OFFLIST', 'AGENCYLIST', 'RELTO_UNIT', 'HOSPITALS', 'EMS', 'TOWCOMPANY', 'UIC', 'RPTAGENCY', 'REP')
AND (def.Category IS NOT NULL AND def.Category <> '')
AND def.Category = cs.Category
AND EXISTS (SELECT * FROM Section s
WHERE f.SectionId = s.Id
AND EXISTS (SELECT * FROM Template t
WHERE s.Template_Id = t.Id
AND EXISTS (SELECT * FROM Agencies
WHERE Name = 'Some Agency' and t.AgencyId = Id)
AND (t.Name like '% OH %')
)
)
)
)
ORDER BY 1
The results, at least to me, were unimpressive.
If I were more technically knowledgeable about how SQL works, I could give you an answer, but take this example as you may and make your own conclusion.
The INNER JOIN and IN () is easier to read, however.
EXISTS Is Faster in Performance than IN.
If Most of the filter criteria is in subquery then better to use IN and If most of the filter criteria is in main query then better to use EXISTS.
If you are using the IN operator, the SQL engine will scan all records fetched from the inner query. On the other hand if we are using EXISTS, the SQL engine will stop the scanning process as soon as it found a match.