Use of 1=2 in a SQL query - sql

Someone please explain the meaning of '1=2' in the below SQL query.
SELECT E.EmpID,
E.EmpName,
Country = CASE
WHEN T.Active = 'N'
AND 1 = 2 THEN 'Not Working Anymore'
ELSE C.Country_Name
END,
T.Contract_No
FROM Employees E (nolock)
INNER JOIN Contract T
ON T.Contract_No = E.Contract_No
LEFT JOIN Country C (nolock)
ON E.Country_ID = C.Country_ID
thanks
EDIT:- Corrected the slight mistake existed in the example SQL query given by me.
# ALL :- The query mentioned here is an example version of a big working query on which I have to reoslve something. I have created a sample scenario of SQL query for the sake of simplicity of question.

There is a good use for this 1=2 part of the WHERE clause if you are creating a table from another, but you don't want to copy any rows. For example:
CREATE TABLE ABC_TEMP AS
SELECT * FROM ABC WHERE 1=2;

when T.Active = 'N' and 1=2 then 'Not Working Anymore'
Simple, the above condition will never become true.
So the result will always be C.Country_Name

It is a common trick used in dynamic construction of SQL filter clauses. This allows the automated construction of "T.Active = 'N' and" with no check needed for a following clause, because "1=2" will always be appended.
Update:
Whether 1=1 or 1=2 is used depends on whether conjunctive or disjunctive normal form is supposed to be used in building the automated clauses. In this case, there seems to have been a mismatch of design and implementation.
Update 2
I believe most developers prefer conjunctive normal form, with major terms joind by AND, but disjunctive normal form is equal in expressive power and size of code.

It corresponds to a FALSE argument.
For example ;
select * from TABLE where 1=2
returns zero rows.

Use WHERE 1=2 if you don't want to retrieve any rows,
As 1=2 is always false.

adding and 1=2 will cause that case to always return false. To find out why it's there, ask the person who put it there.
I suspect it was put there so the author could force the first condition to be false and then he forgot to remove it.

I would guess that is a debug script. It is there to always return the negative part of the case. Probably on release that part is taken out.

1 = 2 means that we are giving a condition that will always be false; therefore no records will show ('NULL') for your rows...
ie
Create table empt_tgt
AS
Select empno, ename, job, mgr, sal
WHERE 1=2;
then assuming that empt_tgt has records for all those columns
when we perform the following statement:
SELECT * FROM empt_tgt
EMPT_TGT will be null ; meaning we will only see the column name empno, ename, job, mgr,sal no data...

I have found this in several bits of code at my company. In our case it generally gets left in as DEBUG code by mistake. Developers could use it as a place holder which looks like the case in your example.

People use 1=2 to check if their code is syntactically correct without the code performing anything. For example if you have a complicated UPDATE statement and you want to check if the code is correct without updating anything.

Related

Difference between SQL statements

I have come across two versions of an SQLRPGLE program and saw a change in the code as below:
Before:
Exec Sql SELECT 'N'
INTO :APRFLG
FROM LG751F T1
INNER JOIN LG752F T2
ON T1.ISBOLN = T2.IDBOLN AND
T1.ISITNO = T2.IDMDNO
WHERE T2.IDVIN = :M_VIN AND
T1.ISAPRV <> 'Y';
After:
Exec Sql SELECT case
when T1.ISAPRV <> 'Y' then 'N'
else T1.ISAPRV
end as APRFLG
INTO :APRFLG
FROM LG751F T1
join LG752F T2
ON T1.ISBOLN = T2.IDBOLN AND
T1.ISITNO = T2.IDMDNO
WHERE T2.IDVIN = :M_VIN AND
T1.ISAPRV <> 'Y'
group by T1.ISAPRV;
Could you please tell me if you see any difference in how the codes would work differently? The second SQL has a group by which is supposed to be a fix to avoid -811 SQLCod error. Apart from this, do you guys spot any difference?
They are both examples of poor coding IMO.
The requirement to "remove duplicates" is often an indication of a bad statement design and/or a bad DB design.
You appear to be doing an existence check, in which case you should be making use of the EXISTS predicate.
select 'N' into :APRFLG
from sysibm.sysdummy1
where exists (select 1
FROM LG751F T1
INNER JOIN LG752F T2
ON T1.ISBOLN = T2.IDBOLN
AND T1.ISITNO = T2.IDMDNO
WHERE
T2.IDVIN = :M_VIN
AND T1.ISAPRV <> 'Y');
As far as the original two statements, besides the group by, the only real difference is moving columns from the JOIN clause to the WHERE clause. However, the query engine in Db2 for i will rewrite both statements equivalently and come up with the same plan; since an inner join is used.
EDIT : as Mark points out, there JOIN and WHERE are the same in both the OP's statements. But I'll leave the statement above in as an FYI.
I don't find a compelling difference, other that the addition of the group by, that will have the effect of suppressing any duplicate rows that might have been output.
It looks like the developer intended for the query to be able to vary its output to be sometimes Y and sometimes N, but forgot to remove the WHERE clause that necessarily forces the case to always be true, and hence it to always output N. This kind of pattern is usually seen when the original report includes some spec like "don't include managers in the employee Sakarya report" and that then changes to "actually we want to know if the employee is a manager or not". What was a "where employee not equal manager" becomes a "case when employee is manager then... else.." but the where clause needs removing for it to have any effect
The inner keyword has disappeared from the join statement, but overall this should also be a non-op
Another option is just to use fetch first row only like this:
Exec Sql
SELECT 'N'
INTO :APRFLG
FROM LG751F T1
JOIN LG752F T2
ON T1.ISBOLN = T2.IDBOLN AND
T1.ISITNO = T2.IDMDNO
WHERE T2.IDVIN = :M_VIN AND
T1.ISAPRV <> 'Y'
fetch first row only;
That makes it more obvious that you only want a single row rather than trying to use grouping which necessitates the funky do nothing CASE statement. But I do like the EXISTS method Charles provided since that is the real goal, and having exists in there makes it crystal clear.
If your lead insists on GROUP BY, you can also GROUP BY 'N' and still leave out the CASE statement.

Do subselects do an implicit join?

I have a sql query that seems to work but I dont really understand why. Therefore I would very much appreciate if someone could help explain whats going on:
THE QUERY RETURNS: All organisations that dont have any comments that were not created by the consultant who created the organisation record.
SELECT \"organisations\".*
FROM \"organisations\"
WHERE \"organisations\".\"id\" NOT IN
(SELECT \"comments\".\"commentable_id\"
FROM \"comments\"
WHERE \"comments\".\"commentable_type\" = 'Organisation'
AND (comments.author_id != organisations.consultant_id)
ORDER BY \"comments\".\"created_at\" ASC
)
It seems to do so correctly.
The part I dont understand is why (comments.author_id != organisations.consultant_id) is working!? I dont understand how postgres even knows what "organisations" is inside that subselect? It is not defined in here.
If this was written as a join where I had joined comments to organisations then I would totally understand how you could do something like this but in this case its a subselect. How does it know how to map the comments and organisations table and exclude the ones where (comments.author_id != organisations.consultant_id)
That subselect happens in a row so it can see all columns of that row. You will probably get better performance with this
select organisations.*
from organisations
where not exists (
select 1
from comments
where
commentable_type = 'organisation' and
author_id != organisations.consultant_id
)
Notice that it is not necessary to qualify commentable_type since the one in comments has priority over any other outside the subselect. And if comments does not have a consultant_id column then it would be possible to take its qualifier out, although not recommended for better legibility.
The order by in your query buys you nothing, just added cost.
You are running a correlated subquery. http://technet.microsoft.com/en-us/library/ms187638(v=sql.105).aspx
This is commonly used in all databases. A subquery in the WHERE clause can refer to tables used in the parent query, and they often do.
That being said, your current query could likely be written better.
Here is one way, using an outer join with comments, where no matches are found based on your criteria -
select o.*
from organizations o
left join comments c
on c.commentable_type <> 'Organisation'
and c.author_id = o.consultant_id
where c.commentable_id is null

Questions about subquery in WHERE EXISTS(...)

I'm confused about this example in this tutorial page. http://www.postgresqltutorial.com/postgresql-subquery/
SELECT
first_name,
last_name
FROM
customer
WHERE
EXISTS (
SELECT
1
FROM
payment
WHERE
payment.customer_id = payment.customer_id
);
Could you please tell me the point of the subquery?
I understand that EXISTS converts the result set to a boolean "true" if there's at least one result returned from the subquery. But in the WHERE clause of the sub query, it would seem like it would always be "true", so a row will always be selected, so the EXISTS will always be "true".
Was that example meant to do this in the subquery?
WHERE
customer.customer_id = payment.customer_id
Also, I assume that a subquery that is part of the WHERE will run once for every "customer". Is that right?
Thanks.
You are absolutely right. That seems to be a typo in the PostgreSQL documentation... and quite a confusing one, by the way.
Regarding the last question, thinking of it running for each customer is a good approach too.

SQL: ... WHERE X IN (SELECT Y FROM ...)

Is the following the most efficient in SQL to achieve its result:
SELECT *
FROM Customers
WHERE Customer_ID NOT IN (SELECT Cust_ID FROM SUBSCRIBERS)
Could some use of joins be better and achieve the same result?
Any mature enough SQL database should be able to execute that just as effectively as the equivalent JOIN. Use whatever is more readable to you.
One reason why you might prefer to use a JOIN rather than NOT IN is that if the Values in the NOT IN clause contain any NULLs you will always get back no results. If you do use NOT IN remember to always consider whether the sub query might bring back a NULL value!
RE: Question in Comments
'x' NOT IN (NULL,'a','b')
≡ 'x' <> NULL and 'x' <> 'a' and 'x' <>
'b'
≡ Unknown and True and True
≡ Unknown
Maybe try this
Select cust.*
From dbo.Customers cust
Left Join dbo.Subscribers subs on cust.Customer_ID = subs.Customer_ID
Where subs.Customer_Id Is Null
SELECT Customers.*
FROM Customers
WHERE NOT EXISTS (
SELECT *
FROM SUBSCRIBERS AS s
JOIN s.Cust_ID = Customers.Customer_ID)
When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, the query can use an index within the sub-query.
If you want to know which is more effective, you should try looking at the estimated query plans, or the actual query plans after execution. It'll tell you the costs of the queries (I find CPU and IO cost to be interesting). I wouldn't be surprised much if there's little to no difference, but you never know. I've seen certain queries use multiple cores on our database server, while a rewritten version of that same query would only use one core (needless to say, the query that used all 4 cores was a good 3 times faster). Never really quite put my finger on why that is, but if you're working with large result sets, such differences can occur without your knowing about it.

SQL - table alias scope

I've just learned ( yesterday ) to use "exists" instead of "in".
BAD
select * from table where nameid in (
select nameid from othertable where otherdesc = 'SomeDesc' )
GOOD
select * from table t where exists (
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
And I have some questions about this:
1) The explanation as I understood was: "The reason why this is better is because only the matching values will be returned instead of building a massive list of possible results". Does that mean that while the first subquery might return 900 results the second will return only 1 ( yes or no )?
2) In the past I have had the RDBMS complainin: "only the first 1000 rows might be retrieved", this second approach would solve that problem?
3) What is the scope of the alias in the second subquery?... does the alias only lives in the parenthesis?
for example
select * from table t where exists (
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
AND
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeOtherDesc' )
That is, if I use the same alias ( o for table othertable ) In the second "exist" will it present any problem with the first exists? or are they totally independent?
Is this something Oracle only related or it is valid for most RDBMS?
Thanks a lot
It's specific to each DBMS and depends on the query optimizer. Some optimizers detect IN clause and translate it.
In all DBMSes I tested, alias is only valid inside the ( )
BTW, you can rewrite the query as:
select t.*
from table t
join othertable o on t.nameid = o.nameid
and o.otherdesc in ('SomeDesc','SomeOtherDesc');
And, to answer your questions:
Yes
Yes
Yes
You are treading into complicated territory, known as 'correlated sub-queries'. Since we don't have detailed information about your tables and the key structures, some of the answers can only be 'maybe'.
In your initial IN query, the notation would be valid whether or not OtherTable contains a column NameID (and, indeed, whether OtherDesc exists as a column in Table or OtherTable - which is not clear in any of your examples, but presumably is a column of OtherTable). This behaviour is what makes a correlated sub-query into a correlated sub-query. It is also a routine source of angst for people when they first run into it - invariably by accident. Since the SQL standard mandates the behaviour of interpreting a name in the sub-query as referring to a column in the outer query if there is no column with the relevant name in the tables mentioned in the sub-query but there is a column with the relevant name in the tables mentioned in the outer (main) query, no product that wants to claim conformance to (this bit of) the SQL standard will do anything different.
The answer to your Q1 is "it depends", but given plausible assumptions (NameID exists as a column in both tables; OtherDesc only exists in OtherTable), the results should be the same in terms of the data set returned, but may not be equivalent in terms of performance.
The answer to your Q2 is that in the past, you were using an inferior if not defective DBMS. If it supported EXISTS, then the DBMS might still complain about the cardinality of the result.
The answer to your Q3 as applied to the first EXISTS query is "t is available as an alias throughout the statement, but o is only available as an alias inside the parentheses". As applied to your second example box - with AND connecting two sub-selects (the second of which is missing the open parenthesis when I'm looking at it), then "t is available as an alias throughout the statement and refers to the same table, but there are two different aliases both labelled 'o', one for each sub-query". Note that the query might return no data if OtherDesc is unique for a given NameID value in OtherTable; otherwise, it requires two rows in OtherTable with the same NameID and the two OtherDesc values for each row in Table with that NameID value.
Oracle-specific: When you write a query using the IN clause, you're telling the rule-based optimizer that you want the inner query to drive the outer query. When you write EXISTS in a where clause, you're telling the optimizer that you want the outer query to be run first, using each value to fetch a value from the inner query. See "Difference between IN and EXISTS in subqueries".
Probably.
Alias declared inside subquery lives inside subquery. By the way, I don't think your example with 2 ANDed subqueries is valid SQL. Did you mean UNION instead of AND?
Personally I would use a join, rather than a subquery for this.
SELECT t.*
FROM yourTable t
INNER JOIN otherTable ot
ON (t.nameid = ot.nameid AND ot.otherdesc = 'SomeDesc')
It is difficult to generalize that EXISTS is always better than IN. Logically if that is the case, then SQL community would have replaced IN with EXISTS...
Also, please note that IN and EXISTS are not same, the results may be different when you use the two...
With IN, usually its a Full Table Scan of the inner table once without removing NULLs (so if you have NULLs in your inner table, IN will not remove NULLS by default)... While EXISTS removes NULL and in case of correlated subquery, it runs inner query for every row from outer query.
Assuming there are no NULLS and its a simple query (with no correlation), EXIST might perform better if the row you are finding is not the last row. If it happens to be the last row, EXISTS may need to scan till the end like IN.. so similar performance...
But IN and EXISTS are not interchangeable...