Questions about subquery in WHERE EXISTS(...) - sql

I'm confused about this example in this tutorial page. http://www.postgresqltutorial.com/postgresql-subquery/
SELECT
first_name,
last_name
FROM
customer
WHERE
EXISTS (
SELECT
1
FROM
payment
WHERE
payment.customer_id = payment.customer_id
);
Could you please tell me the point of the subquery?
I understand that EXISTS converts the result set to a boolean "true" if there's at least one result returned from the subquery. But in the WHERE clause of the sub query, it would seem like it would always be "true", so a row will always be selected, so the EXISTS will always be "true".
Was that example meant to do this in the subquery?
WHERE
customer.customer_id = payment.customer_id
Also, I assume that a subquery that is part of the WHERE will run once for every "customer". Is that right?
Thanks.

You are absolutely right. That seems to be a typo in the PostgreSQL documentation... and quite a confusing one, by the way.
Regarding the last question, thinking of it running for each customer is a good approach too.

Related

Working of subquery in SQL Oracle

I was trying to understand how nested or nested subqueries work in Oracle when dealing with SQL. So lets take an example where I have 2 tables, one where I hold all student information and one where I hold all the grades each student has received. Now I'm trying to find all students that received at least one 'A' grade form the grades table. I can do a simple join and get the output for this. But the problem is if a student has received an 'A' grade twice, his ID shows up twice. Now I know I can use the DISTINCT word to solve my problem. But I wanted to do this using nested queries and so this is what I typed ->
select id from students where id in (select id from grades);
Now this query returns an output with no duplicates. I'm trying to get my head around this and how this nested query works in detail. What does the "where in" part also do? Really confused.
While it's not universally true that a distinct is a bad thing, it is often misused -- and I think your example is a good one where there is a better way.
In this case, I think your best bet is a semi-join. Here is a rough example:
select s.*
from students s
where exists (
select null
from grades g
where
s.student_id = g.student_id and
g.grade = 'A'
)
Oracle does a pretty nice job of executing a subquery into a semi-join in the background, when it makes sense, but other DBMSs definitely benefit from this construct.
Actually the way it works is a join and a distinct - but Oracle is smart, it does it efficiently. It does what you would do: it takes the first student_id from the first table, and it tries to match it against rows in the second table. But, since you don't need the whole join, it will stop as soon as it finds a match - then it moves on to the second row in the first table.
I assume you meant the subquery to be select id from grades where grade = 'A', right?

Do subselects do an implicit join?

I have a sql query that seems to work but I dont really understand why. Therefore I would very much appreciate if someone could help explain whats going on:
THE QUERY RETURNS: All organisations that dont have any comments that were not created by the consultant who created the organisation record.
SELECT \"organisations\".*
FROM \"organisations\"
WHERE \"organisations\".\"id\" NOT IN
(SELECT \"comments\".\"commentable_id\"
FROM \"comments\"
WHERE \"comments\".\"commentable_type\" = 'Organisation'
AND (comments.author_id != organisations.consultant_id)
ORDER BY \"comments\".\"created_at\" ASC
)
It seems to do so correctly.
The part I dont understand is why (comments.author_id != organisations.consultant_id) is working!? I dont understand how postgres even knows what "organisations" is inside that subselect? It is not defined in here.
If this was written as a join where I had joined comments to organisations then I would totally understand how you could do something like this but in this case its a subselect. How does it know how to map the comments and organisations table and exclude the ones where (comments.author_id != organisations.consultant_id)
That subselect happens in a row so it can see all columns of that row. You will probably get better performance with this
select organisations.*
from organisations
where not exists (
select 1
from comments
where
commentable_type = 'organisation' and
author_id != organisations.consultant_id
)
Notice that it is not necessary to qualify commentable_type since the one in comments has priority over any other outside the subselect. And if comments does not have a consultant_id column then it would be possible to take its qualifier out, although not recommended for better legibility.
The order by in your query buys you nothing, just added cost.
You are running a correlated subquery. http://technet.microsoft.com/en-us/library/ms187638(v=sql.105).aspx
This is commonly used in all databases. A subquery in the WHERE clause can refer to tables used in the parent query, and they often do.
That being said, your current query could likely be written better.
Here is one way, using an outer join with comments, where no matches are found based on your criteria -
select o.*
from organizations o
left join comments c
on c.commentable_type <> 'Organisation'
and c.author_id = o.consultant_id
where c.commentable_id is null

Use of 1=2 in a SQL query

Someone please explain the meaning of '1=2' in the below SQL query.
SELECT E.EmpID,
E.EmpName,
Country = CASE
WHEN T.Active = 'N'
AND 1 = 2 THEN 'Not Working Anymore'
ELSE C.Country_Name
END,
T.Contract_No
FROM Employees E (nolock)
INNER JOIN Contract T
ON T.Contract_No = E.Contract_No
LEFT JOIN Country C (nolock)
ON E.Country_ID = C.Country_ID
thanks
EDIT:- Corrected the slight mistake existed in the example SQL query given by me.
# ALL :- The query mentioned here is an example version of a big working query on which I have to reoslve something. I have created a sample scenario of SQL query for the sake of simplicity of question.
There is a good use for this 1=2 part of the WHERE clause if you are creating a table from another, but you don't want to copy any rows. For example:
CREATE TABLE ABC_TEMP AS
SELECT * FROM ABC WHERE 1=2;
when T.Active = 'N' and 1=2 then 'Not Working Anymore'
Simple, the above condition will never become true.
So the result will always be C.Country_Name
It is a common trick used in dynamic construction of SQL filter clauses. This allows the automated construction of "T.Active = 'N' and" with no check needed for a following clause, because "1=2" will always be appended.
Update:
Whether 1=1 or 1=2 is used depends on whether conjunctive or disjunctive normal form is supposed to be used in building the automated clauses. In this case, there seems to have been a mismatch of design and implementation.
Update 2
I believe most developers prefer conjunctive normal form, with major terms joind by AND, but disjunctive normal form is equal in expressive power and size of code.
It corresponds to a FALSE argument.
For example ;
select * from TABLE where 1=2
returns zero rows.
Use WHERE 1=2 if you don't want to retrieve any rows,
As 1=2 is always false.
adding and 1=2 will cause that case to always return false. To find out why it's there, ask the person who put it there.
I suspect it was put there so the author could force the first condition to be false and then he forgot to remove it.
I would guess that is a debug script. It is there to always return the negative part of the case. Probably on release that part is taken out.
1 = 2 means that we are giving a condition that will always be false; therefore no records will show ('NULL') for your rows...
ie
Create table empt_tgt
AS
Select empno, ename, job, mgr, sal
WHERE 1=2;
then assuming that empt_tgt has records for all those columns
when we perform the following statement:
SELECT * FROM empt_tgt
EMPT_TGT will be null ; meaning we will only see the column name empno, ename, job, mgr,sal no data...
I have found this in several bits of code at my company. In our case it generally gets left in as DEBUG code by mistake. Developers could use it as a place holder which looks like the case in your example.
People use 1=2 to check if their code is syntactically correct without the code performing anything. For example if you have a complicated UPDATE statement and you want to check if the code is correct without updating anything.

Only one expression can be specified in the select list when the subquery is not introduced with EXISTS

Here is my query. I got that error. Please help me. Thanks.
ASC
ALTER PROCEDURE [dbo].[sp_CostAllocation_Test]
#CompanyCode VARCHAR(3),
#EmpCode VARCHAR(600),
#PayCode VARCHAR(600)
AS
SELECT
CTPY33PAYRP.CTPAPECOD As EmployeeCode,
CTPY33PAYRP.CTPAPPCOD As paycode,
(select PY11RPTFPD.rpcol as columntotal from PY11RPTFPD where rppcod =CTPAPPCOD) ,
(SELECT COCODE,CTPAPECOD,CTPAPPCOD
FROM CTPY33PAYRP
WHERE CTPY33PAYRP.COCODE = #CompanyCode
AND CTPY33PAYRP.CTPAPECOD =#EmpCode
AND CTPY33PAYRP.COCODE = #CompanyCode
AND CTPY33PAYRP.CTPAPPCOD=#PayCode) As PayCode_Check,
PY11RPTFPD.RPPCOD As PayType,
(SELECT RPCOL,RPPCOD
FROM PY11RPTFPD,CTPY33PAYRP
WHERE CTPY33PAYRP.CTPAPPCOD=PY11RPTFPD.RPPCOD)
from CTPY33PAYRP,PY11RPTFPD
ORDER BY CTPAPECOD
I have to say your naming conventions aren't exactly transparent!
Without knowing the schemas for your tables it's a bit hard to say for sure, but I would guess that you are having trouble with this sub-query:
(SELECT COCODE,CTPAPECOD,CTPAPPCOD FROM CTPY33PAYRP
WHERE CTPY33PAYRP.COCODE = #CompanyCode AND CTPY33PAYRP.CTPAPECOD =#EmpCode
AND CTPY33PAYRP.COCODE = #CompanyCodeAND CTPY33PAYRP.CTPAPPCOD=#PayCode) As PayCode_Check,
and with this sub-query:
(SELECT RPCOL,RPPCOD
FROM PY11RPTFPD,CTPY33PAYRP
WHERE CTPY33PAYRP.CTPAPPCOD=PY11RPTFPD.RPPCOD)
You are selecting multiple columns from one table, in the first case, and from a join of two tables in the second case. There is nothing in either sub-query which restricts the results to a single row. If you are going to include a sub-query in your select list the sub-query has to return a single row per row in your main query. Also, I've never seen a sub-query with multiple columns.
Since I have no clue from your table and column names what it is the query is meant to do, I can't give you much definitive advice about how to fix the syntax errors. I would say keep your sub-selects to one column each. This is what the error message is telling you. Also you should either correlate the subqueries with the main query so that only one value is possible or use an aggregate function in the sub-queries to ensure that only a single value is possible for each record in the main query.
I will also say as an aside that you should learn ANSI join syntax. It seems tricky at first, but it is your friend once you get used to it.

SQL - ISNULL Record Value

I have a SQL Statement where i need to display the value from another table if a joining record exists. To attempt this, I'm using ISNULL. As a demonstration, here is a sample query:
SELECT
FirstName,
LastName,
ISNULL(select top 1 birthdate from BirthRecords where [SSN]=p.SSN, false) as HasRecord
FROM
Person p
Please note, this is a small snippet. I know there is a better way to do this specific query. However, I cannot do an outer join in my FROM clause. Because of this, I'm trying to do an inline statement. I thought ISNULL was the correct approach. Can someone please explain how I should do this?
Thank you,
Try this and see if it works for ya.
SELECT
FirstName,
LastName,
CASE when R.BirthDate IS NULL THEN FALSE
ELSE TRUE
END as HasRecord
FROM
Person p
left join BirthRecords R on p.SSN = R.SSN
Use a left outer join to return the birthdate if it exists, otherwise null:
SELECT
FirstName,
LastName,
birthdate
FROM Person AS p
LEFT JOIN BirthRecords AS b ON p.SSN = b.SSN
Your question is incomplete. You should at least specify:
what DBMS you use (I guess MS SQL Server, because of ISNULL)
what does/does not work
That said, I don't think you can use ISNULL like this. According to the docs, the replacement and the original column must be type compatible. So you cannot use "false" as the replacement, it must be a date (like birthdate).
Doing a JOIN is really your best bet from a performance and readability perspective.
Why can you not do a JOIN?
Another option is to create a FUNCTION and call it. Something like GET_BIRTHDATE(p.SSN) and it would return a scaler value of null or the date. This is clearly a performance issue because the function will get called on every row...so the JOIN would still be better.