SQL WITH clause doesn't work - sql

I'm trying to execute seemingly simple request contains WITH clause:
WITH sub AS (SELECT url FROM site WHERE id = 15)
SELECT * FROM search_result WHERE url = sub.url
But it doesn't work. I get
ERROR: missing FROM-clause entry for table "sub"
What's the matter?

Table expressions need to be used like tables. You're trying to use the value of sub as a scalar.
Try this (forgive me, Postgres is not my first SQL dialect).
WITH sub AS (SELECT url FROM site WHERE id = 15)
SELECT * FROM sub
INNER JOIN
search_result
ON
sub.url = search_result.url
EDIT, alternatively, you could just skip the WITH clause and go with:-
SELECT * FROM
site
INNER JOIN
search_result
ON
site.url = search_result.url
WHERE
site.id = 15

Don't use a CTE at all for this simple case.
Unlike you seem to be expecting, the following simple query without a CTE will be slightly faster:
SELECT r.*
FROM search_result r
JOIN site s USING (url)
WHERE s.id = 15;
Test with EXPLAIN ANALYZE to verify.
CTEs introduce an optimization barrier. They have many very good uses, but they won't make simple queries faster.
Here is a thread on pgsql-performance that gives you more details as to why that is.

That's not the correct way to use a CTE:
With sub as (
SELECT url
FROM site
WHERE id = 15
)
SELECT *
FROM Search_Result SR
JOIN sub ON SR.url = sub.Url

You can just as easily do an inner join:
SELECT search_result .*
FROM
search_result
INNER JOIN
(SELECT url FROM site WHERE id = 15) as st
ON
search_result.url = st.url
This does the filtering so that you are joining on a smaller set than if you did the where clause outside of the filtering. This may not matter in your case, but it is something to consider.

Related

using an alias of a function in a sql condition

I have something like this:
SELECT
cansa1.NAME,
mod(cansa1.PRODUCT_ID, 1000000) prodIdHash
FROM CANSA_TABLE cansa1
INNER JOIN CUSER_TABLE cuser1 ON cansa1.PRODUCT_ID = cuser1.PRODUCT_ID
AND mod(cansa1.PRODUCT_ID, 1000000) = cuser1.PRODUCT_HASH
This query is working, but I want replace the second occurrence (in the inner join) of the mod() function, to avoid execute it two times. I tried replace it by the alias in the select clause but not works. Any idea of that I can use to make this query don't repeat the mod() function?
Sorry by my english
Don't worry about executing it twice, the SQL engine will optimize the query and will decide whether the function value is cached or it executes twice and can end up re-writing the query so that what is executed has a different structure than the written query because it has determined that it would be more efficient.
If you really want to try to rewrite it then:
SELECT c.NAME,
c.prodIdHash
FROM (
SELECT name,
mod(PRODUCT_ID, 1000000) As prodIdHash
FROM CANSA_TABLE
) c
INNER JOIN CUSER_TABLE u
ON ( c.PRODUCT_ID = u.PRODUCT_ID
AND c.prodIdHash = u.PRODUCT_HASH )
However, the SQL engine may rewrite the query and push the function to the outer scope so you may need a seemingly irrelevant filter condition to materialize the inner query and force the calculation not to be rewritten:
SELECT c.NAME,
c.prodIdHash
FROM (
SELECT name,
mod(PRODUCT_ID, 1000000) As prodIdHash
FROM CANSA_TABLE
WHERE ROWNUM > 0
) c
INNER JOIN CUSER_TABLE u
ON ( c.PRODUCT_ID = u.PRODUCT_ID
AND c.prodIdHash = u.PRODUCT_HASH )
However, this really seems like a case of premature optimisation. You should check if there is actually a problem first before you try and apply an optimisation that probably is not needed.
You can use a derived table (i.e. a subquery in the FROM clause):
SELECT dt.NAME, dt.prodIdHash
FROM
(SELECT
cansa1.NAME,
mod(cansa1.PRODUCT_ID, 1000000) prodIdHash
FROM CANSA_TABLE cansa1) dt
INNER JOIN CUSER_TABLE cuser1 ON dt.PRODUCT_ID = cuser1.PRODUCT_ID
AND dt.prodIdHash = cuser1.PRODUCT_HASH

Select with table join

I want to get * from concrete_samples but only method_name and not the id from concrete_compaction_methods
SELECT * FROM concrete_samples,concrete_compaction_methods WHERE concrete_compaction_methods.id = concrete_samples.compaction_method AND workorder_id=1
This is currently returning everything I want EXCEPT it's giving me the id column of the methods table which I don't want.
the pseudo code of the statement I want to do is
SELECT * FROM concrete_samples, SELECT method_name FROM concrete_compaction_methods WHERE concrete_compaction_methods.id = concrete_samples.compaction_method AND workorder_id=1
I've done some research. I've tried to use a union but i don't think that's the correct or neatest solution
Thank you
I strongly advise:
Learn to use proper JOIN syntax.
Use table aliases in your query.
Qualify all column references.
So the query looks more like this:
SELECT cs.*, ccm.method_name
FROM concrete_samples cs JOIN
concrete_compaction_methods ccm
ON ccm.id = cs.compaction_method
WHERE cs.workorder_id = 1;
I am guessing that workorder_id comes from concrete_samples rather than the other table.
Try below -
SELECT concrete_samples.*,method_name
FROM concrete_samples inner join concrete_compaction_methods
on concrete_compaction_methods.id = concrete_samples.compaction_method
where workorder_id=1

Selecting ambiguous column from subquery with postgres join inside

I have the following query:
select x.id0
from (
select *
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0
) x;
Since id0 is in both sessions and clicked_products, I get the expected error:
column reference "id0" is ambiguous
However, to fix this problem in the past I simply needed to specify a table. In this situation, I tried:
select sessions.id0
from (
select *
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0
) x;
However, this results in the following error:
missing FROM-clause entry for table "sessions"
How do I return just the id0 column from the above query?
Note: I realize I can trivially solve the problem by getting rid of the subquery all together:
select sessions.id0
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0;
However, I need to do further aggregations and so do need to keep the subquery syntax.
The only way you can do that is by using aliases for the columns returned from the subquery so that the names are no longer ambiguous.
Qualifying the column with the table name does not work, because sessions is not visible at that point (only x is).
True, this way you cannot use SELECT *, but you shouldn't do that anyway. For a reason why, your query is a wonderful example:
Imagine that you have a query like yours that works, and then somebody adds a new column with the same name as a column in the other table. Then your query suddenly and mysteriously breaks.
Avoid SELECT *. It is ok for ad-hoc queries, but not in code.
select x.id from
(select sessions.id0 as id, clicked_products.* from sessions
inner join
clicked_products on
sessions.id0 = clicked_products.session_id0 ) x;
However, you have to specify other columns from the table sessions since you cannot use SELECT *
I assume:
select x.id from (select sessions.id0 id
from sessions
inner join clicked_products
on sessions.id0 = clicked_products.session_id0 ) x;
should work.
Other option is to use Common Table Expression which are more readable and easier to test.
But still need alias or selecting unique column names.
In general selecting everything with * is not a good idea -- reading all columns is waste of IO.

Parent Query does not see columns of subquery when returning using SELECT *

I am currently learning SQL using Oracle SQL developer.
While writing queries I came up with three different versions of the same query.
SELECT
sh.share_id
FROM shares sh
LEFT JOIN trades tr
ON sh.share_id = tr.share_id
WHERE trade_id is NULL;
SELECT
sr.share_id
FROM (SELECT sh.share_id, tr.trade_id
FROM shares sh
LEFT JOIN trades tr
ON sh.share_id = tr.share_id) sr
WHERE sr.trade_id is NULL;
SELECT
sr.share_id
FROM (SELECT *
FROM shares sh
LEFT JOIN trades tr
ON sh.share_id = tr.share_id) sr
WHERE sr.trade_id is NULL;
The first two queries compile, run and return the same result set but when I try to run the third query I get a error on the second line of the third query.
"SR"."SHARE_ID": invalid identifier.
I know that * in the SELECT statement selects all columns so why Am I getting this error?
From reading your comments, in your final query, the DBMS doesn't know which share_id to use for your SELECT sr.share_id. AKA the SELECT * of your subquery is grabbing two share_id columns. You have to do something like your 2nd query.
The problem is that select * is selecting all columns from both tables, even those with the same name. So, you are getting share_id twice. A simple fix is to use USING:
SELECT sr.share_id
FROM (SELECT *
FROM shares sh LEFT JOIN
trades tr
USING (share_id)
) sr
WHERE sr.trade_id is NULL;
Of course, tis only fixes the reference to share_id.

SQL joins instead of nested query

select baseurl from tmp_page_tbl
where baseurl NOT IN ( select baseurl from page_lookup )
How do I write this query using joins instead of nesting it.
The idea is to get the baseurls from tmp tbl which do not exist in the page_lookup table
select baseurl
from tmp_page_tbl t
left outer join page_lookup p on t.baseurl = p.baseurl
where p.baseurl IS NULL
You could rewrite using joins like below:
SELECT baseurl from tmp_page_tbl as t
LEFT JOIN page_lookup as pl
ON t.baseurl=pl.baseurl
where pl.baseurl IS NULL
I'm not sure I would though unless you have a compelling reason. Below are a few links worth looking at:
http://explainextended.com/2009/09/15/not-in-vs-not-exists-vs-left-join-is-null-sql-server/
http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/
If you aren't selecting most of the table and you've index on page_lookup.baseUrl, then not exists should be most efficient.
select baseurl from tmp_page_tbl tmp
where not exists ( select 1 from page_lookup WHERE baseurl = tmp.baseurl );
On some RDBMS (Oracle DB and Postgres) you can use MINUS (or EXCEPT in Postgres). That is in some cases very efficient.