SQL Intersect not supported in Phoenix , alternative for intersect in phoenix? - sql

I have the following SQL expression:
SELECT SS_ITEM_SK AS POP_ITEM_SK
FROM (SELECT SS_ITEM_SK
FROM (SELECT SS_ITEM_SK,(ITEM_SOLD-ITEM_RETURNED) AS TOT_SOLD_QTY FROM (SELECT SS_ITEM_SK,COUNT(SS_ITEM_SK) AS ITEM_SOLD,COUNT(SR_ITEM_SK) AS ITEM_RETURNED FROM STORE_SALES1 right outer join STORE_RETURNS1 on SS_TICKET_NUMBER = SR_TICKET_NUMBER AND SS_ITEM_SK = SR_ITEM_SK GROUP BY SS_ITEM_SK)))
INTERSECT
SELECT CS_ITEM_SK AS POP_ITEM_SK FROM (SELECT CS_ITEM_SK
FROM (SELECT CS_ITEM_SK,(ITEM_SOLD-ITEM_RETURNED) AS TOT_SOLD_QTY FROM (SELECT CS_ITEM_SK,COUNT(CS_ITEM_SK) AS ITEM_SOLD,COUNT(CR_ITEM_SK) AS ITEM_RETURNED FROM CATALOG_SALES1 right outer join CATALOG_RETURNS1 on CS_ORDER_NUMBER = CR_ORDER_NUMBER and CS_ITEM_SK = CR_ITEM_SK GROUP BY CS_ITEM_SK)))
INTERSECT
SELECT WS_ITEM_SK AS POP_ITEM_SK FROM (SELECT WS_ITEM_SK
FROM (SELECT WS_ITEM_SK,(ITEM_SOLD-ITEM_RETURNED) AS TOT_SOLD_QTY FROM (SELECT WS_ITEM_SK,COUNT(WS_ITEM_SK) AS ITEM_SOLD,COUNT(WR_ITEM_SK) AS ITEM_RETURNED FROM WEB_SALES1 right outer join WEB_RETURNS1 on WS_ORDER_NUMBER = WR_ORDER_NUMBER AND WS_ITEM_SK = WR_ITEM_SK GROUP BY WS_ITEM_SK)))
Apache phoenix is not supporting the keyword INTERSECT. Can somebody please help me to correct above query without using INTERSECT?

I think there are multiple ways you can do this:
Join Method
select * from ((query1 inner join query2 on column_names) inner join query3 on column_names)
Exists Method
(query1 where exists (query2 where exists (query3)) )
In Method
(query1 where column_name in (query2 where column_name in (query3)) )
References: https://blog.jooq.org/2015/10/06/you-probably-dont-use-sql-intersect-or-except-often-enough/
and http://phoenix.apache.org/subqueries.html
Although I would use the exists/in over the join since if these queries return huge data then you might have to optimize your queries using this:
https://phoenix.apache.org/joins.html

Related

PostgreSQL how to use with as

Anybody know why this isn't working? I'm getting: ERROR: syntax error at or near "most_recent"
with most_recent as (SELECT MAX(public."Master_playlist".updated_at)
FROM public."Master_playlist")
SELECT * from public."Playlist"
JOIN public."Master_playlist_playlist" on public."Playlist".id = public."Master_playlist_playlist".playlist_id
JOIN public."Master_playlist" on public."Master_playlist_playlist".master_playlist_id = public."Master_playlist".id
WHERE public."Master_playlist".updated_at = most_recent;
Supposed to be getting the most recent date from Master_playlist and then using that to select a Master_playlist to join the inner query with
Thanks! HM
The with clause creates a derived table, which you need select from, using a join or a subquery. You also need to alias the column so you can refer to it afterwards, as in:
with most_recent as (
SELECT MAX(updated_at) max_updated_at
FROM public."Master_playlist"
)
SELECT *
from public."Playlist"
JOIN public."Master_playlist_playlist"
on public."Playlist".id = public."Master_playlist_playlist".playlist_id
JOIN public."Master_playlist"
on public."Master_playlist_playlist".master_playlist_id = public."Master_playlist".id
WHERE public."Master_playlist".updated_at = (SELECT max_updated_at FROM most_recent)
But here, it looks like it is simpler to use a row-limiting query:
select ...
from (
select *
from public."Master_playlist"
order by updated_at desc
limit 1
) mp
inner join public."Master_playlist_playlist" mpp
on mpp.master_playlist_id = mp.id
inner join public."Playlist" p
on p.id = mpp.playlist_id

aggregate functions are not allowed in WHERE

I am using this query to find the unique records by latest date using postgresql. The error I am having is "aggregate functions are not allowed in WHERE". How to fix error “aggregate functions are not allowed in WHERE” Following this link I have tried to use inner select function. But this did not work. Please help me to edit the query. I am using PgAdmin III as client.
SELECT Distinct t1.pa_serial_
,t1.homeownerm_name
,t1.districtvdc
,t1.date as firstrancheinspection_date
,t1.status
,t1.name_of_data_collector
,t1.fulcrum_id
,first_tranche_inspection_v2_reporting_questionnaire.date_reporting
From first_tranche_inspection_v2 t1
LEFT JOIN first_tranche_inspection_v2_reporting_questionnaire ON (t1.fulcrum_id = first_tranche_inspection_v2_reporting_questionnaire.fulcrum_parent_id)
where first_tranche_inspection_v2_reporting_questionnaire.date_reporting = (
select Max(first_tranche_inspection_v2_reporting_questionnaire.date_reporting)
from first_tranche_inspection_v2
where first_tranche_inspection_v2.pa_serial_ = t1.pa_serial_
);
You want to join the latest reporting questionaire per inspection. In PostgreSQL you can use DISTINCT ON for this:
select fti.*, rq.*
from first_tranche_inspection_v2 fti
left join
(
select distinct on (fulcrum_parent_id) *
from first_tranche_inspection_v2_reporting_questionnaire
order by fulcrum_parent_id, date_reporting desc
) rq on rq.fulcrum_parent_id = fti.fulcrum_id;
Or use standard SQL's ROW_NUMBER:
select fti.*, rq.*
from first_tranche_inspection_v2 fti
left join
(
select
ftirq.*,
row_number() over (partition by fulcrum_parent_id order by date_reporting desc) as rn
from first_tranche_inspection_v2_reporting_questionnaire ftirq
) rq on rq.fulcrum_parent_id = fti.fulcrum_id and rq.rn = 1;
What you were trying to do should look like this:
select fti.*, rq.*
from first_tranche_inspection_v2 fti
left join first_tranche_inspection_v2_reporting_questionnaire rq
on rq.fulcrum_parent_id = fti.fulcrum_id
and (rq.fulcrum_parent_id, rq.date_reporting) in
(
select fulcrum_parent_id, max(date_reporting)
from first_tranche_inspection_v2_reporting_questionnaire
group by fulcrum_parent_id
);
This works, too, and only has the disadvantage that you read the table first_tranche_inspection_v2_reporting_questionnaire twice.
DISTINCT often ends up being implemented with a GROUP BY query in many RDBMS. What I think is happening in your current query is that there is already an implicit aggregation involving the columns in your SELECT. Hence, the correlated subquery involving MAX() actually is an aggregation because of the DISTINCT.
One quick workaround might be to perform the original query without DISTINCT, then subquery the result set to retain only distinct records:
WITH cte AS (
SELECT t1.pa_serial_,
t1.homeownerm_name,
t1.districtvdc,
t1.date as firstrancheinspection_date,
t1.status,
t1.name_of_data_collector,
t1.fulcrum_id,
t2.date_reporting
FROM first_tranche_inspection_v2 t1
LEFT JOIN first_tranche_inspection_v2_reporting_questionnaire t2
ON t1.fulcrum_id = t2.fulcrum_parent_id
WHERE t2.date_reporting = (SELECT MAX(t.date_reporting)
FROM first_tranche_inspection_v2 t
WHERE t.pa_serial_ = t1.pa_serial_)
);
SELECT DISTINCT t.pa_serial_,
t.homeownerm_name,
t.districtvdc,
t.firstrancheinspection_date,
t.status,
t.name_of_data_collector,
t.fulcrum_id,
t.date_reporting
FROM cte t
Note that I went ahead and added an alias to the second table in your join, which leaves the query much easier to read.

Combine two queries, one based upon the other, into one

I have two queries, one based partly on the other. Is there a way of combining them into a single query?
SELECT tblIssues.*, tblIssues.NewsletterLookup
FROM tblIssues
WHERE (((tblIssues.NewsletterLookup)=5));
SELECT tblArea.ID, tblArea.AreaName
FROM tblArea LEFT JOIN Query2 ON tblArea.ID = Query2.[AreaLookup]
WHERE (((tblArea.Dormant)=False) AND ((Query2.tblIssues.NewsletterLookup) Is Null));
If you want to do this in a single query without Query2, you can use the equivalent SQL from Query2 as a subquery in your second example:
SELECT a.ID, a.AreaName
FROM
tblArea AS a
LEFT JOIN
(
SELECT i.*
FROM tblIssues AS i
WHERE i.NewsletterLookup=5
) AS sub
ON a.ID = sub.[AreaLookup]
WHERE
a.Dormant=False
AND sub.NewsletterLookup Is Null;
You ment to perform a JOIN like
SELECT ti.*, tblArea.ID, tblArea.AreaName
FROM tblArea ta
LEFT JOIN tblIssues ti ON ta.ID = ti.[AreaLookup]
WHERE (ti.NewsletterLookup=5 OR ti.NewsletterLookup Is Null)
AND ta.Dormant=False;

Error message - Every derived table must have its own alias

I have this SQL Syntax but it's not working and receive this error:
"#1248 - Every derived table must have its own alias".
Could you help me?
SELECT *
FROM produse_comenzi
JOIN comenzi ON comenzi.id_comanda = produse_comenzi.id_comanda
JOIN (SELECT DISTINCT numar_factura FROM facturi)
ON facturi.id_comanda = comenzi.id_comanda
In the second join you are using a subquery but you haven't given the result an alias, i.e. something to identify the result by
SELECT *
FROM produse_comenzi
JOIN comenzi
ON comenzi.id_comanda = produse_comenzi.id_comanda
JOIN (SELECT DISTINCT numar_factura FROM facturi) -- has no alias
ON facturi.id_comanda = comenzi.id_comanda
you should do
SELECT *
FROM produse_comenzi
JOIN comenzi
ON comenzi.id_comanda = produse_comenzi.id_comanda
JOIN (SELECT DISTINCT numar_factura, id_comanda FROM facturi) AS facturi
ON facturi.id_comanda = comenzi.id_comanda
You must add an alias to each subquery that's being treated as a table:
SELECT *
FROM produse_comenzi
JOIN comenzi ON comenzi.id_comanda = produse_comenzi.id_comanda
JOIN (SELECT DISTINCT numar_factura FROM facturi) x
ON x.id_comanda = comenzi.id_comanda
Here I have named the result set x and referred to that in the join condition. You can change "x" to whatever you like.
This should fix it:
(there is a need in SQL to distinguish between different Resultset from selects)
SELECT *
FROM produse_comenzi AS table_1
JOIN comenzi AS table_2
ON table_2.id_comanda = table_1.id_comanda
JOIN (SELECT DISTINCT numar_factura FROM facturi AS table_3)
ON table_3.id_comanda = table_2.id_comanda

Need to understand multiple joins correctly

I was trying to join 3 tables - CurrentProducts, SalesInvoice and SalesInvoiceDetail. SalesInvoiceDetail contains FK/foreign key to the other two tables and some other columns. The first query is ok but the second is not. My question comes at the end of the code.
Right
select *
from CurrentProducts inner join
(dbo.SalesInvoiceDetail inner join dbo.SalesInvoice
on dbo.SalesInvoiceDetail.InvoiceID = dbo.SalesInvoice.InvoiceID
)
on dbo.SalesInvoiceDetail.ProductID = dbo.CurrentProducts.ProductID
Wrong
select *
from CurrentProducts inner join
(select * from
dbo.SalesInvoiceDetail inner join dbo.SalesInvoice
on dbo.SalesInvoiceDetail.InvoiceID = dbo.SalesInvoice.InvoiceID
)
on dbo.SalesInvoiceDetail.ProductID = dbo.CurrentProducts.ProductID
error - Incorrect syntax near the keyword 'on'.
Why is the second query wrong ? Isn't it conceptually the same as the first one ? That is inside join makes a result set. We select * the result set and then join this result set to CurrentProducts ?
The first query is a "plain" join expressed with an older syntax. It can be rewritten as:
select
*
from
CurrentProducts
inner join dbo.SalesInvoiceDetail
on dbo.SalesInvoiceDetail.ProductID = dbo.CurrentProducts.ProductID
inner join dbo.SalesInvoice
on dbo.SalesInvoiceDetail.InvoiceID = dbo.SalesInvoice.InvoiceID
The second query is a join where the second table is a subquery. When you join on a subquery, you must assign an alias to it and use that alias to refer to the columns returned by the subquery:
select
*
from
CurrentProducts
inner join (select *
from dbo.SalesInvoiceDetail
inner join dbo.SalesInvoice
on SalesInvoiceDetail.InvoiceID = SalesInvoice.InvoiceID
) as foo on foo.ProductID = dbo.CurrentProducts.ProductID
You need to alias the inner query. Also, in the first one the parentheses are not needed.
select *
from CurrentProducts inner join
(select * from
dbo.SalesInvoiceDetail inner join dbo.SalesInvoice
on dbo.SalesInvoiceDetail.InvoiceID = dbo.SalesInvoice.InvoiceID
) A
on A.ProductID = dbo.CurrentProducts.ProductID