When subquery behind SELECT can not be removed? - sql

Correlated subqueries are considered to be a bad habit. I believe that any SQL command with a subquery between SELECT and FROM (lets call it SELECT subquery) can be rewritten into a SQL without any. For example query like this
select *,
(
select sum(t2.sales)
from your_table t2
where t2.dates
between t1.dates - interval '3' day and
t1.dates and
t2.id = t1.id
) running_sales
from your_table t1
demo
can be rewritten into the following one
select dd.id, dd.dates, dd.sales, sum(d.sales) running_sales
from your_table dd
join your_table d on d.dates
between (dd.dates - interval '3' day) and
dd.dates and
dd.id = d.id
group by dd.id, dd.dates, dd.sales
demo
The problems may occur when there is more than one SELECT subquery, however, even in those case, it is possible to rewrite them into a subquery behind FROM and then perform a LEFT JOIN in the following spirit
select *,
(
select sum(sales)
from dat dd
where dd.dates
between (d.dates - interval '3' day) and d.dates and
dd.id = d.id
) running_sales,
(
select sum(sales)
from dat dd
where dd.id = d.id
) total_sales
from dat d
demo
can be rewritten into the following one
select d.*,
t_running.running_sales,
t_total.total_sales
from dat d
left join (
select dd.id, dd.dates, sum(d.sales) running_sales
from dat dd
join dat d on d.dates
between (dd.dates - interval '3' day) and
dd.dates and
dd.id = d.id
group by dd.id, dd.dates
) t_running on d.id = t_running.id and d.dates = t_running.dates
left join (
select d.id, sum(d.sales) total_sales
from dat d
group by d.id
) t_total on t_total.id = d.id
demo
Could you please provide me an example where it is not possible to get rid of the SELECT subquery? Please be so kind and add also a working example link (e.g. dbfiddle, or sqlfiddle) to make the potential disscussion is easier, thanks!

If the question is for a multiple-choice test (or something like that) :) , it is not possible to get rid of subquery for EXISTS clause.
An other similar answeris for IN (subquery) for different level of aggregation to avoid cartesian product.
(same comment by the way : correlated subqueries are not considered everytime to be a bad habit, it depends of optimization, structure, etc....
The WITH is a sort of use of correlated subqueries... and it's very practical for complex queries. )

Related

How to do a join in Oracle tables?

I am trying to help a co-worker do an inner join on two oracle tables so he can build a particular graph on a report.
I have no Oracle experience, only SQL Server and have gotten to what seems like the appropriate statement, but does not work.
SELECT concat(concat(month("a.timestamp"),','),day("a.timestamp")) as monthDay
, min("a.data_value") as minTemp
, max("a.data_value") as maxTemp
, "b.forecast" as forecastTemp
, "a.timestamp" as date
FROM table1 a
WHERE "a.category" = 'temperature'
GROUP BY concat(concat(month("timestamp"),','),day("timestamp"))
INNER JOIN (SELECT "forecast"
, "timestamp"
FROM table2
WHERE "category" = 'temperature') b
ON "a.timestamp" = "b.timestamp"
It doesn't like my aliases for some reason. It doesn't like not having quotes for some reason.
Also when I use the fully scored names it still fails because:
ORA-00933 SQL command not properly ended
The order of the query should be
SELECT
FROM
INNER JOIN
WHERE
GROUP BY
as below
SELECT concat(concat(month("a.timestamp"),','),day("a.timestamp")) as monthDay
, min("a.data_value") as minTemp
, max("a.data_value") as maxTemp
, "b.forecast" as forecastTemp
, "a.timestamp" as date
FROM table1 a
INNER JOIN (SELECT "forecast"
, "timestamp"
FROM table2
WHERE "category" = 'temperature') b
ON "a.timestamp" = "b.timestamp"
WHERE "category" = 'temperature'
GROUP BY concat(concat(month("timestamp"),','),day("timestamp"))
In a flood of attempts, here's yet another one.
table2 can be moved out of subquery; join it with table1 on category as well
note that all non-aggregates columns (from the SELECT) have to be contained in the GROUP BY clause. It seems that a.timestamp contains more info than just month and day - if that's so, it'll probably ruin the whole result set as data won't be grouped by monthday, but by the whole date - consider removing it from SELECT, if necessary
SELECT TO_CHAR(a.timestamp,'mm.dd') monthday,
MIN(a.data_value) mintemp,
MAX(a.data_value) maxtemp,
b.forecast forecasttemp,
a.timestamp c_date
FROM table1 a
JOIN table2 b ON a.timestamp = b.timestamp
AND a.category = b.category
WHERE a.category = 'temperature'
GROUP BY TO_CHAR(a.timestamp,'mm.dd'),
b.forecast,
a.timestamp;
The correct (simplified) syntax of select is
SELECT <columns>
FROM table1 <alias>
JOIN table2 <alias> <join_condition>
WHERE <condition>
GROUP BY <group by columns>
You are doing it wrong. Use subquery:
SELECT c.*, b.`forecast` as forecastTemp
FROM
(SELECT concat(concat(month(a.`timestamp`),','),day(a.`timestamp`)) as monthDay
, min(a.`data_value`) as minTemp
, max(a.`data_value`) as maxTemp
, a.`timestamp` as date
FROM table1 a
WHERE `category`='temperature'
GROUP BY concat(concat(month(`timestamp`),','),day(`timestamp`))) c
INNER JOIN (SELECT `forecast`
, `timestamp`
FROM table2
WHERE `category` = 'temperature') b
ON c.`timestamp` = b.`timestamp`;
In addition to the order of the components other answers have mentioned (where goes after join etc), you also need to remove all of the double-quote characters. In Oracle, these override the standard naming rules, so "a.category" is only valid if your table actually has a column named, literally, "a.category", e.g.
create table demo ("a.category" varchar2(10));
insert into demo ("a.category") values ('Weird');
select d."a.category" from demo d;
It's quite rare to need to do this.
The query should look something like this:
SELECT to_char(a.timestamp, 'MM-DD') as monthDay,
min(a.data_value) as minTemp,
max(a.data_value) as maxTemp,
b.forecast as forecastTemp
FROM table1 a JOIN
table2 b
ON a.timestamp = b.timestamp and b.category = 'temperature'
WHERE a.category = 'temperature'
GROUP BY to_char(timestamp, 'MM-DD'), b.forecast;
I'm not 100% sure this is what you want. Your query has numerous issues and complexities:
You don't need a subquery in the FROM clause.
You can use to_char() instead of the more complex date string processing.
The group by did not contain all the relevant fields.
Don't use double quotes, unless really, really needed.

Multiple Join count doesnt get 0

I've been trying to get data with joins. But problem is result doesn't has records which are has no data in second or third table.
Here is the query;
SELECT AUDIT_CONFIG.TITLE,AUDIT_CONFIG.AUDITOR_POOL,AUDIT_CONFIG.FREQUENCE,
TO_CHAR(TO_DATE(AUDIT_CONFIG.START_DATE,'yyyymmdd'),'dd/mm/yyyy') AS "START",
AUDIT_CONFIG.AUDIT_ID, TO_CHAR(MAX(AUDIT_DATES.AUDIT_DATE), 'dd/mm/yyyy') AS "FINISH",
TRUNC(MAX(AUDIT_DATES.AUDIT_DATE) - SYSDATE) DAY_TO,
(SELECT COUNT(DISTINCT UNIQ_ID) FROM SENDED_AUDIT) AS SCHEDULED,
(SELECT COUNT(*) FROM AUDIT_RESULTS WHERE PASSORFAIL='P') AS PASS,
(SELECT COUNT(*) FROM AUDIT_RESULTS WHERE PASSORFAIL='F') AS FAIL
FROM AUDIT_CONFIG
RIGHT JOIN AUDIT_DATES ON AUDIT_DATES.AUDIT_ID = AUDIT_CONFIG.AUDIT_ID
RIGHT JOIN SENDED_AUDIT ON SENDED_AUDIT.AUDIT_ID=AUDIT_CONFIG.AUDIT_ID
RIGHT JOIN AUDIT_RESULTS ON AUDIT_RESULTS.AUDIT_ID=AUDIT_CONFIG.AUDIT_ID
GROUP BY AUDIT_CONFIG.TITLE, AUDIT_CONFIG.AUDITOR_POOL, AUDIT_CONFIG.FREQUENCE,
TO_CHAR(TO_DATE(AUDIT_CONFIG.START_DATE, 'yyyymmdd'), 'dd/mm/yyyy'), AUDIT_CONFIG.AUDIT_ID;
And here is a image for understanding the problem; (my query returns just first row)
So any advice for getting 0 rows? Thanks in advance..
EDİT For Thorsten Kettner:
Solved now :) thank you for your help and time
Your query looks overly complicated
To start with: Few people use right outer joins for we find them less intuitive than left outer joins. It even seems you were confused with the joins and really wanted left joins instad.
Another thing is the count subqueries that are not related to the records in the main query. I don't think this is on purpose, is it?
Then you join sended_audit and audit_results - the same tables you are using in the count subqueries, but you don't use these joined records in your query.
I guess you want:
select
ac.title,
ac.auditor_pool,
ac.frequence,
to_char(to_date(ac.start_date, 'yyyymmdd'), 'dd/mm/yyyy') as "start",
ac.audit_id,
to_char(ad.max_date, 'dd/mm/yyyy') as "finish",
trunc(ad.max_date - sysdate) as day_to,
sa.scheduled,
nvl(ar.pass, 0) as pass,
nvl(ar.fail, 0) as fail
from audit_config ac
left join
(
select audit_id, max(audit_date) as max_date
from audit_dates
group by audit_id
) ad on ad.audit_id = ac.audit_id
left join
(
select audit_id, count(distinct uniq_id) as scheduled
from sended_audit
group by audit_id
) sa on sa.audit_id = ac.audit_id
left join
(
select
audit_id,
count(case when passorfail = 'p' then 1 end) as pass,
count(case when passorfail = 'f' then 1 end) as fail
from audit_results
group by audit_id
) ar on ar.audit_id = ac.audit_id;

Why do I get ORA-00907 in my SQL query?

I have this SQL query which a partner has done for a little project at university (this is the first time we use SQL), but we get the ora-00907 error and both of us don't know why.
I have checked the parenthesis and they seem to be ok, so the problem must be another.
select
persona.nombre,
anyo,
t2.total
from persona join
(
select
t1.idPersona,
count(produccion.anyo) as total,
anyo
from
(
select *
from produccion
join pelicula
on produccion.id = pelicula.id
) as pel
join
(
select *
from participa
where idPapel = 8
) as t1
on t1.idProduccion = pel.id
)
group by t1.idPersona
) as t2
on persona.id = t2.idPersona
where t2.total > 2
order by t2.total desc;
You are selecting * and doing group by on one column which is creating problem. Either you select only respective column under group by condition OR you remove group by.
select *
from (produccion join pelicula on produccion.id=pelicula.id) as pel
join
(select *
from participa
where idPapel=8) as t1
on t1.idProduccion=pel.id)
group by t1.idPersona
Above code section is unallowed use of group by.
If group by is so much needed, i would suggest you to use it later on in the end. Another option is to use analytical function and filter out rest un-wanted records in upper nesting of query which you already have.
You have lots of nested views, which makes your query rather hard to debug. You have lots of brackets, which need to match.
Anyway this is wrong: select t1.idPersona, count(produccion.anyo) as total, anyo. You'll need to include anyo in the GROUP BY clause, which will probably change the result set you want.
select persona.nombre,
t2.anyo,
t2.total
from persona join
(select t1.idPersona,
count(produccion.anyo) as total,
anyo
from (select *
from produccion
join pelicula
on produccion.id=pelicula.id) pel
join
(select *
from participa
where idPapel=8) t1
on t1.idProduccion=pel.id
group by t1.idPersona, t1.anyo) t2
on persona.id=t2.idPersona
where t2.total>2
order by t2.total desc;
I think your query can be simplified/corrected like this:
select persona.nombre,
anyo,
t2.total
from persona
join (
select par.idPersona,
count(produccion.anyo) as total,
anyo
from produccion
join pelicula
on produccion.id = pelicula.id
left join participa par
on par.idProduccion = pelicula.id -- or produccion.id,
-- this was also an error in the original query,
-- since the subquery selected both
and par.idPapel = 8
group by t1.idPersona
, anyo -- Was missing, but it also doesn't make sense, as this is what you count, so you'll just get 1's here. What do you want with this?
) as t2
on persona.id = t2.idPersona
where t2.total > 2
order by t2.total desc;

how to handle subquery returning more than one value error?

is there any other way to write this query so that it wont get the error?
select sum(Travelled_value)
from travel_table
where customer_id=(select distinct f.CUSTOMER_ID as agg
from SEGMENT_table f
JOIN bookin_table t
ON f.CUSTOMER_ID=t.CUSTOMER_ID
where t.booking_date BETWEEN sysdate
AND sysdate+21 and f.type='NEW';)
here the three tables having customer_id as common.
I don't know if this will work, but it fixes many problems:
select sum(tt.Travelled_value)
from travel_table tt
where tt.customer_id in (select f.CUSTOMER_ID
from SEGMENT_table f JOIN
booking_table t
ON f.CUSTOMER_ID = t.CUSTOMER_ID
where t.booking_date between sysdate and sysdate+21 and
f.type = 'NEW'
);
Notes:
You have a semicolon in the middle of the query. It goes at the end.
select distinct is not needed in an in subquery.
You are using sysdate and comparing it to a date. Are you sure you don't want trunc(sysdate)? sysdate has a time component.
SELECT SUM(Travelled_value)
FROM travel_table
WHERE customer_id in
(SELECT f.CUSTOMER_ID
FROM SEGMENT_table f
JOIN bookin_table t
ON f.CUSTOMER_ID=t.CUSTOMER_ID
WHERE t.booking_date BETWEEN trunc(sysdate) AND trunc(sysdate+21)
AND f.type='NEW'
);

access - row_number function?

I had this query, which gives me the desired results on postgres
SELECT
t.*,
ROW_NUMBER() OVER (PARTITION BY t."Internal_reference", t."Movement_date" ORDER BY t."Movement_date") AS "cnt"
FROM (SELECT
"Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
INNER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime
AND t."Internal_reference" = r."Internal_reference"
Issue is I have to translate the query above on Access where the analytical function does not exist ...
I used this answer to build the query below
SELECT
t."Internal_reference",
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
COUNT(*) AS "cnt"
FROM (
SELECT "Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
LEFT OUTER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND t."Internal_reference" = r."Internal_reference"
GROUP BY
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
t."Internal_reference"
ORDER BY t.from_code
Issue is I only have 1 in the cnt column.
I tried to tweak it by removing the internal_reference (see below)
SELECT
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
COUNT(*) AS "cnt"
FROM (
SELECT "Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
LEFT OUTER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND t."Internal_reference" = r."Internal_reference"
GROUP BY
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date"
ORDER BY t.from_code
However, the results are even worse. The cnt is growing but it gives me the wrong cnt
Any help are more than welcome as I'm slow losing my sanity.
Thanks
Edit: Please find the sqlfiddle
I think Gordon-Linoff's code is close to what you want, but there are some typos I couldn't correct without a rewrite, so here's my attempt
SELECT
t1.Internal_reference,
t1.Movement_date,
t1.PO_Number as Combination_Of_Columns_Which_Make_This_Unique,
t1.Other_columns,
Count(1) AS Cnt
FROM
([LO-D4_Movements] AS t1
INNER JOIN [LO-D4_Movements] AS t2 ON
t1.Internal_reference = t2.Internal_reference AND
t1.Movement_date = t2.Movement_date)
INNER JOIN (
SELECT
t3.Internal_reference,
MAX(t3.Movement_date) AS Maxtime
FROM
[LO-D4_Movements] AS t3
GROUP BY
t3.Internal_reference
) AS r ON
t1.Internal_reference = r.Internal_reference AND
t1.Movement_date = r.Maxtime
WHERE
t1.PO_Number>=t2.PO_Number
GROUP BY
t1.Internal_reference,
t1.Movement_date,t1.PO_Number,
t1.Other_columns
ORDER BY
t1.Internal_reference,
t1.Movement_date,
Count(1);
In addition to within the max(movement_date) subquery, the main table is brought in twice. One version is the one for showing in your results, the other is for counting records to generate the sequence numbers.
Gordon said you need a unique id column for each row. And that's true if by "column" you mean to include derived columns also. Also it only needs to be unique within any combination of "internal_reference" and "Movement_date".
I've assumed, perhaps wrongly, that PO_Number will suffice. If not, concatenate with that (and some delimeters) other fields which will make it unique. The where clause will need updating to compare t1 and t2 for the "Combination of Columns which make this unique".
If, there is no appropriate combination available, I'm not sure it can be done without VBA and/or temp tables as The-Gambill suggested.
This is a real pain in MS Access, as far as I know. One method is a correlated subquery, but you need a unique id column on each row:
SELECT t.*,
(SELECT COUNT(*)
FROM (SELECT "Internal_reference", MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference"
) as t2
WHERE t2."Internal_reference" AND t."Internal_reference" AND
t2."Movement_date" = t."Movement_date" AND
t2.?? <= t.??
) as cnt
FROM (SELECT "Internal_reference", MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference"
) r INNER JOIN
dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND
t."Internal_reference" = r."Internal_reference";
The ?? is for the id or creation date or something to allow the counting of rows.