Multiple Join count doesnt get 0 - sql

I've been trying to get data with joins. But problem is result doesn't has records which are has no data in second or third table.
Here is the query;
SELECT AUDIT_CONFIG.TITLE,AUDIT_CONFIG.AUDITOR_POOL,AUDIT_CONFIG.FREQUENCE,
TO_CHAR(TO_DATE(AUDIT_CONFIG.START_DATE,'yyyymmdd'),'dd/mm/yyyy') AS "START",
AUDIT_CONFIG.AUDIT_ID, TO_CHAR(MAX(AUDIT_DATES.AUDIT_DATE), 'dd/mm/yyyy') AS "FINISH",
TRUNC(MAX(AUDIT_DATES.AUDIT_DATE) - SYSDATE) DAY_TO,
(SELECT COUNT(DISTINCT UNIQ_ID) FROM SENDED_AUDIT) AS SCHEDULED,
(SELECT COUNT(*) FROM AUDIT_RESULTS WHERE PASSORFAIL='P') AS PASS,
(SELECT COUNT(*) FROM AUDIT_RESULTS WHERE PASSORFAIL='F') AS FAIL
FROM AUDIT_CONFIG
RIGHT JOIN AUDIT_DATES ON AUDIT_DATES.AUDIT_ID = AUDIT_CONFIG.AUDIT_ID
RIGHT JOIN SENDED_AUDIT ON SENDED_AUDIT.AUDIT_ID=AUDIT_CONFIG.AUDIT_ID
RIGHT JOIN AUDIT_RESULTS ON AUDIT_RESULTS.AUDIT_ID=AUDIT_CONFIG.AUDIT_ID
GROUP BY AUDIT_CONFIG.TITLE, AUDIT_CONFIG.AUDITOR_POOL, AUDIT_CONFIG.FREQUENCE,
TO_CHAR(TO_DATE(AUDIT_CONFIG.START_DATE, 'yyyymmdd'), 'dd/mm/yyyy'), AUDIT_CONFIG.AUDIT_ID;
And here is a image for understanding the problem; (my query returns just first row)
So any advice for getting 0 rows? Thanks in advance..
EDİT For Thorsten Kettner:
Solved now :) thank you for your help and time

Your query looks overly complicated
To start with: Few people use right outer joins for we find them less intuitive than left outer joins. It even seems you were confused with the joins and really wanted left joins instad.
Another thing is the count subqueries that are not related to the records in the main query. I don't think this is on purpose, is it?
Then you join sended_audit and audit_results - the same tables you are using in the count subqueries, but you don't use these joined records in your query.
I guess you want:
select
ac.title,
ac.auditor_pool,
ac.frequence,
to_char(to_date(ac.start_date, 'yyyymmdd'), 'dd/mm/yyyy') as "start",
ac.audit_id,
to_char(ad.max_date, 'dd/mm/yyyy') as "finish",
trunc(ad.max_date - sysdate) as day_to,
sa.scheduled,
nvl(ar.pass, 0) as pass,
nvl(ar.fail, 0) as fail
from audit_config ac
left join
(
select audit_id, max(audit_date) as max_date
from audit_dates
group by audit_id
) ad on ad.audit_id = ac.audit_id
left join
(
select audit_id, count(distinct uniq_id) as scheduled
from sended_audit
group by audit_id
) sa on sa.audit_id = ac.audit_id
left join
(
select
audit_id,
count(case when passorfail = 'p' then 1 end) as pass,
count(case when passorfail = 'f' then 1 end) as fail
from audit_results
group by audit_id
) ar on ar.audit_id = ac.audit_id;

Related

(probably) very simple SQL query needed

Having a slow day....could use some assistance writing a simple ANSI SQL query.
I have a list of individuals within families (first and last names), and a second table which lists a subset of those individuals. I would like to create a third table which flags every individual within a family if ANY of the individuals are not listed in the second table. The goal is essentially to flag "incomplete" families.
Below is an example of the two input tables, and the desired third table.
As I said...very simple...having a slow day. Thanks!
I think you want a left join and case expression:
select t1.*,
(case when t2.first_name is null then 'INCOMPLETE' else 'OK' end) as flag
from table1 t1 left join
table2 t2
on t1.first_name = t2.first_name and t1.last_name = t2.last_name;
Of course, this marks "Diane Thomson" as "OK", but I think that is an error in the question.
EDIT:
Oh, I see. The last name defines the family (that seems like a pretty big assumption). But you can do this with window functions:
select t1.*,
(case when count(t2.first_name) over (partition by t1.last_name) =
count(*) over (partition by t1.last_name)
then 'OK'
else 'INCOMPLETE'
end) as flag
from table1 t1 left join
table2 t2
on t1.first_name = t2.first_name and t1.last_name = t2.last_name;
That's not simple, at least not in SAS :-)
Standard SQL, when Windowed Aggregates are supported:
select ft.*,
-- counts differ when st.first_name is null due to the outer join
case when count(*) over (partition by ft.last_name)
= count(st.first_name) over (partition by ft.last_name)
then 'OK'
else 'INCOMPLETE'
end
from first_table as ft
left join second_table as st
on ft.first_name = st.first_name
and ft.last_name = ft.last_name
Otherwise you need to a standard aggregate and join back:
select ft.*, st.flag
from first_table as ft
join
(
select ft.last_name,
case when count(*)
= count(st.first_name)
then 'OK'
else 'INCOMPLETE'
end as flag
from first_table as ft
left join second_table as st
on ft.first_name = st.first_name
and ft.last_name = st.last_name
group by ft.last_name
) as st
on ft.last_name = st.last_name
It is pretty easy to do in SAS if you want to take advantage of its non-ANSI SQL feature of automatically re-merging aggregate function results back onto detail records.
select
a.first
, a.last
, case when 1=max(missing(b.last)) then 'INCOMPLETE'
else 'OK'
end as flag
from table1 a left join table2 b
on a.last=b.last and a.first=b.first
group by 2
order by 2,1
;

When subquery behind SELECT can not be removed?

Correlated subqueries are considered to be a bad habit. I believe that any SQL command with a subquery between SELECT and FROM (lets call it SELECT subquery) can be rewritten into a SQL without any. For example query like this
select *,
(
select sum(t2.sales)
from your_table t2
where t2.dates
between t1.dates - interval '3' day and
t1.dates and
t2.id = t1.id
) running_sales
from your_table t1
demo
can be rewritten into the following one
select dd.id, dd.dates, dd.sales, sum(d.sales) running_sales
from your_table dd
join your_table d on d.dates
between (dd.dates - interval '3' day) and
dd.dates and
dd.id = d.id
group by dd.id, dd.dates, dd.sales
demo
The problems may occur when there is more than one SELECT subquery, however, even in those case, it is possible to rewrite them into a subquery behind FROM and then perform a LEFT JOIN in the following spirit
select *,
(
select sum(sales)
from dat dd
where dd.dates
between (d.dates - interval '3' day) and d.dates and
dd.id = d.id
) running_sales,
(
select sum(sales)
from dat dd
where dd.id = d.id
) total_sales
from dat d
demo
can be rewritten into the following one
select d.*,
t_running.running_sales,
t_total.total_sales
from dat d
left join (
select dd.id, dd.dates, sum(d.sales) running_sales
from dat dd
join dat d on d.dates
between (dd.dates - interval '3' day) and
dd.dates and
dd.id = d.id
group by dd.id, dd.dates
) t_running on d.id = t_running.id and d.dates = t_running.dates
left join (
select d.id, sum(d.sales) total_sales
from dat d
group by d.id
) t_total on t_total.id = d.id
demo
Could you please provide me an example where it is not possible to get rid of the SELECT subquery? Please be so kind and add also a working example link (e.g. dbfiddle, or sqlfiddle) to make the potential disscussion is easier, thanks!
If the question is for a multiple-choice test (or something like that) :) , it is not possible to get rid of subquery for EXISTS clause.
An other similar answeris for IN (subquery) for different level of aggregation to avoid cartesian product.
(same comment by the way : correlated subqueries are not considered everytime to be a bad habit, it depends of optimization, structure, etc....
The WITH is a sort of use of correlated subqueries... and it's very practical for complex queries. )

how to handle subquery returning more than one value error?

is there any other way to write this query so that it wont get the error?
select sum(Travelled_value)
from travel_table
where customer_id=(select distinct f.CUSTOMER_ID as agg
from SEGMENT_table f
JOIN bookin_table t
ON f.CUSTOMER_ID=t.CUSTOMER_ID
where t.booking_date BETWEEN sysdate
AND sysdate+21 and f.type='NEW';)
here the three tables having customer_id as common.
I don't know if this will work, but it fixes many problems:
select sum(tt.Travelled_value)
from travel_table tt
where tt.customer_id in (select f.CUSTOMER_ID
from SEGMENT_table f JOIN
booking_table t
ON f.CUSTOMER_ID = t.CUSTOMER_ID
where t.booking_date between sysdate and sysdate+21 and
f.type = 'NEW'
);
Notes:
You have a semicolon in the middle of the query. It goes at the end.
select distinct is not needed in an in subquery.
You are using sysdate and comparing it to a date. Are you sure you don't want trunc(sysdate)? sysdate has a time component.
SELECT SUM(Travelled_value)
FROM travel_table
WHERE customer_id in
(SELECT f.CUSTOMER_ID
FROM SEGMENT_table f
JOIN bookin_table t
ON f.CUSTOMER_ID=t.CUSTOMER_ID
WHERE t.booking_date BETWEEN trunc(sysdate) AND trunc(sysdate+21)
AND f.type='NEW'
);

Multiple left joins with aggregation on same table causes huge performance hit in SAP HANA

I am joining two tables on HANA and, to get some statistics, I am LEFT joining the items table 3 times to get a total count, number of entries processed and number of errors, as shown below.
This is a dev system and the items table has only 1500 items. But the query below runs for 17 seconds.
When I remove any of the three aggregation terms (but leave the corresponding JOIN in place), the query executes almost immediately.
I have also tried adding indexes on the fields used in the specific JOINs, but that makes no difference.
select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by,
count( distinct rp.guid ),
count( distinct rp2.guid ),
count( distinct rp3.guid )
from zbsbpi_rk as rk
left join zbsbpi_rp as rp
on rp.header = rk.guid
left join zbsbpi_rp as rp2
on rp2.header = rk.guid
and rp2.processed = 'X'
left join zbsbpi_rp as rp3
on rp3.header = rk.guid
and rp3.result_status = 'E'
where rk.run_id = '0000000010'
group by rk.guid, run_id, status, created_at, created_by
I think you can re-write you query to improve the performance:
select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by,
count( distinct rp.guid ),
count( distinct (CASE WHEN rp.processed = 'X' then rp.guid else null end) ),
count( distinct (CASE WHEN rp.result_status = 'E' then rp.guid else null end))
from zbsbpi_rk as rk
left join zbsbpi_rp as rp
on rp.header = rk.guid
where rk.run_id = '0000000010'
group by rk.guid, run_id, status, created_at, created_by
I'm not entirely sure if the count distinct case construct will work on hana but you may try.
My apologies, but I forgot that I had posted this question here. I had posted the same question at answers.sap.com after not getting any joy here: https://answers.sap.com/questions/172096/multiple-left-joins-with-aggregation-on-same-table.html
I eventually came up with the solution, which was a bit of a "doh!" moment:
select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by,
count( distinct rp.guid ),
count( distinct rp2.guid ),
count( distinct rp3.guid )
from zbsbpi_rk as rk
join zbsbpi_rp as rp
on rp.header = rk.guid
left join zbsbpi_rp as rp2
on rp2.guid = rp.guid
and rp2.processed = 'X'
left join zbsbpi_rp as rp3
on rp3.guid = rp.guid
and rp3.result_status = 'E'
where rk.run_id = '0000000010'
group by rk.guid, run_id, status, created_at, created_by
The subsequent left joins needed only to be joined to the first join on the same table, as the first join contained a superset of all the records anyway.

Oracle/SQL: How can I show dates and counts even when count is zero?

So, I need to query a table and show counts per day even if that count is zero. I tried something like the below, but it doesn't show the days that have a count of zero. Anyone have any other ideas? Using Oracle, BTW. Many thanks!!
with the_dates as (
select to_date('080114','MMDDYY') + level - 1 as the_date
from dual
connect by level <= to_date('011716', 'MMDDYY')
- to_date('080114', 'MMDDYY') + 1
)
select distinct trunc(a.the_date), count(*)
from the_dates a
left outer join TableFoo f on a.the_date = to_date(admit_date, 'MMDDYYYY')
where f.customer_num = 10
group by trunc(a.the_date)
order by trunc(a.the_date);
The problem is that your where clause turns the left join into an inner join. So:
with . . .
select trunc(a.the_date), count(f.customer_num)
from the_dates a left outer join
TableFoo f
on a.the_date = to_date(admit_date, 'MMDDYYYY') and
f.customer_num = 10
group by trunc(a.the_date)
order by trunc(a.the_date);
Also, select distinct is almost never needed when using group by (and certainly not in this case).