Optimizing a SELECT with sub SELECT query in Oracle - sql

Select id,
(Select sum(totalpay)
from Table2 t
where t.id = a.id
and t.transamt > 0
and t.paydt BETWEEN TRUNC(sysdate-0-7) and TRUNC(sysdate-0-1)) As Pay
from Table1 a
In spite of having indexes on transamt, paydt and id, the cost of the sub-query on Table2 is very expensive and requires a FULL TABLE scan.
Can this sub-query be optimized in any other way?
Please help.

Select t.id,
sum(totalpay) as Pay
from Table2 t join Table1
Where t.id = Table1.id
and t.transamt > 0
and t.paydt BETWEEN TRUNC(sysdate-0-7) and TRUNC(sysdate-0-1)
group by t.id

Try this:
Select a.id,
pay.totalpay
from Table1 a
(Select t.id, sum(totalpay) totalpay
from Table2 t
where t.transamt > 0
and t.paydt BETWEEN TRUNC(sysdate-0-7) and TRUNC(sysdate-0-1)
group by t.id
) As Pay
where a.id = pay.id
push group by joining columns (id column in this example) into subquery to calculate results for all values in Table2 and then join with Table1 table.
In original query you calculate result for every crow from Table1 table reading full Table2 table.

Related

count duplicates in sql for 1 column

how i can fill out "category" in table 1 in the case where there are multiple in table 2? I would like to fill as multiple if there are multiple categories per object.
attempt:
select t1.*,
case case when count(t2.category)>1 then 'Multiple' else cast(t2,category as varchar)
from table1 t1
left join table 2 t2 on t2.object=t1.object
Issue, its asking for group by, i have over 80 columns, is there anyway to bypass group by?
Create an expression for the value using a subquery:
select
t1.*,
case (select count(*) from table2 t2 where t2.object = t1.object)
when 0 then 'None'
when 1 then (select cast(max(t2,category) as varchar)) from table2 t2 where t2.object = t1.object)
else 'Multiple'
end as category
from table1 t1
This avoids listing all columns in either the group by clause in the case of aggregation approach, or the select clause in the case of using a query over a subquery.
I added None as a result because you used a left join, indicating that joining rows are optional.
This wouldn't perform well with large numbers of rows, but should be fine with modest table sizes.
Can you try this:
select
distinct
t1.*,
case when c.object is not null then 'Multiple' else b.category end as category
from
table1 t1
left outer join
table2 t2
on t1.object = t2.object
left outer join
(select object from table2 group by object having count(*) > 1) c
on t1.object = c.object

Optimize SQL Query, need suggestions

I have a table in SQL Server having 4 columns:
Invoice No, Date, Amt and ID
I have to find invoices that have same Invoice No, date and Amt but different ID.
I'm populating the results doing self join but seems like it's not the optimized way to fetch results.
My query:
select * from table t1 join
table t2 on t1.invoice = t2.invoice
where t1.invoice=t2.invoice and t1.amount=t2.amount and t1.date =t2.date and t1.id!=t2.id
Kindly suggest me an optimized way to fetch the correct result.
try this. using left join and filter those nulls.
select * from (
select t1.invoiceno, t1.date, t1.amt, t1.id, t2.id as t2ID
from invoices t1
left join invoices t2 on t2.invoiceno = t1.invoiceno
and t2.date = t1.date
and t2.amt = t1.amt
and t2.id != t1.id) t3
where coalesce(t3.t2ID, 0) != 0
You might use indexes to speed up the retrieving from large databases.
Use sub query but don't use a sub query just to show one column.
I advised to use sub query as new table to use joins.
just like the first answer.
use not exists
select t1.* from table t1
where not exists( select 1 form
table t2 where t1.invoice = t2.invoice
and t1.invoice=t2.invoice and t1.amount=t2.amount
and t1.date =t2.date and t1.id=t2.id
having count(*)>1
)
have to find invoices that have same Invoice No, date and Amt but different ID.
Use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.Invoice = t.invoice and
t2.Date = t.date and
t2.amount = t.amount and
t2.id <> t.id
)
order by t.invoiceNo, t.date, t.amount, t.id;
This will show the matching invoices on adjacent rows. For performance, you want an index on (invoice, date, amount, id).
If you just want triplets where this occurs, you can use aggregation:
select invoice, date, amount, min(id), max(id)
from t
group by invoice, date, amount
having count(distinct id) > 1;
Note: If there are more than two duplicates, this only shows two ids.

SQL Query, how to get data from two tables

Table 1:
ID (unqiue), Name, Address
Table 2:
RecordId, ID (key of table 1), Child name
In one query, I want to retrieve all rows of Table 1 with one additional column which will be the count of all record in table 2 from ID (that is number of children for each ID in table 1). Can't figure out how to format a query to retrieve this data.
Simply Join and apply count
select T1.*, COUNT(T2.RECORDID)AS T2COUNT from Table1 T1
INNER JOIN TABLE2 T2 ON T1.ID= T2.ID
--LEFT JOIN TABLE2 T2 ON T1.ID= T2.ID --if you need 0 child records (from commets by #Cha)
GROUP BY T1.ID , T1.Name, T1.Address
The correct way of doing this will be with a OUTER JOIN:
SELECT a.ID, a.Name, a.Address, b.cnt
FROM Table1 a
LEFT OUTER JOIN
(SELECT ID, count(*) cnt from Table2 GROUP BY ID) b
ON a.ID = b.ID
The incorrect way will be with a help of a correlated sub-query:
SELECT a.ID, a.Name, a.Address,
(SELECT count(*) FROM Table2 b WHERE b.ID = a.ID) as cnt
FROM Table1 a
Here is a discussion about correlated subqueries vs OUTER JOINs, if you are interested
Group by table1 fields and count total records in table2:
here T1 alias of table1 and T2 alias of table2.
select T1.ID, T1.Name, T1.Address, count(T2.ID) as total_records
from table1 as T1
left outer join table2 as T2 on T2.ID=T1.ID
group by T1.ID, T1.Name, T1.Address

tsql: alternative to select subquery in join

this is my table layout simplified:
table1: pID (pkey), data
table2: rowID (pkey), pID (fkey), data, date
I want to select some rows from table1 joining one row from table2 per pID for the most recent date for that pID.
I currently do this with the following query:
SELECT * FROM table1 as a
LEFT JOIN table2 AS b ON b.rowID = (SELECT TOP(1) rowID FROM table2 WHERE pID = a.pID ORDER BY date DESC)
This way of working is slow, probabaly because it has to do a subquery on each row of table 1. Is there a way to improve performance on this or do it another way?
You can try something on these lines, use the subquery to get the latest based on the date field (grouping by the pID), then join that with the first table, this way the subquery would not have not have to be executed for each row of Table1 and will result in better performance:
Select *
FROM Table1 a
INNER JOIN
(
SELECT pID, Max(Date) FROM Table2
GROUP BY pID
) b
ON a.pID = b.pID
I have provided the sample SQL for one column using the group by, in case you need additional columns, add them to the GROUP BY clause. Hope this helps.
use the below code, and note that i added the order by Date desc to get the most resent data
select *
from table1 a
inner join table2 b on a.pID=b.pID
where b.rowID in(select top(1) from table2 t where t.pID=a.pID order by Date desc)
I am using the code below in a similar scenaro (I transcripted it to your example)
SELECT b.*
FROM table1 AS a
left outer join (
SELECT a.*
FROM table2 a
inner join (
SELECT a.pID, max(date) as date
FROM table2
WHERE date <= <max_date>
group by pID
) b ON a.pID = b.pID AND a.date = b.date
) b ON a.pID = b.pID
) b on a.pID = b.pID
The only problem with this aproach is that you have to make sure the date's don't reapet for the pID's
You can do this with the row_number() function and a subquery:
SELECT t1.*
FROM table1 t1 LEFT JOIN
(select t2.*, row_number() over (partition by pId order by rowId desc) as seqnum
from table2 t2
) t2
on t1.pId = t2.pId and t2.seqnum = 1;
Use the ROW_NUMBER() function to get a column saying which id of each row in table 2 is the first (As partitioned by the pID, and ordered by the rowDate descending)
Example:
WITH cte AS
(
SELECT
rowID AS t2RowId,
ROW_NUMBER OVER (PARTITION BY pID ORDER BY rowDate DESC) AS rowNum
FROM table2 t2
) -- gets the t2RowIds + a column which says which is the latest for each pID
SELECT t1.*, t2.*
FROM table1 t1
LEFT JOIN
(
table2 t2
JOIN cte ON t2.rowID = cte.t2RowId AND cte.rowNum = 1
) ON t1.pID = t2.pID
This is guaranteed to only return 1 item from table2 per pID, even if multiple items have the same date. You should of course ensure that the date column is indexed in table 2 for quick performance (ideally an index that also covers the PrimaryID of table2)

SQL Select Puzzle

Ok..here's what I want to do. I've oversimplified the example below:-
I have a table (Table1) with references in like this:
Table1_ID (PK)
Table1_ID Description
There's another table (Table2):-
Table2_ID (PK)
Table2_LinkedID (FK)
Table2_Status <--value is "open" or "complete"
Table2_LinkedID is linked to Table1_ID.
Ok. Now I have three queries that I want to connect together. Here is what I need.
First query:-
SELECT * FROM Table1
This works fine.
I want to add two additional columns to the query. The first is the total number of records in Table2 where the foreign key equals the primary key of table1 (ie SELECT *).
The second will be a count of records where Table2_Status = 'completed'
Does this make sense?
select t1.Table1_ID,
t1.Table1_Description,
t2.TotalCount,
t2.CompletedCount
from Table1 t1
left outer join (
select Table2_LinkedID,
count(*) as TotalCount,
count(case when Table2_Status = 'completed' then 1 end) as CompletedCount
from Table2
group by Table2_LinkedID
) t2 on t1.Table1_ID = t2.Table2_LinkedID
SELECT t1.*,
(SELECT COUNT(*) FROM Table2 WHERE Table2_LinkedID = t1.ID) cntTotal,
(SELECT COUNT(*) FROM Table2 WHERE Table2_LinkedID = t1.ID AND Table2_Status = 'completed') cntCompleted
FROM Table1 t1
Make sure to have a proper index for the foreign key and for Table2_Status for best performance.
You can make a simple GROUP BY with aggregates:
SELECT
Table1.ID,
Table1.Description,
Count(Table2.ID) AS TotalT2,
Sum(CASE WHEN Table2.Status = 'completed' THEN 1 ELSE 0 END) AS CountOfCompleted
FROM Table1
LEFT JOIN Table2 ON Table2.LinkedID = Table1.ID
GROUP BY Table1.ID, Table1.Description
Will this work for you?
Query 1:
select a.ID
, count(1) as Table2_RecordCount
from Table1 a
inner join Table2 b on b.LinkedID = a.ID
group by a.ID
Query 2:
select a.ID
, count(1) as Table2_RecordCount
from Table1 a
inner join Table2 b on b.LinkedID = a.ID
where b.[Status] = 'completed'
group by a.ID