Optimize SQL Query, need suggestions - sql

I have a table in SQL Server having 4 columns:
Invoice No, Date, Amt and ID
I have to find invoices that have same Invoice No, date and Amt but different ID.
I'm populating the results doing self join but seems like it's not the optimized way to fetch results.
My query:
select * from table t1 join
table t2 on t1.invoice = t2.invoice
where t1.invoice=t2.invoice and t1.amount=t2.amount and t1.date =t2.date and t1.id!=t2.id
Kindly suggest me an optimized way to fetch the correct result.

try this. using left join and filter those nulls.
select * from (
select t1.invoiceno, t1.date, t1.amt, t1.id, t2.id as t2ID
from invoices t1
left join invoices t2 on t2.invoiceno = t1.invoiceno
and t2.date = t1.date
and t2.amt = t1.amt
and t2.id != t1.id) t3
where coalesce(t3.t2ID, 0) != 0

You might use indexes to speed up the retrieving from large databases.
Use sub query but don't use a sub query just to show one column.
I advised to use sub query as new table to use joins.
just like the first answer.

use not exists
select t1.* from table t1
where not exists( select 1 form
table t2 where t1.invoice = t2.invoice
and t1.invoice=t2.invoice and t1.amount=t2.amount
and t1.date =t2.date and t1.id=t2.id
having count(*)>1
)

have to find invoices that have same Invoice No, date and Amt but different ID.
Use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.Invoice = t.invoice and
t2.Date = t.date and
t2.amount = t.amount and
t2.id <> t.id
)
order by t.invoiceNo, t.date, t.amount, t.id;
This will show the matching invoices on adjacent rows. For performance, you want an index on (invoice, date, amount, id).
If you just want triplets where this occurs, you can use aggregation:
select invoice, date, amount, min(id), max(id)
from t
group by invoice, date, amount
having count(distinct id) > 1;
Note: If there are more than two duplicates, this only shows two ids.

Related

SQL Server - How to get 2 values before and after timestamp from another table

I was just wondering if anyone can help me with this problem.
I have two tables - Table 1 & Table 2.
What I'm trying to do is to find 2 timestamps before and after 'date' from Table 1, so everything highlighted in pink in Table 2.
How can I do this in Microsoft SQL Server? ideally without using CTE. Unfortunately, CTE is not supported by Tableau.
Thank you in advance.
Here is one option using a CROSS JOIN, datediff() and row_number()
Select Date
,Value
,ID
From (
Select A.Date
,A.Value
,B.ID
,RN1 = row_number() over (order by datediff(second,a.date,b.date) desc)
,RN2 = row_number() over (order by datediff(second,a.date,b.date) asc)
From Table2 A
Cross Join Table1 B
) A
Where RN1 in (2,3)
or RN2 in (2,3)
Order By Date
Results
If you want to select this specific range of rows in table2, the simplest approach might be union all:
(
select top (3) t2.date, t2.value
from table1 t1
inner join table2 t2 on t2.date < t1.date
order by t2.date desc
)
union all
(
select top (3) t2.date, t2.value
from table1 t1
inner join table2 t2 on t2.date > t1.date
order by t2.date
)
Both suqueries start from the reference date in table1, and then fetch the previous (resp. next) 3 records in table2, using order by and fetch. The combination of the two datasets gives you the result you want.
With a very small table1 and with the relevant index in place on table2(date), both subqueries should execute very efficiently. Adding column value to the table2 index might further help performance, by making the index covering for the query.

Column must appear in the GROUP BY clause or be used in an aggregate function?

The question is We want to sum the orders that are above $700.
Here is my code so far
SELECT sum(t2.amount), t1.name
FROM Salesperson t1
INNER JOIN Orders t2 ON t1.ID = t2. salesperson_id
WHERE t2.amount >= 700
However I get an error
Column "t1.name" must appear in the GROUP BY clause or aggregate function
Why is that?
You need to specify the fields that will be used to aggregate (group) the results, in this case the salesperson name.
SELECT sum(t2.amount), t1.name
FROM Salesperson t1
INNER JOIN Orders t2 ON t1.ID = t2.salesperson_id
WHERE t2.amount >= 700
GROUP BY t1.name
If you want just sum the amount order I think this is will work :
SELECT sum(t2.amount), t1.name
FROM Salesperson t1
INNER JOIN Orders t2 ON t1.ID = t2. salesperson_id
GROUP BY(salesperson_id)
HAVING t2.amount >= 700;
The error because some salesperson_id make more than one order
so you need to group them by its id to make each rows represent the total amount order for each salesperson.
I hope this answer useful for you and fix your problem.

Use SELECT subquery within UNION

I have two tables:
Table 1
Table 2
I am trying to write a query to SELECT all records for both tables, using UNION (columns ID, Date and Amount). The tables are linked by ID. When selecting the records in Table 2 however during the UNION, if the related ID in Table 1 has a True or False value of TRUE, I want to change the Date of the Table 2 record to the date in Table 1, ultimate acheiving this:
Is this possible?
The 2nd query of UNION ALL should be a join of the tables:
select id, date, amount
from Table1
union all
select
t2.id,
case when t1.trueorfalse = 'TRUE' then t1.date else t2.date end
t2.amount
from Table2 t2 inner join Table1 t1
on t1.id = t2.id
The CASE expression will return either the date from Table1 or from Table2.
If your database supports the Boolean data type maybe you should use TRUE instead of 'TRUE'.
This seems like a convenient place to use a lateral join, if your database supports it:
select v.*
from table1 t1 join
table2 t2
on t1.id = t2.id cross join lateral
(values (t1.id, (case when t1.true_or_false = 'true' then t2.date else t1.date end), t1.amount),
(t2.id, t2.date, t2.amount)
) v(id, date, amount);

How to find difference in rows for two tables in DB2 or SQL Server

In the table1 I have 1421144 rows and table2 has 1421134 rows.
I tried this query, but I don't get any rows returned.
select table1.ID
from table1
where ID not in (select ID from table2)
I have also used this query:
select ID from table1
except
select ID from table2
But I don't get any rows. Please help me, if the table1 has duplicates how can I get those duplicates?
Assuming ids are unique, you can use full outer join in either database:
select coalesce(t1.id, t2.id) as id,
(case when t1.id is null then 'T2 only' else 'T1 only' end)
from t1 full outer join
t2
on t1.id = t2.id
where t1.id is null or t2.id is null;
It is quite possible that the two tables have the same sets of ids, but there are duplicates. Try this:
select t1.id, count(*)
from t1
group by t1.id
having count(*) > 1;
and
select t2.id, count(*)
from t2
group by t2.id
having count(*) > 1;
If you have duplicates, try:
WITH Dups AS(
SELECT ID, COUNT(ID) OVER (PARTITION BY ID) AS DupCount
FROM Table1)
SELECT *
FROM Dups
WHERE DupCount > 1;
If you need to delete the dups, you can use the following syntax:
WITH Dups AS(
SELECT ID, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS DupCount
FROM Table1)
DELETE FROM Dups
WHERE DupCount > 1;
Obviously, however, check the data before you run a DELETE statement you got from a random on the internet. ;)
I Guess u have data type mismatch between 2 tables, cast them to integers and try your first query
select table1.ID from table1
where cast(ID as int) not in (select cast(ID as int) from table2)
If you have stored in a different format than int, cast them to varchar and
try with this datatype.
Not in takes longer to execute, use left join instead
select t1id from
(
select t1.id t1Id, t2.Id t2Id from table1 left join table2
on cast(t1.id as int) = cast(t2.id as int)
) x where t2Id is null

Optimizing a SELECT with sub SELECT query in Oracle

Select id,
(Select sum(totalpay)
from Table2 t
where t.id = a.id
and t.transamt > 0
and t.paydt BETWEEN TRUNC(sysdate-0-7) and TRUNC(sysdate-0-1)) As Pay
from Table1 a
In spite of having indexes on transamt, paydt and id, the cost of the sub-query on Table2 is very expensive and requires a FULL TABLE scan.
Can this sub-query be optimized in any other way?
Please help.
Select t.id,
sum(totalpay) as Pay
from Table2 t join Table1
Where t.id = Table1.id
and t.transamt > 0
and t.paydt BETWEEN TRUNC(sysdate-0-7) and TRUNC(sysdate-0-1)
group by t.id
Try this:
Select a.id,
pay.totalpay
from Table1 a
(Select t.id, sum(totalpay) totalpay
from Table2 t
where t.transamt > 0
and t.paydt BETWEEN TRUNC(sysdate-0-7) and TRUNC(sysdate-0-1)
group by t.id
) As Pay
where a.id = pay.id
push group by joining columns (id column in this example) into subquery to calculate results for all values in Table2 and then join with Table1 table.
In original query you calculate result for every crow from Table1 table reading full Table2 table.