How to rewrite SQL query without IN clause - sql

Schema:
Table A: AID(PK), RECEIVE_DATE
Table B: BID(PK), AID(FK), MESSAGE, ITEMID, ITEMTYPE
Tables A-to-B have a one-to-many mapping.
Here is a working SQL query (in SQL Server) to find out the latest message grouped by ITEMID i.e for different ITEMID (of ITEMTYPE say as 'XYZ').
SELECT
b.MESSAGE, b.ITEMID
from a
inner join b on b.aid = a.aid AND b.ITEMTYPE = 'XYZ'
where a.receive_date in (select max(receive_date)
from a a1
inner join b b1 on b1.aid = a1.aid
where b1.itemid = b.itemid
);
How can we rewrite this SQL query without IN clause [also without rownumber concept in use], as ORACLE is having restriction for IN clause. Getting java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a list is 1000 for above expression.

It isn't clear to me why you are getting ORA-01795. Your subquery only selects a max value, which should be a single value. In addition, the 1000 value limit only applies to a list of literals, not a subquery. In any case, you could rephrase this query using a join instead of WHERE IN:
SELECT
b.MESSAGE,
b.ITEMID
FROM a
INNER JOIN b
ON b.aid = a.aid AND b.ITEMTYPE = 'XYZ'
INNER JOIN
(
SELECT
b1.itemid,
MAX(receive_date) AS max_receive_date
FROM a a1
INNER JOIN b b1
ON b1.aid = a1.aid
GROUP BY b1.itemid
) t
ON b.itemid = t.itemid
WHERE a.receive_date = t.max_receive_date

EXISTS and IN tend to be interchangeable, and EXISTS performs better in some engines (not sure about Oracle) due to the fact that it returns true on the first match, rather than generating a subset and checking against that. I'm not familiar with Oracle, but I imagine you could use the following to circumvent your 1000 row limit on IN:
SELECT
b.MESSAGE, b.ITEMID
from a
inner join b on b.aid = a.aid AND b.ITEMTYPE = 'XYZ'
where exists (
SELECT 1
from a a1
inner join b b1 on b1.aid = a1.aid
where b1.itemid = b.itemid
having MAX(a1.receive_date) = a.receive_date
)

Related

Translating subquery to left join in sqlite

I have a query that is running against a SQLite database that uses a couple of subqueries. In order to accommodate some new requirements, I need to translate it to use joins instead. Below is the structure version of the original query:
SELECT c.id AS category_id, b.budget_year,
(
SELECT sum(actual)
FROM lines l1
WHERE status = 'complete'
AND category_id = c.id
AND billing_year = b.budget_year
) AS actual
(
SELECT sum(planned)
FROM lines l2
WHERE status IN ('forecasted', 'in-progress')
AND category_id = c.id
AND billing_year = b.budget_year
) AS rough_proposed
FROM categories AS c
LEFT OUTER JOIN budgets AS b ON (c.id = b.category_id)
GROUP BY c.id, b.budget_year;
The next query is my first attempt to convert it to use LEFT OUTER JOINs:
SELECT c.id AS category_id, b.budget_year, sum(l1.actual) AS actual, sum(l2.planned) AS planned
FROM categories AS c
LEFT OUTER JOIN budgets AS b ON (c.id = b.category_id)
LEFT OUTER JOIN lines AS l1 ON (l1.category_id = c.id
AND l1.billing_year = b.budget_year
AND l1.status = 'complete')
LEFT OUTER JOIN lines AS l2 ON (l2.category_id = c.id
AND l2.billing_year = b.budget_year
AND l2.status IN ('forecasted', 'in-progress'))
GROUP BY c.id, b.budget_year;
However, the actual and rough_proposed columns are much larger than expected. I am no SQL expert, and I am having a hard time understanding what is going on here. Is there a straightforward way to convert the subqueries to joins?
There is a problem with both your queries. However, the first query hides the problem, while the second query makes it visible.
Here is what's going on: you join lines twice - once as l1 and once more as l2. The query before grouping would have the same line multiple times when there are both actual lines and forecast-ed / in-progress lines. When this happens, each line would be counted multiple times, resulting in inflated values.
The first query hides this, because it does not apply aggregation to actual and rough_proposed columns. SQLite picks the first entry for each group, which has the correct value.
You can fix your query by joining to lines only once, and counting the amounts conditionally, like this:
SELECT
c.id AS category_id
, b.budget_year
, SUM(CASE WHEN l.status = 'complete' THEN l.actual END) AS actual
, SUM(CASE WHEN l.status IN ('forecasted', 'in-progress') THEN l.planned END) AS planned
FROM categories AS c
LEFT OUTER JOIN budgets AS b ON (c.id = b.category_id)
LEFT OUTER JOIN lines AS l ON (l.category_id = c.id AND l1.billing_year = b.budget_year)
GROUP BY c.id, b.budget_year
In this new query each row from lines is brought in only once; the decision to count it in one of the actual/planned columns is made inside the conditional expression embedded in the SUM aggregating function.

Equivalent SQL Query using Cartesian Product Only

given 3 tables a : {id, name_eng}, b: {id, name_spa} and c: {id, name_ita}
which is the equivalent "product cartesian query" for this given one:
select
a.name_eng
b.name_spa
c.name_ita
from
a inner join b on a.id = b.id
left outer join c on a.id = c.id
I don't know what you want, a cartesian product would be this:
SELECT a.name_eng
b.name_spa
c.name_ita
FROM a
CROSS JOIN b
CROSS JOIN c
Or the implicit way:
SELECT a.name_eng
b.name_spa
c.name_ita
FROM a,b,c
If you want your previous query written with cartesian products (why??), then this should do (on SQL Server 2000):
SELECT a.name_eng
b.name_spa
c.name_ita
FROM a,b,c
WHERE a.id = b.id
AND a.id *= c.id
If I wasn't clear enough with the "why??", you shouldn't use implicit joins since hey are deprecated, you should always use proper explicit joins.

SQL SELECT and SUM from three

I have been cracking my head for hours on what I thought to be simple SQL SELECT command. I searched every where and read all questions related to mine. I tried an SQL Command Builder, and even read and applied complete series of SQL tutorials and manuals to try to build it from scratch understanding it (which is very important for me, regarding next commands I'll eventually have to build...).
But now I'm just stuck with the results I want, but on separates SELECT commands which I seem to be unable to get together !
Here is my case : 3 tables, first linked to the second with a common id, second linked to the third with another common id, but no common id from the first to the third. Let's say :
Table A : id, name
Table B : id, idA, amount
Table C : id, idB, amount
Several names in Table A. Several amounts in Table B. Several amounts in Table C. Result wanted : each A.id and A.name, with the corresponding SUM of B.amount, and with the corresponding SUM of C.amount. Let's say :
A.id
A.name
SUM(B.amount) WHERE B.idA = A.id
SUM(C.amount) WHERE C.idB = B.id for each B which B.idA = A.id
It's okay for "the first three columns", and "the first two columns and the fourth", both with a WHERE clause and/or a LEFT JOIN. But I can't achieve cumulating all fourth columns together without messing everything !
One could say "it's easy, just put an idA column in Table C" ! Should be easier, sure. But is it really necessary ? I don't think so, but I could be wrong ! So, I just please anyone (who I will give an eternal "SQL God" decoration) with SQL skills to answer laughing "That's so simple ! Just do that and you are gone ! Stupid little newbies..." ;)
Running VB 2010 and MS SQL Server
Thanks for reading !
Try this:
SELECT A.Id, A.Name, ISNULL(SUM(B.amount), 0) as bSum, ISNULL(SUM(C2.Amount), 0) as cSum
FROM A
LEFT OUTER JOIN B ON A.Id = B.idA
LEFT OUTER JOIN (SELECT C.idB, SUM(C.AMOUNT) AS Amount FROM C GROUP BY C.idB) AS C2 ON C2.idB = B.Id
GROUP BY A.Id, A.Name
Try this:
SELECT
a.id,
a.name,
sum(x.amount) as amountb,
sum(x.amountc) as amountc
from a
left join (
select
b.id,
b.ida,
b.amount,
SUM(c.amount) as amountc
from b
left join c
on b.id = c.idb
group by
b.id,
b.amount,
b.ida
) x
on a.id = x.ida
group by
a.id,
a.name
This should give you the result set you're looking for. It sums all C.Amount's for each B.id, then adds it all together into a single result set. I tested it with a bit of sample data in MSSQL, and it works as expected.
Select a.id, a.name, sum(b.amount), sum(c.amount)
from a inner join b on a.id = b.idA
inner join c on b.id = c.idB
group by a.id, a.name
You need to add them separately:
select a.id, a.name, (coalesce(b.amount, 0.0) + coalesce(c.amount, 0.0))
from a left outer join
(select b.ida, sum(amount) as amount
from b
group by b.ida
) b
on a.id = b.ida left outer join
(select b.ida, sum(amount) as amount
from c join
b
on c.idb = b.id
group by b.ida
) c
on a.id = c.ida
The outer joins are to take into account when b and c records don't both exist for a given id.

How to return rows matched in a table without multiple EXISTS clauses?

I want to pull back results from one table that match ALL specified values where the specified values are in another table. I can do it like this:
SELECT * FROM Contacts
WHERE
EXISTS (SELECT 1 FROM dbo.ContactClassifications WHERE ContactID = Contacts.ID AND ClassificationID = '8C62E5DE-00FC-4994-8127-000B02E10DA5')
AND EXISTS (SELECT 1 FROM dbo.ContactClassifications WHERE ContactID = Contacts.ID AND ClassificationID = 'D2E90AA0-AC93-4406-AF93-0020009A34BA')
AND EXISTS etc...
However that falls over when I get up to about 40 EXISTS clauses. The error message is "The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query."
The gist of this is to
Select all contacts with any GUID from the IN statement
Use a DISTINCT COUNT to get a count for each contactid on matching GUID's
Use the HAVING to retain only those contacts that equal the amount of matching GUID's you've put into the IN statement
SQL Statement
SELECT *
FROM dbo.Contacts c
INNER JOIN (
SELECT c.ID
FROM dbo.Contacts c
INNER JOIN dbo.ContactClassifications cc ON c.ID = cc.ContactID
WHERE cc.ClassificationID IN ('..', '..', 38 other GUIDS)
GROUP BY
c.ID
HAVING COUNT(DISTINCT cc.ClassificationID) = 40
) cc ON cc.ID = c.ID
Test script at data.stackexchange
One solution is to demand that no classification exists without a matching contact. That's a double negation:
select *
from contacts c
where not exists
(
select *
from ContactClassifications cc
where not exists
(
select *
from ContactClassifications cc2
where cc2.ContactID = c.ID
and cc2.ClassificationID = cc.ClassificationID
)
)
This type of problem is known as relational division.
SELECT c.*
FROM Contacts c
INNER JOIN
(cc.ContactID, COUNT(DISTINCT cc.ClassificationID) as num_class
FROM ContactClassifications
WHERE ClassificationID IN (....)
GROUP BY cc.ContactID
) b ON c.ID = b.ContactID
WHERE b.num_class = [number of distinct values - how many different values you put in "IN"]
If you run SQLServer 2005 and higher, you can do pretty much the same with CROSS APPLY, supposedly more efficiently

Inner join query

Please go thourgh Attached Image where i descirbed my scenario:
I want SQL Join query.
Have a look at something like
SELECT *
FROM Orders o
WHERE EXISTS (
SELECT 1
FROM OrderBooks ob INNER JOIN
Books b ON ob.BookID = b.BookID
WHERE o.OrderID = ob.OrderID
AND b.IsBook = #IsBook
)
The query will return all orders based on the given criteria.
So, what it does is, when #IsBook = 1 it will return all Orders where there exists 1 or more entries linked to this order that are Books. And if #IsBook = 0 it will return all Orders where there exists 1 or more entries linked to this order that are not Books.
Inner join is a method that is used to combine two or more tables together on base of common field from both tables. the both keys must be of same type and of length in regardless of name.
here is an example,
Table1
id Name Sex
1 Akash Male
2 Kedar Male
similarly another table
Table2
id Address Number
1 Nadipur 18281794
2 Pokhara 54689712
Now we can perform inner join operation using the following Sql statements
select A.id, A.Name, B.Address, B.Number from Table1 A
INNER JOIN Table2 B
ON A.id = B.id
Now the above query gives one to one relation details.