Sorry if I'm not wording this too well, but let me try to explain what I'm doing. I have a main object of class A, that has multiple objects of classes B, C, D and E.
such that:
Class ObjectA
{
ObjectB[] myObjectBs;
ObjectC[] myObjectCs;
ObjectD[] myObjectDs;
ObjectE[] myObjectEs;
}
where A---B mapping is 1 to many, for B, C, D and E. That is, all B,C,D,E objects are associated with only one object A.
I'm storing the data for all these objects in a database, with Table A holding all the data for the instances of Class A, etc.
Now, when getting the data for this at run time on the fly, I'm running 5 different queries for each object.
(very simplified psuedocode)
objectA=sql("select * from tableA where id=#id#");
objectA.setObjectBs(sql("select * from tableB where a_id=#id#");
objectA.setObjectCs(sql("select * from tableC where a_id=#id#");
objectA.setObjectDs(sql("select * from tableD where a_id=#id#");
objectA.setObjectEs(sql("select * from tableE where a_id=#id#");
if that makes sense.
Now, I'm wondering, is this the most efficient way of doing it? I feel like there should be a way to get all this info in 1 query, but doing something like "select * from a,b,c,d,e where a.id = #id# and b.a_id = #id# and c.a_id = #id# and d.a_id = #id# and e.a_id = #id#" will give a result set with all the columns of A,B,C,D,E for each row, and there will be many many more rows that I'd be needing.
If there was only one array of objects (like just ObjectBs) it could be done with a simple join and then handled by my database framework. If the relationships were A(one)....B(many) and B(one)....C(many) it could be done with two joins and work. But for A(one)....B(many) and A(one)....C(many) etc I can't think of a good way to do joins or return this data without having too many rows, as with joins if A has 10 Bs and 10Cs, it'll return 100 rows rather than 20.
So, is the way I'm currently doing it, with 5 different selects, the most efficient (which it seems like its not), or is there a better way of doing it?
Also, If I were to grab a large set of these at once (say, 5000 ObjectAs and all the associated Bs, Cs, Ds, and Es), would there be a way to do it without running a ton of consecutive queries one after the other?
You can try iBatis using N+1 Select Lists
http://ibatis.apache.org/docs/dotnet/datamapper/ch03s05.html
Hth.
There is a huge performance issue with N+1 selects (check https://github.com/balak1986/prime-cache-support-in-ibatis2/blob/master/README). So please don't use it unless there is no other way of achieving this.
Luckily iBatis has groupBy property, that is created exactly to map data for these kind of complex object.
Check the example from http://www.cforcoding.com/2009/06/ibatis-tutorial-aggregation-with.html
Related
I have a list of 2500 obj numbers stored in Excel for which I need to run the below SQL:
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a
JOIN
AQ$_QUEUES b ON a.objno = b.table_objno
WHERE
a.objno = 19551;
Is there any way I can write a loop on above SQL with objno feeding from a list or from a different table? I also want to store/produce all the results from each loop run as a single output.
I considered the option to upload the numbers into a new table and add a where condition:
a.objno=(SELECT newtab.objectno FROM newtab);
However, the logic I'll be writing in the query would exclude certain objectno results. Let's say that the associated objectno has certain queue_comment as of certain date associated with that objectno. I do not want to pull that record. This condition would match with some objectno and wouldn't match with others. Having that condition and running the query against all the objectno is returning 0 results. I couldn't share the original logic as it would reveal certain business rules and it'll be a violation of some policy.
So, I need to run the query on each objectno separately and combine the results.
I'm totally new to SQL and got this task assigned. I'm aware of the regular loop, for in SQL, but I don't think I can apply them in this situation.
Any guidance or reference links to helpful topics is much appreciated as well.
Thanks in advance for the help.
One option is to upload the object numbers from Excel sheet to a table in the database and run the query as following. Assuming newtab is the table where the objectno are uploaded.
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a JOIN AQ$_QUEUES b on a.objno = b.table_objno
WHERE
a.objno IN (SELECT newtab.objectno FROM newtab);
I have used a subquery here, join to the aq$ can work as well.
Reading the comments and all I think you need to enhance your Excel with 2 additional columns and load to a new table.
IN can be used in the following way too:
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a
JOIN
AQ$_QUEUES b ON a.objno = b.table_objno
WHERE
(a.objno,a.table_comment,b.queue_comment) IN (19551,'something','something');
so with the new table will be:
WHERE
(a.objno,a.table_comment,b.queue_comment) IN
(select n.objno, n.table_comment, n.queue_comment from new_table n)
I have a local access database and in it a query which takes values from a form to populate a drop down menu. The weird (to me) thing is that with most options this query is quick (blink of an eye), but with a few options it's very slow (>10 seconds).
What the query is does is a follows: It populates a dropdown menu to record animals seen at a specific sighting, but only those animals which have not been recorded at that specific sighting yet (to avoid duplicate entries).
SELECT DISTINCT tblAnimals.AnimalID, tblAnimals.Nickname, tblAnimals.Species
FROM tblSightings INNER JOIN (tblAnimals INNER JOIN tblAnimalsatSighting ON tblAnimals.AnimalID = tblAnimalsatSighting.AnimalID) ON tblSightings.SightingID = tblAnimalsatSighting.SightingID
WHERE (((tblAnimals.Species)=[form]![Species]) AND ((tblAnimals.CurrentGroup)=[form]![AnimalGroup2]) AND ((tblAnimals.[Dead?])=False) AND ((Exists (select tblAnimalsatSighting.AnimalID FROM tblAnimalsatSighting WHERE tblAnimals.AnimalID = tblAnimalsatSighting.AnimalID AND tblAnimalsatSighting.SightingID = [form]![SightingID]))=False));
It performs well for all groups of 2 of the 4 possible species, for 1 species it performs well for 4 of the 5 groups, but not for the last group, and for the last species it performs very slowly for both groups. Anybody an idea what can be the cause of this kind of behavior? Is it problems with the query? Or duplicate entries in the tables which can cause this? I don't think it's duplicates in the tables, I've checked that, and there are some, but they appear both for groups where there are problems and where there aren't. Could I re-write the query so it performs faster?
As noted in our comments above, you confirmed that the extra joins were not really need and were in fact going to limit the results to animal that had already had a sighting. Those joins would also likely contribute to a slowdown.
I know that Access probably added most of the parentheses automatically but I've removed them and converted the subquery to a not exists form that's a lot more readable.
SELECT tblAnimals.AnimalID, tblAnimals.Nickname, tblAnimals.Species
FROM tblAnimals
WHERE
tblAnimals.Species = [form]![Species]
AND tblAnimals.CurrentGroup = [form]![AnimalGroup2]
AND tblAnimals.[Dead?] = False
AND NOT EXISTS (
SELECT tblAnimalsatSighting.AnimalID
FROM tblAnimalsatSighting
WHERE
tblAnimals.AnimalID = tblAnimalsatSighting.AnimalID
AND tblAnimalsatSighting.SightingID = [form]![SightingID]
);
I have one maybe stupid question.
Look at the query :
select count(a) as A, count(b) as b, count(a)+count(b) as C
From X
How can I sum up the two columns without repeating the code:
Something like:
select count(a) as A, count(b) as b, A+B as C
From X
For the sake of completeness, using a CTE:
WITH V AS (
SELECT COUNT(a) as A, COUNT(b) as B
FROM X
)
SELECT A, B, A + B as C
FROM V
This can easily be handled by making the engine perform only two aggregate functions and a scalar computation. Try this.
SELECT A, B, A + B as C
FROM (
SELECT COUNT(a) as A, COUNT(b) as B
FROM X
) T
You may get the two individual counts of a same table and then get the summation of those counts, like bellow
SELECT
(SELECT COUNT(a) FROM X )+
(SELECT COUNT(b) FROM X )
AS C
Let's agree on one point: SQL is not an Object-Oriented language. In fact, when we think of computer languages, we are thinking of procedural languages (you use the language to describe step by step how you want the data to be manipulated). SQL is declarative (you describe the desired result and the system works out how to get it).
When you program in a procedural languages your main concerns are: 1) is this the best algorithm to arrive at the correct result? and 2) do these steps correctly implement the algorithm?
When you program in a declarative language your main concern is: is this the best description of the desired result?
In SQL, most of your effort will be going into correctly forming the filtering criteria (the where clause) and the join criteria (any on clauses). Once that is done correctly, you're pretty much just down to aggregating and formating (if applicable).
The first query you show is perfectly formed. You want the number of all the non-null values in A, the number of all the non-null values in B, and the total of both of those amounts. In some systems, you can even use the second form you show, which does nothing more than abstract away the count(x) text. This is convenient in that if you should have to change a count(x) to sum(x), you only have to make a change in one place rather than two, but it doesn't change the description of the data -- and that is important.
Using a CTE or nested query may allow you to mimic the abstraction not available in some systems, but be careful making cosmetic changes -- changes that do not alter the description of the data. If you look at the execution plan of the two queries as you show them, the CTE and the subquery, in most systems they will probably all be identical. In other words, you've painted your car a different color, but it's still the same car.
But since it now takes you two distinct steps in 4 or 5 lines to explain what it originally took only one step in one line to express, it's rather difficult to defend the notion that you have made an improvement. In fact, I'll bet you can come up with a lot more bullet points explaining why it would be better if you had started with the CTE or subquery and should change them to your original query than the other way around.
I'm not saying that what you are doing is wrong. But in the real world, we are generally short of the spare time to spend on strictly cosmetic changes.
I would like to form a query where an associated collection has been
restricted, ideally with Hibernate Criteria or HQL, but I'd be
interested in how to do this in SQL. For example, say I have a Boy
class with a bidirectional one-to-many association to the Kites class.
I want to get a List of the Boys whose kites' lengths are in a range.
The problem is that the HQL/Criteria I know only gets me Boy objects
with a complete (unrestricted) Set of Kites, as given in these two
examples (the HQL is a guess). I.e., I get the Boys who have Kites
in the right range, but for each such Boy I get all of the Kites, not
just the ones in the range.
select new Boy(name) from Boy b
inner join Kite on Boy.id=Kite.boyId
where b.name = "Huck" and length >= 1;
Criteria crit = session.createCriteria(Boy.class);
crit.add(Restrictions.eq("name", "Huck"))
.createCriteria("kites")
.add(Restrictions.ge("length", new BigDecimal(1.0)));
List list = crit.list();
Right now the only way I have to get the correct Kite length Sets is
to iterate through the list of Boys and for each one re-query Kites
for the ones in the range. I'm hoping some SQL/HQL/Criteria wizard
knows a better way. I'd prefer to get a Criteria solution because my
real "Boy" constructor has quite a few arguments and it would be handy
to have the initialized Boys.
My underlying database is MySQL. Do not assume that I know much about
SQL or Hibernate. Thanks!
I'm no hibernate expert, but as you say you're interested in the SQL solution as well...:
In SQL, I assume you mean something like (with the addition of indices, keys, etc):
CREATE TABLE Boys (Id INT, Name VARCHAR(16))
CREATE TABLE Kites(Length FLOAT, BoyID INT, Description TEXT)
plus of course other columns &c that don't matter here.
All boys owning 1+ kites with lenghts between 1.0 and 1.5:
SELECT DISTINCT Boys.*
FROM Boys
JOIN Kites ON(Kites.BoyID=Boys.ID AND Kites.Length BETWEEN 1.0 AND 1.5)
If you also want to see the relevant kites' description, with N rows per boy owning N such kites:
SELECT Boys.*, Kites.Length, Kites.Description
FROM Boys
JOIN Kites ON(Kites.BoyID=Boys.ID AND Kites.Length BETWEEN 1.0 AND 1.5)
Hope somebody can help you integrate these with hybernate...!
It turns out that this is best done by reversing the join:
Criteria crit = session.createCriteria(Kite.class);
crit.add(Restrictions.ge("length", new BigDecimal(1.0))
.createCriteria("boy")
.add(Restrictions.eq("name", "Huck")));
List<Kite> list = crit.list();
Note that the list's Kites need to be aggregated into Boys, this
can be done easily with a HashMap.
Is there a direct LINQ syntax for finding the members of set A that are absent from set B? In SQL I would write this
SELECT A.* FROM A LEFT JOIN B ON A.ID = B.ID WHERE B.ID IS NULL
See the MSDN documentation on the Except operator.
var results = from itemA in A
where !B.Any(itemB => itemB.Id == itemA.Id)
select itemA;
I believe your LINQ would be something like the following.
var items = A.Except(
from itemA in A
from itemB in B
where itemA.ID == itemB.ID
select itemA);
Update
As indicated by Maslow in the comments, this may well not be the most performant query. As with any code, it is important to carry out some level of profiling to remove bottlenecks and inefficient algorithms. In this case, chaowman's answer provides a better performing result.
The reasons can be seen with a little examination of the queries. In the example I provided, there are at least two loops over the A collection - 1 to combine the A and B list, and the other to perform the Except operation - whereas in chaowman's answer (reproduced below), the A collection is only iterated once.
// chaowman's solution only iterates A once and partially iterates B
var results = from itemA in A
where !B.Any(itemB => itemB.Id == itemA.Id)
select itemA;
Also, in my answer, the B collection is iterated in its entirety for every item in A, whereas in chaowman's answer, it is only iterated upto the point at which a match is found.
As you can see, even before looking at the SQL generated, you can spot potential performance issues just from the query itself. Thanks again to Maslow for highlighting this.