Query for restricting associated entities - sql
I would like to form a query where an associated collection has been
restricted, ideally with Hibernate Criteria or HQL, but I'd be
interested in how to do this in SQL. For example, say I have a Boy
class with a bidirectional one-to-many association to the Kites class.
I want to get a List of the Boys whose kites' lengths are in a range.
The problem is that the HQL/Criteria I know only gets me Boy objects
with a complete (unrestricted) Set of Kites, as given in these two
examples (the HQL is a guess). I.e., I get the Boys who have Kites
in the right range, but for each such Boy I get all of the Kites, not
just the ones in the range.
select new Boy(name) from Boy b
inner join Kite on Boy.id=Kite.boyId
where b.name = "Huck" and length >= 1;
Criteria crit = session.createCriteria(Boy.class);
crit.add(Restrictions.eq("name", "Huck"))
.createCriteria("kites")
.add(Restrictions.ge("length", new BigDecimal(1.0)));
List list = crit.list();
Right now the only way I have to get the correct Kite length Sets is
to iterate through the list of Boys and for each one re-query Kites
for the ones in the range. I'm hoping some SQL/HQL/Criteria wizard
knows a better way. I'd prefer to get a Criteria solution because my
real "Boy" constructor has quite a few arguments and it would be handy
to have the initialized Boys.
My underlying database is MySQL. Do not assume that I know much about
SQL or Hibernate. Thanks!
I'm no hibernate expert, but as you say you're interested in the SQL solution as well...:
In SQL, I assume you mean something like (with the addition of indices, keys, etc):
CREATE TABLE Boys (Id INT, Name VARCHAR(16))
CREATE TABLE Kites(Length FLOAT, BoyID INT, Description TEXT)
plus of course other columns &c that don't matter here.
All boys owning 1+ kites with lenghts between 1.0 and 1.5:
SELECT DISTINCT Boys.*
FROM Boys
JOIN Kites ON(Kites.BoyID=Boys.ID AND Kites.Length BETWEEN 1.0 AND 1.5)
If you also want to see the relevant kites' description, with N rows per boy owning N such kites:
SELECT Boys.*, Kites.Length, Kites.Description
FROM Boys
JOIN Kites ON(Kites.BoyID=Boys.ID AND Kites.Length BETWEEN 1.0 AND 1.5)
Hope somebody can help you integrate these with hybernate...!
It turns out that this is best done by reversing the join:
Criteria crit = session.createCriteria(Kite.class);
crit.add(Restrictions.ge("length", new BigDecimal(1.0))
.createCriteria("boy")
.add(Restrictions.eq("name", "Huck")));
List<Kite> list = crit.list();
Note that the list's Kites need to be aggregated into Boys, this
can be done easily with a HashMap.
Related
How can I assign pre-determined codes (1,2,3, etc,) to a JSON-type column in PostgreSQL?
I'm extracting a table of 2000+ rows which are park details. One of the columns is JSON type. Image of the table We have about 15 attributes like this and we also have a documentation of pre-determined codes assigned to each attribute. Each row in the extracted table has a different set of attributes that you can see in the image. Right now, I have cast(parks.services AS text) AS "details" to get all the attributes for a particular park or extract just one of them using the code below: CASE WHEN cast(parks.services AS text) LIKE '%uncovered%' THEN '2' WHEN cast(parks.services AS text) LIKE '%{covered%' THEN '1' END AS "details" This time around, I need to extract these attributes by assigning them the codes. As an example, let's just say Park 1 - {covered, handicap_access, elevator} to be {1,3,7} Park 2 - {uncovered, always_open, handicap_access} to be {2,5,3} I have thought of using subquery to pre-assign the codes, but I cannot wrap my head around JSON operators - in fact, I don't know how to extract them on 2000+ rows. It would be helpful if someone could guide me in this topic. Thanks a lot!
You should really think about normalizing your tables. Don't store arrays. You should add a mapping table to map the parks and the attribute codes. This makes everything much easier and more performant. step-by-step demo:db<>fiddle SELECT t.name, array_agg(c.code ORDER BY elems.index) as codes -- 3 FROM mytable t, unnest(attributes) WITH ORDINALITY as elems(value, index) -- 1 JOIN codes c ON c.name = elems.value -- 2 GROUP BY t.name Extract the array elements into one record per element. Add the WITH ORDINALITY to save the original order. Join your codes on the elements Create code arrays. To ensure the correct order, you can use the index values created by the WITH ORDINALITY clause.
Sorting with many to many relationship
I have a 3 tables person, person_speaks_language and language. person has 80 records language has 2 records I have the following records the first 10 persons speaks one language the first 70 persons (include the first group) speaks 2 languages the last 10 persons dont speaks any language Following with the example I want sort the persons by language, How I can do it correctly. I'm trying to use the the following SQL but seems quite strange SELECT "person".* FROM "person" LEFT JOIN "person_speaks_language" ON "person"."id" = "person_speaks_language"."person_id" LEFT JOIN "language" ON "person_speaks_language"."language_id" = "language"."id" ORDER BY "language"."name" ASC dataset 71,Catherine,Porter,male,NULL 72,Isabelle,Sharp,male,NULL 73,Scott,Chandler,male,NULL 74,Jean,Graham,male,NULL 75,Marc,Kennedy,male,NULL 76,Marion,Weaver,male,NULL 77,Melvin,Fitzgerald,male,NULL 78,Catherine,Guerrero,male,NULL 79,Linnie,Strickland,male,NULL 80,Ann,Henderson,male,NULL 11,Daniel,Boyd,female,English 12,Ora,Beck,female,English 13,Hulda,Lloyd,female,English 14,Jessie,McBride,female,English 15,Marguerite,Andrews,female,English 16,Maurice,Hamilton,female,English 17,Cecilia,Rhodes,female,English 18,Owen,Powers,female,English 19,Ivan,Butler,female,English 20,Rose,Bishop,female,English 21,Franklin,Mann,female,English 22,Martha,Hogan,female,English 23,Francis,Oliver,female,English 24,Catherine,Carlson,female,English 25,Rose,Sanchez,female,English 26,Danny,Bryant,female,English 27,Jim,Christensen,female,English 28,Eric,Banks,female,English 29,Tony,Dennis,female,English 30,Roy,Hoffman,female,English 31,Edgar,Hunter,female,English 32,Matilda,Gordon,female,English 33,Randall,Cruz,female,English 34,Allen,Brewer,female,English 35,Iva,Pittman,female,English 36,Garrett,Holland,female,English 37,Johnny,Russell,female,English 38,Nina,Richards,female,English 39,Mary,Ballard,female,English 40,Adrian,Sparks,female,English 41,Evelyn,Santos,female,English 42,Bess,Jackson,female,English 43,Nicholas,Love,female,English 44,Fred,Perkins,female,English 45,Cynthia,Dunn,female,English 46,Alan,Lamb,female,English 47,Ricardo,Sims,female,English 48,Rosie,Rogers,female,English 49,Susan,Sutton,female,English 50,Mary,Boone,female,English 51,Francis,Marshall,male,English 52,Carl,Olson,male,English 53,Mario,Becker,male,English 54,May,Hunt,male,English 55,Sophie,Neal,male,English 56,Frederick,Houston,male,English 57,Edwin,Allison,male,English 58,Florence,Wheeler,male,English 59,Julia,Rogers,male,English 60,Janie,Morgan,male,English 61,Louis,Hubbard,male,English 62,Lida,Wolfe,male,English 63,Alfred,Summers,male,English 64,Lina,Shaw,male,English 65,Landon,Carroll,male,English 66,Lilly,Harper,male,English 67,Lela,Gordon,male,English 68,Nina,Perry,male,English 69,Dean,Perez,male,English 70,Bertie,Hill,male,English 1,Nelle,Gill,female,Spanish 2,Lula,Wright,female,Spanish 3,Anthony,Jensen,female,Spanish 4,Rodney,Alvarez,female,Spanish 5,Scott,Holmes,female,Spanish 6,Daisy,Aguilar,female,Spanish 7,Elijah,Olson,female,Spanish 8,Alma,Henderson,female,Spanish 9,Willie,Barrett,female,Spanish 10,Ada,Huff,female,Spanish 11,Daniel,Boyd,female,Spanish 12,Ora,Beck,female,Spanish 13,Hulda,Lloyd,female,Spanish 14,Jessie,McBride,female,Spanish 15,Marguerite,Andrews,female,Spanish 16,Maurice,Hamilton,female,Spanish 17,Cecilia,Rhodes,female,Spanish 18,Owen,Powers,female,Spanish 19,Ivan,Butler,female,Spanish 20,Rose,Bishop,female,Spanish 21,Franklin,Mann,female,Spanish 22,Martha,Hogan,female,Spanish 23,Francis,Oliver,female,Spanish 24,Catherine,Carlson,female,Spanish 25,Rose,Sanchez,female,Spanish 26,Danny,Bryant,female,Spanish 27,Jim,Christensen,female,Spanish 28,Eric,Banks,female,Spanish 29,Tony,Dennis,female,Spanish 30,Roy,Hoffman,female,Spanish 31,Edgar,Hunter,female,Spanish 32,Matilda,Gordon,female,Spanish 33,Randall,Cruz,female,Spanish 34,Allen,Brewer,female,Spanish 35,Iva,Pittman,female,Spanish 36,Garrett,Holland,female,Spanish 37,Johnny,Russell,female,Spanish 38,Nina,Richards,female,Spanish 39,Mary,Ballard,female,Spanish 40,Adrian,Sparks,female,Spanish 41,Evelyn,Santos,female,Spanish 42,Bess,Jackson,female,Spanish 43,Nicholas,Love,female,Spanish 44,Fred,Perkins,female,Spanish 45,Cynthia,Dunn,female,Spanish 46,Alan,Lamb,female,Spanish 47,Ricardo,Sims,female,Spanish 48,Rosie,Rogers,female,Spanish 49,Susan,Sutton,female,Spanish 50,Mary,Boone,female,Spanish 51,Francis,Marshall,male,Spanish 52,Carl,Olson,male,Spanish 53,Mario,Becker,male,Spanish 54,May,Hunt,male,Spanish 55,Sophie,Neal,male,Spanish 56,Frederick,Houston,male,Spanish 57,Edwin,Allison,male,Spanish 58,Florence,Wheeler,male,Spanish 59,Julia,Rogers,male,Spanish 60,Janie,Morgan,male,Spanish 61,Louis,Hubbard,male,Spanish 62,Lida,Wolfe,male,Spanish 63,Alfred,Summers,male,Spanish 64,Lina,Shaw,male,Spanish 65,Landon,Carroll,male,Spanish 66,Lilly,Harper,male,Spanish 67,Lela,Gordon,male,Spanish 68,Nina,Perry,male,Spanish 69,Dean,Perez,male,Spanish 70,Bertie,Hill,male,Spanish Update the expect results are: each person must be appears only one time using the language order For explain the case further, I'll take a new and small dataset, using only the person id and the language name 1,English 2,English 3,English 4,English 19,English 1,Spanish 2,Spanish 3,Spanish 4,Spanish 5,Spanish 14,Spanish 15,Spanish 16,Spanish 19,Spanish 21,Spanish 25,Spanish I'm using the same order but if I use a limit for example LIMIT 8 the results will be 1,English 2,English 3,English 4,English 19,English 1,Spanish 2,Spanish 3,Spanish And the expected result is 1,English 2,English 3,English 4,English 19,English 5,Spanish 14,Spanish 15,Spanish What I'm trying to do What I'm trying to do is sorting, paginating and filtering a list of X that may have a many-to-many relationship with Y, in this case X is a person and Y is the language. I need do it in a general way. I found a trouble if I want ordering the list by some Y properties. The list will show in this way: firstname, lastname, gender , languages Daniel , Boyd , female , English Spanish Ora , Beck , female , English Anthony , Jensen , female , Spanish .... I only need return a array with the IDs in the correct order this is the main reason I need that the results only appears the person one time is because the ORM (that I'm using) try to hydrate each result and if I paginate the results using offset and limit. the results maybe aren't the expected. I'm doing assumptions many to many relationships I can't use the string_agg or group_concat because I dont know the real data, I dont know if are integers or strings
If you want each person to appear only once, then you need to aggregate by that person. If you then want the list of languages, you need to combine them in some way, concatenation comes to mind. The use of double quotes suggests Postgres or Oracle to me. Here is Postgres syntax for this: SELECT p.id, string_agg(l.name) as languages FROM person p LEFT JOIN person_speaks_language psl ON p.id = psl.person_id LEFT JOIN language l ON psl.language_id = l.id GROUP BY p.id ORDER BY COUNT(l.name) DESC, languages; Similar functionality to string_agg() exists in most databases.
There is nothing wrong with Bertie Hill appearing in two rows, with one language each, that is the Tabular View of Data per the Relational Model. There are no dependencies on data values or number of data values. It is completely correct and un-confused. But here, the requirement is confused, because you really want three separate lists: speaks one language speaks two languages [or the number of languages currently in the language file] speaks no language [on file] ) ... But you want those three lists in one list. Concatenating data values is never, ever a good idea. It is a breach of rudimentary standards, specifically 1NF. It may be common, but it is a gross error. It may be taught by the so-called "theoreticians", but it remains a gross error. Even in a result set, yes. It creates confusion, such as I have detailed at the top. With concatenated strings, as the number of languages changes, the width of that concatenated field will grow, and eventually exceed space, wherever it appears (eg. the width of the field on the screen). Just two of the many reasons why it is incorrect, not expandable, sub-standard. By the way, in your "dataset" (it isn't the result set produced by your code), the sexes appear to be nicely mixed up. Therefore the answer, and the only correct one, even if it isn't popular, is that your code is correct (it can be cleaned it up, sure), and you have to educate the user re the dangers of sub-standard code or reports. You can sort by person.name (rather than by language.name) and then write smarter SQL such that (eg) the person.name is not repeated on the second and subsequent row for persons who speak more than one language, etc. That is just pretty printing. The non-answer, for those who insist on sub-standard code that will break one day when, is Gordon's response. Response to Comments In the Relational Model: There is no order to the rows, that is deemed a physical or implementation aspect, which we have no control over, and which changes anyway, and which we are warned not to rely upon. If order is sought in the output result set, then we must us ORDER BY, that is its purpose in life. The data has meaning, and that meaning is carried in Relational Keys. Meaning cannot be carried in surrogates (ie. ID columns). Limiting myself to the files (they are not tables) that you have given, there is no such thing in the data as: the first 10 persons who speaks one language Obtaining persons who speak one language is simple, I believe you already understand that: SELECT person.first_name, person.last_name FROM person P, (SELECT person_id FROM person_speaks_language GROUP BY person_id HAVING COUNT(*) = 1 -- change this for 2 languages, etc ) AS PL WHERE P.person_id = PL.person_id But "first" ? "first" by what criteria ? Record creation date ? ORDER BY date_created -- if it exists in the data Record ID does not give first anything: as records are added and deleted, any "order" that may exist initially is completely lost. You cannot extract meaning out of, or assign meaning to something that, by definition, has no meaning. If the Record ID is relevant, ie. you are going to use it for some purpose, then it is not a Record ID, name the field for what it actually is. I fail to see, I do not understand, the relevance of the difference between the "dataset" and the updated "small dataset". The "dataset" size is irrelevant, the field headings are irrelevant, what the result set means, is relevant. The problem is not some "limitation" in the Relational Model, the problem is (a) your fixed view of data values, and (b) your lack of understanding about what the Relational Model is, what it does, understanding of which makes this whole question disappear, and we are left with a simple SQL (as tagged) "how to" question. Eg. If I had a Relational Database, with persons and languages, with no ID columns, there is nothing that I cannot do with it, no report that I cannot produce from it, from the data. Please try to use an example that conveys the meaning in the data, in what you are trying to do. the expect results are: each person must be appear only one time They already appear only once (for each language) using the language order Well, there is no order in the language file. We can give it some order, whatever order is meaning-ful, to you, in the result set, based on the data. Eg. language.name. Of course, many persons speak each language, so what order would you like within language.name? How about last_name, first_name. The Record IDs are meaningless to the user, so I won't display them in the result set. NULL is also meaningless, and ambiguous, so I will make the meaning here explicit. This is pretty much what you have, tidied up: SELECT [language] = CASE name WHEN NULL THEN "[None]" ELSE name END, last_name, first_name FROM person P LEFT JOIN person_speaks_language PL ON P.id = PL.person_id LEFT JOIN language L ON PL.language_id = L.id ORDER BY name, last_name, first_name But then you have: And the expected result is The example data of which contradicts your textual descriptions: the expect results are: each person must be appear only one time using the language order So now, if I ignore the text, and examine the example data re what you want (which is a horrible thing to do, because I am joining you in the incorrect activity of focussing on the data values, rather than understanding the meaning), it appears you want the person to appear only once, full stop, regardless of how many languages they speak. Your example data is meaningless, so I cannot be asked to reproduce it. See if this has some meaning. SELECT last_name, first_name, [language] = ( -- correlated subquery SELECT TOP 1 -- get the "first" language CASE name -- make meaning of null explicit WHEN NULL THEN "[None]" ELSE name END FROM person_speaks_language PL JOIN language L ON PL.language_id = L.id WHERE P.id = PL.person_id -- the subject person ORDER BY name -- id would be meaningless ) FROM person P -- vector for person, once ORDER BY last_name, first_name Now if you wanted only persons who speak a language (on file): SELECT last_name, first_name, [language] = ( -- correlated subquery SELECT TOP 1 -- get the "first" language name FROM person_speaks_language PL JOIN language L ON PL.language_id = L.id WHERE P.id = PL.person_id -- the subject person ORDER BY name -- id would be meaningless ) FROM person P, ( SELECT DISTINCT person_id -- just one occ, thanks FROM person_speaks_language PL -- vector for speakers ) AS PL_1 WHERE P.id = PL_1.person_id -- join them to person fields There, not an outer join anywhere to be seen, in either solution. LEFT or RIGHT will confuse you. Do not attempt to "get everything", so that you can "see" the data values, and then mangle, hack and chop away at the result set, in order to get what you want from that. No, forget about the data values and get only what you want from the record filing system. Response to Update I was trying to explain the case with a data set, I think I made things tougher than they actually were Yes, you did. Reviewing the update then ... The short answer is, get rid of the ORM. There is nothing in it of value: you can access the RDB from the queries that populate your objects directly. The way we did for decades before the flatulent beast came along. Especially if you understand and implement Open Architecture Standards. Further, as evidenced, it creates masses of problems. Here, you are trying to work around the insane restrictions of the ORM. Pagination is a straight-forward issue, if you have your data Normalised, and Relational Keys. The long answer is ... please read this Answer. I trust you will understand that the approach you take to designing your app components, your design of windows, will change. All your queries will be simplified, you get only what you require for the specific window or object. The problem may well disappear entirely (except for possibly the pagination, you might need a method). Then please think about those architectural issues carefully, and make specific comments of questions.
Matching observations based on similarity of categorical variables
I was wondering, if someone has a good way how to match two observations based on categorical (non-ordinal) variables. The exercise I am working on is matching mentees with mentors based on interests and other characteristics that are (non-ordinal or ordinal) categorical variables. Variable Possible Values Sport “Baseball”, “Football”, “Basketball” (…) Marital Status “Single, no kids”, “Single, young kids”, “Married, no kids”, “Married, young kids”, (…) Job Level 1, 2, 3, 4, 5, 6 Industry “Retail”, “Finance”, “Wholesale”, (…) There are also indicators if any of the variables is important to the person. I understand, I could force marital status into one or two ordinal variables like (“Single”, “Married”, “Widow”) and (“no kids”, “young kids”, “grown kids”). But I don’t know how to handle industry and sport as there is no logical order to them. My plan was originally to use a clustering technique to find a match between the mentor and the mentee set based on the shortest distance or the given points. But that would ignore the fact that people could decide, if the variable is important to them or not (“Yes”, “No”). Now, I am thinking to just brute force logic on it by using nested IF statements that check, if there is a perfect match based on the importance and the actual values. ELSE check if there is a matching record that has all matches, but one category etc. This seems very inefficient, so I was hoping if someone came across a similar problem, I would find a better way how to handle this. Would it make sense to create two variables one for the importance sequence (eg: "YesNoYesNoNo") and one for the interests (eg "BasketballSingleNokids6Retail") and then employ fuzzy matching? Best regards,
One approach would be to decide first on which variables you must have an exact match, do a cartesian join on those, then generate a score based on other non-mandatory matches and output records where the score exceeds a threshold. The more mandatory matches you require, the better the query will perform. E.g. %let MATCH_THRESHOLD = 2; /*At least this many optional variables must match*/ proc sql; create table matches as select * from mentors a inner join mentees b /*Mandatory matches*/ on a.m_var1 = b.m_var1 and a.m_var2 = b.m_var2 and ... /*Optional threshold-based matches*/ where a.o_var1 = b.o_var1 + a.o_var2 = b.o_var2 + ... >= &MATCH_THRESHOLD; quit; Going further - if you have inconsistently entered data, you could use soundex or edit distance matching rather than exact matching for the optional conditions. If some optional matches are worth more than others, you can weight their contribution to the score.
Filtering simultaneously on count of related objects and on count of related objects that satisfy a condition in Django
So I have models amounting to this (very simplified, obviously): class Mystery(models.Model): name = models.CharField(max_length=100) class Character(models.Model): mystery = models.ForeignKey(Mystery, related_name="characters") required = models.BooleanField(default=True) Basically, in each mystery there are a number of characters, which can be essential to the story or not. The minimum number of actors that can stage a mystery is the number of required characters for that mystery; the maximum number is the number of characters total for the mystery. Now I'm trying to query for mysteries that can be played by some given number of actors. It seemed straightforward enough using the way Django's filtering and annotation features function; after all, both of these queries work fine: # Returns mystery objects with at least x characters in all Mystery.objects.annotate(max_actors=Count('characters', distinct=True)).filter(max_actors__gte=x) # Returns mystery objects with no more than x required characters Mystery.objects.filter(characters__required=True).annotate(min_actors=Count('characters', distinct=True)).filter(min_actors__lte=x) However, when I try to combine the two... Mystery.objects.annotate(max_actors=Count('characters', distinct=True)).filter(characters__required=True).annotate(min_actors=Count('characters', distinct=True)).filter(min_actors__lte=x, max_actors__gte=x) ...it doesn't work. Both min_actors and max_actors come out containing the maximum number of actors. The relevant parts of the actual query being run look like this: SELECT `mysteries_mystery`.`id`, `mysteries_mystery`.`name`, COUNT(DISTINCT `mysteries_character`.`id`) AS `max_actors`, COUNT(DISTINCT `mysteries_character`.`id`) AS `min_actors` FROM `mysteries_mystery` LEFT OUTER JOIN `mysteries_character` ON (`mysteries_mystery`.`id` = `mysteries_character`.`mystery_id`) INNER JOIN `mysteries_character` T5 ON (`mysteries_mystery`.`id` = T5.`mystery_id`) WHERE T5.`required` = True GROUP BY `mysteries_mystery`.`id`, `mysteries_mystery`.`name` ...which makes it clear that while Django is creating a second join on the character table just fine (the second copy of the table being aliased to T5), that table isn't actually being used anywhere and both of the counts are being selected from the non-aliased version, which obviously yields the same result both times. Even when I try to use an extra clause to select from T5, I get told there is no such table as T5, even as examining the output query shows that it's still aliasing the second character table to T5. Another attempt to do this with extra clauses went like this: Mystery.objects.annotate(max_actors=Count('characters', distinct=True)).extra(select={'min_actors': "SELECT COUNT(*) FROM mysteries_character WHERE required = True AND mystery_id = mysteries_mystery.id"}).extra(where=["`min_actors` <= %s", "`max_actors` >= %s"], params=[x, x]) But that didn't work because I can't use a calculated field in the WHERE clause, at least on MySQL. If only I could use HAVING, but alas, Django's .extra() does not and will never allow you to set HAVING parameters. Is there any way to get Django's ORM to do what I want?
How about combining your Count()s: Mystery.objects.annotate(max_actors=Count('characters', distinct=True),min_actors=Count('characters', distinct=True)).filter(characters__required=True).filter(min_actors__lte=x, max_actors__gte=x) This seems to work for me but I didn't test it with your exact models.
It's been a couple of weeks with no suggested solutions, so here's how I ended up going about it, for anyone else who might be looking for an answer: Mystery.objects.annotate(max_actors=Count('characters', distinct=True)).filter(max_actors__gte=x, id__in=Mystery.objects.filter(characters__required=True).annotate(min_actors=Count('characters', distinct=True)).filter(min_actors__lte=x).values('id')) In other words, filter on the first count and on IDs that match those in an explicit subquery that filters on the second count. Kind of clunky, but it works well enough for my purposes.
Retrieving object data from multiple tables help
Sorry if I'm not wording this too well, but let me try to explain what I'm doing. I have a main object of class A, that has multiple objects of classes B, C, D and E. such that: Class ObjectA { ObjectB[] myObjectBs; ObjectC[] myObjectCs; ObjectD[] myObjectDs; ObjectE[] myObjectEs; } where A---B mapping is 1 to many, for B, C, D and E. That is, all B,C,D,E objects are associated with only one object A. I'm storing the data for all these objects in a database, with Table A holding all the data for the instances of Class A, etc. Now, when getting the data for this at run time on the fly, I'm running 5 different queries for each object. (very simplified psuedocode) objectA=sql("select * from tableA where id=#id#"); objectA.setObjectBs(sql("select * from tableB where a_id=#id#"); objectA.setObjectCs(sql("select * from tableC where a_id=#id#"); objectA.setObjectDs(sql("select * from tableD where a_id=#id#"); objectA.setObjectEs(sql("select * from tableE where a_id=#id#"); if that makes sense. Now, I'm wondering, is this the most efficient way of doing it? I feel like there should be a way to get all this info in 1 query, but doing something like "select * from a,b,c,d,e where a.id = #id# and b.a_id = #id# and c.a_id = #id# and d.a_id = #id# and e.a_id = #id#" will give a result set with all the columns of A,B,C,D,E for each row, and there will be many many more rows that I'd be needing. If there was only one array of objects (like just ObjectBs) it could be done with a simple join and then handled by my database framework. If the relationships were A(one)....B(many) and B(one)....C(many) it could be done with two joins and work. But for A(one)....B(many) and A(one)....C(many) etc I can't think of a good way to do joins or return this data without having too many rows, as with joins if A has 10 Bs and 10Cs, it'll return 100 rows rather than 20. So, is the way I'm currently doing it, with 5 different selects, the most efficient (which it seems like its not), or is there a better way of doing it? Also, If I were to grab a large set of these at once (say, 5000 ObjectAs and all the associated Bs, Cs, Ds, and Es), would there be a way to do it without running a ton of consecutive queries one after the other?
You can try iBatis using N+1 Select Lists http://ibatis.apache.org/docs/dotnet/datamapper/ch03s05.html Hth.
There is a huge performance issue with N+1 selects (check https://github.com/balak1986/prime-cache-support-in-ibatis2/blob/master/README). So please don't use it unless there is no other way of achieving this. Luckily iBatis has groupBy property, that is created exactly to map data for these kind of complex object. Check the example from http://www.cforcoding.com/2009/06/ibatis-tutorial-aggregation-with.html