nHibernate collections and alias criteria - nhibernate

I have a simple test object model in which there are schools, and a school has a collection of students.
I would like to retrieve a school and all its students who are above a certain age.
I carry out the following query, which obtains a given school and the children which are above a certain age:
public School GetSchoolAndStudentsWithDOBAbove(int schoolid, DateTime dob)
{
var school = this.Session.CreateCriteria(typeof(School))
.CreateAlias("Students", "students")
.Add(Expression.And(Expression.Eq("SchoolId", schoolid), Expression.Gt("students.DOB", dob)))
.UniqueResult<School>();
return school;
}
This all works fine and I can see the query going to the database and returning the expected number of rows.
However, when I carry out either of the following, it gives me the total number of students in the given school (regardless of the preceding request) by running another query:
foreach (Student st in s.Students)
{
Console.WriteLine(st.FirstName);
}
Assert.AreEqual(s.Students.Count, 3);
Can anyone explain why?

You made your query on the School class and you restricted your results on it, not on the mapped related objects.
Now there are many ways to do this.
You can make a static filter as IanL said, however its not really flexible.
You can just iterate the collection like mxmissile but that is ugly and slow (especially considering lazy loading considerations)
I would provide 2 different solutions:
In the first you maintain the query you have and you fire a dynamic filter on the collection (maintaining a lazy-loaded collection) and doing a round-trip to the database:
var school = GetSchoolAndStudentsWithDOBAbove(5, dob);
IQuery qDob = nhSession.CreateFilter(school.Students, "where DOB > :dob").SetDateTime("dob", dob);
IList<Student> dobedSchoolStudents = qDob.List<Student>();
In the second solution just fetch both the school and the students in one shot:
object result = nhSession.CreateQuery(
"select ss, st from School ss, Student st
where ss.Id = st.School.Id and ss.Id = :schId and st.DOB > :dob")
.SetInt32("schId", 5).SetDateTime("dob", dob).List();
ss is a School object and st is a Student collection.
And this can definitely be done using the criteria query you use now (using Projections)

Unfortunately s.Students will not contain your "queried" results. You will have to create a separate query for Students to reach your goal.
foreach(var st in s.Students.Where(x => x.DOB > dob))
Console.WriteLine(st.FirstName);
Warning: That will still make second trip to the db depending on your mapping, and it will still retrieve all students.
I'm not sure but you could possibly use Projections to do all this in one query, but I am by no means an expert on that.

You do have the option of filtering data. If it there is a single instance of the query mxmissle option would be the better choice.
Nhibernate Filter Documentation
Filters do have there uses, but depending on the version you are using there can be issues where filtered collections are not cached correctly.

Related

How (preloading) join table on custom column?

Imagine we have the following models:
type Company struct {
gorm.Model
Name string
Addresses []Address
}
type Address struct {
gorm.Model
CompanyID uint64
Street string
City string
Country string
}
I want to take all the companies(and their addresses) which have address in a specific location. Something like this:
SELECT company.*, address.* FROM company
INNER JOIN address ON address.company_id = company.id AND address.country = 'Bulgaria'
So if a company does not have address at the specific location, I will not get it as a result at all. I was trying something like that:
db.Joins("Addresses", "addresses.country = ?", "Bulgaria").Find(&companies)
However, it doesn't work, because GORM doesn't take the second argument of Joins(when preloading join used), so I should check the generated query and make something like that:
db.Where(`"Address".country = ?`, "Bulgaria").Joins("Addresses").Find(&companies)
Is there a better way/not hacky way? Have in mind all of the above code is mock of the real problem, I didn't want to expose the original models/queries.
You can use Preload to load Addresses into the Company object.
Based on your described conditions, where you don't want to load companies that don't match your filter, you should use an INNER JOIN
Two options:
First, if your table is named company, then your query should look like this:
db.Table("company").
Preload("Addresses").
Joins("INNER JOIN addresses a ON a.company_id = company.id").
Where("a.country = ?", "Bulgaria").
Find(&companies)
Second, if your table is named companies, then you should try this:
db.Preload("Addresses").
Joins("INNER JOIN addresses a ON a.company_id = companies.id").
Where("a.country = ?", "Bulgaria").
Find(&companies)
If you are using Gorm v2 you perform CRUD operations on has-one and one-to-many via associations:
var companies []Company
var addresses []Address
countries := []string{"Bulgaria"}
db.Model(&Address).Where("country IN (?)", countries).Find(&addresses)
// The key is using the previously fetched variable as the model for the association query
db.Model(&addresses).Association("CompanyID").Find(&companies)
This also works for many-to-many relations.
When comparing Gorm VS raw SQL queries (db.Raw(SQLquery)) you will typically see a performance hit. On the upside, you will not have to deal with the added complexity of raw sql string building in Go, which is pretty nice. I personally use a combination of both raw and gorm-built queries in my projects :)

Return results from more than one database table in Django

Suppose I have 3 hypothetical models;
class State(models.Model):
name = models.CharField(max_length=20)
class Company(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
class Person(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
I want to be able to return results in a Django app, where the results, if using SQL directly, would be based on a query such as this:
SELECT a.name as 'personName',b.name as 'companyName', b.state as 'State'
FROM Person a, Company b
WHERE a.state=b.state
I have tried using the select_related() method as suggested here, but I don't think this is quite what I am after, since I am trying to join two tables that have a common foreign-key, but have no key-relationships amongst themselves.
Any suggestions?
Since a Person can have multiple Companys in the same state. It is not a good idea to do the JOIN at the database level. That would mean that the database will (likely) return the same Company multiple times, making the output quite large.
We can prefetch the related companies, with:
qs = Person.objects.select_related('state').prefetch_related('state__company')
Then we can query the Companys in the same state with:
for person in qs:
print(person.state.company_set.all())
You can use a Prefetch-object [Django-doc] to prefetch the list of related companies in an attribute of the Person, for example:
from django.db.models import Prefetch
qs = Person.objects.prefetch_related(
Prefetch('state__company', Company.objects.all(), to_attr='same_state_companies')
)
Then you can print the companies with:
for person in qs:
print(person.same_state_companies)

Selecting related model: Left join, prefetch_related or select_related?

Considering I have the following relationships:
class House(Model):
name = ...
class User(Model):
"""The standard auth model"""
pass
class Alert(Model):
user = ForeignKey(User)
house = ForeignKey(House)
somevalue = IntegerField()
Meta:
unique_together = (('user', 'property'),)
In one query, I would like to get the list of houses, and whether the current user has any alert for any of them.
In SQL I would do it like this:
SELECT *
FROM house h
LEFT JOIN alert a
ON h.id = a.house_id
WHERE a.user_id = ?
OR a.user_id IS NULL
And I've found that I could use prefetch_related to achieve something like this:
p = Prefetch('alert_set', queryset=Alert.objects.filter(user=self.request.user), to_attr='user_alert')
houses = House.objects.order_by('name').prefetch_related(p)
The above example works, but houses.user_alert is a list, not an Alert object. I only have one alert per user per house, so what is the best way for me to get this information?
select_related didn't seem to work. Oh, and surely I know I can manage this in multiple queries, but I'd really want to have it done in one, and the 'Django way'.
Thanks in advance!
The solution is clearer if you start with the multiple query approach, and then try to optimise it. To get the user_alerts for every house, you could do the following:
houses = House.objects.order_by('name')
for house in houses:
user_alerts = house.alert_set.filter(user=self.request.user)
The user_alerts queryset will cause an extra query for every house in the queryset. You can avoid this with prefetch_related.
alerts_queryset = Alert.objects.filter(user=self.request.user)
houses = House.objects.order_by('name').prefetch_related(
Prefetch('alert_set', queryset=alerts_queryset, to_attrs='user_alerts'),
)
for house in houses:
user_alerts = house.user_alerts
This will take two queries, one for houses and one for the alerts. I don't think you require select related here to fetch the user, since you already have access to the user with self.request.user. If you want you could add select_related to the alerts_queryset:
alerts_queryset = Alert.objects.filter(user=self.request.user).select_related('user')
In your case, user_alerts will be an empty list or a list with one item, because of your unique_together constraint. If you can't handle the list, you could loop through the queryset once, and set house.user_alert:
for house in houses:
house.user_alert = house.user_alerts[0] if house.user_alerts else None

What is the potential performance issue with the following code and how would you suggest to fix it?

i had an interview with microsoft and they asked me this following question! i didn't knew how to solve it and i'm very interesting to know what's the solution
p.s: it's only for me to improve myself because i was denied..
anyways: please assume that EmployeeRepository and ServiceTicketsRepository are implementing EntityFramework ORM repositories. The actual storage is a SQL database in the cloud.
Bonus: what is the name of the anti-pattern?
//
// Return overall number of pending work tickets for all employees in the repository
//
public int GetTicketsForEmployees()
{
EmployeeRepository employeeRepository = new EmployeeRepository();
ServiceTicketsRepository serviceTicketRepository = new ServiceTicketRepository();
int ticketscount = 0;
var employees = employeeRepository.All.Select(e => new EmployeeSummary { Employee = e }).ToList();
foreach (var employee in employees)
{
var tickets = serviceTicketRepository.AllIncluding(t => t.Customer).Where(t => t.AssignedToID ==employee.Employee.ID).ToList();
ticketscount += tickets.Count();
}
return ticketscount;
{
This is called the 1 + N anti-pattern. It means that you will do 1 + N round trips to the database where N is the number of records in the Employee table.
It will do 1 query to find all employees, then for each employee do another query to find their tickets, in order to count them.
The performance issue is that when N grows, your application will do more and more round trips, each taking a few milliseconds. Even at only 1000 employees this will be slow.
In addition to the round trips, this code is fetching all the columns for all the rows in the Employee table and also from the Ticket table. This will add up to a lot of bytes and in the end might cause an out of memory exception when the number of Employees and Tickets have grown to a big amount.
The fix is to perform one query which counts all the tickets which belongs to employees and then only returning the count. This will become one round trip sending only a few bytes over the network.
I'am not a C# but what I can see from my side is you are not using any join procedure.
If you have 1 million of employees and you have about 1000 tickets per employee.
You will do a 1 billion of query (loop including) :/ and you just want to return a count of ticket reported by your employee
Edit : I supposed you are in a eager loading and during your loop your EntityFramework instance will be open for the all duration of your loop.
Edit 2 : With a inner join you wont have to repeat t => t.AssignedToID ==employee.Employee.ID The join will do that for you.

multiple results in a single call

When paging data, I want to not only return 10 results, but I also want to get the total number of items in all the pages.
How can I get the total count AND the results for the page in a single call?
My paged method is:
public IList GetByCategoryId(int categoryId, int firstResult, int maxResults)
{
IList<Article> articles = Session.CreateQuery(
"select a from Article as a join a.Categories c where c.ID = :ID")
.SetInt32("ID", categoryId)
.SetFirstResult(firstResult)
.SetMaxResults(maxResults)
.List<Article>();
return articles;
}
The truth is that you make two calls. But a count(*) call is very, very cheap in most databases and when you do it after the main call sometimes the query cache helps out.
Your counter call will often be a little different too, it doesn't actually need to use the inner joins to make sense. There are a few other little performance tweaks too but most of the time you don't need them.
I believe you actually can do what you ask. You can retieve the count and the page in one go in code but not in one SQL statement. Actually two queries are send to the database but in one round trip and the results are retrieved as an array of 2 elements. One of the elements is the total count as an integer and the second is an IList of your retrieved entities.
There are 2 ways to do that:
MultyQuery
MultiCriteria
Here is a sample taken from the links below:
IList results = s.CreateMultiQuery()
.Add("from Item i where i.Id > :id")
.Add("select count(*) from Item i where i.Id > :id")
.SetInt32("id", 50)
.List();
IList items = (IList)results[0];
long count = (long)((IList)results[1])[0];
Here is more info on how you can do that. It is really straight forward.
http://ayende.com/Blog/archive/2006/12/05/NHibernateMutliQuerySupport.aspx
http://ayende.com/Blog/archive/2007/05/20/NHibernate-Multi-Criteria.aspx
If you read the 2 articles above you will see that there is a performance gain of using this approach that adds value to the anyway more transparent and clear approach of doing paging with MultiQuery and MultiCriteria against the conventional way.
Please note that recent versions of NHibernate support the idea of futures, so you can do.
var items = s.CreateQuery("from Item i where i.Id > :id")
.SetInt32("id", 50)
.Future<Item>();
var count = s.CreateQuery("select count(*) from Item i where i.Id > :id")
.SetInt32("id", 50)
.FutureValue<long>();
This is a much more natural syntax, and it will still result in a single DB query.
You can read more about it here:
http://ayende.com/Blog/archive/2009/04/27/nhibernate-futures.aspx