Does Fluent Nhibernate Lazy loads IList from Criteria - nhibernate

I want to make a kind of "news feed" in the application of my company.
In the scenario, User actions will generate "Activity" of different kinds, and other users will see in their "news feed".
However, an "Activity" is not related to all users, and to determine the relation, we have a complex piece of code.
Here is my Activity class
public class Activity: IActivity
{
public virtual int Id { get; set; }
public virtual ActivityType Type { get; set; }
public virtual User User { get; set; }
public virtual bool IsVisibleToUser(User userLook)
{
// Complex business calculation etc.
return true;
}
}
I want to get latest 10 news that is visible to User. But since the Activity table will be quite huge, and performance is an issue, I want to do the best practice about it.
What i am about to do, is get 25 last Activity, and check if we fill the list to show to user. For example, if only 5 Activity is visible to user, i will get another 25 Activities and so on.
IList<Activity> resultList = session.CreateCriteria(typeof(Activity))
.SetMaxResults(25)
.AddOrder(Order.Desc("Id"))
.List<Activity>();
I want to learn, if I get the whole list ordered by Id, and check one by one if it is visible to User, would NHibernate only loads the objects that i use for me or not?
IList<Activity> resultList = session.CreateCriteria(typeof(Activity))
.AddOrder(Order.Desc("Id"))
.List<Activity>();
int count = 0;
foreach( Activity act in resultList){
if (act.IsVisible(CurrentUser)){
count++;
// Do something with act
if (count == 10)
break;
}
}
EDIT:
Here is ActivityMapping for Activity model.
public class ActivityMap : ClassMap<Activity>
{
public ActivityMap()
{
Id(x => x.Id);
Map(x => x.Type).CustomType(typeof(Int32));
References(x => x.User).Nullable();
}
}

If your question is about how the generated SQL would look like, my guess would be :
SELECT
this_.Id as Id0_0_,
this_.ActivityTypeas ActivityType_0_0_,
--Other fields
FROM dbo.ActivityType this_
WHERE
--condition
ORDER BY
--condition
Since you have mentioned that the Activity count is huge, you can make use of
ICriteria's SetFirstResult and SetMaxResult.
SetFirstResult(int) indicates the index of the first item that you wish to obtain and SetMaxResult(int) indicates the number of rows you wish to get, 25 in your case.
The ToList would load all the records in memory at once.
[UPDATE] If you need the records to be returned one by one, make use of Enumerable() -
If you expect your query to return a very large number of objects, but you don't expect to use them all, you might get better performance from the Enumerable() methods, which return a System.Collections.IEnumerable. The iterator will load objects on demand, using the identifiers returned by an initial SQL query (n+1 selects total).
Source - Link

No, the List() method pulls everything into memory at once.

Related

EF Core include related ids but not related entities

Before I go creating my own SQL scripts by hand for this, I have a scenario where I want to get the ids of a foreign key, but not the entirety of the foreign entities, using EF Core.
Right now, I'm getting the ids manually by looping through the related entities and extracting the ids one at a time, like so:
List<int> ClientIds = new List<int>();
for (var i = 0; i < Clients.length; i++){
ClientIds.add(Clients.ElementAt(i).Id);
}
To my understanding, this will either cause data returns much larger than needed (my entity + every related entity) or a completely separate query to be run for each related entity I access, which obviously I don't want to do if I can avoid it.
Is there a straightforward way to accomplish this in EF Core, or do I need to head over the SQL side and handle it myself?
Model:
public class UserViewModel {
public UserViewModel(UserModel userModel){
ClientIds = new List<int>();
for (var i = 0; i < UserModel.Clients.length; i++){
ClientIds.add(Clients.ElementAt(i).Id);
}
//...all the other class asignments
}
public IEnumerable<int> ClientIds {get;set;}
//...all the other irrelevant properties
}
Basically, I need my front-end to know which Client to ask for later.
It looks like you are trying to query this from within the parent entity. I.e.
public class Parent
{
public virtual ICollection<Client> Clients { get; set; }
public void SomeMethod()
{
// ...
List<int> ClientIds = new List<int>();
for (var i = 0; i < Clients.length; i++)
{
ClientIds.add(Clients.ElementAt(i).Id);
}
// ...
}
}
This is not ideal because unless your Clients were eager loaded when the Parent was loaded, this would trigger a lazy load to load all of the Clients data when all you want is the IDs. Still, it's not terrible as it would only result in one DB call to load the clients.
If they are already loaded, there is a more succinct way to get the IDs:
List<int> ClientIds = Clients.Select(x => x.Id).ToList();
Otherwise, if you have business logic involving the Parent and Clients where-by you want to be more selective about when and how the data is loaded, it is better to leave the entity definition to just represent the data state and basic rules/logic about the data, and move selective business logic outside of the entity into a business logic container that scopes the DbContext and queries against the entities to fetch what it needs.
For instance, if the calling code went and did this:
var parent = _context.Parents.Single(x => x.ParentId == parentId);
parent.SomeMethod(); // which resulted in checking the Client IDs...
The simplest way to avoid the extra DB call is to ensure the related entities are eager loaded.
var parent = _context.Parents
.Include(x => x.Clients)
.Single(x => x.ParentId == parentId);
parent.SomeMethod(); // which resulted in checking the Client IDs...
The problem with this approach is that it will still load all details about all of the Clients, and you end up in a situation where you end up defaulting to eager loading everything all of the time because the code might call something like that SomeMethod() which expects to find related entity details. This is the use-case for leveraging lazy loading, but that does have the performance overheads of the ad-hoc DB hits and ensuring that the entity's DbContext is always available to perform the read if necessary.
Instead, if you move the logic out of the entity and into the caller or another container that can take the relevant details, so that this caller projects down the data it will need from the entities in an efficient query:
var parentDetails = _context.Parents
.Where(x => x.ParentId == parentId)
.Select(x => new
{
x.ParentId,
// other details from parent or related entities...
ClientIds = x.Clients.Select(c => c.Id).ToList()
}).Single();
// Do logic that SomeMethod() would have done here, or pass these
// loaded details to a method / service to do the work rather than
// embedding it in the Entity.
This doesn't load a Parent entity, but rather executes a query to load just the details about the parent and related entities that we need. In this example it is projected into an anonymous type to hold the information we can later consume, but if you are querying the data to send to a view then you can project it directly into a view model or DTO class to serialize and send.

Performing a relevance search on a RavenDb index (lucene) across all indexed fields

Is it possible to execute a query (relevance search) in Lucene / RavenDb, where it would automatically search all fields in the index?
I have an index which has a lot of fields (40+), and I would like to search everywhere for it. Also, some fields have boosting applied.
My ideal query would be simply
red dog
And this would return all the documents, ordered by relevance, which contain these keywords.
Is this possible, or would I have to add a manual field which includes all the terms found in the 40 fields?
You have to have a field that will include all the terms you want.
See also the techniques outlined here: https://ayende.com/blog/153729/lazys-man-comprehensive-search-with-ravendb
public class Users_AllProperties : AbstractIndexCreationTask<User, Users_AllProperties.Result>
{
public class Result
{
public string Query { get; set; }
}
public Users_AllProperties()
{
Map = users =>
from user in users
select new
{
Query = AsDocument(user).Select(x => x.Value)
};
Index(x=>x.Query, FieldIndexing.Analyzed);
}
}

LightSwitch - bulk-loading all requests into one using a domain service

I need to group some data from a SQL Server database and since LightSwitch doesn't support that out-of-the-box I use a Domain Service according to Eric Erhardt's guide.
However my table contains several foreign keys and of course I want the correct related data to be shown in the table (just doing like in the guide will only make the key values show). I solved this by adding a Relationship to my newly created Entity like this:
And my Domain Service class looks like this:
public class AzureDbTestReportData : DomainService
{
private CountryLawDataDataObjectContext context;
public CountryLawDataDataObjectContext Context
{
get
{
if (this.context == null)
{
EntityConnectionStringBuilder builder = new EntityConnectionStringBuilder();
builder.Metadata =
"res://*/CountryLawDataData.csdl|res://*/CountryLawDataData.ssdl|res://*/CountryLawDataData.msl";
builder.Provider = "System.Data.SqlClient";
builder.ProviderConnectionString =
WebConfigurationManager.ConnectionStrings["CountryLawDataData"].ConnectionString;
this.context = new CountryLawDataDataObjectContext(builder.ConnectionString);
}
return this.context;
}
}
/// <summary>
/// Override the Count method in order for paging to work correctly
/// </summary>
protected override int Count<T>(IQueryable<T> query)
{
return query.Count();
}
[Query(IsDefault = true)]
public IQueryable<RuleEntryTest> GetRuleEntryTest()
{
return this.Context.RuleEntries
.Select(g =>
new RuleEntryTest()
{
Id = g.Id,
Country = g.Country,
BaseField = g.BaseField
});
}
}
public class RuleEntryTest
{
[Key]
public int Id { get; set; }
public string Country { get; set; }
public int BaseField { get; set; }
}
}
It works and all that, both the Country name and the Basefield loads with Autocomplete-boxes as it should, but it takes VERY long time. With two columns it takes 5-10 seconds to load one page.. and I have 10 more columns I haven't implemented yet.
The reason it takes so long time is because each related data (each Country and BaseField) requires one request. Loading a page looks like this in Fiddler:
This isn't acceptable at all, it should be a way of combining all those calls into one, just as it does when loading the same table without going through the Domain Service.
So.. that was a lot explaining, my question is: Is there any way I can make all related data load at once or improve the performance by any other way? It should not take 10+ seconds to load a screen.
Thanks for any help or input!s
My RIA Service queries are extremely fast, compared to not using them, even when I'm doing aggregation. It might be the fact that you're using "virtual relationships" (which you can tell by the dotted lines between the tables), that you've created using your RuleEntryTest entity.
Why is your original RuleEntry entity not related to both Country & BaseUnit in LightSwitch BEFORE you start creating your RIA entity?
I haven't used Fiddler to see what's happening, but I'd try creating "real" relationships, instead of "virtual" ones, & see if that helps your RIA entity's performance.

How to filter Azure logs, or WCF Data Services filters for Dummies

I am looking at my Azure logs in the WADLogsTable and would like to filter the results, but I'm clueless as to how to do so. There is a textbox that says:
"Enter a WCF Data Services filter to limit the entities returned"
What is the syntax of a "WCF Data Services filter"? The following gives me an InvalidValueType error saying "The value specified is invalid.":
Timestamp gt '2011-04-20T00:00'
Am I even close? Is there a handy syntax reference somewhere?
This query should be in the format:
Timestamp gt datetime'2011-04-20T00:00:00'
Remembering to put that datetime in there is the important bit.
This trips me up every time, so I use the OData overview for reference.
Adding to knightffhor's response, you can certainly write a query which filters by Timstamp but this is not recommended approach as querying on "Timestamp" attribute will lead to full table scan. Instead query this table on PartitionKey attribute. I'm copying my response from other thread here (Can I capture Performance Counters for an Azure Web/Worker Role remotely...?):
"One of the key thing here is to understand how to effectively query this table (and other diagnostics table). One of the things we would want from the diagnostics table is to fetch the data for a certain period of time. Our natural instinct would be to query this table on Timestamp attribute. However that's a BAD DESIGN choice because you know in an Azure table the data is indexed on PartitionKey and RowKey. Querying on any other attribute will result in full table scan which will create a problem when your table contains a lot of data.The good thing about these logs table is that PartitionKey value in a way represents the date/time when the data point was collected. Basically PartitionKey is created by using higher order bits of DateTime.Ticks (in UTC). So if you were to fetch the data for a certain date/time range, first you would need to calculate the Ticks for your range (in UTC) and then prepend a "0" in front of it and use those values in your query.
If you're querying using REST API, you would use syntax like:
PartitionKey ge '0<from date/time ticks in UTC>' and PartitionKey le '0<to date/time in UTC>'."
I've written a blog post about how to write WCF queries against table storage which you may find useful: http://blog.cerebrata.com/specifying-filter-criteria-when-querying-azure-table-storage-using-rest-api/
Also if you're looking for a 3rd party tool for viewing and managing diagnostics data, may I suggest that you take a look at our product Azure Diagnostics Manager: /Products/AzureDiagnosticsManager. This tool is built specifically for surfacing and managing Windows Azure Diagnostics data.
The answer I accepted helped me immensely in directly querying the table through Visual Studio. Eventually, however, I needed a more robust solution. I used the tips I gained here to develop some classes in C# that let me use LINQ to query the tables. In case it is useful to others viewing this question, here is roughly how I now query my Azure logs.
Create a class that inherits from Microsoft.WindowsAzure.StorageClient.TableServiceEntity to represent all the data in the "WADLogsTable" table:
public class AzureDiagnosticEntry : TableServiceEntity
{
public long EventTickCount { get; set; }
public string DeploymentId { get; set; }
public string Role { get; set; }
public string RoleInstance { get; set; }
public int EventId { get; set; }
public int Level { get; set; }
public int Pid { get; set; }
public int Tid { get; set; }
public string Message { get; set; }
public DateTime EventDateTime
{
get
{
return new DateTime(EventTickCount, DateTimeKind.Utc);
}
}
}
Create a class that inherits from Microsoft.WindowsAzure.StorageClient.TableServiceContext and references the newly defined data object class:
public class AzureDiagnosticContext : TableServiceContext
{
public AzureDiagnosticContext(string baseAddress, StorageCredentials credentials)
: base(baseAddress, credentials)
{
this.ResolveType = s => typeof(AzureDiagnosticEntry);
}
public AzureDiagnosticContext(CloudStorageAccount storage)
: this(storage.TableEndpoint.ToString(), storage.Credentials) { }
// Helper method to get an IQueryable. Hard code "WADLogsTable" for this class
public IQueryable<AzureDiagnosticEntry> Logs
{
get
{
return CreateQuery<AzureDiagnosticEntry>("WADLogsTable");
}
}
}
I have a helper method that creates a CloudStorageAccount from configuration settings:
public CloudStorageAccount GetStorageAccount()
{
CloudStorageAccount.SetConfigurationSettingPublisher(
(name, setter) => setter(RoleEnvironment.GetConfigurationSettingValue(name)));
string configKey = "Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString";
return CloudStorageAccount.FromConfigurationSetting(configKey);
}
I create an AzureDiagnosticContext from the CloudStorageAccount and use that to query my logs:
public IEnumerable<AzureDiagnosticEntry> GetAzureLog(DateTime start, DateTime end)
{
CloudStorageAccount storage = GetStorageAccount();
AzureDiagnosticContext context = new AzureDiagnosticContext(storage);
string startTicks = "0" + start.Ticks;
string endTicks = "0" + end.Ticks;
IQueryable<AzureDiagnosticEntry> query = context.Logs.Where(
e => e.PartitionKey.CompareTo(startTicks) > 0 &&
e.PartitionKey.CompareTo(endTicks) < 0);
CloudTableQuery<AzureDiagnosticEntry> tableQuery = query.AsTableServiceQuery();
IEnumerable<AzureDiagnosticEntry> results = tableQuery.Execute();
return results;
}
This method takes advantage of the performance tip in Gaurav's answer to filter on PartitionKey rather than Timestamp.
If you wanted to filter the results by more than just date, you could filter the returned IEnumerable. But, you'd probably get better performance by filtering the IQueryable. You could add a filter parameter to your method and call it within the IQueryable.Where(). Eg,
public IEnumerable<AzureDiagnosticEntry> GetAzureLog(
DateTime start, DateTime end, Func<AzureDiagnosticEntry, bool> filter)
{
...
IQueryable<AzureDiagnosticEntry> query = context.Logs.Where(
e => e.PartitionKey.CompareTo(startTicks) > 0 &&
e.PartitionKey.CompareTo(endTicks) < 0 &&
filter(e));
...
}
In the end, I actually further abstracted most of these classes into base classes in order to reuse the functionality for querying other tables, such as the one storing the Windows Event Log.

How to efficiently filter a collection of child objects by a property with NHibernate

I've been trying unsuccessfully to filter a collection of child objects for a few hours how and have finally thrown my hands up! I'm new to NHibernate and was hoping for a couple of pointers on this. I've tried various ICriteria etc. with no luck. I'm just not getting it.
I have a parent object 'Post' with a collection of child objects 'Comment'. The collection is mapped as a set with inverse on the Comment side.
What I am trying to do is to return only comments with a status enum value of 'Comment.Approved'
The relevant portions of the entity classes are as follows:
public class Post
{
public virtual Guid Id { get; protected set; }
private ICollection<Comment> _comments;
public virtual ICollection<Comment> Comments
{
get { return _comments; }
protected set { _comments = value; }
}
}
public class Comment
{
public virtual Guid Id { get; protected set; }
public virtual Post Post { get; set; }
public virtual CommentStatus Status { get; set; }
}
My retrieval code looks like this at the moment:
var Id = __SomeGuidHere__;
var post = _session
.CreateCriteria<Post>()
.Add(Restrictions.Eq("Id", Id))
.UniqueResult<Post>();
var comments = _session.CreateFilter(post.Comments, "where Status = :status").SetParameter("status", CommentStatus.Approved).List<Comment>();
While this works the SQL doesn't appear to be very efficient, I expected to be able to translate the following SQL into something similar in HQL or an ICriteria of some sort:
SELECT * FROM posts p LEFT JOIN comments c ON p.PostId = c.PostId AND c.Status = 0 WHERE p.PostId = '66a2bf13-1330-4414-ac8a-9d9b00ea0705';
I've had a look at the various answers related to this type of query here and none of them seem to address this specific scenario.
There's probably something very simple I'm missing here but I'm too tired now to see it. Here's hoping someone better with NHibernate can point me in the right direction.
Thanks for your time.
Edit: Still struggling with this, some of the answers here are good in that I'm starting to think that my post entity needs to be re-thought to perform the filtration itself, or that I should implement a ViewModel to filter the comments I want. The question still remains however, even if only from an academic perspective.
I've updated the selection to HQL and tried:
var post = _session
.CreateQuery("select p from Post as p left join fetch p.Comments as c where p.Id = :id and c.Id in (select ac from p.Comments ac where ac.Status = :status)")
.SetParameter("id", Id)
.SetParameter("status", CommentStatus.Approved)
.UniqueResult<Post>();
This works as long as a post has an approved comment, otherwise I get no post due to the SQL generated using 'AND' in the where clause.
Anyone? I'm stumped now!
Update: Thanks to all who have replied, it has been useful and has forced me to re-evaluate portions of my model. Since the most frequent use of comments as children of a post is in the viewing of a post, only the approved comments should be viewable in this scenario. In most other scenarios that I can think of comments would be accessed directly and filtered by status which is of course straight forward.
I have updated my mappings to filter all post > comment loading to only load approved posts as follows (in FluentNHibernate):
HasMany(x => x.Comments).Where(x => x.Status == CommentStatus.Approved)
.AsSet()
.Inverse()
.KeyColumn("PostId")
.ForeignKeyConstraintName("PostComments")
.OrderBy("CreatedOn")
.Cascade.AllDeleteOrphan();
I wish I could mark all as the answer since all contributed to me working this thing out, but Peter was the first to point out that I may be thinking about the model incorrectly.
Thanks to all.
According to me the SQL is perfect for what you describe.
The database has its own optimizer and will know what to do to get your data efficiently delivered to your doorstep.
I think this would work fine:
var comments =
_session.CreateCriteria<Comment>
.CreateAlias("Post", "post")
.Add (Restriction.Eq("Status", status))
.Add (Restriction.Eq("post.Id", Id)).List();
I am not shure if you need the .CreateAlias part (if you don't need it, maybe Post.Id would work too - but I am not sure).
I would solve this using an extension method for IEnumerable<Comment>:
public static IEnumerable<Comment> FilterByStatus(this IEnumerable<Comment> comments, CommentStatus status)
{
return comments.Where(x => x.Status == status);
}
Let NHibernate return the entire collection then filter it as needed.