I am using NHibernate.Search libraries in my project for free text search. Recently when I started getting more than 2100 results, I started getting max parameter length error from SQL Server.
Does NHibernate.Search take care of such situation ? Any workaround anyone ?
You could modify NHibernate.Search code to take care of this, or, use custom paging, IE get number of hits for your search, then page nhibernate search results accordingly.
public IList<TEntity> Search<TEntity>(Query query, bool? active, string orderBy)
{
var search = NHibernate.Search.Search.CreateFullTextSession(this.session);
var total = search.CreateFullTextQuery(query, typeof(TEntity)).ResultSize;
var first = 0;
var l = new List<TEntity>();
while (total > 0)
{
l.AddRange(search.CreateFullTextQuery(query, typeof(TEntity))
.SetFirstResult(first)
.SetMaxResults(1000)
.List<TEntity>());
first += 1000;
total -= 1000;
}
return l;
}
See : IFullTextQuery - exception if there are too may objects
Related
I am trying to display document revisions in a paginated list. This will only work when I know the total results count so I can calculate the page count.
For common document queries this is possible with session.Query(...).Statistics(out var stats).
My current solution is the following:
// get paged revisions
List<T> items = await session.Advanced.Revisions.GetForAsync<T>(id, (pageNumber - 1) * pageSize, pageSize);
double totalResults = items.Count;
// get total results in case items count equals page size
if (pageSize <= items.Count || pageNumber > 1)
{
GetRevisionsCommand command = new GetRevisionsCommand(id, 0, 1, true);
session.Advanced.RequestExecutor.Execute(command, session.Advanced.Context);
command.Result.Results.Parent.TryGet("TotalResults", out totalResults);
}
Problem with this approach is that I need an additional request just to get the TotalResults property which indeed has already been returned by the first request. I just don't know how to access it.
You are right the TotalResults is returned from server but not parsed on the client side.
I opened an issue to fix that: https://issues.hibernatingrhinos.com/issue/RavenDB-15552
You can also get the total revisions count for document by using the /databases/{dbName}/revisions/count?id={docId} endpoint, client code for example:
var docId = "company/1";
var dbName = store.Database;
var json = await store.GetRequestExecutor().HttpClient
.GetAsync(store.Urls.First() + $"/databases/{dbName}/revisions/count?id={docId}").Result.Content.ReadAsStringAsync();
using var ctx = JsonOperationContext.ShortTermSingleUse();
using var obj = ctx.ReadForMemory(json, "revision-count");
obj.TryGet("RevisionsCount", out long revisionsCount);
Another way could be getting all the revisions:
var revisions = await session.Advanced.Revisions.GetForAsync<Company>(docId, 0, int.MaxValue);
Then using the revisions.Count as the total count.
Hi I know most probably this query was asked. How can I improve query performance on Sitecore queries. I've installed dotTrancer and got the results you can see into this image. My method with I did not displayed in the image tooked 2,551 ms and Sitecore query toked 2,344 ms which is huge.
I am using Sitecore 7.2 i think their are around 10k item records into the database. We don't have more then 5 version per item. The query we do is:
return rootItem.Axes.SelectSingleItem(string.Format("descendant::*[##{0}='{1}']", attributeName, attributeValue));
attributeName = TemplateName.
Do you have any ideas how I can optimize the request?
Do not use Sitecore API for search. Use Sitecore Search:
public void Search(Item rootItem, string templateName)
{
var index = ContentSearchManager.GetIndex("sitecore_" + Sitecore.Context.Database.Name + "_index");
using (var context = index.CreateSearchContext())
{
var result = context.GetQueryable<SearchResultItem>().FirstOrDefault(i => i.TemplateName == templateName && i.Paths.Contains(rootItem.ID));
if (result != null)
{
Item resultItem = result.GetItem();
}
}
}
So I have a standard service reference proxy calss for MS CRM 2013 (i.e. right-click add reference etc...) I then found the limitation that CRM data calls limit to 50 results and I wanted to get the full list of results. I found two methods, one looks more correct, but doesn't seem to work. I was wondering why it didn't and/or if there was something I'm doing incorrectly.
Basic setup and process
crmService = new CrmServiceReference.MyContext(new Uri(crmWebServicesUrl));
crmService.Credentials = System.Net.CredentialCache.DefaultCredentials;
var accountAnnotations = crmService.AccountSet.Where(a => a.AccountNumber = accountNumber).Select(a => a.Account_Annotation).FirstOrDefault();
Using Continuation (something I want to work, but looks like it doesn't)
while (accountAnnotations.Continuation != null)
{
accountAnnotations.Load(crmService.Execute<Annotation>(accountAnnotations.Continuation.NextLinkUri));
}
using that method .Continuation is always null and accountAnnotations.Count is always 50 (but there are more than 50 records)
After struggling with .Continutation for a while I've come up with the following alternative method (but it seems "not good")
var accountAnnotationData = accountAnnotations.ToList();
var accountAnnotationFinal = accountAnnotations.ToList();
var index = 1;
while (accountAnnotationData.Count == 50)
{
accountAnnotationData = (from a in crmService.AnnotationSet
where a.ObjectId.Id == accountAnnotationData.First().ObjectId.Id
select a).Skip(50 * index).ToList();
accountAnnotationFinal = accountAnnotationFinal.Union(accountAnnotationData).ToList();
index++;
}
So the second method seems to work, but for any number of reasons it doesn't seem like the best. Is there a reason .Continuation is always null? Is there some setup step I'm missing or some nice way to do this?
The way to get the records from CRM is to use paging here is an example with a query expression but you can also use fetchXML if you want
// Query using the paging cookie.
// Define the paging attributes.
// The number of records per page to retrieve.
int fetchCount = 3;
// Initialize the page number.
int pageNumber = 1;
// Initialize the number of records.
int recordCount = 0;
// Define the condition expression for retrieving records.
ConditionExpression pagecondition = new ConditionExpression();
pagecondition.AttributeName = "address1_stateorprovince";
pagecondition.Operator = ConditionOperator.Equal;
pagecondition.Values.Add("WA");
// Define the order expression to retrieve the records.
OrderExpression order = new OrderExpression();
order.AttributeName = "name";
order.OrderType = OrderType.Ascending;
// Create the query expression and add condition.
QueryExpression pagequery = new QueryExpression();
pagequery.EntityName = "account";
pagequery.Criteria.AddCondition(pagecondition);
pagequery.Orders.Add(order);
pagequery.ColumnSet.AddColumns("name", "address1_stateorprovince", "emailaddress1", "accountid");
// Assign the pageinfo properties to the query expression.
pagequery.PageInfo = new PagingInfo();
pagequery.PageInfo.Count = fetchCount;
pagequery.PageInfo.PageNumber = pageNumber;
// The current paging cookie. When retrieving the first page,
// pagingCookie should be null.
pagequery.PageInfo.PagingCookie = null;
Console.WriteLine("#\tAccount Name\t\t\tEmail Address");while (true)
{
// Retrieve the page.
EntityCollection results = _serviceProxy.RetrieveMultiple(pagequery);
if (results.Entities != null)
{
// Retrieve all records from the result set.
foreach (Account acct in results.Entities)
{
Console.WriteLine("{0}.\t{1}\t\t{2}",
++recordCount,
acct.EMailAddress1,
acct.Name);
}
}
// Check for more records, if it returns true.
if (results.MoreRecords)
{
// Increment the page number to retrieve the next page.
pagequery.PageInfo.PageNumber++;
// Set the paging cookie to the paging cookie returned from current results.
pagequery.PageInfo.PagingCookie = results.PagingCookie;
}
else
{
// If no more records are in the result nodes, exit the loop.
break;
}
}
I know variants of this question have been asked before (even by me), but I still don't understand a thing or two about this...
It was my understanding that one could retrieve more documents than the 128 default setting by doing this:
session.Advanced.MaxNumberOfRequestsPerSession = int.MaxValue;
And I've learned that a WHERE clause should be an ExpressionTree instead of a Func, so that it's treated as Queryable instead of Enumerable. So I thought this should work:
public static List<T> GetObjectList<T>(Expression<Func<T, bool>> whereClause)
{
using (IDocumentSession session = GetRavenSession())
{
return session.Query<T>().Where(whereClause).ToList();
}
}
However, that only returns 128 documents. Why?
Note, here is the code that calls the above method:
RavenDataAccessComponent.GetObjectList<Ccm>(x => x.TimeStamp > lastReadTime);
If I add Take(n), then I can get as many documents as I like. For example, this returns 200 documents:
return session.Query<T>().Where(whereClause).Take(200).ToList();
Based on all of this, it would seem that the appropriate way to retrieve thousands of documents is to set MaxNumberOfRequestsPerSession and use Take() in the query. Is that right? If not, how should it be done?
For my app, I need to retrieve thousands of documents (that have very little data in them). We keep these documents in memory and used as the data source for charts.
** EDIT **
I tried using int.MaxValue in my Take():
return session.Query<T>().Where(whereClause).Take(int.MaxValue).ToList();
And that returns 1024. Argh. How do I get more than 1024?
** EDIT 2 - Sample document showing data **
{
"Header_ID": 3525880,
"Sub_ID": "120403261139",
"TimeStamp": "2012-04-05T15:14:13.9870000",
"Equipment_ID": "PBG11A-CCM",
"AverageAbsorber1": "284.451",
"AverageAbsorber2": "108.442",
"AverageAbsorber3": "886.523",
"AverageAbsorber4": "176.773"
}
It is worth noting that since version 2.5, RavenDB has an "unbounded results API" to allow streaming. The example from the docs shows how to use this:
var query = session.Query<User>("Users/ByActive").Where(x => x.Active);
using (var enumerator = session.Advanced.Stream(query))
{
while (enumerator.MoveNext())
{
User activeUser = enumerator.Current.Document;
}
}
There is support for standard RavenDB queries, Lucence queries and there is also async support.
The documentation can be found here. Ayende's introductory blog article can be found here.
The Take(n) function will only give you up to 1024 by default. However, you can change this default in Raven.Server.exe.config:
<add key="Raven/MaxPageSize" value="5000"/>
For more info, see: http://ravendb.net/docs/intro/safe-by-default
The Take(n) function will only give you up to 1024 by default. However, you can use it in pair with Skip(n) to get all
var points = new List<T>();
var nextGroupOfPoints = new List<T>();
const int ElementTakeCount = 1024;
int i = 0;
int skipResults = 0;
do
{
nextGroupOfPoints = session.Query<T>().Statistics(out stats).Where(whereClause).Skip(i * ElementTakeCount + skipResults).Take(ElementTakeCount).ToList();
i++;
skipResults += stats.SkippedResults;
points = points.Concat(nextGroupOfPoints).ToList();
}
while (nextGroupOfPoints.Count == ElementTakeCount);
return points;
RavenDB Paging
Number of request per session is a separate concept then number of documents retrieved per call. Sessions are short lived and are expected to have few calls issued over them.
If you are getting more then 10 of anything from the store (even less then default 128) for human consumption then something is wrong or your problem is requiring different thinking then truck load of documents coming from the data store.
RavenDB indexing is quite sophisticated. Good article about indexing here and facets here.
If you have need to perform data aggregation, create map/reduce index which results in aggregated data e.g.:
Index:
from post in docs.Posts
select new { post.Author, Count = 1 }
from result in results
group result by result.Author into g
select new
{
Author = g.Key,
Count = g.Sum(x=>x.Count)
}
Query:
session.Query<AuthorPostStats>("Posts/ByUser/Count")(x=>x.Author)();
You can also use a predefined index with the Stream method. You may use a Where clause on indexed fields.
var query = session.Query<User, MyUserIndex>();
var query = session.Query<User, MyUserIndex>().Where(x => !x.IsDeleted);
using (var enumerator = session.Advanced.Stream<User>(query))
{
while (enumerator.MoveNext())
{
var user = enumerator.Current.Document;
// do something
}
}
Example index:
public class MyUserIndex: AbstractIndexCreationTask<User>
{
public MyUserIndex()
{
this.Map = users =>
from u in users
select new
{
u.IsDeleted,
u.Username,
};
}
}
Documentation: What are indexes?
Session : Querying : How to stream query results?
Important note: the Stream method will NOT track objects. If you change objects obtained from this method, SaveChanges() will not be aware of any change.
Other note: you may get the following exception if you do not specify the index to use.
InvalidOperationException: StreamQuery does not support querying dynamic indexes. It is designed to be used with large data-sets and is unlikely to return all data-set after 15 sec of indexing, like Query() does.
I've two sets of search indexes.
TestIndex (used in our test environment) and ProdIndex(used in PRODUCTION environment).
Lucene search query: +date:[20090410184806 TO 20091007184806] works fine for test index but gives this error message for Prod index.
"maxClauseCount is set to 1024"
If I execute following line just before executing search query, then I do not get this error.
BooleanQuery.SetMaxClauseCount(Int16.MaxValue);
searcher.Search(myQuery, collector);
Am I missing something here? Why am not getting this error in test index?The schema for two indexes are same.They only differ wrt to number of records/data.PROD index has got higher number of records(around 1300) than those in test one (around 950).
The range query essentially gets transformed to a boolean query with one clause for every possible value, ORed together.
For example, the query +price:[10 to 13] is tranformed to a boolean query
+(price:10 price:11 price:12 price:13)
assuming all the values 10-13 exist in the index.
I suppose, all of your 1300 values fall in the range you have given. So, boolean query has 1300 clauses, which is higher than the default value of 1024. In the test index, the limit of 1024 is not reached as there are only 950 values.
I had the same problem. My solution was to catch BooleanQuery.TooManyClauses and dynamically increase maxClauseCount.
Here is some code that is similar to what I have in production.
private static Hits searchIndex(Searcher searcher, Query query) throws IOException
{
boolean retry = true;
while (retry)
{
try
{
retry = false;
Hits myHits = searcher.search(query);
return myHits;
}
catch (BooleanQuery.TooManyClauses e)
{
// Double the number of boolean queries allowed.
// The default is in org.apache.lucene.search.BooleanQuery and is 1024.
String defaultQueries = Integer.toString(BooleanQuery.getMaxClauseCount());
int oldQueries = Integer.parseInt(System.getProperty("org.apache.lucene.maxClauseCount", defaultQueries));
int newQueries = oldQueries * 2;
log.error("Too many hits for query: " + oldQueries + ". Increasing to " + newQueries, e);
System.setProperty("org.apache.lucene.maxClauseCount", Integer.toString(newQueries));
BooleanQuery.setMaxClauseCount(newQueries);
retry = true;
}
}
}
I had this same issue in C# code running with the Sitecore web content management system. I used Randy's answer above, but was not able to use the System get and set property functionality. Instead I retrieved the current count, incremented it, and set it back. Worked great!
catch (BooleanQuery.TooManyClauses e)
{
// Increment the number of boolean queries allowed.
// The default is 1024.
var currMaxClause = BooleanQuery.GetMaxClauseCount();
var newMaxClause = currMaxClause + 1024;
BooleanQuery.SetMaxClauseCount(newMaxClause);
retry = true;
}
Add This Code
#using Lucene.Net.Search;
#BooleanQuery.SetMaxClauseCount(2048);
Just put, BooleanQuery.setMaxClauseCount( Integer.MAX_VALUE );and that's it.