Sitecore - get all indexed items - lucene

I am using lucene search to get bucket items filtered by some string and this is my code:
var innerQuery = new FullTextQuery(myString);
var hits = searchContext.Search(innerQuery, searchIndex.GetDocumentCount());
Is there some query or something different that will let me get all indexed items? I tried with empty "myString" but there is an error that it cannot be empty.

You can use Lucene.Net.Search.MatchAllDocsQuery:
SearchHits hits = searchContext.Search(new MatchAllDocsQuery(), int.MaxValue);

Related

Get total results for revisions (RavenDB Client API)

I am trying to display document revisions in a paginated list. This will only work when I know the total results count so I can calculate the page count.
For common document queries this is possible with session.Query(...).Statistics(out var stats).
My current solution is the following:
// get paged revisions
List<T> items = await session.Advanced.Revisions.GetForAsync<T>(id, (pageNumber - 1) * pageSize, pageSize);
double totalResults = items.Count;
// get total results in case items count equals page size
if (pageSize <= items.Count || pageNumber > 1)
{
GetRevisionsCommand command = new GetRevisionsCommand(id, 0, 1, true);
session.Advanced.RequestExecutor.Execute(command, session.Advanced.Context);
command.Result.Results.Parent.TryGet("TotalResults", out totalResults);
}
Problem with this approach is that I need an additional request just to get the TotalResults property which indeed has already been returned by the first request. I just don't know how to access it.
You are right the TotalResults is returned from server but not parsed on the client side.
I opened an issue to fix that: https://issues.hibernatingrhinos.com/issue/RavenDB-15552
You can also get the total revisions count for document by using the /databases/{dbName}/revisions/count?id={docId} endpoint, client code for example:
var docId = "company/1";
var dbName = store.Database;
var json = await store.GetRequestExecutor().HttpClient
.GetAsync(store.Urls.First() + $"/databases/{dbName}/revisions/count?id={docId}").Result.Content.ReadAsStringAsync();
using var ctx = JsonOperationContext.ShortTermSingleUse();
using var obj = ctx.ReadForMemory(json, "revision-count");
obj.TryGet("RevisionsCount", out long revisionsCount);
Another way could be getting all the revisions:
var revisions = await session.Advanced.Revisions.GetForAsync<Company>(docId, 0, int.MaxValue);
Then using the revisions.Count as the total count.

TFS query on HTML-field for empty value

I want to get with a wiql query all workitem that have an empty HTML field.
Is there a way to do so?
To query out those work items who have empty HTML field like Description, using the query from TFS web page couldn't do that. There's no operator like "isEmpty or isNotEmpty" to use. Here is a uservoice about your request, and according to it, this feature is under review now.
As a workaround, you could use Excel to filter those work items. Write a simple query and export those work items to Excel. Then use the filter In Excel.
You could also use TFS object model api to get those empty field workitems, here is an example:
WorkItemStore workItemStore = teamProjectCollection.GetService<WorkItemStore>();
string queryString = "Select [State], [Title],[Description] From WorkItems Where [Work Item Type] = 'User Story' and [System.TeamProject] = 'TeamProjectName'";
// Create and run the query.
Query query = new Query(workItemStore, queryString);
WorkItemCollection witCollection = query.RunQuery();
foreach (WorkItem workItem in witCollection)
{
//check if the field is empty
if(workItem.Fields["Description"].Value.ToString() == string.Empty || workItem.Fields["Description"].Value.ToString() == "")
{
Console.WriteLine(workItem.Title);
}
}

Creating a MoreLikeThis query yields empty queries and terms

I'm trying to get similar Document of another one . I'm using the Lucene.Net MoreLikeThis-Class to achieve this. For this i seperate my Documents in multiple Fields - Title and Content. Now creating the actual query results in an empty query without interesting Terms.
My could looks like this:
var queries = new List<Query>();
foreach(var docField in docFields)
var similarSearch = new MoreLikeThis(indexReader);
similarSearch.SetFieldNames(docField.fieldName);
similarSearch.Analyzer = new GermanAnalyzer(Version.LUCENE_30, new HashSet<string>(StopWords));
similarSearch.MinDocFreq = 1;
similarSearch.MinTermFreq = 1;
similarSearch.MinWordLen = 1;
similarSearch.Boost = true;
similarSearch.BoostFactor = boostFactor;
using(var reader = new StringReader(docField.Content)){
var searchQuery = similarSearch.Like(reader);
// debugging purpose
var queryString = searchQuery.ToString(); // empty
var terms = similarSearch.RetrieveInterestingTerms(reader); // also empty
queries.Add(searchQuery);
}
var booleanQuery = new BooleanQuery();
foreach(var moreLikeThisQuery in queries)
{
booleanQuery.Add(moreLikeThisQuery, Occur.SHOULD);
}
var topDocs = indexSearcher.Search(booleanQuery, maxNumberOfResults); // and of course no results obtained
So the question is:
Why there are no Terms / why there is no query generated?
I hope important thing's be seen, if not please help me to make my first question better :)
I got it to work.
The problem was, that i worked on the false directory.
I have different Solutions for creating the index and creating the queries and had a missmatch with the index-location.
So the generall solution would be:
Is your Querygenerating-Class fully initialized? (MinDocFreq, MinTermFreq, MinWordLen, has a Analyzer, set the fieldNames)
Is your used IndexReader correctly initialized?

Lucene DrillSideways with custom sort field or with TopScoreDocCollector

I am trying to implement drillsideways search with Lucene 4.6.1.
Following code works fine:
DrillSideways ds = new DrillSideways(searcher, taxoReader);
FacetSearchParams fsp = new FacetSearchParams(getAllFacetCounts());
DrillDownQuery ddq = new DrillDownQuery(fsp.indexingParams, mainQuery);
List<CategoryPath> paths = new ArrayList<CategoryPath>();
...
add category path
...
if (paths.size() >0)
ddq.add(paths.toArray(new CategoryPath[paths.size()]));
DrillSidewaysResult dsr = ds.search(null, ddq, 500, fsp); // <-- here
TopDocs topDocs = dsr.hits;
ScoreDoc[] hits = topDocs.scoreDocs;
// list search results
listSearchResults(searcher, hits, Math.min(500, topDocs.totalHits));
But what if I want to pass TopScoreDocCollector, like
// for now it is top score collector,
// but I may want to implement custom sort
TopScoreDocCollector topDocsCollector = TopScoreDocCollector.create(500, true);
DrillSidewaysResult dsr = ds.search(ddq, topDocsCollector, fsp);
the result is empty set and no errors. What is wrong?
I'm guessing you are referring to the value of DrillSidewaysResult.hits, and it is intended behavior, as noted in the documentation of DrillSidewaysResult:
Note that if you called DrillSideways.search(DrillDownQuery, Collector, FacetSearchParams), then hits will be null.
You should get your hits from the Collector instead.

Ordering a query by the string length of one of the fields

In RavenDB (build 2330) I'm trying to order my results by the string length of one of the indexed terms.
var result = session.Query<Entity, IndexDefinition>()
.Where(condition)
.OrderBy(x => x.Token.Length);
However the results look to be un-sorted. Is this possible in RavenDB (or via a Lucene query) and if so what is the syntax?
You need to add a field to IndexDefinition to order by, and define the SortOption to Int or something more appropriate (however you don't want to use String which is default).
If you want to use the Linq API like in your example you need to add a field named Token_Length to the index' Map function (see Matt's comment):
from doc in docs
select new
{
...
Token_Length = doc.TokenLength
}
And then you can query using the Linq API:
var result = session.Query<Entity, IndexDefinition>()
.Where(condition)
.OrderBy(x => x.Token.Length);
Or if you really want the field to be called TokenLength (or something other than Token_Length) you can use a LuceneQuery:
from doc in docs
select new
{
...
TokenLength = doc.Token.Length
}
And you'd query like this:
var result = session.Advanced.LuceneQuery<Entity, IndexDefinition>()
.Where(condition)
.OrderBy("TokenLength");