Lucene Taxonomy Facets: difference between Hierarchial and DrillDown search - lucene

When I index fields using Lucene's taxonomy mechanism, I can specify the hierarchial path:
var doc = new Document();
doc.add(new FacetField("Date", "2010", "10", "15"));
In the query, I can specify path in two places:
var query = new DrillDownQuery(config);
query.add("Date", "2010", "10"); // here
var collector = new FacetsCollector();
FacetsCollector.search(searcher, query, 10, collector);
var facets = new FastTaxonomyFacetCounts(taxoReader, config, collector);
var result = facets.getTopChildren(10, "Date", "2010", "10"); // here
Both places accept String... path as an argument. My simple tests show the same behavior in both cases. The question is, what's the difference and why we need 2 places to specify the path in a query?

Related

RavenDB Get document count after BulkInsertOperations

I am using RavenDB to bulk load some documents. Is there a way to get the count of documents loaded into the database?
For insert operations I am doing:
BulkInsertOperation _bulk = docStore.BulkInsert(null,
new BulkInsertOptions{ CheckForUpdates = true});
foreach(MyDocument myDoc in docCollection)
_bulk.Store(myDoc);
_bulk.Dispose();
And right after that I call the following:
session.Query<MyDocument>().Count();
but I always get a number which is less than the count I see in raven studio.
By default, the query you are doing limits to a sane number of results, part of RavenDB's promise to be safe by default and not stream back millions of records.
In order to get the number of a specific type of document in yoru database, you need a special map-reduce index whose job it is to track the counts for each document type. Because this type of index deals directly with document metadata, it's easier to define this in Raven Studio instead of trying to create it with code.
The source for that index is in this question but I'll copy it here:
// Index Name: Raven/DocumentCollections
// Map Query
from doc in docs
let Name = doc["#metadata"]["Raven-Entity-Name"]
where Name != null
select new { Name , Count = 1}
// Reduce Query
from result in results
group result by result.Name into g
select new { Name = g.Key, Count = g.Sum(x=>x.Count) }
Then to access it in your code you would need a class that mimics the structure of the anonymous type created by both the Map and Reduce queries:
public class Collection
{
public string Name { get; set; }
public int Count { get; set; }
}
Then, as Ayende notes in the answer to the previously linked question, you can get results from the index like this:
session.Query<Collection>("Raven/DocumentCollections")
.Where(x => x.Name == "MyDocument")
.FirstOrDefault();
Keep in mind, however, that indexes are updated asynchronously so after bulk-inserting a bunch of documents, the index may be stale. You can force it to wait by adding .Customize(x => x.WaitForNonStaleResults()) right after the .Query(...).
Raven Studio actually gets this data from the index Raven/DocumentsByEntityName which exists for every database, by sidestepping normal queries and getting metadata on the index. You can emulate that like this:
QueryResult result = docStore.DatabaseCommands.Query("Raven/DocumentsByEntityName",
new Raven.Abstractions.Data.IndexQuery
{
Query = "Tag:MyDocument",
PageSize = 0
},
includes: null,
metadataOnly: true);
var totalDocsOfType = result.TotalResults;
That QueryResult contains a lot of useful data:
{
Results: [ ],
Includes: [ ],
IsStale: false,
IndexTimestamp: "2013-11-08T15:51:25.6463491Z",
TotalResults: 3,
SkippedResults: 0,
IndexName: "Raven/DocumentsByEntityName",
IndexEtag: "01000000-0000-0040-0000-00000000000B",
ResultEtag: "BA222B85-627A-FABE-DC7C-3CBC968124DE",
Highlightings: { },
NonAuthoritativeInformation: false,
LastQueryTime: "2014-02-06T18:12:56.1990451Z",
DurationMilliseconds: 1
}
A lot of that is the same data you get on any query if you request statistics, like this:
RavenQueryStatistics stats;
Session.Query<Course>()
.Statistics(out stats)
// Rest of query

Selectlist manipulation to merge 2 sets together in a single list

ViewBag.WList = new SelectList(_bdb.BMW.OrderBy(o => o.NAME).AsEnumerable(), "ID", "NAME");
This is how I've setup my selectlist in the controller with below the view code.
#Html.DropDownList("WOptions", (SelectList)ViewBag.WList, "Select an option", new { style = "width:250px;" })
I have a table in my database which I can't add new records too and which I need to reference as my main option list, additionally I need to add 2 new records to reference exceptions.
Basically on my create form I have my option list and a type list since 1 of the options isn't suitable so 2 new choices have been added as a different dropdown in the form of the type list.
It's mostly not a problem to record this or display it but I have a filter on my kendo list based on the code above.
I'm wondering what the best method of adding in 2 options to the ward list would be if I can do it with my existing code or need to use a different method?
Additionally I've tried,
SelectList options = new SelectList(_bdb.BMW.OrderBy(o => o.NAME).AsEnumerable(), "ID", "NAME");
options.ToList().Insert(11, (new SelectListItem { Text = "Example1", Value = "11" }));
options.ToList().Insert(22, (new SelectListItem { Text = "Example2", Value = "22" }));
ViewBag.WList = options;
While this does still display the original selectlist it doesn't display the new items and this has so far been the only variant which didn't cause errors.
Any ideas or suggestions would be much appreciated.
The Linq Extension .ToList() returns a new instance of List. So options.ToList() does not reference to the internal list and that is why the later two items are not inserted to the internal list. To go around,
var bmws = _bdb.BMW.OrderBy(o => o.NAME).Select(x => new { ID = x.ID, NAME = x.NAME }).ToList();
bmws.Insert(11, (new { ID = "11", NAME = "Example1" }));
bmws.Insert(22, (new { ID = "22", NAME = "Example2" }));
var options = new SelectList(bmws.AsEnumerable(), "ID", "NAME");
I have not tested the code, but this should do the trick.
Hope this helps.

facets with ravendb

i am trying to work with the facet ability in ravendb but getting strange results.
i have a documents like :
{
"SearchableModel": "42LC2RR ",
"ModelName": "42LC2RR",
"ModelID": 490578,
"Name": "LG 42 Television 42LC2RR",
"Desctription": "fffff",
"Image": "1/4/9/8/18278941c",
"MinPrice": 9400.0,
"MaxPrice": 9400.0,
"StoreAmounts": 1,
"AuctionAmounts": 0,
"Popolarity": 3,
"ViewScore": 0.0,
"ReviewAmount": 2,
"ReviewScore": 45,
"Sog": "E-TV",
"SogID": 1,
"IsModel": true,
"Manufacrurer": "LG",
"ParamsList": [
"1994267",
"46570",
"4134",
"4132",
"4118",
"46566",
"4110",
"180676",
"239517",
"750771",
"2658507",
"2658498",
"46627",
"4136",
"169941",
"169846",
"145620",
"169940",
"141416",
"3190767",
"3190768",
"144720",
"2300706",
"4093",
"4009",
"1418470",
"179766",
"190025",
"170557",
"170189",
"43768",
"4138",
"67976",
"239516",
"3190771",
"141195"
],
}
where the ParamList each represents a property of the product and in our application we have in cache what each param represents.
when searching for a specific product i would like to count all the returning attributes to be able to add the amount of each item after the search.
After searching lg in televisions category i want to get :
Param:4134 witch is a representative of LCD and the amount :65.
but unfortunately i am getting strange results. only some params are counted and some not.
on some searchers where i am getting results back i dont get any amounts back.
i am using the latest stable version of RavenDB.
index :
from doc in docs
from param in doc.ParamsList
select new {Name=doc.Name,Description=doc.Description,SearchNotVisible = doc.SearchNotVisible,SogID=doc.SogID,Param =param}
facet :
DocumentStore documentStore = new DocumentStore { ConnectionStringName = "Server" };
documentStore.Initialize();
using (IDocumentSession session = documentStore.OpenSession())
{
List<Facet> _facets = new List<Facet>
{
new Facet {Name = "Param"}
};
session.Store(new FacetSetup { Id = "facets/Params", Facets = _facets });
session.SaveChanges();
}
usage example :
IDictionary<string, IEnumerable<FacetValue>> facets = session.Advanced.DatabaseCommands.GetFacets("FullIndexParams", new IndexQuery { Query = "Name:lg" }, "facets/Params");
i tried many variations without success.
does anyone have ideas what am i doing wrong ?
Thanks
Use this index, it should resolve your problem:
from doc in docs
select new {Name=doc.Name,Description=doc.Description,SearchNotVisible = doc.SearchNotVisible,SogID=doc.SogID,Param = doc.ParamsList}
What analyzer you set for "Name" field. I see you search by Name "lg". By default, Ravendb use KeywordAnalyzer, means you must search by exact name. You should set another analyzer for Name or Description field (StandardAnalyzer for example).

Conditional patch requests in RavenDB

How do I patch a document in RavenDB conditionally. The below code just patches all documents of type patron to Middle Initial = JJJ. I also would like to do this per condition.. for example .. do the same patch for the same Patrons documents types.. but for only those that have City ="New York"
store.DatabaseCommands.UpdateByIndex("Raven/DocumentsByEntityName",
new IndexQuery { Query = "Tag:Patrons" },
new[]
{
new PatchRequest
{
Type = PatchCommandType.Set,
Name = "MiddleInitial",
Value = "JJJ"
}
}, allowStale: false);
ZVenue,
You do it using:
store.DatabaseCommands.UpdateByIndex("Patrons/ByCity",
new IndexQuery { Query = "City:\"New York\"" },
new[]
{
new PatchRequest
{
Type = PatchCommandType.Set,
Name = "MiddleInitial",
Value = "JJJ"
}
}, allowStale: false);
Where the Patrons/ByCity index is defined as:
from p in docs.Patrons select new { p.City }
EDIT: It seems that I've been wrong with this answer, because Ayende explains a way how to do it in his answer.
This is something that cannot be done at the moment. However, Matt Warren has implemented something based on IronJS to do this stuff. I don't know when and if it will become part of the main product, but you can certainly use his Github repo if you really need it.
Instead, I suggest you either patch the documents on your own or don't denormalize the data and use .Include() instead if that's applicable in your case.

Organising a Lunene Directory

I have a set of records. Each record can have multiple skills and a single state.
So you might have a record of skills a, b and c and a state of Victoria.
I need to be able to search the directory for any records that have say skill a in Victoria or skill a and c in Victoria.
I'm having trouble creating an effective directory that will allow me to search in the manner I want.
At first i created a directory with skills: a b c state:vic
then i tried skills: a,b,c state:vic
But searching these is not giving me the correct results. In fact when i have a query;
skills:a,b AND state:vic,
skills: a OR b AND state:vic,
skills: a OR skills:b AND state:vic
the above all return a set of records for skills: a AND state:vic.
Any thought?
EDIT
Since posting this I have gotten it to work but am unsure if this is the right approach.
I have combined all the skills into a single field and space seperated them. The skills are in GUID format.
Then in my search method I do this;
queryString = "skills:(skill1 OR skill2 OR skill3) AND state:Vic";
Query query = parser.Parse(queryString);
This works fine and it's very quick but is creating a queryString really the way to go with this or is there a better way?
You can use a BooleanQuery : http://geekswithblogs.net/rgupta/archive/2009/01/07/lucene-multifield-searches.aspx
You could build your BooleanQuery directly, instead of going via a QueryParser. This requires that you differentate skills- and state input fields when presenting your search interface to your users.
var skillQuery = new BooleanQuery {
{ new TermQuery(new Term("skills", "skill1")), Occur.SHOULD },
{ new TermQuery(new Term("skills", "skill2")), Occur.SHOULD },
{ new TermQuery(new Term("skills", "skill3")), Occur.SHOULD }
};
var query = new BooleanQuery() {
{ skillQuery, Occur.MUST },
{ new TermQuery(new Term("state", "Vic")), Occur.MUST }
};
// query = "+(skills:skill1 skills:skill2 skills:skill3) +state:Vic"
Allowing freetext entries when searching for skills still require a call to QueryParser.Parse.
var analyzer = new StandardAnalyzer(Version.LUCENE_30);
var skillParser = new QueryParser(Version.LUCENE_30, "skills", analyzer);
var skillQuery = skillParser.Parse("skill1, skill2, skill3");
var query = new BooleanQuery() {
{ skillQuery, Occur.MUST },
{ new TermQuery(new Term("state", "Vic")), Occur.MUST }
};
// query = "+(skills:skill1 skills:skill2 skills:skill3) +state:Vic"