RavenDB - what are "string indexes"

RavenDB - what are "string indexes" - ravendb

In a question I asked in the RavenDB discussion boards about making a large number of similar indexes, I was told that I should not make "so many indexes that do the same thing" and told I should make a "string index" instead - the exact words were ...
There is no reason to have so many indexes that do the same thing.
Create an index with:
from d in docs
select new { d.Id, d.Name, Collection = d["#metadata"]["Raven-Entity-Name"] }
And query on that.
Reference Topic
I do not understand at all what this means, I have read the raven documentation many times before today, and I'm still very lost and in the dark.
The best I can come up with; Or the closest I understand, is something kind of like this...
RavenSession.Advanced.DocumentStore.DatabaseCommands.PutIndex("Index/Name",
new Raven.Client.Indexes.IndexDefinitionBuilder<IMayTransform>{
Map = results => from result in results
select new{
result.Id,
result.Name,
result["#metadata"]["Raven-Entity-Name"]
}
});
Update
Adding the interfaces involved, by request.
public interface IMayTransform : IHasIdentity, IHasName { }
public interface IHasIdentity {
/// <summary>
/// A string based identity that is used by raven to store the entity.
/// </summary>
string Id { get; set; }
}
public interface IHasName {
/// <summary>
/// The friendly display name for the entity.
/// </summary>
string Name { get; set; }
}
But this does not work. First of all, the part about ["#metadata"] is not compiling; I do not even understand the purpose if its existence in the example I was cited. I do not understand how I am supposed to query something like this, either - or where it goes, or where it is defined, how it is called, etc. I have literally no concept of what a "string index" is.
On top of that, I do not comprehend how I add analyzers to this. Any help is appreciated; This has been a horrible, horrible morning and I am frantic to get at least some modicum of work done, but this is leaving me very confused.
IMayTransform is an interface that every entity that needs to fit this implements.
Update (Again)
Between the answer here on stack overflow, and the glorious help from Ayende on the ravendb google groups, I have the following index that works. And I HATE IT.
public class EntityByName : AbstractIndexCreationTask {
public override IndexDefinition CreateIndexDefinition() {
return new IndexDefinition {
Map = "from doc in docs let collection = doc[\"#metadata\"][\"Raven-Entity-Name\"] select new { doc.Id, doc.Name, doc.Abbreviation, Group_Name = doc.Group.Name, Tags = doc.Tags.Select( r => r.Name ), collection };",
Indexes = {
{"Name", FieldIndexing.Analyzed },
{"Abbreviation", FieldIndexing.Analyzed },
{"Tags", FieldIndexing.Analyzed },
{"Group.Name", FieldIndexing.Analyzed }
},
Fields ={
{ "Name" },
{ "Abbreviation" },
{ "Group.Name" },
//{ "Tags" }
},
Stores ={
{ "Name", FieldStorage.Yes },
{ "Abbreviation", FieldStorage.Yes },
{ "Group.Name", FieldStorage.Yes },
//{ "Tags", FieldStorage.Yes }
},
SortOptions = {
{ "collection", SortOptions.String },
{ "Name", SortOptions.String }
}
};
}
}
This does exactly what I need it to do. It is fast, efficient, it works fine, it gives me no trouble, but hard coding everything into a string query is driving me nuts. I am closing this question and opening a new one concerning this, but if anyone has suggestions on ways I can get away from this, I would love to hear them. This is an issue of personal desire, not functional need.

try it like this
RavenSession.Advanced.DocumentStore.DatabaseCommands.PutIndex(
"Index/Name",
new IndexDefinition {
Map = "from d in docs select new { Id = d.Id, Name = [\"#metadata\"][\"Raven-Entity-Name\"]}"
});
now you see, what a "string index" is as well.
BTW: there is a build-in index, that does a similar thing as this: Raven/DocumentsByEntityName
Usage:
public class Entity
{
public string Id { get; set; }
public string Name { get; set; } }
}
session.Query<Entity>("Index/Name").Where(x => x.Name == "Foo");
Further Explanation:
There is now way to build indexes using interfaces only. To achieve what you want, Raven would need to implement a BaseEntity that populates the matadata as properties and all documents must derive from that Entity. But that is not going to happen and nobody wnats that.
But, when designing your documents, you could integrate the entity name yourself, like
public class EntityBase
{
public EntityBase()
{
this.TypeName = this.GetType().Name;
}
public string Name { get; set; }
public string TypeName { get; set; }
}
public class Person : EntityBase
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
Now you could build an ravendb index based on EntityBase and query TypeName

Related

Returning sub-properties from document in RavenDb via Index using Lucene

I'm using RavenDB 2.5 and what I want to do is query a Group (see below) providing a valid Lucene search term and get back a collection of Member instances (or even just Ids) that match. So, class definition:
public class Group {
public string Id { get; set; }
public IList<Member> Members { get; set; }
}
public class Member {
public string Name { get; set; }
public string Bio { get; set; }
}
And they are stored in the database as session.Store(groupInstance); as you'd expect. What I'd like to do is query and return the Member instances which match a given search term.
So, something like:
public class GroupMembers_BySearchTerm : AbstractIndexCreationTask {
public override IndexDefinition CreateIndexDefinition(){
return new IndexDefinition {
Map = "from g in docs.Groups select new { Content = new [] { g.Members.Select(m => m.Name), g.Members.Select(m => m.Bio) }, Id = g.Id }",
Indexes = { { "Id", FieldIndexing.Default }, { "Content", FieldIndexing.Analyzed } }
}
}
}
If I call this using something like:
session.Advanced.LuceneQuery<Group, GroupMembers_BySearchTerm>().Where("Id: myId AND Content: search terms").ToList();
I obviously get back a Group instance, but how can I get back the Members instead?

What about an index like this:
public class Members_BySearchTermAndGroup : AbstractIndexCreationTask {
public override IndexDefinition CreateIndexDefinition(){
return new IndexDefinition {
Map = "from g in docs.Groups
from member in g.Members
select new {
GroupdId = g.Id,
Name = member.Name,
Bio = member.Bio,
Content = new [] {member.Name, member.Bio },
}",
Indexes = {
{ "GroupId", FieldIndexing.Default },
{ "Content", FieldIndexing.Analyzed }
},
Stores = {
{ "Name", FieldStorage.Yes },
{ "Bio", FieldStorage.Yes }
}
}
}
}
If you take a closer look you'll see that we are creating a new lucene entry for each member inside of a group. Consequently, you'll be able to query on those elements and retrieve them.
Finally you can query your store like this (more info about searching):
session.Query<Member, Members_BySearchTermAndGroup>()
.Search(x => x.Content, "search terms")
.ProjectFromIndexFieldsInto<Member>()
.ToList();
I cannot check this right now but I guess that you need to project your results using the ProjectFromIndexFieldsInto(). Some more information about projections in this link.
or, following your example:
session.Advanced
.LuceneQuery<Member, Members_BySearchTermAndGroup>()
.Where("GroupId: myId AND Content: search terms")
.SelectFields<Member>()
.ToList();

Losing the record ID

I have a record structure where I have a parent record with many children records. On the same page I will have a couple queries to get all the children.
A later query I will get a record set when I expand it it shows "Proxy". That is fine an all for getting data from the record since everything is generally there. Only problem I have is when I go to grab the record "ID" it is always "0" since it is proxy. This makes it pretty tough when building a dropdown list where I use the record ID as the "selected value". What makes this worse is it is random. So out of a list of 5 items 2 of them will have an ID of "0" because they are proxy.
I can use evict to force it to load at times. However when I am needing lazy load (For Grids) the evict is bad since it kills the lazy load and I can't display the grid contents on the fly.
I am using the following to start my session:
ISession session = FluentSessionManager.SessionFactory.OpenSession();
session.BeginTransaction();
CurrentSessionContext.Bind(session);
I even use ".SetFetchMode("MyTable", Eager)" within my queries and it still shows "Proxy".
Proxy is fine, but I need the record ID. Anyone else run into this and have a simple fix?
I would greatly appreciate some help on this.
Thanks.
Per request, here is the query I am running that will result in Patients.Children having an ID of "0" because it is showing up as "Proxy":
public IList<Patients> GetAllPatients()
{
return FluentSessionManager.GetSession()
.CreateCriteria<Patients>()
.Add(Expression.Eq("IsDeleted", false))
.SetFetchMode("Children", Eager)
.List<Patients>();
}

I have found the silver bullet that fixes the proxy issue where you loose your record id!
I was using ClearCache to take care of the problem. That worked just fine for the first couple layers in the record structure. However when you have a scenario of Parient.Child.AnotherLevel.OneMoreLevel.DownOneMore that would not fix the 4th and 5th levels. This method I came up with does. I also did find it mostly presented itself when I would have one to many followed by many to one mapping. So here is the answer to everyone else out there that is running into the same problem.
Domain Structure:
public class Parent : DomainBase<int>
{
public virtual int ID { get { return base.ID2; } set { base.ID2 = value; } }
public virtual string Name { get; set; }
....
}
DomainBase:
public abstract class DomainBase<Y>, IDomainBase<Y>
{
public virtual Y ID //Everything has an identity Key.
{
get;
set;
}
protected internal virtual Y ID2 // Real identity Key
{
get
{
Y myID = this.ID;
if (typeof(Y).ToString() == "System.Int32")
{
if (int.Parse(this.ID.ToString()) == 0)
{
myID = ReadOnlyID;
}
}
return myID;
}
set
{
this.ID = value;
this.ReadOnlyID = value;
}
}
protected internal virtual Y ReadOnlyID { get; set; } // Real identity Key
}
IDomainBase:
public interface IDomainBase<Y>
{
Y ID { get; set; }
}
Domain Mapping:
public class ParentMap : ClassMap<Parent, int>
{
public ParentMap()
{
Schema("dbo");
Table("Parent");
Id(x => x.ID);
Map(x => x.Name);
....
}
}
ClassMap:
public class ClassMap<TEntityType, TIdType> : FluentNHibernate.Mapping.ClassMap<TEntityType> where TEntityType : DomainBase<TIdType>
{
public ClassMap()
{
Id(x => x.ID, "ID");
Map(x => x.ReadOnlyID, "ID").ReadOnly();
}
}

.Net WebAPI URI convention for advanced searching /filtering

I am relatively new to REST and WebAPI by Microsoft. We are implementing a hub REST service that will house several types of object gets and sets. Being the lead on the project, I am being tasked with coming up with the proper Uri design we are going with. I was wondering what thoughts were on war is better. Yes I specifically phased that without using the word "standard".
Here are the options my team and I are currently entertaining:
Http://servername/API/REST/Ldap/AD/employees?username=jsmith
Http://servername/API/REST/Ldap/AD/employee/UserName?searchTerm=jsmith (this seems RPC to me)
Http://servername/API/REST/Ldap/AD/employees/getusername?searchterm?jsmith
We are also creating a Soap version hence the rest in the Uri.
Thanks for the input

I recommend you take a look at Web API Design from Brian Mulloy. He has several recommendations when it comes to searching and filtering.
Simplify associations - sweep complexity under the ‘?’
Most APIs have intricacies beyond the base level of a resource.
Complexities can include many states that can be updated, changed,
queried, as well as the attributes associated with a resource. Make it
simple for developers to use the base URL by putting optional states
and attributes behind the HTTP question mark. Keep your API intuitive
by simplifying the associations between resources, and sweeping
parameters and other complexities under the rug of the HTTP question
mark.
Tips for search
While a simple search could be modeled as a resourceful API (for
example, dogs/?q=red), a more complex search across multiple resources
requires a different design. If you want to do a global search across
resources, we suggest you follow the Google model:
Global search
/search?q=fluffy+fur
Here, search is the verb; ?q represents the query.
Scoped search
To add scope to your search, you can prepend with the scope of the
search. For example, search in dogs owned by resource ID 5678
/owners/5678/dogs?q=fluffy+fur
Notice that the explicit search has been dropped from the URL and
instead it relies on the parameter 'q' to indicate the scoped query.
Pagination and partial response
Support partial response by adding optional fields in a comma
delimited list.
/dogs?fields=name,color,location
Use limit and offset to make it easy for developers to paginate
objects.
/dogs?limit=25&offset=50

To Oppositional's comment, this is what I put together some time ago.
https://groups.google.com/d/msg/servicestack/uoMzASmvxho/CtqpZdju7NcJ
public class QueryBase
{
public string Query { get; set; }
public int Limit { get; set; }
public int Offset { get; set; }
}
[Route("/v1/users")]
public class User : IReturn<List<User>>
{
public string Id { get; set; }
public string FirstName { get; set; }
public string MiddleName { get; set; }
public string LastName { get; set; }
public string Email { get; set; }
}
public class RequestFilterAttribute : Attribute, IHasRequestFilter
{
#region IHasRequestFilter Members
public IHasRequestFilter Copy()
{
return this;
}
public int Priority
{
get { return -100; }
}
public void RequestFilter(IHttpRequest req, IHttpResponse res, object requestDto)
{
var query = req.QueryString["q"] ?? req.QueryString["query"];
var limit = req.QueryString["limit"];
var offset = req.QueryString["offset"];
var user = requestDto as QueryBase;
if (user == null) { return; }
user.Query = query;
user.Limit = limit.IsEmpty() ? int.MaxValue : int.Parse(limit);
user.Offset = offset.IsEmpty() ? 0 : int.Parse(offset);
}
#endregion
}
[Route("/v1/users/search", "GET")]
[RequestFilter]
public class SearchUser : QueryBase, IReturn<PagedResult<User>> { }
public class UsersService : Service
{
public static List<User> UserRepository = new List<User>
{
new User{ Id="1", FirstName = "Michael", LastName = "A", Email = "michaelEmail" },
new User{ Id="2", FirstName = "Robert", LastName = "B", Email = "RobertEmail" },
new User{ Id="3", FirstName = "Khris", LastName = "C", Email = "KhrisEmail" },
new User{ Id="4", FirstName = "John", LastName = "D", Email = "JohnEmail" },
new User{ Id="4", FirstName = "Lisa", LastName = "E", Email = "LisaEmail" }
};
public PagedResult<User> Get(SearchUser request)
{
var query = request.Query;
var users = request.Query.IsNullOrEmpty()
? UserRepository.ToList()
: UserRepository.Where(x => x.FirstName.Contains(query) || x.LastName.Contains(query) || x.Email.Contains(query)).ToList();
var totalItems = users.Count;
var totalPages = (int)Math.Ceiling((decimal)totalItems / (decimal)request.Limit);
var currentPage = request.Offset;
users = users.Skip(request.Offset * request.Limit).Take(request.Limit).ToList();
var itemCount = users.Count;
return new PagedResult<User>
{
TotalItems = totalItems,
TotalPages = totalPages,
ItemCount = itemCount,
Items = users,
CurrentPage = currentPage
};
}
}

ADO.NET Entity Framework Hierarchical Data

Consider following database model:
And following query code:
using (var context = new DatabaseEntities())
{
return context.Feed.ToHierarchy(f => f.Id_Parent, null);
}
Where ToHierarchy is an extension to ObjectSet<TEntity> as:
public static List<TEntity> ToHierarchy<TEntity, TProperty>(this ObjectSet<TEntity> set, Func<TEntity, TProperty> parentIdProperty, TProperty idRoot) where TEntity : class
{
return set.ToList().Where(f => parentIdProperty(f).Equals(idRoot)).ToList();
}
This would result in example JSON formatted response:
[
{
"Description":"...",
"Details":[ ],
"Id":1,
"Id_Parent":null,
"Title":"...",
"URL":"..."
},
{
"Description":"...",
"Details":[
{
"Description":"...",
"Details":[ ],
"Id":4,
"Id_Parent":3,
"Title":"...",
"URL":"..."
},
{
"Description":"...",
"Details":[
{
"Description":"...",
"Details":[
{
"Description":"...",
"Details":[ ],
"Id":7,
"Id_Parent":6,
"Title":"...",
"URL":"..."
}
],
"Id":6,
"Id_Parent":5,
"Title":"...",
"URL":"..."
}
],
"Id":5,
"Id_Parent":3,
"Title":"...",
"URL":null
}
],
"Id":3,
"Id_Parent":null,
"Title":"...",
"URL":null
}
]
As you may have noticed ToHierarchy method is supposed to (and apparently do indeed) retrieve all rows from a given set (flat) and return hierarchical representation of these as per "parent property".
When I was in the middle of my implementation I quick tried my code and surprisingly it worked! Now, I imagine how weird does this sound to many of you, but I really don't understand why or how that piece of code works, even though I kinda wrote it down on my own...
Could you explain how does it work?
P.S.: if you look closer, ToHierarchy is not near the same as .Include("Details").

It works because set.ToList will load all records from the database table to your application and the rest is done be EF and its change tracking mechanism which should ensure correct referencing between related entities.
Btw. you are filtering records in the memory of your application, not in the database. For example if your table contains 10000 records and your filter should return only 10, you will still load all 10000 from the database.
You will find that implementing this with EF is quite hard because EF has no support for hierarchical data. You will always end with bad solution. The only good solution is using stored procedure and some support for hierarchical queries in the database - for example common table expressions (CTE) in SQL server.
I just made this very simple example and it works as I described in comment:
public class SelfReferencing
{
public int Id { get; set; }
public string Name { get; set; }
public SelfReferencing Parent { get; set; }
public ICollection<SelfReferencing> Children { get; set; }
}
public class Context : DbContext
{
public DbSet<SelfReferencing> SelfReferencings { get; set; }
}
public class Program
{
static void Main(string[] args)
{
using (var context = new Context())
{
context.Database.Delete();
context.Database.CreateIfNotExists();
context.SelfReferencings.Add(
new SelfReferencing()
{
Name = "A",
Children = new List<SelfReferencing>()
{
new SelfReferencing()
{
Name = "AA",
Children = new List<SelfReferencing>()
{
new SelfReferencing()
{
Name = "AAA"
}
}
},
new SelfReferencing()
{
Name = "AB",
Children = new List<SelfReferencing>()
{
new SelfReferencing()
{
Name = "ABA"
}
}
}
}
});
context.SaveChanges();
}
using (var context = new Context())
{
context.Configuration.LazyLoadingEnabled = false;
context.Configuration.ProxyCreationEnabled = false;
var data = context.SelfReferencings.ToList();
}
}
}
It uses code first approach but internally it is same as when using EDMX. When I get data I have 5 entities in list and all have correctly configured Parent and Children navigation properties.

Implementing a flexible searching infrastructure using nHibernate

My aim is to implement a quite generic search mechanism. Here's the general idea:
you can search based on any property of the entity you're searching for (for example- by Employee's salary, or by Department name etc.).
Each property you can search by is represented by a class, which inherits from EntityProperty:
public abstract class EntityProperty<T>
where T:Entity
{
public enum Operator
{
In,
NotIn,
}
/// <summary>
/// Name of the property
/// </summary>
public abstract string Name { get; }
//Add a search term to the given query, using the given values
public abstract IQueryable<T> AddSearchTerm(IQueryable<T> query, IEnumerable<object> values);
public abstract IQueryable<T> AddSortingTerm(IQueryable<T> query);
protected Operator _operator = Operator.In;
protected bool _sortAscending = false;
public EntityProperty(Operator op)
{
_operator = op;
}
//use this c'tor if you're using the property for sorting only
public EntityProperty(bool sortAscending)
{
_sortAscending = sortAscending;
}
}
all of the properties you're searching / sorting by are stored in a simple collection class:
public class SearchParametersCollection<T>
where T: Entity
{
public IDictionary<EntityProperty<T>,IEnumerable<object>> SearchProperties { get; private set; }
public IList<EntityProperty<T>> SortProperties { get; private set; }
public SearchParametersCollection()
{
SearchProperties = new Dictionary<EntityProperty<T>, IEnumerable<object>>();
SortProperties = new List<EntityProperty<T>>();
}
public void AddSearchProperty(EntityProperty<T> property, IEnumerable<object> values)
{
SearchProperties.Add(property, values);
}
public void AddSortProperty(EntityProperty<T> property)
{
if (SortProperties.Contains(property))
{
throw new ArgumentException(string.Format("property {0} already exists in sorting order", property.Name));
}
SortProperties.Add(property);
}
}
now, all the repository class has to do is:
protected IEnumerable<T> Search<T>(SearchParametersCollection<T> parameters)
where T : Entity
{
IQueryable<T> query = this.Session.Linq<T>();
foreach (var searchParam in parameters.SearchProperties)
{
query = searchParam.Key.AddSearchTerm(query, searchParam.Value);
}
//add order
foreach (var sortParam in parameters.SortProperties)
{
query = sortParam.AddSortingTerm(query);
}
return query.AsEnumerable();
}
for example, here's a class which implements searching a user by their full name:
public class UserFullName : EntityProperty<User>
{
public override string Name
{
get { return "Full Name"; }
}
public override IQueryable<User> AddSearchTerm(IQueryable<User> query, IEnumerable<object> values)
{
switch (_operator)
{
case Operator.In:
//btw- this doesn't work with nHibernate... :(
return query.Where(u => (values.Cast<string>().Count(v => u.FullName.Contains(v)) > 0));
case Operator.NotIn:
return query.Where(u => (values.Cast<string>().Count(v => u.FullName.Contains(v)) == 0));
default:
throw new InvalidOperationException("Unrecognized operator " + _operator.ToString());
}
}
public override IQueryable<User> AddSortingTerm(IQueryable<User> query)
{
return (_sortAscending) ? query.OrderBy(u => u.FullName) : query.OrderByDescending(u => u.FullName);
}
public UserFullName(bool sortAscending)
: base(sortAscending)
{
}
public UserFullName(Operator op)
: base(op)
{
}
}
my questions are:
1. firstly- am I even on the right track? I don't know of any well-known method for achieving what I want, but I may be wrong...
2. it seems to me that the Properties classes should be in the domain layer and not in the DAL, since I'd like the controller layers to be able to use them. However, that prevents me from using any nHibernate-specific implementation of the search (i.e any other interface but Linq). Can anybody think of a solution that would enable me to utilize the full power of nH while keeping these classes visible to upper layers? I've thought about moving them to the 'Common' project, but 'Common' has no knowledge of the Model entities, and I'd like to keep it that way.
3. as you can see by my comment for the AddSearchTerm method- I haven't really been able to implement 'in' operator using nH (I'm using nH 2.1.2 with Linq provider). any sugggestions in that respect would be appriciated. (see also my question from yesterday).
thanks!

If you need good API to query NHIbernate objects then you should use ICriteria (for NH 2.x) or QueryOver (for NH 3.x).
You over complicating DAL with these searches. Ayende has a nice post about why you should not do it

I ended up using query objects, which greatly simplified things.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas