RavenDB: How to index dictionary keys on a multi-map index? - indexing

I've got the below RavenDB MultiMap index that works and returns results. Now when I want to use the query and try to filter data I get the following message:
The field 'Stock_Key' is not indexed, cannot query/sort on fields that are not indexed.
I am trying to get all the products which have some stock at certain warehouses. _request.Warehouses is a list of warehouse IDS that can be provided to the query. The product stock is saved in a separate collection in the database which holds the same SKU.
var query = await _session
.Query<Products_SearchUniqueBySku.Result, Products_SearchUniqueBySku>()
.Where(x => x.Stock.Any(y => y.Key.In(_request.Warehouses) && y.Value > 0))
.ToListAsync();
I've been trying to get the keys indexed all day but failed to do so. Would appreciate some help with this. I've also tried to do the query via some RQL variants in the RavenDB studio but getting more of the same message in there. Not sure if the RQL queries are written correctly though.
from index 'Products/SearchUniqueBySku'
where Stock.589e90c9-09bb-4a04-94fb-cf92bde88f97 > 0
The field 'Stock.589e90c9-09bb-4a04-94fb-cf92bde88f97' is not indexed, cannot query/sort on fields that are not indexed
from index 'Products/SearchUniqueBySku'
where Stock_589e90c9-09bb-4a04-94fb-cf92bde88f97 > 0
The field 'Stock_589e90c9-09bb-4a04-94fb-cf92bde88f97' is not indexed, cannot query/sort on fields that are not indexed
The index that I've used:
using System;
using System.Collections.Generic;
using System.Linq;
using Raven.Client.Documents.Indexes;
using MyProject.Models.Ordering.Entities;
using MyProject.Models.Ordering.Enums;
namespace MyProject.Ordering.Indexes;
public class Products_SearchUniqueBySku : AbstractMultiMapIndexCreationTask<Products_SearchUniqueBySku.Result>
{
public class Result
{
public string Sku { get; set; }
public ProductTypes Type { get; set; }
public string Name { get; set; }
public IDictionary<string, decimal> Stock { get; set; }
}
public Products_SearchUniqueBySku()
{
AddMap<Product>(
products => from product in products
where product.Type == ProductTypes.Simple
where product.Variants.Count == 0
select new
{
product.Sku,
product.Type,
product.Name,
Stock = new Dictionary<string, decimal>()
}
);
AddMap<Product>(
products => from product in products
where product.Type == ProductTypes.Simple
where product.Variants.Count > 0
from variant in product.Variants
select new
{
variant.Sku,
Type = ProductTypes.Variant,
Name = $"{product.Name} ({string.Join("/", variant.Mappings.Select(y => y.Value.Name))})",
Stock = new Dictionary<string, decimal>()
});
AddMap<StockItem>(
items => from item in items
group item by item.Sku
into grouped
select new
{
Sku = grouped.Key,
Type = ProductTypes.Variant,
Name = (string) null,
Stock = grouped.ToDictionary(x => x.Warehouse.Id, x => x.Stock)
}
);
Reduce = results => from result in results
group result by result.Sku
into grouped
let product = grouped.Single(x => x.Stock.Count == 0)
select new
{
Sku = grouped.Key,
product.Type,
product.Name,
Stock = grouped.SelectMany(x => x.Stock).ToDictionary(x => x.Key, x => x.Value),
};
}
}
The results when using it the RavenDB studio (only showing some, you get the idea):
from index 'Products/SearchUniqueBySku'
{
"Sku": "VANS-SH-38",
"Type": "Variant",
"Name": "Vans Men's Suede (Maat 38)",
"Stock": {
"589e90c9-09bb-4a04-94fb-cf92bde88f97": 10,
"98304a84-0f44-49ce-8438-8a959ca29b9d": 11
},
"#metadata": {
"#change-vector": null,
"#index-score": 1
}
},
{
"Sku": "889376",
"Type": "Simple",
"Name": "Apple Magic Trackpad (2021)",
"Stock": {
"589e90c9-09bb-4a04-94fb-cf92bde88f97": 15
},
"#metadata": {
"#change-vector": null,
"#index-score": 1
}
}
The models (most properties omitted for brevity):
public class StockItem
{
public EntityReference Warehouse { get; set; }
public string Sku { get; set; }
public decimal Stock { get; set; }
}
public class EntityReference
{
public string Id { get; set; }
public string Name { get; set; }
}
public class Product
{
public string Id { get; set; }
public string Name { get; set; }
public ProductTypes Type { get; set; }
public List<ProductVariant> Variants { get; set; }
}
public class ProductVariant
{
public string Sku { get; set; }
}
EDIT:
Building on the answer from #Ayende Rahien. I had to change the Reduce to the following:
Reduce = results => from result in results
group result by result.Sku
into grouped
let product = grouped.Single(x => x.Stock.Count == 0)
let stock = grouped.SelectMany(x => x.Stock).ToDictionary(x => x.Key, x => x.Value)
select new
{
product.Id,
....
Stock = stock,
_ = stock.Select(x => CreateField("Stock_" + x.Key, x.Value)) <--
};
See Dynamic Fields for indexes (docs). I then got the following message:
Map and Reduce functions of a index must return identical types.
This I solved by adding _ = (string) null as a property (not sure if it is the perfect solution but hey, it worked) to each of the AddMap functions and then the following query worked:
from index 'Products/SearchUniqueBySku'
where Stock_589e90c9-09bb-4a04-94fb-cf92bde88f97 > 0

You need to do this in the Reduce of the index:
_ = grouped.SelectMany(x => CreateField("Stock_" +x.Stock.Key, x.Stock.Value))
That uses _ to mark the field as containing dynamic fields.
The fields that it will emit will be in the format of Stock_$key

Related

Can I use an index as the source of an index in RavenDB

I'm trying to define an index in RavenDb that uses the output of another index as it's input but I can't get it to work.
I have the following entities & indexes defined.
SquadIndex produces the result I expect it to do but SquadSizeIndex doesn't even seem to execute.
Have I done something wrong or is this not supported?
class Country
{
public string Id { get; private set; }
public string Name { get; set; }
}
class Player
{
public string Id { get; private set; }
public string Name { get; set; }
public string CountryId { get; set; }
}
class Reference
{
public string Id { get; set; }
public string Name { get; set; }
}
class SquadIndex : AbstractIndexCreationTask<Player, SquadIndex.Result>
{
public SquadIndex()
{
Map = players => from player in players
let country = LoadDocument<Country>(player.CountryId)
select new Result
{
Country = new Reference
{
Id = country.Id,
Name = country.Name
},
Players = new[]
{
new Reference
{
Id = player.Id,
Name = player.Name
}
}
};
Reduce = results => from result in results
group result by result.Country
into g
select new Result
{
Country = g.Key,
Players = g.SelectMany(x => x.Players)
};
}
internal class Result
{
public Reference Country { get; set; }
public IEnumerable<Reference> Players { get; set; }
}
}
class SquadSizeIndex : AbstractIndexCreationTask<SquadIndex.Result, SquadSizeIndex.Result>
{
public SquadSizeIndex()
{
Map = squads => from squad in squads
select new Result
{
Country = squad.Country,
PlayerCount = squad.Players.Count()
};
Reduce = results => from result in results
group result by result.Country
into g
select new Result
{
Country = g.Key,
PlayerCount = g.Sum(x => x.PlayerCount)
};
}
internal class Result
{
public Reference Country { get; set; }
public int PlayerCount { get; set; }
}
}
No, you can't. The output of indexes are not documents to be indexed.
You can use the scripted index results to chain indexes, but that isn't trivial.

RavenDb : Search occurrences in text is slow

I would like to find the occurrences of a word in a text.
I have a class like this
public class Page
{
public string Id { get; set; }
public string BookId { get; set; }
public string Content { get; set; }
public int PageNumber { get; set; }
}
I have my index like this :
class Pages_SearchOccurrence : AbstractIndexCreationTask<Page, Pages_SearchOccurrence.ReduceResult>
{
public class ReduceResult
{
public string PageId { get; set; }
public int Count { get; set; }
public string Word { get; set; }
public string Content { get; set; }
}
public Pages_SearchOccurrence()
{
Map = pages => from page in pages
let words = page.Content
.ToLower()
.Split(new string[] { " ", "\n", ",", ";" }, StringSplitOptions.RemoveEmptyEntries)
from w in words
select new
{
page.Content,
PageId = page.Id,
Count = 1,
Word = w
};
Reduce = results => from result in results
group result by new { PageId = result.PageId, result.Word } into g
select new
{
Content = g.First().Content,
PageId = g.Key.PageId,
Word = g.Key.Word,
Count = g.ToList().Count()
};
Index(x => x.Content, Raven.Abstractions.Indexing.FieldIndexing.Analyzed);
}
}
Finally, my query is like this :
using (var session = documentStore.OpenSession())
{
RavenQueryStatistics stats;
var occurence = session.Query<Pages_SearchOccurrence.ReduceResult, Pages_SearchOccurrence>()
.Statistics(out stats)
.Where(x => x.Word == "works")
.ToList();
}
But I realize that RavenDb is very slow (or my query is not good  )
stats.IsStale = true and raven studio take too much time and give only few results.
I have 1000 document “Pages” with a content of 1000 words per Page .
Why is my query not okay and how can I find the occurrences in a page ?
Thank you for your help!
You are doing it wrong. You should set the Content field as Analyzed and use RavenDB's Search() operator. The slowness is most likely because of the amount of un-optimized work your index code is doing.
I had found a partial result.
Perhaps I'm not clear : my goal is to find the occurrences of a word in the page.
I search the hits count of a word in the page and I would like to order by this count.
I changed my index like this :
class Pages_SearchOccurrence : AbstractIndexCreationTask<Page, Pages_SearchOccurrence.ReduceResult>{
public class ReduceResult
{
public string Content { get; set; }
public string PageId { get; set; }
public string Count { get; set; }
public string Word { get; set; }
}
public Pages_SearchOccurrence()
{
Map = pages => from page in pages
let words = page.Content.ToLower().Split(new string[] { " ", "\n", ",", ";" }, StringSplitOptions.RemoveEmptyEntries)
from w in words
select new
{
page.Content,
PageId = page.Id,
Count = 1,
Word = w
};
Index(x => x.Content, Raven.Abstractions.Indexing.FieldIndexing.Analyzed);
Index(x => x.PageId, Raven.Abstractions.Indexing.FieldIndexing.NotAnalyzed);
}
Finally, my new query looks like this :
using (var session = documentStore.OpenSession())
{
var query = session.Query<Pages_SearchOccurrence.ReduceResult, Pages_SearchOccurrence>()
.Search((x) => x.Word, "works")
.AggregateBy(x => x.PageId)
.CountOn(x => x.Count)
.ToList()
.Results
.FirstOrDefault();
var listFacetValues = query.Value.Values;
var finalResult = listFacetValues.GroupBy(x => x.Hits).OrderByDescending(x => x.Key).Take(5).ToList();
}
The finalResult gives me a group of Facetvalue which have a property Hits
( the properties Hits and Count of my FacetValue are the same here )
The Hits property gives me the result that I want but for me this code is not correct and ravendb studio doesn't like this too.
Do you have a better solution ?

Indexing list count within last month

I would like to be able to query the first 10 documents from a collection in RavenDB ordered by the count with a constraint in a sublist. This is my entity:
public class Post
{
public string Title { get; set; }
public List<Like> Likes { get; set; }
}
public class Like
{
public DateTime Created { get; set; }
}
I've tried with the following query:
var oneMonthAgo = DateTime.Today.AddMonths(-1);
session
.Query<Post>()
.OrderByDescending(x => x.Likes.Count(y => y.Created > oneMonthAgo))
.Take(10);
Raven complaints that count should be done on index time rather than query time. I've tried moving the count to a index using the following code:
public class PostsIndex : AbstractIndexCreationTask<Post>
{
public PostsIndex()
{
var month = DateTime.Today.AddMonths(-1);
Map = posts => from doc in posts
select
new
{
doc.Title,
LikeCount = doc.Likes.Count(x => x.Created > month),
};
}
}
When adding this index, Raven throws a error 500.
What to do?
You can do this by creating a Map/Reduce index to flatten the Posts/Likes and then query over that.
The index:
public class PostLikesPerDay : AbstractIndexCreationTask<Post, PostLikesPerDay.Result>
{
public PostLikesPerDay()
{
Map = posts => from post in posts
from like in post.Likes
select new Result
{
Title = post.Title,
Date = like.Created,
Likes = 1
};
Reduce = results => from result in results
group result by new
{
result.Title,
result.Date.Date
}
into grp
select new Result
{
Title = grp.Key.Title,
Date = grp.Key.Date,
Likes = grp.Sum(l => l.Likes)
};
}
public class Result
{
public string Title { get; set; }
public DateTime Date { get; set; }
public int Likes { get; set; }
}
}
And the query:
using (var session = store.OpenSession())
{
var oneMonthAgo = DateTime.Today.AddMonths(-1);
var query = session.Query<PostLikesPerDay.Result, PostLikesPerDay>()
.Where(y => y.Date > oneMonthAgo)
.OrderByDescending(p => p.Likes)
.Take(10);
foreach (var post in query)
{
Console.WriteLine("'{0}' has {1} likes on {2:d}", post.Title, post.Likes, post.Date);
}
}

How to retrieve entities from a join using QueryOver in NHibernate

I'm trying to return multiple entities from a QueryOver query. I'm doing this code in a plain text editor so there may be syntactic errors, but it should get the idea across.
public class Product
{
public virtual int ID { get; set; }
public virtual string ProductName { get; set; }
public virtual List<Category> Categories { get; set; }
public virtual List<Inventory> Inventories { get; set; }
...
}
public class Category
{
public virtual int ID { get; set; }
public virtual string CategoryName { get; set; }
public virtual string Style { get; set; }
public virtual Product Product { get; set; }
...
}
public class Inventory
{
public virtual int ID { get; set; }
public virtual List<Discount> Discounts { get; set; }
public virtual Product Product { get; set; }
public virtual bool InStock { get; set; }
...
}
public class Discount
{
public virtual int ID { get; set; }
public virtual Inventory Inventory { get; set; }
public virtual decimal DiscountAmount { get; set; }
...
}
Now my goal is to take a product ID and a couple other options to pull back a Category, Inventory, and DiscountAmount in a single query. I've gotten this to work using HQL with this query:
var query = session.CreateQuery("select category, inventory, discount.DiscountAmount"
+ " from Product product"
+ " join product.Categories category"
+ " join product.Inventories inventory"
+ " left join inventory.Discounts discount"
+ " where product.ID = :productID"
+ " and category.Style = :style"
+ " and inventory.InStock = 1");
With this query I get an list of object arrays that each have a Category entity, Inventory entity, and a DiscountAmount decimal. My goal is to use a QueryOver query to do this same query with no magic strings, but I can't get it to work. Here's what I've tried so far:
Product productAlias = null;
Category categoryAlias = null;
...
var query = session.QueryOver<Product>(() => productAlias)
.Where(() => productAlias.ID == productID)
.JoinAlias(() => productAlias.Categories, () => categoryAlias)
...
.Select(Projections.Property(() => categoryAlias.ID),
Projections.Property(() => discountAlias.Inventory),
Projections.Property(() => discount.DiscountAmount));
This query only pulls back the ID for Category, and while it does pull the full Inventory entity back it uses a full additional database query to grab it.
...
.Select(Projections.Property(() => categoryAlias),
Projections.Property(() => inventoryAlias),
Projections.Property(() => discountAlias.DiscountAmount));
This query throws a runtime exception of "Could not resolve property: categoryAlias of : Product".
...
.Select(Projections.Property(() => categoryAlias.ID).WithAlias(() => ReturnClass.Category),
Projections.Property(() => inventoryAlias.ID).WithAlias(() => ReturnClass.Inventory),
Projections.Property(() => discountAlias.DiscountAmount).WithAlias(() => ReturnClass.DiscountAmount))
.TransformUsing(Transformers.AliasToBean<ReturnClass>());
This query throws a runtime exception of "Object of type Int32 cannot be converted to type Category".
...
.Select(Projections.Property(() => categoryAlias.ID)
.TransformUsing(Transformers.AliasToBean<Category>())
This query returns a default Category entity.
So is there any way to mimic the HQL query using the QueryOver API, or is my only option to choose between HQL or making multiple queries?
Edit: To be more clear, I really want to avoid magic strings as much as possible, so I'd really prefer strongly typed QueryOver queries. Currently I'm using a QueryOver query that returns the IDs for the Category and Inventory entities and then querying for them separately, but since I have to hit those tables in the first query anyway I'd rather return them all at once.
Edit 2: The exact SQL I'm trying to achieve is
Select Category.ID, Category.CategoryName, Category.Style, (other Category columns),
Inventory.ID, Inventory.InStock, (other Inventory columns),
Discount.DiscountAmount
From Products as Product
Inner join Categories as Category ...
Where Product.ID = #productID
And Category.Style = #style
And ...
I think you should use Fetch instead of JoinAlias:
.Fetch(product => product.Categories).Eager
and don't use select: .List<Product>() // then by LINQ you can get Categories and Inventories from Product
So here is an example of "eager" loading all of the subcategories for my categories.
No N+1 when iterating over the categories collection. Future is the key here.
Category catalias = null;
var subCategories =_session.QueryOver<Category>().JoinQueryOver(x => x.SubCategories, () => catalias, JoinType.LeftOuterJoin).
Future<Category>();
var categories = _session.QueryOver<Category>().Where(x => x.ParentCategoryId == null).Future<Category>();

NHibernate Future<T> analyse

I have a code to query/paginate a ProductPrice list... My ProductPrice Object has a Product...
The code works fine...
But looking at log4net I have 2 SELECT happening...
Is that right?
My code :
var query = Session.QueryOver<ProductPrice>();
Product product = null;
query.JoinQueryOver(mg => mg.Product, () => product);
query.WhereRestrictionOn(() => product.Name).IsLike("Asics", MatchMode.Anywhere)
.OrderBy(() => product.Name);
var rowCountQuery = query.ToRowCountQuery();
totalCount = rowCountQuery.FutureValue<int>().Value;
var firstResult = pageIndex * pageSize;
ProductViewModel productViewModel = null;
var productsViewModel = query
.SelectList(l => l
.Select(() => product.Id).WithAlias(() => productViewModel.Id)
.Select(() => product.Name).WithAlias(() => productViewModel.Name)
.Select(mg => mg.Price).WithAlias(() => productViewModel.Price))
.TransformUsing(Transformers.AliasToBean<ProductViewModel>())
.Skip(firstResult)
.Take(pageSize)
.Future<ProductViewModel>();
edited
ProductPrice:
public class ProductPrice : Entity
{
public virtual string Sku { get; set; }
public virtual decimal Price { get; set; }
public virtual Product Product { get; set; }
...
}
Product:
public class ProductPrice : Entity
{
public virtual string Name { get; set; }
public virtual IList<ProductPrice> Prices { get; set; }
...
}
The mapping is generated by Fluent NHibernate...
Thanks
You're doing the ".Value" too soon to get the row count. You should keep it like:
var rowCountQuery = query.ToRowCountQuery();
var rowCount = rowCountQuery.FutureValue<int>();
This way the query is not really executed, just deferred.
After the main query, which seems ok, you may now really fetch the row count integer, and both queries should be sent at the same time to the database:
totalCount = rowCount.Value;