CosmosDB heterogenous document collection - composite indexing - indexing

I'm using a single collection for all my documents and then instantiating them into POCO's using a property of "type". Things have been going great so far.
Now I need to add multiple sorting abilities.
That doesn't work and it says a I need a composite index. Fine, I understand.
But how would I create an Indexing policy when it wants paths that won't exist in some document types or may exist in more than one document type?
Do I really have to create a collection for each document type for this to work?
TIA

It will simply ignore those items. Also note that while, for composite indexes, you have to specify paths to include along with their sort order. For the regular index it's generally preferable to include all paths (i.e. "/*") and then specify those paths to exclude. This way you don't need to keep updating your index policy when you add new entity types into your collection.
Also, note that the max number of composite index paths per composite index is 8 per container. Also currently queries will only use one path at a time but this will change very soon to use multiple paths at the same time which will have significant performance improvement to queries which use them.

Related

Azure Search facets and filter

I'm using Azure Search on my e-commerce site, and now want to implement filtering and I can't choose right way to do is. The problem is that I have really different product types, so they have various attributes (properties) and I don't want to create index with 50 attribute fields to faced them all. Another way - I can define few properties (like Attribute1, Attribute2 ... ) and then determine their 'Key' names according to facated values, but it sound not so good too. Is there some common or checked way to build filters on e-commerce sites?
Azure Search will do well if you have 50 fields that have sparse values. Assuming that sparseness comes with relatively low per-facet cardinality, it shouldn’t have a bad performance impact, and by explicitly modeling stuff as fields you get to keep the nice and clean programming model that comes from having explicit fields.
The attribute mapping thing would work assuming all facets are of the same data type or can be grouped easily.
Another thing you can do for string facets is to have a prefix in the facet value (e.g. Cameras/Canon, Memory/MiniSD, etc.). That gives you counts per value within the parent scope. If you also have the parent scope (e.g. Camera, Memory) in a separate field you can filter by the whole scope if needed. This generalizes well into hierarchical facets (as long as they are all strings). Here is a blog post that explores this technique.

Some guidance request on 'custom defined' resultsets

I would like some guidance/thoughts on the route to create a functionality that allows me to let user customize their datasets. I have added an image showing this functionality but it has been called queues here.
A view is a segmentation of a resultset where the conditions are defined by either the system (default views) or the user.
I can create predefined indexes/projections for the default views that are under my control but I am stuck on the approach when a user should be able to create custom views.
I can create one big index with all properties, and only query those fields on the index that are in the conditions defined by the user. But in that scenario the index is just one big blob of information. It is probably the easiest way but it feels ugly.
I can dynamically create a new index, based on the entered conditions. Never explored the options of runtime defined indexes before though.
I can dynamically create a query with conditions, however I will have to deal with stale results because I let RavenDB define the index; I would like to avoid index creation by RavenDB if possible.
Some guidance would be highly appreciated; how and with what parts of RavenDB can I efficiently accomplish this? I am not in search of a complete solution, since this is a personal project experimenting with RavenDB.
This question might be too broad/generic but here's my two cents.
Yes, I agree that one massive index would not be optimal. In many cases you can get creative by breaking down an index into smaller indexes.
I don't suggest that you create run-time indexes based on how an user is using the application. That's what dynamic indexes are for. RavenDB will create a an index and manage its importance. So, you have dynamic indexes that don't get used anymore, RavenDB will abandon them. If you're worried about staleness, you can wait for non-stale results.
I'm not clear on your use-case, but maybe you could design your app it in such a way that you save all the views (custom or default) into Raven documents. For example, given the picture you attached, "unassigned issues" and "due this week" would be two separate documents. This could allow you to keep a small number of static indexes.

Managing the neo4j index's life cycle (CRUD)

I have limited (and disjointed) experience with databases, and nearly none with indexes. Based on web search, reading books, and working with ORMs my understanding can be summed up as follows:
An index in databases is similar to a book index in that it lists "stuff" that's in the book and tells you where to find it. This helps with lookup efficiency (this is most probably not the only benefit)
In (at least some) RDBMS's, primary key fields get automatically indexed so u never have to directly manipulate them.
I'm tinkering with neo4j and it seems you have to be deliberate about indexes so now I need to understand them but I cannot find clear answers to:
How are indexes managed in neo4j?
I know there's automatic indexing, how does it work?
If you choose to manually manage your own indexes, what can you control about them? Perhaps,index name, etc?
Would appreciate answers or pointers to answers, thanx.
Neo4j uses Apache Lucene under the covers if you want index engine like capabilities for your data. You can index nodes and/or relationships- the index helps you look up a particular instance/set of nodes or relationships.
Manual Indexing:
You can create as many node/relationship indexes as you want and you can specify a name for each index. The config can also be controlled i.e. whether you want exact matching (the default) or Lucenes full text indexing support. Once you have the index, you simply add nodes/relationships to it and the key/value you want indexed. You do however need to take care of "updating" data in the index yourself if you make changes to the node properties.
Auto-Indexing:
Here you get one index for nodes and one index for relations if you turn them on in the neo4j.properties file. You may specify what properties are to be indexed and from the point of turning them on, the index is automatically managed for you i.e. any nodes created after this point are added to the index and updated/removed automatically.
More reading:
http://docs.neo4j.org/chunked/stable/indexing.html
The above applies to versions < 2.0
2.0 adds more around the concept of indexing itself, you might want to go through
http://www.neo4j.org/develop/labels
http://blog.neo4j.org/2013/04/nodes-are-people-too.html
Hope that helps.

Dynamic fields indexing on nested documents with RavenDB

How would you handle dynamic fields indexing on nested documents so that you can query dynamic fields of a deep graph object with RavenDB?
Using the example from the documentation: http://ravendb.net/docs/2.0/client-api/advanced/dynamic-fields
What if the value of a product's attribute is also a product? Think of a CMS with dynamic fields where everything is a content and a root entity content (for the DDD guys) may embed another one, etc (deep graph).
This is very important since, aggregating child contents instead of relating to them (like you would do in a relational database world), is one of the core concept of document databases.
If the data conforms to a pattern, such as with hierarchical data, then you can recurse into that data to index according to the recursion pattern.
You already found how to index dynamic fields. You can combine these techniques to get at most any pattern that you can describe.
If the data is arbitrarily dynamic (i.e. you have no way of knowing what the object structure is ahead of time), then you are going to have a hard time reaching any particular field because you can't describe how to access it.
You can't be arbitrarily dynamic and be completely indexable at the same time.

Can i get/set the index generated by checking the 'indexed' checkbox? (core data)

I have 2 entities. entity A will hold many entity Bs and order will matter.
if i check the little box that says 'indexed' in xcode, how do i go about using that index, if i even can? ( i know that i CAN use it in some way: http://cocoawithlove.com/2008/03/testing-core-data-with-very-big.html but i am not so spiffy with Obj-c yet.)
I have seen this Indexed Relationships in Core Data , but it seems broken and too much over my head to fix myself.
Index doesn't mean what you think it means. In this context, "indexed" means like the index of a book. It add a lookup table so the database can find individual records quickly. If you need to be able to sort the records into a specific order, use NSSortDescriptor with the NSFetchRequest. If the existing properties are not what you want to sort on, you'll need to add another property.