Dynamic fields indexing on nested documents with RavenDB - ravendb

How would you handle dynamic fields indexing on nested documents so that you can query dynamic fields of a deep graph object with RavenDB?
Using the example from the documentation: http://ravendb.net/docs/2.0/client-api/advanced/dynamic-fields
What if the value of a product's attribute is also a product? Think of a CMS with dynamic fields where everything is a content and a root entity content (for the DDD guys) may embed another one, etc (deep graph).
This is very important since, aggregating child contents instead of relating to them (like you would do in a relational database world), is one of the core concept of document databases.

If the data conforms to a pattern, such as with hierarchical data, then you can recurse into that data to index according to the recursion pattern.
You already found how to index dynamic fields. You can combine these techniques to get at most any pattern that you can describe.
If the data is arbitrarily dynamic (i.e. you have no way of knowing what the object structure is ahead of time), then you are going to have a hard time reaching any particular field because you can't describe how to access it.
You can't be arbitrarily dynamic and be completely indexable at the same time.

Related

What's the best database for my data structure?

I have two data structures that I need to store in a database. At this point, I'm relatively sure that SQL and any relational database types wouldn't work, but I'm also not sure what alternatives I have and/or which of those alternatives would be best. If there is a reasonable way to implement these structures in mySQL or something similar, I'm open to the idea.
Structure 1:
A nested tree diagram, where nodes are not defined ahead of time, and are instead generated from the data. I have a lot of strings that I need to separate into trees such that each branch node on the tree is empty and each leaf node contains a maximum of 200 strings, all beginning with the same prefix. I would use SQL, but considering I will regularly have upwards of 9.45x10^55 nodes (branch and leaf), I can't use the tree traversal method; adding a single node would take too much time.
Structure 2:
I have an array of the leaf nodes from the above structure, however, every leaf node has its own data associated with it, yet not contained within it.
From my (extremely limited) understanding of SQL, the second structure can be implemented in mySQL or something similar. The problem is, I need to be able to retrieve individual nodes from the 2nd structure, instead of the entire array of nodes. Also, I don't know the length of the array ahead of time, so I can't simply make a table with a certain number of columns available for each node: I'd end up having over 9.09x10^55 columns, when I will regularly be only using 5 or less.
If you have any recommendations as to what kind of database I could use to implement these structures relatively easily, or any advice pertaining to the implementation itself, it would be greatly appreciated.

Azure Search facets and filter

I'm using Azure Search on my e-commerce site, and now want to implement filtering and I can't choose right way to do is. The problem is that I have really different product types, so they have various attributes (properties) and I don't want to create index with 50 attribute fields to faced them all. Another way - I can define few properties (like Attribute1, Attribute2 ... ) and then determine their 'Key' names according to facated values, but it sound not so good too. Is there some common or checked way to build filters on e-commerce sites?
Azure Search will do well if you have 50 fields that have sparse values. Assuming that sparseness comes with relatively low per-facet cardinality, it shouldn’t have a bad performance impact, and by explicitly modeling stuff as fields you get to keep the nice and clean programming model that comes from having explicit fields.
The attribute mapping thing would work assuming all facets are of the same data type or can be grouped easily.
Another thing you can do for string facets is to have a prefix in the facet value (e.g. Cameras/Canon, Memory/MiniSD, etc.). That gives you counts per value within the parent scope. If you also have the parent scope (e.g. Camera, Memory) in a separate field you can filter by the whole scope if needed. This generalizes well into hierarchical facets (as long as they are all strings). Here is a blog post that explores this technique.

Mongo: query documents from multi-collection

There are two tables such as student and class:
SELECT student.name, class.subj
FROM student
INNER JOIN class
ON student.class_id = class.class_id;
In sql is ok, but in mongodb,
I know the MongoDB does not support joins,
but I don't want put in one collection,
I want to put in 2 collections and query it and return in one data.
reason that I want to do like this, please see this
so how can I do?
Currently Mongodb does not support cross collection requests and AFAIK there is no plan to do such a functionality. It differs with the whole concept of document based databases.
We faced same issue with Mongodb earlier working with Nodejs project. The solution for us was to put subdocuments into another collection with a reference to parent document by _id parameter of Mongodb. Large part of it was handled by Mongoose ORM, but in its core it still will do two different requests - one for retrieving parent document and another for retrieving all children where parent document will still have a parameter array with list of _id of all its children.
This is a difference in schema design pattern between SQL and NoSQL. In SQL the schema is fixed and changing it is sometimes painful, but you benefit from this fixed schema by ability to do complex requests. In NoSQL there is no fixed schema, all schema is in your head (and perhaps documentation) and you yourself need to follow it, but this provides you a good speed on database level.
UPDATE: After all we ended up with merging two collections into one. There still were some problems with quering subdocuments from parent document, but it was pretty easy and did not change much for us. I would recommend you looking into this rather than splitting into two separate collections. it also primarily depends on the workflow with your DB, will you be doing more read queries or more write queries? With NoSQL schema you need also consider those points. If more reading - single collection is a way to go.

Using QAbstractItemModel to represent data in a SQL database

I am trying to create a QTreeView to display data from a SQL database. This is a large database, so simply loading the data into a QStandardItemModel seems prohibitive.
None of Qt's pre-built SQL model classes are sufficient for the task. Therefore it seems necessary to subclass QAbstractItemModel.
In the first place, I can find no examples where this is done, so I am wondering whether it is the correct approach.
Implementing QAbstractItemModel::data is pretty straightforward. I am uncertain how to implement QAbstractItemModel::parent.
Qt's "Simple Tree Model Example" example would be informative, but in that example the tree structure is represented in memory with the TreeItem class. I could copy that, but if I am going to duplicate the database structure, it would be just as easy to use QStandardItemModel. If I need to maintain a separate data structure (in addition to the database and the QAbstractItemModel subclass) to represent the tree structure, is there any advantage to subclassing QAbstractItemModel over just using a QStandardItemModel?
The challenge in the tree structure is to always be able to identify a model index's parent (i.e., overloading the parent() method). In the Simple Tree example, this is done by storing the three structure in a separate data structure. For large SQL queries this is impractical. For the right database structure, you might be able to calculate the proper parent node given the child, but that is not a guarantee. The only alternative I can imagine is passing a quint32 to QAbstractItemModel::createIndex which encodes the item's parent.
One performance consideration that might be useful. After giving up on sublcassing QAbstractItemModel, I tried populating a QStandardItemModel from the database. I loaded about 1200 items into the model, and four child items to each item with two separate database calls. This took about 3 seconds on a 2009 laptop. That is faster than I had been expecting. (And there would be performance gains if I used a single query instead of repeated queries.)
In the end I went another route: having several QTableViews in a the GUI, with signals and slots to show different aspects of the data. My code is much simpler, and the proper functionality is in place, so this feels like the "right" solution.

Storing dynamic fields with Doctrine2

in our app, we are looking to use doctrine2, however, there is one feature we want to offer but am completely confused as to how it would work.
we want our customers to be able to define custom fields to our standard objects. so, these fields would be made on-the-fly, and not part of the object definition that is known and mapped by doctrine.
our first thought was to use nosql (mongodb or amazon dynamodb) to store some of this data, but since we want to use doctrine to handle our core objects, we would like to stay within the realm of doctrine to achieve this without have to extend beyond it to store this data.
one thing on my mind was using doctrine's ability to serialize/unserialize complex objects and just have like a hash of custom field names and their values as an extra property in the object, however, this would not allow us to have a feature that would search these fields if we ever wanted to allow that...
anyone ever attempted to do this with doctrine2 or any orm variant?
You could consider using Doctrine ODM, which is Doctrine 2 but for NoSQL - I believe they support at least MongoDB.
Another approach would be to use serialization as you said. You probably shouldn't worry about search too much - I would recommend to use a separate fulltext search engine (Solr, ElasticSearch, or other) as they provide much more versatility and performance for search vs SQL fulltext search.
Third, you could use Doctrine alongside with NoSQL. In this case, you probably should abstract your querying into a service class or such, so that you can use Doctrine to query for the data from your SQL DB, and some other to query the remaining data.
Finally, you could consider using a key-value table. One column represents the key, another the value.