Mongo: query documents from multi-collection - sql

There are two tables such as student and class:
SELECT student.name, class.subj
FROM student
INNER JOIN class
ON student.class_id = class.class_id;
In sql is ok, but in mongodb,
I know the MongoDB does not support joins,
but I don't want put in one collection,
I want to put in 2 collections and query it and return in one data.
reason that I want to do like this, please see this
so how can I do?

Currently Mongodb does not support cross collection requests and AFAIK there is no plan to do such a functionality. It differs with the whole concept of document based databases.
We faced same issue with Mongodb earlier working with Nodejs project. The solution for us was to put subdocuments into another collection with a reference to parent document by _id parameter of Mongodb. Large part of it was handled by Mongoose ORM, but in its core it still will do two different requests - one for retrieving parent document and another for retrieving all children where parent document will still have a parameter array with list of _id of all its children.
This is a difference in schema design pattern between SQL and NoSQL. In SQL the schema is fixed and changing it is sometimes painful, but you benefit from this fixed schema by ability to do complex requests. In NoSQL there is no fixed schema, all schema is in your head (and perhaps documentation) and you yourself need to follow it, but this provides you a good speed on database level.
UPDATE: After all we ended up with merging two collections into one. There still were some problems with quering subdocuments from parent document, but it was pretty easy and did not change much for us. I would recommend you looking into this rather than splitting into two separate collections. it also primarily depends on the workflow with your DB, will you be doing more read queries or more write queries? With NoSQL schema you need also consider those points. If more reading - single collection is a way to go.

Related

MongoDB: Get latest 10 items in different data model?

I'm trying to do a personal blog system with MongoDB. The blog system designed to support both formal article and short tweets, which have similar but different data structures. For example, an article must have a title, but tweets must not; both them have a creation date.
There is also a timeline feature in design. The timeline will show latest 10 items, no matter the item is an article or a tweet, and when "load more" button pressed, there will show the 10-20 items...
So I think there are two method to design the database schema
Save articles and tweets into separate collections.
Save all things into a single collection, add a "type" field for specifying.
And I have three questions:
If I use method 1, how to implement the "latest 10 items" query? Any literal explanation, or example code in MongoDB query language, mongoose, mongoengine or else, is welcome.
If I use traditional SQL DB (like MySQL), is there a common method to solve the "latest 10 items" problem ?
Which methods fits the MongoDB's philosophy more ?
Thanks
Question 1 : If you are saving articles and tweets into separate collections, you'll either have to do application-side joining or use $lookup operator, which will not work if you have sharded collections. I tend to avoid operators that have that limitation.
Question 2 : I don't work with SQL, can't help you there.
Question 3 : Saving everything into a single collection will definitely fit MongoDB philosophy more. MongoDB should be fast at retrieving, although slower in inserts and updates. Doing either application-side joining or having to use $lookup kind of throw its ability to embed documents out of the windows.
As for your data model, here's my take. I used java driver and I used to have a custom deserializer/serializer to handle the document to POJO mapping. I believe it's natively supported in Mongo Java Driver 3.5, not sure if it's there for Mongoose already. In any case, you may define a Model/Object that contains all fields in both models, it'll then serialize accordingly, regardless which type you're fetching from DB using method 2. The model will get a little messy as you add more fields though, so some clever naming might be necessary

DocumentDB vs. SQL Database

I have a question regarding the usage of a DocumentDB or SQL-Database.
E.g. I have categories which can have multiple child categories and so on. Every category can have multiple attributes and every attribute can have one or many values. Would it be better to use a schemaless solution like a DocumentDB because I could add new sub categories etc. with no effort or is it better to stick with a schema and use a SQL-Database.
Many thanks in advance.
As #DavidMakogon said, there is not a standard & absolute right answer, it just up to you and up to application scenario. For this current needs to store a tree structure of categories with attributes, it's simple to design database schema & develop application for both without any addition condition like data volume and concurrency, etc, and both are good.
Consideration for others, there are two documents may help analyzing the features which you may need to use in your application or more suitable for your scenario, to make your choice.
MongoDB vs MySQL: Comparison Between RDBMS and Document Oriented Database, it's very similar for comparision between DocumentDB and SQL Database.
10 things never to do with a relational database, I think the advantage of RDBMS is as well known and be suitable for which scenario, but NoSQL's not.
Hope it helps.

MongoDB embedding vs SQL foreign keys?

Are there any particular advantages to MongoDB's ability to embed objects within a document, compared to SQL's use of foreign keys for the same logic?
It seems to me that the only advantage is ease of use (and perhaps performance?), and even that seems like it could be easily abstracted away (e.g. Django seems to handle SQL's foreign keys pretty intuitively).
This boils down to a classic question of whether to embed or not.
Here are a few links to get started before I explain some more:
Where should I put activities timeline in mongodb, embedded in user or separately?
MongoDB schema design -- Choose two collection approach or embedded document
MongoDB schema for storing user location history
Now to answer more specifically.
You must remember the server-side usage of foreign keys in SQL: JOINs. Embedding is a single round trip to get all the data you need in a single document however Joins are not, they are infact two selections based upon a range and then merged to omit duplicates (with significant overhead on some data sets).
So the use of foreign keys is not totally app dependant, it is also server and database dependant.
That being said some people misunderstand embedding in MongoDB and try and make all their data fit into one document. Unfortunately this is re-inforced by the common knowledge that you should always try to embed everything. The links and more will provide some useful guides on this.
Now that we cleared some things up the main pros of embedding over JOINs are:
Single round trip
Easy to update the document in a lot of cases, unless you embed many levels deep
Can keep entity data with the entity it is related to
However embedding has a few flaws:
The document must be paged in to get it's values, this can be problematic on larger documents
Subdocuments are designed to be unique to that entity that do not require advanced querying so you normally would not get two separate entities that are related together, i.e. a post could embed comments but a user probably wouldn't embed posts due to the query needs.
Nesting more than 3 levels deep could effect your ability to use things such as the atomic lock.
So when used right MongoDBs embedding can become a huge power over SQL Joins but you must understand when to use it right.
The core strength of Mongo is in its document-view of data, and naturally this can be extended to a "POCO" view of data. Mongo clients like the NoRM Project in .NET will seem astonishingly similar to experienced Fluent NHibernate users, and this is no accident - your POCO data models are simply serialized to BSON and saved in Mongo 1:1. No mappings required.
Overall, the biggest difference between these two technologies is the model and how developers have to think about their data. Mongo is better suited to rapid application development.

Storing dynamic fields with Doctrine2

in our app, we are looking to use doctrine2, however, there is one feature we want to offer but am completely confused as to how it would work.
we want our customers to be able to define custom fields to our standard objects. so, these fields would be made on-the-fly, and not part of the object definition that is known and mapped by doctrine.
our first thought was to use nosql (mongodb or amazon dynamodb) to store some of this data, but since we want to use doctrine to handle our core objects, we would like to stay within the realm of doctrine to achieve this without have to extend beyond it to store this data.
one thing on my mind was using doctrine's ability to serialize/unserialize complex objects and just have like a hash of custom field names and their values as an extra property in the object, however, this would not allow us to have a feature that would search these fields if we ever wanted to allow that...
anyone ever attempted to do this with doctrine2 or any orm variant?
You could consider using Doctrine ODM, which is Doctrine 2 but for NoSQL - I believe they support at least MongoDB.
Another approach would be to use serialization as you said. You probably shouldn't worry about search too much - I would recommend to use a separate fulltext search engine (Solr, ElasticSearch, or other) as they provide much more versatility and performance for search vs SQL fulltext search.
Third, you could use Doctrine alongside with NoSQL. In this case, you probably should abstract your querying into a service class or such, so that you can use Doctrine to query for the data from your SQL DB, and some other to query the remaining data.
Finally, you could consider using a key-value table. One column represents the key, another the value.

Is it possible to use Nhibernate with partition of an object over several tables?

We are having a system that gather large quantities of data each month and performes rather advanced calculations that increase the database even more. Since we have the requirements from the customer that we need to store data for fast access three years back and that we must be able to access older data (up to ten years), this however can be low performance and requires some work. We want to avoid performance issues where the database and its tables grows out of proportion.
After discussing using SQL Enterprise (VERY costly and full of traps since we haven't gotten the know-how) and since our system have so many tables that referenses each other we are leaning towards creating some kind of history tables to which we move data in a monthly fashion and rewrite the select queries that we have based on parameters to search either in the regular table or in the history or both depending on the situation.
Since we also are using NHibernate for mapping I was wondering if it is possbible to create a mapping file that handles this by itself (almost) using some sort of polymorfism or inheritance in which each object is stored in different tables based on parameters?
I know this sounds complicated and strange and that there is other methods to perform this but I this question I would rather have people answering the question asked and not give other sugestions to use instead.
As far as I know NHibernate can't do that (each class can be mapped to one table/view )but you can use SQL Queries or StoredProcedures (depends on the version of NHibernate that you are using) to populate mapped objects.
In your case you can have a combined view created by making unions of different tables Then you can use a SQL query to populated your entity.
There's also another solution that you create a summary object for your queries that uses that view ,therefore you can use both HQL and criteria to query this object.
Short answer "no". I would not create views as you mention a lot of joining.
Personally I would create summary tables and map to these directly using a stateless session or a very least mutable=false on the class definition. Think of these summary tables as denormalised data for report only. The only drawback is if historic data changes on a regular basis then the summary tables also needs changing. If historical data never changes then this should be simple to achieve.
I would also most probably store these summary tables on another catalog rather than adding to the size of the current system.
Its not a quick win this one I am afraid.