Rails: mixing NOSQL & SQL Databases - sql

I'm looking for the better way (aka architecture) to have different kind of DBs ( MySQL + MongoDB ) backending the same Rails app.
I was speculating on a main Rails 3.1 app, mounting Rails 3.1 engines linking each a different kind of DB ...
... or having a main Rails 3.0.x app routing a sinatra endpoint for each MySQL/MongoDB istance ...
Do you think it's possible ..., any idea or suggestions ?
I notice some other similar questions here, but I think that "mounting apps" is moving fast in Rails 3.1 / Rack / Sinatra and we all need to adjust our paradigms.
Thanks in advance
Luca G. Soave

There's no need to completely over-complicate things by running two apps just to have two types of database. It sounds like you need DataMapper. It'll do exactly what you need out of the box. Get the dm-rails gem to integrate it with Rails.
In DataMapper, unlike ActiveRecord, you have to provide all the details about your underlying data store: what fields it has, how they map the attributes in your models, what the table names are (if in a database), what backend it uses etc etc.
Read the documentation... there's a bucket-load of code to give you an idea.
Each model is just a plain old Ruby object. The class definition just mixes in DataMapper::Resource, which gives you access to all of the DataMapper functionality:
class User
include DataMapper::Resource
property :id, Serial
property :username, String
property :password_hash, String
property :created_at, DateTime
end
You have a lot of control however. For example, I can specify that this model is not store in my default data store (repository) and that it's stored in one of the other configured data stores (which can be a NoSQL store, if you like).
class User
include DataMapper::Resource
storage_names[:some_other_repo] = 'whatever'
# ... SNIP ...
end
Mostly DM behaves like ActiveRecord on steroids. You get all the basics, like finding records (except you never have to use the original field names if your model abstracts them away):
new_users = User.all(:created_at.gte => 1.week.ago)
You get validations, you get observers, you get aggregate handling... then get a bunch of other stuff, like strategic eager-loading (solves the n+1 query problem), lazy loading of large text/blob fields, multiple repository support. The query logic is much nicer than AR, in my opinion. Just have a read of the docs. They're human-friendly. Not just an API reference.
What's the downside? Well, many gems don't take into account that you might not be using ActiveRecord, so there's a bit more searching to do when you need a gem for something. This will get better over time though, since before Rails 3.x seamlessly integrating DM with Rails wasn't so easy.

I dont fully understand your question., like
what is the problem you are facing right now using mongo and MySQL in same app, and
whats the reason for going multiple rails app dealing with different dbs.
Though am not an expert in ruby & rails(picked up few months ago), i like to add something here.
I am currently building the rails app utilizing both mongo and MySQL in the back end. Mongoid & ActiveRecord are the drivers. MySql for transactions and mongo for all other kind of data (geo spatial mainly). Its just straight forward. You can create different models inheriting from mongoid and activerecord.
class Item
include Mongoid::Document
field :name, :type => String
field :category, :type => String
end
and
class User < ActiveRecord::Base
end
And you can query both the way same way (except complex sql joins, also mongoid has some addition querying patterns for Geo spatial kind of queries)
Item.where(:category => 'car').skip(0).limit(10)
User.where(:name => 'ram')
Its a breeze. But there are some important points you need to know
Create your Active record models before the mongoid models. Once mongoid is activated (on rails g mongoid:config - mongoid.yml added) all the scaffolding , and generations works toward mongo db. Otherwise every time you need to delete the mongoid.yml before creating the Activerecord models
And don't use mongoid in a relational way. i know mongoid provides lot of options to define realtions. Like Belongs_to relations stores the refernece id's in child documents. Its quite opposite to the mongo DbRef. Its greatly confusing when leaving the mongo idioms for the favour of active record feel. So try to stick with the document nature of it. Use embed and DbRef whenever necessary. (may be someone corrcet me if am wrong)
Still Mongoid is a great work. Its fully loaded with features.

Related

Mongoid database schema

We're making an application using Ruby on Rails with Mongoid.
I've been reading up on the documentation of mongoDB and the Mongoid gemregarding how to design the database schema when utilizing a document-based database. I've been going over it in my head for a while now, and would really appreciate some input from people with more experience than me to know whether or not I'm completely lost :p
Anywho, here's an overview (I've tried to keep it as simple as possible to keep the question facutal) of the appliction we're making:
The application consists of the following entities:
Users, Subjects, Skills, Tasks, Hints and Tutorials.
They are organized in the following manner:
Subjects consists of a set of 1..n Skills.
Skills consists of a set of Tasks, (sub-)Skills or both (i.e. skills can be a tree
structure, where one main skill (say, Geometry) is the root and other skills are
child nodes (for instance, the Pythagorean Rule might be a sub-skill)). However,
all skills, regardless of whether they have sub-skills or not, should consist of
0..n tasks.
Tasks have a set of 1..n Hints associated with them.
Hints are each associated with a particular task.
Tutorials are associated with 1..n Skills (this skill can be either a root
node or a leaf node in a skill tree).
Users can complete 0..n Tasks in order to complete 0..n Skills.
Now, we imagine that there will mostly be read queries called to the database for the collection of skills/tasks completed by certain users, and read queries to display the various skill trees associated with a subject. The main write queries will probably be related to relationship between various users and tasks, in the following form
User A completes Task B
and so forth. Also, we imagine that the size of the number of entitites would be as follows: Users > Hints > Tasks > Skills > Tutorials > Subjects
Currently, this is the solution we have in mind:
Subject.rb
has_and_belongs_to_many :skills
Skill.rb (uses Mongoid::Tree)
has_and_belongs_to_many :subjects
embeds_many :tasks
Task.rb
embedded_in :skill, :inverse_of => :tasks
embeds_many :hints
Hint.rb
embedded_in :task, :inverse_of => :hints
We haven't really started implementing the tutorials and the connection between user and skills/tasks yet, but we imagine that the relationship between user and skill/tasks necessarily has to be N:N (which I guess is rather inefficient).
Is using a document-based database for this kind of application a bad idea? And if not, how can we improve our schema to make it as efficient as possible?
Cheers, sorry for the wall of text :-)
In my honest opinion unless some part of your data is going to be completely unstructured I see no need to use a NoSQL solution for your problem.
I'm going out from a point of view that you do have databases knowledge so you are familiar with MySQL/PostgreSQL/etc.
I sincerely believe that PostgreSQL (or Mysql for the matter) will be easier for you to set up, to write your code, to maintain and perhaps eventually to scale up.
My take on NoSQL is use it when you have unstructured datasets and you need the flexibility of adding fields on the fly, there are pitfalls using MongoDB.
It's not a silver bullet.
Some of the pitfalls are for instance, in()s are slow, if you write something to mongo and then want to immediately read it, you have to expect you won't get it (sharding in mongo), map reduce is kind of a hassle (I haven't tried the aggregate framework with Moped but it does look promising, sometimes indexing can have issues with sharded collections).

How do I add (sql) database to my sintra application?

When I started to code my sinatra application I never used it before. Note that I had and still have no experience with RoR. I had one .rb file and one .haml and was happy. Now I had to split .rb file into about 10 'library' files as the whole application gets more and more complex.
I store some application logs/info in csv files and now I am getting conflicts when accessing the csv file. So I think that I need to introduce "proper" database solution. I want it to be part of my ruby (sinatra) application.
How can I introduce 'light' sql database into my sintra application?
I am on ruby 1.8.7 (2010-08-16 patchlevel 302) [i386-mingw32] soon upgrading to 1.9 (hopefully)
I'd recommend looking at Sequel. It's very flexible and powerful, and works well with SQLite, MySQL, Postgres, Oracle and other DBMs. It's not opinionated about how you talk to the database; You can use it as an ORM or with simple datasets, and allows embedded SQL or more programmatic approaches.
For ORM, both ActiveRecord and Sequel are recommended. About database, I guess sqlite3 will be good enough for your need. Also you can choose mysql or pg.
If you want to use active_record, you'll find this article very useful.
And if Sequel is the choice, just read Sequel documents here.
After the gems installed. You can start writing some code to connect the db. Then maybe some migration task to build database tables (and don't forget build some corresponding models). Both gem have similar syntax for migrations. After that, import your csv data and well done.
I have had no trouble using either Active Record or DataMapper to add object persistence to my Sinatra apps. People also tell me Sequel is very good but philosophically it is not not worlds apart from Active Record imho.
Active Record and Sequel favour a more database-centric approach, whereby you spell out your tables as a set of database and table definitions in a collection of migration files and merge them into a schema which is then used to build or update your database tables. If you really care about the underlying SQL database then one of these is for you. I find them to be six of one, a half-dozen the other.
DataMapper is more object-centric and lets you define the properties and object relationships you need in your object's own class definition; and then when your app launches you make sure you call DataMapper.auto_upgrade! and it upgrades the database to suit your object graph. The upside is that you only have one place to look to find what properties your object might have. The downside is you have less control over the specifics of the underlying databases, though it's not impossible to tightly define the mappings, DataMapper works well when you care about object graphs over database tables.
The good news is they pretty much all work in the same way once you have your mappings from object graph to SQL database tables defined. All support lazy or pre-emptive loading of related collections of objects, many_to_many relationships, polymorphism, etc, and tend to vary only in configuration and seeding details.
I often start projects using DataMapper just for its speed of throwing up and tearing down database schemas, as the app's object graph is still in flux; I refactor quickly to use Active Record when the schema has settled down. Next project I think I'll give Sequel a go though, as people do seem to rave about it.
I have had success using datamapper with Sinatra, put like the other post you can also use Sequel and Active Record. One advantage to maybe using Active Record though is if you do ever want to use/learn ROR, Active Record is the default ORM so that might be something that you want to consider.
If you don't want to go the ORM route you can always use the SQL-Ruby gem which will allow you to create and run sql queries. Here is some example code from the website http://sqlite-ruby.rubyforge.org/
require 'sqlite'
db = SQLite::Database.new( "data.db" )
db.execute( "select * from table" ) do |row|
p row
end
db.close

Remove a database field when deleting property from a class using datamapper

I am using datamapper in a Sinatra application. I currently use the command
DataMapper.finalize.auto_upgrade!
to handle the migrations. I had two Classes (Artists and Events) with a 'has_n' and 'belongs_to' association. An Event 'belonged_to' one Artist and an Artist could have many Events associated with it.
I changed the association to be a many_to_many relationship by deleting the previous parts of the class definition which governed the original one_to_many association in the models and adding
has n, :artists, :through => Resource
to the Event class and the corresponding code to the Artist class. When I make a new Event, an error is kicked off.
#<DataObjects::IntegrityError: events.artist_id may not be NULL
The :artist_id field is a relic of the original association between the two classes. The new many_to_many association is accessed by event.artists[i] (where 'i' is just an integer index going from 0 to the number of associated artists -1). Apparently the original association method between the Artist and Event classes is still there? My guess is the solution to this is to not just use the auto_upgrade method built into datamapper but rather to write an explicit migration. If there is a way to handle this type of change to a database and still have the auto_upgrade method work, that would be great!
If you need more details about my models or anything please ask and I'll gladly add them.
In my experience, DataMapper's auto_upgrade does not work very well -- or, to say the least, it doesn't work the way I expect it to. If you want to add a new column to your model, it will do what it should; try to do anything more sophisticated to a column and it probably won't behave as you expect.
For example, if you create a property of type String, it will initially have a length of 50 characters. If you notice that 50 characters is not enough to hold your string, adding :length => 100 to the model won't be enough to make auto_upgrade change the column's width.
It seems you have stumbled upon another shortcoming, although one may argue that, in your case, maybe DataMapper's behavior isn't that bad (think of legacy databases). But the fact is that, when you changed the association, the Event's artist_id column wasn't removed, and then when you try to save an Event, you'll get an error because the database says it is a required field.
Notice that the error you are getting is not a validation error: DataMapper thinks everything looks ok, but gets an error from the database when trying to save the object.
Hope this helps!
Auto-upgrade is not a shortcoming at all! I think of auto-upgrade as a convenience feature of DataMapper. It's only intended purpose is to add columns for you as far as I know. So it is great for getting a project started quickly, and managing test and dev environments without having to write migrations, but not the best tool for making modifications to a mature, live project.
For that, DataMapper does have migrations! Use the dm-migrations gem. The shortcoming is that they are not very well documented... at all. I'm actually working on changing a current project of mine over to using migrations, and I hope to contribute some instructions to the dm-migrations github wiki. But if you aren't ready to switch to migrations, you can also just update columns manually using an SQL client, and then continue to use auto-upgrade for new columns. That's what I have been doing for 2 years on my project :)

Rails 3 - Devise with acts_as_audited possible?

I'd like to use Devise with acts_as_audited.
I have googled it, but the results weren't very clear.
What are its pros and cons?
I use Paper Trail here which is newer but much the same thing, and the top of my Devise User model looks like this:
class User < ActiveRecord::Base
has_paper_trail
And now I have a growing versions table in my DB with a row for every CRUD action on the User model.
The benefits are that all previous versions of your model's data are saved and stored in YAML, allowing you to rollback/undo.
The cons? Only database size and perhaps a small performance hit at Write/Update time.

What's the significant difference between active record and data mapper based ORMs?

Like doctrine(active record) and Xyster(data mapper),what's the difference?
The difference is in how separate your domain objects are from the data access layer. With ActiveRecord, its all one object, which makes it very simple. Especially if your classes map one to one to your database.
Data mapper is more flexible, and easily allows your domain to be tested independent of any data access infrastructure code. But complexity comes at a price.
Like blockhead said, the difference lies in how you choose to separate Domain Objects from the Data Access Layer.
In a nutshell, Active"Record" maps an object to a record in the database.
Here, One Object = One Record.
From what I know, Data"mapper" maps an object with data, but it need not be a record - it could be a file as well.
Here, One Object need not be One Record
It's this way because the goal of this pattern: to keep the in memory representation and the persistent data store independent of each other and the data mapper itself.
By not placing this 1 object = 1 record restriction, Data Mapper makes these two layers independent of each other.
Any suggestions/corrections to my answer are welcome, in case I was wrong somewhere.
The main difference is that in DataMapper the model is defined in the ruby class itself:
class Post
include DataMapper::Resource
property :id, Serial
property :title, String
property :body, Text
property :created_at, DateTime
end
While in ActiveRecord the class is mostly an empty class and the framwork scans the database. This means you need either a pre-defined database or use something like migrations to generate the schema, this keeps the data model separated from the ORM.
DataMapper.auto_migrate!
would generate the schema for you.
ActiveRecord is different in this regard:
class Post < ActiveRecord::Base
end
In DataMapper there is no need for migrations, as automigrations can generate the schema or look the differences between the model and the database and migrate for you. There is also support for manual migration you can use for non-trivial cases.
Also DataMapper is much more "ruby" syntax friendy, and features like lazy loading when doing chainable conditions (like ActiveRecord in Rails 3) are there from the beginning.
Datamapper also has a feature that every record in the database maps to one ruby object, which is not true for ActiveRecord. So if you know that the database records are the same, you know that two references to the ruby object will point to the same object too.
On the counter side, while Rails 3 may promise you exchangeable frameworks, the Datamapper railtie (dm-rails) is not production ready and many features may not work.
See this page for more information.
I have to admit that I don't know doctrine or Xyster but I can at least give some insight into the difference between Active Records as implemented in Ruby versus ORMs such as SubSonic, Linq to SQL, nHibernate and Telerik. Hopefully, it will at least give you something to explore further.
Ruby's Active Record is its native data access library - it is not a mapping from an existing SQL interface library (e.g. .NET SqlDataTables) into the constructs of the language - it is the interface library. This gave the designers more latitude to build the library in a more integrated manner but it also required that they implement a broad range of SQL tools that you won't typically find in an ORM (e.g. DDL commands are a part of Ruby's Active Record interface).
ORMs are mapped to the underlying database structure using a manual step in which a code generator will open a database and scan through it - building objects corresponding to the tables (and stored procedures) that it finds. These objects are constructed using the low-level SQL programming constructs offered as part of the language (e.g. the .NET System.Data.Sql and SqlClient libraries). The objective here is to give record-oriented, relational databases a smoother, more fluent interface while you are programming: to reduce the "impedence mismatch" between the relational model and object-oriented programming.
As a side note, MS has taken a very "Active Record-like" step in building native language constructs into C# via Linq to SQL and Linq to Entities.
Hope this helps!