Would it be advisable to use the closure_tree gem to represent an ordered list? e.g. this Rails model:
class OrderedSet < ActiveRecord::Base
acts_as_tree, order: 'position' #order is a supported option
end
My thoughts:
It might be convenient represent an ordered
dictionary, since the gem can generate a hash tree.
Maybe sometimes you just want a list of trees? Random example: a central strand of nerve cells in the arm with neurons that branch out along it
I'm the author of closure_tree.
If you use the append_sibling or prepend_sibling methods that are added to your model class if the order column is numeric, your position value will be reordered from 0..siblings.count. No need to use another gem—I had to rely on specific ordering rules to make the majority of scopes order correctly.
After playing with the gem, I realized that it doesn't directly alter the order position, so it requires us to use our own functionality to set that column.
Basically: to represent a list, use the acts_as_list gem or another option. To use an ordered list in conjunction with the nodes of a tree, use both gems. (they should be compatible but I am not sure)
Related
In a Redis database I have a number of hashes corresponding to "story" objects.
I have an ordered set stories containing all keys of the above (the stories) enabling convenient retrieval of stories.
I now want to store arbitrary emoticons (ie. the Unicode characters corresponding to "smiley face" etc) with stories as "user emotions" corresponding to the emotion the story made the user feel.
I am thinking of:
creating new hashes called emotions containing single emoticons (one per emotion expressed)
creating a hash called story-emotions that enables efficient retrieval of and counting of all the emotions associated with a story
creating another new hash called user-story-emotions mapping user IDs to items in the story-emotion hash.
Typical queries will be:
retrieve all the emotions for a story for the current user
retrieve the count of each kind of emotion for the 50 latest stories
Does this sound like a sensible approach?
Very sensible, but I think I can help make it even more so.
To store the emoticons dictionary, use two Hashes. The first, lets call it emoticon-id should have a field for each emoticon expressed. The field name is the actual Unicode sequence and the value is a unique integer value starting from 0, and increasing for each new emoticon added.
Another Hash, id-emoticon, should be put in place to do the reverse mapping, i.e. from field names that are ids to actual Unicode values.
This gives you O(1) lookups for emoticons, and you should also consider caching this in your app.
To store the user-story-emotions data, look into Redis' Bitmaps. Tersely, use the emoticon id as index to toggle the presence/lack of it by that user towards that story.
Note that in order to keep things compact, you'll want popular emotions to have low ids so your bitmaps remain a small as possible.
To store the aggregative story-emotions, the Sorted Set would be a better option. Elements can be either id or actual unicode, and the score should be the current count. This will allow you to fetch the top emoticons (ZREVRANGEBYSCORE) and/or page similarly to how you're doing with the recent 50 stories (I assume you're using the stories Sorted Set for that).
Lastly, when serving the second query, use pipelining or Lua scripting when fetching the bulk of 50 story-emotion counter values in order get more throughput and better concurrency.
I'm creating an app that has to store historical financial data for various stocks.
I currently have a stock table where the columns are stock symbol, stockname along with numerical data which I'm trying to decide how to store.
For example, for the column stockprice, I want to store an entire hash where the key is the date as a string and the value is the stock price. This information should be easily accessible(fast random access). I've read a bit about serializing, but I wonder if this is the best option(or if it's even applicable at all). Is there a way to instead automatically generate an sqlite table for each stock entered and create columns representing the date and rows representing the stockprice?
I appreciate all insight into this matter and perhaps some clarification on whether this is exactly where I should use serialization or whether there is a better alternative
EDIT 1: Is ActiveModel Serialization relevant? (http://railscasts.com/episodes/409-active-model-serializers)
EDIT 2: Is it advisable to consider instead creating a Stockprice model & table where the model belongs_to a stock and a stock has_many stockprices. THe stockprice table would have a regular id, a stock_id(for which it belongs) to and a date column and a stockprice value column. I'd appreciate some analysis on the run-time memory-time usage of this in comparison to serialization and how to analyze it in the future
You are correct, it is possible to store it as a hash. I don't have any metrics for serialize but I would suggest starting this way and optimizing your data storage later if you begin to notice a significant impact on your application.
You're migration would look something like this (be sure to use the text data type):
def self.up
add_column :stocks, :price, :text
end
In your model you will need to add
serialize :price
You will be able to create price as a hash and store it directly.
stock = Stock.new
stock.price = { :date => "#{Time.now}", :amount => 25.2 }
stock.save
EDIT: I would start with the serialization unless you have designed functionality that is specific to stock_price. Since the convention in Rails is to have a model per each table, you would end up with a class for stock_price. It isn't necessary to dedicate a class for stock_price unless you have specific methods in mind for that class. Also, depending on your design, you may be able to keep the stocks class more cohesive by keeping the stock price as an attribute of stocks.
you mentioned 'This information should be easily accessible(fast random access)' - in which case serialized column is not a good option. Lets say you keep 20 years of data, then it would be 20*365 key value pairs in the serialized price. But you might be interested in only a subset of this for an usecase - say plotting last 6 months. If you go with the serialize option, that entire data (for price field) will be transported from db to ruby process and gets de-serialized.Then again you need to filter the price hash in ruby process. Where as in case of a seperate table for price, db can do the filtering for you and you can have a fast response with good indices.
Have you explored any time series dbs?
I have a listing of ~10,000 apps and I'd like to order them by certain columns, but I want to give certain columns more "weight" than others.
For instance, each app has overall_ratings and current_ratings. If the app has a lot of overall_ratings, that's worth 1.5, but the number of current_ratings would be worth, say 2, since the number of current_ratings shows the app is active and currently popular.
Right now there are probably 4-6 of these variables I want to take into account.
So, how can I pull that off? In the query itself? After the fact using just Ruby (remember, there are over 10,000 rows that would need to be processed here)? Something else?
This is a Rails 3.2 app.
Sorting 10000 objects in plain Ruby doesn't seem like a good idea, specially if you just want the first 10 or so.
You can try to put your math formula in the query (using the order method from Active Record).
However, my favourite approach would be to create a float attribute to store the score and update that value with a before_save method.
I would read about dirty attributes so you only perform this scoring when some of you're criteria is updated.
You may also create a rake task that re-scores your current objects.
This way you would keep the scoring functionality in Ruby (you could test it easily) and you could add an index to your float attribute so database queries have better performance.
One attempt would be to let the DB do this work for you with some query like: (can not really test it because of laking db schema):
ActiveRecord::Base.connection.execute("SELECT *,
(2*(SELECT COUNT(*) FROM overall_ratings
WHERE app_id = a.id) +
1.5*(SELECT COUNT(*) FROM current_ratings
WHERE app_id = a.id)
AS rating FROM apps a
WHERE true HAVING rating > 3 ORDER BY rating desc")
Idea is to sum the number of ratings found for each current and overall rating with the subqueries for an specific app id and weight them as desired.
I have a series of users who each make numerous posts (and the posts receive "views"), and I have sorted them in various ways in the past.
For example, here is the 8 most viewed users:
#most_viewed = User.sort_by{|user| user.views}
And then in my model, user.rb file:
def views
self.posts.sum(:views) || 0
end
This worked fine in the past.
The problem is that my sorting used the sort_by method, which doesn't play nice with the will_paginate gem because it returns an Array instead of a Relation.
How can I do the following things in Rails 3, Active Record Relation, Lazy Loading style? For example, as named scopes?
Users with at least one post
Users with the greatest number of total views on their posts
Users with the posts with the greatest number of views (not total)
Users with the most recent posts
Users with the most number of posts
You can still paginate an array, but you will have to require 'will_paginate/array'
To do this, you could create a file (i.e.: "will_paginate_array_fix.rb") in your config/initializers directory. Add this as only line in that file:
require 'will_paginate/array'
restart your server, and you should be able to paginate normal ruby arrays.
you should use the order method. This will perform the sorting through SQL, which give better performance in the first place :)
User.order("users.views DESC")
or for ascending order
User.order("users.views ASC")
This will return a relation, so you can add other stuff such as where clauses, scoping, etc..
User.order("views DESC").where("name like 'r%'")
For an ordered list of users who's name start with r
visit http://guides.rubyonrails.org/active_record_querying.html for the official explanation
I need when I add a new document in my collection X to get the last document that was inserted in that same collection, because some values of that document must influence the document I am currently inserting.
Basically as a simple example I would need to do that:
class X
include Mongoid::Document
include Mongoid::Timestamps
before_save :set_sum
def set_sum
self.sum = X.last.sum + self.misc
end
field :sum, :type => Integer
field :misc, :type => Integer
end
How can I make sure that type of process will never break if there are concurrent insert? I must make sure that when self.sum = X.last.sum + self.misc is calculate that X.last.sum absolutely represents that last possible document inserted in the collection ?
This is critical to my system. It needs to be thread safe.
Alex
ps: this also needs to be performant, when there are 50k documents in the collections, it can't take time to get the last value...
this kind of behavior is equivalent to having an auto increment id.
http://www.mongodb.org/display/DOCS/How+to+Make+an+Auto+Incrementing+Field
The cleanest way is to have a side collection with one (or more) docs representing the current total values.
Then in your client, before inserting the new doc, do a findAndModify() that atomically updates the totals AND retrieves the current total doc.
Part of the current doc can be an auto increment _id, so that even if there are concurrent inserts, your document will then be correctly ordered as long as you sort by _id.
Only caveat: if your client app dies after findAndModify and before insert, you will have a gap in there.
Either that's ok or you need to add extra protections like keeping a side log.
If you want to be 100% safe you can also get inspiration from 2-phase commit
http://www.mongodb.org/display/DOCS/two-phase+commit
Basically it is the proper way to do transaction with any db that spans more than 1 server (even sql wouldnt help there)
best
AG
If you need to keep a running sum, this should probably be done on another document in a different collection. The best way to keep this running sum is to use the $inc atomic operation. There's really no need to perform any reads while calculating this sum.
You'll want to insert your X document into its collection, then also $inc a value on a different document that is meant for keeping this tally of all the misc values from the X documents.
Note: This still won't be transactional, because you're updating two documents in two different collections separately, but it will be highly performant, and fully thread safe.
Fore more info, check out all the MongoDB Atomic Operations.