Do you have any suggestions how to perform authorization of elasticsearch results (using tire) with cancan?
My approach would be to let cancan load and authorize the models and pass in all authorized ids to the tire search or maybe the other way around.
What do you think?
A proper answer would need a bit more code to understand your situation correctly.
But, generally, as an example:
You'd store the user ID(s) for users who can access/manipulate a document directly in the document (eg. author_ids property)
You'd pass the user ID (based on the current_user.id value) to a term/terms filter.
Alternatively, you can use a boolean query, but filters would be faster and more effiecient.
Your results are filtered.
Note, that there are many more possible approaches with elasticsearch and Tire, for instance:
You use the “filtered aliases” feature in elasticsearch, to set up a “virtual” index for each user, automatically filtered on the elasticsearch layer.
You inject this index name to searches in Tire.
Related
Lets say I store story documents in Elasticsearch like
{
story_id: 1,
title: "Hello World",
content: "Foo. Bar."
likes: 2222
}
When the client (frontend) searches, they should have the option to like (or remove their like) any of the search results, but there should also be an indication of whether or not they liked each result already.
What is a good way to get this information to the client?
Perform a database query to get all the stories the user has liked and keep it in the client's local storage. When search results are retrieved, map the user's liked stories to the retrieved search results on the client. This would add the complexity of updating local storage as well as the API when a user likes a story. Also, the number of stories a user likes could get very large.
Keep a list of users that have liked a story within the document itself and when searching check if the user is in the list. This could blow up the search index size?
{ ...
likes: [ 'foo_user', 'bar_user', ... ]
}
In the API, after the search, perform a database query to determine which stories in the search response the user has already liked, and map this info to the search results before returning the API response. This could slow down searches because an additional database query is required, but maybe it is inconsequential?
For this use case, most common/mainstream approach would be your option 3.
You need to save every like as a record in datastore.
You need to index docs in Elasticsearch(ES) with most probably only the properties you will use for searching and aggregation purpose, not whole doc.
After use query/search from Frontend you lookup from ES the docs and take their ids.
Go to datastore like records and check if there are user like records for each of them.
Combine this info and return whole doc to frontend.
Additional Datastore lookup would not cost you much both in time and money I would say. It wouldn't effect user experience much either.
My only concern would be because for every query I need to check likes collection, this request is not CDN/cache friendly.
UPDATE: Redefined what I am trying to do.
I have a model of Contact, this contact belongs to an account as does every other model in my account. I need all searches whether they be global or model specific to only query the containing account. I was told that I could do this with custom index names. I would like the index name to be the 'index-#{account-id}'. How would I achieve this in my active-models?
class Contact < ActiveRecord::Base
include Tire::Model::Search
include Tire::Model::Callbacks
belongs_to :account
mapping do
indexes :first_name
indexed :last_name
end
end
class Account < ActiveRecord::Base
has_many :contacts
end
You may want to check this comment at Tire's issues, which basically walks through some possible scenarios of the “tenant-based” index naming with Tire. I believe it's what you're after.
In elasticsearch itself, you have the option to have a separate index for every account, a filtered & routed index alias for every account, index templates, etc etc., so the toolkit is vast in this area.
Do you refer to having each account (user?) physically separate in each it's own index? This is generally referred to as 'multi-tenant' http://en.wikipedia.org/wiki/Multitenancy
Assuming this is indeed what you set out to do:
Much has been said in the past about the 'need' (I assume you want this for security reasons, I'm not familiar with other reasons why you would want this although I'm not an expert with multi-tenancy apps) for partitioning data per account/user, as apposed to just having, say, a field accountid for Contact and be sure all your queries filter, at least, on accountid. IMO, a carefully designed query-component where, say, every query used in the system inherits from a 'super-query' which is required to set accountid would suffice in a lot of cases.
Even if you don't know upfront what apps in the future will want to query these indices, you could still enforce the above by say, having a thin REST-service around ES and require all programs to interact with ES through this service. You could then have this service handle this type of security by enforcing an accountid or, probably better, by inferring the accountid by the current logged-in user doing the request.
If you still want to pursue Multi-tenancy have a look at: http://elasticsearch-users.115913.n3.nabble.com/Multi-tenacy-td471400.html (quickly searched this, perhaps there's better stuff around) 'Kimchy' (the creator of ES) comments in that thread as well.
Regardless, the best way in ES to have multi-tenancy is probably to have 1 index per account/user . Within that you could have multiple 'types' (an ES construct) , where Contact could be such a type.
http://www.elasticsearch.org/guide/reference/mapping/
http://www.elasticsearch.org/guide/reference/api/search/indices-types.html
Enforcing this in your models, as you are suggesting, is probably not the correct way IMO. Generally, you should keep your domain-models clean from any knowledge on the storage backend (including the index in which the data is stored)
To me, a better solution would be to have, as earlier suggested, a query-component in which the logic of choosing the correct index based on account/user would be contained. Going with the rest-service approach above, the dynamic indexname, as you suggested, could be derived from the logged-in user doing the request.
I realize that this probably wasn't a straight answer to your question, but I hope it was useful nonetheless.
A core piece of the application I'm working on is allowing the user to explore a complex dataset by progressively add search terms. For example, you might start with a free-text search, then progressively add (or remove) some facetted search terms, move a slider to constrain some dimension of the returned results, etc.
Conceptually, it seems to me that the user is incrementally defining a set of constraints. These constraints are used to search the dataset, and the rendering of the results provides the UI affordances to add further search refinements. So building this in Rails, I'm thinking of having one of the models be the current set of search constraints, and controller actions add to or remove constraint terms from this model.
Assuming this is a sensible approach (which is part of my question!), I'm not sure how to approach this in Rails, since the search is an ephemeral, not persistent, object. I could keep the constraints model in the session store, but it seems rather a complex object to be marshalled into a cookie. On the other hand, I could put store the constraints model in a database, but then I'll have a GC problem as the database fills up with constraint models from previous sessions.
So: how best to build up a complex interaction state in Rails?
Here's some pointers
create a class XxxSearch with accessors for all the search facets: keywords, category, tags, whatever. This class should be ActiveModel compatible, and it's instances are going to be used in conjunction with form_for #xxx_search. This class is not meant for persistence only for temporaryly holding search params and any associated logic. It may even act as a presenter for data: #xxx_search.results, or implement search data validations for each faceting step.
incrementaly resubmit the form via wizard technique, or even ad-hoc data insertion on a large form.
allways submit the search via GET, as such:
the search is bookmarkable
you can chain the params to pagination links easily like: params_for(params[:search].merge(:page => 3))
you need NOT use the session, the data is forwarded via GET params, as such:
can keep using cookie session store
escapes you from a lot of headaches when the last search is persisted and the user expects a new search context (I say this from experience)
I had to solve this problem for several apps so I wrote a small gem with a DSL to describe these searches:
https://github.com/fortytools/forty_facets
I have some basic objects like Customer, Portfolio and ... with some association to other objects. I can easily display the required information in the web page by reading object values. The problem is that what do I do when the value associated with the object is calculated and returned by a method, a value that makes sense only in certain context and cannot be attached to the object as an instance variable? In this case if I have a list of say Users I have to pass the username of each user to the method to get the calculated value. This causes problem to keep the association while displaying the values in the page.
An example to make this clear:
An application provides the functionality for users to keep track of each others activities by letting them add whoever they want to a list. If this user performs a search on users there's the option to follow each returned user. I want to make sure this option is disabled for those user's that are already being followed. This functionality is provided by a method like isFollowed(String follower, String followee) which returnes a boolean. How can I associate this boolean values to each user in search result?
Solutions:
One thing I can think of is to add a followed instance variable to User class. But I don't think it's a good approach because this variable only makes sense in a certain context. It's not a part of User class in the domain.
The other way I can think of is to use Decoration or Wrappers in a way to extend the User class and add the attribute in the child class. But again what if I have several objects that need to be in the same context. In that case I have to extend all of them with the same boolean attribute in all classes.
I hope I could make it clear.
In principle, I don't see anything wrong with instance method on User: bool IsFollowedBy(User user).
Of course, this could lead to performance issues. If that is the case, you can create separate object for presentation purposes which bundles data from User and whether he is being followed by the user performing search. Then you can build query which retrieves all necessary data for such object in a single roundtrip to DB.
One solution is to avoid querying Entities (as in DDD/ORM) and query directly using subquery/join or even using some denormalized database. This is something CQRS pattern suggests.
Other solution is to do computations on application layer (how many Users can you show on the same page anyway), which is expensive but you can implement some caching techniques to make things easier.
Struggling with a decision on how best to handle Client-level authentication with the following model hierarchy:
Client -> Store -> Product (Staff, EquipmentItem, etc.)
...where Client hasMany Stores, Store hasMany Products(hasMany Staff, hasMany EquipmentItem, etc.)
I've set up a HABTM relationship between User and Client, which is straightforward and accessible through the Auth session or a static method on the User model if necessary (see afterFind description below).
Right now, I'm waffling between evaluating the results in each model's afterFind callback, checking for relationship to Client based on the model I'm querying against the Clients that the current User is a member of. i.e. if the current model is Client, check the id; if the current model is a Store, check Store.clientid, and finally if Product, get parent Store from Item.storeid and check Store.clientid accordingly.
However, to keep in line with proper MVC, I return true or false from the afterFind, and then have to check the return from the calling action -- this is ok, but I have no way I can think of to determine if the Model->find (or Model->read, etc.) is returning false because of invalid id in the find or because of Client permissions in the afterFind; it also means I'd have to modify every action as well.
The other method I've been playing with is to evaluate the request in app_controller.beforeFilter and by breaking down the request into controller/action/id, I can then query the appropriate model(s) and eval the fields against the Auth.User.clients array to determine whether User has access to the requested Client. This seems ok, but doesn't leave me any way (afaik) to handle /controller/index -- it seems logical that the index results would reflect Client membership.
Flaws in both include a lengthy list of conditional "rules" I need to break down to determine where the current model/action/id is in the context of the client. All in all, both feel a little brittle and convoluted to me.
Is there a 3rd option I'm not looking at?
This sounds like a job for Cake ACL. It is a bit of a learning curve, but once you figure it out, this method is very powerful and flexible.
Cake's ACLs (Access Control Lists) allow you to match users to controllers down to the CRUD (Create Read Update Delete) level. Why use it?
1) The code is already there for you to use. The AuthComponent already has it built in.
2) It is powerful and integrated to allow you to control permissions every action in your site.
3) You will be able to find help from other cake developers who have already used it.
4) Once you get it setup the first time, it will be much easier and faster to implement full site permissions on any other application.
Here are a few links:
http://bakery.cakephp.org/articles/view/how-to-use-acl-in-1-2-x
http://book.cakephp.org/view/171/Access-Control-Lists
http://blog.jails.fr/cakephp/index.php?post/2007/08/15/AuthComponent-and-ACL
Or you could just google for CakePHP ACL