Building Logical queries using Sunspot - ruby-on-rails-3

I have a User and Skill model, having (has_many association) where indexing is done using sunspot.
Now the requirement is to implement logical queries.
Example "skill1 AND skill2"
"skill1 OR skill2"
"skill1 AND skill2 OR skill3"
And Sunspot is using the dismax request handler where minimum_match field is used to implement OR queries.
Now the problem is when we need to combine AND and OR operators together.
How to implement the same.
Regards,
Karan

Related

MarkLogic: Constrain SPARQL query scope by triple-range-query constraint

I would like to evaluate a SPARQL query against a limited document scope, which is based on a triple range query. Only embedded triples contained by documents which match a specific triple pattern should be part of the SPARQL evaluation scope. I'm using the Java SDK (via marklogic-rdf4j) to evaluate the SPARQL query. We're only using embedded/unmanaged triples.
I'm aware of the possibility to attach a structured query definition to a SPARQL query (by calling MarkLogicQuery::setConstrainingQueryDefinition), but the structured query syntax does not support triple-range-query constraints.
Is there any way to apply one or more triple-range-query constraints in a structured query definition? Or are there better alternatives?
Support for triple-range-query in structured queries has been requested before. I added your case to the ticket.
In the mean time you might get away with using a custom constraint. Me and a colleague put this together:
https://github.com/patrickmcelwee/triple-range-constraint/blob/master/triple-range-constraint.xqy
HTH!

Should I use Eloquent ORM or create big joins by Fluent?

Well, I'm using Eloquent ORM for a project that I'm developing, but it is bugging me with the performance issue. When I use only its own methods, I can see by its query log that it creates a lot of queries.
I'm trying to fetch data from a main table with 4 other tables, one related to it one-to-one and the others many-to-many. Eloquent creates about 6-7 queries for it, and that makes me afraid of performance issues. Then, I remove Eloquent's methods and create jumbo queries with Fluent, using lots of joins, which makes me lose code readability and practicity.
What I really need to know is: Does Eloquent sacrifice performance? Should I stick to it, or use just Fluent? And what is better, a few big joined queries or more small ones?
I'm going to extend Sebastian's answer.
I too have many to many relationships or even one to many relationships.
I have actually melded Eloquent's style of programming (its easier on the eyes) with a bit of a joint hack with Fluent. Please be reminded that Eloquent is an extension of Fluent so your not sacrificing unless you are doing bat queries.
If you do a User and then Phone model with a One to one or one to many (a user can have many phone number)
and you simply where()->get() and then $users->phone - this will make eloquent run a select * for each ID. This is where Eager Loading (as referenced by Sebastian but too short to actually explain) is used where it prefetch all the IDs required and eager load the IDs (you can verify this by running a query log profiler).
The added bonus of this is that you can eagerload many relationships like this.
So its not cut dry solution of "is Eloquent providing a performance hit" if you dont use it the right way.
Now here is a small example to how I put both eloquent and fluent to use:
Within Book Model - I have defined a Scope function which is a relationship function:
public function scopeLicensorStatus($query, $licensor_status)
{
$query->select('book.*')
->leftJoin('licensors as l', 'l.id', '=', 'book.licensor_id')
->where('l.status','=',$licensor_status);
}
$bookData = Book::
->LicensorStatus('active')
->where('book.status','=', 'active')
->whereIN('book.id',$recommendedIds)
->take($limit)
->skip($offset)
->get();
what does this do is do the Join for me as a function and let me chain up the commands fro the outside. In the end (if you do toSQL() instead of get()) you will achieve a single query that will match raw SQL, however as you can see a) the code is reusable if you forsee to reuse the join with other constraints, b) your not sacrificing speed since the end game query is a single one (just need to write it properly), c) looks nicer and readable which is why we like eloquent.
Hope this answer helps you to dive a bit more into eloquent

NHibernate problem choosing between CreateSql and CreateCriteria

I have a very silly doubt in NHibernate. There are two or three entities of which two are related and one is not related to other two entities. I have to fetch some selected columns from these three tables by joining them. Is it a good idea to use session.CreateSql() or we have to use session.CreateCriteria(). I am really confused here as I could not write the Criteria queries here and forced to use CreateSql. Please advise.
in general you should avoid writing SQL whenever possible;
one of the advantages of using an ORM is that it's implementation-agnostic.
that means that you don't know (and don't care) what the underlying database is, and you can actually switch DB providers or tweak with the DB structure very easily.
If you write your own SQL statements you run the risk of them not working on other providers, and also you have to maintain them yourself (for example- if you change the name of the underlying column for the Id property from 'Id' to 'Employee_Id', you'd have to change your SQL query, whereas with Criteria no change would be necessary).
Having said that- there's nothing stopping you from writing a Criteria / HQL that pulls data from more than one table. for example (with HQL):
select emp.Id, dep.Name, po.Id
from Employee emp, Department dep, Posts po
where emp.Name like 'snake' //etc...
There are multiple ways to make queries with NH.
HQL, the classic way, a powerful object oriented query language. Disadvantage: appears in strings in the code (actually: there is no editor support).
Criteria, a classic way to create dynamic queries without string manipulations. Disadvantages: not as powerful as HQL and not as typesafe as its successors.
QueryOver, a successor of Criteria, which has a nicer syntax and is more type safe.
LINQ, now based on HQL, is more integrated then HQL and typesafe and generally a matter of taste.
SQL as a fallback for cases where you need something you can't get the object oriented way.
I would recommend HQL or LINQ for regular queries, QueryOver (resp. Criteria) for dynamic queries and SQL only if there isn't any other way.
To answer your specific problem, which I don't know: If all information you need for the query is available in the object oriented model, you should be able to solve it by the use of HQL.

Nested queries using Arel (Rails3)

For example, I have 2 models:
Purchase (belongs_to :users)
User (has_many :purchases)
I want to select all users that have at least one purchase.
In SQL I would write like this:
SELECT * FROM `users` WHERE `id` IN (SELECT DISTINCT `buyer_id` FROM `purchases`)
And one more question: are there any full documentation or book that cover Arel?
Hmm, I'd like to answer my question... :)
buyers=purchases.project(:buyer_id).group(purchases[:buyer_id]) #<-- all buyers
busers=users.where(users[:id].in(buyers)) #<--answer
The Rails Guide has really good documentation for ARel.
http://guides.rubyonrails.org/active_record_querying.html#conditions
The Rails API is also pretty useful for some of the more obscure options. I just google a specific term with "rails api" and it comes up first.
I don't believe that the code above issues a nested query. Instead, it appears that it would issue 2 separate SQL queries. You may have comparable speed (depending on how concerned you are with performance), but with 2 round trips to the server, it doesn't offer the same benefits of nested queries.

How to create dynamic and safe queries

A "static" query is one that remains the same at all times. For example, the "Tags" button on Stackoverflow, or the "7 days" button on Digg. In short, they always map to a specific database query, so you can create them at design time.
But I am trying to figure out how to do "dynamic" queries where the user basically dictates how the database query will be created at runtime. For example, on Stackoverflow, you can combine tags and filter the posts in ways you choose. That's a dynamic query albeit a very simple one since what you can combine is within the world of tags. A more complicated example is if you could combine tags and users.
First of all, when you have a dynamic query, it sounds like you can no longer use the substitution api to avoid sql injection since the query elements will depend on what the user decided to include in the query. I can't see how else to build this query other than using string append.
Secondly, the query could potentially span multiple tables. For example, if SO allows users to filter based on Users and Tags, and these probably live in two different tables, building the query gets a bit more complicated than just appending columns and WHERE clauses.
How do I go about implementing something like this?
The first rule is that users are allowed to specify values in SQL expressions, but not SQL syntax. All query syntax should be literally specified by your code, not user input. The values that the user specifies can be provided to the SQL as query parameters. This is the most effective way to limit the risk of SQL injection.
Many applications need to "build" SQL queries through code, because as you point out, some expressions, table joins, order by criteria, and so on depend on the user's choices. When you build a SQL query piece by piece, it's sometimes difficult to ensure that the result is valid SQL syntax.
I worked on a PHP class called Zend_Db_Select that provides an API to help with this. If you like PHP, you could look at that code for ideas. It doesn't handle any query imaginable, but it does a lot.
Some other PHP database frameworks have similar solutions.
Though not a general solution, here are some steps that you can take to mitigate the dynamic yet safe query issue.
Criteria in which a column value belongs in a set of values whose cardinality is arbitrary does not need to be dynamic. Consider using either the instr function or the use of a special filtering table in which you join against. This approach can be easily extended to multiple columns as long as the number of columns is known. Filtering on users and tags could easily be handled with this approach.
When the number of columns in the filtering criteria is arbitrary yet small, consider using different static queries for each possibility.
Only when the number of columns in the filtering criteria is arbitrary and potentially large should you consider using dynamic queries. In which case...
To be safe from SQL injection, either build or obtain a library that defends against that attack. Though more difficult, this is not an impossible task. This is mostly about escaping SQL string delimiters in the values to filter for.
To be safe from expensive queries, consider using views that are specially crafted for this purpose and some up front logic to limit how those views will get invoked. This is the most challenging in terms of developer time and effort.
If you were using python to access your database, I would suggest you use the Django model system. There are many similar apis both for python and for other languages (notably in ruby on rails). I am saving so much time by avoiding the need to talk directly to the database with SQL.
From the example link:
#Model definition
class Blog(models.Model):
name = models.CharField(max_length=100)
tagline = models.TextField()
def __unicode__(self):
return self.name
Model usage (this is effectively an insert statement)
from mysite.blog.models import Blog
b = Blog(name='Beatles Blog', tagline='All the latest Beatles news.')
b.save()
The queries get much more complex - you pass around a query object and you can add filters / sort elements to it. When you finally are ready to use the query, Django creates an SQL statment that reflects all the ways you adjusted the query object. I think that it is very cute.
Other advantages of this abstraction
Your models can be created as database tables with foreign keys and constraints by Django
Many databases are supported (Postgresql, Mysql, sql lite, etc)
DJango analyses your templates and creates an automatic admin site out of them.
Well the options have to map to something.
A SQL query string CONCAT isn't a problem if you still use parameters for the options.