How to cast a MongoDB query and use Index

How to cast a MongoDB query and use Index - sql

I have a sql query that I want to convert to MongoDB and still use the index. Here is the sql query
SELECT * FROM "Data1" WHERE age > Cast('12' as int)
According to this SO answer the above query can be converted to:
db.test.find("this.age > 12")
However, using this syntax will not use any indexes. I wanted to know has this issue has been fixed for MonoDB.

You need to use the $gt operator, to make use of indexes.
db.test.find({age:{$gt:12}})
The field age should be indexed.
I wanted to know has this issue has been fixed for MongoDB.
It still can't be achieved.
The doc says, If you need to use java script for evaluation and select documents, you need to use the $where operator. But it would not use the index.
db.test.find( { $where: "this.age>12" } );
From the docs,
$where evaluates JavaScript and cannot take advantage of indexes;
Hence whenever query criteria is evaluated as Java script , it can never make use of indexes.

Related

Using OrientDB 2.0.2, a LUCENE query does not seem to respect the "LIMIT n" operator

Using LUCENE inside of OrientDB seems to work fine, but there are very many LUCENE-specific query parameters that I would ordinarily pass directly to LUCENE (normally through Solr). The first one I need to pass is the result limiter such as SELECT * FROM V WHERE field LUCENE "Value" LIMIT 10.
If I use a value that only returns a few rows, I get the performance I expect, but if it has a lot of values, I need the limiter to get the result to return quickly. Otherwise I get an message in the console stating that The query would return more than 50000 records. Please consider using an index.
How do I pass additional LUCENE query paramters?

There's a known issue with the query parser which is in the process of being fixed, until then the following workaround should help:
SELECT FROM (
SELECT * FROM V WHERE Field LUCENE 'Value'
) LIMIT 10
Alternatively, depending on which client libraries you're using you may be able to set a limit using the out-of-band query settings.

Parameterizing 'limit' and 'order' in sqlite3

I have a sqlite query that I'm looking into parameterization to avoid bad sql injection things on the internet...
So things like:
Select * From myTable Where id = $id
are fine if I have $id defined somewhere and pass that as a parameter to my db calls.
paramters.$id = 150;
db.all(myQuery, parameters, function (err, rows) {
results = rows;
});
I wonder if I need to go out of my way to also parameterize things that are sorted and paginated (both are inputs that users can give)...
I tried to do something like:
var sorter = JSON.parse(value);
parameters.$sortMethod = sorter.method;
parameters.$sortOrder = sorter.order;
sort_filter += 'ORDER BY $sortMethod $sortOrder';
No dice though. I'm guessing sqlite3 just doesn't let you parameterize things that are in ORDER, LIMIT and OFFSET. I thought there was something really sneaky maybe folks out there could do by ending a sqlite statement prematurely in the order and then creating a new malicious statement, but maybe SQLITE3 only lets you exercise one statement at a time (http://www.qtcentre.org/threads/54748-Execute-multiple-sql-command-in-SQLITE3)
Should I not worry about parameterizing things in order limit and offset? For reference, I'm running this on node.js with this sqlite library: https://github.com/mapbox/node-sqlite3
Thanks much in advance!

SQLite (and any other database) allows you to parameterize expressions, that is, any numbers, strings, blobs, or NULL values that appear in a statement.
This includes the values in the LIMIT/OFFSET clauses.
Anything else cannot be parameterized.
This would be table and column names, operators, or any other keyword (like SELECT, ORDER BY, or ASC).
If you need to change any parts of your SQL statements that are not expressions, you have to create the statement on the fly.
(There is no danger of SQL injection as long as your code constructs the statement by itself, not using any unchecked user data.)

String matching in Peewee (SQL)

I am trying to query in Peewee with results that should have a specific substring in them.
For instance, if I want only activities with "Physics" in the name:
schedule = Session.select().join(Activity).where(Activity.name % "%Physics%").join(Course).join(StuCouRel).join(Student).where(Student.id == current_user.id)
The above example doesn't give any errors, but doesn't work correctly.
In python, I would just do if "Physics" in Activity.name, so I'm looking for an equivalent which I can use in a query.

You could also use these query methods: .contains(substring), .startswith(prefix), .endswith(suffix).
For example, your where clause could be:
.where(Activity.name.contains("Physics"))
I believe this is case-insensitive and behaves the same as LIKE '%Physics%'.

Quick answer:
just use Activity.name.contains('Physics')
Depending on the database backend you're using you'll want to pick the right "wildcard". Postgresql and MySQL use "%", but for Sqlite if you're performing a LIKE query you will actually want to use "*" (although for ILIKE it is "%", confusing).
I'm going to guess you're using SQLite since the above query is failing, so to recap, with SQLite if you want case-sensitive partial-string matching: Activity.name % "*Physics*", and for case-insensitive: Activity.name ** "%Physics%".
http://www.sqlite.org/lang_expr.html#like

Rails: ActiveRecord db sort operation case insensitive

I am trying to learn rails [by following the SAAS course in coursera] and working with simple Movie table using ActiveRecord.
I want to display all movies with title sorted. I would like it to be sorted case insensitively.
I tried doing it this way:
Movie.all(:conditions => ["lower(title) = ?", title.downcase],:order => "title DESC")
=>undefined local variable or method `title' for #<MoviesController:0xb4da9a8>
I think it doesnt recognise lower(title) .
Is this the best way to achieve case insesisitve sort ?
Thanks!

Use where and not all
Movie.where("lower(title) = ?", title.downcase).order("title DESC")
Don't really understand the sort though. Here you'll get all movies with lower title equalling to title.downcase. Everything is equal, how could you sort it by title desc ?
To sort reverse-alphabetically all movies by lowercase title :
Movie.order("lower(title) DESC").all

You have to do this:
Movie.order("lower(title) DESC").all

A more robust solution is to use arel nodes. I'd recommend defining a couple scopes on the Movie model:
scope :order_by_title, -> {
order(arel_table['title'].lower.desc)
}
scope :for_title, (title)-> {
where(arel_table['title'].lower.eq title.downcase)
}
and then call Movie.for_title(title).order_by_title
The advantage over other answers listed is that .for_title and .order_by_title won't break if you alias the title column or join to another table with a title column, and they are sql escaped.
Like rickypai mentioned, if you don't have an index on the column, the database will be slow. However, it's bad (normal) form to copy your data and apply a transform to another column, because then one column can become out of sync with the other. Unfortunately, earlier versions of mysql didn't allow for many alternatives other than triggers. After 5.7.5 you can use virtual generated columns to do this. Then in case insensitive cases you just use the generated column (which actually makes the ruby more straight forward).
Postgres has a bit more flexibility in this regard, and will let you make indexes on functions without having to reference a special column, or you can make the column a case insensitive column.

Having MySQL perform upper or lower case operation each time is quite expensive.
What I recommend is having a title column and a title_lower column. This way, you can easily display and sort with case insensitivity on the title_lower column without having MySQL perform upper or lower each time you sort.
Remember to index both or at least title_lower.

How bad is my query?

Ok I need to build a query based on some user input to filter the results.
The query basically goes something like this:
SELECT * FROM my_table ORDER BY ordering_fld;
There are four text boxes in which users can choose to filter the data, meaning I'd have to dynamically build a "WHERE" clause into it for the first filter used and then "AND" clauses for each subsequent filter entered.
Because I'm too lazy to do this, I've just made every filter an "AND" clause and put a "WHERE 1" clause in the query by default.
So now I have:
SELECT * FROM my_table WHERE 1 {AND filters} ORDER BY ordering_fld;
So my question is, have I done something that will adversely affect the performance of my query or buggered anything else up in any way I should be remotely worried about?

MySQL will optimize your 1 away.
I just ran this query on my test database:
EXPLAIN EXTENDED
SELECT *
FROM t_source
WHERE 1 AND id < 100
and it gave me the following description:
select `test`.`t_source`.`id` AS `id`,`test`.`t_source`.`value` AS `value`,`test`.`t_source`.`val` AS `val`,`test`.`t_source`.`nid` AS `nid` from `test`.`t_source` where (`test`.`t_source`.`id` < 100)
As you can see, no 1 at all.
The documentation on WHERE clause optimization in MySQL mentions this:
Constant folding:
(a<b AND b=c) AND a=5
-> b>5 AND b=c AND a=5
Constant condition removal (needed because of constant folding):
(B>=5 AND B=5) OR (B=6 AND 5=5) OR (B=7 AND 5=6)
-> B=5 OR B=6
Note 5 = 5 and 5 = 6 parts in the example above.

You can EXPLAIN your query:
http://dev.mysql.com/doc/refman/5.0/en/explain.html
and see if it does anything differently, which I doubt. I would use 1=1, just so it is more clear.
You might want to add LIMIT 1000 or something, when no parameters are used and the table gets large, will you really want to return everything?

WHERE 1 is a constant, deterministic expression which will be "optimized out" by any decent DB engine.

If there is a good way in your chosen language to avoid building SQL yourself, use that instead. I like Python and Django, and the Django ORM makes it very easy to filter results based on user input.
If you are committed to building the SQL yourself, be sure to sanitize user inputs against SQL injection, and try to encapsulate SQL building in a separate module from your filter logic.
Also, query performance should not be your concern until it becomes a problem, which it probably won't until you have thousands or millions of rows. And when it does come time to optimize, adding a few indexes on columns used for WHERE and JOIN goes a long way.

TO improve performance, use column indexes on fields listen in "WHERE"

Standard SQL Injection Disclaimers here...
One thing you could do, to avoid SQL injection since you know it's only four parameters is use a stored procedure where you pass values for the fields or NULL. I am not sure of mySQL stored proc syntax, but the query would boil down to
SELECT *
FROM my_table
WHERE Field1 = ISNULL(#Field1, Field1)
AND Field2 = ISNULL(#Field2, Field2)
...
ORDRE BY ordering_fld

We've been doing something similiar not too long ago and there're a few things that we observed:
Setting up the indexes on the columns we were (possibly) filtering, improved performance
The WHERE 1 part can be left out completely if the filters're not used. (not sure if it applies to your case) Doesn't make a difference, but 'feels' right.
SQL injection shouldn't be forgotten
Also, if you 'only' have 4 filters, you could build up a stored procedure and pass in null values and check for them. (just like n8wrl suggested in the meantime)

That will work - some considerations:
About dynamically built SQL in general, some databases (Oracle at least) will cache execution plans for queries, so if you end up running the same query many times it won't have to completely start over from scratch. If you use dynamically built SQL, you are creating a different query each time so to the database it will look like 100 different queries instead of 100 runs of the same query.
You'd probably just need to measure the performance to find out if it works well enough for you.
Do you need all the columns? Explicitly specifying them is probably better than using * anyways because:
You can visually see what columns are being returned
If you add or remove columns to the table later, they won't change your interface

Not bad, i didn't know this snippet to get rid of the 'is it the first filter 3' question.
Tho you should be ashamed of your code ( ^^ ), it doesn't do anything to performance as any DB Engine will optimize it.

The only reason I've used WHERE 1 = 1 is for dynamic SQL; it's a hack to make appending WHERE clauses easier by using AND .... It is not something I would include in my SQL otherwise - it does nothing to affect the query overall because it always evaluates as being true and does not hit the table(s) involved so there aren't any index lookups or table scans based on it.
I can't speak to how MySQL handles optional criteria, but I know that using the following:
WHERE (#param IS NULL OR t.column = #param)
...is the typical way of handling optional parameters. COALESCE and ISNULL are not ideal because the query is still utilizing indexes (or worse, table scans) based on a sentinel value. The example I provided won't hit the table unless a value has been provided.
That said, my experience with Oracle (9i, 10g) has shown that it doesn't handle [ WHERE (#param IS NULL OR t.column = #param) ] very well. I saw a huge performance gain by converting the SQL to be dynamic, and used CONTEXT variables to determine what to add. My impression of SQL Server 2005 is that these are handled better.

I have usually done something like this:
for(int i=0; i<numConditions; i++) {
sql += (i == 0 ? "WHERE " : "AND ");
sql += dbFieldNames[i] + " = " + safeVariableValues[i];
}
Makes the generated query a little cleaner.

One alternative i sometimes use is to build the where clause an an array and then join them together:
my #wherefields;
foreach $c (#conditionfields) {
push #wherefields, "$c = ?",
}
my $sql = "select * from table";
if(#wherefields) { $sql.=" WHERE " . join (" AND ", #wherefields); }
The above is written in perl, but most languages have some kind of join funciton.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to cast a MongoDB query and use Index - sql

Related

Using OrientDB 2.0.2, a LUCENE query does not seem to respect the "LIMIT n" operator

Parameterizing 'limit' and 'order' in sqlite3

String matching in Peewee (SQL)

Rails: ActiveRecord db sort operation case insensitive

How bad is my query?

Categories

Resources