Django model query with not equal exclusion - sql

I would like to create query that has a not equal parameter, but in exclusion part.
All posts suggest to exclude the parameters that should be not equal.
So what I need is :
Object.objects.filter(param1=p).exclude(param2=False, param3=False, param4 != q)
I try
Object.objects.filter(param1=p).exclude(param2=False, param3=False).exclude(param4=q)
but, as expected, doesn't work it is the same as exclude(param2=False, param3=False, param4=q)
So how to do it, or how to rightly chain exclude statements, so that second exclude is pointed to first exclude and not first filter.

It's a kind of logical question.
Exclude (field1 != value1) is same as Include(field1 = value1)
So, you can use the filter() metbod as,
Object.objects.filter(param1=p, param4=q).exclude(param2=False, param3=False)
Or
Use Q() object with ~ symbol
from django.db.models import Q
Object.objects.filter(param1=p).exclude(~Q(param4=q), param2=False, param3=False)

Related

Passing Optional List argument from Django to filter with in Raw SQL

When using primitive types such as Integer, I can without any problems do a query like this:
with connection.cursor() as cursor:
cursor.execute(sql='''SELECT count(*) FROM account
WHERE %(pk)s ISNULL OR id %(pk)s''', params={'pk': 1})
Which would either return row with id = 1 or it would return all rows if pk parameter was equal to None.
However, when trying to use similar approach to pass a list/tuple of IDs, I always produce a SQL syntax error when passing empty/None tuple, e.g. trying:
with connection.cursor() as cursor:
cursor.execute(sql='''SELECT count(*) FROM account
WHERE %(ids)s ISNULL OR id IN %(ids)s''', params={'ids': (1,2,3)})
works, but passing () produces SQL syntax error:
psycopg2.ProgrammingError: syntax error at or near ")"
LINE 1: SELECT count(*) FROM account WHERE () ISNULL OR id IN ()
Or if I pass None I get:
django.db.utils.ProgrammingError: syntax error at or near "NULL"
LINE 1: ...LECT count(*) FROM account WHERE NULL ISNULL OR id IN NULL
I tried putting the argument in SQL in () - (%(ids)s) - but that always breaks one or the other condition. I also tried playing around with pg_typeof or casting the argument, but with no results.
Notes:
the actual SQL is much more complex, this one here is a simplification for illustrative purposes
as a last resort - I could alter the SQL in Python based on the argument, but I really wanted to avoid that.)
At first I had an idea of using just 1 argument, but replacing it with a dummy value [-1] and then using it like
cursor.execute(sql='''SELECT ... WHERE -1 = any(%(ids)s) OR id = ANY(%(ids)s)''', params={'ids': ids if ids else [-1]})
but this did a Full table scan for non empty lists, which was unfortunate, so a no go.
Then I thought I could do a little preprocessing in python and send 2 arguments instead of just the single list- the actual list and an empty list boolean indicator. That is
cursor.execute(sql='''SELECT ... WHERE %(empty_ids)s = TRUE OR id = ANY(%(ids)s)''', params={'empty_ids': not ids, 'ids': ids})
Not the most elegant solution, but it performs quite well (Index scan for non empty list, Full table scan for empty list - but that returns the whole table anyway, so it's ok)
And finally I came up with the simplest solution and quite elegant:
cursor.execute(sql='''SELECT ... WHERE '{}' = %(ids)s OR id = ANY(%(ids)s)''', params={'ids': ids})
This one also performs Index scan for non empty lists, so it's quite fast.
From the psycopg2 docs:
Note You can use a Python list as the argument of the IN operator using the PostgreSQL ANY operator.
ids = [10, 20, 30]
cur.execute("SELECT * FROM data WHERE id = ANY(%s);", (ids,))
Furthermore ANY can also work with empty lists, whereas IN () is a SQL syntax error.

Check existence of given text in a table

I have a course code name COMP2221.
I also have a function finder(int) that can find all codes matching a certain pattern.
Like:
select * from finder(20004)
will give:
comp2211
comp2311
comp2411
comp2221
which match the pattern comp2###.
My question is how to express "whether comp2221 is in finder(20004)" in a neat way?
How to express "whether comp2221 is in finder(20004)" in a neat way?
Use an EXISTS expression and put the test into the WHERE clause:
SELECT EXISTS (SELECT FROM finder(20004) AS t(code) WHERE code = 'comp2221');
Returns a single TRUE or FALSE. Never NULL and never more than one row - even if your table function finder() returns duplicates.
Or fork the function finder() to integrate the test directly. Probably faster.

Django select only rows with duplicate field values

suppose we have a model in django defined as follows:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task:
Select all rows from the model that have at least one duplicate value of the name field.
I know how to do it using plain SQL (may be not the best solution):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?
Try:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
This was rejected as an edit. So here it is as a better answer
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django ORM is smart enough to combine these into a single query:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.
try using aggregation
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
In case you use PostgreSQL, you can do something like this:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
Ok, so for some reason none of the above worked for, it always returned <MultilingualQuerySet []>. I use the following, much easier to understand but not so elegant solution:
dupes = []
uniques = []
dupes_query = MyModel.objects.values_list('field', flat=True)
for dupe in set(dupes_query):
if not dupe in uniques:
uniques.append(dupe)
else:
dupes.append(dupe)
print(set(dupes))
If you want to result only names list but not objects, you can use the following query
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')

How to specify multiple values in where with AR query interface in rails3

Per section 2.2 of rails guide on Active Record query interface here:
which seems to indicate that I can pass a string specifying the condition(s), then an array of values that should be substituted at some point while the arel is being built. So I've got a statement that generates my conditions string, which can be a varying number of attributes chained together with either AND or OR between them, and I pass in an array as the second arg to the where method, and I get:
ActiveRecord::PreparedStatementInvalid: wrong number of bind variables (1 for 5)
which leads me to believe I'm doing this incorrectly. However, I'm not finding anything on how to do it correctly. To restate the problem another way, I need to pass in a string to the where method such as "table.attribute = ? AND table.attribute1 = ? OR table.attribute1 = ?" with an unknown number of these conditions anded or ored together, and then pass something, what I thought would be an array as the second argument that would be used to substitute the values in the first argument conditions string. Is this the correct approach, or, I'm just missing some other huge concept somewhere and I'm coming at this all wrong? I'd think that somehow, this has to be possible, short of just generating a raw sql string.
This is actually pretty simple:
Model.where(attribute: [value1,value2])
Sounds like you're doing something like this:
Model.where("attribute = ? OR attribute2 = ?", [value, value])
Whereas you need to do this:
# notice the lack of an array as the last argument
Model.where("attribute = ? OR attribute2 = ?", value, value)
Have a look at http://guides.rubyonrails.org/active_record_querying.html#array-conditions for more details on how this works.
Instead of passing the same parameter multiple times to where() like this
User.where(
"first_name like ? or last_name like ? or city like ?",
"%#{search}%", "%#{search}%", "%#{search}%"
)
you can easily provide a hash
User.where(
"first_name like :search or last_name like :search or city like :search",
{search: "%#{search}%"}
)
that makes your query much more readable for long argument lists.
Sounds like you're doing something like this:
Model.where("attribute = ? OR attribute2 = ?", [value, value])
Whereas you need to do this:
#notice the lack of an array as the last argument
Model.where("attribute = ? OR attribute2 = ?", value, value) Have a
look at
http://guides.rubyonrails.org/active_record_querying.html#array-conditions
for more details on how this works.
Was really close. You can turn an array into a list of arguments with *my_list.
Model.where("id = ? OR id = ?", *["1", "2"])
OR
params = ["1", "2"]
Model.where("id = ? OR id = ?", *params)
Should work
If you want to chain together an open-ended list of conditions (attribute names and values), I would suggest using an arel table.
It's a bit hard to give specifics since your question is so vague, so I'll just explain how to do this for a simple case of a Post model and a few attributes, say title, summary, and user_id (i.e. a user has_many posts).
First, get the arel table for the model:
table = Post.arel_table
Then, start building your predicate (which you will eventually use to create an SQL query):
relation = table[:title].eq("Foo")
relation = relation.or(table[:summary].eq("A post about foo"))
relation = relation.and(table[:user_id].eq(5))
Here, table[:title], table[:summary] and table[:user_id] are representations of columns in the posts table. When you call table[:title].eq("Foo"), you are creating a predicate, roughly equivalent to a find condition (get all rows whose title column equals "Foo"). These predicates can be chained together with and and or.
When your aggregate predicate is ready, you can get the result with:
Post.where(relation)
which will generate the SQL:
SELECT "posts".* FROM "posts"
WHERE (("posts"."title" = "Foo" OR "posts"."summary" = "A post about foo")
AND "posts"."user_id" = 5)
This will get you all posts that have either the title "Foo" or the summary "A post about foo", and which belong to a user with id 5.
Notice the way arel predicates can be endlessly chained together to create more and more complex queries. This means that if you have (say) a hash of attribute/value pairs, and some way of knowing whether to use AND or OR on each of them, you can loop through them one by one and build up your condition:
relation = table[:title].eq("Foo")
hash.each do |attr, value|
relation = relation.and(table[attr].eq(value))
# or relation = relation.or(table[attr].eq(value)) for an OR predicate
end
Post.where(relation)
Aside from the ease of chaining conditions, another advantage of arel tables is that they are independent of database, so you don't have to worry whether your MySQL query will work in PostgreSQL, etc.
Here's a Railscast with more on arel: http://railscasts.com/episodes/215-advanced-queries-in-rails-3?view=asciicast
Hope that helps.
You can use a hash rather than a string. Build up a hash with however many conditions and corresponding values you are going to have and put it into the first argument of the where method.
WRONG
This is what I used to do for some reason.
keys = params[:search].split(',').map!(&:downcase)
# keys are now ['brooklyn', 'queens']
query = 'lower(city) LIKE ?'
if keys.size > 1
# I need something like this depending on number of keys
# 'lower(city) LIKE ? OR lower(city) LIKE ? OR lower(city) LIKE ?'
query_array = []
keys.size.times { query_array << query }
#['lower(city) LIKE ?','lower(city) LIKE ?']
query = query_array.join(' OR ')
# which gives me 'lower(city) LIKE ? OR lower(city) LIKE ?'
end
# now I can query my model
# if keys size is one then keys are just 'brooklyn',
# in this case it is 'brooklyn', 'queens'
# #posts = Post.where('lower(city) LIKE ? OR lower(city) LIKE ?','brooklyn', 'queens' )
#posts = Post.where(query, *keys )
now however - yes - it's very simple. as nfriend21 mentioned
Model.where(attribute: [value1,value2])
does the same thing

linq match word with boundaries

say i have a nvarchar field in my database that looks like this
1, "abc abccc dabc"
2, "abccc dabc"
3, "abccc abc dabc"
i need a select LINQ query that would match the word "abc" with boundaries not part of a string
in this case only row 1 and 3 would match
from row in table.AsEnumerable()
where row.Foo.Split(new char[] {' ', '\t'}, StringSplitOptions.None)
.Contains("abc")
select row
It's important to include the call to AsEnumerable, which means the query is executed on the client-side, else (I'm pretty sure) the Where clause won't get converted into SQL succesfully.
Maybe a regular expression like this (nb - not compiled or tested):
var matches = from a in yourCollection
where Regex.Match(a.field, ".*\sabc\s.*")
select a;
datacontext.Table.Where(
e => Regex.Match(e.field, #"(.*?[\s\t]|^)abc([\s\t].*?|$)")
);
or
datacontext.Table.Where(
e => e.Split(' ', '\t').Contains("abc");
);
For efficiency, you want to do as much of the filtering as possible on the server, and then the rest of the filtering on the client. You can't use Regex on the server (SQL Server doesn't support it) so the solution is to first use a LIKE-type search (by calling .Contains) then use Regex on the client to further refine the results:
db.MyTable
.Where (t => t.MyField.Contains ("abc"))
.AsEnumerable() // Executes locally from this point on
.Where (t => Regex.IsMatch (t.MyField, #"\babc\b"))
This ensures that you retrieve only the rows from SQL Server than contain the letters 'abc' (regardless of whether they're a word-boundary match or not) and use Regex on the client-side to further restrict the result set so that only matches that are on word boundaries are included.