Query list index with django - sql

i have a problem with django when i use these two arrays:
institution_ids [(2,), (16,)]
project_ids [(3,), (1,)]
in this query:
queryset = Patient.active.filter(tss_id__in=institution_ids, project_id__in = project_ids)
it gives me back all the combinations, but I need this kind of result:
queryset = Patient.active.filter(tss_id=institution_ids[0], project_id = project_ids[0])
queryset += Patient.active.filter(tss_id=institution_ids[1], project_id = project_ids[1])
how can i do?
Thanks
Giuseppe

What you search for is an OR statement.
You can do it using Q object.
In Django, we cannot directly use the OR operator to filter the
QuerySet. For this implementation, we have to use the Q() object. By
using the Q() object in the filter method, we will be able to use the
OR operator between the Q() objects.
For example-
If you want to get all objects with tss_id in institution array and all objects with project__id__in projects_id array you will use it like this.
queryset = Patient.active.filter(Q(tss_id__in=institution_ids) | project_id__in = project_ids))
Pay attention- you need to import Q.
Another option -
Using union
New in Django 1.11. Uses SQL’s UNION operator to combine the results of two or more QuerySets.
The UNION operator selects only distinct values by default. To allow duplicate values, use the all=True argument.
For example :
queryset = Patient.active.filter(tss_id=institution_ids[0], project_id = project_ids[0])
queryset2 = Patient.active.filter(tss_id=institution_ids[1], project_id = project_ids[1])
Combined_queryset = queryset.union(queryset2)

Related

how to dynamically build select list from a API payload using PyPika

I have a JSON API payload containing tablename, columnlist - how to build a SELECT query from it using pypika?
So far I have been able to use a string columnlist, but not able to do advanced querying using functions, analytics etc.
from pypika import Table, Query, functions as fn
def generate_sql (tablename, collist):
table = Table(tablename)
columns = [str(table)+'.'+each for each in collist]
q = Query.from_(table).select(*columns)
return q.get_sql(quote_char=None)
tablename = 'customers'
collist = ['id', 'fname', 'fn.Sum(revenue)']
print (generate_sql(tablename, collist)) #1
table = Table(tablename)
q = Query.from_(table).select(table.id, table.fname, fn.Sum(table.revenue))
print (q.get_sql(quote_char=None)) #2
#1 outputs
SELECT "customers".id,"customers".fname,"customers".fn.Sum(revenue) FROM customers
#2 outputs correctly
SELECT id,fname,SUM(revenue) FROM customers
You should not be trying to assemble the query in a string by yourself, that defeats the whole purpose of pypika.
What you can do in your case, that you have the name of the table and the columns coming as texts in a json object, you can use * to unpack those values from the collist and use the syntax obj[key] to get the table attribute with by name with a string.
q = Query.from_(table).select(*(table[col] for col in collist))
# SELECT id,fname,fn.Sum(revenue) FROM customers
Hmm... that doesn't quite work for the fn.Sum(revenue). The goal is to get SUM(revenue).
This can get much more complicated from this point. If you are only sending column names that you know to belong to that table, the above solution is enough.
But if you have complex sql expressions, making reference to sql functions or even different tables, I suggest you to rethink your decision of sending that as json. You might end up with something as complex as pypika itself, like a custom parser or wathever. than your better option here would be to change the format of your json response object.
If you know you only need to support a very limited set of capabilities, it could be feasible. For example, you can assume the following constraints:
all column names refer to only one table, no joins or alias
all functions will be prefixed by fn.
no fancy stuff like window functions, distinct, count(*)...
Then you can do something like:
from pypika import Table, Query, functions as fn
import re
tablename = 'customers'
collist = ['id', 'fname', 'fn.Sum(revenue / 2)', 'revenue % fn.Count(id)']
def parsed(cols):
pattern = r'(?:\bfn\.[a-zA-Z]\w*)|([a-zA-Z]\w*)'
subst = lambda m: f"{'' if m.group().startswith('fn.') else 'table.'}{m.group()}"
yield from (re.sub(pattern, subst, col) for col in cols)
table = Table(tablename)
env = dict(table=table, fn=fn)
q = Query.from_(table).select(*(eval(col, env) for col in parsed(collist)))
print (q.get_sql(quote_char=None)) #2
Output:
SELECT id,fname,SUM(revenue/2),MOD(revenue,COUNT(id)) FROM customers

search for django-objects tagged with all tags in a set

In my django-project have a search function where you can specify tags, like "apple, banana" and by that query for objects of a certain model, tagged with taggit. When I do:
tag_set = Tag.objects.filter(Q(name__in=tag_list))
query_set = Model.objects.filter(Q(tags__in=tag_set))
this gives me objects tagged with either "apple" OR "banana". But I want the AND operator... I tried:
query_set = Model.objects.filter(reduce(operator.and_, (Q(tags__in=x) for x in tag_set)))
but then I get 'Tag' object is not iterable.
Any help?
You can work with:
queryset = Model.objects.all()
for tag in tag_list:
queryset = queryset.filter(tags__name=tag)
This will make a JOIN for each tag, and thus eventually the queryset will only contain Model items that have all the necessary tags.
Another approach is counting the number of matched tags, so:
from django.db.models import Count
tag_set = set(tag_list)
Model.objects.filter(tag__name__in=tag_set).alias(
ntags=Count('tags')
).filter(ntags=len(tag_set))

multiple parameter "IN" prepared statement

I was trying to figure out how can I set multiple parameters for the IN clause in my SQL query using PreparedStatement.
For example in this SQL statement, I'll be having indefinite number of ?.
select * from ifs_db where img_hub = ? and country IN (multiple ?)
I've read about this in
PreparedStatement IN clause alternatives?
However I can't figure it out how to apply it to my SQL statement above.
There's not a standard way to handle this.
In SQL Server, you can use a table-valued parameter in a stored procedure and pass the countries in a table and use it in a join.
I've also seen cases where a comma-separated list is passed in and then parsed into a table by a function and then used in a join.
If your countries are standard ISO codes in a delimited list like '#US#UK#DE#NL#', you can use a rather simplistic construct like:
select * from ifs_db where img_hub = ? and ? LIKE '%#' + country + '#%'
Sormula will work for any data type (even custom types). This example uses int's for simplicity.
ArrayList<Integer> partNumbers = new ArrayList<Integer>();
partNumbers.add(999);
partNumbers.add(777);
partNumbers.add(1234);
// set up
Database database = new Database(getConnection());
Table<Inventory> inventoryTable = database.getTable(Inventory.class);
ArrayListSelectOperation<Inventory> operation =
new ArrayListSelectOperation<Inventory>(inventoryTable, "partNumberIn");
// show results
for (Inventory inventory: operation.selectAll(partNumbers))
System.out.println(inventory.getPartNumber());
You could use setArray method as mentioned in the javadoc below:
http://docs.oracle.com/javase/6/docs/api/java/sql/PreparedStatement.html#setArray(int, java.sql.Array)
Code:
PreparedStatement statement = connection.prepareStatement("Select * from test where field in (?)");
Array array = statement.getConnection().createArrayOf("VARCHAR", new Object[]{"AA1", "BB2","CC3"});
statement.setArray(1, array);
ResultSet rs = statement.executeQuery();

Django select only rows with duplicate field values

suppose we have a model in django defined as follows:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task:
Select all rows from the model that have at least one duplicate value of the name field.
I know how to do it using plain SQL (may be not the best solution):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?
Try:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
This was rejected as an edit. So here it is as a better answer
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django ORM is smart enough to combine these into a single query:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.
try using aggregation
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
In case you use PostgreSQL, you can do something like this:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
Ok, so for some reason none of the above worked for, it always returned <MultilingualQuerySet []>. I use the following, much easier to understand but not so elegant solution:
dupes = []
uniques = []
dupes_query = MyModel.objects.values_list('field', flat=True)
for dupe in set(dupes_query):
if not dupe in uniques:
uniques.append(dupe)
else:
dupes.append(dupe)
print(set(dupes))
If you want to result only names list but not objects, you can use the following query
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')

checking if all the items in list occur in another list using linq

I am stuck with a problem here. I am trying to compare items in a list to another list with much more items using linq.
For example:
list 1: 10,15,20
list 2: 10,13,14,15,20,30,45,54,67,87
I should get TRUE if all the items in list 1 occur in list 2. So the example above should return TRUE
Like you can see I can't use sequenceEquals
Any ideas?
EDIT:
list2 is actually not a list it is a column in sql thas has following values:
<id>673</id><id>698</id><id>735</id><id>1118</id><id>1120</id><id>25353</id>.
in linq I did the following queries thanks to Jon Skeets help:
var query = from e in db
where e.taxonomy_parent_id == 722
select e.taxonomy_item_id;
query is IQueryable of longs at this moment
var query2 = from e in db
where query.Contains(e.taxonomy_item_id)
where !lsTaxIDstring.Except(e.taxonomy_ids.Replace("<id>", "")
.Replace("</id>", "")
.Split(',').ToList())
.Any()
select e.taxonomy_item_id;
But now I am getting the error Local sequence cannot be used in LINQ to SQL implementation of query operators except the Contains() operator.
How about:
if (!list1.Except(list2).Any())
That's about the simplest approach I can think of. You could explicitly create sets etc if you want:
HashSet<int> set2 = new HashSet<int>(list2);
if (!list1.Any(x => set2.Contains(x)))
but I'd expect that to pretty much be the implementation of Except anyway.
This should be what you want:
!list1.Except(list2).Any()
var result = list1.All(i => list2.Any(i2 => i2 == i));