Pymongo error : 'Cursor' object has no attribute 'find' , how to filter it? - pymongo

I want to query data from mongo and filter them to many stacks
The logic is like:
objs = collection.find({"name":"BOB"})
t_list = ["Math","History"]
for t in t_list :
subjects = objs.find({{"subjects":t}})
# do_somethig(subjects)
But pymongo will raise error AttributeError: 'Cursor' object has no attribute 'find'
I want to know can pymongo do this ?? ( query one time and then filter the data I want??)

Related

search for django-objects tagged with all tags in a set

In my django-project have a search function where you can specify tags, like "apple, banana" and by that query for objects of a certain model, tagged with taggit. When I do:
tag_set = Tag.objects.filter(Q(name__in=tag_list))
query_set = Model.objects.filter(Q(tags__in=tag_set))
this gives me objects tagged with either "apple" OR "banana". But I want the AND operator... I tried:
query_set = Model.objects.filter(reduce(operator.and_, (Q(tags__in=x) for x in tag_set)))
but then I get 'Tag' object is not iterable.
Any help?
You can work with:
queryset = Model.objects.all()
for tag in tag_list:
queryset = queryset.filter(tags__name=tag)
This will make a JOIN for each tag, and thus eventually the queryset will only contain Model items that have all the necessary tags.
Another approach is counting the number of matched tags, so:
from django.db.models import Count
tag_set = set(tag_list)
Model.objects.filter(tag__name__in=tag_set).alias(
ntags=Count('tags')
).filter(ntags=len(tag_set))

Query list index with django

i have a problem with django when i use these two arrays:
institution_ids [(2,), (16,)]
project_ids [(3,), (1,)]
in this query:
queryset = Patient.active.filter(tss_id__in=institution_ids, project_id__in = project_ids)
it gives me back all the combinations, but I need this kind of result:
queryset = Patient.active.filter(tss_id=institution_ids[0], project_id = project_ids[0])
queryset += Patient.active.filter(tss_id=institution_ids[1], project_id = project_ids[1])
how can i do?
Thanks
Giuseppe
What you search for is an OR statement.
You can do it using Q object.
In Django, we cannot directly use the OR operator to filter the
QuerySet. For this implementation, we have to use the Q() object. By
using the Q() object in the filter method, we will be able to use the
OR operator between the Q() objects.
For example-
If you want to get all objects with tss_id in institution array and all objects with project__id__in projects_id array you will use it like this.
queryset = Patient.active.filter(Q(tss_id__in=institution_ids) | project_id__in = project_ids))
Pay attention- you need to import Q.
Another option -
Using union
New in Django 1.11. Uses SQL’s UNION operator to combine the results of two or more QuerySets.
The UNION operator selects only distinct values by default. To allow duplicate values, use the all=True argument.
For example :
queryset = Patient.active.filter(tss_id=institution_ids[0], project_id = project_ids[0])
queryset2 = Patient.active.filter(tss_id=institution_ids[1], project_id = project_ids[1])
Combined_queryset = queryset.union(queryset2)

SQLAlchemy range overlap query

There are two columns datetimerange[] type.
How to write the following query in sqlalchemy?
SQL:
unnest(range_array_1) && unnest(range_array_2)
I tried this method, but it gives an error
session.query(table_1.id).filter(func.unnest(table_1.range_array_1).overlaps(func.unnest(table_2.range_array_2 )))
AttributeError: Neither 'Function' object nor 'Comparator' object has an attribute 'overlaps'

Passing table name and list of values as argument to psycopg2 query

Context
I would like to pass a table name along with query parameters in a psycopg2 query in a python3 function.
If I understand correctly, I should not format the query string using python .format() method prior to the execution of the query, but let psycopg2 do that.
Issue
I can't succeed passing both the table name and the parameters as argument to my query string.
Code sample
Here is a code sample:
import psycopg2
from psycopg2 import sql
connection_string = "host={} port={} dbname={} user={} password={}".format(*PARAMS.values())
conn = psycopg2.connect(connection_string)
curs = conn.cursor()
table = 'my_customers'
cities = ["Paris", "London", "Madrid"]
data = (table, tuple(customers))
query = sql.SQL("SELECT * FROM {} WHERE city = ANY (%s);")
curs.execute(query, data)
rows = cursLocal.fetchall()
Error(s)
But I get the following error message:
TypeError: not all arguments converted during string formatting
I also tried to replace the data definition by:
data = (sql.Identifier(table), tuple(object_types))
But then this error pops:
ProgrammingError: can't adapt type 'Identifier'
If I put ANY {} instead of ANY (%s) in the query string, in both previous cases this error shows:
SyntaxError: syntax error at or near "{"
LINE 1: ...* FROM {} WHERE c...
^
Initially, I didn't used the sql module and I was trying to pass the data as the second argument to the curs.execute() method, but then the table name was single quoted in the command, which caused troubles. So I gave the sql module a try, hopping it's not a deprecated habit.
If possible, I would like to keep the curly braces {} for parameters substitution instead of %s, except if it's a bad idea.
Environment
Ubuntu 18.04 64 bit 5.0.0-37-generic x86_64 GNU/Linux
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
psycopg2.__version__
'2.8.4 (dt dec pq3 ext lo64)'
You want something like:
table = 'my_customers'
cities = ["Paris", "London", "Madrid"]
query = sql.SQL("SELECT * FROM {} WHERE city = ANY (%s)").format(sql.Identifier(table))
curs.execute(query, (cities,))
rows = cursLocal.fetchall()

how to define a constant array and check if a value is in the array for Pig Latin

I want to define an array of user Ids in Pig and then filter data if the userId from the input is NOT in that array,
How do I do this in pig latin? Below is the example of what I intend to do
Thanks
inputData = load '$INPUT' USING PigStorage('|') AS (useriD:chararray,controllerAction:chararray,url:chararray,browserName:chararray,IsMobile:chararray,exceptionDetails:chararray,renderTime:int,serviceHostId:int,auditEventTime:chararray);
filteredInput = filter inputData by controllerAction is not null and auditEventTime is not null and serviceHostId is not null and renderTime is not null and useriD in ('2be2df06-f4ba-4d87-8938-09d867d3f2fe','ac1ac6bf-d151-49fc-8c7c-2b52d2efbb58','f00aec16-36e5-46ae-b7cb-a0f1eeefe609','258890f9-102a-4f8e-a001-ae24d2e25269','cf221779-a077-472c-b377-cca4a9230e1b');
Thanks Murali..I tried the approach of declaring a variable and then using Flatten and stringSplit to join..However I get the following error
Syntax error, unexpected symbol at or near 'flatteneduserids'
%declare REQUIRED_USER_IDS 'xxxxx,yyyyy,sssss' ;
inputData = load '$INPUT' USING PigStorage('|') AS (useriD:chararray,controllerAction:chararray,url:chararray,browserName:chararray,IsMobile:chararray,exceptionDetails:chararray,renderTime:int,serviceHostId:int,auditEventTime:chararray);
filteredInput = filter inputData by controllerAction is not null and auditEventTime is not null and serviceHostId is not null and renderTime is not null;
flatteneduserids = FLATTEN(STRSPLIT('$REQUIRED_USER_IDS',',')) AS (uid:chararray);
useridfilter = JOIN filteredInput BY useriD, flatteneduserids BY uid USING 'replicated';
so Now I tried another way of declaring flatteneduserids which results in the error Undefined alias: IN
flatteneduserids = FOREACH IN GENERATE FLATTEN(STRSPLIT('$REQUIREDUSERIDS',',')) AS (uid:chararray);
Had a similar use case. Tried the approach by declaring the constant value in %define and accessing the same inside IN clause, was not able to achieve the objective. (Refer : Declare a comma seperated string constant)
A thought worth contemplating ....
If the condition inside IN clause is a static/ reference/ meta kind of data, then would suggest to declare this in a static file. We can then read the data at run time and do an inner join with input data to retrieve the matching records.
input_data = LOAD '$INPUT' USING PigStorage('|') AS (user_id:chararray ...)
static_data = LOAD ... AS (req_user_id:chararray
required_data = JOIN input_data BY useriD, static_data BY req_user_id USING 'replicated';
required_data_fmt = -- project required fields.
I was not able to figure out how to do this in memory
So as per Murali's suggestion I added the user ids in a file..load the file and then do a join...that worked as expected for mr