django.db.utils.NotSupportedError: FOR UPDATE cannot be applied to the nullable side of an outer join - sql

I've found this error in server log. Can't replicate the problem.
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/models/query.py", line 250, in __len__
self._fetch_all()
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/models/query.py", line 1186, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/models/query.py", line 54, in __iter__
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1065, in execute_sql
cursor.execute(sql, params)
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 100, in execute
return super().execute(sql, params)
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 68, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 77, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 85, in _execute
return self.cursor.execute(sql, params)
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/home/futilestudio/.venvs/36venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 85, in _execute
return self.cursor.execute(sql, params)
django.db.utils.NotSupportedError: FOR UPDATE cannot be applied to the nullable side of an outer join
I think that it happens only on PostgreSQL database. I tried Sqlite before and it worked.
The problem is here:
match, created = Match.objects.update_or_create(
match_id=draft_match.match_id,
defaults={
field.name: getattr(
temp_match, field.name, None
) for field in Match._meta.fields if not field.name in [
'id', 'pk'
]
}
)
Is there a problem in attributes or do I must not use update_or_create and do it another way?
EDIT
In [18]: match, created = Match.objects.update_or_create(match_id=draft_match.match_id, defaults={
...: field.name: getattr(temp_match, field.name, None) for field in Match._meta.fields if
...: not field.name in ['id', 'pk','created','modified','home_team','away_team']})
returns the same error so I checked the defaults and NONE of them is ReversedForeignKey, there are only two ForeignKey but when I exclude them, it raises the same error.
In [22]: {
...: ...: field.name: (getattr(temp_match, field.name, None),field) for field in Match._meta.fields if
...: ...: not field.name in ['id', 'pk','created','modified']}
...:
...:
Out[22]:
{'match_url': ('https://cestohlad.eu/sport-kansas-city-monterrey/',
<django.db.models.fields.URLField: match_url>),
'match_id': ('4ps3utZN', <django.db.models.fields.CharField: match_id>),
'datetime': (datetime.datetime(2019, 4, 12, 3, 0),
<django.db.models.fields.DateTimeField: datetime>),
'home_team': (<Team: Sporting Kansas City (USA) (CURGfJWt)>,
<django.db.models.fields.related.ForeignKey: home_team>),
'away_team': (<Team: Monterrey (Mex) (Ya23C2Zs)>,
<django.db.models.fields.related.ForeignKey: away_team>),
'home_score': (2,
<django.db.models.fields.PositiveSmallIntegerField: home_score>),
'away_score': (5,
<django.db.models.fields.PositiveSmallIntegerField: away_score>),
'home_odds': (Decimal('0.6215'),
<django.db.models.fields.DecimalField: home_odds>),
'away_odds': (Decimal('0.4850'),
<django.db.models.fields.DecimalField: away_odds>),
'under_odds': (Decimal('2.02'),
<django.db.models.fields.DecimalField: under_odds>),
'over_odds': (Decimal('1.84'),
<django.db.models.fields.DecimalField: over_odds>),
'total': (Decimal('2.75'), <django.db.models.fields.DecimalField: total>),
'total_real': (Decimal('7.00'),
<django.db.models.fields.DecimalField: total_real>),
'correct': (False, <django.db.models.fields.BooleanField: correct>),
'home_odds_raw': (Decimal('0.4464'),
<django.db.models.fields.DecimalField: home_odds_raw>),
'draw_odds_raw': (Decimal('0.2817'),
<django.db.models.fields.DecimalField: draw_odds_raw>),
'away_odds_raw': (Decimal('0.3484'),
<django.db.models.fields.DecimalField: away_odds_raw>)}

Related

Julia and dbscan clustering: how extract elements from resulting structure?

Warning: this is from a julia n00b!
After performing dbscan on a point coordinate array in Julia. (Note that this is not the 'distance based method' that returns 'assignments' as part of the result structure, but the 'adjacency list' method). Documentation here. I attempt to access the vector containing the indices, but I am at a loss when trying to retrieve the members of individual clusters:
dbr = dbscan(pointcoordinates, .1, min_neighbors = 10, min_cluster_size = 10)
13-element Array{DbscanCluster,1}:
DbscanCluster(17, [4, 12, 84, 90, 94, 675, 676, 737, 873, 965], [27, 108, 177, 880, 954, 1050, 1067])
DbscanCluster(10, Int64[], [46, 48, 51, 57, 188, 225, 226, 228, 270, 542])
DbscanCluster(11, [48, 51, 228], [46, 49, 57, 188, 225, 226, 270, 542])
DbscanCluster(14, [418, 759, 832, 988, 1046], [830, 831, 855, 865, 989, 991, 996, 1021, 1070])
DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1069])
It is easy to retrieve a single cluster from the array:
> dbr[1]
DbscanCluster(17, [4, 12, 84, 90, 94, 675, 676, 737, 873, 965], [27, 108, 177, 880, 954, 1050, 1067])
But how do i get the stuff inside DBscanCluster?
a = dbr[1]
DbscanCluster(17, [4, 12, 84, 90, 94, 675, 676, 737, 873, 965], [27, 108, 177, 880, 954, 1050, 1067])
In [258]:
a[1]
MethodError: no method matching getindex(::DbscanCluster, ::Int64)
Thank you for your help, and sorry if I am missing something glaring!
What makes you say that DbscanCluster is a child of array?
julia> DbscanCluster <: AbstractArray
false
You might be confused by Array{DbscanCluster,1} in your result, but this just tells you that the object returned by the dbscan call is an Array the elements of which are of type DbscanCluster - this does not tell you anything about whether those elements themselves are subtypes of Array.
As for how to get the indexes, the docs for DbscanResult show that the type has three fields:
seeds::Vector{Int}: indices of cluster starting points
assignments::Vector{Int}: vector of clusters indices, where each point was assigned to
counts::Vector{Int}: cluster sizes (number of assigned points)
each of which you can access with dot notation by doing e.g. drb[1].assignments.
If you want to get say the counts for all the 13 clusters in your results, you can broadcast getproperty like so:
getproperty.(drb, :counts)
Note that counts does not exist for in the case of the "adjacency lists" method of dbscan, one can use:
getproperty.(drb, :core_indices)

Why I've got a error "Ref expected, Object provided"?

With Python API, I've created a document in the collection "spells" as follows
>>> client.query(
... q.create(
... q.collection("spells"),
... {
... "data": {"name": "Mountainous Thunder", "element": "air", "cost": 15}
... }
... ))
{'ref': Ref(id=243802653698556416, collection=Ref(id=spells, collection=Ref(id=collections))), 'ts': 1568767179200000, 'data': {'name': 'Mountainous Thunder', 'element': 'air', 'cost': 15}}
Then, I've tried to get the document with its ts as follows:
>>> client.query(q.get(q.ref(q.collection("spells", "1568767179200000"))))
But, the result is, the error as "Ref expected, Object provided".
>>> client.query(q.get(q.ref(q.collection("spells", "1568767179200000"))))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/faunadb/client.py", line 175, in query
return self._execute("POST", "", _wrap(expression), with_txn_time=True)
File "/usr/local/lib/python3.6/dist-packages/faunadb/client.py", line 242, in _execute
FaunaError.raise_for_status_code(request_result)
File "/usr/local/lib/python3.6/dist-packages/faunadb/errors.py", line 28, in raise_for_status_code
raise BadRequest(request_result)
faunadb.errors.BadRequest: Ref expected, Object provided.
I've no idea what was wrong, any suggestions are welcome!
I've solved this myself. I've also missed parameters with q.ref.
The correct params are as follows:
>>> client.query(q.get(q.ref(q.collection("spells"),"243802585534824962")))
{'ref': Ref(id=243802585534824962, collection=Ref(id=spells, collection=Ref(id=collections))), 'ts': 1568767114140000, 'data': {'name': 'Mountainous Thunder', 'element': 'air', 'cost': 15}}

Querying a column with lists in it

I have a dataframe with columns with lists in them. How can I query these?
>>> df1.shape
(1812871, 7)
>>> df1.dtypes
CHROM object
POS int32
ID object
REF object
ALT object
QUAL int8
FILTER object
dtype: object
>>> df1.head()
CHROM POS ID REF ALT QUAL FILTER
0 20 60343 rs527639301 G [A] 100 [PASS]
1 20 60419 rs538242240 A [G] 100 [PASS]
2 20 60479 rs149529999 C [T] 100 [PASS]
3 20 60522 rs150241001 T [TC] 100 [PASS]
4 20 60568 rs533509214 A [C] 100 [PASS]
>>> df2 = df1.head(30)
>>> df3 = df1.head(3000)
I found a previous question, but the solutions do not quite work for me. The accepted solution does not work:
>>> df2[df2.ALT.apply(lambda x: x == ['TC'])]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2682, in __getitem__
return self._getitem_array(key)
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2726, in _getitem_array
indexer = self.loc._convert_to_indexer(key, axis=1)
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/indexing.py", line 1314, in _convert_to_indexer
indexer = check = labels.get_indexer(objarr)
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3259, in get_indexer
indexer = self._engine.get_indexer(target._ndarray_values)
File "pandas/_libs/index.pyx", line 301, in pandas._libs.index.IndexEngine.get_indexer
File "pandas/_libs/hashtable_class_helper.pxi", line 1544, in pandas._libs.hashtable.PyObjectHashTable.lookup
TypeError: unhashable type: 'numpy.ndarray'
The reason being, the booleans get nested:
>>> df2.ALT.apply(lambda x: x == ['TC']).head()
0 [False]
1 [False]
2 [False]
3 [True]
4 [False]
Name: ALT, dtype: object
So I tried the second answer, which seemed to work:
>>> c = np.empty(1, object)
>>> c[0] = ['TC']
>>> df2[df2.ALT.values == c]
CHROM POS ID REF ALT QUAL FILTER
3 20 60522 rs150241001 T [TC] 100 [PASS]
But strangely, it doesn't work when I try it on the larger dataframe:
>>> df3[df3.ALT.values == c]
Traceback (most recent call last):
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3078, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: False
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2688, in __getitem__
return self._getitem_column(key)
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2695, in _getitem_column
return self._get_item_cache(key)
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 2489, in _get_item_cache
values = self._data.get(item)
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "/home/user/miniconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3080, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: False
Which is probably because the result of the boolean comparison is different!
>>> df3.ALT.values == c
False
>>> df2.ALT.values == c
array([False, False, False, True, False, False, False, False, False,
False, False, False, False, False, False, False, False, False,
False, False, False, False, False, False, False, False, False,
False, False, False])
This is completely baffling to me.
I found a hacky solution of casting the list as tuples works for me
df = pd.DataFrame({'CHROM': [20] *5,
'POS': [60343, 60419, 60479, 60522, 60568],
'ID': ['rs527639301', 'rs538242240', 'rs149529999', 'rs150241001', 'rs533509214'],
'REF': ['G', 'A', 'C', 'T', 'A'],
'ALT': [['A'], ['G'], ['T'], ['TC'], ['C']],
'QUAL': [100] * 5,
'FILTER': [['PASS']] * 5})
df['ALT'] = df['ALT'].apply(tuple)
df[df['ALT'] == ('C',)]
This method works because the immutability of tuples allows pandas to check if the entire element is correct compared to the intra-list elementwise comparison you got for the Boolean series because lists are not hashable.

Pandas and timeseries

I have a dictionary of dataframes. I want to convert each dataframe in it to its respective timeseries. I am able to convert one nicely. But, if I do it within an iterator, it complains. Eg:
This works:
df = dfDict[4]
df['start_date'] = pd.to_datetime(df['start_date'])
df.set_index('start_date', inplace = True)
df.sort_index(inplace = True)
print df.head() works nicely.
But, this doesn't work:
tsDict = {}
for id, df in dfDict.iteritems():
df['start_date'] = pd.to_datetime(df['start_date'])
df.set_index('start_date', inplace = True)
df.sort_index(inplace = True)
tsDict[id] = df
It gives the following error message:
Traceback (most recent call last):
File "tsa.py", line 105, in <module>
main()
File "tsa.py", line 84, in main
df['start_date'] = pd.to_datetime(df['start_date'])
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1997, in __getitem__
return self._getitem_column(key)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2004, in _getitem_column
return self._get_item_cache(key)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 1350, in _get_item_cache
values = self._data.get(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 3290, in get
loc = self.items.get_loc(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/indexes/base.py", line 1947, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:4154)
File "pandas/index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas/index.c:4018)
File "pandas/hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368)
File "pandas/hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322)
KeyError: 'start_date'
I am unable to see the subtle problem here...

How to avoid error "Cannot compare type 'Timestamp' with type 'str'" pandas 0.16.0

I have various dataframes with this format
df.index
<class 'pandas.tseries.index.DatetimeIndex'>
[2009-10-23, ..., 2010-06-15]
Length: 161, Freq: None, Timezone: None
df.columns
Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')
every now and then, at the execution of this line:
zeros_idx = df[ (df.A==0) | (df.B==0) | (df.C==0) | (df.D==0) ].index
I get the following error with this stack trace:
zeros_idx = df[ (df.A==0) | (df.B==0) | (df.C==0) | (df.D==0) ].index
File "/usr/lib64/python3.4/site-packages/pandas/core/ops.py", line 811, in f
return self._combine_series(other, na_op, fill_value, axis, level)
File "/usr/lib64/python3.4/site-packages/pandas/core/frame.py", line 3158, in _combine_series
return self._combine_match_columns(other, func, level=level, fill_value=fill_value)
File "/usr/lib64/python3.4/site-packages/pandas/core/frame.py", line 3191, in _combine_match_columns
left, right = self.align(other, join='outer', axis=1, level=level, copy=False)
File "/usr/lib64/python3.4/site-packages/pandas/core/generic.py", line 3143, in align
fill_axis=fill_axis)
File "/usr/lib64/python3.4/site-packages/pandas/core/generic.py", line 3225, in _align_series
return_indexers=True)
File "/usr/lib64/python3.4/site-packages/pandas/core/index.py", line 1810, in join
return_indexers=return_indexers)
File "/usr/lib64/python3.4/site-packages/pandas/tseries/index.py", line 904, in join
return_indexers=return_indexers)
File "/usr/lib64/python3.4/site-packages/pandas/core/index.py", line 1820, in join
return_indexers=return_indexers)
File "/usr/lib64/python3.4/site-packages/pandas/core/index.py", line 1830, in join
return_indexers=return_indexers)
File "/usr/lib64/python3.4/site-packages/pandas/core/index.py", line 2083, in _join_monotonic
join_index, lidx, ridx = self._outer_indexer(sv, ov)
File "pandas/src/generated.pyx", line 8558, in pandas.algos.outer_join_indexer_object (pandas/algos.c:157803)
File "pandas/tslib.pyx", line 823, in pandas.tslib._Timestamp.__richcmp__ (pandas/tslib.c:15585)
TypeError: Cannot compare type 'Timestamp' with type 'str'