How to randomly sample pairs of elements from a masked numpy array until the exhaustion of elements without any repetition? - numpy

I have an array of integers/indices (i.e. arr_F_idx = np.arange(0,200)) and I would like to draw 100 pairs without repetition. I am using a masked array (I transform arr_F_idx after the first draw, as shown in the code below), but it seems that numpy.random.choice still draws the masked elements.
arr_F_idx = np.arange(0,200)
draw = np.random.choice(arr_F_idx,2,replace=False)
arr2_Drawn_pairs[0,0] = draw[0]
arr2_Drawn_pairs[0,1] = draw[1]
arr_F_idx_dum1 = np.array(arr_F_idx == draw[0])
arr_F_idx = np.ma.array(arr_F_idx, mask = arr_F_idx_dum1)
arr_F_idx_dum2 = np.array(arr_F_idx == draw[1])
arr_F_idx = np.ma.array(arr_F_idx, mask = arr_F_idx_dum2)
for i in range(1,100):
draw = np.random.choice(arr_F_idx,2,replace=False)
arr2_Drawn_pairs[i,0] = draw[0]
arr2_Drawn_pairs[i,1] = draw[1]
arr_F_idx_dum1 = np.array(arr_F_idx == draw[0])
arr_F_idx = np.ma.array(arr_F_idx, mask = arr_F_idx_dum1)
arr_F_idx_dum2 = np.array(arr_F_idx == draw[1])
arr_F_idx = np.ma.array(arr_F_idx, mask = arr_F_idx_dum2)
The (sample) output that I get is
arr_F_idx
masked_array(data=[--, --, --, --, 4, 5, --, --, --, --, --, 11, 12, 13,
--, --, 16, 17, --, 19, --, 21, 22, --, 24, --, --, --,
28, --, --, --, 32, 33, 34, --, --, 37, 38, 39, --, 41,
--, --, --, --, --, --, 48, --, --, --, --, --, --, --,
--, 57, 58, --, 60, --, --, 63, 64, --, --, --, --, --,
--, --, --, 73, --, --, 76, --, 78, --, --, --, --, --,
--, --, --, --, 88, 89, --, --, 92, --, --, --, --, --,
98, 99, --, 101, 102, --, --, --, 106, --, --, --, --,
111, --, --, --, --, 116, --, --, --, --, 121, --, 123,
124, 125, --, 127, --, --, --, 131, --, --, --, 135,
--, --, --, --, --, 141, --, --, --, --, --, --, --,
--, --, 151, --, --, --, 155, --, --, --, 159, --, 161,
--, --, --, 165, --, 167, --, 169, --, 171, --, --, --,
--, 176, --, 178, 179, --, --, --, 183, --, 185, --,
--, --, 189, 190, --, --, 193, 194, --, 196, --, 198,
199],
mask=[ True, True, True, True, False, False, True, True,
True, True, True, False, False, False, True, True,
False, False, True, False, True, False, False, True,
False, True, True, True, False, True, True, True,
False, False, False, True, True, False, False, False,
True, False, True, True, True, True, True, True,
False, True, True, True, True, True, True, True,
True, False, False, True, False, True, True, False,
False, True, True, True, True, True, True, True,
True, False, True, True, False, True, False, True,
True, True, True, True, True, True, True, True,
False, False, True, True, False, True, True, True,
True, True, False, False, True, False, False, True,
True, True, False, True, True, True, True, False,
True, True, True, True, False, True, True, True,
True, False, True, False, False, False, True, False,
True, True, True, False, True, True, True, False,
True, True, True, True, True, False, True, True,
True, True, True, True, True, True, True, False,
True, True, True, False, True, True, True, False,
True, False, True, True, True, False, True, False,
True, False, True, False, True, True, True, True,
False, True, False, False, True, True, True, False,
True, False, True, True, True, False, False, True,
True, False, False, True, False, True, False, False],
fill_value=999999)
For smaller ranges it happens as well; of course for very small ranges this is not a problem, but as I have mentioned, I want to exhaust the original array to the point that all its elements will be masked.
The problem seems to be that the np.random.choice somehow still draws the masked elements, even though it is not supposed to (otherwise I do not see the point of the object called masked array). I may be doing something wrong. I will appreciate help on this issue, also if there is a simpler way to make the draws of pairs without repetition across and within pairs.
Edit: In fact, the numpy random.choice draws masked elements, as can be seen in the output (e.g. number 177 is drawn twice and 191 three times):
arr2_Drawn_pairs
Out[53]:
array([[ 20., 49.],
[ 35., 114.],
[ 44., 42.],
[ 52., 140.],
[191., 59.], 191 - the first time
[147., 144.],
[ 74., 143.],
[ 23., 43.],
[130., 1.],
[146., 166.],
[ 62., 80.],
[ 26., 138.],
[152., 71.],
[ 50., 87.],
[ 69., 9.],
[ 20., 65.],
[ 3., 162.],
[ 30., 104.],
[168., 145.],
[154., 54.],
[129., 2.],
[ 79., 170.],
[ 14., 188.],
[107., 30.],
[119., 188.],
[139., 94.],
[132., 158.],
[ 0., 69.],
[ 47., 27.],
[192., 72.],
[181., 160.],
[ 95., 162.],
[ 40., 25.],
[107., 8.],
[128., 10.],
[ 7., 83.],
[ 91., 173.],
[174., 10.],
[134., 82.],
[ 67., 52.],
[195., 172.],
[197., 96.],
[ 15., 188.],
[184., 164.],
[ 18., 180.],
[ 45., 27.],
[ 86., 84.],
[ 97., 128.],
[149., 6.],
[109., 85.],
[182., 62.],
[ 53., 68.],
[157., 81.],
[188., 25.],
[107., 45.],
[117., 86.],
[195., 47.],
[105., 103.],
[ 51., 162.],
[187., 162.],
[ 70., 97.],
[ 29., 156.],
[175., 177.],
[ 0., 10.],
[ 87., 46.],
[ 1., 119.],
[ 93., 90.],
[174., 53.],
[ 77., 84.],
[ 84., 66.],
[ 91., 186.],
[ 83., 59.],
[137., 140.],
[136., 186.],
[100., 195.],
[173., 81.],
[120., 115.],
[ 36., 46.],
[112., 148.],
[118., 103.],
[ 8., 128.],
[ 56., 65.],
[158., 145.],
[180., 122.],
[142., 126.],
[133., 45.],
[ 59., 173.],
[110., 119.],
[177., 31.], 177 - the first time!
[ 82., 158.],
[ 53., 113.],
[ 85., 150.],
[126., 94.],
[ 61., 152.],
[ 93., 40.],
[ 1., 55.],
[ 96., 162.],
[153., 108.],
[163., 9.],
[ 75., 50.],
[101., 47.],
[178., 148.],
[188., 183.],
[ 69., 177.], 177 - the second time!
[141., 16.],
[ 31., 28.],
[106., 147.],
[ 66., 176.],
[156., 96.],
[ 9., 21.],
[139., 57.],
[106., 11.],
[ 25., 2.],
[152., 69.],
[ 34., 169.],
[148., 191.], 191 - the second time!
[105., 32.],
[187., 156.],
[105., 191.], 191 - the third time!
[ 53., 128.],
[ 56., 30.],
[176., 7.],
[168., 150.],
[ 48., 101.],
[105., 167.]])
Edit 2: A brute-force method for obtaining my desired result is perhaps creating a new array to sample from after every draw,
arr_F_idx_iter = np.append(arr_F_idx[0:draw[0]],arr_F_idx[draw[0]+1:199])
but I think the question of whether this can be done efficiently with masked arrays is still legitimate, as it would be quicker, maybe also points out a flaw in how masked arrays work.

A simple 1d masked array:
In [28]: m = np.ma.masked_array(np.arange(10), mask=np.random.randint(0,2,10))
In [29]: m
Out[29]:
masked_array(data=[--, --, 2, 3, --, --, 6, 7, --, --],
mask=[ True, True, False, False, True, True, False, False,
True, True],
fill_value=999999)
choice, without special ma knowledge, draws from the data attribute:
In [31]: np.random.choice(m,3, replace=False)
Out[31]: array([4, 3, 8])
In [32]: np.random.choice(m.data,3, replace=False)
Out[32]: array([1, 8, 5])
If you want it to draw from the unmasked elements, you need to give it such an array, compressed:
In [33]: np.random.choice(m.compressed(),3, replace=False)
Out[33]: array([2, 6, 3])
In general np functions don't work correctly on masked arrays. That's why there's a large set of np.ma functions (and ma methods). Often those functions use the compressed to get the unmasked values. Or they replace masked values with some "innocent" fill.
In [35]: m.data
Out[35]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [36]: m.compressed()
Out[36]: array([2, 3, 6, 7])
In [37]: m.filled()
Out[37]:
array([999999, 999999, 2, 3, 999999, 999999, 6, 7,
999999, 999999])
In [38]: m.filled(0)
Out[38]: array([0, 0, 2, 3, 0, 0, 6, 7, 0, 0])

Related

How can I get the row of the first True find in a numpy matrix?

I have the following matrix defined:
d = np.array(
[[False, False, False, False, False, True],
[False, False, False, False, False, True],
[False, False, False, False, True, True],
[False, False, False, False, True, True],
[False, False, False, True, True, True],
[False, False, False, True, True, True],
[False, False, True, True, True, True],
[False, False, True, True, True, True],
[False, True, True, True, True, True],
[False, True, True, True, True, True],
[ True, True, True, True, True, True],
[ True, True, True, True, True, True],
[ True, True, True, True, True, True],
[False, True, True, True, True, True],
[False, False, True, True, True, True],
[False, False, False, True, True, True],
[False, False, False, False, True, True],
[False, False, False, False, False, True],
[False, False, False, False, True, True],
[False, False, False, True, True, True],
[False, False, True, True, True, True],
[False, True, True, True, True, True],
[ True, True, True, True, True, True]])
And I would like to get a vector of length 6 containing the index of the first True occurrence in each column.
So the expected output would be:
fo = np.array([10, 8, 6, 4, 2, 0])
If there would be no True values in a given column ideally it shall return NaN for that column.
I have tried:
np.sum(d, axis=0)
array([ 4, 8, 12, 16, 20, 23])
which together with the length of the columns would give the index, but that would work only if there would be only two continuous regions, one with False and another one with True.
You can do this using argmax which find the first true, and then find columns which all is False to cure the result as needed for columns contain only False. e.g. if the first column all is False:
# if first column be all False, so it show 0, too; which need additional work using mask
ini = np.argmax(d == 1, 0) # [0 8 6 4 2 0] # if we want to fill with nans so convert it to object using ".astype(object)"
sec = (d == 0).all(0) # find column with all False
ini[sec] = 1000
# [1000 8 6 4 2 0]
First, we can iterate through the Numpy array. Then, we can check if True is in the nested array we are looking at. If so, we use .index() to find what the index is.
index_list = []
for nested_list in d:
if True in nested_list:
index_list.append(nested_list.index(True))

Plotly Animation with slider

I want to add two moving points represent the location of two trains according to the day. My day data is as shown in pic starting from 0 to 7. However, in the resulting animation, the slider does not slide into the integer day. It jumped from 1.75 to 2.25 or 2.75 to 3.25 automatically. Can anyone help me to solve that?
trainpath info
import plotly.graph_objects as go
import pandas as pd
dataset = pd.read_csv('trainpath.csv')
days = []
for k in range(len(dataset['day'])):
if dataset['day'][k] not in days:
days.append(dataset['day'][k])
t1 = [-1, 0, 1, 1, 1, 0, -1, -1, -1]
k1 = [-20, -20, -20, 0, 20, 20, 20, 0, -20]
# make list of trains
trains = []
for train in dataset["train"]:
if train not in trains:
trains.append(train)
# make figure
fig_dict = {
"data": [go.Scatter(x=t1, y=k1,
mode="lines",
line=dict(width=2, color="blue")),
go.Scatter(x=t1, y=k1,
mode="lines",
line=dict(width=2, color="blue"))],
"layout": {},
"frames": []
}
# fill in most of layout
fig_dict['layout']['title'] = {'text':'Train Animation'}
fig_dict["layout"]["xaxis"] = {"range": [-10, 10], "title": "xlocation", 'autorange':False, 'zeroline':False}
fig_dict["layout"]["yaxis"] = {"range": [-22, 22], "title": "ylocation", 'autorange':False, 'zeroline':False}
fig_dict["layout"]["hovermode"] = "closest"
fig_dict["layout"]["updatemenus"] = [
{
"buttons": [
{
"args": [None, {"frame": {"duration": 500, "redraw": False},
"fromcurrent": True, "transition": {"duration": 300,
"easing": "quadratic-in-out"}}],
"label": "Play",
"method": "animate"
},
{
"args": [[None], {"frame": {"duration": 0, "redraw": False},
"mode": "immediate",
"transition": {"duration": 0}}],
"label": "Pause",
"method": "animate"
}
],
"direction": "left",
"pad": {"r": 10, "t": 87},
"showactive": False,
"type": "buttons",
"x": 0.1,
"xanchor": "right",
"y": 0,
"yanchor": "top"
}
]
sliders_dict = {
"active": 0,
"yanchor": "top",
"xanchor": "left",
"currentvalue": {
"font": {"size": 20},
"prefix": "Day:",
"visible": True,
"xanchor": "right"
},
"transition": {"duration": 300, "easing": "cubic-in-out"},
"pad": {"b": 10, "t": 50},
"len": 0.9,
"x": 0.1,
"y": 0,
"steps": []
}
# make data
day = 0
for train in trains:
dataset_by_date = dataset[dataset['day']==day]
dataset_by_date_and_train = dataset_by_date[dataset_by_date['train']==train]
data_dict = {
'x': list(dataset_by_date_and_train['x']),
'y': list(dataset_by_date_and_train['y']),
'mode': 'markers',
'text': train,
'marker': {
'sizemode': 'area',
'sizeref': 20,
'size': 20,
# 'size': list(dataset_by_date_and_train['quantity']) # this section can be used to increase or decrease the marker size to reflect the material quantity
},
'name': train
}
fig_dict['data'].append(data_dict)
# make frames
for day in days:
frame={'data': [go.Scatter(x=t1, y=k1,
mode="lines",
line=dict(width=2, color="blue")),
go.Scatter(x=t1, y=k1,
mode="lines",
line=dict(width=2, color="blue"))], 'name':str(day)}
for train in trains:
dataset_by_date = dataset[dataset['day'] == day]
dataset_by_date_and_train = dataset_by_date[dataset_by_date['train'] == train]
data_dict = {
'x': list(dataset_by_date_and_train['x']),
'y': list(dataset_by_date_and_train['y']),
'mode': 'markers',
'text': train,
'marker': {
'sizemode': 'area',
'sizeref': 20,
'size': 20,
# 'size': list(dataset_by_date_and_train['quantity']) # this section can be used to increase or decrease the marker size to reflect the material quantity
},
'name': train
}
frame['data'].append(data_dict)
fig_dict['frames'].append(frame)
slider_step = {'args': [
[day],
{'frame': {'duration':300, 'redraw':False},
'mode': 'immediate',
'transition': {'duration':3000}}
],
'label': day,
'method': 'animate'}
sliders_dict["steps"].append(slider_step)
if day == 7:
print('H')
fig_dict["layout"]["sliders"] = [sliders_dict]
fig = go.Figure(fig_dict)
fig.show()

Repeat elements from one array based on another

Given
a = np.array([1,2,3,4,5,6,7,8])
b = np.array(['a','b','c','d','e','f','g','h'])
c = np.array([1,1,1,4,4,4,8,8])
where a & b 'correspond' to each other, how can I use c to slice b to get d which 'corresponds' to c:
d = np.array(['a','a','a','d','d','d','h','h')]
I know how to do this by looping
for n in range(a.shape[0]):
d[n] = b[np.argmax(a==c[n])]
but want to know if I can do this without loops.
Thanks in advance!
With the a that is just position+1 you can simply use
In [33]: b[c - 1]
Out[33]: array(['a', 'a', 'a', 'd', 'd', 'd', 'h', 'h'], dtype='<U1')
I'm tempted to leave it at that, since the a example isn't enough to distinguish it from the argmax approach.
But we can test all a against all c with:
In [36]: a[:,None]==c
Out[36]:
array([[ True, True, True, False, False, False, False, False],
[False, False, False, False, False, False, False, False],
[False, False, False, False, False, False, False, False],
[False, False, False, True, True, True, False, False],
[False, False, False, False, False, False, False, False],
[False, False, False, False, False, False, False, False],
[False, False, False, False, False, False, False, False],
[False, False, False, False, False, False, True, True]])
In [37]: (a[:,None]==c).argmax(axis=0)
Out[37]: array([0, 0, 0, 3, 3, 3, 7, 7])
In [38]: b[_]
Out[38]: array(['a', 'a', 'a', 'd', 'd', 'd', 'h', 'h'], dtype='<U1')

How to create an invoice and invoice lines via code - Odoo13

I am trying to create an invoice and invoice lines via python code.
Here is the code.
def createInvoice(self, date_ref):
// Skipped some code here
invoice_values = contract._prepare_invoice(date_ref)
for line in contract_lines:
invoice_values.setdefault('invoice_line_ids', [])
invoice_line_values = line._prepare_invoice_line(
invoice_id=False
)
if invoice_line_values:
invoice_values['invoice_line_ids'].append(
(0, 0, invoice_line_values)
)
invoices_values.append(invoice_values)
​
Values for
invoice_values = {'type': 'in_invoice', 'journal_id': 2, 'company_id': 1, 'line_ids': [(6, 0, [])],
'partner_id': 42, 'commercial_partner_id': 42, 'fiscal_position_id': False,
'invoice_payment_term_id': False, 'invoice_line_ids': [(6, 0, [])],
'invoice_partner_bank_id': False, 'invoice_cash_rounding_id': False,
'bank_partner_id': 42, 'currency_id': 130, 'invoice_date': datetime.date(2020, 11, 11),
'invoice_origin': 'Vendor COntract #1', 'user_id': 2, 'old_contract_id': 6}
invoice_line_values = {'move_id': False, 'journal_id': False, 'company_id': False,
'account_id': False, 'name': '[E-COM07] Large Cabinet', 'quantity': 1.0,
'price_unit': 1444.01, 'discount': 0.0, 'partner_id': False,
'product_uom_id': 1, 'product_id': 17, 'payment_id': False,
'tax_ids': [(6, 0, [])], 'analytic_line_ids': [(6, 0, [])],
'display_type': False, 'contract_line_id': 7, 'asset_id': False,
'analytic_account_id': False}
In create function of account move
vals = {'date': datetime.date(2020, 2, 11), 'type': 'in_invoice', 'journal_id': 2,
'company_id': 1, 'currency_id': 130, 'line_ids': [(6, 0, [])], 'partner_id': 42,
'commercial_partner_id': 42, 'fiscal_position_id': False, 'user_id': 2,
'invoice_date': datetime.date(2020, 12, 11), 'invoice_origin': 'Vendor COntract #1',
'invoice_payment_term_id': False,
'invoice_line_ids': [(6, 0, []), (0, 0, {'journal_id': False, 'company_id': False,
'account_id': 109, 'name': '[E-COM07] Large Cabinet', 'quantity': 1.0, 'price_unit': 1444.01,
'discount': 0.0, 'partner_id': False, 'product_uom_id': 1, 'product_id': 17,
'payment_id': False, 'tax_ids': [(6, 0, [19])], 'analytic_line_ids': [(6, 0, [])],
'analytic_account_id': False, 'display_type': False, 'exclude_from_invoice_tab': False,
'contract_line_id': 7, 'asset_id': False}), (0, 0, {'journal_id': False,
'company_id': False, 'account_id': 109, 'name': '[E-COM09] Large Desk', 'quantity': 1.0,
'price_unit': 8118.04, 'discount': 0.0, 'partner_id': False, 'product_uom_id': 1,
'product_id': 19, 'payment_id': False, 'tax_ids': [(6, 0, [19])],
'analytic_line_ids': [(6, 0, [])], 'analytic_account_id': False, 'display_type': False,
'exclude_from_invoice_tab': False, 'contract_line_id': 8, 'asset_id': False})],
'invoice_partner_bank_id': False, 'invoice_cash_rounding_id': False, 'bank_partner_id': 42,
'old_contract_id': 6}
It creates the account_move(Invoice) but not the account_move_line(Invoice lines).
What am i missing here?
Finally got the solution,
'line_ids': [(6, 0, [])],
The above line caused the problem. I removed it from the invoice_values then it worked.

Tensorflow indicator matrix for top n values

Does anyone know how to extract the top n largest values per row of a rank 2 tensor?
For instance, if I wanted the top 2 values of a tensor of shape [2,4] with values:
[[40, 30, 20, 10], [10, 20, 30, 40]]
The desired condition matrix would look like:
[[True, True, False, False],[False, False, True, True]]
Once I have the condition matrix, I can use tf.select to choose actual values.
Thank you for assistance!
You can do it using built-in tf.nn.top_k function:
a = tf.convert_to_tensor([[40, 30, 20, 10], [10, 20, 30, 40]])
b = tf.nn.top_k(a, 2)
print(sess.run(b))
TopKV2(values=array([[40, 30],
[40, 30]], dtype=int32), indices=array([[0, 1],
[3, 2]], dtype=int32))
print(sess.run(b).values))
array([[40, 30],
[40, 30]], dtype=int32)
To get boolean True/False values, you can first get the k-th value and then use tf.greater_equal:
kth = tf.reduce_min(b.values)
top2 = tf.greater_equal(a, kth)
print(sess.run(top2))
array([[ True, True, False, False],
[False, False, True, True]], dtype=bool)
you can also use tf.contrib.framework.argsort
a = [[40, 30, 20, 10], [10, 20, 30, 40]]
idx = tf.contrib.framework.argsort(a, direction='DESCENDING') # sorted indices
ranks = tf.contrib.framework.argsort(idx, direction='ASCENDING') # ranks
b = ranks < 2
# [[ True True False False] [False False True True]]
Moreover, you can replace 2 with a 1d tensor so that each row/column can have different n values.