numpy return indices using multiple conditions of UNKNOWN number - numpy

Consider two arrays (X and Y), X is a 2D data array (grayscale image), and Y is an array of conditions where array X needs to be filtered based on, as follows:
X = np.array([[0,0,0,0,4], [0,1,1,2,3], [1,1,2,2,0], [0,0,2,2,3], [0,0,0,0,0]])
Y = np.array([1,2,3])
X:
[[0 0 0 0 4]
[0 1 1 2 3]
[1 1 2 2 0]
[0 0 2 2 3]
[0 0 0 0 0]]
Y:
[1 2 3]
I need to select the elements/indices of array X based on the values in array Y, such that:
Z = np.argwhere((X == Y[0]) | (X == Y[1]) | (X == Y[2]))
Z:
[[1 1]
[1 2]
[1 3]
[1 4]
[2 0]
[2 1]
[2 2]
[2 3]
[3 2]
[3 3]
[3 4]]
This can be done using a loop over the items of array Y, is there a numpy function to achieve this?
It is also achievable using multiple conditions in a np.argwhere function, however, the number of conditions (length of array Y ) is unknown beforhand.
Thanks

The key is to prepare the correct mask. For that, use numpy.isin:
np.isin(X, Y)
You'll get a boolean mask as a result, of the same shape X has. Now you can get the indices using an appropriate method.

Related

Red: any alternative to using do for adding dynamic keys to a block

Is there an alternative syntax for :
a: [
b: [
1 2
]
]
append (do "a/b") 3
== [
b: [
1 2
]
]
I don't feel this as very elegant using do (it ressembles too much eval in javascript).
I tried to-path without success.
The simplest way is to use path notation to "address" the inner block directly:
>> a: [ b: [ 1 2 ] ]
== [b: [1 2]]
>> append a/b 3
== [1 2 3]
Re comment that you want a/b in a variable:
a: [b: [1 2 3]]
var: a/b
append var 4
probe a
== [b: [1 2 3 4]]
Given your initial assignment
a: [b: [1 2]]
== [b: [1 2]]
you want to append 3 to the inner block. You can get the inner block by
do "a/b"
== [1 2]
but you can also get it by
probe a/b
== [1 2]
which lets you append like this instead:
append a/b 3
== [1 2 3]
probe a
== [b: [1 2 3]]
In an Algol-style language, this would be something like a.b = append(a.b, 3): the a/b is an assignable dereference to the inner block.
ETA:
If you want to bottle up the dereference, the alternative to your do "a/b" could be to create a function:
ab: function [][a/b]
== func [][a/b]
append ab 7
== [1 2 7]
(Alternatively, ab: does [a/b].)
Why this doesn't work though: a: [b: [1 2 3]] var: to-path "a/b" append var 4
This does (note the GET)
a: [b: [1 2 3]]
var: load "a/b"
append get var 4
probe a
== [b: [1 2 3 4]]
As path notation is just a shortcut to select you can circumvent the path by using select
in Red
>> a: [ b: [ 1 2 ] ]
== [b: [1 2]]
>> append select a 'b 3
== [1 2 3]
>> a
== [b: [1 2 3]]
in Rebol you have to do
>> append select a to-set-word 'b 3
== [1 2 3]
By the way, why do you not use a: [ b [ 1 2 ] ] or do you want to assign the inner block to the global variable b ? Then a simple do a would do it and you can use
>> do a
== [1 2]
>> append b 3
== [1 2 3]
a: [b: [ 1 2 3]]
append a/b 4
probe a
== [b: [1 2 3 4]]

Optimization Nurse Assignment

I'm new in CPLEX Optminization and at the moment, I'm writing a model that should assign nurses to surgery cases that fit their competences, specialty…
Actually I thought that the model is working correctly, but when I am trying it, it assigns nurses to cases, in which they are not allowed to work.
I hope somebody here will make time to look at the model and can help me. So here is the existing model (very easy at the moment with 5 nurses and one case):
.mod:
// indices
tuple nurses {
key int number_nurses ;
string roles ;
string specialty ;
string compentency ;
int shift_start ;
int shift_end ;
}
tuple shifts {
int shift_start ;
int shift_end ;
}
{shifts} shift = ... ;
{nurses} nurse = ... ;
int j = ... ;
range available_ORs = 1..j ;
{string} roles = ... ;
{string} specialty = ... ;
tuple cases {
key int number_cases ;
int start_time ;
int end_time ;
int duration ;
int demand_RN ;
int demand_ST ;
int available_ORs ;
string specialty_needed ;
string competency_needed ;
}
{cases} case = ... ;
//{string} shifts = ... ;
{string} competency = ... ;
int h = ... ;
range time_intervals = 1..h ;
// parameters
int P1 [nurse][shift] = ... ;
int P2 [nurse][roles][specialty][competency] = ... ;
int P3 [case][available_ORs] = ... ;
int P4 [time_intervals][case][specialty][competency] = ... ; // Auf Complexity innerhalb des Cases zugreifen
int P5 [time_intervals][case][roles] = ... ;
int P6 [case][time_intervals] = ... ;
int P7 [case] = ... ;
int P8 [time_intervals][shift] = ...;
int M = ... ;
// decision variables
dvar boolean y [nurse][case][roles][time_intervals] in 0..1 ; // 1: Nurse is assigened to case to perform role in time interval, 0: otherwise
dvar boolean x [nurse][case][roles] in 0..1 ;
dvar int de [time_intervals][case][roles] in 0..1 ;
dvar int dev [time_intervals][nurse] in 0..1 ;
dvar int dev2 [nurse][time_intervals] in 0..1 ;
dvar int xdev [nurse][available_ORs] in 0..1 ;
dvar int nc [nurse][case] in 0..1 ;
dvar int cd [nurse][case] in 0..1 ;
// deviation variable
dvar int DE ;
dvar int DS ;
dvar int DF ;
dvar int XDEV ;
dvar int NCT ;
dvar int CDT ;
// Objective function
minimize (DE+DF+DS+XDEV+NCT+CDT) ;
// Hard Constraints
subject to {
forall (i in nurse, h in time_intervals)
cons_01: // each nurse is assigned to at most one case in each time interval and performs a single role
sum (c in case,k in roles) y[i][c][k][h] <= 1 ;
forall (i in nurse, c in case, k in roles, h in time_intervals)
cons_02: // in each shift, cases will be assigned to the nurses who are working during their regular or authorized overtime hours
y[i][c][k][h] <= sum (s in shift) ((P1[i][s])*P8[h][s]) ;
forall (i in nurse)
cons_03: //total working hours for a nurse each day must be less than his or her total regular and overtime working hours
sum (c in case, k in roles, h in time_intervals) y[i][c][k][h] <= sum (s in shift, h in time_intervals) ((P1[i][s])*P8[h][s]) ;
forall (i in nurse, c in case, k in roles, h in time_intervals)
cons_04: // assigned to a case only if their skill level is high enough to handle the specialty requirements and have sufficient competency to deal with its procedural complexities
y[i][c][k][h] <= P6[c][h]*sum (q in specialty, p in competency) (P4[h][c][q][p]*P2[i][k][q][p]) ;
forall (c in case, k in roles, h in time_intervals)
cons_05: // see cons_04
sum (i in nurse) y[i][c][k][h] >= P6[c][h] ;
forall (i in nurse, c in case, k in roles)
cons_06: //nurses perform the same role for the entire duration of a case
sum (h in time_intervals) y[i][c][k][h] <= M * x[i][c][k] ;
forall (i in nurse, c in case)
cons_07: // see cons_06
sum (k in roles) x[i][c][k] <= 1 ;
}
// soft constraints
subject to {
forall (c in case, k in roles, h in time_intervals)
cons_08: // permit undercoverage, preferred that nurses work continuously during their regular hours rather than having idle perijjjods
sum (i in nurse) y[i][c][k][h] * de[h][c][k] >= P5[h][c][k] * P6[c][h] ; //momentan zu viele nurses, daher zu viele Zuordnungen
forall (c in case, k in roles)
cons_09: // see cons_08
DE >= sum (h in time_intervals) de[h][c][k] ;
/*forall (i in nurse, h in time_intervals)
cons_10: // avoid idle times: maximum number of times that nurses work nonconsecutive hours will be minimized along with idle time hours during the shift
-dev[h][i] <= sum (c in case, k in roles) y[i][c][k][h+1] - sum (c in case, k in roles) y[i][c][k][h] <=dev[h][i] ;*/
forall (i in nurse)
cons_11: // see cons_10
DS >= sum (h in time_intervals) dev[h][i] ;
forall(i in nurse, j in available_ORs)
cons_12: // account for the preference that nurses work continuously in one operating room: maximum number of ORs is reduced as much as possible
sum (c in case, k in roles, h in time_intervals) (y[i][c][k][h] * P3[c][j]) <= M * xdev[i][j] ;
forall (i in nurse)
cons_13: // see cons_12
sum (j in available_ORs) xdev[i][j] <= XDEV ;
forall (i in nurse, h in time_intervals)
cons_14: // record overtime: maximum number of times a nurse is working during overtime is as low as possible
sum (s in shift) P1[i][s] * 1 * sum (c in case, k in roles) y [i][c][k][h] <= dev2[i][h] ;
forall (i in nurse)
cons_15: // see cons_14
DF >= sum (h in time_intervals) dev2[i][h] ;
forall (i in nurse, c in case)
cons_16: // if nurse assigned to surgery, (s)he will stay for entire duration of the case unless there is a greater need for that nurse elseway, limit the movement of nurses between cases
sum (k in roles, h in time_intervals) y[i][c][k][h] - P7[c] + M * cd[i][c] + M * (1 - nc[i][c]) >= 0 ;
forall (i in nurse, c in case)
cons_17: // see cons_16
sum (k in roles, h in time_intervals) y[i][c][k][h] <= M * nc[i][c] ;
forall (i in nurse)
cons_18: // see cons_16
sum (c in case) nc[i][c] <= NCT ;
forall (i in nurse)
cons_19: // see cons_16
sum (c in case) cd[i][c] <= CDT ;
}
dat.
nurse = {
< 1, "RN", "cardio", "high", 6, 14 > ,
< 2, "ST", "cardio", "high", 10, 18 > ,
< 3, "RN", "neuro", "high", 14, 22 > ,
< 4, "RN", "cardio", "intermediate", 14, 22 > ,
< 5, "RN", "cardio", "high", 22, 6 > ,
} ;
j = 2 ;
roles = { "RN" , "ST" } ; // RN can work as ST if necessary, but a ST can not work as RN
specialty = { "ortho", "neuro", "cardio" } ;
case = {
< 1, 8, 15, 7, 1, 0, 1, "cardio", "intermediate" > ,
} ;
shift = { < 6, 14 > < 10, 18 > < 14, 22 > < 22, 6 >} ;
competency = { "low", "intermediate", "high" } ;
h = 24 ;
// Parameter
P1 = [
[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
[0 0 1 0]
[0 0 0 1]
] ;
P2 = [
[[[1 1 1][0 0 0][0 0 0]][[1 1 1][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]][[1 1 1][0 0 0][0 0 0]]]
[[[0 0 0][1 1 1][0 0 0]][[0 0 0][1 1 1][0 0 0]]]
[[[1 1 0][0 0 0][0 0 0]][[1 1 0][0 0 0][0 0 0]]]
[[[1 1 1][0 0 0][0 0 0]][[1 1 1][0 0 0][0 0 0]]]
] ;
P3 = [
[1 0]
] ;
P4 = [
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 1 0][0 0 0][0 0 0]]]
[[[0 1 0][0 0 0][0 0 0]]]
[[[0 1 0][0 0 0][0 0 0]]]
[[[0 1 0][0 0 0][0 0 0]]]
[[[0 1 0][0 0 0][0 0 0]]]
[[[0 1 0][0 0 0][0 0 0]]]
[[[0 1 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
[[[0 0 0][0 0 0][0 0 0]]]
] ;
P5 = [
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[1 0]]
[[1 0]]
[[1 0]]
[[1 0]]
[[1 0]]
[[1 0]]
[[1 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
[[0 0]]
] ;
P6 = [
[0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ]
] ;
P7 = [7] ;
P8 = [
[0 0 0 1]
[0 0 0 1]
[0 0 0 1]
[0 0 0 1]
[0 0 0 1]
[1 0 0 0]
[1 0 0 0]
[1 0 0 0]
[1 0 0 0]
[1 1 0 0]
[1 1 0 0]
[1 1 0 0]
[1 1 0 0]
[0 1 1 0]
[0 1 1 0]
[0 1 1 0]
[0 1 1 0]
[0 0 1 0]
[0 0 1 0]
[0 0 1 0]
[0 0 1 0]
[0 0 0 1]
[0 0 0 1]
[0 0 0 1]
] ;
M = 10 ;
When I run it, it assigns the second nurse to the case, but actually the nurse 2 is not allowed to work in this case because a RN is needed. I already tried different possibilities to write it, but I don't find the reason why…
At the moment, nurse 1 and nurse 4 are assigned as RN (which is right) and additionally also Nurse 2.
Can anybody give me some tipps or help to solve the optimization problem correctly? I would be very grateful, thank you in advance.
Your model is not feasible and CPLEX relaxed it. It relaxed some labeled constraints and gave you the relaxed solution. Having a look at the relaxed solution will help you debug your model.
In cplex documentation you could have a look at:
IDE and OPL > CPLEX Studio IDE > IDE Tutorials
Relaxing infeasible models

How do I put values od dataframe column in 2d matrix?

I have the pandas dataframe with 3 columns value, row_index, column_index. I would like to create a matrix, where values of dataframe placed at relevant rows and columns and unknown elements are zeros.
I have made a for-cycle like this:
N_rows = df.row_index.max()
N_cols = df.column_index.max()
A = np.zeros((N_rows, N_cols))
for i in df.row_index:
for j in df.column_index:
np.put(A, i*N_cols+j, df['value'][(df.row_index==i) &
(df.column_index==j)])
but it works very slow.
How can I do it faster?
I think you need pivot with fillna and for missing values of columns and rows add reindex, last for numpy array add values:
df = pd.DataFrame({'value':[2,4,5],
'row_index':[2,3,4],
'col_index':[0,2,3]})
print (df)
col_index row_index value
0 0 2 2
1 2 3 4
2 3 4 5
rows = np.arange(df.row_index.max()+1)
cols = np.arange(df.col_index.max()+1)
print (df.pivot('row_index', 'col_index', 'value')
.fillna(0)
.reindex(index=rows, columns=cols, fill_value=0))
col_index 0 1 2 3
row_index
0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0
2 2.0 0.0 0.0 0.0
3 0.0 0.0 4.0 0.0
4 0.0 0.0 0.0 5.0
a = df.pivot('row_index', 'col_index', 'value')
.fillna(0)
.reindex(index=rows, columns=cols, fill_value=0)
.values
print (a)
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 2. 0. 0. 0.]
[ 0. 0. 4. 0.]
[ 0. 0. 0. 5.]]
Another solution with set_index and unstack:
print (df.set_index(['row_index', 'col_index'])['value']
.unstack(fill_value=0)
.reindex(index=rows, columns=cols, fill_value=0))
col_index 0 1 2 3
row_index
0 0 0 0 0
1 0 0 0 0
2 2 0 0 0
3 0 0 4 0
4 0 0 0 5
a = df.set_index(['row_index', 'col_index'])['value']
.unstack(fill_value=0)
.reindex(index=rows, columns=cols, fill_value=0)
.values
print (a)
[[0 0 0 0]
[0 0 0 0]
[2 0 0 0]
[0 0 4 0]
[0 0 0 5]]
Just modifying a minor part in #jezrael's solution. You can actually use Pandas as_matrix() functions to get the arrays:
df = pd.DataFrame({'value':[2,4,5],
'row_index':[2,3,4],
'col_index':[0,2,3]})
df.pivot('row_index', 'col_index', 'value').fillna(0).as_matrix()
# array([[ 2., 0., 0.],
# [ 0., 4., 0.],
# [ 0., 0., 5.]])

Fill pandas fields with tuples as elements by slicing

Sorry if this question has been asked before, but I did not find it here nor somewhere else:
I want to fill some of the fields of a column with tuples. Currently I would have to resort to:
import pandas as pd
df = pd.DataFrame({'a': [1,2,3,4]})
df['b'] = ''
df['b'] = df['b'].astype(object)
mytuple = ('x','y')
for l in df[df.a % 2 == 0].index:
df.set_value(l, 'b', mytuple)
with df being (which is what I want)
a b
0 1
1 2 (x, y)
2 3
3 4 (x, y)
This does not look very elegant to me and probably not very efficient. Instead of the loop, I would prefer something like
df.loc[df.a % 2 == 0, 'b'] = np.array([mytuple] * sum(df.a % 2 == 0), dtype=tuple)
which (of course) does not work. How can I improve my above method by using slicing?
In [57]: df.loc[df.a % 2 == 0, 'b'] = pd.Series([mytuple] * len(df.loc[df.a % 2 == 0])).values
In [58]: df
Out[58]:
a b
0 1
1 2 (x, y)
2 3
3 4 (x, y)

How to access array indices in REBOL multidimensional arrays

I tried using an array to specify an index of a 2-dimensional array, but the pick function won't accept an array as the second element:
print pick [[3 5] [3 1]] [2 1]
*** ERROR
** Script error: invalid argument: [2 2]
** Where: pick try do either either either -apply-
** Near: pick [[3 5] [3 1]] [2 2]
I found a workaround for this, but it's slightly more verbose:
print pick pick [[3 5] [3 1]] 2 1
[comment This prints "3".]
Is it possible to access an index of a multidimensional array without calling the pick function multiple times?
A more succinct way to PICK out an element from a multi-dimensonal array is to use the PATH! syntax.
Here's an example in the Rebol console:
>> x: [[3 5] [3 1]]
== [[3 5] [3 1]]
>> x/2/1
== 3
>> x/2/2
== 1
>> x/1/(1 + 1) ;; use parens for expressions - transforms to x/1/2
== 5
>> p: 2
== 2
>> x/1/:p ;; use ":" for variable reference - transforms to x/1/2
== 5
>> x/(p - 1)/:p ;; mix and match at any level of array - transforms to x/1/2
== 5
>> x/3 ;; NONE is returned if index does not exist
== none
>> x/2
== [3 1]
>> x/2/3 ;; again out of range
== none
Another alternative would be the FIRST, SECOND .. TENTH functions:
>> second first [[3 5] [3 1]]
== 5
You can even mix and match:
>> x: [ [[1]] [[2]] [3 [4 5]] ]
== [[[1]] [[2]] [3 [4 5]]]
>> first pick x/3 2
== 4