how to implement a randlong in numpy

how to implement a randlong in numpy - pandas

For the following code, I want to generate random value but the value in dfCombined["TotalFreeSize"] can be up to 9.941458e+11, and it throws error for randint, what should I do?
And I can't find a randlong function..
# get average, std, and number of NaN values in col
average_age_test = dfCombined["TotalFreeSize"].mean()
std_age_test = dfCombined["TotalFreeSize"].std()
count_nan_age_test = dfCombined["TotalFreeSize"].isnull().sum()
rand_1 = np.random.randint(average_age_test - std_age_test, average_age_test + std_age_test, size = count_nan_age_test)

randint takes two values as parameters, both ints. a lower bound value and a higher bound value. it will than return an int between those values.
mean(), std() in the other hand return float values, as the mean and std of a set of values.
what you should do is give parameters that are int into the randint function.
you can:
average_age_test = int ( dfCombined["TotalFreeSize"].mean() )
std_age_test = int ( dfCombined["TotalFreeSize"].std() )
hope that helps! if it does please upvote :)
sample:
input:
A = int (9.941458e+11)
random.randint(0, A)
output:
153271550649L (type: long)

You can simply map random values from 0 to 1 (generated with random) to your max desired value, and python will cast it appropriately. This seems to work even for very large max values:
import random
max = 9.941458e91 # notice the '91' exponent here!
for i in xrange(10):
val = long(random.random() * max)
print val
print type(val)
print
Sample output (single value):
45525909158271540655933151699075210889481964268830820171688052775301692920778132706971090944
<type 'long'>

Related

ORTools CP-Sat Solver Channeling Constraint dependant of x

I try to add the following constraints to my model. my problem: the function g() expects x as a binary numpy array. So the result arr_a depends on the current value of x in every step of the optimization!
Afterwards, I want the max of this array times x to be smaller than 50.
How can I add this constraint dynamically so that arr_a is always rightfully calculated with the value of x at each iteration while telling the model to keep the constraint arr_a * x <= 50 ? Currently I am getting an error when adding the constraint to the model because g() expects x as numpy array to calculate arr_a, arr_b, arr_c ( g uses np.where(x == 1) within its calculation).
#Init model
from ortools.sat.python import cp_model
model = cp_model.CpModel()
# Declare the variables
x = []
for i in range(self.ds.n_banks):
x.append(model.NewIntVar(0, 1, "x[%i]" % (i)))
#add bool vars
a = model.NewBoolVar('a')
arr_a, arr_b, arr_c = g(df1,df2,df3,x)
model.Add((arr_a.astype('int32') * x).max() <= 50).OnlyEnforceIf(a)
model.Add((arr_a.astype('int32') * x).max() > 50).OnlyEnforceIf(a.Not())
Afterwards i add the target function that naturally also depends on x.
model.Minimize(target(x))
def target(x):
arr_a, arr_b, arr_c = g(df1,df2,df3,x)
return (3 * arr_b * x + 2 * arr_c * x).sum()
EDIT:
My problem changed a bit and i managed to get it work without issues. Nevertheless, I experienced that the constraint is never actually met! self-defined-function is a highly non-linear function that expects the indices where x==1 and where x == 0 and returns a numpy array. Also it is not possible to re-build it with pre-defined functions of the sat.solver.
#Init model
model = cp_model.CpModel()
# Declare the variables
x = [model.NewIntVar(0, 1, "x[%i]" % (i)) for i in range(66)]
# add hints
[model.AddHint(x[i],np.random.choice(2, 1, p=[0.4, 0.6])[0]) for i in range(66)]
open_elements = [model.NewBoolVar("open_elements[%i]" % (i)) for i in range(66)]
closed_elements = [model.NewBoolVar("closed_elements[%i]" % (i)) for i in range(6)]
# open indices as bool vars
for i in range(66):
model.Add(x[i] == 1).OnlyEnforceIf(open_elements[i])
model.Add(x[i] != 1).OnlyEnforceIf(open_elements[i].Not())
model.Add(x[i] != 1).OnlyEnforceIf(closed_elements[i])
model.Add(x[i] == 1).OnlyEnforceIf(closed_elements[i].Not())
model.Add((self-defined-function(np.where(open_elements), np.where(closed_elements), some_array).astype('int32') * x - some_vector).all() <= 0)
Even when I apply a simpler function, it will not work properly.
model.Add((self-defined-function(x, some_array).astype('int32') * x - some_vector).all() <= 0)
I also tried the following:
arr_indices_open = []
arr_indices_closed = []
for i in range(66):
if open_elements[i] == True:
arr_indices_open.append(i)
else:
arr_indices_closed.append(i)
# final Constraint
arr_ = self-defined-function(arr_indices_open, arr_indices_closed, some_array)[0].astype('int32')
for i in range(66):
model.Add(arr_[i] * x[i] <= some_other_vector[i])
Some minimal example for the self-defined-function, with which I simply try to say that n_closed shall be smaller than 10. Even that condition is not met by the solver:
def self_defined_function(arr_indices_closed)
return len(arr_indices_closed)
arr_ = self-defined-function(arr_indices_closed)
for i in range(66):
model.Add(arr_ < 10)

I'm not sure I fully understand the question, but generally, if you want to optimize a function g(x), you'll have to implement it in using the solver's primitives (docs).
It's easier to do when your calculation coincides with an existing solver function, e.g.: if you're trying to calculate a linear expression; but could get harder to do when trying to calculate something more complex. However, I believe that's the only way.

Negamax Cut-off Return Value?

I have a problem with my Negamax algorithm and hope someone could help me.
I'm writing it in Cython
my search method is a following:
cdef _search(self, object game_state, int depth, long alpha, long beta, int max_depth):
if depth == max_depth or game_state.is_terminated:
value = self.evaluator.evaluate(game_state) evaluates based on current player
return value, []
moves = self.prepare_moves(depth, game_state) # getting moves and sorting
max_value = LONG_MIN
for move in moves:
new_board = game_state.make_move(move)
value, pv_moves = self._search(new_board, depth + 1, -beta, -alpha, max_depth, event)
value = -value
if max_value < value:
max_value = value
best_move = move
best_pv_moves = pv_moves
if alpha < max_value:
alpha = max_value
if max_value >= beta:
return LONG_MAX, []
best_pv_moves.insert(0, best_move)
return alpha, best_pv_moves
In many examples you break after a cutoff is detected but when I do this the algorithm don't find the optimal solution. I'm testing against some chess puzzles and I was wondering why this is the case. If I return the maximum number after a cutoff is detected It works fine but I takes a long time (252sec for depth 6)...
Speed: Nodes pre Second : 21550.33203125
Or if you have other improvements let me know (I use transposition table, pvs and killer heuristics)

Turn out I used the c limits
cdef extern from "limits.h":
cdef long LONG_MAX
cdef long LONG_MIN
and when you try to invert LONG_MIN, with -LONG_MIN you get LONG_MIN, because of an overflow?

Retrieve indices for rows of a PyTables table matching a condition using `Table.where()`

I need the indices (as numpy array) of the rows matching a given condition in a table (with billions of rows) and this is the line I currently use in my code, which works, but is quite ugly:
indices = np.array([row.nrow for row in the_table.where("foo == 42")])
It also takes half a minute, and I'm sure that the list creation is one of the reasons why.
I could not find an elegant solution yet and I'm still struggling with the pytables docs, so does anybody know any magical way to do this more beautifully and maybe also a bit faster? Maybe there is special query keyword I am missing, since I have the feeling that pytables should be able to return the matched rows indices as numpy array.

tables.Table.get_where_list() gives indices of the rows matching a given condition

I read the source of pytables, where() is implemented in Cython, but it seems not fast enough. Here is a complex method that can speedup:
Create some data first:
from tables import *
import numpy as np
class Particle(IsDescription):
name = StringCol(16) # 16-character String
idnumber = Int64Col() # Signed 64-bit integer
ADCcount = UInt16Col() # Unsigned short integer
TDCcount = UInt8Col() # unsigned byte
grid_i = Int32Col() # 32-bit integer
grid_j = Int32Col() # 32-bit integer
pressure = Float32Col() # float (single-precision)
energy = Float64Col() # double (double-precision)
h5file = open_file("tutorial1.h5", mode = "w", title = "Test file")
group = h5file.create_group("/", 'detector', 'Detector information')
table = h5file.create_table(group, 'readout', Particle, "Readout example")
particle = table.row
for i in range(1001000):
particle['name'] = 'Particle: %6d' % (i)
particle['TDCcount'] = i % 256
particle['ADCcount'] = (i * 256) % (1 << 16)
particle['grid_i'] = i
particle['grid_j'] = 10 - i
particle['pressure'] = float(i*i)
particle['energy'] = float(particle['pressure'] ** 4)
particle['idnumber'] = i * (2 ** 34)
# Insert a new particle record
particle.append()
table.flush()
h5file.close()
Read the column in chunks and append the indices into a list and concatenate the list to array finally. You can change the chunk size according to your memory size:
h5file = open_file("tutorial1.h5")
table = h5file.get_node("/detector/readout")
size = 10000
col = "energy"
buf = np.zeros(batch, dtype=table.coldtypes[col])
res = []
for start in range(0, table.nrows, size):
length = min(size, table.nrows - start)
data = table.read(start, start + batch, field=col, out=buf[:length])
tmp = np.where(data > 10000)[0]
tmp += start
res.append(tmp)
res = np.concatenate(res)

Switch on argument type

Using Open SCAD, I have a module that, like cube(), has a size parameter that can be a single value or a vector of three values. Ultimately, I want a vector of three values.
If the caller passes a single value, I'd like all three values of the vector to be the same. I don't see anything in the language documentation about detecting the type of an argument. So I came up with this hack:
module my_cubelike_thing(size=1) {
dimensions = concat(size, size, size);
width = dimensions[0];
length = dimensions[1];
height = dimensions[2];
// ... use width, length, and height ...
}
When size is a single value, the result of the concat is exactly what I want: three copies of the value.
When size is a three-value vector, the result of the concat is nine-value vector, and my code just ignores the last six values.
It works but only because what I want in the single value case is to replicate the value. Is there a general way to switch on the argument type and do different things depending on that type?

If type of size only can be single value or a vector with 3 values, the type can helpwise be found by the special value undef:
a = [3,5,8];
// a = 5;
if (a[0] == undef) {
dimensions = concat(a, a, a);
// do something
cube(size=dimensions,center=false);
}
else {
dimensions = a;
// do something
cube(size=dimensions,center=false);
}
But assignments are only valid in the scope in which they are defined , documnetation of openscad.
So in each subtree much code is needed and i would prefere to validate the type of size in an external script (e.g. python3) and write the openscad-code with the assignment of variables to a file, which can be included in the openscad-file, here my short test-code:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import os
# size = 20
size = [20,15,10]
if type(size) == int:
dimensions = [size, size, size]
elif type(size) == list:
dimensions = size
else:
# if other types possible
pass
with open('variablen.scad', 'w') as wObj:
for i, v in enumerate(['l', 'w', 'h']):
wObj.write('{} = {};\n'.format(v, dimensions[i]))
os.system('openscad ./typeDef.scad')
content of variablen.scad:
l = 20;
w = 15;
h = 10;
and typeDef.scad can look like this
include <./variablen.scad>;
module my_cubelike_thing() {
linear_extrude(height=h, center=false) square(l, w);
}
my_cubelike_thing();

PyOpenCL reduction Kernel on each pixel of image as array instead of each byte (RGB mode, 24 bits )

I'm trying to calculate the average Luminance of an RGB image. To do this, I find the luminance of each pixel i.e.
L(r,g,b) = X*r + Y*g + Z*b (some linear combination).
And then find the average by summing up luminance of all pixels and dividing by width*height.
To speed this up, I'm using pyopencl.reduction.ReductionKernel
The array I pass to it is a Single Dimension Numpy Array so it works just like the example given.
import Image
import numpy as np
im = Image.open('image_00000001.bmp')
data = np.asarray(im).reshape(-1) # so data is a single dimension list
# data.dtype is uint8, data.shape is (w*h*3, )
I want to incorporate the following code from the example into it . i.e. I would make changes to datatype and the type of arrays I'm passing. This is the example:
a = pyopencl.array.arange(queue, 400, dtype=numpy.float32)
b = pyopencl.array.arange(queue, 400, dtype=numpy.float32)
krnl = ReductionKernel(ctx, numpy.float32, neutral="0",
reduce_expr="a+b", map_expr="x[i]*y[i]",
arguments="__global float *x, __global float *y")
my_dot_prod = krnl(a, b).get()
Except, my map_expr will work on each pixel and convert each pixel to its luminance value.
And reduce expr remains the same.
The problem is, it works on each element in the array, and I need it to work on each pixel which is 3 consecutive elements at a time (RGB ).
One solution is to have three different arrays, one for R, one for G and one for B ,which would work, but is there another way ?

Edit: I changed the program to illustrate the char4 usage instead of float4:
import numpy as np
import pyopencl as cl
import pyopencl.array as cl_array
deviceID = 0
platformID = 0
workGroup=(1,1)
N = 10
testData = np.zeros(N, dtype=cl_array.vec.char4)
dev = cl.get_platforms()[platformID].get_devices()[deviceID]
ctx = cl.Context([dev])
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
Data_In = cl.Buffer(ctx, mf.READ_WRITE, testData.nbytes)
prg = cl.Program(ctx, """
__kernel void Pack_Cmplx( __global char4* Data_In, int N)
{
int gid = get_global_id(0);
//Data_In[gid] = 1; // This would change all components to one
Data_In[gid].x = 1; // changing single component
Data_In[gid].y = 2;
Data_In[gid].z = 3;
Data_In[gid].w = 4;
}
""").build()
prg.Pack_Cmplx(queue, (N,1), workGroup, Data_In, np.int32(N))
cl.enqueue_copy(queue, testData, Data_In)
print testData
I hope it helps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

how to implement a randlong in numpy - pandas

Related

ORTools CP-Sat Solver Channeling Constraint dependant of x

Negamax Cut-off Return Value?

Retrieve indices for rows of a PyTables table matching a condition using `Table.where()`

Switch on argument type

PyOpenCL reduction Kernel on each pixel of image as array instead of each byte (RGB mode, 24 bits )

Categories

Resources