How to use alias to simplify CUDA_VISIBLE_DEVICES - alias

I can use alias gpu0='CUDA_VISIBLE_DEVICES=0' to set gpu0, but what if CUDA_VISIBLE_DEVICES=0,1,2?

I come up with a bash function to this. Put this function inside .bashrc:
gpu() {
export CUDA_VISIBLE_DEVICES="$1"
}
and use it with gpu 0 or gpu 0,1

share my script to generate all possible alias situation
import itertools
num = 8
gpu_list = []
for i in range(num):
gpu_list.extend(list(itertools.combinations(
[str(j) for j in range(num)], i + 1)))
print(gpu_list)
f = open('gpu.txt', 'w')
for gpu in gpu_list:
sub1 = ''.join(gpu)
sub2 = ','.join(gpu)
f.writelines(f"alias gpu{sub1}=\'CUDA_VISIBLE_DEVICES={sub2}\'\n")

Related

In pyscipopt, would it be possible to use a function containing an optimization model inside of my main optimization model?

I am using Jupyter Notebook. I have tried defining a function with an optimization model, it seems to work outside of my main model. When I tried using the function on a variable inside my main model, at first the kernel dies, when I have updated Anaconda, it now seems to be doing nothing.
My function:
def optfunc(x):
mod = Model()
y = mod.addVar("y", ub = 2, lb = -1)
consl = mod.addCons(y + x <= 3, "cons")
mod.setObjective(y, "maximize")
mod.optimize()
sol = mod.getBestSol()
return mod.getSolVal(sol, y)
My main model:
mainfunc = Model()
n = mainfunc.addVar("n",lb=1,ub=3)
c = optfunc(n)
const = mainfunc.addCons(n + 0.5 == 1, "cons")
mainfunc.setObjective(n, "maximize")
mainfunc.optimize()
sol = mainfunc.getBestSol()
print(mainfunc.getSolVal(sol,n))
This does not work. You cannot have a Model inside another Model - especially, assigning a variable from the main Model (x) to be also a variable in the sub-model.

Is it okay to use complex control flow in tf.function?

I have the following Python function and I want to wrap it into #tf.function (originally the input arguments are numpy arrays, but for the sake of executing on GPU it's not a problem to convert them to TF tensors).
def reproject(before_frame, motion_vecs):
reprojected_image = np.zeros((before_frame.shape[0], before_frame.shape[1], before_frame.shape[2]))
for row_idx in range(before_frame.shape[0]):
for col_idx in range(before_frame.shape[1]):
for c_idx in range(before_frame.shape[2]):
diff_u = int(round(
(before_frame.shape[1] * motion_vecs[row_idx][col_idx][0])
))
diff_v = int(round(
(before_frame.shape[0] * motion_vecs[row_idx][col_idx][1])
))
before_pixel_position = (
row_idx + diff_v,
col_idx + diff_u
)
if before_pixel_position[0] < before_frame.shape[0] and before_pixel_position[1] < before_frame.shape[1] \
and before_pixel_position[0] > 0 and before_pixel_position[1] > 0:
reprojected_image[row_idx][col_idx][c_idx] = before_frame[
before_pixel_position[0]
][
before_pixel_position[1]
][c_idx]
return reprojected_image
I can see that in Tensorflow tutorials people use vectorized_map or map_fn instead of loops, and tf.cond instead of the if operator. So is using these functions the only option for control flow, and if so, what are the reasons behind it?

could not convert string to float in python

i try to analysis the Principle Component from cvs file but when i run the code i get this error
C:\Users\Lenovo\Desktop>python pca.py
ValueError: could not convert string to float: Annee;NET;INT;SUB;LMT;DCT;IMM;EXP;VRD
this is my cvs file
i try to remove any space and any think
this is my python script, i don't know what i miss
Note: i run this code under python2.7
from sklearn.externals import joblib
import numpy as np
import glob
import os
import time
import numpy
my_matrix = numpy.loadtxt(open("pca.csv","rb"),delimiter= ",",skiprows=0)
def pca(dataMat, r, autoset_r=False, autoset_rate=0.9):
"""
purpose: principal components analysis
"""
print("Start to do PCA...")
t1 = time.time()
meanVal = np.mean(dataMat, axis=0)
meanRemoved = dataMat - meanVal
# normData = meanRemoved / np.std(dataMat)
covMat = np.cov(meanRemoved, rowvar=0)
eigVals, eigVects = np.linalg.eig(np.mat(covMat))
eigValIndex = np.argsort(-eigVals)
if autoset_r:
r = autoset_eigNum(eigVals, autoset_rate)
print("autoset: take top {} of {} features".format(r, meanRemoved.shape[1]))
r_eigValIndex = eigValIndex[:r]
r_eigVect = eigVects[:, r_eigValIndex]
lowDDataMat = meanRemoved * r_eigVect
reconMat = (lowDDataMat * r_eigVect.T) + meanVal
t2 = time.time()
print("PCA takes %f seconds" %(t2-t1))
joblib.dump(r_eigVect, './pca_args_save/r_eigVect.eig')
joblib.dump(meanVal, './pca_args_save/meanVal.mean')
return lowDDataMat, reconMat
def autoset_eigNum(eigValues, rate=0.99):
eigValues_sorted = sorted(eigValues, reverse=True)
eigVals_total = eigValues.sum()
for i in range(1, len(eigValues_sorted)+1):
eigVals_sum = sum(eigValues_sorted[:i])
if eigVals_sum / eigVals_total >= rate:
break
return i
It seemed that NumPy has some problem parsing your index row to float.
Try setting skiprows = 1 in your np.readtxt command in order to skip the table header.

Apache Beam job (Python) using Tensorflow Transform is killed by Cloud Dataflow

I'm trying to run an Apache Beam job based on Tensorflow Transform on Dataflow but its killed. Someone has experienced that behaviour? This is a simple example with DirectRunner, that runs ok on my local but fails on Dataflow (I change the runner properly):
import os
import csv
import datetime
import numpy as np
import tensorflow as tf
import tensorflow_transform as tft
from apache_beam.io import textio
from apache_beam.io import tfrecordio
from tensorflow_transform.beam import impl as beam_impl
from tensorflow_transform.beam import tft_beam_io
from tensorflow_transform.tf_metadata import dataset_metadata
from tensorflow_transform.tf_metadata import dataset_schema
import apache_beam as beam
NUMERIC_FEATURE_KEYS = ['feature_'+str(i) for i in range(2000)]
def _create_raw_metadata():
column_schemas = {}
for key in NUMERIC_FEATURE_KEYS:
column_schemas[key] = dataset_schema.ColumnSchema(tf.float32, [], dataset_schema.FixedColumnRepresentation())
raw_data_metadata = dataset_metadata.DatasetMetadata(dataset_schema.Schema(column_schemas))
return raw_data_metadata
def preprocessing_fn(inputs):
outputs={}
for key in NUMERIC_FEATURE_KEYS:
outputs[key] = tft.scale_to_0_1(inputs[key])
return outputs
def main():
output_dir = '/tmp/tmp-folder-{}'.format(datetime.datetime.now().strftime('%Y%m%d%H%M%S'))
RUNNER = 'DirectRunner'
with beam.Pipeline(RUNNER) as p:
with beam_impl.Context(temp_dir=output_dir):
raw_data_metadata = _create_raw_metadata()
_ = (raw_data_metadata | 'WriteInputMetadata' >> tft_beam_io.WriteMetadata(os.path.join(output_dir, 'rawdata_metadata'), pipeline=p))
m = numpy_dataset = np.random.rand(100,2000)*100
raw_data = (p
| 'CreateTestDataset' >> beam.Create([dict(zip(NUMERIC_FEATURE_KEYS, m[i,:])) for i in range(m.shape[0])]))
raw_dataset = (raw_data, raw_data_metadata)
transform_fn = (raw_dataset | 'Analyze' >> beam_impl.AnalyzeDataset(preprocessing_fn))
_ = (transform_fn | 'WriteTransformFn' >> tft_beam_io.WriteTransformFn(output_dir))
(transformed_data, transformed_metadata) = ((raw_dataset, transform_fn) | 'Transform' >> beam_impl.TransformDataset())
transformed_data_coder = tft.coders.ExampleProtoCoder(transformed_metadata.schema)
_ = transformed_data | 'WriteTrainData' >> tfrecordio.WriteToTFRecord(os.path.join(output_dir, 'train'), file_name_suffix='.gz', coder=transformed_data_coder)
if __name__ == '__main__':
main()
Also, my production code (not shown) fail with the message: The job graph is too large. Please try again with a smaller job graph, or split your job into two or more smaller jobs.
Any hint?
The restriction on the pipeline description size is documented here:
https://cloud.google.com/dataflow/quotas#limits
There is a way around that, instead of creating stages for each tensor that goes into tft.scale_to_0_1 we could fuse them by first stacking them together, and then passing them into tft.scale_to_0_1 with 'elementwise=True'.
The result will be the same, because the min and max are computed per 'column' instead of across the whole tensor.
This would look something like this:
stacked = tf.stack([inputs[key] for key in NUMERIC_FEATURE_KEYS], axis=1)
scaled_stacked = tft.scale_to_0_1(stacked, elementwise=True)
for key, tensor in zip(NUMERIC_FEATURE_KEYS, tf.unstack(scaled_stacked, axis=1)):
outputs[key] = tensor

Importing a matrix from Python to Pyomo

I have a matrix defined in Python: (name of the document matrix.py)
N = 4
l = N
k = N
D = np.zeros((l,k))
for i in range(0,l):
for j in range(0,k):
if (i==j):
D[i,j] = 2
else:
D[i,j] = 0
D[0,0] = (2*N**2+1)/6
D[-1,-1] = -(2*N**2+1)/6
print(D)
I want to use it in Pyomo, and i did:
import matrix
.
.
.
m.f_x1 = Var(m.N)
def f_x1_definition(model,i):
for j in m.N:
return m.f_x1[j] ==sum(D[i,j]*m.x1[j] for j in range(value(m.n)))
m.f_x1_const = Constraint(m.N, rule = f_x1_definition)
But I get the next error:
NameError: global name 'D' is not defined
How can I do it?
When you import a module in python using the syntax
import foo
all the things defined in the foo module will be available within the foo namespace. That is, if foo.py contains:
import numpy as np
a = 5
D = np.zeros((1,5))
when you import the module with import foo, then you can access a, and D with:
import foo
print(foo.a)
print(foo.D)
If you want to pull the symbols from foo directly into your local namespace, you would instead use the from ... import ... syntax:
from foo import a,D
print(a)
print(D)