Import a dictionary into the current scope as variables - numpy

I have a .mat file in which I put data previously processed. When I perform
dict = scipy.io.loadmat('training_data.mat')
I get back a dict that is like this
{'encoders' : ......, 'decoders' : ........, 'stuff' : .....}
I want to selectively import the encoders and decoders variables into my current scope. The effect is the same as:
encoders = dict['encoders']
decoders = dict['decoders']
How do I cleanly do this without typing 10-15 lines?

You could import a dictionary d into the global scope using
globals().update(d)
The same thing is impossible for local scopes, since modifying the dictionary returned by locals() results in undefined behaviour.
A slightly hacky trick you could use in this situation is to import the names into the dictionary of an on-the-fly created type:
d = {"encoders": 1, "decoders": 2}
t = type("", (), d)
print t.encoders
print t.decoders
This will at least be slightly more convenient than using d["decoders"] etc.
Alternatively, you could use exec statements to create your variables:
d = {"encoders": 1, "decoders": 2}
for k, v in d.iteritems():
exec k + " = v"
This could also be done selectively.

Related

How to input via gradio dropdowns using variable information stored in a data frame?

I'm trying to input info from an easily editable CSV file where the first row contains the variable names. I am doing this by iterating over the list of variables so each variable is a new dropdown. Using this method everything appears correctly in the interface but something is going wrong when it comes to outputting the selected values. I'm thinking I need another approach but I can't think how to do it short of generating a new gradio interface for each dropdown.
the following code hangs when you click submit:
import gradio as gr
import pandas as pd
df = pd.read_csv(file_path)
# first row of the CSV as the variable names
variables = list(df.columns)
# filter out null values
df = df.dropna()
# rest of the rows as the options for each variable
options = {var: list(df[var]) for var in variables}
def assign_values(**kwargs):
global save_variables
save_variables.update(kwargs)
return kwargs
save_variables = {}
inputs = [gr.inputs.Dropdown(label=var, choices=options[var], type="value") for var in variables]
outputs = ["text"]
gr.Interface(fn=assign_values, inputs=inputs, outputs=outputs, title="Assign Values").launch()

Compact way to save JuMP optimization results in DataFrames

I would like to save all my variables and dual variables of my finished lp-optimization in an efficient manner. My current solution works, but is neither elegant nor suited for larger optimization programs with many variables and constraints because I define and push! every single variable into DataFrames separately. Is there a way to iterate through the variables using all_variables() and all_constraints() for the duals? While iterating, I would like to push the results into DataFrames with the variable index name as columns and save the DataFrame in a Dict().
A conceptual example would be for variables:
Result_vars = Dict()
for vari in all_variables(Model)
Resul_vars["vari"] = DataFrame(data=[indexval(vari),value(vari)],columns=[index(vari),"Value"])
end
An example of the appearance of the declared variable in JuMP and DataFrame:
#variable(Model, p[t=s_time,n=s_n,m=s_m], lower_bound=0,base_name="Expected production")
And Result_vars[p] shall approximately look like:
t,n,m,Value
1,1,1,50
2,1,1,60
3,1,1,145
Presumably, you could go something like:
x = all_variables(model)
DataFrame(
name = variable_name.(x),
Value = value.(x),
)
If you want some structure more complicated, you need to write custom code.
T, N, M, primal_solution = [], [], [], []
for t in s_time, n in s_n, m in s_m
push!(T, t)
push!(N, n)
push!(M, m)
push!(primal_solution, value(p[t, n, m]))
end
DataFrame(t = T, n = N, m = M, Value = primal_solution)
See here for constraints: https://jump.dev/JuMP.jl/stable/constraints/#Accessing-constraints-from-a-model-1. You want something like:
for (F, S) in list_of_constraint_types(model)
for con in all_constraints(model, F, S)
#show dual(con)
end
end
Thanks to Oscar, I have built a solution that could help to automatize the extraction of results.
The solution is build around a naming convention using base_name in the variable definition. One can copy paste the variable definition into base_name followed by :. E.g.:
#variable(Model, p[t=s_time,n=s_n,m=s_m], lower_bound=0,base_name="p[t=s_time,n=s_n,m=s_m]:")
The naming convention and syntax can be changed, comments can e.g. be added, or one can just not define a base_name. The following function divides the base_name into variable name, sets (if needed) and index:
function var_info(vars::VariableRef)
split_conv = [":","]","[",","]
x_str = name(vars)
if occursin(":",x_str)
x_str = replace(x_str, " " => "") #Deletes all spaces
x_name,x_index = split(x_str,split_conv[1]) #splits raw variable name+ sets and index
x_name = replace(x_name, split_conv[2] => "")
x_name,s_set = split(x_name,split_conv[3])#splits raw variable name and sets
x_set = split(s_set,split_conv[4])
x_index = replace(x_index, split_conv[2] => "")
x_index = replace(x_index, split_conv[3] => "")
x_index = split(x_index,split_conv[4])
return (x_name,x_set,x_index)
else
println("Var base_name not properly defined. Special Syntax required in form var[s=set]: ")
end
end
The next functions create the columns and the index values plus columns for the primal solution ("Value").
function create_columns(x)
col_ind=[String(var_info(x)[2][col]) for col in 1:size(var_info(x)[2])[1]]
cols = append!(["Value"],col_ind)
return cols
end
function create_index(x)
col_ind=[String(var_info(x)[3][ind]) for ind in 1:size(var_info(x)[3])[1]]
index = append!([string(value(x))],col_ind)
return index
end
function create_sol_matrix(varss,model)
nested_sol_array=[create_index(xx) for xx in all_variables(model) if varss[1]==var_info(xx)[1]]
sol_array=hcat(nested_sol_array...)
return sol_array
end
Finally, the last function creates the Dict which holds all results of the variables in DataFrames in the previously mentioned style:
function create_var_dict(model)
Variable_dict=Dict(vars[1]
=>DataFrame(Dict(vars[2][1][cols]
=>create_sol_matrix(vars,model)[cols,:] for cols in 1:size(vars[2][1])[1]))
for vars in unique([[String(var_info(x)[1]),[create_columns(x)]] for x in all_variables(model)]))
return Variable_dict
end
When those functions are added to your script, you can simply retrieve all the solutions of the variables after the optimization by calling create_var_dict():
var_dict = create_var_dict(model)
Be aware: they are nested functions. When you change the naming convention, you might have to update the other functions as well. If you add more comments you have to avoid using [, ], and ,.
This solution is obviously far from optimal. I believe there could be a more efficient solution falling back to MOI.

generating DataFrames in for loop in Scala Spark cause out of memory

I'm generating small dataFrames in for loop. At each round of for loop, I pass the generated dataFrame to a function which returns double. This simple process (which I thought could be easily taken care of by garbage collector) blow up my memory. When I look at Spark UI at each round of for loop it adds a new "SQL{1-500}" (my loop runs 500 times). My question is how to drop this sql object before generating a new one?
my code is something like this:
Seq.fill(500){
val data = (1 to 1000).map(_=>Random.nextInt(1000))
val dataframe = createDataFrame(data)
myFunction(dataframe)
dataframe.unpersist()
}
def myFunction(df: DataFrame)={
df.count()
}
I tried to solve this problem by dataframe.unpersist() and sqlContext.clearCache() but neither of them worked.
You have two places where I suspect something fishy is happening:
in the definition of myFunction : you really need to put the = before the body of the definition. I had typos like that compile, but produce really weird errors (note I changed your myFunction for debugging purposes)
it is better to fill your Seq with something you know and then apply foreach or some such
(You also need to replace random.nexInt with Random.nextInt, and also, you can only create a DataFrame from a Seq of a type that is a subtype of Product, such as tuple, and need to use sqlContext to use createDataFrame)
This code works with no memory issues:
Seq.fill(500)(0).foreach{ i =>
val data = {1 to 1000}.map(_.toDouble).toList.zipWithIndex
val dataframe = sqlContext.createDataFrame(data)
myFunction(dataframe)
}
def myFunction(df: DataFrame) = {
println(df.count())
}
Edit: parallelizing the computation (across 10 cores) and returning the RDD of counts:
sc.parallelize(Seq.fill(500)(0), 10).map{ i =>
val data = {1 to 1000}.map(_.toDouble).toList.zipWithIndex
val dataframe = sqlContext.createDataFrame(data)
myFunction(dataframe)
}
def myFunction(df: DataFrame) = {
df.count()
}
Edit 2: the difference between declaring function myFunction with = and without = is that the first is (a usual) function definition, while the other is procedure definition and is only used for methods that return Unit. See explanation. Here is this point illustrated in Spark-shell:
scala> def myf(df:DataFrame) = df.count()
myf: (df: org.apache.spark.sql.DataFrame)Long
scala> def myf2(df:DataFrame) { df.count() }
myf2: (df: org.apache.spark.sql.DataFrame)Unit

Serializing groovy map to string with quotes

I'm trying to persist a groovy map to a file. My current attempt is to write the string representation out and then read it back in and call evaluate on it to recreate the map when I'm ready to use it again.
The problem I'm having is that the toString() method of the map removes vital quotes from the values of the elements. When my code calls evaluate, it complains about an unknown identifier.
This code demonstrates the problem:
m = [a: 123, b: 'test']
print "orig: $m\n"
s = m.toString()
print " str: $s\n"
m2 = evaluate(s)
print " new: ${m2}\n"
The first two print statements almost work -- but the quotes around the value for the key b are gone. Instead of showing [a: 123, b: 'test'], it shows [a: 123, b: test].
At this point the damage is done. The evaluate call chokes when it tries to evaluate test as an identifier and not a string.
So, my specific questions:
Is there a better way to serialize/de-serialize maps in Groovy?
Is there a way to produce a string representation of a map with proper quotes?
Groovy provides the inspect() method returns an object as a parseable string:
// serialize
def m = [a: 123, b: 'test']
def str = m.inspect()
// deserialize
m = Eval.me(str)
Another way to serialize a groovy map as a readable string is with JSON:
// serialize
import groovy.json.JsonBuilder
def m = [a: 123, b: 'test']
def builder = new JsonBuilder()
builder(m)
println builder.toString()
// deserialize
import groovy.json.JsonSlurper
def slurper = new JsonSlurper()
m = slurper.parseText('{"a": 123, "b": "test"}')
You can use myMap.toMapString()

overriding a method in R, using NextMethod

how dows this work in R...
I am using a package (zoo 1.6-4) that defines a S3 class for time series sets.
I am writing a derived class where I want to override a few methods and can't get past this one:[.zoo!
in my derived class rows are indexed by timestamp, like in zoo, but differently from zoo, I allow only POSIXct values in the index. my users will be selecting columns all of the time, while slicing series only occasionally so I want to offer obj[name] instead of obj[, name].
my objects have class c("delftfews", "zoo").
but...
how do I override a method?
I tried this:
"[.delftfews" <- function(x, i, j, drop=TRUE, ...) {
if (missing(i)) return(NextMethod())
if (all(class(i) == "character") && missing(j)) {
return(NextMethod('[', x=x, i=1:NROW(x), j=i, drop=drop, ...))
}
NextMethod()
}
but I get this error: Error in rval[i, j, drop = drop., ...] : incorrect number of dimensions.
I have solved by editing the source from zoo: I removed those ..., but I don't get why that works. anybody can explain what is going on here?
The problem is that with the above definition of [.delftfews this code:
library(zoo)
z <- structure(zoo(cbind(a = 1:3, b = 4:6)), class = c("delftfews", "zoo"))
z["a"]
# generates this call: `[.zoo`(x = 1:6, i = 1:3, j = "a", drop = TRUE, z, "a")
Your code does work as is if you write the call like this:
z[j = "a"]
# generates this call: `[.zoo`(x = z, j = "a")
I think what you want is to change the relevant line in [.delftfews to this:
return(NextMethod(.Generic, object = x, i = 1:NROW(x), drop = drop))
# z["a"] now generates this call: `[.zoo`(x = z, i = 1:3, j = "a", drop = TRUE)
A point of clarification: allowing only POSIXct index values does not allow indexing columns by name only. I'm not sure how you arrived at that conclusion.
You're overriding zoo correctly, but I think you misunderstand NextMethod. The error is caused by if (missing(i)) return(NextMethod()), which calls [.zoo if i is missing, but [.zoo requires i because zoo's internal data structure is a matrix. Something like this should work:
if (missing(i)) i <- 1:NROW(x)
though I'm not sure if you have to explicitly pass this new i to NextMethod...
You may be interested in the xts package, if you haven't already taken a look at it.