Getting a set of a list from a list of strings

Getting a set of a list from a list of strings - pandas

I have this dataframe:
df = pd.DataFrame({"c1":["[\"text\",\"text2\"]","[\"bla\",\"bla\",\"bla\"]"]})
and I'm removind [] and "" :
df["c2"] = df["c1"].apply(lambda x:re.sub('[\["\]]', "", x))
then I want to add df['c2'] to a list:
list = df['c2'].to_list()
Then I get this: ['text,text2', 'bla,bla,bla']
So far so good. But then I want a list with only unique values, what I could to using set(list).
The proble is that Instead of ['text,text2', 'bla,bla,bla'] I needed to get ['text','text2', 'bla','bla','bla'] so when I apply `set(list) I would get what I am expecting:
['text','text2','bla']

First, don't use list as a variable. Second, once you get ['text,text2',...] you can use str.split. So your set would be
{y for x in df['c2'].str.split(',') for y in x}
Output:
{'bla', 'text', 'text2'}
Note: You can use regex directly to extract all patterns between the \":
set(df['c1'].str.extractall('\"([^"]+)\"')[0])

Try this:
new = []
for l in list:
new.extend(l.split(',') )
new = list(set(new))
which results in new to be
['text2', 'text', 'bla']

Related

Swapping characters in Strings Python

I have been trying to make something like an encoder:
here is my idea
dict = {
1: "!",
2: "#"
}
in = 21 # Input number in
out = ?
print(out) # Returns "#!"
Is there any way I could perform this?

What you want is exactly the translate function of str:
x="12"
y="!#"
in=12
txt=str(in)
mapping = txt.maketrans(x, y)
out=txt.translate(mapping)
You can check the complete reference here.

Compact way to save JuMP optimization results in DataFrames

I would like to save all my variables and dual variables of my finished lp-optimization in an efficient manner. My current solution works, but is neither elegant nor suited for larger optimization programs with many variables and constraints because I define and push! every single variable into DataFrames separately. Is there a way to iterate through the variables using all_variables() and all_constraints() for the duals? While iterating, I would like to push the results into DataFrames with the variable index name as columns and save the DataFrame in a Dict().
A conceptual example would be for variables:
Result_vars = Dict()
for vari in all_variables(Model)
Resul_vars["vari"] = DataFrame(data=[indexval(vari),value(vari)],columns=[index(vari),"Value"])
end
An example of the appearance of the declared variable in JuMP and DataFrame:
#variable(Model, p[t=s_time,n=s_n,m=s_m], lower_bound=0,base_name="Expected production")
And Result_vars[p] shall approximately look like:
t,n,m,Value
1,1,1,50
2,1,1,60
3,1,1,145

Presumably, you could go something like:
x = all_variables(model)
DataFrame(
name = variable_name.(x),
Value = value.(x),
)
If you want some structure more complicated, you need to write custom code.
T, N, M, primal_solution = [], [], [], []
for t in s_time, n in s_n, m in s_m
push!(T, t)
push!(N, n)
push!(M, m)
push!(primal_solution, value(p[t, n, m]))
end
DataFrame(t = T, n = N, m = M, Value = primal_solution)
See here for constraints: https://jump.dev/JuMP.jl/stable/constraints/#Accessing-constraints-from-a-model-1. You want something like:
for (F, S) in list_of_constraint_types(model)
for con in all_constraints(model, F, S)
#show dual(con)
end
end

Thanks to Oscar, I have built a solution that could help to automatize the extraction of results.
The solution is build around a naming convention using base_name in the variable definition. One can copy paste the variable definition into base_name followed by :. E.g.:
#variable(Model, p[t=s_time,n=s_n,m=s_m], lower_bound=0,base_name="p[t=s_time,n=s_n,m=s_m]:")
The naming convention and syntax can be changed, comments can e.g. be added, or one can just not define a base_name. The following function divides the base_name into variable name, sets (if needed) and index:
function var_info(vars::VariableRef)
split_conv = [":","]","[",","]
x_str = name(vars)
if occursin(":",x_str)
x_str = replace(x_str, " " => "") #Deletes all spaces
x_name,x_index = split(x_str,split_conv[1]) #splits raw variable name+ sets and index
x_name = replace(x_name, split_conv[2] => "")
x_name,s_set = split(x_name,split_conv[3])#splits raw variable name and sets
x_set = split(s_set,split_conv[4])
x_index = replace(x_index, split_conv[2] => "")
x_index = replace(x_index, split_conv[3] => "")
x_index = split(x_index,split_conv[4])
return (x_name,x_set,x_index)
else
println("Var base_name not properly defined. Special Syntax required in form var[s=set]: ")
end
end
The next functions create the columns and the index values plus columns for the primal solution ("Value").
function create_columns(x)
col_ind=[String(var_info(x)[2][col]) for col in 1:size(var_info(x)[2])[1]]
cols = append!(["Value"],col_ind)
return cols
end
function create_index(x)
col_ind=[String(var_info(x)[3][ind]) for ind in 1:size(var_info(x)[3])[1]]
index = append!([string(value(x))],col_ind)
return index
end
function create_sol_matrix(varss,model)
nested_sol_array=[create_index(xx) for xx in all_variables(model) if varss[1]==var_info(xx)[1]]
sol_array=hcat(nested_sol_array...)
return sol_array
end
Finally, the last function creates the Dict which holds all results of the variables in DataFrames in the previously mentioned style:
function create_var_dict(model)
Variable_dict=Dict(vars[1]
=>DataFrame(Dict(vars[2][1][cols]
=>create_sol_matrix(vars,model)[cols,:] for cols in 1:size(vars[2][1])[1]))
for vars in unique([[String(var_info(x)[1]),[create_columns(x)]] for x in all_variables(model)]))
return Variable_dict
end
When those functions are added to your script, you can simply retrieve all the solutions of the variables after the optimization by calling create_var_dict():
var_dict = create_var_dict(model)
Be aware: they are nested functions. When you change the naming convention, you might have to update the other functions as well. If you add more comments you have to avoid using [, ], and ,.
This solution is obviously far from optimal. I believe there could be a more efficient solution falling back to MOI.

difference between two lists that include duplicates

I have a problem with two lists which contain duplicates
a = [1,1,2,3,4,4]
b = [1,2,3,4]
I would like to be able to extract the differences between the two lists ie.
c = [1,4]
but if I do c = a-b I get c =[]
It should be trivial but I can't find out :(
I tried also to parse the biggest list and remove items from it when I find them in the smallest list but I can't update lists on the fly, it does not work either
has anyone got an idea ?
thanks

You see an empty c as a result, because removing e.g. 1 removes all elements that are equal 1.
groovy:000> [1,1,1,1,1,2] - 1
===> [2]
What you need instead is to remove each occurrence of specific value separately. For that, you can use Groovy's Collection.removeElement(n) that removes a single element that matches the value. You can do it in a regular for-loop manner, or you can use another Groovy's collection method, e.g. inject to reduce a copy of a by removing each occurrence separately.
def c = b.inject([*a]) { acc, val -> acc.removeElement(val); acc }
assert c == [1,4]
Keep in mind, that inject method receives a copy of the a list (expression [*a] creates a new list from the a list elements.) Otherwise, acc.removeElement() would modify an existing a list. The inject method is an equivalent of a popular reduce or fold operation. Each iteration from this example could be visualized as:
--inject starts--
acc = [1,1,2,3,4,4]; val = 1; acc.removeElement(1) -> return [1,2,3,4,4]
acc = [1,2,3,4,4]; val = 2; acc.removeElement(2) -> return [1,3,4,4]
acc = [1,3,4,4]; val = 3; acc.removeElement(3) -> return [1,4,4]
acc = [1,4,4]; val = 4; acc.removeElement(4) -> return [1,4]
-- inject ends -->
PS: Kudos to almighty tim_yates who recommended improvements to that answer. Thanks, Tim!

the most readable that comes to my mind is:
a = [1,1,2,3,4,4]
b = [1,2,3,4]
c = a.clone()
b.each {c.removeElement(it)}
if you use this frequently you could add a method to the List metaClass:
List.metaClass.removeElements = { values -> values.each { delegate.removeElement(it) } }
a = [1,1,2,3,4,4]
b = [1,2,3,4]
c = a.clone()
c.removeElements(b)

Invalid parameter number when using whereNotIn in laravel

I want to print the list of option from database on my blade, which I want to get all list except the option that already selected. So what I've already do is getting the selected option (which is saved on another table) as an array:
$ctg_id[] = [];
foreach($vendor->category as $item){
$ctg_id[] = [$item->category_id];
}
Then finally print it:
$category = ItemCategory::whereNotIn('id',$ctg_id)->get();
But it return an error that says:
"SQLSTATE[HY093]: Invalid parameter number (SQL: select * from category_item where id not in (4, 3, ?))" Is there anything wrong with my code?

You are using double dimensional array, you need to change it to one dimensional:
$ctg_id = [];
foreach($vendor->category as $item){
$ctg_id[] = $item->category_id;
}
Recommend to use method-pluck like this:
$category = ItemCategory::whereNotIn('id', $vendor->category->pluck('category_id'))->get();

Lua get index name of table as table

Is there any way to get every index value of a table?
Example:
local mytbl = {
["Hello"] = 123,
["world"] = 321
}
I want to get this:
{"Hello", "world"}

local t = {}
for k, v in pairs(mytbl) do
table.insert(t, k) -- or t[#t + 1] = k
end
Note that the order of how pairs iterates a table is not specified. If you want to make sure the elements in the result are in a certain order, use:
table.sort(t)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Getting a set of a list from a list of strings - pandas

Try this: new = [] for l in list: new.extend(l.split(',') ) new = list(set(new)) which results in new to be ['text2', 'text', 'bla']

Related

Swapping characters in Strings Python

Compact way to save JuMP optimization results in DataFrames

difference between two lists that include duplicates

Invalid parameter number when using whereNotIn in laravel

Lua get index name of table as table

Categories

Resources