forNonBlank function in OpenRefine - openrefine

I get an error when using forNonBlank in OpenRefine's Templating Export feature.
I have cells with multiple subjects that I want to capture in separate dcterms:subject xml elements. Example:
Geology--Alberta--Coal Valley. // Geology, Structural. // Geology, Stratigraphic--Cretaceous.
I am using OpenRefine's Templating Export option to export to XML, similarly to the process described here.
This expression works fine:
{{forEach(cells["dcterms:subject"].value.split(" // "), v, "<dcterms:subject>" + v + "</dcterms:subject>\n")}}
I get:
<dcterms:subject>Geology--Alberta--Coal Valley.</dcterms:subject>
<dcterms:subject>Geology, Structural.</dcterms:subject>
<dcterms:subject>Geology, Stratigraphic--Cretaceous.</dcterms:subject>
But when using forNonBlank as in:
{{forNonBlank(cells["dcterms:subject"].value.split(" // "), v, "<dcterms:subject>" + v + "</dcterms:subject>\n", "")}}
I get:
<dcterms:subject>[Ljava.lang.String;#16657412</dcterms:subject>
Is there something wrong with my coding, or is this a bug?
Thanks for your help.

forNonBlank isn't an iterative function, so the function:
forNonBlank(cells["dcterms:subject"].value.split(" // "), v, "" + v + "\n", "")
Evaluates the array created through the split as to whether it is blank or not (the whole array, not each item in the array) and finding that it is not blank assigns the array to variable 'v'.
Essentially 'forNonBlank' is doing something similar to combining 'if' and 'isNonBlank', not 'forEach' and 'isNonBlank'
You've got several options for doing what you want, but you need to have an iterator in there somewhere. For example:
forEach(cells["dcterms:subject"].value.split(" // "),v,forNonBlank(v,w, "" + w + "", "")).join("/n")

Related

OpenEdge dynamic buffers... how do I avoid error 7328? ("unambiguous buffer field/reference for buffers known to a query")

Althogh I've been supporting (and extending) a legacy OE application for 10 years plus, I've never before been forced into the scary world of dynamic buffers... However, my luck has finally run out.
Let me start by saying I cannot believe how opaque the little OE documentation I could find is... the only Progress guide seems to be in the online documentation for v10.2 (thanks to the contributer to one of the forums for even that snippet.)
Anyway, this should be almost trivial. Except that it doesn't work;
DEFINE VARIABLE hFileBuffer AS WIDGET-HANDLE.
DEFINE VARIABLE hFieldBuffer AS WIDGET-HANDLE.
DEFINE VARIABLE cWhere AS CHARACTER.
DEFINE VARIABLE hQuery AS HANDLE.
CREATE BUFFER hFileBuffer FOR TABLE "_File".
CREATE BUFFER hFieldBuffer FOR TABLE "_Field".
CREATE QUERY hQuery.
hQuery:SET-BUFFERS(hFileBuffer).
hQuery:ADD-BUFFER(hFieldBuffer).
cWhere = SUBSTITUTE(
"FOR EACH _File " +
" NO-LOCK, " +
" EACH _Field " +
" WHERE _Field.File-recid = _File._File-recid " +
" NO-LOCK"
).
message cWhere.
pause.
hQuery:Query-PREPARE(cWhere).
hQuery:Query-OPEN().
DELETE OBJECT hQuery.
DELETE OBJECT hFileBuffer.
DELETE OBJECT hFieldBuffer.
ASSIGN hQuery = ?
hFileBuffer = ?
hFieldBuffer = ?.
The output from "message" is (after removing redundant spaces):
FOR EACH _File NO-LOCK, EACH _Field WHERE _Field.File-recid = _File._File-recid NO-LOCK
which looks fine to me.
However I then get:
_Field File-recid must be a quoted constant or an unabbreviated, unambiguous buffer/field reference for buffers known to query . (7328)
I just cannot see what is ambiguous about "_Field.File-recid" or "_File._File-recid". Or am I missing something? (I should add that the equivalent works in good ol'-fashioned static OpenEdge!)
Hoping someone wiser than I can advise,
Allan.
There are two issues in your dynamic query string:
a) It's RECID(_file) and not _file._file-recid (no _file-recid field on _file)
b) It's _field._file-recid and not _field.file-recid (underscore missing)
cWhere = SUBSTITUTE(
"FOR EACH _File " +
" NO-LOCK, " +
" EACH _Field " +
" WHERE _Field._File-recid = recid(_file)" +
" NO-LOCK"
).
You can enable the display of hidden fields in the Data Dictionary:
Just an example on ABL Dojo to watch your query fly:
def var hbfile as handle no-undo.
def var hbfield as handle no-undo.
def var hq as handle no-undo.
def var cquery as char no-undo.
create buffer hbfile for table '_file'.
create buffer hbfield for table '_field'.
create query hq.
hq:set-buffers( hbfile, hbfield ).
cquery = substitute(
'for each &1 where &1._hidden = false'
+ ', each &2 where &2._file-recid = recid( &1 )'
+ ' break by &1._file-name',
hbfile:name,
hbfield:name
).
hq:query-prepare( cquery ).
hq:query-open().
do while hq:get-next():
if hq:first-of( 1 ) then
message hbfile::_file-name.
message ' ' hbfield::_field-name.
end.
finally:
delete object hq no-error.
delete object hbfile no-error.
delete object hbfield no-error.
end finally.
A few additional issues with your snippet:
buffer handles are regular handles, no need for the meaningless widget- prefix
when working with dynamic buffers, it really helps to use the :name of the dynamic buffer, this allows you to change names without causing the query to fail

Building string from list of list of strings

I rather have this ugly way of building a string from a list as:
val input = listOf("[A,B]", "[C,D]")
val builder = StringBuilder()
builder.append("Serialized('IDs((")
for (pt in input) {
builder.append(pt[0] + " " + pt[1])
builder.append(", ")
}
builder.append("))')")
The problem is that it adds a comma after the last element and if I want to avoid that I need to add another if check in the loop for the last element.
I wonder if there is a more concise way of doing this in kotlin?
EDIT
End result should be something like:
Serialized('IDs((A B,C D))')
In Kotlin you can use joinToString for this kind of use case (it deals with inserting the separator only between elements).
It is very versatile because it allows to specify a transform function for each element (in addition to the more classic separator, prefix, postfix). This makes it equivalent to mapping all elements to strings and then joining them together, but in one single call.
If input really is a List<List<String>> like you mention in the title and you assume in your loop, you can use:
input.joinToString(
prefix = "Serialized('IDs((",
postfix = "))')",
separator = ", ",
) { (x, y) -> "$x $y" }
Note that the syntax with (x, y) is a destructuring syntax that automatically gets the first and second element of the lists inside your list (parentheses are important).
If your input is in fact a List<String> as in listOf("[A,B]", "[C,D]") that you wrote at the top of your code, you can instead use:
input.joinToString(
prefix = "Serialized('IDs((",
postfix = "))')",
separator = ", ",
) { it.removeSurrounding("[", "]").replace(",", " ") }
val input = listOf("[A,B]", "[C,D]")
val result =
"Serialized('IDs((" +
input.joinToString(",") { it.removeSurrounding("[", "]").replace(",", " ") } +
"))')"
println(result) // Output: Serialized('IDs((A B,C D))')
Kotlin provides an extension function [joinToString][1] (in Iterable) for this type of purpose.
input.joinToString(",", "Serialized('IDs((", "))')")
This will correctly add the separator.

Compact way to save JuMP optimization results in DataFrames

I would like to save all my variables and dual variables of my finished lp-optimization in an efficient manner. My current solution works, but is neither elegant nor suited for larger optimization programs with many variables and constraints because I define and push! every single variable into DataFrames separately. Is there a way to iterate through the variables using all_variables() and all_constraints() for the duals? While iterating, I would like to push the results into DataFrames with the variable index name as columns and save the DataFrame in a Dict().
A conceptual example would be for variables:
Result_vars = Dict()
for vari in all_variables(Model)
Resul_vars["vari"] = DataFrame(data=[indexval(vari),value(vari)],columns=[index(vari),"Value"])
end
An example of the appearance of the declared variable in JuMP and DataFrame:
#variable(Model, p[t=s_time,n=s_n,m=s_m], lower_bound=0,base_name="Expected production")
And Result_vars[p] shall approximately look like:
t,n,m,Value
1,1,1,50
2,1,1,60
3,1,1,145
Presumably, you could go something like:
x = all_variables(model)
DataFrame(
name = variable_name.(x),
Value = value.(x),
)
If you want some structure more complicated, you need to write custom code.
T, N, M, primal_solution = [], [], [], []
for t in s_time, n in s_n, m in s_m
push!(T, t)
push!(N, n)
push!(M, m)
push!(primal_solution, value(p[t, n, m]))
end
DataFrame(t = T, n = N, m = M, Value = primal_solution)
See here for constraints: https://jump.dev/JuMP.jl/stable/constraints/#Accessing-constraints-from-a-model-1. You want something like:
for (F, S) in list_of_constraint_types(model)
for con in all_constraints(model, F, S)
#show dual(con)
end
end
Thanks to Oscar, I have built a solution that could help to automatize the extraction of results.
The solution is build around a naming convention using base_name in the variable definition. One can copy paste the variable definition into base_name followed by :. E.g.:
#variable(Model, p[t=s_time,n=s_n,m=s_m], lower_bound=0,base_name="p[t=s_time,n=s_n,m=s_m]:")
The naming convention and syntax can be changed, comments can e.g. be added, or one can just not define a base_name. The following function divides the base_name into variable name, sets (if needed) and index:
function var_info(vars::VariableRef)
split_conv = [":","]","[",","]
x_str = name(vars)
if occursin(":",x_str)
x_str = replace(x_str, " " => "") #Deletes all spaces
x_name,x_index = split(x_str,split_conv[1]) #splits raw variable name+ sets and index
x_name = replace(x_name, split_conv[2] => "")
x_name,s_set = split(x_name,split_conv[3])#splits raw variable name and sets
x_set = split(s_set,split_conv[4])
x_index = replace(x_index, split_conv[2] => "")
x_index = replace(x_index, split_conv[3] => "")
x_index = split(x_index,split_conv[4])
return (x_name,x_set,x_index)
else
println("Var base_name not properly defined. Special Syntax required in form var[s=set]: ")
end
end
The next functions create the columns and the index values plus columns for the primal solution ("Value").
function create_columns(x)
col_ind=[String(var_info(x)[2][col]) for col in 1:size(var_info(x)[2])[1]]
cols = append!(["Value"],col_ind)
return cols
end
function create_index(x)
col_ind=[String(var_info(x)[3][ind]) for ind in 1:size(var_info(x)[3])[1]]
index = append!([string(value(x))],col_ind)
return index
end
function create_sol_matrix(varss,model)
nested_sol_array=[create_index(xx) for xx in all_variables(model) if varss[1]==var_info(xx)[1]]
sol_array=hcat(nested_sol_array...)
return sol_array
end
Finally, the last function creates the Dict which holds all results of the variables in DataFrames in the previously mentioned style:
function create_var_dict(model)
Variable_dict=Dict(vars[1]
=>DataFrame(Dict(vars[2][1][cols]
=>create_sol_matrix(vars,model)[cols,:] for cols in 1:size(vars[2][1])[1]))
for vars in unique([[String(var_info(x)[1]),[create_columns(x)]] for x in all_variables(model)]))
return Variable_dict
end
When those functions are added to your script, you can simply retrieve all the solutions of the variables after the optimization by calling create_var_dict():
var_dict = create_var_dict(model)
Be aware: they are nested functions. When you change the naming convention, you might have to update the other functions as well. If you add more comments you have to avoid using [, ], and ,.
This solution is obviously far from optimal. I believe there could be a more efficient solution falling back to MOI.

method for serializing lua tables

I may have missed this, but is there a built-in method for serializing/deserializing lua tables to text files and vice versa?
I had a pair of methods in place to do this on a lua table with fixed format (e.g. 3 columns of data with 5 rows).
Is there a way to do this on lua tables with any arbitrary format?
For an example, given this lua table:
local scenes={
{name="scnSplash",
obj={
{
name="bg",
type="background",
path="scnSplash_bg.png",
},
{
name="bird",
type="image",
path="scnSplash_bird.png",
x=0,
y=682,
},
}
},
}
It would be converted into text like this:
{name="scnSplash",obj={{name="bg",type="background",path="scnSplash_bg.png",},{name="bird", type="image",path="scnSplash_bird.png",x=0,y=682,}},}
The format of the serialized text can be defined in any way, as long as the text string can be deserialized into an empty lua table.
I'm not sure why JSON library was marked as the right answer as it seems to be very limited in serializing "lua tables with any arbitrary format". It doesn't handle boolean/table/function values as keys and doesn't handle circular references. Shared references are not serialized as shared and math.huge values are not serialized correctly on Windows. I realize that most of these are JSON limitations (and hence implemented this way in the library), but this was proposed as a solution for generic Lua table serialization (which it is not).
One would be better off by using one of the implementations from TableSerialization page or my Serpent serializer and pretty-printer.
Lua alone doesn't have any such builtin, but implementing one is not difficult. A number of prebaked implementations are listed here: http://lua-users.org/wiki/TableSerialization
require "json"
local t = json.decode( jsonFile( "sample.json" ) )
reference here for a simple json serializer.
Add json.lua from rxi/json.lua to your project, then use it with:
local json = require("json")
local encoded = json.encode({
name = "J. Doe",
age = 42
})
local decoded = json.decode(encoded)
print(decoded.name)
Note that the code chokes if there are functions in the value you are trying to serialize. You have to fix line 82 and 93 in the code to skip values that have the function type.
Small solution: The key can be done without brackets, but be sure that here is no minuses or other special symbols.
local nl = string.char(10) -- newline
function serialize_list (tabl, indent)
indent = indent and (indent.." ") or ""
local str = ''
str = str .. indent.."{"..nl
for key, value in pairs (tabl) do
local pr = (type(key)=="string") and ('["'..key..'"]=') or ""
if type (value) == "table" then
str = str..pr..serialize_list (value, indent)
elseif type (value) == "string" then
str = str..indent..pr..'"'..tostring(value)..'",'..nl
else
str = str..indent..pr..tostring(value)..','..nl
end
end
str = str .. indent.."},"..nl
return str
end
local str = serialize_list(tables)
print(str)

Why is MATLAB reporting my variable uninitialized?

I made a class and in one of its methods I needed to calculate the distance between two points. So I wrote an ordinary function named "remoteness" to do this for me.
Compilation Error:
At compilation, "remoteness" was
determined to be a variable and this
variable is uninitialized.
"remoteness" is also a function name
and previous versions of MATLAB would
have called the function.
However, MATLAB 7 forbids the use of the same name in the same context as both a function and a variable.
Error in ==> TRobot>TRobot.makeVisibilityGraph at 58
obj.visiblityGraph(k,k+1) = remoteness(:,obj.VGVertices(k),obj.VGVertices(:,k+1));
I thought the name remoteness might be a name of another function, but when I changed its name to kamran the error persisted. It should be noted that I can use the kamran function (or remoteness) in the command line without any problem.
Command line example:
>> kamran([0,0],[3,4])
ans = 5
The code of the kamran function is in a separate m file.
Code for kamran function:
function dist = kamran(v1,v2)
dist = sqrt( (v1(1) - v2(1)) ^2 + (v1(2) - v2(2)) ^2 );
Code example for how kamran function is used:
function obj = makeVisibilityGraph(obj)
verticesNumber = 0;
for num = 1: size(obj.staticObstacle,2)
verticesNumber = verticesNumber + size(obj.staticObstacle(num).polygon,2);
end
% in the below line, 2 is for start and goal vertices
obj.visibilityGraph = ones(2 + size(obj.VGVertices,2)) * Inf;
for j=1 : size(obj.staticObstacle,2)
index = size(obj.VGVertices,2);
obj.VGVertices = [obj.VGVertices, obj.staticObstacle(j).polygon];
obj.labelVGVertices = [obj.labelVGVertices, ones(1,size(obj.staticObstacle(j).polygon,2))* j ];
for k = index+1 : (size(obj.VGVertices,2)-1)
obj.visiblityGraph(k,k+1) = kamran(:,obj.VGVertices(k),obj.VGVertices(:,k+1));
end
% as the first and last point of a polygon are visible to each
% other, so set them visible to each other
obj.visibilityGraph(index+1,size(obj.VGVertices,2)) = ...
kamran( obj.VGVertices(:,index+1), obj.VGVertices(:,size(obj.VGVertices,2)));
end
end
You seem to be trying to use kamran as an array:
kamran(:,obj.VGVertices(k),obj.VGVertices(:,k+1));
Notice the first parameter ":"?
I would bet MATLAB assumes that kamran (as called here) should be a 3-dimensional array, and you are trying to select the subset containing
kamran(all-of-first-index, Nth-of-second, Mth-of-third)
The second invocation of kamran looks right:
kamran( obj.VGVertices(:,index+1), obj.VGVertices(:,size(obj.VGVertices,2))
I do not know MATLAB but I notice on this line, you are running kamran with what looks like 3 arguments. In all other cases, it is executed with 2 arguments. Maybe there is something to that?
kamran(:,obj.VGVertices(k),obj.VGVertices(:,k+1));