How to apply a function to each value of an array and store the results - authorization

I am new to Open Policy Agent and the Rego language. I have an array of strings. Each of those strings needs to have some operation performed on them so they may be in the appropriate format for use later. Is there a way to apply a function to each element in the array and then store those processed elements as a result array?

That would normally be done with an array comprehension:
a := ["a", "a", "b", "c"]
b := [x | y := a[_]
x := upper(y)
x == "A"]
# b == ["A", "A"]

Related

OrderedDict contain list of DataFrame

I don't understand why method1 is fine but not the second ...
method 1
import pandas as pd
import collections
d = collections.OrderedDict([('key', []), ('key2', [])])
df = pd.DataFrame({'id': [1], 'test': ['ok']})
d['key'].append(df)
d
OrderedDict([('key', [ id test
0 1 ok]), ('key2', [])])
method 2
l = ['key', 'key2']
dl = collections.OrderedDict(zip(l, [[]]*len(l)))
dl
OrderedDict([('key', []), ('key2', [])])
dl['key'].append(df)
dl
OrderedDict([('key', [ id test
0 1 ok]), ('key2', [ id test
0 1 ok])])
dl == d True
The issue stems from creating empty lists like so: [[]] * len(l) what this is actually doing is copying the reference to the empty list multiple times. So what you end up with is a list of empty lists that all point to the same underlying object. When this happens, any change you make to the underlying list via inplace operations (such as append) will change the values inside of all references to that list.
The same type issue comes about when assigning variables to one another:
a = []
b = a
# `a` and `b` both point to the same underlying object.
b.append(1) # inplace operation changes underlying object
print(a, b)
[1], [1]
To circumvent your issue instead of using [[]] * len(l) you can use a generator expression or list comprehension to ensure a new empty list is created for each element in list l:
collections.OrderedDict(zip(l, ([] for _ in l))
using the generator expression ([] for _ in l) creates a new empty list for every element in l instead of copying the reference to a single empty list. The easiest way to check this is to use the id function to check the underlying ids of the objects. Here we'll compare your original method to the new method:
# The ids come out the same, indicating that the objects are reference to the same underlying list
>>> [id(x) for x in [[]] * len(l)]
[2746221080960, 2746221080960]
# The ids come out different, indicating that they point to different underlying lists
>>> [id(x) for x in ([] for _ in l)]
[2746259049600, 2746259213760]

Extract a dataframe from a list of dataframes containing a substring

I have the following dataframes in python that are part of a list
dataframe_list= []## CREATE AN EMPTY LIST
import pandas as pd
A=pd.DataFrame()
A["name"]=["A", "A", "A"]
A["att"]=["New World", "Hello", "Big Day now"]
B=pd.DataFrame()
B["name"]=["A2", "A2", "A2"]
B["Col"]=["L", "B", "B"]
B["CC"]=["old", "Hello", "Big Day now"]
C=pd.DataFrame()
C["name"]=["Brave old World", "A", "A"]
The above dataframes are of different sizes. these are stored as a list as follows
dataframe_list.append(A)
dataframe_list.append(B)
dataframe_list.append(C)
I am trying to extract two dataframes that contain the word world(irrespective of case). I have tried the following code
list1=["World"]
result=[x for x in dataframe_list if any(x.isin(list1) ) ]
This however is yielding all the dataframes. The expected output is dataframes A, C. Am not sure where I am making a mistake here
Use DataFrame.stack for Series and test by Series.str.contains by word w instead one element list, also is added words boundaries for match only whole words:
w="World"
result=[x for x in dataframe_list if x.stack().str.contains(rf"\b{w}\b", case=False).any()]
print (result)
[ name att
0 A New World
1 A Hello
2 A Big Day now, name
0 Brave old World
1 A
2 A]
EDIT: For list of words is used | for regex or:
list1=["World",'Hello']
pat = '|'.join(rf"\b{x}\b" for x in list1)
result=[x for x in dataframe_list if x.stack().str.contains(pat, case=False).any()]

Select rows with missing value in a Julia dataframe

I'm just started exploring Julia and am struggeling with subsetting dataframes. I would like to select rows where LABEL has the value "B" and VALUE is missing. Selecting rows with "B" works fine, but trying to add a filter for missing fails. Any suggestions how to solve this. Tips for good documentation on subsetting/filtering dataframes in Julia are welcome. In the Julia documentation I haven't found a solution.
using DataFrames
df = DataFrame(ID = 1:5, LABEL = ["A", "A", "B", "B", "B"], VALUE = ["A1", "A2", "B1", "B2", missing])
df[df[:LABEL] .== "B", :] # works fine
df[df[:LABEL] .== "B" && df[:VALUE] .== missing, :] # fails
Use:
filter([:LABEL, :VALUE] => (l, v) -> l == "B" && ismissing(v), df)
(a very similar example is given in the documentation of the filter function).
If you want to use getindex then write:
df[(df.LABEL .== "B") .& ismissing.(df.VALUE), :]
The fact that you need to use .& instead of && when working with arrays is not DataFrames.jl specific - this is a common pattern in Julia in general when indexing arrays with booleans.

Map columns with list and return corresponding list

import pandas as pd
pd.DataFrame({"a":["a","b","c"],"d":[1,2,3]})
Given an array ["a","b","c","c"], I want it to use to map col "a", and get output [1,2,3,3] which is from column "d". Is there a short way to do this without iterating the rows?
Use Series.reindex with index by a converted to index by DataFrame.set_index:
a = ["a","b","c","c"]
L = df.set_index('a').reindex(a)['d'].tolist()
print (L)
[1, 2, 3, 3]

How can one assign an item contextualized Array to a positional?

In Rakudo Perl 6 item or $ can be used to evaluate an expression in item context. See https://docs.perl6.org/routine/item
I am using a library that returns an item contextualized Array. What is the proper way to remove the contextualization so it may be assigned to a # variable?
For example:
my #a = $[<a b c>];
dd #a; # Outputs: Array #a = [["a", "b", "c"],]
Perl being Perl, there's more than one way to do it, such as
dd my # = #$[<a b c>]; # via original array, equivalent to .list
dd my # = $[<a b c>][]; # via original array, using zen slicing
dd my # = |$[<a b c>]; # via intermediate Slip, equivalent to .Slip
dd my # = $[<a b c>].flat; # via intermediate flattening Seq
The most clear solution is probably enforcing list context via # or .list, and I'd avoid the .flat call because it has slightly different semantic connotations.
Just as a reminder, note that list assignment is copying, but if you used one of the approaches that just pull out the original array from its scalar container, you could also use binding. However, in that case you wouldn't even need to manually decontainerize as
dd my # := $[<a b c>];
also gets you back your array as something list-y.
Flatten it:
my #a = $[<a b c>].flat;
dd #a; # Array #a = ["a", "b", "c"]