How should I set variables and nested data in UML diagrams? - oop

I'm trying to set a UML from code with a bunch of nested data structures to view better the relationship and how I should to step to process this data.
I'm working in a Objects Diagram. How I should put a variable in a diagram like this? Are there a better way to set this "list inside a dict, inside a pd.DataFrame, inside a dict... and so on" code than with association (compositions, aggregations...)?
Python example:
x = 3
y = 'str'
z = [x, y]
a = [{b: ['d', 'f', {j: 'j', k: 'k'}], c: [[1, 2, 3],[4, 5, 6]]}, z, y]
My first attempt would be the next:
(Note: I added inside a parenthesis some values, but I know this is not correct, what should be the right way?)
I would like to see in one view al this data and clearly find the steps to process this.

Your diagram is almost correct.
The composition relationship is not allowed between objects, only between classes. Objects have links. The value of an object can be specified as shown in the following diagram (incomplete).
See UML 2.5 specification, figure 9.28.

You can model the value of object inside the Specification property of the InstanceSpecification

Related

How do I use ggspectra()?

I would like to use the package ggspectra but I can't figure out how to use it in means of data type(?). With the examples given with two_suns.spct it works, more or less, but when I want to use my own data which is w.length ~ Intensity/count, I can't get any plot with it. What do I have to do (with my own data)?
df[1:10, ]
Intensity w.length
1 0.00021348 1.235582e-21
2 0.00026164 1.008143e-21
3 0.00030980 8.514191e-22
4 0.00035796 7.368669e-22
5 0.00040612 6.494837e-22
6 0.00045428 5.806284e-22
7 0.00050244 5.249731e-22
8 0.00055060 4.790541e-22
9 0.00059876 4.405220e-22
10 0.00064693 4.077270e-22
(...)
I'm trying it via:
library(readxl)
library(ggplot2)
library(photobiology)
library(photobiologyWavebands)
library(ggspectra)
Lambda = h*c / E
h = 6.62607015e-34
c = 299792458
df$w.length = (h * c) / df$Energy_MeV
ggplot(df, aes(x = Energy_MeV, y = Intensity)) +
geom_line() +
The code line
ggplot(df) + geom_line()
does not work at all as I receive the information that aes() is necessary.
'ggspectra' is designed to work with spectral data stored in classes defined in package 'photobiology' as you noticed. These classes are based on data frame but store additional metadata in attributes and have strict expectations about the units used to express the spectral data, the units used to express wavelength and the names of the columns used to store them. This approach has pros and cons. Once we have created an object of one of these classes, when we pass it as argument to ggplot(), R dispatches a ggplot() method specific to these classes that "know" how to set aes() automatically. There are also autoplot() methods that build a whole ggplot object. A big advantage of keeping the metadata in attributes of the object were the data is stored is that this ensures their availability not only when plotting but for any other computations now and in the future, helping ensure reproducibility. This of course, requires additional work up front as we need to create an object belonging to a special class and store both data and metadata in it.
When designing these packages I did not expect they would be used for anything other than light and ultraviolet radiation, expressed either as energy in W m-2 or as photons in mol s-1 m-2, and wavelength in nm. Just for completness, I mention that when dealing with these units a data frame con be converted with the conversion constructor as.source_spct() into a source_spct object if the data are already expressed in the expected units and the column names follow the naming conventions. Alternatively, a source_spct object can be created with the source_spct() constructor by passing suitable vectors as arguments, similarly to how a data frame is created. Additional arguments can be passed to set the metadata attributes.
Neither of these constructors will work in this case as clearly the spectral data in the question is expressed is some other units or even is a different quantity.

How can I force two jupyter sliders to interact with one another (non-trivially)? Is "tag" available for handler?

I want to create two ipywidget sliders, say one with value x, the other with value 1-x. When I change one slider, the other one should be automatically changed acccordingly. I am trying to use observe for callback. I see that I might use owner and description to identify which slider was modified. But I don't think description supposed to be used for this purpose. After all, description should not need to be unique in the first place. I wonder if I am missing something here.
from ipywidgets import widgets
x=0.5
a=widgets.FloatSlider(min=0,max=1,description='a',value=x)
b=widgets.FloatSlider(min=0,max=1,description='b',value=1-x)
display(a,b)
def on_value_change(change):
if str(change['owner']).split("'")[1]=='a':
exec('b.value='+str(1-change['new']))
else:
exec('a.value='+str(1-change['new']))
a.observe(on_value_change,names='value')
b.observe(on_value_change,names='value')
Beginner with widgets, but I ran into the same question earlier and couldn't find a solution. I pieced together several sources and came up with something that seems to work.
Here's a model example of two sliders maintaining proportionality according to '100 = a + b', with two sliders representing a and b. :
caption = widgets.Label(value='If 100 = a + b :')
a, b = widgets.FloatSlider(description='a (=100-b)'),\
widgets.FloatSlider(description='b (= 100-a)')
def my_func1(a):
# b as function of a
return (100 - a)
def my_func2(b):
# a as function of b
return (100 - b)
l1 = widgets.dlink((a, 'value'), (b, 'value'),transform=my_func1)
l2 = widgets.dlink((b, 'value'), (a, 'value'),transform=my_func2)
display(caption, a, b)
To explain, as best as I understand... the key was to set up a directional link going each direction between the two sliders, and to provide a transform function for the math each direction across the sliders.
i.e.,:
l1 = widgets.dlink((a, 'value'), (b, 'value'),transform=my_func1)
What that is saying is this: .dlink((a, 'value'), (b, 'value'),transform=my_func1) is like "the value of a is a variable (input) used to determine the value of b (output)" and that "the function describing b, as a function of a, is my_func1".
With the links described, you just need to define the aforementioned functions.
The function pertaining to direct link l1 is:
def my_func1(a): # defining b as function of a
return (100 - a)
Likewise (but in reverse), l2 is the 'vice versa' to l1, and my_func2 the 'vice versa' to my_func1.
I found this to work better for learning purposes, compared to the fairly common approach of utilizing a listener (a.observe or b.observe) to log details (e.g. values) about the states of the sliders into a dictionary-type parameter (change) which can be passed into the transform functions and indexing for variable assignments.
Good luck, hope that helps! More info at https://ipywidgets.readthedocs.io/en/latest/examples/Widget%20Events.html#Linking-Widgets

Reading TTree Friend with uproot

Is there an equivalent of TTree::AddFriend() with uproot ?
I have 2 parallel trees in 2 different files which I'd need to read with uproot.iterate and using interpretations (setting the 'branches' option of uproot.iterate).
Maybe I can do that by manually obtaining several iterators from iterate() calls on the files, and then calling next() on each iterators... but maybe there's a simpler way akin to AddFriend ?
Thanks for any hint !
edit: I'm not sure I've been clear, so here's a bit more details. My question is not about usage of arrays, but about how to read them from different files. Here's a mockup of what I'm doing :
# I will fill this array and give it as input to my DNN
# it's very big so I will fill it in place
bigarray = ndarray( (2,numentries),...)
# get a handle on a tree, just to be able to build interpretations :
t0 = .. first tree in input_files
interpretations = dict(
a=t0['a'].interpretation.toarray(bigarray[0]),
b=t0['b'].interpretation.toarray(bigarray[1]),
)
# iterate with :
uproot.iterate( input_files, treename,
branches = interpretations )
So what if a and b belong to 2 trees in 2 different files ?
In array-based programming, friends are implicit: you can JOIN any two columns after the fact—you don't have to declare them as friends ahead of time.
In the simplest case, if your arrays a and b have the same length and the same order, you can just use them together, like a + b. It doesn't matter whether a and b came from the same file or not. Even if I've if these is jagged (like jets.phi) and the other is not (like met.phi), you're still fine because the non-jagged array will be broadcasted to match the jagged one.
Note that awkward.Table and awkward.JaggedArray.zip can combine arrays into a single Table or jagged Table for bookkeeping.
If the two arrays are not in the same order, possibly because each writer was individually parallelized, then you'll need some column to act as the key associating rows of one array with different rows of the other. This is a classic database-style JOIN and although Uproot and Awkward don't provide routines for it, Pandas does. (Look up "merging, joining, and concatenating" in the Pandas documenting—there's a lot!) You can maintain an array's jaggedness in Pandas by preparing the column with the awkward.topandas function.
The following issue talks about a lot of these things, though the users in the issue below had to join sets of files, rather than just a single tree. (In principle, a process would have to look ahead to all the files to see which contain which keys: a distributed database problem.) Even if that's not your case, you might find more hints there to see how to get started.
https://github.com/scikit-hep/uproot/issues/314
This is how I have "friended" (befriended?) two TTree's in different files with uproot/awkward.
import awkward
import uproot
iterate1 = uproot.iterate(["file_with_a.root"]) # has branch "a"
iterate2 = uproot.iterate(["file_with_b.root"]) # has branch "b"
for array1, array2 in zip(iterate1, iterate2):
# join arrays
for field in array2.fields:
array1 = awkward.with_field(array1, getattr(array2, field), where=field)
# array1 now has branch "a" and "b"
print(array1.a)
print(array1.b)
Alternatively, if it is acceptable to "name" the trees,
import awkward
import uproot
iterate1 = uproot.iterate(["file_with_a.root"]) # has branch "a"
iterate2 = uproot.iterate(["file_with_b.root"]) # has branch "b"
for array1, array2 in zip(iterate1, iterate2):
# join arrays
zippedArray = awkward.zip({"tree1": array1, "tree2": array2})
# zippedArray. now has branch "tree1.a" and "tree2.b"
print(zippedArray.tree1.a)
print(zippedArray.tree2.b)
Of course you can use array1 and array2 together without merging them like this. But if you have already written code that expects only 1 Array this can be useful.

NHibernate - drilling down from the aggregrate root

Given an aggregate root X, which has many Y, and Y which has many Z...
How can I drill down through the associations and select only those X's whose Z's have a certain property value?
IList Xs = Session.CreateCriteria(typeof(X))
.CreateAlias("Ys", "Y")
.CreateAlias("Y.Zs", "Z")
.Add(Expression.Eq("Z.Property", 1))
.List();
Doing this results in a PropertyAccessException, and I have no idea why.
Loading all Xs and testing their Z properties would be massively redundant.
I have tried it out, and in my test setup it works flawlessly. A PropertyAccessExceotion can be about an unavailable setter or a type mismatch when a property is set. If you would post some mapping and entity souce code it might help.

Obtaining all possible states of an object for a NP-Complete(?) problem in Python

Not sure that the example (nor the actual usecase) qualifies as NP-Complete, but I'm wondering about the most Pythonic way to do the below assuming that this was the algorithm available.
Say you have :
class Person:
def __init__(self):
self.status='unknown'
def set(self,value):
if value:
self.status='happy'
else :
self.status='sad'
... blah . Maybe it's got their names or where they live or whatev.
and some operation that requires a group of Persons. (The key value is here whether the Person is happy or sad.)
Hence, given PersonA, PersonB, PersonC, PersonD - I'd like to end up a list of the possible 2**4 combinations of sad and happy Persons. i.e.
[
[ PersonA.set(true), PersonB.set(true), PersonC.set(true), PersonD.set(true)],
[ PersonA.set(true), PersonB.set(true), PersonC.set(true), PersonD.set(false)],
[ PersonA.set(true), PersonB.set(true), PersonC.set(false), PersonD.set(true)],
[ PersonA.set(true), PersonB.set(true), PersonC.set(false), PersonD.set(false)],
etc..
Is there a good Pythonic way of doing this? I was thinking about list comprehensions (and modifying the object so that you could call it and get returned two objects, true and false), but the comprehension formats I've seen would require me to know the number of Persons in advance. I'd like to do this independent of the number of persons.
EDIT : Assume that whatever that operation that I was going to run on this is part of a larger problem set - we need to test out all values of Person for a given set in order to solve our problem. (i.e. I know this doesn't look NP-complete right now =) )
any ideas?
Thanks!
I think this could do it:
l = list()
for i in xrange(2 ** n):
# create the list of n people
sublist = [None] * n
for j in xrange(n):
sublist[j] = Person()
sublist[j].set(i & (1 << j))
l.append(sublist)
Note that if you wrote Person so that its constructor accepted the value, or such that the set method returned the person itself (but that's a little weird in Python), you could use a list comprehension. With the constructor way:
l = [ [Person(i & (1 << j)) for j in xrange(n)] for i in xrange(2 ** n)]
The runtime of the solution is O(n 2**n) as you can tell by looking at the loops, but it's not really a "problem" (i.e. a question with a yes/no answer) so you can't really call it NP-complete. See What is an NP-complete in computer science? for more information on that front.
According to what you've stated in your problem, you're right -- you do need itertools.product, but not exactly the way you've stated.
import itertools
truth_values = itertools.product((True, False), repeat = 4)
people = (person_a, person_b, person_c, person_d)
all_people_and_states = [[person(truth) for person, truth in zip(people, combination)] for combination in truth_values]
That should be more along the lines of what you mentioned in your question.
You can use a cartesian product to get all possible combinations of people and states. Requires Python 2.6+
import itertools
people = [person_a,person_b,person_c]
states = [True,False]
all_people_and_states = itertools.product(people,states)
The variable all_people_and_states contains a list of tuples (x,y) where x is a person and y is either True or False. It will contain all possible pairings of people and states.