How do I use ggspectra()? - ggplot2

I would like to use the package ggspectra but I can't figure out how to use it in means of data type(?). With the examples given with two_suns.spct it works, more or less, but when I want to use my own data which is w.length ~ Intensity/count, I can't get any plot with it. What do I have to do (with my own data)?
df[1:10, ]
Intensity w.length
1 0.00021348 1.235582e-21
2 0.00026164 1.008143e-21
3 0.00030980 8.514191e-22
4 0.00035796 7.368669e-22
5 0.00040612 6.494837e-22
6 0.00045428 5.806284e-22
7 0.00050244 5.249731e-22
8 0.00055060 4.790541e-22
9 0.00059876 4.405220e-22
10 0.00064693 4.077270e-22
(...)
I'm trying it via:
library(readxl)
library(ggplot2)
library(photobiology)
library(photobiologyWavebands)
library(ggspectra)
Lambda = h*c / E
h = 6.62607015e-34
c = 299792458
df$w.length = (h * c) / df$Energy_MeV
ggplot(df, aes(x = Energy_MeV, y = Intensity)) +
geom_line() +
The code line
ggplot(df) + geom_line()
does not work at all as I receive the information that aes() is necessary.

'ggspectra' is designed to work with spectral data stored in classes defined in package 'photobiology' as you noticed. These classes are based on data frame but store additional metadata in attributes and have strict expectations about the units used to express the spectral data, the units used to express wavelength and the names of the columns used to store them. This approach has pros and cons. Once we have created an object of one of these classes, when we pass it as argument to ggplot(), R dispatches a ggplot() method specific to these classes that "know" how to set aes() automatically. There are also autoplot() methods that build a whole ggplot object. A big advantage of keeping the metadata in attributes of the object were the data is stored is that this ensures their availability not only when plotting but for any other computations now and in the future, helping ensure reproducibility. This of course, requires additional work up front as we need to create an object belonging to a special class and store both data and metadata in it.
When designing these packages I did not expect they would be used for anything other than light and ultraviolet radiation, expressed either as energy in W m-2 or as photons in mol s-1 m-2, and wavelength in nm. Just for completness, I mention that when dealing with these units a data frame con be converted with the conversion constructor as.source_spct() into a source_spct object if the data are already expressed in the expected units and the column names follow the naming conventions. Alternatively, a source_spct object can be created with the source_spct() constructor by passing suitable vectors as arguments, similarly to how a data frame is created. Additional arguments can be passed to set the metadata attributes.
Neither of these constructors will work in this case as clearly the spectral data in the question is expressed is some other units or even is a different quantity.

Related

Stan variable declarations: difference in use between var_type var_name[length] and vector[length] var_name

I am new at Stan and I'm struggling to understand the difference in how different variable declaration styles are used. In particular, I am confused about when I should put square brackets after the variable type and when I should put them after the variable name. For example, given int<lower = 0> L; // length of my data, let's consider:
real N[L]; // my variable
versus
vector[L] N; // my variable
From what I understand, both declare a variable N as a vector of length L.
Is the only difference between the two that the first way specifies the variable type? Can they be used interchangeably? Should they belong do different parts of the Stan code (e.g., data vs parameters or model)?
Thanks for explaining!
real name[size] and vector[size] name can be used pretty interchangeably. They are stored differently internally, so you can get better performance with one or the other. Some operations might also be restricted to one and the other (e.g. vector multiplication) and the optimal order to loop over them changes. E.g. with a matrix vs. a 2-D array, it is more efficient to loop over rows first vs. columns first, but those will come up if you have a more specific example. The way to read this is:
real name[size];
means name is an array of type real, so a bunch of reals that are stored together.
vector[size] name;
means that name is a vector of size size, which is also a bunch of reals stored together. But the vector data type in STAN is based on the eigen c++ library (c++) and that allows for other operations.
You can also create arrays of vectors like this:
vector[N] name[K];
which is going to produce an array of K vectors of size N.
Bottom line: You can get any model running with using vector or real, but they're not necessarily equivalent in the computational efficiency.

Pyiron automatically assigns bonds in the input files for lammps for O and H whenever they exist in the structure

Whenever I purposefully put H atoms inside a structure (with Fe and O in the structure as host atoms) as interstitials, I expect that it will be defined as an isolated atom with no bonds between it and the surrounding host atoms. However, depending on the location of the H interstitial, pyiron sometimes defines a bond between the additional H and the original O for lammps calculation.
This can be useful for the automatic detection of bonds. However, how can one control this feature whenever not needed?
I also believe there is a bug with the .define_bonds() function.
The inputs for the functions are:
species="O"
element_list=["H"]
cutoff_list=[2.0]
max_bond_list=1
bond_type_list=1
While I believe the max_bond_list and bond_type_list are supposed to be lists, it only works error-free if both are defined as integers instead because of the way they are defined. However, defining them as integers then messes up running the jobs because they should be iterables.

Redeclaring two Medium packages in One system component

I am new to modelica, and i don't have this much experience in it, but i got the basics of course. I am trying to model a micrfluidic network. The network consists of two sources of water and oil, controlled by two valves. The flow of the two mediums interact at a Tjunction and then into a tank or chamber. I don't care about the fluid properties of the mixture because its not my purpose. My question is how do redeclare two medium packages (water and oil) in one system component such as the Tjunction or a tank in order to simulate the system. In my real model, the two mediums doesn't meet, becuase every medium passes through the channels at a different time.
I attached the model with this message. Here's the link.
https://www.dropbox.com/s/yq6lg9la8z211uc/twomediumsv2.zip?dl=0
Thanks for the help .
I don't think you can redeclare a medium during simulation. In your case (where you don't need the mixing of the two fluids) you could create a new medium, for instance called OilWaterMixture, extending from Modelica.Media.Interfaces.PartialMedium.
If you look into the code of PartialMedium you'll see that it contains a lot of partial ("empty") functions that you should fill in in your new medium model. For example, in OilWaterMixture you should extend the function specificEnthalpy_pTX to return the specific enthalpy of your water/oil mixture, for a certain water/oil mixture (given by the mass fraction vector X). This could be done by adding the following model to the OilWaterMixture package:
redeclare function extends specificEnthalpy_pTX "Return specific enthalpy"
Oil = Modelica.Media.Incompressible.Examples.Essotherm650;
Water = Modelica.Media.Water.StandardWater;
algorithm
h_oil := Oil.h_pT(p,T);
h_water := Water.specificEnthalpy_pT(p,T);
h := X[0]*h_oil + X[1]*h_water;
end specificEnthalpy_pTX;
The mass fraction vector X is defined in PartialMedium and in OilWaterMixture you must define that it has two elements.
Again, since you are not going to actually use the mixing properties but only mass fraction vectors {0,1} or {1,0} the simple linear mixing equation should be adequate.
When you use OilWaterMixture in the various components, the error log will tell you which medium functions they need. So you probably don't need to extend all the partial functions in PartialMedium.

Arrays with attributes in Julia

I am making my first steps in julia, and I would like to reproduce something I achieved with numpy.
I would like to write a new array-like type which is essentially an vector of elements of arbitrary type, and, to keep the example simple, an scalar attribute such as the sampling frequency fs.
I started with something like
type TimeSeries{T} <: DenseVector{T,}
data::Vector{T}
fs::Float64
end
Ideally, I would like:
1) all methods that take a Vector{T} as argument to take on TimeSeries{T}.
e.g.:
ts = TimeSeries([1,2,3,1,543,1,24,5], 12.01)
median(ts)
2) that indexing a TimeSeries always returns a TimeSeries:
ts[1:3]
3) built-in functions that return a Vector to return a TimeSeries:
ts * 2
ts + [1,2,3,1,543,1,24,5]
I have started by implementing size, getindex and so on, but I definitely do not see how it could be possible to match points 2 and 3.
numpy has a quite comprehensive way to doing this: http://docs.scipy.org/doc/numpy/user/basics.subclassing.html. R also seems to allow linking attributes attr()<- to arrays.
Do you have any idea about the best strategy to implement this sort of "array with attributes".
Maybe I'm not understanding, why is for say point 3 it not sufficient to do
(*)(ts::TimeSeries, n) = TimeSeries(ts.data*n, ts.fs)
(+)(ts::TimeSeries, n) = TimeSeries(ts.data+n, ts.fs)
As for point 2
Base.getindex(ts::TimeSeries, r::Range) = TimeSeries(ts.data[r], ts.fs)
Or are you asking for some easier way where you delegate all these operations to the internal vector? You can clever things like
for op in (:(+), :(*))
#eval $(op)(ts::TimeSeries, x) = TimeSeries($(op)(ts.data,x), ts.fs)
end

Can I run a GA to optimize wavelet transform?

I am running a wavelet transform (cmor) to estimate damping and frequencies that exists in a signal.cmor has 2 parameters that I can change them to get more accurate results. center frequency(Fc) and bandwidth frequency(Fb). If I construct a signal with few freqs and damping then I can measure the error of my estimation(fig 2). but in actual case I have a signal and I don't know its freqs and dampings so I can't measure the error.so a friend in here suggested me to reconstruct the signal and find error by measuring the difference between the original and reconstructed signal e(t)=|x(t)−x^(t)|.
so my question is:
Does anyone know a better function to find the error between reconstructed and original signal,rather than e(t)=|x(t)−x^(t)|.
can I use GA to search for Fb and Fc? or do you know a better search method?
Hope this picture shows what I mean, the actual case is last one. others are for explanations
Thanks in advance
You say you don't know the error until after running the wavelet transform, but that's fine. You just run a wavelet transform for every individual the GA produces. Those individuals with lower errors are considered fitter and survive with greater probability. This may be very slow, but conceptually at least, that's the idea.
Let's define a Chromosome datatype containing an encoded pair of values, one for the frequency and another for the damping parameter. Don't worry too much about how their encoded for now, just assume it's an array of two doubles if you like. All that's important is that you have a way to get the values out of the chromosome. For now, I'll just refer to them by name, but you could represent them in binary, as an array of doubles, etc. The other member of the Chromosome type is a double storing its fitness.
We can obviously generate random frequency and damping values, so let's create say 100 random Chromosomes. We don't know how to set their fitness yet, but that's fine. Just set it to zero at first. To set the real fitness value, we're going to have to run the wavelet transform once for each of our 100 parameter settings.
for Chromosome chr in population
chr.fitness = run_wavelet_transform(chr.frequency, chr.damping)
end
Now we have 100 possible wavelet transforms, each with a computed error, stored in our set called population. What's left is to select fitter members of the population, breed them, and allow the fitter members of the population and offspring to survive into the next generation.
while not done
offspring = new_population()
while count(offspring) < N
parent1, parent2 = select_parents(population)
child1, child2 = do_crossover(parent1, parent2)
mutate(child1)
mutate(child2)
child1.fitness = run_wavelet_transform(child1.frequency, child1.damping)
child2.fitness = run_wavelet_transform(child2.frequency, child2.damping)
offspring.add(child1)
offspring.add(child2)
end while
population = merge(population, offspring)
end while
There are a bunch of different ways to do the individual steps like select_parents, do_crossover, mutate, and merge here, but the basic structure of the GA stays pretty much the same. You just have to run a brand new wavelet decomposition for every new offspring.