Is it possible to save a dictionary on the project level in pyiron? - pyiron

I have a large number of MD simulations, which are merely there for me to calculate the thermal expansion of a certain material. Since I'm not really interested in the details of these simulations, which nevertheless take quite some space, I would like to erase them afterwards, while I would love to store the thermal expansion coefficient and its error on the project level. So the full workflow would look like this:
Create a large number of MD simulations
Calculate the thermal expansion coefficient and its error from the MD simulations and store them in a dictionary
Store this dictionary in an hdf5 file on the project level
Erase all MD simulations

You can create a pyiron table to collect the thermal expansion and then only keep the pyiron table object rather than keeping all calculations. Here is an example, how to store bulk properties calculated from an energy volume curve for each potential in a pyiron table:
https://github.com/pyiron/pyiron/blob/master/notebooks/data_mining.ipynb

Related

Estimating short- and long-run elasticities for dynamic panel with short T and large N

I want to estimate short- and long-run price elasticities for energy demand using a dynamic panel regression. My data contains of large N (>1000) and small T (12). I started with an ARDL representation as follows:
$EC_{it} = c + \sum_{j=1}^p \phi EC_{i,t-j} + \sum_{i=0}^q \theta X_{i, t-i} + \epsilon_{it}$
To estimate the parameters I would use the ARDL-PMG estimator, however the literature tells me these are biased for small T. For small T dynamic panel models the Arellano-Bond estimator is proposed, however, is it possible to estimate short-run elasticities using this estimator and furthermore I cannot find how this estimator deals with I(0)/I(1) variables (which is clear for the ARDL specification).
Thanks in advance
Hein

Feature Selection for Text Classification with Information Gain in R

I´m trying to prepare my dataset ideally for binary document classification with an SVM algorithm in R.
The dataset is a combination of 150171 labelled variables and 2099 observations stored in a dataframe. The variables are a combination uni- and bigrams which were retrieved from a text dataset.
When I´m trying to calculate the Information gain as a feature selection method, the Error "cannot allocate vector of size X Gb" occurs although I already extended my memory and I´m running on a 64-bit operating system. I tried the following package:
install.packages("FSelector")
library(FSelector)
value <- information.gain(Usefulness ~., dat_SentimentAnalysis)
Does anybody know a solution/any trick for this problem?
Thank you very much in advance!

VTK / ITK Dice Similarity Coefficient on Meshes

I am new to VTK and am trying to compute the Dice Similarity Coefficient (DSC), starting from 2 meshes.
DSC can be computed as 2 Vab / (Va + Vb), where Vab is the overlapping volume among mesh A and mesh B.
To read a mesh (i.e. an organ contour exported in .vtk format using 3D Slicer, https://www.slicer.org) I use the following snippet:
string inputFilename1 = "organ1.vtk";
// Get all data from the file
vtkSmartPointer<vtkGenericDataObjectReader> reader1 = vtkSmartPointer<vtkGenericDataObjectReader>::New();
reader1->SetFileName(inputFilename1.c_str());
reader1->Update();
vtkSmartPointer<vtkPolyData> struct1 = reader1->GetPolyDataOutput();
I can compute the volume of the two meshes using vtkMassProperties (although I observed some differences between the ones computed with VTK and the ones computed with 3D Slicer).
To then intersect 2 meshses, I am trying to use vtkIntersectionPolyDataFilter. The output of this filter, however, is a set of lines that marks the intersection of the input vtkPolyData objects, and NOT a closed surface. I therefore need to somehow generate a mesh from these lines and compute its volume.
Do you know which can be a good, accurate way to generete such a mesh and how to do it?
Alternatively, I tried to use ITK as well. I found a package that is supposed to handle this problem (http://www.insight-journal.org/browse/publication/762, dated 2010) but I am not able to compile it against the latest version of ITK. It says that ITK must be compiled with the (now deprecated) ITK_USE_REVIEW flag ON. Needless to say, I compiled it with the new Module_ITKReview set to ON and also with backward compatibility but had no luck.
Finally, if you have any other alternative (scriptable) software/library to solve this problem, please let me know. I need to perform these computation automatically.
You could try vtkBooleanOperationPolyDataFilter
http://www.vtk.org/doc/nightly/html/classvtkBooleanOperationPolyDataFilter.html
filter->SetOperationToIntersection();
if your data is smooth and well-behaved, this filter works pretty good. However, sharp structures, e.g. the ones originating from binary image marching cubes algorithm can make a problem for it. That said, vtkPolyDataToImageStencil doesn't necessarily perform any better on this regard.
I had once impression that the boolean operation on polygons is not really ideal for "organs" of size 100k polygons and more. Depends.
If you want to compute a Dice Similarity Coefficient, I suggest you first generate volumes (rasterize) from the meshes by use of vtkPolyDataToImageStencil.
Then it's easy to compute the DSC.
Good luck :)

Labview optimization VIs

I am trying to minimize a specific spectral coefficient in respect to a set of parameters involved in my array, using the global optimization VI, and the process gets stuck. Maybe I am using the wrong VI I don't know. Here are screenshots of my code:
and the sub-VI that's referenced:
Basically it averages an array (which values are a linear function of three parameters) over one dimension then gets a certain coefficient of its power spectrum, after that the main VI tries to minimize that coefficient in respect to the three aforementioned parameters. Any ideas?

How use GPS almanac information?

I need use the data of this site: http://www.navcen.uscg.gov/?Do=gpsArchives&path=2012
to develop a small software that plot a chart about satellate availability, something like this: http://i.stack.imgur.com/X0iGL.jpg
The user must set a day, a latitude/longitude position and a time zone, then my application must plot the satellate availability for 7 days (from user day) to choose the best day.
I'm not a GPS expert so I don't know which and how use the data from almanac to make the plot.
Any idea?
If you're good at Matlab you could try using this
This program calculate GPS visible satellites by exerting terrain for high accuracy prediction.
inputs:
Coordinate of Station on earth.
GPS Almanac file.
Terrain Data in "txt" format(just for DSM calculation).
Or you could go through the gpstk code and refer to how ComputeStationSatelliteVisibility l is implemented
The almanac contains the elipse parameters that describe the curve of one satellite around the earth. Using these params you could determine where the sats are positioned for a specific time and position.
Assume a visibility of 170° of the sky: 5° are hidden by houses or mountains at the horizon.
Refer to :
http://www.navcen.uscg.gov/?pageName=gpsAlmanacs
and
http://www.navcen.uscg.gov/pdf/gps/Programmatically%20Accessing.pdf