Is there a reliable way to get __name__ in Manim when using multiprocessing - python-multiprocessing

I am working on a video that has a simulation of ideal gas (points) or realistic gas (circles) going on in the background for parts of it. In order to make processing go faster with a useful number of "molecules" for the realistic gas, I've been using the multiprocessing python package. This requires checks for "__name__ = '__main__'" in regular python to avoid bad multiprocessing conditions. In manim 0.16 versions, I was able to use "__name__ = 'py'", but this broke in the upgrade to 0.17 versions. I eventually fixed it by doing a quick check for whatever __name__ is now by having the main process print("__name__ is now ", __name__), which now outputs "path.to.working.dir.main".
Is there an easy way to find out what to use from within the program without having to check and fix with every update?
In manim 0.16 versions, checking for
__name__ = "py"
was good enough. This changed in manim 0.17 versions and it took me a while to figure out why my animation was crashing. I finally resolved it by inserting a
print("__name__ is now ", __name__)
near the top of the code to find out what to use now. However, this seems like an ineffective approach in the long run (will I have to go through this in version 0.18?) and I'd like to find a better way, more reliable, more automated way.

Related

GUROBI only uses single core to setup problem with cvxpy (python)

I have a large MILP that I build with cvxpy and want to solve with GUROBI. When I give use the solve() function of cvxpy it take a really really really long time to setup and does not start solving for hours. Whilest doing that only 1 core of my cluster is being used. It is used for 100%. I would like to use multiple cores to build the model so that the process of building the model does not take so long. Running grbprobe also shows that gurobi knows about the other cores and for solving the problem it uses multiple cores.
I have tried to run with different flags i.e. turning presolve off and on or giving the number of Threads to be used (this seemed like i didn't even for the solving.
I also have reduce the number of constraints in the problem and it start solving much faster which means that this is definitively not a problem of the model itself.
The problem in it's normal state should have 2200 constraints i reduce it to 150 and it took a couple of seconds until it started to search for a solution.
The problem is that I don't see anything since it takes so long to get the ""set username parameters"" flag and I don't get any information on what the computer does in the mean time.
Is there a way to tell GUROBI or CVXPY that it can take more cpus for the build-up?
Is there another way to solve this problem?
Sorry. The first part of the solve (cvxpy model generation, setup, presolving, scaling, solving the root, preprocessing) is almost completely serial. The parallel part is when it really starts working on the branch-and-bound tree. For many problems, the parallel part is by far the most expensive, but not for all.
This is not only the case for Gurobi. Other high-end solvers have the same behavior.
There are options to do less presolving and preprocessing. That may get you earlier in the B&B. However, usually, it is better not to touch these options.
Running things with verbose=True may give you more information. If you have more detailed questions, you may want to share the log.

Tensorflow: The graph couldn't be sorted in topological order only when running in terminal

I encounter some problem while running the python script on Google cloud computing instance, with python 3.6, tensorflow 1.13.1. I see several people encounter similar problems of loops in computational graph on stackoverflow. But none of it really find the culprit for it. And I observe something interesting so maybe someone experienced can figure it out.
The error message is like this:
2019-05-28 22:28:57.747339: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-05-28 22:28:57.754195: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:704] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.
My script for train.py will look like this:
import A,B,C
...
def main():
....
if __name__ == '__main__':
main()
So I will show my two ways to run this script:
VERSION1:
In terminal,
python3 train.py
This gives me the error like I state above. When I only use CPU, i notice it throws me something like failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected. So I add GPU to my instance but the loop in computational graph is still there.
VERSION 2(This is where weird thing happens):
I just simply copy, with nothing changed, the code in main to the jupyter notebook and run there. Then suddenly, no error occurs anymore.
I don't really know what's going on under the hood. I just notice the message at the beginning of the running is not the same between two different ways of running the code.
If you encounter the same problem, copy to jupyter notebook might help directly. I would really like share more info if someone has any ideas what might possible cause this. Thank you!
Well it turns out, no matter what, I choose a wrong way to build the graph at the beginning, which will not give the loop in my perspective. The loop error give me an idea I'm doing something wrong. But the interesting question I mentioned above is still not answered! However, I'd like to share my error so anyone see the loop error should think whether you're doing the same thing as me.
In the input_fn, I use tensor.eval() to get corresponding numpy.array in the middle to interact with data outside of that function. I choose not to use tf.data.Dataset because the whole process is complicated and I can't compress the whole thing into Dataset directly. But it turns out this approach sabotage the static computational graph design of Tensorflow. So during training, it trains on the same batch again and again. So my two cents advice is that if you want to achieve something super complex in your input_fn. It's likely you will be better off or even only doing the right thing by using the old fashioned modelling way- tf.placeholder.

Why is karate-gatling slow compared to JMeter

I have followed the example at karate-gatling-demo for creating a load test. For my use-case I converted a JMeter test to karate. After making sure everything works, I compared the two. In the time that it took karate-gatling to get to even 300 requests, JMeter had already made a few thousands. I thought it might have been the pause in the demo but even after I removed it, the speed of the tests make them unusable. I would really like to implement this as we are already making strides to use normal karate tests as part of our CI process. Is there a reason they are so slow?
(I am using karate-gatling version 0.8.0.RC4)
To provide some info related to the two testing situations...
JMeter: 50 threads/users with 30 second ramp up and 50 loops
Karate-Gatling: repeat scenario 50 times, ramp to 50 users over 30 seconds
Because this is still in the early stages of development. This feedback helps. If possible can you try 0.8.0.RC3 and see if that makes a difference, the test syntax needs a slight change which you should be able to figure out from the version history. There was a fundamental change in the async model which probably has some issues.
Ideally I would love someone who knows Gatling internals to help but this will take a little time to evolve with me looking at it.
EDIT: Gatling support was released in 0.8.0 (final) and multiple teams have reported that it is working well for them.

How to speed up matrix functions such as expm function in scipy/numpy?

I'm using scipy and numpy to calculate exponentiation of a 6*6 matrix for many times.
Compared to Matlab, it's about 10 times slower.
The function I'm using is scipy.linalg.expm, I have also tried deprecated methods scipy.linalg.expm2 and scipy.linalg.expm3, and those are only two times faster than expm. My question is:
What's wrong with expm2 and expm3 as they are faster than expm?
I'm using wheel package from http://www.lfd.uci.edu/~gohlke/pythonlibs/, and I found https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows. Is the wheel package compiled with MKL. If not, I think I can optimize and numpy, scipy by compile it by myself with MKL?
Any other ways to optimize the performance?
Well I think I have found answer for question 1 and 2 by myself
1. It seems expm2 and expm3 returns array rather than matrix. But they are about 2 times faster than expm
Well, after a whole day trying to compile scipy by MKL, I succeed. It's really hard to build the scipy, especially when I'm using windows, x64 and python3. It turned out to be a waste of time. It's not even a bit faster than the whl package from http://www.lfd.uci.edu/~gohlke/pythonlibs/ .
Hoping someone give answer to question 3.
Your matrix is relatively small, so maybe the numerical part is not the bottleneck. You should use a profiler to make sure that the limitation is in the exponentiation.
You can also take a look at the source code of these implementations and write an equivalent function with less conditionals and checking.

Speeding up Binary Integer programming model

Can anyone give me some tips to make a binary integer programming model faster?
I currently have a model that runs well with very small amount of variables but as soon as I increase the number of variables in my model SCIP keeps running without giving me an optimal solution. I'm currently using SCIP with Soplex to find an optimal solution.
You should have a look at the statistics (type display statistics in the interactive shell). Watch out for time consuming heuristics that don't find a solution and try disabling them. You should also play around with the parameters to find better suited settings for your instances (different branching rule or node selection). Without further information, though, we won't be able to help you.