I am doing some hardware work and I do not know what is the difference between a 4 way processor and an 8 way processor. Can please anyone clarify what does way mean?.
Thanks a lot.
Number of CPU cores.
NOT true!!! 8 way processing is actually 4 cores... called quad core and 4 way processing is actually a dual core processor
Related
I successfully implemented a solver that fits my needs. However, I need to run the solver on 1500+ different "problems" at 0:00 precisely, everyday. Because my web-app is in ruby, I built a quarkus "micro-service" that takes the data, calculate a solution and return it to my main app.
In my application.properties, I set:
quarkus.optaplanner.solver.termination.spent-limit=5s
which means each request take ~5s to solve. But sending 1500 requests at once will saturate the CPU on my machine.
Is there a way to tell OptaPlanner to stop when the solution is good enough ? ( for example if the score is stable ... ). That way I can maybe reduce the time from 5s to 1-2s depending on the problem?
What are your recommandations for my specific scenario?
The SolverManager will automatically queue solver jobs if too many come in, based on its parallelSolverCount configuration:
quarkus.optaplanner.solver-manager.parallel-solver-count=3
In this case, it will run 3 solvers in parallel. So if 7 datasets come in, it will solve 3 of them and the other 4 later, as the earlier solvers terminate. However if you use moveThreadCount=2, then each solver uses at least 2 cpu cores, so you're using at least 6 CPU cores.
By default parallelSolverCount is currently set to half your CPU cores (it currently ignores moveThreadCount). In containers, it's important to use JDK 11+: the CPU count of the container is often different than from the bare metal machine.
You can indeed tell the OptaPlanner Solvers to stop when the solution is good enough, for example when a certain score is attained or the score hasn't improved in an amount of time, or combinations thereof. See these OptaPlanner docs. Quarkus exposes some of these already (the rest currently still need a solverConfig.xml file), some Quarkus examples:
quarkus.optaplanner.solver.termination.spent-limit=5s
quarkus.optaplanner.solver.termination.unimproved-spent-limit=2s
quarkus.optaplanner.solver.termination.best-score-limit=0hard/-1000soft
I've just read something about how CPU cores interact with each other. I can be wrong on some points so don't hesitate to correct me.
So a CPU will basically run instructions that are stored in the L2 or L3 cache. These instructions are addresses that reference to an object in the DRAM.
A multi-core CPU will be able to run more instructions, this will result in better performance. But there is a little problem with that: these cores have to interact with each other, and this is slowing down a little bit the process.
So now that I go back to my question: Why do we not use 1 CPU with bigger cache? As I think, this should give more performance for less costs? Right?
I know these are some basic things that you should know lol. I feel a little bit weird asking this.
Any answer would be welcome!
Multiple cores means you have duplicated circuitry which allow you to do more work in parallel. Each core has its own L1 dcache and icache along with their own registers, decode units, execution pipelines etc.
Just having a bigger cache and a 20 ghz clock won't give you as much performance because you still have to share all other resources.
As someone said to me, I'm forgetting clock speed of CPU's.
LOL
Just Imagine a single core 20GHz; the refresh rate would be too fast to interact with the RAM. In other terms, that means that too much data would be lost, which would result in a crash.
Same holds true with overclocking. -_-
I'm working on a password list generator program. This program needs to be as fast as possible. But it only uses 13% of CPU:
What should I do to make it use all CPU power available ?
Heh. I thought it might be 8 cores. The reason is that your app is running on one thread and therefore only one core is being used. 13% is about 1/8 of 100 :)
If you can split the process up into 8 separate threads, then it will use the other 7 cores.
Obviously your program is only using one thread and because of this not all cores of your CPU are used.
You have to convert your program into something multithreaded
I'm writting a opengl C program, and I know that most graphical jobs are done by the GPU. My question is, can I use the GPU to compute stuff that is not graphic-related? For example, compute 1 + 2 + 3 + ... + 100 = ?
You can by using OpenCL or Computeshaders (thats the DX name, but I think theres something similar in openGl). But in general it only makes sense for algorithms that are easy to parallelize and way bigger than your example.
You're looking for General Purpose GPU computing (GPGPU).
Check out CUDA and
OpenCL
I am not an Expert on GPUs but as far as I know YES. Since the GPU is optimized for the graphic operations I don't know about the performance and scalability.
Check this article.
Your question if it can do 1 + 2 + 3 + .... + 100 = .. ?
Answer: Yes
This might raise another question: What is the advantage of using GPU hardware for Computation?
Answer: It can execute hundreds of such '1+2+3+..+100==..','101+102+...+200=..', '201+202+...+300=..' operations in parallel way!
With its enhanced hardware a GPU is capable of performing computations in parallel way, within fractions of seconds. GPU has hundreds of cores and they can be utilized for the non-graphics functions also. Advantage of this many core architecture can be taken to perform many scientific computations and simulation. Read the concept of General Purpose Graphics Programming Unit. GPGPU
In the old (single-threaded) days we instructed our testing team to always report the CPU time and not the real-time of an application. That way, if they said that in version 1 an action took 5 CPU seconds, and in version 2 it took 10 CPU seconds, that we had a problem.
Now, with more and more multi-threading, this doesn't seem to make sense anymore. It could be that the version 1 of an application takes 5 CPU seconds, and version 2 10 CPU seconds, but that version 2 is still faster if version 1 is single-threaded, and version 2 uses 4 threads (each consuming 2.5 CPU seconds).
On the other hand, using real-time to compare performance isn't reliable either since it can be influenced by lots of other elements (other applications running, network congestion, very busy database server, fragmented disk, ...).
What is in your opinion the best way to 'numerate' performance?
Hopefully it's not intuition since that is not an objective 'value' and probably leads to conflicts between the development team and the testing team.
Performance needs to be defined before it is measured.
Is it:
memory consumption?
task completion times?
disk space allocation?
Once defined, you can decide on metrics.