OptaPlanner Right Tool for Scheduling of Manufacturing Orders - optaplanner

Would you consider OptaPlanner to be the right tool for the planning of manufacturing operations with multiple level routings (final product, subassembly1, subassembly2, subassembly11, subassembly12, ...)?
We are talking about several 1000s of manufacturing orders with 10-20 operations each.
Looks like project shop scheduling, I know. I'm just concerned a about the amount of data and the ability to find an optimal solution in a reasonable amount of time...
Are there real world examples for this problem domain and OptaPlanner out there?

See the project job scheduling example. That's not our easiest or prettiest example, but it works and you can make it pretty.
For scaling, if it would end up as a problem (I doubt it for only 1k entities), there are plenty of power tweaking options (multithreaded solving, partitioned search, ...)

Related

Optimization algorithms optimizing an existing system connections

i am currently working on an existing infrastructure where i have about a 1000 customer sites connected to about 5 different hubs. A customer site may connect to one or two hubs to ensure reliability but each customer site is connected to at least one hub. I want to ensure if the current system is the best or can be optimised to have better connection from customer sites to hubs, to help improve connectivity and reliability. Can you suggest good Optimisation Algorithms to look into?. Thank you
Sounds like you're doing some variation of the Facility Problem.
This is a well-known problem, and while there are algorithms that can solve for the global optimum (Djiskra's Algorithm, or other variants of Dynamic Programming), they do not scale well (i.e. you run into the curse of dimensionality). You could try this, but 1000 sounds already pretty big (depends on your exact problem formulation though).
I'd recommend taking a look at this coursera mooc Discrete Optimization. You don't have to take the whole course, but in the "Assignments" section of the video lectures, he also explains a variant of the Facility problem, some possible approaches to think about, and once you've decided which one you want to use, you can look deeper into that particular approach.

Any place to get simple planning problems and datasets?

I am implementing a timetabling application in the Drools Planner application. They have included:
N queens
allocating computer processes into n computers with constrained resources
Allocating beds for patients, each bed having special requirements
Allocating students for examinations
and so on.
I want to practice simple planning problems like #1 and #2 to hone my Drools Planner skills. I prefer state problems over path problems.
What are simple enough planning problems who have obvious feasible solutions?
Is there any online resource to get such problems and problem datasets?
As a general answer you may check datasets for machine learning research. The list contains datasets growing in size that you may use for different things including planning problems.
project euler and eclipseclp are also nice resources.

How do I work out the cost benefit of optimisation?

I want to figure out how much money I'd save if I optimise some part of my web app. If I save 100 cpu milliseconds over 50K calls to the app, how much electricity is that not using in a day? How about over a year?
I've tried to find some figures thru google, but my googling mojo is failing me at present.
You can't calculate something that specific. You can only conduct an experiment and see what happens.
But honestly I would rather spend time refactoring code for better maintainability and adding new features the customers will like and pay for, so that I won't have to think about electricity.
When "optimizing" it is always important to focus on what you want to "optimize" - in this case, your electricity bill. I would not even bother looking at changing code in an attempt to affect your electricity bill. I would look at the computer's power supply, cooling fans, heat sink, etc. and optimize those things for energy efficiency (buy new, more efficient components). More than likely it will cost less than several hours of a software engineer "optimizing" code for energy efficiency.

Profiling a VxWorks system

We've got a fairly large application running on VxWorks 5.5.1 that's been developed and modified for around 10 years now. We have some simple home-grown tools to show that we are not using too much memory or too much processor, but we don't have a good feel for how much headroom we actually have. It's starting to make it difficult to do estimates for future enhancements.
Does anybody have any suggestions on how to profile such a system? We've never had much luck getting the Wind River tools to work.
For bonus points: the other complication is that our system has very different behaviors at different times; during start-up it does a lot of stuff, then it sits relatively idle except for brief bursts of activity. If there is a profiler with some programmatic way to have to record state information, I think that'd be very useful too.
FWIW, this is compiled with GCC and written entirely in C.
I've done a lot of performance tuning of various kinds of software, including embedded applications. I won't discuss memory profiling - I think that is a different issue.
I can only guess where the "well-known" idea originated that to find performance problems you need to measure performance of various parts. That is a top-down approach, similar to the way governments try to control budget waste, by subdividing. IMHO, it doesn't work very well.
Measurement is OK for seeing if what you did made a difference, but it is poor at telling you what to fix.
What is good at telling you what to fix is a bottom-up approach, in which you examine a representative sample of microscopic units of what is being spent, and finding out the full explanation of why each one is being spent. This works for a simple statistical reason. If there is a reason why some percent (for example 40%) of samples can be saved, on average 40% of samples will show it, and it doesn't require a huge number of samples. It does require that you examine each sample carefully, and not just sort of aggregate them into bigger bunches.
As a historical example, this is what Harry Truman did at the outbreak of the U.S. involvement in WW II. There was terrific waste in the defense industry. He just got in his car, drove out to the factories, and interviewed the people standing around. Then he went back to the U.S. Senate, explained what the problems were exactly, and got them fixed.
Maybe this is more of an answer than you wanted. Specifically, this is the method I use, and this is a blow-by-blow example of it.
ADDED: I guess the idea of finding-by-measuring is simply natural. Around '82 I was working on an embedded system, and I needed to do some performance tuning. The hardware engineer offered to put a timer on the board that I could read (providing from his plenty). IOW he assumed that finding performance problems required timing. I thanked him and declined, because by that time I knew and trusted the random-halt technique (done with an in-circuit-emulator).
If you have the Auxiliary Clock available, you could use the SPY utility (configurable via the config.h file) which does give you a very rough approximation of which tasks are using the CPU.
The nice thing about it is that it does not require being attached to the Tornado environment and you can use it from the Kernel shell.
Otherwise, btpierre's suggestion of using taskHookAdd has been used successfully in the past.
I've worked on systems that have had luck using locally-built monitoring utilities based on taskSwitchHookAdd and related functions (delete hook, etc).
"Simply" use this to track the number of ticks a given task runs. I realize that this is fairly gross scale information for profiling, but it can be useful depending on your needs.
To see how much cpu% each task is using, calculate the percentage of ticks assigned to each task.
To see how much headroom you have, add a lowest priority "idle" task that just does "while(1){}", and see how much cpu% it is assigned to it. Roughly speaking, that's your headroom.

Commercial uses for grid computing?

I keep hearing from associates about grid computing which, from what I can gather, is highly distributed stuff along the lines of SETI#Home.
Is anyone working on these sort of systems for business use? My interest is in figuring out if there's a commercial reason for starting software development in this field.
Rendering Farms such as Pixar
Model Evaluation e.g. weather, financials, military
Architectural Engineering e.g. earthquakes.
To list a few.
Grid computing is really only needed if you have a lot of WORK that needs to be done, like folding proteins, otherwise a simple server farm will likely be plenty.
Obviously Google are major users of Grid Computing; all their search service relies on it, and many others.
Engines such as BigTable are based on using lots of nodes for storage and computation. These are commercially very useful because they're a good alternative to a small number of big servers, providing better redundancy and cost effective scaling.
The downside is that the software is fiendishly difficult to write, but Google seem to manage that one ok :)
So anything which requires big storage and/or lots of computation.
I used to work for these guys. Grid computing is used all over. Anyone who makes computer chips uses them to test designs before getting physical silicon cut. Financial websites use grids to calculate if you qualify for that loan. These days they are starting to replace big iron in a lot of places, as they tend to be cheaper to maintain over the long term.