Using the cloud service to trasform a picture using a neural algorithm? - amazon-s3

Yesterday I tried to transform a picture in the artistic style using CNNs based on A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge using a recent Torch implemenation,as it is explained here :
https://github.com/mbartoli/neural-animation
it started the conversion correctly,the problem is that the process is very time consuming,after 1 hour of elaboration a simple picture was not fully transformed. And I have to trasform 1615 pictures. What's the solution here ? Can I use the Google Cloud Platform to make this operation faster ? Or some other kind of Cloud service ? Using my Home PC is not the right solution. If I can use the cloud power,how can I configure everything ? let me know,thanks.

Using (Google Cloud Platform) GCP here would seem to be a good use case. If we were to boil it down to what you have ... you have an application which is CPU intensive that takes a long time to run. Depending on the nature of the application, it may run faster for any single given instance by having more CPUs and/or more RAM. GCP allows YOU to choose the size of the machine on which your application runs. You can choose from VERY small to VERY large. The distinction is how much you are willing to pay. Remember, you only pay for what you use. If an application takes an hour to run on a machine with price X but takes 30 minutes on a different machine with price 2X then the cost will still only be X but you will have a result in 30 minutes rather than an hour. You would switch off the machine after the 30 minutes to prevent charging.
Since you also said that you have MANY images to process, this is where you can take advantage of horizontal scale. Instead of having just one machine where the application takes an hour on each machine and all results are serialized.... you can create an array of machines where each machine is processing one picture. So if you had 50 machines, at the end of one hour you would have 50 images processed instead of one.
As for how to get this all going ... I'm afraid that is a much bigger story and one where a read of the GCP documentation will help tremendously. I suggest you read and have a play and THEN if you have specific questions, the community can try and provide specific answers.

Related

OptaPlanner VRPPD with Traffic and Time Windows

Is there currently a way to incorporate traffic patterns into OptaPlanner with the package and delivery VRP problem?
Eg. Let's say I need to optimize 500 pickup and deliveries today and tomorrow amongst 30 vehicles where each pickup has a 1-4hr time window. I want to avoid busy areas of the city during rush hours when possible.
New pickups can also be added (or cancelled in the meantime).
I'm sure this is a common problem. Does a decent solution exist for this in OptaPlanner?
Thanks!
Users often do this, but there is no out-of-the-box example of it.
There are several ways to do it, but one way is to add a 3th dimension to the distanceMatrix, indicating the departureTime. Typically that use a granularity of 15 minutes, 30 minutes or 1 hour.
There are 2 scaling concerns here:
memory. 15 minutes means 24 * 4 = 96 per day. Given that with 2 dimensions, a 10k locations distanceMatrix uses almost 2 GB RAM, clearly memory can become a concern.
pre-calculation time. Calculation the distance matrix can be time consuming. "bulk algorithms" can help here. For example, graphhopper community doesn't support bulk distance calculations, but their enterprise version - as well OSRM (which is free) does. Getting a 3 dimensional matrix from the remote Google Maps API, or the remote enterprise Graphhopper API, can result in bandwidth concerns (see above, the distance matrix can become several GB in size, especially in non-binary formats such as JSON or CSV).
In any case, one that 3 dimensional matrix is there, it's just a matter of adjust OptaPlanner examples's ArrivalTimeUpdateListener to use getDistance(from, to, departureTime).

Curious on how to use some basic machine learning in a web application

A co-worker and I had an idea to create a little web game where a user enters a chunk of data about themselves and then the application would write for them to sound like them in certain structures. (Trying to leave the idea a little vague.) We are both new to ML and thought this could be a fun first dive.
We have a decent bit of background with PHP, JavaScript (FE and Node), Ruby a little bit of other languages, and have had interest in learning Python for ML. Curious if you can run a cost efficient ML library for text well with a web app, being most servers lack GPUs?
Perhaps you have to pay for one of the cloud based systems, but wanted to find the best entry point for this idea without racking up too much cost. (So far I have been reading about running Pytorch or TensorFlow, but it sounds like you lose a lot of efficiency running with CPUs.)
Thank you!
(My other thought is doing it via an iOS app and trying Apple's ML setup.)
It sounds like you are looking for something like Tensorflow JS
Yes, before jumping into training something with Deep Learning; (this might even be un-necessary for your purpose) try to build a nice and simple baseline for this.
Before Deep Learning (just a few yrs ago) people did similar tasks using n-gram feature based language models. https://web.stanford.edu/~jurafsky/slp3/3.pdf
Essentially you try to predict the next few words probabilistically given a small context(of n-words; typically n is small like 5 or 6)
This should be a lot of fun to work out and will certainly do quite well with a small amount of data. Also such a model will run blazingly fast; so you don't have to worry about GPUs and compute .
To improve on these results with Deep Learning, you'll need to collect a ton of data first; and it will be work to get it to be fast on a web based platform

Using tensorflow to identify lego bricks?

having read this article about a guy who uses tensorflow to sort cucumber into nine different classes I was wondering if this type of process could be applied to a large number of classes. My idea would be to use it to identify Lego parts.
At the moment, a site like Bricklink describes more than 40,000 different parts so it would be a bit different than the cucumber example but I am wondering if it sounds suitable. There is no easy way to get hundreds of pictures for each part but does the following process sound feasible :
take pictures of a part ;
try to identify the part using tensorflow ;
if it does not identify the correct part, take more pictures and feed the neural network with them ;
go on with the next part.
That way, each time we encounter a new piece we "teach" the network its reference so that it can better be recognized the next time. Like that and after hundreds of iterations monitored by a human, could we imagine tensorflow to be able to recognize the parts? At least the most common ones?
My question might sound stupid but I am not into neural networks so any advice is welcome. At the moment I have not found any way to identify a lego part based on pictures and this "cucumber example" sounds promising so I am looking for some feedback.
Thanks.
You can read about the work of Jacques Mattheij, he actually uses a customized version of Xception1 running on https://keras.io/.
The introduction is Sorting 2 Metric Tons of Lego.
In Sorting 2 Tons of Lego, The software Side you can read:
The hard challenge to deal with next was to get a training set large
enough to make working with 1000+ classes possible. At first this
seemed like an insurmountable problem. I could not figure out how to
make enough images and to label them by hand in acceptable time, even
the most optimistic calculations had me working for 6 months or longer
full-time in order to make a data set that would allow the machine to
work with many classes of parts rather than just a couple.
In the end the solution was staring me in the face for at least a week
before I finally clued in: it doesn’t matter. All that matters is that
the machine labels its own images most of the time and then all I need
to do is correct its mistakes. As it gets better there will be fewer
mistakes. This very rapidly expanded the number of training images.
The first day I managed to hand-label about 500 parts. The next day
the machine added 2000 more, with about half of those labeled wrong.
The resulting 2500 parts where the basis for the next round of
training 3 days later, which resulted in 4000 more parts, 90% of which
were labeled right! So I only had to correct some 400 parts, rinse,
repeat… So, by the end of two weeks there was a dataset of 20K images,
all labeled correctly.
This is far from enough, some classes are severely under-represented
so I need to increase the number of images for those, perhaps I’ll
just run a single batch consisting of nothing but those parts through
the machine. No need for corrections, they’ll all be labeled
identically.
A recent update is Sorting 2 Tons of Lego, Many Questions, Results.
1CHOLLET, François. Xception: Deep Learning with Depthwise Separable Convolutions. arXiv preprint arXiv:1610.02357, 2016.
I have started this using IBM Watson's Visual Recognition.
I had six different bricks to be recognized on the transport belt background.
I am actually thinking about tensorflow, since I can have it running locally.
The codelab : TensorFlow for Poets, describes almost exactly what you want to achieve,
For a demo of the Watson version:
https://www.ibm.com/developerworks/community/blogs/ibmandgoogle/entry/Lego_bricks_recognition_with_Watosn_lego_and_raspberry_pi?lang=en

how to speedup amazon EMR bootstrap?

I'm using amazon EMR for some intensive computation, but, it takes around 7 min to start computing, is there some clever way to have my computation starting immediately ? The computation is a python stream started from a user-faced website, so I can't really afford a long startup.
I might have simply missed an option in the ocean that is amazon AWS. I just want simplicity to launch jobs (that's what I used EMR), scalability, and pay only for what I use (and startup time is not useful).
I know this is an old question but had some insights I would add to the next searcher who finds this thread in hope of speeding up bootstrap times on Amazon EMR.
For a while I have wondered why my clusters took so long to start, usually about 15 minutes. This takes a pretty big chunk of time for a job that usually completes in under 1 hour. Sometimes it pushes the job past 1 hour, but I think thankfully AWS does not charge for the full boot strap time.
The last couple days I noticed my startup times were improved. You see the spot market became very volatile during April and the first week of May. Normally, I start my cluster entirely of spot instances, as failure is an option, and the cost savings justifies the technique in my case. However, after waiting 14 hours for clusters to start, I had to switch to OnDemand, I only have so much patience, over night usually exceeds it. The OnDemand clusters start in about 5 minutes. Now having switched back to spot as the madness seems to have abated, I am back to 15 minutes for a cluster.
So if you are using Spot instances on your Core or Master nodes, expect a longer startup time. I will be experimenting with using a small set of OnDemand in the core and augmenting with a large number of spot instances to see if it helps startup and deals better with Spot Market volatility.
This is pretty normal and there is little you can do about it. I'm starting 100+ node clusters and I've seen them take 15+ minutes before they start processing. Given the amount of work thats going on in the background I'm pretty happy to allow them the 15 minutes or so to get the cluster configured and read in whatever data maybe required. Nature of the beast I'm afraid.
Where's your data source hosted?
If on S3 (probably), if you have many tiny files, it's the latency of each connection (per file) that is taking the time.
If that's the only reason then, your 7 mins of start-up time will translate to ~5 mins of reading from S3 time => ~1GB input files on S3

Algorithmically suggest best node to perform demanding computation

At work we perform demanding numerical computations.
We have a network of several Linux boxes with different processing capabilities. At any given time, there can be anywhere from zero to dozens of people connected to a given box.
I created a script to measure the MFLOPS (Million of Floating Point Operations per Second) using the Linpack Benchmark; it also provides number of cores and memory.
I would like to use this information together with the load average (obtained using the uptime command) to suggest the best computer for performing a demanding computation. In other words, its 3:00pm; I have a meeting in two hours; I need to run a demanding process: what node will get me the answer fastest?
I envision a script which will output a suggestion along the lines of:
SUGGESTED HOSTS (IN ORDER OF PREFERENCE)
HOST1.MYNETWORK
HOST2.MYNETWORK
HOST3.MYNETWORK
Such suggestion should favor fast computers (high MFLOPS) if the load average is low and, as load average increases for a given node, it should favor available nodes instead (i.e., I'd rather run in a slower computer with no users than in an eight-core with forty dudes logged in).
How should I prioritize? What algorithm (rationale) would you use? Again, what I have is:
Load Average (1min, 5min, 15min)
MFLOPS measure
Number of users logged in
RAM (installed and available)
Number of cores (important to normalize the load average)
Any thoughts? Thanks!
You don't have enough data to make an well-informed decision. It sounds as though the scheduling is very volatile: "At any given time, there can be anywhere from zero to dozens of people connected to a given box." So the current load does not necessarily reflect the future load of the machines.
To properly asses what hosts someone should use to minimize computation time would require knowing when the current jobs will terminate. If a powerful machine is about to be done doing most of its jobs, it would be a good candidate even though it currently has a high load.
If you want to guess purely on the current situation, you can do a weighed calculation to find out which hosts have the most MFLOPS available.
MFLOPS available = host's MFLOPS + (number of logical processors - load average)
Sort the hosts by MFLOPS available and suggest them in a descending order.
This formula assumes that the MFLOPS of a host is linearly related to its load average. This might not be exactly true, but it's probably fairly close.
I would favor the most recent load average since it's closer to the current/future situation, whereas, jobs from 15 minutes ago might have completed by now.
Have you considered a distributed approach to computation? Not all computations can be broken up such that more than one cpu can work on them. But perhaps your problem space can benefit from some parallelization. Have a look at Hadoop.
You don't need to know FLOPS. beowulf modules paralell computing center has I go to has the script for sure
PDC operates leading-edge, high-performance computers on a national level. PDC offers easily accessible computational resources that primarily cater to the ...