I'm working on a project using hadoop. Now I want to test a data intensive application on the hadoop.I checked out apache mahout machine learning algorithms.Are there any open source applications running over hadoop using apahce mahout machine learning algorithms?
You can start by watching official Mahout page - Powered by Mahout where you can find a list of commercial and academic uses of the Mahout software. I guess some of them should be open sourced but haven't checked myself.
Related
Like Ray and horovod, I want to run tensorflow distributed using the Apache Ignite framework, but I can't seem to find good examples on how to achieve distributed training.
Are there any good notebooks or tutorials out there?
The use of TensorFlow w/Ignite or GridGain community edition is described here: https://www.gridgain.com/docs/latest/integrations/tensorflow/deep-learn-tensor
More information could be found here: https://github.com/gridgain/gridgain/tree/master/modules/tensorflow
We are doing a project on "gait analysis" in Ubuntu using skeleton tracking. Which are the libraries we will need in order to start the project . Are there any available tutorials which can help us ?
this link should get you started kinect with python
We develop a server-side solution and to ease its deployment we would like to provide our cutomers with two options:
1. Docker image
2. VM image in OVA format
The images should be automatically created by our build machine.
As of today, we use packer for this purpose. First we create docker image and then update that image in preconfigured virtual machine image (using 'virtualbox-ovf' builder). This works pretty well, but there are some problems with this solution.
First, our vm includes docker framework and two OSes (host's and docker's), so our VM image is ~twice bigger than docker. Second, to base our solution on another linux distro, we should manually configure new VM machine.
We are looking for 'Dockerfile'-style solution to create and configure VM automatically and then export it in OVA format. 'virtualbox-iso' builder is the obvious way to do this, but the building process will be much longer.
If you are willing to use Debian as your base OS then you could look at TurnKey Linux's TKLDev. It's probably a bit of a learning curve initially but it's a pretty cool thing IMO (although I'm very biased - see below disclaimer). TKLDev will build you a TurnKey (Debian based) ISO with your software installed on top. Then using Buildtasks you can convert the ISO to OVA, VMDK, LXC, Docker, OpenStack, etc...
Unfortunately Buildtasks is not very well documented but significant chunks of it are in bash so if you are handy with a Linux commandline you could work it out. Otherwise ask on the TurnKey forums.
The initial development (from Packer to TKLDev) may take a little while, but once the heavy lifting is done the creation of an ISO (in a guest VM on a moderm multicore CPU PC) takes about 10-15 mins and the OVA probably another ~5; Docker another ~5.
If you wanted to make it build automatically then you could use a hook to trigger a fresh TKLDev build (including the buildtasks image creation) everytime a commit was made to a repo. I know that git supports this but I assume that other version control systems allow something similar.
Also if the appliance that you are making is open source then perhaps it could be added to the TurnKey Linux library?
Disclaimer: I work with TurnKey Linux. :)
FWIW this is essentially the process we use to create our library of appliances in most virtualisation formats known to human kind!
My goal is to build up a recommendation system and after going through many articles, I came across Mahout as a simple, yet effective way to go on. I already have XAMPP installed on my system.
How can I install Mahout? I need the complete instructions since I have neither worked with cygwin before, nor have I worked with Hadoop, and everywhere I see, I see these two mentioned very frequently. I first need to install it on my localhost before going on installing it on the server.
Here is a detailed instructions page to install Apache Mahout with Hadoop in windows. Its bit tedious, but can be done anyway.
http://alans.se/blog/2010/mahout-on-hadoop-in-cygwin/
I just followed the Hadoop(0.20.2) installation tutorial and did the set up. I can run map reduce program on the cluster through eclipse. Now my problem is how can I connect to Hadoop clusters from my local system. Local system is windows 7 and I have installed eclipse plugin for Hadoop. I was trying to connect to Hadoop from my local system which is windows(My local system and Hadoop system are in same subnet). I got connection timed out error while connecting to Hadoop server.
In configuration files of Hadoop I have given actual IP addresses.
Not sure which step I have missed out?
I recently read, that the eclipse plugin won't work at all. But you can simply connect to your Cluster with the configuration keys:
mapred.job.tracker
fs.default.name
EDIT: here is a working version Apache Jira: Eclipse Plugin does not work with Eclipse Ganymede (3.4)