Is it posible to install Hortonworks Cluster on Docker on one single Linux machine - ambari

We want to build test Hadoop cluster on one Linux machine based on docker container/s
Dose hortonworks ( cloudera ) support this ? ,
For example HDP version - 2.6.5
For example we need the following services
HDFS ( include at least 3 data-nodes , )
YARN
MAPreduce2
HIVE
Zookeeper
Ambari metrics
KAFKA
SPARK2
and all these services should be on one Linux machine

#Jessica it is not possible to do a 3 node (hdfs) instance with a single node machine. Other than that, a single node cluster can run all of the services you have listed and is suitable for learning, training, and demos or proof of concepts.

Related

Running impala service alone in docker

I am trying to install impala in a docker container(using MAPR documentstion).In this docker I am running only Impala service and remaining hive,maprfs services will be running on physical node.When starting impala-server(impala daemon) I am getting wearied errors.I just wanted to know whether this kind of installation is possible or not.
Thanks for Help!!
It is possible, but it depends on your Impala and MapR version. Impala 2.2.0 is supported on MapR 5.x. Impala 2.5.0 is supported on MapR 5.1 and later. Check enter link description here before proceeding.

How to install cloudera on top of apache hadoop 2.7.1

I am currently working on apache hadoop2.7.1, cluster includes 1 name node and 3 data nodes.
Is it possible to install cloudera manager on existing apache hadoop 2.7.1 cluster. If yes, could you please suggest me how it can be done.
Thanks in advance.
No, that is not possible. It goes the other way around. You install Cloudera Manager, and then you deploy Hadoop components from its web console.

Configure Redis Cluster in Ubuntu Server 14.04

I've installed redis-server using apt-get install redis-server and everything went fine.
Right now I'm trying to configure it in a Cluster mode. The problem is that in the tutorial supplied here http://redis.io/topics/cluster-tutorial they use a script called redis-trib.rb which I can't find it in my system.
Can you please tell me how can I configure my Redis to run in Cluster mode without that script ?
I would like to have a setup with two masters, each on a different machine.
Thank you very much.
Had same problem with reredis-trib.rb
This tutorial explains how to create Redis Cluster using only Redis commands: Configuring and Running Redis Cluster on Linux
You need Redis 3.0.0 beta to run Cluster. You'll not find it in a Linux distribution, since they all have copy of the stable server (fortunately!). Redis 3.0.0 will go out as a stable release the next week. You can find the source code of the stable release here: http://redis.io/download.
There is now a tutorial for Ubuntu at https://www.digitalocean.com/community/tutorials/how-to-configure-a-redis-cluster-on-ubuntu-14-04 which includes installation of a PPA to supply 3.0.x. This tutorial is only for two nodes and does not reference redis-trib.rb ...

Does cloudera distribution of hadoop use the control scripts?

Are the control scripts (start-dfs.sh and start-mapred.sh) used by CDH to start daemons on the fully distributed cluster?
I downloaded and installed CDH5, but could not see the control scripts in the installation, and wondering how does CDH start the daemons on slave nodes?
Or since the daemons are installed as services, they do start with the system start-up. Hence there is no need for control scripts in CDH unlike apache hadoop.
Not as such, no. If I recall correctly, Cloudera Manager invokes the daemons using Supervisor (http://supervisord.org/). So, you manage the services in CM. CM itself runs agent processes as a service on each node, and you can find its start script in /etc/init.d. There is no need for you to install or start or stop anything. You install, deploy config, control and monitor servies in CM.

Accessing Hadoop clusters from eclipse

I just followed the Hadoop(0.20.2) installation tutorial and did the set up. I can run map reduce program on the cluster through eclipse. Now my problem is how can I connect to Hadoop clusters from my local system. Local system is windows 7 and I have installed eclipse plugin for Hadoop. I was trying to connect to Hadoop from my local system which is windows(My local system and Hadoop system are in same subnet). I got connection timed out error while connecting to Hadoop server.
In configuration files of Hadoop I have given actual IP addresses.
Not sure which step I have missed out?
I recently read, that the eclipse plugin won't work at all. But you can simply connect to your Cluster with the configuration keys:
mapred.job.tracker
fs.default.name
EDIT: here is a working version Apache Jira: Eclipse Plugin does not work with Eclipse Ganymede (3.4)