Write a YARN application for a Non-JVM application - hadoop-yarn

Assume I want to use yarn cluster to run a Non-JVM distributed application (e.g. .Net based. is this a good idea?). From what I have read so far, I need to develop a YARN application that consists of
a YARN client that is used to submit the job to the yarn framework
a YARN ApplicationMaster, which is the core to orchestra the application in the cluster.
It seems that the two pieces need to be written using Yarn APIs, which are offered as Jar libraries. It means they have to be written using one of the JVM languages. It seems it's possible to write the YARN client with REST APIs, correct? If yes, it means the client can be written with any language (e.g. C# on .Net). However, for application master, this does not seem to be the case, and it has to use JVM. Correct?
I'm new to YARN. Just want to confirm whether my understanding is correct or not.

The YARN Client and AppMaster need to be written in Java as they are the ones that write to the YARN Java API. The RESTful API, https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html, is really about offering up the commands that you can do from the CLI.
Fortunately, your "container" processes can be just created with just about anything. http://hortonworks.com/blog/apache-hadoop-yarn-concepts-and-applications/ says it best with the following quote:
"This allows the ApplicationMaster to work with the NodeManager to launch containers ranging from simple shell scripts to C/Java/Python processes on Unix/Windows to full-fledged virtual machines (e.g. KVMs)."
That said, if you are trying to bring a non-Java application (or even a Java one!) that is already a distributed application of sorts then the Slider framework, http://slider.incubator.apache.org/, is probably the best starting point for you.

Related

Develop an application with all its containers instantiated and used soon as the dev compilation, and do not wait deployment at delivery time for that

In my development environment, I have my IDE, a database, web-server... installed
A script exist: 80 different commands are ran for it.
Then, at delivery time (integration, acceptance), I have a big mess to execute a script that create many Docker containers, each having its goal: database, web-server, etc.
Their scripts are some subsets of the big one I'm using for my own local developer computer. But adapted.
It's very difficult to ensure the transition between my standalone - "flat" if I can say so - dev computer and the containerized version fitted for delivery.
I wonder if a way exists to develop directly an application being containerized at its early beginning :
With all the tree of its containers ready (and not a single one containing everything: it would be cheating...)
As soon as I compile my sources in my IDE : simple compilation would have for result binaries and files going in their due container
and it's in these containers that my application would be executed, even in development mode.
Is it possible? Is it already done by some of you?
Or does it have too much drawbacks to be attempted?

ASP.NET Core front-end developer workflow with VSCode and VS 2019

I haven't done any cshtml front-end development for a few years.
What's the current, generally accepted way for ASP.NET Core front-end developers to work across a range of tools on Windows?
By that, I mean a way to have the front-end JS build and the .NET project(s) also build and to work rapidly in the browser and code.
My thinking is.
We have much better command line story around dotnet today.
Some folk like VS Code.
Some folk prefer VS 2019, and some like either, depending.
We need to work on UI aspects sometimes.
But we also need to attach a debugger and debug the server logic sometimes.
The build server should have no problem, be simple, and rely mostly on build logic held in the repo.
Tooling, and kicking off the whole build and serve process should be understandable and familiar.
It should be pretty simple to get going after a team noob clones the repo.
My initial thought would be to setup NPM then use something like Gulp to kick off everything, including running dotnet run.
Then when running under the Visual Studio 2019 debugger, use the Task Runner Explorer to kick off the Gulp stuff but skip the dotnet run part.
(shame there doesn't seem to be a command line for start VS(Code or 2019) and attach debugger)
Now I'm expecting to get a "primarily opinion based" SO beating, but there are general trends and ideas that go into designing all these tools for how they can all play ball together and what the dev story looks like.
You've pretty much already described the process. However, I'll add a few things:
You don't need the dotnet run bit. Visual Studio and VS Code are both capable of debugging directly.
You can assign the gulp tasks to build tasks in Task Runner Explorer, so you really don't even then to think about running those directly. I'm not as sure on this aspect of VS Code, but I'm sure there's probably some extension to handle it, if it's not already built-in.
If you want true ease of development, the best thing you can do is use Docker. Just add a Dockerfile to each project that actually runs (i.e. not a class library) and set up the steps to build and run it there. In Visual Studio, you can right-click the project and choose Add > Docker Support, and it will actually generate a ready-made Dockerfile, though you may need to add a step or two to handle the client-side build steps. In any case, this then becomes truly click and run, with nothing to worry about. The story is even better when you use docker-compose, as then Visual Studio and VS Code can spin up your entire application stack all at once, including external dependencies such as a database, Redis instance, etc. If you haven't used Docker before, start now. It's absolutely revolutionary for development.
One note for CI/CD, as much as possible, you should add a YAML file to describe your CI/CD pipeline. Depending on the the actual provider you're using for build/release, there might be some differences, so consult the relevant documentation. (Azure DevOps, for example, doesn't currently support describing release pipelines in yaml, though you can still do your build that way.) In any case, this allows you to configure all this in code, and have it committed to source control.
You may consider the same for your infrastructure. Azure has ARM templates, AWS has CloudFormation, GCP has Deployment Manager. There's also third-party tools like Terraform or Ansible. All of these, in some form or fashion (usually JSON or YAML) allow you to define all the characteristics of the infrastructure you're going to deploy to and commit that to source control. This makes deployment and things like creating new environments as breeze.

Ionic2 client + Meteor server, which approach is better?

I want to have Meteor as a server and Ionic2 as a client. I currently have a headache with authentifiacation. It seems that there are two different approaches:
First is use of Meteor server and Meteor client with ionic-angular library. This approach described here
https://angular-meteor.com/tutorials/socially/angular2/ionic2
I guess the advantage of this method is use of Meteor native architecture, on the other hand I guess we're using Ionic2 just like a subframework and maybe loosing some stuff from native Ionic2.
The second is using separate Meteor server ('client' folder deleted completely) and native Ionic2. This approach described here
https://angular-meteor.com/tutorials/whatsapp2/ionic/authentication
This option is vice versa: use of native Ionic2, but it has to use libraries like meteor-client-side, accounts-base-client-side, accounts-password-client-side etc, which I'm not sure are native for Meteor.
The first approach looks better, because there is a ready-to-use UI component for authentification. But I wonder what issues I would have, when I come to the step of completing my applications for different types of devices.
Thank you in advance for your help.
These approaches are essentially the same for the authentication itself.
What you are pointing out is more about what mobile platform to choose to develop and run mobile projects.
In the first case, you use Meteor's built-in Cordova platform to run the app and Meteor's compiler and bundler plugins (like TypeScript package or Meteor core packages for Babel and UglifyJS etc) to develop the app. In the second case, you develop and run the app solely on Ionic 2 CLI.
But from the app logic point of view these approaches are absolutely same: you import the same Ionic 2 components and use the same Meteor packages with the only difference in the second case is that these packages are now NPMs not Atmosphere ones (essentially though they contain the same scripts since these NPMs are built from Atmosphere packages).
The reason why What’sApp clone is built in that way that differs from the Socially’s one is simply described in the README of
the What’sApp repo (see https://github.com/Urigo/Ionic2CLI-Meteor-WhatsApp). If to repeat: since Ionic is a one of the best Web frameworks that specializes solely in building mobile apps, it’s reasonable to guess that it’ll be (and likely it is) much more powerful in building them than Meteor itself. From that point of view the second approach seems more future-proof, I would say. You could think even of building your project in some way that will allow you to substitute Meteor easily with some another framework if you decide to use it at some point in the future.
If you are though concerned about using those NPMs mentioned in the second case (e.g., if the process of building them doesn’t look transparent to you), you could try this project https://github.com/Urigo/meteor-client-bundler to bundle Atmosphere packages you need into separate scripts and use them after.

What use cases of Docker on real projects

I have read what the Docker is but having hard time finding of what are the real scenarios of using Docker?
It would be great to see here your usages.
I'm replicating production environment with it, on commit on project with jenkins after building binaries i deploy there, launch the required daemons and run integration tests, all in a very short time (a few seconds over the time that takes the integration tests). Having no need to boot, and little overhead on memory/cpu/disk is great for that kind of things.
I could extend that use for development (just adding a volume where the code resides to my git repository, at least for scripting languages) to have the production environment with the code im actually editing, at a fraction of what virtualbox would require.
Also needed to test how to integrate some 3rd party code into a production system that modified DB. Cloned the DB in a container, installed the production system in another, launched both and iterated the integration until i did it well, going back to zero to try again in seconds, and faster, cheaper and more scriptable than doing it with VMs+snapshots.
Also run several desktop browser instances on containers, with their own plugins, cookies, data storage and so on separated. The docker repository example for desktop integration is a good start for it, but planning to test subuser to extend this kind of usage.
I've used Docker to implement a virtualized build server which any user could ask to run a build off their personal git branch in our canonical environment.
Each SSH connection made to the server was connected to a new container, ensuring that all builds were isolated from each other (a major pain point in the past), ensuring that the container's state couldn't be corrupted (since changes were all isolated to that single instance), and ensuring that even developers on platforms such as Windows where Docker (and other tools in our canonical build environment) couldn't be run locally would be able to run builds.
We use it for the following uses:
We have a Jenkins Container which we can use to bring up our Jenkins server. We mount the workspace using volumes so we can migrate the server easily just by copying the files and launching the container somewhere else.
We use a Jetty container to easily deploy our war files in our production and development environment.
We use a whole host of other monitoring tools such as Uptime which we have containers for so that we can bring them up and down on various hosts with a single command.
I use docker to build and test our software on several different Linux distributions (RHEL 4/5/6/7, Ubuntu 12.04, 14.04).
Docker makes it easy and fast to create minimalistic and consistent build environments.
Docker gives you the benefits that other virtualization solutions give you to a fraction of the recourse needed.

How to automate development environment setup? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Every time a new developer joins the team or the computer a developer is using changes, the developer needs to do lots of work to setup the local development environment to make the current project work. As a SCRUM team we are trying to automate everything including deployment and tests so what I am asking is: is there a tool or a practice to make local development environment setup automated?
For example to setup my environment, first I had to install eclipse, then SVN, Apache, Tomcat, MySQL, PHP. After that I populated the DB and I had to do minor changes in the various configuration files etc... Is there a way to reduce this labor to one-click?
There are several options, and sometimes a combination of these is useful:
automated installation
disk imaging
virtualization
source code control
Details on the various options:
Automated Installation Tools for automating installation and configuration of a workstation's various services, tools and config files:
Puppet has a learning curve but is powerful. You define classes of machines (development box, web server, etc.) and it then does what is necessary to install, configure, and keep the box in the proper state. You asked for one-click, but Puppet by default is zero-click, as it checks your machine periodically to make sure it is still configured as desired. It will detect when a file or mode has been changed, and fix the problem. I currently use this to maintain a handful of RedHat Linux boxes, though it's capable of handling thousands. (Does not support Windows as of 2009-05-08).
Cfengine is another one. I've seen this used successfully at a shop with 70 engineers using RedHat Linux. Its limitations were part of the reason for Puppet.
SmartFrog is another tool for configuring hosts. It does support Windows.
Shell scripts. RightScale has examples of how to configure an Amazon EC2 image using shell scripts.
Install packages. On a Unix box it's possible to do this entirely with packages, and on Windows msi may be an option. For example, RubyWorks provides you with a full Ruby on Rails stack, all by installing one package that in turn installs other packages via dependencies.
Disk Images Then of course there are also disk imaging tools for storing an image of a configured host such that it can be restored to another host. As with virtualization, this is especially nice for test boxes, since it's easy to restore things to a clean slate. Keeping things continuously up-to-date is still an issue--is it worth making new images just to propagate a configuration file change?
Virtualization is another option, for example making copies of a Xen, VirtualPC, or VMWare image to create new hosts. This is especially useful with test boxes, as no matter what mess a test creates, you can easily restore to a clean, known state. As with disk imaging tools, keeping hosts up-to-date requires more manual steps and vigilance than if an automated install/config tool is used.
Source Code Control Once you've got the necessary tools installed/configured, then doing builds should be a matter of checking out what's needed from a source code repository and building it.
Currently I use a combination of the above to automate the process as follows:
Start with a barebones OS install on a VMWare guest
Run a shell script to install Puppet and retrieve its configs from source code control
Puppet to install tools/components/configs
Check out files from source code control to build and deploy our web application
I stumbled across this question and was very suprised that no one has mentioned Vagrant yet.
As Pete TerMaat and others have mentioned, virtualization is a great way to manage and automate development environments. Vagrant basically takes the pain away from setting up these virtual boxes.
Within minutes you can have a completely fresh copy of your favourite Linux distro up and running, and provisioned exactly the same way your production server is.
No more fighting with OSX or Windows to get PHP, MySQL, etc. installed. All software lives and runs inside the virtual machine. You can even SSH in with vagrant ssh. If you make a mistake or break something, just vagrant destroy it, and vagrant up to start over fresh.
Vagrant automatically creates a synced folder to your local file system, meaning you don't need to develop within the virtual machine (ie. using Vim). Use whatever your editor of choice is.
I now create a new "Vagrant box" for almost every project I do. All my settings are saved into the project repository, so it's easy to bring on another team member. They simply have to pull the repo, and run vagrant up, and they are literally ready to go.
This also makes it much easier to handle projects that have different software requirements. Maybe you have some projects that rely on PHP 5.3, but some newer ones that run PHP 5.4. Just install the version you want for that project.
Check it out!
One important point is to set up your projects in source control such that you can immediately build, deploy and run after checkout.
That means you should also checkin helper infrastructure, such as Makefiles, ant buildfiles etc., and settings for the tools, such as IDE project files.
That should take care of the setup hassle for individual projects.
For the basic machine setup, you could use a standard image. Another option is to use your platform's tools to automate installation. Under Linux, you could create a meta-package that depends on all the packages you need. Under Windows, a similar thing should be possible using MSI or the like.
Edit:
Ideally, instead of checking in helper infrastructure, you check in the information that allows the build to generate the helper infrastructure. This is the approach taken by e.g. the GNU build system (autotools etc.), or by Maven. This is even more elegant, because you can (theoretically) generate infrastructure for any (supported) build environment, thus you are not bound to e.g. one specific IDE, and settings in the helper infrastructure (paths etc.) don't need to duplicate the main project settings.
However, this also a more complex approach, so if you can't get it to work, I believe checking in stuff like IDE files directly is acceptable.
I like to use Virtual PC or VMware to virtualize the development environment. This provides a standard "dev environment" that could be shared among developers. You don't have to worry about software that the user could add to their system that may conflict with your development environment. It also provides me a way to work to two projects where the development environments can't both be on one system (using two different versions of a core technology).
Use puppet to configure both your development and production environment. Using a top-notch automation system is the only way to scale your ops.
There's always the option of using virtual machines (see e.g. VMWare Player). Create one environment and copy it over for each new employee with minimal configuration needed.
At a prior place we had everything (and I mean EVERYTHING) in SCM (clearcase then SVN). When a new developer can in they installed ClearCase|SVN and sucked down the repository. This also handles the case when you need to update a particular lib/tool as you can just have the dev teams update their environment.
We used two repo's for this so code and tools/config lived in separate places.
I highly recommend Blueprint from DevStructure. It's open-source and your use case is actually the exact reason we originally wrote the software. Our goals have somewhat changed, but it still is the perfect tool for what you are describing. In short, you can create reusable server configs - dead simple configuration management. I hope this helps!
https://github.com/devstructure/blueprint (Blueprint # Github)
I've been thinking about this myself. There are some other technologies that you could throw into the mix. Here's what I'm currently setting up:
PXE based pre-seeded installation images (Debian Squeeze). You can start up a bare-metal machine (or new virtual appliance) and select the image from the PXE boot menu. This has the major advantage of being able to install your environment on physical machines (in addition to virtual appliances).
Someone already mentioned Puppet. I use CFEngine but it's a similar deal. Essentially your configuration is documented and centralized in policy files which are continually enforced by an agent on the client.
if you don't want a rigid environment (i.e. developers may choose a combination of tool-sets) you can roll your own deb packages so new devs can type sudo apt-get install acmecorp-eclipse-env or sudo apt-get install acmecorp-intellij-env, for example.
Slightly off-topic, but if you run a Debian based environment (i.e. Ubuntu), consider installing apt-cacher (package proxy). In addition to saving bandwidth, it will make your installations much faster (since packages are cached on your local network).
If you're using OSX and working with Rails. I'd suggest either:
https://github.com/platform45/let-there-be-light
https://github.com/thoughtbot/laptop
If you use machines in a standard configuration, you can image the disk with a fresh perfectly configured install -- that's a very popular approach in many corporations (and not just for developers, either). If you need separately configured OS's, you can tar-bz2 all the added and changed files once a configured OS is turned into your desired setup, and just untar it as root to make your desired environment from scratch.
if you're using a linux flavor, you've probably got a package management system: thinks .rpm for fedora/redhat, or .deb for ubuntu/debian. many of the things you describe already have packages available: svn, eclipse, etc. you could roll your own packages for company specific software, create a repository (perhaps only available on the local network) and then your setup could be reduced to a single bash script which would add the company repo to /etc/apt/sources.list (debian/ubuntu) and then call a command like,
/home/newhire$ apt-get update && apt-get install some complete package list
you could use buildbot to then automate regular builds for company packages that change often.
Try out DevScript at http://nsnihalsahu.github.io/devscript .
Its one command like ,
devscript lamp or devscript laravel or devscript django . In around a few minutes ,depending on the speed of your internet co