Multiple docker containers - automation

I am reading about docker and I am trying to understand whether or not this is something I should learn to use.
From what I read best practices states that you should have one process per container. Now, this mean that I need one container for JBoss, one for database, one for file storage, build server, ...
Now would I manually have to start each of these containers? Or are there some kind of dependencies you can set up?
What about the order and requirements that one process in a container can have? JBoss needs the database to be started before it starts etc?
Is this handled?

one process per container
This advice is valid if you want to follow a microservices architecture. microservices have advantages but also drawbacks. Depending on your situation you might find it more convenient to have a container running multiple processes.
Running multiple containers on one single host
If you want to start multiple containers together on one single docker host, the easiest way is to use fig. The fig configuration file is very easy to understand as its syntax mimics docker commands. This video gives you a nice presentation of fig (by one of fig authors Aanand Prasad)
Note that tools such as fig AFAIK won't be able to wait for a first container to start and finish initializing before starting another container depending on the first one. The way to handle this is to have the 2nd container implement some kind of test and loop until the dependency is ready, then start its process. This can be achieved by different means (wrapper script, straight in your application code, ...)
Running multiple processes in one container
As a docker container will stop as soon as no process is running in the foreground, there are different techniques you can use (supervisor, running a first process as a daemon and a last one in the foreground, using phusion/baseimage, ...)

Related

How to use one container in pipeline?

the situation is such that we move from jenkins to gitlab ci. Every time a stage occurs in the pipeline, a new container is created, I would like to know if it is possible to make the container used by the previous one, that is, a single one. Gitlab Executer is docker.
I want to save condition of one container
No, this is not possible in a practical way with the docker executor. Each job is executed in its own container. There is no setting to change this behavior.
Keep in mind that jobs (even across stages) can run concurrently and that jobs can land on runners on completely different underlying machines. Therefore, this is not really practical.

.Net Core Hosted Services in a Load Balanced Environment

We are developing a Web API using .Net Core. To perform background tasks we have used Hosted Services.
System has been hosted in AWS Beantalk Environment with the Load Balancer. So based on the load Beanstalk creates/remove new instances of the system.
Our problem is,
Since background services also runs inside the API, When load balancer increases the instances, number of background services also get increased and there is a possibility to execute same task multiple times. Ideally there should be only one instance of background services.
One way to tackle this is to stop executing background services when in a load balanced environment and have a dedicated non-load balanced single instance environment for background services only.
That is a bit ugly solution. So,
1) Is there a better solution for this?
2) Is there a way to identify the primary instance while in a load balanced environment? If so I can conditionally register Hosted services.
Any help is really appreciated.
Thanks
I am facing the same scenario and thinking of a way to implement a custom service architecture that can run normally on all of the instance but to take advantage of pub/sub broker and distributed memory service so those small services will contact each other and coordinate what's to be done. It's complicated to develop yes but a very robust solution IMO.
You'll "have to" use a distributed "lock" system. You'll have to use, for example, a distributed memory cache who put a lock when someone (a node of your cluster) is working on background. If another node is trying to do the same job, he'll be locked by the first lock if the work isn't done yet.
What i mean, if all your nodes doesn't have a "sync handler" you can't handle this kind of situation. It could be SQL app lock, distributed memory cache or other things ..
There is something called Mutex but even that won't control this in multi-instance environment. However, there are ways to control it to some level (may be even 100%). One way would be to keep a tracker in the database. e.g. if the job has to run daily, before starting your job in the background service you might wanna query the database if there is any entry for today, if not then you will insert an entry and start your job.

Setting up a multi-broker cluster (kafka quickstart)

I am trying to follow QUICKSTART in Apache Kafka homepage.
In Step 6: Setting up a multi-broker cluster, it says " edit the server properties"
config/server-1:properties:
broker.id=1
listeners = PLAINTEXT://:9093
logs.dirs=/tmp/kafka-logs-1
My question is :
Do I use vi editor to edit? if yes, I just change those 'broker', 'listeners','logs.dirs' items' values as above?
Whichever tool you prefer, it needs to be able to change values and save the file. You could use sed if you really wanted to (like some Docker containers might do... On that note, you could even just use Docker Compose to start two Kafka containers side by side)
But, as the docs point out, those three properties should be unique across all brokers on a single machine, and all others can remain unchanged.
I would like to point out that having two broker processes on one machine, sharing a single disk, is not a good idea. You'd get better performance with a larger heap space on that one server

Running multiple Kettle transformation on single JVM

We want to use pan.sh to execute multiple kettle transformations. After exploring the script I found that it internally calls spoon.sh script which runs in PDI. Now the problem is every time a new transformation starts it create a separate JVM for its executions(invoked via a .bat file), however I want to group them to use single JVM to overcome memory constraints that the multiple JVM are putting on the batch server.
Could somebody guide me on how can I achieve this or share the documentation/resources with me.
Thanks for the good work.
Use Carte. This is exactly what this is for. You can startup a server (on the local box if you like) and then submit your jobs to it. One JVM, one heap, shared resource.
Benefit of that is then scalability, so when your box becomes too busy just add another one, also using carte and start sending some of the jobs to that other server.
There's an old but still current blog here:
http://diethardsteiner.blogspot.co.uk/2011/01/pentaho-data-integration-remote.html
As well as doco on the pentaho website.
Starting the server is as simple as:
carte.sh <hostname> <port>
There is also a status page, which you can use to query your carte servers, so if you have a cluster of servers, you can pick a quiet one to send your job to.

AX 2009 Code Propagation with Load Balancing

I'm curious how AX 2009 handles code propagation when operating in a load balanced environment.
We have recently converted our AX server infrastructure from a single AOS instance to 3 AOS instances, one of which is a dedicated load balancer (effectively 2 user-facing servers). All share the same application files and database. Since then, we have had one user who has been having trouble receiving code updates made to the system. The changes generally take a few days before they can see it, and the changes don't seem to update all at once.
For example, a value was added to an ENUM field, and they were not able to see it on a form where it was used (though others connected to the same instance were). Now, this user can see the field in the dropdown as expected, but when connected to one of the instances it will not flow onto a report as it should. When connected to the other instance it works fine, and for any other user connected to either instance it works properly.
I'm not certain if this is related to the infrastructure changes, but it does seem odd that only one user is experiencing it. My understanding was that with this setup, code changes would propagate across the servers either immediately (due to sharing the Application Files), or at least in a reasonable amount of time (<1 day). Is this correct or have I been misinformed?
As your cache problems seems to be per user, then go learn about AUC files.
The files are store on the client computer and can be tricky to keep in sync. There are other problems as well.
Start AX by a script, delete the AUC file before starting AX.
There is no cache coherency between AOS instances: import an XPO on one AOS server, and it is not visible on the other. You will either have to flush the cache manually or restart the other AOS. The simplest thing is to import on each server, this is especially true for labels, as this is the only way to bring labels in sync to my knowledge.
I am sort of curious to this as well, but what I do know, is that if a user has access to the AOT (member of admin or a group with developer access), the client will cache AOT-elements more aggressively than if not having developer access.
Elements (like an Enum) might be cached at client level, but also at AOS-level. Restarting the AOS (service) would flush out memory for that service, forcing it to reload elements upon restart.
I guess what I am suggesting is that you make sure the element is not cached client side. Either restart the client, or run the "Refresh AOD" from the developer tools menu. If that doesn't help, try restaring the AOS the client connects to, and see if that helps.
I think it is safe to say, if you want to be absolutely sure every user has the most recent "copy" of any element, you should not develop on the application files shared by all of these services, but rather develop in an environment with 1 AOS. And when you need to move things to production, you need to take down all AOSes in production and move the chances over while the system is down.
In such cases it is often difficult to find the exact cause for a specific case.
I try to follow some best practices to avoid such situations:
- Use separate environment for developing
- Deploy code changes using layer files, not XPOs
- When deploying, stop all AOSs, deploy files, delete index files in the application directory, start one AOSs, compile, sync DB, start other AOS (or even shut down all and start again)
- Try to have latest kernel versions for AOSs and client