systemd vs gitlab cicd - gitlab-ci

this may be a crazy question -
I want to host an algo-trading system which will trigger morning 9.00 AM and runs till 3.00 PM. I'm considering hosting either as a service using systemd or using gitlab cicd to trigger this. (i can watch activity here at any moment).
what is the best choice? is cicd reliable for running the whole day ?

I know your bounty is saying you're looking for a canonical answer, but I don't think such an answer really exists for this question since there is no real right answer based on your use-case.
You can absolutely create a CI/CD job and set the timeout to 6 hours, however I don't think that's really what you want to do here. It sounds like you essentially just want a background job that kicks off every day and processes your trades. You may also want notification if something in the job fails, or you may want it to restart the job automatically.
Systemd would be the simplest way to do this, and KISS is always a good principle to follow when designing your solution. Using GitLab would require you to host the GitLab service itself, along with a runner that would execute the jobs each day, whereas Systemd would only require you to register the service.
If you scale up to the point where you're trying to run many such jobs at once, you'll still likely be better off with a workflow manager such as Apache Airflow (or AWS step functions, etc).
So overall, I wouldn't recommend a CI/CD solution to run what is effectively a job server. Start with Systemd while you're small, then migrate to a true workflow solution when you need to scale.

Related

Monitoring Yarn/Cloudera application logs in production

I am NOT talking about Cloudera or Yarn system level logs. I am talking about applications running on Cloudera/Yarn infrastructure.
We have tens of Java and Python applications running on our Cloudera Infra, and all of them generate application logs. I am looking for the best way to monitor these logs for any errors and warnings. If it is a pure stand alone Java application, traditionally we can use one of these log scraper tools that send emails based on an expression matching (to detect error/warning/any other special situation). I am looking for something similar, that can monitor our application logs and emails us in real time for better production application support.
If thinking about this like a traditional application log monitoring is not the right way, then I am happy to know if there are any better industry standard approaches. Thanks!
I guess the ElasticStack (https://www.elastic.co/de/) could be one approach to solve this. You could use FileBeats to send your application logs to Logstash which forwards it to ElasticSearch. You could then create a Watcher in Kibana which sends i.e. Emails based on some triggering condition (we use a webhook to send notifications into a MS Teams channel).
This solution should work at least in near-realtime (~1-2 minutes delay, but this also depends on your watcher configuration).

Flink on yarn use yarn-session or not?

There are two methods to deploy flink applications on yarn. The first one is use yarn-session and all flink applications are deployed in the session. The second method is each flink application deploy on yarn as a yarn application.
My question is what's the difference between these two methods? Which one to choose in product environment?
I can't find any material about this.
I think the first method will save resources since only need one jobmanager(yarn application master). While it is also the disadvantage since the only jobmanager can be the bottleneck while flink applications getting more and more.
Both modes have their uses in production environments.
Session mode generally makes sense when you will be running a bunch of short-lived jobs, and want to avoid the overhead of starting up a cluster for each one. On the other hand, there are security implications, as any credentials available to any of the jobs will be accessible to all of the jobs. Cluster-per-job mode may use more resources overall, but is, in some sense, more straightforward.

How do I setup rolling deployment in Spinnaker?

I just started trying out Spinnaker. I have gone through the tutorial, https://www.spinnaker.io/guides/tutorials/codelabs/gcp-kubernetes-source-to-prod/, and got it working without issues.
Now I want to go a bit more advanced and do a rolling release or a canary deployment (https://www.spinnaker.io/concepts/#deployment-strategies), where it is possible, for instance, to only expose a new release to 5% of the customers.
I cannot find any guide on spinnaker.io (or google) on how to set that up. Can anyone guide me in the right direction?
I have currently been experimenting and doing PoC's on Spinnaker and Canary Deployments myself of late, and here is what I have found thus far.
To implement a rolling release, just create a Deploy stage in Spinnaker, and set the Deployment Strategy to RollingUpdate in your Server Group config. You will need to make sure that the Deployment checkbox is checked before you can change the Deployment Strategy.
For the Canary Deployment, it is a little more involved. I don't think that the Canary Stage currently supports Kubernetes Deployments(yet), but apparently you can manually deploy a canary(e.g. 1 replica) into the same Kubernetes LoadBalancer where your app is running. This is done using a separate Spinnaker Server Group.
Then you can add a Manual Judgement to your Spinnaker pipeline that will pause until you test/validate the canary. Once the canary has been validated, you "Continue" the Manual Judgement, and the new Server Group gets deployed, and the old Server Group gets disabled, and the canary destroyed.
If you don't want to use a Manual Judgement, and want this fully automated, you can add an ACA Stage(Automated Canary Analysis). This involves setting up a judge, that Spinnaker can connect to, that will gather various metrics and provide an ACA score. You can then use that score to decide whether to proceed with a deployment, or stop the deployment.

Multiple Mobilefirst-Server artifacts concurrent deploy

I use a batch procedure for deploying MFP v7 artifacts (wlapps and adapters).
The procedure is based on the standard ant tasks defined in worklight-ant-deployer.jar.
The MFP environment runs onto a WAS cell, and consists of a single AdminService application managing multiple WLRuntimes.
Is it possible to run two (or more) deploy tasks concurrently against different WLRuntime targets ?
Furthermore, sticking to a single WLRuntime, is it possible to deploy different multiple artifacts concurrently ?
Thanks in advance for any answer/comment.
Ciao, Stefano.
For a single WL runtime, all deployments are internally done sequentially. You can start the deployments concurrently, but internally only one deployment is done after the other, due to a transaction locking mechanism. If you start too many deployments in parallel, it may come to timeout situations, even though this is seldom. By default, a deployment transaction waits for 20 minutes before it may time out.
Note: starting deployments in parallel means here using ant tasks or the wladm tool or the REST service directly. In the MobileFirst Admin Console UI, you will see deploy buttons disabled when another deployment transaction is ongoing, hence in the UI, it is not so easily possible to start deployments in parallel. The UI tries to prohibit that.
Note 2: the 20 minutes that I mentioned above is for the locking mechanism itself. Ant/wladm has its own parameters for time out that may be lower, hence in ant tasks, you might get time outs quicker than 20 min. See here.
For multiple WL runtimes, deployments can be concurrently. The mentioned locking mechanism is per runtime, hence deployments that occur in one WL runtime will not influence any other WL runtime.

How to halt provisioning of one VM until another VM is done?

Using Vagrant+Chef Solo I'm setting up two VMs: #1 is a TeamCity server, #2 is a TeamCity agent. Provisioning is done by first installing the TeamCity server package on VM #1, then the agent VM is booted and requests data from the server which is used to install the agent. That whole thing works fine.
But now I want to alter the server after the agent is done provisioning. I want to modify the server's database directly, to change an attribute that is only available after the agent has spun up. But is there a way for one VM's provisioning to trigger another VM? Once the agent is done I'd like to somehow resume provisioning the server, so I can make the database edit..
Any thoughts, recommendations, or feedback welcomed. I'm new to Vagrant, Chef, and TeamCity, so there's a chance I'm missing a much easier solution.
* Why do I want to edit the DB directly you may be wondering? TeamCity agents must be authorized before they can be used, and I want to do this programmatically. The solution I've found is to directly edit the DB, because authorization functionality is not exposed via the TeamCity REST API (as far as I can tell)
If you can test the agent is installed/answering, you may add a ruby block looping over this test before continuing the recipe execution.
This loop should have a sleep and a counter to avoid infinite loops.
I've no knowledge of teamcity, so can't tell if it's the best way.
In general, Chef is designed to manage your system, not simply provision it (though this is less true in the modern Cloud world with "golden image" strategies). Nonetheless, in your case, you best bet is to just setup chef-client as a service that runs every 15 minutes. Once the client has finished provisioning, the next run of the server will be able to authorize it.
If you really want to "trigger" the one from the other, you'd need either do that externally with something like etcd or consul, or you would need to setup an ssh keypair between the boxes and add a ruby_block on the client that either does the database modification directly, or calls chef-client on the server.