ACI container group disappears after successful exit - azure-container-instances

I am using ACI to run a single-shot container that reads a storage blob, computes for anything from a few seconds to a few hours depending on the blob contents, then writes results to another storage blob. Containers are spawned as needed using node.js, and I periodically check for terminated containers to retrieve their exit codes, after which I delete them.
This normally works fine, but sometimes, when the computation completes very quickly and the container exits normally, Azure appears to delete the container on its own. This means that I cannot retrieve the exit code, which is inconvenient. The container is really gone, and appears neither in the list returned by the Javascript ContainerGroups.listByResourceGroup function nor in the output of the Azure cli "az container list" command.
Is this a known problem, and if so, is there a workaround? I guess I could just have my container sleep for a while before starting its computation, but without understanding the cause of the problem, I don't know how long to sleep for.

Related

Best practice for storing "permanent" variables consumed by applications

I have an application that needs to store a "last_updated_at" variable from a dataset that it obtains from an API. So that in the next job it takes that "last_updated_at" and starts looking only at data after that "last_updated_at" as it retrieves data from other API's. At the end of its execution, it refreshes the "last_updated_at" and saves it. Then a job will come in tomorrow and will start all over again with that "last_updated_at" stored value.
The question is, how is best to save that variable, what's the best practice on where to save it (and retrieve it next time)?
This application comes from a github repo, I built a container from it and have the container at AWS, on every push to the repo a new container will be built. We often update that repo->build the new container -> Pull image on machines.
So with that context, where's the best place to save that "last_updated_at" that needs to be consumed and updated on every execution. There will only be 1 machine with the container and running it, no more machines will have the container. So what's best considering we constantly update that repo and this is a prod environment?
In a csv or txt in the machine running the job
In some cloud like S3
As a OS environment variable on the machine?
As a environment variable on the container running this?
In a github file in a folder of parameters?
In a csv or txt in the container of the machine running the job?
Any other way?
Lastly, should the answer depend whether there's only 1 machine installing the container or more than 1 but only 1 is running at a given time?

Terminated ACI not disappearing

I'm working on a new container image that runs my worker process to drain an Azure queue. Once the queue is empty my app exits and I'd like the ACI to de-allocate and be removed as well. What I am seeing is the ACI stick around. It is in a "Terminated" state with a restart count of 0 as I would expect (seen in Azure Portal), but why is it not removed/deleted from the ACI list entirely?
I am using the Azure cli to create these instances and am specifying the restart never option. Here is my command line (minus the image specific details):
az container create --cpu 4 --memory 14 --restart-policy never --os-type windows --location eastus
I am of course also wondering where billing stops. Once I see the terminated state I am hoping that billing has stopped. Though this is unclear. I can of course manually delete the ACI and it is gone immediately, should exiting the app do the same?
If your container is in terminated state, you are no longer being billed. The resource itself though remains until you delete it though in the event you want to query the logs, events, or details of the container after termination. If you wish to delete existing container groups, writing some code on Azure Functions is a good route so you can define when something should be deleted.
Check out this base example of such a concept.
https://github.com/dgkanatsios/AzureContainerInstancesManagement/tree/master/functions/ACIDelete

Running multiple Kettle transformation on single JVM

We want to use pan.sh to execute multiple kettle transformations. After exploring the script I found that it internally calls spoon.sh script which runs in PDI. Now the problem is every time a new transformation starts it create a separate JVM for its executions(invoked via a .bat file), however I want to group them to use single JVM to overcome memory constraints that the multiple JVM are putting on the batch server.
Could somebody guide me on how can I achieve this or share the documentation/resources with me.
Thanks for the good work.
Use Carte. This is exactly what this is for. You can startup a server (on the local box if you like) and then submit your jobs to it. One JVM, one heap, shared resource.
Benefit of that is then scalability, so when your box becomes too busy just add another one, also using carte and start sending some of the jobs to that other server.
There's an old but still current blog here:
http://diethardsteiner.blogspot.co.uk/2011/01/pentaho-data-integration-remote.html
As well as doco on the pentaho website.
Starting the server is as simple as:
carte.sh <hostname> <port>
There is also a status page, which you can use to query your carte servers, so if you have a cluster of servers, you can pick a quiet one to send your job to.

Background jobs on amazon web services

I am new to AWS so I needed some advice on how to correctly create background jobs. I've got some data (about 30GB) that I need to:
a) download from some other server; it is a set of zip archives with links within an RSS feed
b) decompress into S3
c) process each file or sometime group of decompressed files, perform transformations of data, and store it into SimpleDB/S3
d) repeat forever depending on RSS updates
Can someone suggest a basic architecture for proper solution on AWS?
Thanks.
Denis
I think you should run an EC2 instance to perform all the tasks you need and shut it down when done. This way you will pay only for the time EC2 runs. Depending on your architecture however you might need to run it all the times, small instances are very cheap however.
download from some other server; it is a set of zip archives with links within an RSS feed
You can use wget
decompress into S3
Try to use s3-tools (github.com/timkay/aws/raw/master/aws)
process each file or sometime group of decompressed files, perform transformations of data, and store it into SimpleDB/S3
Write your own bash script
repeat forever depending on RSS updates
One more bash script to check updates + run the script by Cron
First off, write some code that does a) through c). Test it, etc.
If you want to run the code periodically, it's a good candidate for using a background process workflow. Add the job to a queue; when it's deemed complete, remove it from the queue. Every hour or so add a new job to the queue meaning "go fetch the RSS updates and decompress them".
You can do it by hand using AWS Simple Queue Service or any other background job processing service / library. You'd set up a worker instance on EC2 or any other hosting solution that will poll the queue, execute the task, and poll again, forever.
It may be easier to use Amazon Simple Workflow Service, which seems to be intended for what you're trying to do (automated workflows). Note: I've never actually used it.
I think deploying your code on an Elasticbeanstalk Instance will do the job for you at scale. Because I see that you are processing a huge chunk of data here, and using a normal EC2 Instance might max out resources mostly memory. Also the AWS SQS idea of batching the processing will also work to optimize the process and effectively manage time outs on your server-side

Persisting items being uploaded via web service to disk

I have a launchd daemon that every so often uploads some data via a web service using NSOperationQueue.
I need to be able to persist this data, both so that it can later be re-uploaded in the event of failure, even between sessions (in case of computer shut down, for example).
This is not a high load application, it probably receives items intermittently no more than 1 or 2 every minute, often with several hour gaps in between.
My initial implementation without this persistence in place is as follows:
Daemon receives data.
Daemon parses data into an object of type MyDataObject.
Daemon creates instance of NSOperation subclass with MyDataObject as the object to upload and adds it to its NSOperationQueue.
NSOperationQueue goes through and uploads MyDataObject via web service as it is able.
This part all functions just fine. The part I now want to add is the persistence in case of web service failure, computer shut down, etc.
It seems like I could use an NSMutableArray of MyDataObjects along with NSKeyed(Un)archiver containing all the items which had not yet been uploaded and observation of the -isFinished key of all the operations to remove items from the array, but it seems like there should be a simpler way to do is, with less room for things to go wrong, especially as far as thread safety goes.
Can somebody point me in the right direction?
You could add two operations per item. The first would store the item to local storage, and the second would depend on the first and would remove the item from local storage on success.
Then, when you want to restore any items from local storage, you create only the store-to-the-cloud operations, not the store-locally operations. As before, they remove the items from local storage only if they succeed, and if they don't succeed, they leave the items in local storage for the next attempt.