Where are the files stored on cloudhub - mule

Per my understanding, cloudhub is a PaaS service and we can deploy applications directly to Cloudhub. I have below questions
Can we create intermediate files on cloudhub. If yes, how can we define the path ?
When we use SFTP to pull file from particular location, what should be the path on cloudhub server for processing
Can we do SSH on cloudhub server
If we need to externalize cron timings of scheduler ,(via config etc to avoid code change) , what is the best practice for setting cron expression.
All above questions are related to Cloudhub deployment model.
Thanks in advance

The scheduler already gets externalized in the platform when you deploy to CloudHub.
You can technically store the files in /temp, but don't expect them to persist. That would be an "ephemeral" file system.
You can not SSH into the CloudHub server.
Rather than downloading the entire SFTP file and saving it, and then working on it, I would suggest streaming it if possible. You can process JSON/XML/CSV files as a stream, and even use deferred DataWeave with them enabling end-to-end streaming.

Related

How to manually deploy Mule application package on the on-premises cluster?

I'm looking for the advice of how to manually (i.e. without using Runtime Manager - RM) deploy a mule application package on the on-premises Mule cluster. The official documentation suggests using the RM for the purpose either via the gui or cli or api. However, the RM is not available on our environment.
I can manually deploy the package on a single node by copying it to the /apps folder. But this way the application is only deployed on a single node, not on the cluster.
I've tried using the AMC agent rest API for the purpose with the same result - it only deploys on a single node.
So, what's the correct way of manually deploying a mule application on the Mule servers cluster without using Anypoint RM?
We are on Mule 4.4 EE.
Copy the application jar file into the apps directory of every node. Mule clusters do not transfer applications between nodes.
Alternatively ou can use the Runtime Manager Agent instead however it also works in a per node basis. You need to send the same request to each node to deploy.
Each connector may or may not be cluster aware. Read each connector documentation to understand how they behave. In particular the documentation of the VM connector states:
When running in cluster mode, persistent queues are instead backed by the memory grid. This means that when a Mule flow uses VM Connector to publish content to a queue, Mule runtime engine (Mule) decides whether to process that message in the same origin node or to send it out to the cluster to be picked up and processed by another node.
You can register the multiple nodes through AMC agent on the cloudhub control plane and create a server group and deploy code through control plain runtime manager it does the job of deployment to same app in n nodes

How to build a development and production environment in apache nifi

I have 2 apache nifi servers that are development and production hosted on AWS, currently the migration between development and production is done manually. I would like to know if it is possible to automate this process and ensure that people do not develop in production?
I thought about uploading the entire nifi in github and having it deploy the new nifi on the production server, but I don't know if that would be correct to do.
One option is to use NiFi registry, store the flows in the registry and share the registry between Development and Production environments. You can then promote the latest version of the flow from dev to prod.
As you say, another option is to potentially use Git to share the flow.xml.gz between environments and using a deploy script. The flow.xml.gz stores the data flow configuration/canvas. You can use parameterized flows (https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Parameters) to point NiFi at different external dev/prod services (eg. NiFi dev processor uses a dev database URL, NiFi prod points to prod database URL).
One more option is to export all or part of the NiFi flow as a template, and upload the template to your production NiFi, however registry is probably a better way of handling this. More info on templates here: https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#templates.
I believe the original design plan behind NiFi was not necessarily to have different environments, and to allow live changes in production. I guess you would build your initial data flow using some test data in production and then once it's ready start the live data flow. But I think it's reasonable to want to have separate environments.

Google Cloud Manage Tomcat Service

Does google cloud or aws provide manage Apache tomcat which just take war file and do auto-scaling based on load increase and decrease ? not compute engine. I dont want to create VM. this should be manage by manage service.
Google App Engine can directly take and run a WAR file - just use the appcfg deployment method.
You will have more options if you package with docker, as this then provides an image type that can be run in many places (Multilpe GCP, AWS and Azure options, on-prem Kubernetes, etc). This can even be as simple as building a dockerfile that just copies the WAR into a jetty image:
FROM jetty:latest
COPY YOUR_WAR.war /var/lib/jetty/webapps
It might be better to explode the war though - see discussion in this question
AWS provide ** AWS Elastic Beanstalk **
The AWS Elastic Beanstalk Tomcat platform is a set of environment configurations for Java web applications that can run in a Tomcat web container. Each configuration corresponds to a major version of Tomcat, like Java 8 with Tomcat 8.
Platform-specific configuration options are available in the AWS Management Console for modifying the configuration of a running environment. To avoid losing your environment's configuration when you terminate it, you can use saved configurations to save your settings and later apply them to another environment.
To save settings in your source code, you can include configuration files. Settings in configuration files are applied every time you create an environment or deploy your application. You can also use configuration files to install packages, run scripts, and perform other instance customization operations during deployments.
It also provide autoscaling
The Auto Scaling group in your Elastic Beanstalk environment uses two Amazon CloudWatch alarms to trigger scaling operations. The default triggers scale when the average outbound network traffic from each instance is higher than 6 MB or lower than 2 MB over a period of five minutes. To use Amazon EC2 Auto Scaling effectively, configure triggers that are appropriate for your application, instance type, and service requirements. You can scale based on several statistics including latency, disk I/O, CPU utilization, and request count.

Performance test batch based Java Micro Service

I have a batch based micro service that runs every after a particular interval through a chronos job for which I have to performance test. This micro service doesn't return any response but downloads zip files from Amazon S3, extract them and uploads the individual files from the zip to Amazon S3. I work on JMETER to performance test Web applications. Can I use JMeter for perf testing this batch based micro service? If yes, what would I have to do?
Yes, you can use JMeter for this, take a look at:
HTTP Request sampler - to mimic downloads and uploads
Save Responses to a file listener - to store downloaded zip files
OS Process Sampler - to unzip the downloaded files
Check our Performance Testing: Upload and Download Scenarios with Apache JMeter article for detailed information on JMeter configuration for file operations.
You can't load test this service, because it has no particular HTTP endpoint. There is no load on this service since it's not being hit by any user.
You should use production monitoring instead to track any performance issue while the service is running.

Retrieve application config from secure location during task start

I want to make sure I'm not storing sensitive keys and credentials in source or in docker images. Specifically I'd like to store my MySQL RDS application credentials and copy them when the container/task starts. The documentation provides an example of retrieving the ecs.config file from s3 and I'd like to do something similar.
I'm using the Amazon ECS optimized AMI with an auto scaling group that registers with my ECS cluster. I'm using the ghost docker image without any customization. Is there a way to configure what I'm trying to do?
You can define a volume on the host and map it to the container with Read only privileges.
Please refer to the following documentation for configuring ECS volume for an ECS task.
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_data_volumes.html
Even though the container does not have the config at build time, it will read the configs as if they are available in its own file system.
There are many ways to secure the config on the host OS.
In my past projects, I have achieved the same by disabling ssh into the host and injecting the config at boot-up using cloud-init.