I have a batch based micro service that runs every after a particular interval through a chronos job for which I have to performance test. This micro service doesn't return any response but downloads zip files from Amazon S3, extract them and uploads the individual files from the zip to Amazon S3. I work on JMETER to performance test Web applications. Can I use JMeter for perf testing this batch based micro service? If yes, what would I have to do?
Yes, you can use JMeter for this, take a look at:
HTTP Request sampler - to mimic downloads and uploads
Save Responses to a file listener - to store downloaded zip files
OS Process Sampler - to unzip the downloaded files
Check our Performance Testing: Upload and Download Scenarios with Apache JMeter article for detailed information on JMeter configuration for file operations.
You can't load test this service, because it has no particular HTTP endpoint. There is no load on this service since it's not being hit by any user.
You should use production monitoring instead to track any performance issue while the service is running.
Related
I am trying to deploy (in CDK) scheduled Python ETL scripts as Batch jobs (Fargate?) to parse data from AWS and other tools we utilize. A Spunk Forwarder consumes this data and sends it to our Splunk Index. Am I going to need an S3 bucket for the output of logs from my ETL scripts? How can I deploy the Splunk Forwarder alongside these scripts?
There are about 5-6 scripts that I would like to deploy via CDK.
AWS Batch jobs can send STDERR and STDOUT to CloudWatch Logs. Depends on how logging is configured in your Python scripts, that may be the easy answer. If logging is configured to write to a file, then yes I would recommend you upload the file to S3 after the ETL is finished.
Output from the scripts (the ETL results) will need to land someplace, and S3 is a great choice for that. Your Splunk Forwarder can be set up to monitor the bucket for new data and ingest. If the scripts directly send data to the forwarder you should not need a S3 bucket, but I personally would recommend that you decouple the ETL data from ingestion of the result into Splunk.
Splunk Forwarders (stable servers) would be deployed separate from AWS Batch resources.
Per my understanding, cloudhub is a PaaS service and we can deploy applications directly to Cloudhub. I have below questions
Can we create intermediate files on cloudhub. If yes, how can we define the path ?
When we use SFTP to pull file from particular location, what should be the path on cloudhub server for processing
Can we do SSH on cloudhub server
If we need to externalize cron timings of scheduler ,(via config etc to avoid code change) , what is the best practice for setting cron expression.
All above questions are related to Cloudhub deployment model.
Thanks in advance
The scheduler already gets externalized in the platform when you deploy to CloudHub.
You can technically store the files in /temp, but don't expect them to persist. That would be an "ephemeral" file system.
You can not SSH into the CloudHub server.
Rather than downloading the entire SFTP file and saving it, and then working on it, I would suggest streaming it if possible. You can process JSON/XML/CSV files as a stream, and even use deferred DataWeave with them enabling end-to-end streaming.
Does google cloud or aws provide manage Apache tomcat which just take war file and do auto-scaling based on load increase and decrease ? not compute engine. I dont want to create VM. this should be manage by manage service.
Google App Engine can directly take and run a WAR file - just use the appcfg deployment method.
You will have more options if you package with docker, as this then provides an image type that can be run in many places (Multilpe GCP, AWS and Azure options, on-prem Kubernetes, etc). This can even be as simple as building a dockerfile that just copies the WAR into a jetty image:
FROM jetty:latest
COPY YOUR_WAR.war /var/lib/jetty/webapps
It might be better to explode the war though - see discussion in this question
AWS provide ** AWS Elastic Beanstalk **
The AWS Elastic Beanstalk Tomcat platform is a set of environment configurations for Java web applications that can run in a Tomcat web container. Each configuration corresponds to a major version of Tomcat, like Java 8 with Tomcat 8.
Platform-specific configuration options are available in the AWS Management Console for modifying the configuration of a running environment. To avoid losing your environment's configuration when you terminate it, you can use saved configurations to save your settings and later apply them to another environment.
To save settings in your source code, you can include configuration files. Settings in configuration files are applied every time you create an environment or deploy your application. You can also use configuration files to install packages, run scripts, and perform other instance customization operations during deployments.
It also provide autoscaling
The Auto Scaling group in your Elastic Beanstalk environment uses two Amazon CloudWatch alarms to trigger scaling operations. The default triggers scale when the average outbound network traffic from each instance is higher than 6 MB or lower than 2 MB over a period of five minutes. To use Amazon EC2 Auto Scaling effectively, configure triggers that are appropriate for your application, instance type, and service requirements. You can scale based on several statistics including latency, disk I/O, CPU utilization, and request count.
My understanding on aws xray is, xray is similar to dynatrace and I am trying to use xray for monitoring apache performance. I do not see any document related to xray with apache except below.
https://mvnrepository.com/artifact/com.amazonaws/aws-xray-recorder-sdk-apache-http
Can anyone please suggest if it is possible to use aws xray with apache and if yes can you also point some document related to it. Thanks.
I assume that by "apache" you mean the Apache Tomcat servlet container, since you are referring to a maven artifact which is a Java build tool.
Disclamer: I don't know what "dynatrace" is and I don't know which logging you specifically want.
But as far as the Apache Tomcat servlet container and X-Ray goes - here is the link to get started:
http://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-java.html
Start by adding AWSXRayServletFilter as a servlet filter to trace incoming requests. A servlet filter creates a segment While the segment is open you can use the SDK client's methods to add information to the segment and create subsegments to trace downstream calls. The SDK also automatically records exceptions that your application throws while the segment is open.
As for the mentioned maven artifact:
aws-xray-recorder-sdk-apache-http – Instruments outbound HTTP calls made with Apache HTTP clients
So, you'll need this if, let's say, a client makes a request to your Tomcat server and your Tomcat server makes a request to another server thus acting as a client in this case.
I have this problem in which I need to compress (zip) multiple files from the web app (on the server) before streaming it down to the browser. I'm pulling the files from a separate service that connects to a SQL database. So there's a huge delay in opening the files from the service as well as a delay in compressing the files before the zipped package can be streamed to the browser. Ideally, I would like to have the DOWNLOAD button on the page make a call to a SignalR method on the server which will then push a notification back to the client once the multiple files are done compressing. That way, the browser won't request the server to stream the zipped file right away. It will only begin streaming once the multiple files are done compressing.
background info: I'm using IIS 7.5 and MVC 4.
I've been reading up and watching videos on SignalR, but have only seen examples of chat hubs and pushing to multiple clients, etc. Would it be possible to only use SignalR for the client that is making the request? And if so, I would appreciate some example code or perhaps a link to a tutorial on how one could accomplish something like this. Thanks!
To achieve what you need, you will have to define 3 clients
The Browser, it will call The Hub when a download is requested, then it will wait for a call from The Hub to download the files.
The Server, receives a notification from The Hub when the browser requests a download, and when all is ready calls The Hub to pass the files.
The Service, received the files from The Hub when its passed from The Server, and make the files ready for download, then send a notification to The Hub to inform The Browser.
Note
Storing large files in memory is not recommended, and passing it through SignalR is not as well, unless its the only way the server and the service can share the files, so if you have a common storage -Disk or Database- then its better to use it