How to push AWS ECS Fargate cloudwatch logs to UI for users to see their long running task real time logs - amazon-cloudwatch

I am creating an app where the long running tasks are getting executed in ECS Fargate and logs are getting pushed to CloudWatch. Now, am looking for a way to give the users an ability in the UI to see those realtime live logs while their tasks are running.
I am thinking of the below approach..
Saving logs temporarily in DynamoDB
DynamoDB stream with batch will trigger a lambda.
Lambda will trigger an AWS Appsync mutation with None data source.
In UI client will subscribed to that mutation to get real time updates. (depends on the batch size, example 5 batch means 5 logs lines )
https://aws.amazon.com/premiumsupport/knowledge-center/appsync-notify-subscribers-real-time/
Is there any other techniques or methods that i can think of?

Why not use Cloudwatch default save in S3 bucket ability and add SNS to let clients choose which topic they want to trail the log. Removing extra DynamoDB.

Related

Notification Service on AWS S3 bucket (prefix) size

I have a specific use-case where we have a huge amount of data that is continuously streamed into the AWS bucket.
we want a notification service for s3 bucket on the specific folder where if a folder reaches specific size(for example 100 TB) a cleaning service should be triggered via (SNS, Aws lambda)
I have checked into AWS documentation. I did not found any direct support from Aws regarding this issue.
https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html
We are planning to have a script that will periodically run and check the size of s3 Object and kicks AWS lambda.
is there any elegant way to handle case like this .any suggestion or opinion is really appreciated.
Attach s3 trigger event to a lambda function which will get triggered, whenever any file is added to the S3 bucket.
Then in the lambda function check for the file size. This will eliminate to run a script periodically to check the size.
Below is a sample code for adding S3 trigger to a lambda function.
s3_trigger:
handler: lambda/lambda.s3handler
timeout: 900
events:
- s3:
bucket: ${self:custom.sagemakerBucket}
event: s3:ObjectCreated:*
existing: true
rules:
- prefix: csv/
- suffix: .csv
There is no direct method for obtaining the size of a folder in Amazon S3 (because folders do not actually exist).
Here's a few ideas...
Periodic Lambda function to calculate total
Create an Amazon CloudWatch Event to trigger an AWS Lambda function at specific intervals. The Lambda function would list all objects with the given Prefix (effectively a folder) and total the sizes. If it exceeds 100TB, the Lambda function could trigger the cleaning process.
However, if there are thousands of files in that folder, this would be somewhat slow. Each API call can only retrieve 1000 objects. Thus, it might take many calls to count the total, and this would be done every checking interval.
Keep a running total
Configure Amazon S3 Events to trigger an AWS Lambda function whenever a new object is created with that Prefix. The Lambda function can retrieve increment the running total in a database. If the total exceeds 100TB, the Lambda function could trigger the cleaning process.
Which database to use? Amazon DynamoDB would be the quickest and it supports an 'increment' function, but you could be sneaky and just use AWS Systems Manager Parameter Store. This might cause a problem if new objects are created quickly because there's no locking. So, if files are coming in every few seconds or faster, definitely use DynamoDB.
Slow motion
You did not indicate how often this 100TB limit is likely to be triggered. If it only happens after a few days, you could use Amazon S3 Inventory, which provides a daily CSV containing a listing of objects in the bucket. This solution, of course, would not be applicable if the 100TB limit is hit in less than a day.

SQS and AWS Lambda Integration

I am developing an Audit Trail System, that will act as a central location for all the critical events happening around the organization. I am planning to use Amazon SQS as a temporary queue to hold the messages that in turn will trigger the AWS lambda function to write the messages into AWS S3 store. I want to segregate the data at tenantId level (some identifiable id) and persist the messages as batches in S3, that will reduce the no of calls from lambda to S3. Moreover, I want to trigger the lambda every hour. But, I have 2 issues here, one the max batch size provided by SQS is 10, also the lambda trigger polls the SQS service on regular basis, that's gonna increase the no of calls to my S3. I want to create a manual batch of 1000 messages(say) before calling the S3 batch api. I am not very much sure how to architecture my system, so that above requirements can be met. Help or idea provided is very much appreciable!
Simplified Architecture:
Thanks!
I would recommend that you instead use Amazon Kinesis Data Firehose. It basically does what you're wanting to do:
Accepts incoming messages
Buffers them for a period of time
Writes output to S3 or Elasticsearch
This is all done as a managed service, and can also integrate with AWS Lambda to provide custom processing (eg filter out certain records).
However, you might have to do something special to segregate the data at tenantId. See: Can I customize partitioning in Kinesis Firehose before delivering to S3?

Is there a CloudWatch Alert and Notification metrics from Performance Insights?

Need to know if there are alert metrics in CloudWatch for RDS Performance insights.
ie. Trigger and Alarm, whenever there is => high load, waits in SQL Server?
You may need to read Overview of Monitoring Amazon RDS
Amazon RDS automatically sends metrics to CloudWatch every minute for
each active database. You are not charged additionally for Amazon RDS
metrics in CloudWatch.
You can watch a single Amazon RDS metric over a specific time period,
and perform one or more actions based on the value of the metric
relative to a threshold you set
You can create an alarm in RDS console and select the metric that is of your interest. Here is a snapshot to display that:
Amazon RDS Performance Insights recently released a feature that sends key performance metrics from Performance Insights to Amazon CloudWatch. Using this feature, you can set alerts on these metrics.
When Performance Insights is enabled, it automatically sends the following three metrics to CloudWatch:
DBLoad
DBLoadCPU
DBLoadNonCPU
https://aws.amazon.com/blogs/database/set-alarms-on-performance-insights-metrics-using-amazon-cloudwatch/

Is there a way to replicate AWS managed Redis to Azure managed Redis in real time?

I have tried below approach .
Step 1) Took a snapshot of AWS Redis and copied it in Azure
Step 2) Listened to the key space notifications from AWS Redis and put it in a Queue and read these events from the queue and applied it on Azure Redis.
Problems faced:
1)Every operation which effects AWS Redis data, needs one query to AWS Redis to get the effected data as it is not present in notification.
2) Some operations like 'rename' is being delivered in two notifications, handling these two notifications to construct the actual operation will be bit difficult even in Queue.

push logs in S3 to dynamoDB continuously

we have our application logs pumped to S3 via Kinesis Firehose. we want this data to also flow to DynamoDB so that we can efficiently query the data to be presented in web UI (Ember app). need for this is so that users are able to filter and sort the data and so on. basically to support querying abilities via web UI.
i looked into AWS Data pipeline. this is reliable but more tuned to one time imports or scheduled imports. we want the flow of data from s3 to dynamoDB to be continuous.
what other choices are out there to achieve this? moving data from S3 to dynamoDB isn't a very unique requirement. so how have you solved this problem?
Is an S3 event triggered lambda an option? if yes, then how to make this lambda fault tolerant?
For Full Text Querying
You can design your solution as follows for better querying using AWS Elasticsearch as the destination for rich querying.
Setup Kinesis Firehouse Destination to Amazon Elastic Search. This will allow you to do full text querying from your Web UI.
You can choose to either back up failed records only or all records. If you choose all records, Kinesis Firehose backs up all incoming source data to your S3 bucket concurrently with data delivery to Amazon Elasticsearch. 
For Basic Querying
If you plan to use DynamoDB to store the metadata of logs its better to configure S3 Trigger to Lambda which will retrieve the file and update the metadata to DynamoDB.
Is an S3 event triggered lambda an option?
This is definitely an option. You can create a PutObject event on your S3 bucket and have it call your Lambda function, which will invoke it asynchronously.
if yes, then how to make this lambda fault tolerant?
By default, asynchronous invocations will retry twice upon failure. To ensure fault-tolerance beyond the two retries, you can use Dead Letter Queues and send the failed events to an SQS queue or SNS topic to be handled at a later time.