Fetching CloudWatch logs take a long time to complete with a long gap - amazon-cloudwatch

I have a long-running job that normally takes an entire day to accomplish. There are some logs at the start and finish of the process.
When I use the AWS SDK 'GetLogEvents' to retrieve the log stream with the next forward token, I made hundreds of calls with no event data before reaching the finish.
https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_GetLogEvents.html
The log events limit per request is 100 items
I consider putting the API call in a loop with an event data exists condition. However, it may result in rate limiting for the CloudWatch API (25 requests/s). So I want to find a better approach

Related

Rate Limit Pattern with Redis - Accuracy

Background
I have an application that send HTTP request to foreign servers. The application communicating with other services with strict rate limit policy. For example, 5 calls per second. Any call above the allowed rate will get 429 error code.
The application is deployed in the cloud and run by multiple instances. The tasks are coming from shared queue.
The allowed rate limit synced by Redis Rate Limit pattern.
My current implementation
Assuming that the rate limit is 5 per second: I split the time into multiple "window". Each window has maximum rate of 5. Before each call I checking if the counter is less then 5. If yes, fire the request. If no, wait for the next window (after a second).
The problem
In order to sync the application around the Redis, I need to Redis calls: INCR and EXPR. Let's say that each call can take around 250ms to be returned. So we have checking time of ~500ms. Having said that, in some cases you will check for old window because until you will get the answer the current second has been changed. In case that on the next second we will have another 5 quick calls - it will lead to 429 from the server.
Question
As you can see, this pattern not really ensuring that the rate of my application will be up to 5 calls\second.
How do you recommend to do it right?

JMeter - Active threats over time

I am new to JMeter. I am working with it the last month. The problem that i am facing is with the graph that shows the active threats over time. What i want to achieve is a linear graph that will show that every 2 seconds a new threat is entering the application and do whatever it needs to do. My set up is as follow:
I can not add loop count to infinite as each user is executing different tasks that can be executed only once. It can not reuse the data and hit the services/tasks again with the use of the same user.
The process is:
Login
Get Requests
Post requests
If i execute my scenario i am getting the following graph:
What i need to do in order to get something like the below:
You're dealing with the Listener which means that it will plot the first data point only when first sampler reports its metrics.
If your first request takes 10 seconds you will see the first dot in the Active Threads Over Time chart at 10 seconds when 3 users are online already.
So if you want to see "smooth" arrival of virtual users you need to add a "synthetic" sampler with a couple of milliseconds response time before your other samplers (for example a Dummy Sampler will be a perfect match), this way listeners will take it as the starting point
Demo:

Understanding why you would want to process Message Queues at a future time

So I'm trying to understand what practical problems Queues solve. By reading all the information from Google, I get the high-level.
Push message to Queue for processing at a later time
So I'm looking at an architecture from Company A and they have different use cases for Job Queueing like for example
chat messages
file conversion
searching
Heavy sql queries
Why process it at a later time?
Here's my best guess...
Let's say I have an application that can process 10 "things" at a time.
My application then maxes out it's processing capacity.
an 11th request came in so app puts it in the Queue for later processing
Assuming this is a valid Use Case, wouldn't adding more servers to process more "things" make sense? Is it because it's more costly to add more servers than employ a Queue and sacrifice response time a little bit?
Given my Use Case examples, what other problems would Queues solve for them?
Have you ever lined up at a bank when it is busy? You would have waited in a queue.
"But," you could say, "wouldn't adding more staff to process more customers make sense? Is it because it's more costly to add more staff than employ a Queue and sacrifice response time a little bit?"
That would be correct. It can be quite costly to staff a bank based on the peak number of customers who would arrive each day. It is cheaper to staff below this level and have some customers wait in a queue.
Also, the number of customers each day are not 100% predictable. A queue allows excess demand to wait without breaking the system.
Queues enable decoupling.
For example, imagine an online store where customers purchase an item. They select the item, provide a credit card number and click 'Purchase'. If the credit card is declined, the online store can immediately prompt them to re-enter the number. This interaction has to take place immediately while the customer is still online.
However, there is no need to have the customer wait while an invoice is generated, a record is added to the accounting system and inventory is pulled off the shelf. This can be decoupled from the ordering process. A good way to do it is to push the order into a queue, which can be handled by the next system.
If that 'next system' happens to be offline at the moment, there is no reason to cancel the whole sale. The transaction can be processed when the 'next system' comes back online. This is much better than failing the whole process just because one component (which is not required immediately) has a failure.
Bottom line: Queues are excellent. They enable better handling of failures. They makes things more resilient (just wait a few minutes and try again!). They should be used at all times when the process is compatible with a queuing architecture.
Let's do scenarios
Scenario 1 without queue:
you request an endpoint /blabla/do-eveything/
this request do
download an image from very slow FTP
e.g 1.5 sec (can error, retry ? add +X sec)
attach the image to an email
send an email (3 sec)
e.g 1 sec (can error, retry ? add +X sec)
confirmation received > store confirmation to a third company tracking stuff
e.g 1.5 (can error, retry ? add +X sec)
when tracking confirm, update your data from another third company for big data purpose
e.g 2 sec (can error, retry ? add +X sec)
... you get the idead
return the response e.g 11 sec later (this is to slow) or more or timeout when everything failed
End user said internet was faster 20 years ago, maybe I need to change my internet connection or change my 16 threads
Scenario 2 queue everything you can:
you request an endpoint /blabla/do-eveything/
this request do
Queue job "DO_EVERYTHING"
e.g 0.02 sec
Return the response less then 0.250 sec
End user said that is website/app is too fast, I can keep my 56K internet connection
on queue/event system one failed job can be retry later without affeting the end user
you can pause job, add a unlimited number a task/step after the original message
better fault tolerance
Working with queue will allow you a better micro/nano service architecture, better testing because, you can test a single job, intead of a full controller that do everything...
Ye, is maybe more work, more thinking, but a the end no need to think about the work when holidays

Twitch Api: How can I know that stream was finished?

I have a stream url like https://www.twitch.tv/streams/26114851120/channel/31809543. Stream is online and I need to catch moment when stream will be finished.
I researched twitch api documentation and didn't find any events. The first thought was to send requests every several minutes and when stream going online - handle this. It was a little delay, but it isn't critical.
But there are many streams that I must track and I scare that twitch can block me for this.
Are there any other ways to catch stream's finish?
As best I can tell there's no way to directly listen for a stream going online or offline, but you can still monitor a large number of streams in spite of that.
There are a fair number of Q&A on the official Twitch developer site wanting this functionality, but all of them I could find are answered with the same "it's not currently possible."
Keep in mind that you can check the status of multiple channels simultaneously (up to 100 per request) using a comma separated list and the limit query parameter: Get-Live-Streams
https://api.twitch.tv/kraken/streams/?get-live-streams?channel=Channel1,Channel2&limit=100
That'll return an object containing an array of online streams (the streams property).
Rate Limits
Twitch's official stance regarding rate limiting is a recommendation of no more than "about 1 request per second". That said they don't throttle you for making several requests in immediate succession, but rather the cumulative amount.
Note that there's a separate rate limit for IRC-related actions of 20 commands/messages per 30 seconds normally or 100 per 30 if a mod. Violating that will trigger a 30 minute lockout.
API-Side Caching
API results are also cached for 1-3 minutes which reduces load on their end. Given that, there's not much value in polling for anything more frequently than that (i.e. you should wait at least 1 minute before making the exact same request again since you'd just get the same response).
You Can Still Monitor ~6000 Streams
Given the ability to check 100 streams at a time, a need to wait for at least 1 minute per request to get new results, and an approximate rate limit of 1 request per second, you can theoretically check the status of about 6000 streams continuously (assuming you're not making other requests; 100 streams per second * 60 per minute).
PubSub For Monitoring Other Things
Currently the PubSub API doesn't have anything for monitoring stream's going online, but you may want to keep it in mind for other polling-type actions (it currently deals with things like new subscriptions or donations).
Using The Embedded Player
One last thing worth noting is you can listen for a channel going online or offline when you're using the Twitch Embedded Player.
Might be a little late to reply, but now you can look into Twitch WebHooks.
They allow you to subscribe to specific stream(s) to have Twitch notify your callback URL, when stream goes up or down.
This seems more accurate and bandwidth-saving than querying twitch yourself.

Limiting a queue over time

I'm using an API that is limited for usage, let's say: no more than 10 calls per second, and no more than 5000 calls per day.
I am handling this calls in a beanstalkd queue process job. How can I limit the processing of this jobs, having in mind the API's limits.
When you use Beanstalkd you can have the tube paused for a certain seconds.
When you reserve a job, and you know the API call failed during that call, you get to pause the tube for X seconds.
You can find out the time needed to pause the tube, either from your API response (usually they return you are locked until Time X), or start with something adaptive like pause for the next 60 seconds, and increase/decrease on the go.
If you know you can delay, or disperse in advance, before placing the jobs into your queues, you can also add a delay to the job, so it won't execute immediately, this way you can have your jobs distributed over time.
Also there is a great post about distributed rate limiting using redis
If all workers share durable state, they can update shared status and collective implement rate limiting.
If the only shared writable state is the queue itself, you could create ticketing tubes for the rate limited jobs, and have a rate limit manager insert tickets (permission slips) to control when the jobs get run. Would need changes to the workers, would need a way to time out unused tickets, but should be workable.
Edit: a "valid until" timestamp in the ticket might do it for per-second limits. Per-day limits might need a feedback tube back to let the rate limit manager know about actual usage (to implement a rolling 24 hour window instead of the 5000 all getting reset at midnight)