AWS cloudwatch node.js metrics send performance concern - amazon-cloudwatch

I am new to using aws-sdk in a node.js project. I was going through the guide mentioned here about sending metrics to AWS cloudwatch. The example below in particular uses a promise based approach for sending metrics, if i am to use the below strategy for sending metrics, for e.g. whenever a particular endpoint is triggered, how performant will this usage be?
Node.js runtime is single threaded, so, for every request adding something on the call stack for a single threaded process seems like a performance degrading practice, in Java you could use daemon threads for sending metrics.
Should this be a concern? If yes, then what could be a better strategy for sending the metrics?
export const run = async () => {
try {
const data = await cwClient.send(new PutMetricDataCommand(params));
console.log("Success", data.$metadata.requestId);
return data;
} catch (err) {
console.log("Error", err);
}
};

This can be solved by batching metrics and making a single call periodically.
You can try the aws-embedded-metrics library for nodejs. I have not used this but looks like you can flush the metrics periodically.

Related

Should I close page in puppetter cluster task closure when using long lasting cluster

I have a cluster for which I have defined a task. As per example in the README.md, I have a closure which accepts a page instance as an argument. I navigate to the page and capture a screenshot. I don't do anything else with the page instance. In the README.md example, there's an await for idle event and then the cluster is closed. However I have a cluster which I virtually never want to close. Should I change the behaviour of my closure in that scenario to close the page?
I suspect I have got a memory leak somewhere in my service and one of the causes I am investigating is whether the cluster closes pages after I am done using them. I use concurrency: Cluster.CONCURRENCY_CONTEXT option.
await puppeteer.task(async ({ page }) => {
// ... my screenshot logic
// do I need to do this?
await page.close();
});

What's the best way to write synchronous code using async libraries

I've been asked to help write some server-side scripts that update calendars, contacts, e-mail and other company services from other internal services for compliance reasons. The code needs to access the LDAP server, an SQL database and e-mail server, compile and merge information in a peculiar way and then go through information in the calendars, contacts and update those depending on what's there and in the LDAP/SQL databases. This needs to be done a couple of times a day, so performance isn't particularly important.
I wanted to try to use node.js for this and after spending a few days with it, I'm having second thoughts as to whether node.js is the right tool to do this. I'm an old school programmer, have C, C++, Unix, SQL, LDAP, TCP/IP in my small finger but I've learned JavaScript/Node.js in a few days out of curiosity.
I use LDAP, SQL and CalDav/CardDAV modules from npm. These are based on Promises.
Thanks to promises, the code is super ugly, unreadable and buggy if there's any kind of network problem. Things that are very easy in classic language such as C or Java are suddenly a massive headache, such as the fact that (say) LDAP information will arrive at a later stage, but there's no point in async operations as nothing can be done in parallel while waiting for those. This pattern repeats itself throughout the code - async operations complicating matters incredibly for zero benefit. It's incredibly difficult to calculate any sort of aggregate values and apply them elsewhere when using these async functions.
My question is this: is there a simple way to invoke async functions that will simply return the value instead of invoking callbacks?
in javascript - no. Waiting async call to complete would block the only thread and so would stop all the work, and so such waiting is not implemented.
But there exists async/await mechanism - syntactic sugar over asynchronous calls, which mimics synchronous calls.
First of all, the vast majority of I/O calls in Node.js are asynchronous. So if you're totally uncomfortable with this, it may not be the right fit for what you're doing.
It is a (IMHO) brilliant language, and you'll find it more comfortable to use as you get used to it.
However, you can write code using Promises that looks very much like synchronous code using the async/await syntax. I find this much easier to deal with compared to using lot's of .thens and .catch.
You can also use try and catch in this context and it will work as you would expect.
This makes for much cleaner, more readable code.
Most Node.js modules support Promises, for those that don't you can often convert functions to returning a Promise rather than expecting a callback to be passed.
NB: Don't forget to await calls to async. functions. I find this is one of the most common errors one makes when using this syntax.
For example:
function callApi() {
return new Promise(resolve => setTimeout(() => resolve({ status: 'ok'}), 500))
}
async function testApi() {
try {
console.log("callApi: Calling function...");
let result = await callApi();
console.log("callApi result:", result);
} catch (err) {
console.error("callApi: Error:", err.message);
}
}
testApi();
Then a further example to show catching an error (say the api fails for some reason):
function callApi() {
return new Promise((resolve, reject) => setTimeout(() => reject(new Error("Api error")), 500))
}
async function testApi() {
try {
console.log("callApi: Calling function...");
let result = await callApi();
console.log("callApi result:", result);
} catch (err) {
console.error("callApi: Error:", err.message);
}
}
testApi();

First Invocation of Google Cloud Function always times out

I have a google cloud function that's seems to timeout after being inactive for a certain amount of time or if I re-deploy it. Subsequent calls to the end point work just fine, it's just the initial invocation which doesn't work. The following is an over simplified version of what my cloud function is. I basically use an express app as a handler. Perhaps the issue is with the express app not running the first time around, but running on subsequent invocations?
const express = require('express');
const app = express();
const cors = require('cors');
app.use(cors())
app.get('/health', (req, res) => {
res.send('OK');
});
module.exports = app;
Currently have out set to 60s, and a route like the health route shouldn't take that long.
Some interesting log entries
"Function execution took 60004 ms, finished with status: 'timeout'"
textPayload: "Error: Retry total timeout exceeded before any response was received
at repeat (/srv/functions/node_modules/google-gax/build/src/normalCalls/retries.js:80:31)
at Timeout.setTimeout [as _onTimeout] (/srv/functions/node_modules/google-gax/build/src/normalCalls/retries.js:113:25)
at ontimeout (timers.js:436:11)
at tryOnTimeout (timers.js:300:5)
at listOnTimeout (timers.js:263:5)
at Timer.processTimers (timers.js:223:10)"
Cloud Function execution time is limited by the timeout duration, which you can specify at function deployment time. By default, a function times out after 1 minute.
As it is stated in the official documentation:
When function execution exceeds the timeout, an error status is immediately returned to the caller. CPU resources used by the timed-out function instance are throttled and request processing may be immediately paused. Paused work may or may not proceed on subsequent requests, which can cause unexpected side effects.
Note that this period can be extended up to 9 minutes. In order to set the functions timeout limit you can use this gcloud command:
gcloud functions deploy FUNCTION_NAME --timeout=TIMEOUT FLAGS...
More details about your options could be found over here.
But, maybe if your code takes a long time to execute, you may also consider using another serverless option, like Cloud Run.
A Google Cloud Function can be thought of as the event handler for an incoming event request. A cloud function can be triggered from a REST request, pub/sub or cloud storage. For a REST request, consider the function that you supply as the one and only "handler" that the function offers.
The code that you supply (assuming Node.JS) is a function that is passedin an express request object and response object. In the body of the function, you are responsible for handling the request.
Specifically, your Cloud Function should not set up express or attempt to otherwise modify the environment. The Cloud Function provides the environment to be called externally and you provide the logic to be called. Everything else (scaling etc) is handled by Google.

How can I create a "Health Check" for an Azure Storage Queue

So when developing an app utilizing Azure Storage Queues and a Web Job, I feel like I need some sort of health check (via API) to ensure my Azure Storage Queue is properly configured for each environment up to prod. I don't have access (directly) to view the Dashboard or Kudu.
My thought thus far was to just create an API route that returns a bool that tells me if I was able to create the queue if it doesn't exist, and peek at a message (even if one doesn't exist), like :
public async Task<bool> StorageQueueHealthCheck()
{
return await _queueManager.HealthCheck();
}
And the implementation:
public async Task<bool> HealthCheck()
{
try
{
CloudQueue queue = _queueClient.GetQueueReference(QueueNames.reportingQueue);
queue.CreateIfNotExists();
CloudQueueMessage peek = await queue.PeekMessageAsync();
return true; // as long as we were able to peek at messages
}
catch (Exception ex)
{
return false;
}
}
Is this a bad approach? Is there another way to "health check" certain Azure functionality when the dashboard is abstracted away? If I absolutely needed I would be able to view the Kudu but would rather just use an API and hit it via Swagger.
Looks good. You can also try CloudQueue.FetchAttributeAsync() since the payload would be smaller when the message size is large.
This is a good approach, please just make sure you do have a retry mechanism so that your healthcheck does not just return false for intermittent failures.
Second Approach,
Instead of api which will perform the job only is trigger there should be a console app (webjob) which does this task on regular interval (1min) and based on some logic lets say all 'creates' in last 10mins threw error sends an email. This can be used in all environments.

Difference between FireAndForget and Async behavior for publishing

Currently, we are using StackExchange.Redis and, as it does not provides "blocking pops", we are doing as suggested on the documentation:
db.ListLeftPush(key, newWork, flags: CommandFlags.FireAndForget);
sub.Publish(channel, "");
What is the difference from this to the following?
db.ListLeftPushAsync(key, newWork);
sub.Publish(channel, "");
We know the purpose of the commands, what we would like to know is if it has any difference internally or any risk of behaving differently? (Execution order etc.)
There's a main difference comparing fire and forget vs calling an async operation and not awaiting it.
Fire and forget means that not only you're not waiting for the result but you don't care if it works or not, while an async operation may throw an exception once it has ended if something goes wrong.
In the other hand, when you issue a fire and forget command, StackExchange.Redis doesn't try to retrieve the command result internally, which is better if you just want the so-called fire and forget behavior when issuing commands.
You may check this difference if you open ConnectionMultiplexer source code and you see how ExecuteAsyncImpl / ExecuteSyncImpl methods are implemented:
// For example, ExecuteAsyncImpl...
if (message.IsFireAndForget)
{
TryPushMessageToBridge(message, processor, null, ref server);
return CompletedTask<T>.Default(null); // F+F explicitly does not get async-state
}
else
{
var tcs = TaskSource.CreateDenyExecSync<T>(state);
var source = ResultBox<T>.Get(tcs);
if (!TryPushMessageToBridge(message, processor, source, ref server))
{
ThrowFailed(tcs, ExceptionFactory.NoConnectionAvailable(IncludeDetailInExceptions, message.Command, message, server));
}
return tcs.Task;
}
Answer to some OP comment
Hi. Thanks for your answer. We know the purpose of the commands, what
we would like to know is if it has any differrence internally or any
risk of behaving differently (execution order etc.)
Since the async operation won't be finished when you publish the message on the Redis channel, it can happen that you publish the message and the operation gets executed never. You lose a lot of control.
When you send a fire and forget command, it mightn't be executed too, but you know that the try was done before you publish the channel's message. Therefore, you shouldn't use async operations to implement fire and forget pattern when using StackExchange.Redis.
You may check this other related Q&A: Stackexchange.redis does fire and forget guarantees delivery?