Should I close page in puppetter cluster task closure when using long lasting cluster - puppeteer-cluster

I have a cluster for which I have defined a task. As per example in the README.md, I have a closure which accepts a page instance as an argument. I navigate to the page and capture a screenshot. I don't do anything else with the page instance. In the README.md example, there's an await for idle event and then the cluster is closed. However I have a cluster which I virtually never want to close. Should I change the behaviour of my closure in that scenario to close the page?
I suspect I have got a memory leak somewhere in my service and one of the causes I am investigating is whether the cluster closes pages after I am done using them. I use concurrency: Cluster.CONCURRENCY_CONTEXT option.
await puppeteer.task(async ({ page }) => {
// ... my screenshot logic
// do I need to do this?
await page.close();
});

Related

Expressjs send content after listen

I want to make a route in express js to send some content after 1000 ms.
Note: I cant use res.sendFile, it has to be a plain route.
This is the code for the route:
app.get('/r', (req,res)=>{
res.send("a")
setTimeout(()=>{
res.send("fter")
}, 1000)
}
app.listen(8080)
But I get the error: ERR_HTTP_HEADERS_SENT, I assume because the page has already been loaded.
I need my node program to send it after it has already been loaded, so I cant like send a html,js,css script to do it. Is it possible? I cant seem to find how.
Well, if that is not possible, what I am really trying to do is after the page has loaded, execute js or send a message that the page can receive from the node program, like if there was res.execute_js('postMessage(1)')
EDIT based on your edit: So as I understand you want a way to send different values from a node app endpoint without using socketio. I've managed to replicate a similar experimental behavior using readable streams. Starting off, instead of returning response to the request with res.send() you should be using res.write() In my case I did something like this:
app.post('/api', (req, res) => {
res.write("First");
setTimeout(() => {
res.write("Second");
res.end();
}, 1000);
});
This will write to a stream "First" then after 1000ms it'll write another "Second" chunk then end the stream, thus completing the POST request.
Now in the client, you'll make the fetch response callback async, get the ReadableStream from the request like so
const reader = response.body.getReader();
now we should be reading this stream, we'll first initialize an array to collect all what we're reading,
const output = [];
now to actually read the stream,
let finished, current;
while (!finished) {
({ current, finished} = await reader.read());
if (finished) break;
output.push(current);
}
if you read current in the loop, it'll contain each value we passed from res.write() and it should read twice, "First" and after 1000ms "Second".
EDIT: This is very experimental however, and I wouldn't recommend this in a production codebase. I'd suggest trying out socketio or a publish/subscribe mechanism instead.
Old answer: You're already sending "a" back, you should remove the first res.send() invocation at the top of the callback.
So, this is for all the people wondering. No you cannot do this with pure express (there is a workaround, so keep reading).
The reason you cant do this is because, when the user requests to the url, it sends them a response, and the browser renders it. You cant then tell it to change the response, as the browser has already received a response. Even if you send multiple, like with res.write, rather then res.send, the browser will just wait until it receives all the data.
Here are two workarounds:
    1. Use socket.io, cscnode, or another library to have events for updating text,
    2. Send hardcoded html, that updates text (1 was probably better)
That is all I think you can do.
More clarification on the socketio one is basically have an event for changing text that you can fire from node, and the browser will understand, and change the text.

How would you redirect calls to the top object in Cypress?

In my application code, there are a lot of calls (like 100+) to the "top object" referring to window.top such as top.$("title") and so forth. Now, I've run into the problem using Cypress to perform end-to-end testing. When trying to log into the application, there are some calls to top.$(...) but the DevTools shows a Uncaught TypeError: top.$ is not a function. This resulted in my team and I discovering that the "top" our application is trying to reach is the Cypress environment itself.
The things I've tried before coming here are:
1) Trying to stub the window.top with the window object referencing our app. This resulted in us being told window.top is a read-only object.
2) Researching if Cypress has some kind of configuration that would smartly redirect calls to top in our code to be the top-most environment within our app. We figured we probably weren't the only ones coming across this issue.
If there were articles, I couldn't find any, so I came to ask if there was a way to do that, or if anyone would know of an alternate solution?
Another solution we considered: Looking into naming window objects so we can reference them by name instead of "window" or "top". If there isn't a way to do what I'm trying to do through Cypress, I think we're willing to do this as a last resort, but hopefully, we don't have to change that, since we're not sure how much of the app it will break upfront.
#Mikkel Not really sure what code I can provide to be useful, but here's the code that causes Cypress to throw the uncaught exception
if (sample_condition) {
top.$('title').text(...).find('content') // Our iframe
} else {
top.$('title').text(page_title)
}
And there are more instances in our code where we access the top object, but they are generally similar. We found out the root cause of the issue is that within Cypress calls to "top" actually interface with Cypress instead of their intended environment which is our app.
This may not be a direct answer to your question, it's just expanding on your request for more information about the technique that I used to pass info from one script to another. I tried to do it within the same script without success - basically because the async nature of .then() stopped it from working.
This snippet is where I read a couple of id's from sessionStorage, and save them to a json file.
//
// At this point the cart is set up, and in sessionStorage
// So we save the details to a fixtures file, which is read
// by another test script (e2e-purchase.js)
//
cy.window().then(window => {
const contents = {
memberId: window.sessionStorage.getItem('memberId'),
cartId: window.sessionStorage.getItem('mycart')
}
cy.writeFile(`tests/cypress/fixtures/cart.json`, contents)
})
In another script, it loads the file as a fixture (fixtures/cart.json) to pull in a couple of id's
cy.fixture(`cart`).then(cart => {
cy.visit(`/${cart.memberId}/${cart.cartId}`)
})

Results of S3 function call are being cached by my Lambda function

I have a lambda function that uses S3.listObjects to return a directory listing. The listing is sometimes (not always!) out of date - it doesn't contain recently uploaded objects and has old modification dates for the objects that it does have.
When I run the identical code locally it always works fine.
Clearly some sort of caching but I don't understand where...
Here's the relevant code:
function listFiles() {
return new Promise(function (resolve, reject) {
const params = {
Bucket: "XXXXX",
Prefix: "YYYYY"
};
s3.listObjects(params, function (err, data) {
if (err) reject(err);
else resolve(data.Contents);
});
})
}
That is due to Amazon S3 Data Consistency Model. S3 provides read-after-write consistency for PUTs, however other requests - including listObjects are eventually consistent which means there could be a delay in propagation.
The read-after-write consistency in practice settles in a matter of seconds. It's not a guarantee, however. It's unlikely, but not impossible that amazon returns stale data minutes later, esp if across zones. It's more likely however that your client is caching a previous response for that same URL.
You might have run into a side effect of your lambda container being reused. This is explained at a high-level here. One consequence of container reuse is that background processes, temporary files, and global variable modifications are still around when your lambda is re-invoked. Another article talking about how to guard for it.
If you are sending your logs to cloudwatch logs, you can confirm that a container is being reused if the logs for a lambda seem to be appended to the end of a previous log stream, instead of creating a new log stream.
When your lambda container gets reused, the global variables outside your handler function will be reused. For instance, if you change the loglevel of your logging calls to DEBUG at the end of your handler, if your container gets reused, it will start at the top of the handler in the same loglevel.
If you're using the default s3 client session (it seems like you are), then this connection stays in a global (singleton). If your s3 client connection is reused, it might pull the cached results of calls prior, and I would expect that connection to be reused in a later invocation.
One way to avoid this is to specify the If-None-Match request header. If the ETag of the object you're accessing doesn't match on the remote end, you'll get fresh data. You may set it to the last Etag you got (which you'd store in a global), or alternatively you may try setting a completely random value -- which should act as a cache buster. It doesn't look like list_objects() accepts an If-None-Match header, however. You may try to create a new client session just for the current invocation.
This article on recursive lambdas discusses the issue.

Convenient logging with protractor

I'm trying to make logging easier for devs writing selenium tests with protractor.
I'm looking at selenium-webdriver/lib/logging and am trying to figure out how to make a convenient logging system.
Here is an example spec:
it('should NOT show welcome before login', () => {
// convenient log here
expect(homepage.logo.isPresent()).toBe(true);
// log message that would occur after expect
expect(homepage.welcomeText.isPresent()).toBe(false);
// final log message
});
I'm not quite sure how to go about this.
I'm trying to avoid having to do (below) for every log message.
homepage.welcomeText.isPresent().then(() => console.log('foo bar'));
There is a npm package - log4js-protractor-appender which will solve your problem.It is built specially for Protractor based environments and it places all logger command in Protractor Control flow and resolves Protractor promises before logging.
Since Protractor executes all commands in a Control Flow , and all non protractor commands dont get executed in the order we like. So regular logging will need an extra effort from us to chain a non-protractor command to a protractor command
Example:
browser.getCurrentUrl().then(function _logValue(url){
logger.info("The url is" + url);
});
But log4js-protractor-appender enabled to write something like this directly - browser.logger.info('Displayed text is:', browser.getCurrentUrl());
For more details on how to implement this- Please check my blog post - How to implements logs for Protractor/JavaScript based Test Automation Frameworks
For expects you can use toBeTruthy or Falsy and include message there. It would log if something goes wrong. Page Object pattern says you must not have weddriver methods in spec files meaning you may cretae method which would verify something present or not and then() log there like in your example. Also you can implement asyncLog function. console.log() method goes to Stack and executes before protractor methods since protractor's Control Flow or Managed Promise. It wraps every protractor method in deffered promise which puts it in callback queue which executes only after stack is empty. Take a look at next code. I didn't try it out for Protractor though but you can get the idea.
var promise = Promise.resolve();
function asyncLog(message) {
Promise.resolve().then(() => console.log(message));
}
console.log('Start');
promise
.then(() => console.log('This is then'))
asyncLog('This is Callback Queue log');
console.log('This is Call Stack log');
promise
.then(() => console.log('This is another then'))

Best way to enforce user/authentication state in Ember.JS app

Working on my first EmberJS app. The entire app requires that a user be logged in. I'm trying to wrap my head around the best way to enforce that a user is logged in now (when the page is initially loaded) and in the future (when user is logged out and there is no refresh).
I have the user authentication hooks handled - right now I have an ember-data model and associated store that connects that handles authorizing a user and creating a user "session" (using sessionStorage).
What I don't know how to do is enforce that a user is authenticated when transitioning across routes, including the initial transition in the root route. Where do I put this logic? If I have an authentication statemanager, how do I hook that in to the routes? Should I have an auth route that is outside of the root routes?
Note: let me know if this question is poorly worded or I need to explain anything better, I will be glad to do so.
Edit:
I ended up doing something that I consider a little more ember-esque, albeit possibly a messy implementation. I have an auth statemanager that stores the current user's authentication key, as well as the current state.
Whenever something needs authentication, it simply asks the authmanager for it and passes a callback function to run with the authentication key. If the user isn't logged in, it pulls up a login form, holding off the callback function until the user logs in.
Here's some select portions of the code I'm using. Needs cleaning up, and I left out some stuff. http://gist.github.com/3741751
If you need to perform a check before initial state transition, there is a special function on the Ember.Application class called deferReadiness(). The comment from the source code:
By default, the router will begin trying to translate the current URL into
application state once the browser emits the DOMContentReady event. If you
need to defer routing, you can call the application's deferReadiness() method.
Once routing can begin, call the advanceReadiness() method.
Note that at the time of writing this function is available only in ember-latest
In terms of rechecking authentication between route transitions, you can add hooks to the enter and exit methods of Ember.Route:
var redirectToLogin = function(router){
// Do your login check here.
if (!App.loggedIn) {
Ember.run.next(this, function(){
if (router.currentState.name != "login") {
router.transitionTo('root.login');
}
})
}
};
// Define the routes.
App.Router = Ember.Router.extend({
root: Ember.Route.extend({
enter: redirectToLogin,
login: Ember.Route.Extend({
route: 'login',
exit: redirectToLogin,
connectOutlets: function(router){
router.get('applicationController').connectOutlet('login');
}
}),
....
})
});
The problem with such a solution is that Ember will actually transition to the new Route (and thus load all data, etc) before then transitioning back to your login route. So that potentially exposes bits of your app you don't want them seeing any longer. However, the reality is that all of that data is still loaded in memory and accessible via the JavaScript console, so I think this is a decent solution.
Also remember that since Ember.Route.extend returns a new object, you can create your own wrapper and then reuse it throughout your app:
App.AuthenticatedRoute = Ember.Route.extend({
enter: redirectToLogin
});
App.Router = Ember.Router.extend({
root: Ember.Route.extend({
index: App.AuthenticatedRoute.extend({
...
})
})
});
If you use the above solution then you can cherry pick exactly which routes you authenticate. You can also drop the "check if they're transitioning to the login screen" check in redirectToLogin.
I put together a super simple package to manage session and auth called Ember.Session https://github.com/andrewreedy/ember-session
Please also take a look at :
http://www.embercasts.com/
There are two screencasts there about authentication.
Thanks.