Loading very large pages in Chromium

Loading very large pages in Chromium - chromium

how can I enable latest versions of CefSharp/Cef to utilize more of the available memory on a computer?
Here is a test case:
I load an infinite page, like https://www.facebook.com/Google, then run a script that scrolls down the page, as I want to load as much as possible of that page.
With CefSharp/Cef 79 and earlier, I am able to scroll down to dates back in year 2010
With latest CefSharp/Cef the render process crashes when reaching some point in 2015
Here is the script:
async function wait(intervalInMilliseconds) {
return new Promise((resolve) => setInterval(resolve, intervalInMilliseconds));
}
async function unlimitedScroll() {
for (let i = 0; ; ++i) {
await window.scrollTo(0, document.body.scrollHeight);
await wait(2000);
await console.log(`Scroll: ${i}, total: ${window.performance.memory.totalJSHeapSize.toLocaleString()}, used: ${window.performance.memory.usedJSHeapSize.toLocaleString()}, limit: ${window.performance.memory.jsHeapSizeLimit.toLocaleString()}`);
}
}
unlimitedScroll();
The browser crashes around the time the totalJSHeapSize reaches 4GB - but I am running on a 32GB machine, so I have memory to spare
I tried setting values with --js-flags for the settings --max_heap_size, --max_old_space_size and --max_semi_space_size, but it does not help. As a matter of fact, setting a large value for --max_semi_space_size makes the browser crash even sooner than with the defaults.

As it turned out, it was the introduction of the V8 Pointer compression (https://v8.dev/blog/pointer-compression) that is imposing the hard limit of 4GB JavaScript heap space and preventing the browser from loading very large pages.
When using the Chromium Embedded Framework, this can be solved by making a custom build with pointer compression turned off.
I did this by following the instructions at https://bitbucket.org/chromiumembedded/cef/wiki/MasterBuildQuickStart.md, and before running the build process, manually edited the file file chromium_git\chromium\src\v8\BUILD.gn and commented out the setting of the V8_COMPRESS_POINTERS flag.
The resulting libraries have the drawback of consuming significantly more memory, but the limit is gone, and when running on a machine with 32GB RAM or more, we can load those very large pages successfully.

Related

Filtering out assets from precaching in create-react-app

I'm using React 17 with cra-template-pwa to create a PWA. One of my UI libraries has several hundred static image resources that all get preloaded in the PWA (and I don't use most of them). This causes a long delay in enabling the PWA, and even causes Lighthouse to crash. I'm looking at various approaches to fixing the problem, but for a quick fix just to run lighthouse, I'd like to just disable precaching. I haven't been able to find concrete info how to do this. Any advice?

The cleanest solution would entail using the exclude option in the workbox-webpack-plugin configuration, but that requires ejecting in create-react-app.
Something you can do without ejecting, though, is to explicitly filter out entries from the injected self.__WB_MANIFEST array before passing the value to precacheAndRoute().
Your service-worker.js could look something like:
import {precacheAndRoute} from 'workbox-precaching';
// self.__WB_MANIFEST will be replaced with an
// Array<{url: string, revision: string}> during the build process.
// This will filter out all manifest entries with URLs ending in .jpg
// Adjust the criteria as needed.
const filteredManifest = self.__WB_MANIFEST.filter((entry) => {
return !entry.url.endsWith('.jpg');
});
precacheAndRoute(filteredManifest);
The downsides of this approach is that your service-worker.js file will be a bit larger than necessary (since it will include inline {url, revision} entries that aren't needed), and that you'll end up triggering the service worker update flow more than strictly necessary, if the contents of one of your images changes. Those unnecessary service worker updates won't actually harm anything or change the behavior of your web app, though.

Why do browsers with an "offline" option still behave mostly like apps "online"?

tl;dr: what's the logic behind browsers (Chrome, FF, Safari) behaving as an app that's online after clicking offline and not simply... go offline?
CPP, FOP, STP
I have a small socket.io app that fetches from Twitter's API to make an image gallery.
I wanted to style the the divs that create a frame around the photos, but while the app is running found out that when selecting the elements in the dev tools, whenever a new image was added, Chrome emits a "purple
pulsing" (hereafter referred to as CPP) that kicked me out of the div I wanted to style and (rudely) put me at its parent div (the Gallery proper, if you will).
Voilà:
I started by shutting off my WiFi, which solved the problem with two drawbacks:
remembered the offline option in the network panel
needed a connection to read the socket.io docs :~)
Next I tried the offline option and found that, like the production version, CPP reëmerged, the image requests logging net::ERR_INTERNET_DISCONNECTED.
I realized that I could probably set the option reconnection: false in the socket.io bit but alas, this novella of question (which contains multitudes) still beckoned:
The Actual Question(s)
What chez Google (and Firefox (Orange Pulse), Safari (Transparent Pulse), et. al) is the logic to this behavior?
Why not Truly Sever the relevant tab's connection?
Better yet, why not let the poor developer both hold fast to their element and acknowledge visually that new elements are being thrown in?
The images are still fetched (!) which makes the Offline option seem even more misleading.
The docs from Google reference PWAs and those with service workers... does
Check the Offline checkbox to simulate a completely offline network experience.
apply only to them?
The Code that Kinda Could:
Here are the ~20 relevant lines at play (and here's the whole gig):
// app.js
var T = new Twit(config)
var stream = T.stream('statuses/filter', { track: '#MyHashtag', })
stream.on('tweet', function(tweet) { io.sockets.emit('tweet', tweet) })
function handler(request, response) {
var stream = fs.createReadStream(__dirname + '/index.html')
stream.pipe(response)
}
... and the index.html's relevant script:
// index.html
var socket = io.connect('/');
socket.on('tweet', function(tweet) {
if (someConditions = foo) {
tweet_container.innerHTML = '<img src="'
+ tweet.user.profile_image_url +
'" />'
}
}, 1000)
Nota Bene: I realize this question contains questions germane to polling, streams, networking, and topics whose names I'm not even familiar with, but my primary curiosity is what's the logic behind behaving as an app that's online after clicking offline and not simply... go offline" (and behave as it does when disconnecting from WiFi).
P.S.S here's a quote from knee-deep in the socket.io docs
If a certain client is not ready to receive messages (because of network slowness or other issues, or because they’re connected through long polling and is in the middle of a request-response cycle), if it doesn’t receive ALL the tweets related to bieber your application won’t suffer.
In that case, you might want to send those messages as volatile messages.

Optimizing POSTs-per-second in Turbogears2

In a web game built on Turbogears v2.1.5, logged-in users POST a 16-byte message periodically. The server CPU reaches 100% when the POST rate is 60 POSTs-per-second. (For testing, we have removed all work such as updating the DB with each post-- the server simply returns an empty response immediately.)
Using wrk to GET a 16-byte static file we see Turbogears reaching rates of ~500 requests-per-second and want to match or get close to that rate with our game's POSTs. We'd really like to be at 1,000 or more POSTs per second.
Setup: Turbogears v2.1.5, AWS c3.large, Windows Server 2008 R2, Intel Xeon, E5-2680 v2 # 2.8Ghz 2.8Ghz.
Question: Are there tg2 settings or other changes that would let us in this scenario handle 500 or more POSTs-per-second?

If you are able to upgrade to TG2.3 the work in the more recent releases greatly improved the framework performances ( http://blog.axant.it/archives/452 ) out of the box.
Also through the new minimal mode introduced in 2.3 ( http://turbogears.readthedocs.io/en/latest/turbogears/minimal/index.html ) you can easily disable any component you don't need like i18n, sessions etc.. for more speed improvements ( see the various X.enabled options at http://turbogears.readthedocs.io/en/latest/reference/config-options.html ). Disabling i18n and static files support usually gives a good performance boost.

PhantomJs Crashes while running with grunt-karma test cases ????/

We are facing an issue while running karma test cases with phantomJs our phantomJs crashes and gets disconnected.
Is that due to memory leakage or some other issue.Kindly let me know if some one has some suitable solution.
I found that the workaround is to break test cases into multiple grunt task but since we have a lot of test cases more than 1500 so that would not be a feasible task.
We are using the below versions
Node:- 0.10.32
Karma:- 0.12.24
PhantomJs:- 1.9.8 (karma-phantomJs-Launcher)
Please let me know the solutions asap.

There are two reasons I found that this can happen.
PhantomJS does not release memory until its tab is closed so if your test suite is too large, you could be running out of memory.
karma-phantomjs-launcher & karma-phantomjs2-launcher do not hook the stdout/stderr output for their started browser process and so I've seen some instances that the started browser just hangs and gets disconnected, most likely due to its stderr output getting filled up
The first problem can be worked around by splitting your test suite into smaller ones. Or, you could research if there is perhaps a way to tell PhantomJS to run its JavaScript garbage collection, but I have not gone down that road so can't provide much more detail there.
The second problem can be fixed by:
using the latest karma-phantomjs-launcher version that hooks browser the stdout/stderr output (fixed in version 0.2.1)
using a version of karma-phantomjs2-launcher from its pull request #5 which brings in upstream changes from the base karma-phantomJS-launcher project and thus resolves the problem here as well.

I had the same kind of issue with handling random crashes. Though i did not find a way to avoid them, there is the possibility to restart the grunt-task upon a crash.
grunt.registerTask('karma-with-retry', function (opt) {
var done = this.async();
var count = 0;
var retry = function () {
grunt.util.spawn({
cmd : "grunt",
args : ["connect", "karma"], // your tasks
opts: {
stdio: 'inherit'
}
}, function (error, result, code) {
count++;
if (error && code === 90 /* Replace with code thrown by karma */) {
if(count < 5) {
grunt.log.writeln("Retrying karma tests upon error: " + code );
retry();
} else {
done(false);
}
} else {
done(result);
}
});
}
retry();
});
Source https://github.com/ariya/phantomjs/issues/12325#issuecomment-56246505

I was getting Phantom crashed when asserting the following line
dom.should.be.instanceof(HTMLCollection);
Worked on chrome, but phantom was crashing without any useful error message.
I've been able to see the real error message after running the same test on PhantomJS_debug browser with debug option set to true.
The following error message started showing up.
The instanceof assertion needs a constructor but object was given.
Instead of
PhantomJS has crashed. Please read the bug reporting guide at
<http://phantomjs.org/bug-reporting.html> and file a bug report.
So chrome was ok with the assertion but phantom 2.1.1 is crashing with the above error. Hope this will help.

HTML5 Server-Sent Events prototyping - ambiguous error and repeated polling?

I'm trying to get to grips with Server-Side Events as they fit my requirements perfectly and seem like they should be simple to implement, however I can't get past a vague error and what looks like the connection repeatedly being closed and re-opened. Everything I have tried is based on this and other tutorials.
The PHP is a single script:
<?php
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache');
function sendMsg($id, $msg) {
echo "id: $id" . PHP_EOL;
echo "data: $msg" . PHP_EOL;
echo PHP_EOL;
ob_flush();
flush();
}
$serverTime = time();
sendMsg($serverTime, 'server time: ' . date("h:i:s", time()));
?>
and the JavaScript looks like this (run on body load):
function init() {
var source;
if (!!window.EventSource) {
source = new EventSource('events.php');
source.addEventListener('message', function(e) {
document.getElementById('output').innerHTML += e.data + '<br />';
}, false);
source.addEventListener('open', function(e) {
document.getElementById('output').innerHTML += 'connection opened<br />';
}, false);
source.addEventListener('error', function(e) {
document.getElementById('output').innerHTML += 'error<br />';
}, false);
}
else {
alert("Browser doesn't support Server-Sent Events");
}
}
I have searched around a bit but can't find information on
If Apache needs any special configuration to support server-sent events, and
How I can initiate a push from the server with this kind of setup (e.g. can I simply execute the PHP script from CLI to give a push to the already-connected-browser?)
If I run this JS in Chrome (16.0.912.77) it opens the connection, receives the time, then errors (with no useful information in the error object), then reconnects in 3 seconds and goes through the same process. In Firefox (10.0) I get the same behaviour.
EDIT 1: I thought the issue could be related to the server I was using, so I tested on a vanilla XAMPP install and the same error comes up. Should a basic server configuration be able to handle this without modification / extra configuration?
EDIT 2: The following is an example of output from the browser:
connection opened
server time: 01:47:20
error
connection opened
server time: 01:47:23
error
connection opened
server time: 01:47:26
error
Can anyone tell me where this is going wrong? The tutorials I have seen make it look like SSE is very straightforward. Also any answers to my two numbered questions above would be really helpful.
Thanks.

The problem is your php.
With the way your php script is written, only one message is sent per execution. That's how it works if you access the php file directly, and that's how it works if you access the file with an EventSource. So in order to make your php script send multiple messages, you need a loop.
<?php
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache');
function sendMsg($id, $msg) {
echo "id: $id" . PHP_EOL;
echo "data: $msg" . PHP_EOL;
echo PHP_EOL;
ob_flush();
flush();
}
while(true) {
$serverTime = time();
sendMsg($serverTime, 'server time: ' . date("h:i:s", time()));
sleep(1);
}
?>
I have altered your code to include an infinite loop that waits 1 second after every message sent (following an example found here: Using server-sent events).
This type of loop is what I'm currently using and it eliminated the constant connection drop and reconnect every 3 seconds. However (and I've only tested this in chrome), the connections are now only kept alive for 30 seconds. I will be continuing to figure out why this is the case and I'll post a solution when I find one, but until then this should at least get you closer to your goal.
Hope that helps,
Edit:
In order to keep the connection open for ridiculously long times with php, you need to set the max_execution_time (Thanks to tomfumb for this). This can be accomplished in at least three ways:
If you can alter your php.ini, change the value for "max_execution_time." This will allow all of your scripts to run for the time you specify though.
In the script you wish to run for a long time, use the function ini_set(key, value), where key is 'max_execution_time' and value is the time in seconds you wish your script to run for.
In the script you wish to run for a long time, use the function set_time_limit(n) where n is the number of seconds that you wish your script to run.

Server Sent Events are easy only when it comes to the Javascript part. First of all a lot of tutorials on SSE in the internet are closing their connections in the server part. Be it PHP or Java examples. This is really astonishing because what you get then is just a different way of implementing a "Ajax Polling" system with a strictly defined payload structure (and some minor features like client retry values set by server side). You can easily implement that with a few lines of jQuery. No need for SSE then.
According to the spec of SSE, i would say that the retry shouldnt be the normal way of implementing a client side loop. For me SSE is a one way streaming method which relies on a server backend which does not close the connection after pushing the first data to the client.
In Java its useful to use Servlet3 Async spec in order to free the request thread immediately and do the processing / streaming in a different thread. This works so far but still i dont like the 30 seconds connection lifetime for the EventSource request. Even i am pushing data every 5 seconds, the connection will be terminated after 30 seconds (chrome, firefox). Of course SSE will reconnect per default after 3 seconds but still i dont think this is the way it should be.
One problem is that some Java MVC frameworks dont have the ability to keep the connection open after data sending, so that you end up coding to the bare Servlet API. After on 24hours on coding prototypes in Java, i am more or less dissapointed because the gain over a traditional jQuery-Ajax-loop is not THAT much. And the problem with polyfilling the SSE feature is also existant.

The problem is not a server side issue, this all happens on the client and is part of the spec (I know it sounds weird).
http://dev.w3.org/html5/eventsource/
"When a user agent is to reestablish the connection, the user agent must run the following steps. These steps are run asynchronously, not as part of a task. (The tasks that it queues, of course, are run like normal tasks and not asynchronously.)"
Queue a task to run the following steps:
If the readyState attribute is set to CLOSED, abort the task.
Set the readyState attribute to CONNECTING.
Fire a simple event named error at the EventSource object.
I can't see any need to have an error here, so I have modified your Init function to filter out the error event fired whilst connecting.
function init() {
var CONNECTING = 0;
var source;
if (!!window.EventSource) {
source = new EventSource('events.php');
source.addEventListener('message', function (e) {
document.getElementById('output').innerHTML += e.data + '';
}, false);
source.addEventListener('open', function (e) {
document.getElementById('output').innerHTML += 'connection opened';
}, false);
source.addEventListener('error', function (e) {
if (source.readyState != CONNECTING) {
document.getElementById('output').innerHTML += 'error';
}
}, false);
}
else {
alert("Browser doesn't support Server-Sent Events");
}
}

There is no actual issue with the code, that I can see. The answer selected as correct, is then, incorrect.
This sums up the behavior mentioned in the question (http://www.w3.org/TR/2009/WD-html5-20090212/comms.html):
"If such a resource (with the correct MIME type) completes loading (i.e. the entire HTTP response body is received or the connection itself closes), the user agent should request the event source resource again after a delay equal to the reconnection time of the event source. This doesn't apply for the error cases that are listed below."
The problem lies with the stream. I've successfully kept a single EventStream open before in perl; just send the appropriate HTTP headers, and start sending stream data; never shutdown the stream server side. The issue is that it seems most HTTP libraries attempt to close the stream after its been opened. This will cause the client to attempt to reconnect to the server, which is fully standard compliant.
This means that it will appear that the problem is solved by running a while loop, for a couple of reasons:
A) The code will continue to send data, as if it were pushing out a large file
B) The code (php server) will never have the chance to attempt to close the connection
However, the problem here is obvious: to keep the stream alive, a constant stream of data must be sent. This results in wasteful utilization of resources, and negates any benefits the SSE stream is supposed to provide.
I'm not enough of a php guru to know, but I'd imagine that something in the php server/later in the code is prematurely closing the stream; I had to manipulate the stream at Socket level with Perl to keep it open, since HTTP::Response was closing the connection, and causing the client browser to attempt to re-open the connection. In Mojolicious (another Perl web framework), this can be done by opening a Stream object and setting the timeout to zero, so that the stream never times out.
So, the proper solution here is not to use a while loop; it is to call the appropriate php functions for opening, and keeping open, a php stream.

I was able to do it by implementing a custom event loop. It seems that this html5 feature is not ready at all and has compatibility issues even with the latest version of google chrome. Here it is, working on firefox (can't get the message sent correctly on chrome) :
var source;
function Body_Load(event) {
loopEvent();
}
function loopEvent() {
if (source == undefined) {
source = new EventSource("event/message.php");
}
source.onmessage = function(event) {
_e("out").value = event.data;
loopEvent();
}
}
P.S. : _e is a function that calls document.getElementById(id);

According to the Spec, the 3 second reconnection is by design when the connection is closed. PHP with a loop should theoretically stop this but the PHP script will be running indefinitely and wasting resources. You should try to avoid using apache and php for SSE because of this issue.
The standard http response should close a connection once the response is sent. You can change this with the header "connection: keep-alive" which should tell the browser that the connection is meant to stay open although this can cause problems if you're using proxies.
node.js or something similar is a better engine to use for SSE rather than apache/php and since it's basically JavaScript, its pretty easy to get to grips with.

Server Sent Event as name suggests the data should be traveling from server to client if it has to reconnect every three seconds to retrieve data from server then it is no different than other polling mechanisms.The purpose of SSE is to alert client as soon as there is new data which client is unaware of.Since server closes connection even if header is keep-alive there is no other way than to run php script in infinite loop but with considerable thread sleep to prevent burden on server.Till now i don't see any other way out and its better than spamming server every 3 seconds for new data.

I'm trying the same thing. With varying degrees of success.
Had the same problem with Firefox, running the same js code as mentioned.
Using the Nginx server and some PHP that exited(ie no continual loop), I could get messages back to a "Request" from firefox only once the PHP exited.
Run the PHP as a script in PHP.exe and all is good on the concole, stings are printed when flushed. However, Nginx doesn't send the data until the PHP has completed. Tried adding extra \r\n\r\n and flush() or ob_flush() did not help.
There is no pushing of data, as shown in Wireshark logs, just a delayed response packet to the GET.
Read that I need a "push" module for Nginx that requires a re-build from source.
So this is definitely an Nginx problem.
Using a socket in 'C' I was able to push data to Firefox as expected, and the socket was kept open, and no messages were missed. However this has the disadvantage that I need to server the page.html and the events/stream from the same socket or firefox will not connect due to Cross Site Url problems. There are some ways around this in certain situations, but not for a iframe in a menu system. This approach did prove the point that the SSE does work with firefox and there are pushed packets in the wireshark log. Where option 1 only had request/reply packets.
All this said, I still don't have a solution. I've tried to remove the buffering on the PHP and Nginx. But still nothing until PHP finishes. Tried different header options, eg chunks didn't help either.
I don't feel like writing a full blown http server in 'C' but this seems to be the only option that is working for me at the moment.
I'm about to try Apache, but most write ups suggest that this is worse than Nginx at this job.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas