How to optimize intranet web site - optimization

I have create a web site (PHP, MySQL) for the intranet of my campus. The campus network has a proxy and web servers but I've used a PC in my workgroup as the server using XAMPP 1.7.7 for testing purpose. When I visiting the web site from the different PC in the same workgroup it takes more than 30 seconds to load the index page or other pages.
The web browser used is Firefox and it has bypass the proxy server for the local addresses. The index.php page is only 5KB of the size. And in the index.php I have destroyed current session if there any, database connection to retrieve latest 03 news and call two external css files. Used less than 5 images (total capacity less than 5 KB) in every page. The XAMPP is in default settings.
Is there something that I can do to optimize and decrease the loading time. Your opinions are welcome.

That is very slow. Your page is not that large. If the page loads fast locally don't bother trying to optimize more. How long does it take to copy a file over the workgroup? Or better yet, FTP a file to your 'server'?
You could use tracert (http://support.microsoft.com/kb/162326) to get a better idea where the slow down is.
More generally speaking, the ideal solution really depends on what you are ultimately trying to accomplish. XAMPP is obviously more geared towards personal localhost work. You got the ports open and host all setup correctly, seems your immediate problem is a network issue requiring further diagnoses.

Related

Is PageSpeed Insights bypassing Google CDN cache?

We're using Google Cloud Platform to host a WordPress site:
Google Load Balancer with CDN -> Instance Group with single VM -> Nginx + WordPress
From step 1 (only VM with WordPress, no cache) to the last step (whole setup with Load Balancer and CDN) I could progressively see the improvement when testing locally from my browser and from GTmetrix. But PageSpeed Insights always showed little improvement.
Now we're proud of an impressive 98/97 score in GTmetrix (woah!), but PSI still shows we're pretty average, specially on mobile (range from 45-55).
Problem: we're concerned about page ranking in Google so we'd like to make PSI happy as well. Also... our client won't understand that we did make an improvement while PSI still shows that score.
I was digging and found a few weird things about PSI:
When we adjusted cache-control in nginx, it was correctly detected by local browser and GTmetrix, but section Serve static assets with an efficient cache policy in PSI showed the old values for a few days.
The homepage has a background video hosted in 3 formats (mp4, webm, ogv). Clients are supposed to request only one of them (my browser and GTmetrix do), but PSI actually requests the 3 of them. I can see them in Avoid enormous network payloads section.
When a client requests our homepage, only the GET / request reaches our backend server (which is the expected behaviour) and the rest of the static assets are served from the CDN. But when testing from PSI, all requests reach our backend server. I can see them in nginx access log.
So... those 3 points are making us get a worse score in PSI (point 1 suddenly fixed itself yesterday after days since we changed cache-control), but for what I understand none of them should be happening. Is there something else I am missing?
Thanks in advance to those who can shed some light on this.
but PSI still shows we're pretty average, specially on mobile (range from 45-55).
PSI defaults to show you a mobile score on a simulated throttled connection. If you look at the desktop tab this is comparable to GT Metrix (which uses the same engine 'Lighthouse' under the hood without throttling so will give similar results on Desktop).
Sorry to tell you but the site is only average on mobile speed, test it by going to Performance tab in developer tools and enabling 'Network:Fast 3G' and 'CPU: 4x Slowdown' in the throttling options.
Plus the site seems really JavaScript computation heavy for some reason, PSI simulates a slower CPU so this is another factor. One script is taking nearly 1 second to evaluate.
Serve static assets with an efficient cache policy in PSI showed the old values for a few days.
This is far more likely to be a config issue than a PSI issue. PSI always runs from an empty cache. Perhaps the roll out across all CDNs is slow for some reason and PSI was requesting from a different CDN to you?
Videos - but PSI actually requests the 3 of them. I can see them in Avoid enormous network payloads section.
Do not confuse what you see here with what Google has used to actually run your test. This is calculated separately from all assets that it can download not based on the run data that is calculated by loading the page in a headless browser.
Also these assets are the same for desktop and mobile so it could be for some reason it is using one asset for the mobile test and one for the desktop test.
Either way it does indeed look like a bug but it will not affect your score as that is calculated in other ways.
all requests reach our backend server
Then this points to a similar problem as with point 1 - are you sure your CDN has fully deployed? Either that or you have some rule set up for a certain user agent / robots rule set up that bypasses your CDN. Most likely a robots rule needs updating.
What can you do?
double check your config, deployment etc. Ensure it has propagated to all CDN sites and that all of the DNS routing is working as expected.
Check that you don't have rules set for robots, I notice the site is 'noindex' so perhaps you do have something set up while you are testing things that is interfering.
Run an 'Audit' from Developer Tools in Google Chrome -> this uses exactly the same engine that PSI uses. This may give you better results as it uses your actual browser rather than a headless browser. Although for me this stops the videos loading at all so something strange is happening with that.

Self Web Hosting with cpanel - What are detailed software as well as hardware requirements do I need to meet?

I recently deployed my website in the live shared resource server in Linux environment. The website is a Property Listing Application that allows real estate agents to list their properties with 18 High Definition (HD) images per listing.
My Web Hosting Provider configured the php.ini file for upload_max_filesize at 20MB. I tried different possible configurations to increase the upload size whether through php scripting in cpnel or progammatically but I got connection time out without notification error.
I spoke to Techical Support Team who increased the upload_max_filesize to 50MB that allowed me to only upload 16 images of 3MB size each. But when I tried, the image upload failed still.
I noticed that the cpanel master settings override all the changes. I wanted to upgrade the package for VPS but its features don't convice my ideal server settings.
I decided to concider a self web hosting with control panel. I investigated some possibilities and did some research online to find out how I can accomplish this, I found a few tips that showed possibilities with drawbacks.
I would like to know if it is possible for the self web hosting. What will be server requirements like number of CPUs, Raid, RAM and HDD sizes, etc. What is the most trusted server manufacturers like Cisco, Dell, etc. For Software between Windows and Linux. What is the best fit and why? What is a good cpanel to consider?
I fixed my upload issues. The problem was not upload_max_filsize but the actual image resolution in my script that needed a slight modication. I just modified the image resolution and the upload file image work fine.
For the ideal cpanel, I found cpanel.net

Scrapy on Ubuntu web server getting 417 error

I have been developing a crawling script for a number of news websites and using Scrapy to handle the logic.
When I run my script on an Ubuntu web server (Digital Ocean, if that helps), a lot of the websites that return 200 on my local machine turn out to be 417 instead.
I was wondering how I should fix this, if it is a problem at all? I'm actually not quite sure if it is affecting the final output, but it seems like it has been.
Some of my own research has turned up:
http://www.checkupdown.com/status/E417.html . I've tried adding an Expect header to my requests, which hasn't worked
I've heard that it might be a problem with HTTP 1.1 vs 1.0? EDIT: Nope. Scrapy's HTTPDownloaderHandler automatically chooses 1.1 if it is available
417 is the error a web server gives you when your client says it expects content-types a,b,c, but the content that the server could deliver doesn't match any of these types.
This looks like a scrapy bug or, more likely, misconfiguration.
It seems either your public ip address was already banned or was banned while you scraped by the web server of the page you want to scrape. For the first situation you can reboot your instance to get a new public ip (at least this works on Amazon). For the second scenario, here are some tips from the official documentation to avoid this situation:
rotate your user agent from a pool of well-known ones from browsers
(google around to get a list of them)
disable cookies (see COOKIES_ENABLED) as some sites may use cookies to spot bot behaviour
use download delays (2 or higher). See DOWNLOAD_DELAY setting.
if possible, use Google cache to fetch pages, instead of hitting the
sites directly
use a pool of rotating IPs. For example, the free Tor
project or paid services like ProxyMesh
use a highly distributed downloader that circumvents bans internally, so you can just focus on parsing clean pages. One example of such downloaders is Crawlera
Additionally, you can reduce concurrent requests settings in your spider, that worked once for me.

Site functionality diminished over VPN/company network

We currently experience a diminished with one of our customers at our main production site. All subpages and resources seem to be affected as well.
The customer reports a completely broken experience for themselves with the site not working correctly at all, mostly due to assets not loading correctly.
We already started investigating and have found that - so far - nothing seems to be wrong with the site itself.
Quick rundown:
The production site has a Cloudflare layer and almost all of it's assets are delivered either via CDNjs or Amazon's Cloudfront (behind Cloudflare) - all assets are reachable via HTTP as well
The site uses SSL and enforces it (the dynamic cert from Cloudflare)
We could secure a HAR from one of the requests for the request to one of our sites, the request times are extremely long. If you like to try, here is an online HAR viewer, be sure to uncheck validation of the file.
The customer uses Internet Explorer 8 and Chrome (39). While the site is not optimized for IE8. It should run fine in Chrome, in fact, in runs in most browsers above IE9 just fine for all of us.
Notes
We already ruled out:
Virtual delivery problems (there could be physical limitations we are not aware of)
General faultiness of our setup (We tried three different open VPNs to verify this)
Being on the customers blacklist by accident (although we cannot be entirely sure of this)
SSL Server name indication (SNI) problems
(Potentially) a general problem with the customers network, the customer does not report any problems with "the rest of the internet".
The customer will not give access to their VPN/disclose security details so we cannot really test for the situation ourselves. We suspect that the customer uses an internal proxy that might cause the problems described, but we are not sure.
Questions
My questions here are:
Is there any known problem caused by internal networking in conjunction with our setup that can cause this behaviour?.
Are there potential problems on our end that we could have overlooked or things that we do different from other sites?
It seems the connection is being done (or routed) through a low bandwidth high latency link (or a very congested one). Most of the dns lookups and connects seems to be taking ~10s.
In the HAR you can see that it affects fonts.googleapis.com and cdnjs.cloudflare.com. https://www.google-analytics.com/analytics.js has no data captured. To me the affirmation that the customer does not report any problems with "the rest of the internet" seems kind of dubious, seeing that in this HAR it hasn't been able to load the analytics js and access to usual cdns are very slow.
My guesses (pick one or more):
they are testing in a machine different than the one they have no problems with "the rest of the internet"
this machine is very, very slow
it has some kind of content filtering, antivirus, whatever filtering the web (perhaps with a ssl certificate installed in order to forge & inspect https traffic)
the access is done through a congested route, or a low bandwidth high latency link
Two hotspots:
It happens sometime for CDN points to be inconsistent, I spent a lot of time to understand this issue. How? In a live session with the client when I opened each resource loaded one by one I understand there are differences between CDN access points (Mine eastern Europe - His central Europe ). CDN hosting was one of the biggest US player in the world, anyhow we fixed this by invalidating(deleting) all files from CDN as so new/correct ones were loaded.
You need to have CDN that supports serving files over HTTPS, then use that CDN for the SSL requests.

Making Plone site temporarily static for high traffic peak

We know there is a surge of traffic hitting a Plone site on a certain day. Last time this happened we couldn't crank enough power out of Plone to make it run smoothly.
Now I am asking what kind of tricks one could play to feed the horde temporarily? E.g.
Convert (part of) Plone site to static HTML files and images on a disk, serving them through Apache?
Cache the whole site in Varnish with very long expire time
Using some CDN service which automatically mirrors the site
We can change the site DNS if needed, but I hope all this could be achieved having contact form and other HTTP POST forms still working (if necessary we can hide them temporary)
I'd go with Varnish and something like a 60 second TTL. This is enough, because it means you'll get only a handful of requests per minute.
You need to test carefully, though, that response headers are set correctly so you don't have any "holes" in the cache that hammer Zope. Funkload to the rescue.
Martin