How to compare test website and live website - testing

We have our production server running our website. Then we have a test server which has exact same data but with changes to code to do some new functionality. This web app has over 500 pages.
Is there any program that can
Login to the test site
Crawl through each page and then save the page as html
Compare with the same page saved with live site?
This way we can make sure that new features that we add to our test site will not break the live site when code updates are applied to production.
I am currently trying to use WinHTTrack website copier and then comparing the test and live folders with some code comparison tool like beyond compare. This works ok but there are lot of files changed because of the domain name changes.
Looking forward to ideas / solutions for this problem.
Regards

Have you looked at using Watir for this? It's not exactly the thing you are looking for but it might allow you some more granularity in your tests and ensure the site is functionally identical rather than getting caught up on changing guids, timestamps and all the other things that tend to change across any significant size website from day to day as part of it's standard functionality.

Apparently you can't make consistent, reproduceable builds in your project, can you? I would recommend moving towards that in the long run, it will save you a lot of headaches. That way you would know exactly what was deployed to which server when, so there would be no more need to bend around backwards to get the deployed sources back like this...
I know this is not a direct solution to your problem... but maybe it is worth comparing, whether you would save more in the long run by investing the efforts into your build process now, instead of implementing this workaround (and then improving your build process anyway - because one day you will almost surely need to do that).

wget has a --convert-links option, there are also some options to preserve cookies that might let you do it logged in http://drupal.org/node/118759#comment-664498

use an Offline Downloader, download all files to your computer from both sources, then compare the folder contents using a free tool like Total Commander.
EDIT
Load both of your sources into a CVS, and compare it there.

Related

How to compare content between two web pages in different environments?

We are in the process of building a website from scratch from an existing website. The web page is an identical copy, and as the web page contains many pages we need a way to compare content between the sites. It is of course possible to do manually, but it takes both a lot of time and entails a risk of human errors.
I have seen that there are services that offer this by inputting two URLs which are then analyzed and where discrepancies are presented. However, these cannot be used as our test environment is local (built in Sitecore).
Is there a way to solve this without making our test environment available online (which is not possible)? For example, does software exist for this, or alternatively some service where you can compare a web page that is online with one that is local?
Note that we're only looking for content comparison (not visual).
(Un)fortunately there's many ways to do this, but fortunately there are some simple ones.
What I would do is:
Get a list of URLs for each site. If the Sitemap is exhaustive, then you could use that, if it's not you might want to run some Sitecore Powershell to get the lists.
Given the lists (from files, or Sitecore API or something), write a program to visit each URL, get the text of the page after it's done rendering, and save it to disk (something like Selenium is good for this and you can use any language). You'll want some folder structure like host/urlpart/urlpart/pagename.txt, basically the same as your content tree.
Use some filesystem diff program like WinMerge to compare the two folders
This is quick and dirty, but a good place to start.

Why is it better to work on a local website copy than live on the website?

I used to work on the live website when I'm editing a website (I'm working alone), but some people told me "it's the old way". I'm inclined to evolve and I like to work, but how can't I lose time doing this?
First, that means that I need to get a copy of the website on my computer. I need to copy the files, dump and restore the database, first waste of time. If my customer adds extension on the website in the meantime(for example, Wordpress) my modification should be impacted then I need to add it on my local copy to. If I need to modify the DB I will need to do it on the local copy too.
Secondly if I want to show a work in progress to my customer I need to apply all modifications to the live website and check than everything works, still a waste of time.
And finally when everything is ok, I need to update again the live website, files and DB.
So, there's two options:
this is not the correct way to do and there's tools to do all that transparently (I hope so)
this is not a waste of time but a needed time to work properly (I understand why agency have big prices and I'll keep my method)
It depends on the complexity of the project and the size of your team.
One of the major risk of working on a live site is the introduction bugs in production. You also want to have some confirmation of functionalities developed from QA or your customer before having your users access them.
Basically, you want to make sure your new code does not break the live site, so working on the local instance could help you in this way, and you could also deploy on the live test site you changes for approval and QA.
Also if you working with a larger team working on the live site just won't scale up and the risk of introducing bug is even higher.
You could consider using docker, to simplify development on your local machine.

Does vwo(visual website optimizer) slow down website load time in any manner?

Visual website optimizer is a A/B testing tool which can help one site owner to analyze his site with a modified of that. It puts a simple code in your website and make a new version of your web page.Then it show one version of your webpage to 50% of your visitors and another ver to rest of the 50%. This way the owner can analyze which ver of the site is generating more revenue & dump the other one.
So my question is can vwo reduce the site loading time somehow?Or what is the drawbacks of using vwo in a website?
Yes, there's a little bit of additional lag in load time, as the script that makes the decision has to call home to the VWO servers, see what variations should be served, then serve that particular page.
The trick to minimising that loading lag is to put the script absolutely first on the target page, so that nothing else is happening before the script fires (but you'll always have lag).
This blog post by VWO sums everything up: https://vwo.com/blog/how-vwo-affects-site-speed/
They write in that post:
Having said all this, we are confident that VWO’s best-in-class technology coupled with optimal campaign settings will ensure that your website never slows down
However, I would suggest to test it out on your page and see whether it works for you or not.

Composite C1 - develop locally, sync to live site

I have a couple of Composite C1 CMS websites.
To edit them currently I use the web based CMS on the live site.
However - I would like to update the (code & content) in Visual Studio locally - then sync to the web. However, if my local copy is older than that online (e.g. a non techy client has edited something on the live site) and I Web Deploy - it will go over the top of the new file on the server.
I need a solution that works out the newest change? I can't find anything in Google or the C1 docs.
How can I sync - preferably using Web Deploy. Do I need some kind of version control?
Is there a best practice for this - editing the live site through the web interface seems a bit dicey & is slow.
The general answer to this type of scenario seems to be to use the Package Creator. With that you can develop locally, add the files you've changed to a package, and install that package on a live site. This solution does not at all cover all the parts of you question though, and has certain limitations:
You cannot selectively add content to a package. It's all pages or no pages.
Adding datatypes is easy, but updating them later requires you to delete the datatype (and data), and recreate the datatype.
In my experience packages works well for incremental site updates, if you limit the packages content to be front end stuff, like css, images and such.
You say you need a solution that works out the newest changes - I believe the only solution to this is yourself, with the aid of some tooling. I don't think there's a silver bullet solution here.
Should you use a version control system? Yes! By all means. Even if you are not sharing your code with anyone, a VCS is a great way to get to know Composite C1 from a file system perspective, as you can carefully track what files are changed on disk, as you develop. This knowledge is crucial when you want to continuously add features the a website that is already alive and kicking - you need to know what to deploy, and what not to touch.
Make sure you read the docs on how Composite fits in VCS: http://docs.composite.net/Configuration/C1-and-Version-Control
I assume that your sites are using the XML data storage (if you where using SQL Data Store, your content would not be overridden upon sync).
This means that your entire web application lives in one folder on disk on the web server, which can be an advantage here.
I'll try to outline a solution that could work for you, although I must stress that I've never tried this - I'm making it up as I type.
Let's say you're using git, download the site in it's entirety from the production web server, and commit the whole damned thing* to your master branch.
Then you create a new feature branch from that commit, and start making the changes you want to deploy later, and carefully commit your work as you go along, making sure you only commit the changes that are needed for your feature to work, to the feature branch.
Now, you are ready to deploy, and you switch back the master branch, and again download the entire site and commit it to master.
You then merge your feature branch into the master branch, and have git do all the hard work of stitching you changes in with the changes from the live site. There are bound to be merge conflicts, and that is where you will have to jump in, and decide for yourself what content needs to go live.
After this is done and tested, you can web deploy the site up to the production environment.
Changes to the live site might have occurred while you where merging, so consider closing the site, or parts of it, during this process.
If you are using SQL Data Store i suggest paying for a tool like Red Gate's SQL Compare and SQL Data Compare or SQL Delta, to compare your dev database to the production database, and hand pick SQL scripts that can be applied to the production database along with your feature deployment.
'* Do consider using a .gitignore file to avoid committing certain files - refer to the docs for mere info.
I suppose you should use the Package Creator
Also have a look here: http://docs.composite.net/Configuration/C1-and-Version-Control

browser plugin to test a site's look when migrating

I'm thinking I need a browser plugin that does the following, and if it doesn't exist, it should. I may as well say FF for now, but it could be any browser.
The problem: when moving a website from one server to another, you need migration testing. It is a pain to click on every link by hand and compare it to the old host. You really need 2 machines or have to constantly thrash your hosts file.
The plugin:
Would allow you to specify an alternate hosts entry for a website. 2 entries would make it clear, one for live, one for test.
The plugin would crawl every link on the site, and render the page in the browser, and save an image of the entire page.
It would switch hosts and repeat, and save images in a second folder. Since the rendering engines match, the images should match. We need to switch hosts (like /etc/hosts) so all absolute links are the same for the site.
Now this could be part of the plugin or external, now that we have 2 folders of identically named images, we run an image-diff program on the whole batch. A quick test would be a bdiff or hash, or we could get more sophisticated and determine how different each image is.
This would save so much time. So can it be done with existing tools, or do I need to go write it?
Have a look at Selenium, it allows you to script interactions with the browser and verify content.
That is overengineered. What kind of website is it? How big? Which framework (PHP, JSP, Rails, etc.)? Why not copy the website onto the new server and grep the code for specific ties to the old server?
I'd concentrate on why you think the site would differ between two servers, and focus on testing those specific cases rather than the whole site. When a site is moved to a new machine the issues are generally very obvious from looking at a couple of pages.
Presumably they are both looking at the same data source, assuming there is a data source, otherwise a folder diff on the two installations would suffice. This being the case, it should be a simple task to identify which areas of the site are likely to be affected by a server migration.
Also, I wouldn't personally trust a machine matching two images to sign off system as ready to go live. There just isn't a substitute for real human testing. Yes it's time consuming, but how important is your site?
Try http://www.browsercam.com/ - free trial should allow you to specify main page and follow links to make screenshots automatically of the sub-pages as well.