I was wondering if testing web application based on comparing images of a screenshot is used in the industry and if it's a good approach.
Scenario:
- model image is taken by hand
- test compares only selected parts of an image
- test is written in selenium for example
- test has always the same data to work on
- test is always running on the same screen (the same resolution)
- test will compare images after some steps (for example: if the user account page looks good after registration - we have always the same data for test)
Does this have some advantages?
Is it a 'stable' approach to testing web apps?
Could it be useful as the last step of (for example) selenium test to verify the results?
What do you think about it?
I did a search for that topic but couldn't find any good resources.
There are solutions for this kind of testing, for example, there is Applitools.
It is based on taking a baseline screenshot, and then the subsequent screenshots take the diff from the original image. There are 4 different comparison levels:
Exact (MatchLevel.EXACT) - pixel-to-pixel comparison
Strict (MatchLevel.STRICT) - Strict compares everything including content (text), fonts, layout, colors and position of each of the elements but knows to ignore rendering changes that are not visible to the human
Content (MatchLevel.CONTENT) - Content works in a similar way to Strict except for the fact that it ignores colors
Layout (MatchLevel.LAYOUT) - compares the layouts (i.e. structure) of the baseline and actual images. It ignores the content, color and other style changes between the pages.
A big advantage, in my opinion, is that this kind of testing can catch unexpected bugs (visual or otherwise) and in one go (you don't need to write multiple assertions, you just compare screenshots). You can write scripts with less code, as well.
Possible downsides are: you cannot handle dynamic content, sometimes it is impossible to take a screenshot and, since there are screenshots, test execution (working with img files) can be longer.
NOTE: I'm not involved with Applitools, but they have a site with many tutorial courses.
Related
I hope it’s Ok to post a complete naive question here for LO or OO experts.
I’m looking for advice on whether scripting LibreOffice or OpenOffice would be suitable for the following:
General Question
I’m looking to generate PDF reports, based on a combination of a “template” and a set of data (currently in JSON format) and inserted images.
This would act as a headless service that gets invoked when necessary from a web server, when a user requests a PDF report (on linux).
We have a need to frequently modify/customise/generate new templates, hence the reluctance to go down a route of using something like Reportlab (plus I don't know Reportlab at all, so face huge learning curve that way
Background
This is in contrast to using an approach of using a PDF library like Reportlab directly within the web server, and having to build up the template/report programmatically.
As LibreOffice/OpenOffice is obviously a lot faster for generating good looking report "templates", this is a question about doing both the template generation, plus final template + data -> PDF generation all directly within LibreOffice.
Some more specifics
The data values would mostly either be substituted into fields in the template, with no to minimal processing of values required.
However, there would be situations where some of the data is in “sets” that would be shown in a table type view, and the number of fields (and so number of table rows for instance) would need to vary per report, based on the number of values in that particular JSON data.
Additionally, I’d need to be able to include (“import”) images into the report. Some of the JSON data would be paths to image files, and I’d like to include those. Again for these, the number of image may vary between each report.
This wouldn't be high frequency at all, so would not need to run either LO/OO as a service, but could simply invoke when required with a sys call. Conceptually something like "LibreOffice --template 'make_fancy.report' <data.json> <output_file.pdf>"
If this approach would be reasonable in either LO or OO, what languages are best to script in? (Hopefully python3).
Lately I need to do an impact analysis on changing a DB column definition of a widely used table (like PRODUCT, USER, etc). I find it is a very time consuming, boring and difficult task. I would like to ask if there is any known methodology to do so?
The question also apply to changes on application, file system, search engine, etc. At first, I thought this kind of functional relationship should be pre-documented or some how keep tracked, but then I realize that everything can have changes, it would be impossible to do so.
I don't even know what should be tagged to this question, please help.
Sorry for my poor English.
Sure. One can technically at least know what code touches the DB column (reads or writes it), by determining program slices.
Methodology: Find all SQL code elements in your sources. Determine which ones touch the column in question. (Careful: SELECT ALL may touch your column, so you need to know the schema). Determine which variables read or write that column. Follow those variables wherever they go, and determine the code and variables they affect; follow all those variables too. (This amounts to computing a forward slice). Likewise, find the sources of the variables used to fill the column; follow them back to their code and sources, and follow those variables too. (This amounts to computing a backward slice).
All the elements of the slice are potentially affecting/affected by a change. There may be conditions in the slice-selected code that are clearly outside the conditions expected by your new use case, and you can eliminate that code from consideration. Everything else in the slices you may have inspect/modify to make your change.
Now, your change may affect some other code (e.g., a new place to use the DB column, or combine the value from the DB column with some other value). You'll want to inspect up and downstream slices on the code you change too.
You can apply this process for any change you might make to the code base, not just DB columns.
Manually this is not easy to do in a big code base, and it certainly isn't quick. There is some automation to do for C and C++ code, but not much for other languages.
You can get a bad approximation by running test cases that involve you desired variable or action, and inspecting the test coverage. (Your approximation gets better if you run test cases you are sure does NOT cover your desired variable or action, and eliminating all the code it covers).
Eventually this task cannot be automated or reduced to an algorithm, otherwise there would be a tool to preview refactored changes. The better you wrote code in the beginning, the easier the task.
Let me explain how to reach the answer: isolation is the key. Mapping everything to object properties can help you automate your review.
I can give you an example. If you can manage to map your specific case to the below, it will save your life.
The OR/M change pattern
Like Hibernate or Entity Framework...
A change to a database column may be simply previewed by analysing what code uses a certain object's property. Since all DB columns are mapped to object properties, and assuming no code uses pure SQL, you are good to go for your estimations
This is a very simple pattern for change management.
In order to reduce a file system/network or data file issue to the above pattern you need other software patterns implemented. I mean, if you can reduce a complex scenario to a change in your objects' properties, you can leverage your IDE to detect the changes for you, including code that needs a slight modification to compile or needs to be rewritten at all.
If you want to manage a change in a remote service when you initially write your software, wrap that service in an interface. So you will only have to modify its implementation
If you want to manage a possible change in a data file format (e.g. length of field change in positional format, column reordering), write a service that maps that file to object (like using BeanIO parser)
If you want to manage a possible change in file system paths, design your application to use more runtime variables
If you want to manage a possible change in cryptography algorithms, wrap them in services (e.g. HashService, CryptoService, SignService)
If you do the above, your manual requirements review will be easier. Because the overall task is manual, but can be aided with automated tools. You can try to change the name of a class's property and see its side effects in the compiler
Worst case
Obviously if you need to change the name, type and length of a specific column in a database in a software with plain SQL hardcoded and shattered in multiple places around the code, and worse many tables present similar column namings, plus without project documentation (did I write worst case, right?) of a total of 10000+ classes, you have no other way than manually exploring your project, using find tools but not relying on them.
And if you don't have a test plan, which is the document from which you can hope to originate a software test suite, it will be time to make one.
Just adding my 2 cents. I'm assuming you're working in a production environment so there's got to be some form of unit tests, integration tests and system tests already written.
If yes, then a good way to validate your changes is to run all these tests again and create any new tests which might be necessary.
And to state the obvious, do not integrate your code changes into the main production code base without running these tests.
Yet again changes which worked fine in a test environment may not work in a production environment.
Have some form of source code configuration management system like Subversion, GitHub, CVS etc.
This enables you to roll back your changes
I am investigating a replacement for iText and have been looking at the API and example code for PdfBox. I am slightly confused by its useage though, it seems I need to manually create the page objects which implies I need to know the number of pages beforehand or at least work out when its time to create a new page.
I generally use PDF generation for reports based on user configurable parameters which call stored procedures which can return varying amounts of data.
My question is quite simple, is it down to me to try and work out how much data will fit onto a page and create the pages programmatically?
The API seems to state that each page object represents a single page. From my experience of iText I do not need to worry about this, I simply write my data to the document and the pages are created for me based on the content I am placing into it.
I recently made the switch from iText to PDFBox and ran into a similar issue. I asked this question and eventually worked out what I needed to do to generate reports with an unknown number of pages.
This model works well for generating reports containing lines of data generated from a ResultSet...though that's the only way I've been using it thus far. I may run into limitations, but for now, it's getting the job done.
And I guess I should state that I am still laying out each page manually, but this method does at least generate my pages dynamically depending on the number of results that return.
I'm new to Selenium, and also fuzz testing. I see that Selenium IDE only allows the fixed test cases. But then fuzz testing seems to be helpful.
So what's behind a fuzz testing, what kind of tests does Selenium offer, is this a black box or white box testing.
Any help would be appreciated.
For a short answer:
Selenium is mostly about black-box testing, but you could do some whiter testing also with Selenium.
Selenium RC gives you much more freedom to do fuzz testing than Selenium IDE.
For a long answer, see below:
In this post I would try to explain the concept of randomly testing your web application using Selenium RC.
Normally speaking, a black-box testing technique like Selenium gives you a good freedom to
(1) Enter any value to a certain field
(2) Choose any field to test in a certain HTML form
(3) Choose any execution order/step to test a certain set of fields.
Basically you
use (1) to test a specific field in your HTML form (did you choose a good maximum length for a field), your JavaScript handling of that field's value (e.g. turning "t" into today's date, turning "+1" into tomorrow's date), and your back end Database's handling of that variable (VARCHAR length, conversion of numerical string into numerical value, ...).
use (2) to test ALL possible fields
use (3) to test the interaction of the fields with each other: is there a JavaScript alert popped up if the username field was not entered before the password field, is there a database (e.g. Oracle) trigger "popped up" when certain condition is not met.
Note that testing EVERYTHING (all states of your program, constructed by possible combinations of all variables) is not possible even in theory (e.g.: consider testing your small function used to parse a string, then how many possible values does a string have ?). Therefore, in reality, given a limited resource (time, money, people) you want to test only the "most crucial" execution paths of your web application. A path is called more "crucial" if it has more of the properties: (a) is executed frequently, (b) a deviation from specification causes serious loss.
Unfortunately, it is hard to know which execution cases are crucial, unless you have recorded all use cases of your application and select the most frequent ones, which is a very time consuming process. Furthermore even some bugs at the least executed use case could cause a lot of trouble if it is a security hole (e.g. someone steals all customers' password given a tiny bug in an URL handling of some PHP page).
That is why you need to randomly scan the testing space (i.e. the space of values used in those use cases), with the hope to run-something-and-scan-everything. This is called fuzz testing.
Using Selenium RC you could easily do all the phases (1), (2) and (3): testing any value in any field under any execution step by doing some programming in a supported language like Java, PHP, CSharp, Ruby, Perl, Python.
Following is the steps to do all these phases (1), (2) and (3):
Create list of your HTML fields so that you could easily iterate through them. If your HTML fields are not structurized enough (legacy reason), think of adding a new attribute that contains a specific id, e.g. selenium-id to your HTML element, to (1) simplify XPath formation, (2) speed up XPath resolution and (3) to avoid translation hassle. While choosing the value for these newly added selenium-id, you are free to help iterating while fuzzing by (a) using consecutive numbers, (b) using names that forms a consistency.
Create a random variable to control the step, say rand_step
Create a random variable to control the field, say rand_field
Eventually, create a random variable to control the value entered into a certain field, say rand_value.
Now, inside your fuzzing algorithm, iterate first through the values of rand_step, then with each such iteration, iterate through rand_field, then finally iterate through rand_value.
That said, fuzz testing helps to scan your whole application's use case values space after a limited execution time. It is said that "a plague of new vulnerabilities emerge that affected popular client-side applications including Microsoft Internet Explorer, Microsoft Word and Microsoft Excel; a large portion of these vulnerabilities were discovered through fuzzing"
But fuzz testing does not come without drawback. One if which is the ability to reproduce a test case given all those randomness. But you could easily overcome this limitation by either doing one of the following:
Generating the test cases before hand in a batch file to be used in a certain period of time, and apply this file gradually
Generating the test cases on the fly, together with logging down those cases
Logging down only the failed cases.
To answer more on if Selenium is black or white box.
Definitions about black-box and white-box
Black box: checks if one box (usually the whole app) delivers the correct outputs while being fed with inputs. Theoretically, your application is bug free if ALL possible input-output pairs are verified.
White box: checks the control flow of the source. Theoretically, your application is bug free if ALL execution paths are visited without problem.
But in real life, you cannot do ALL input-output pairs, nor ALL execution paths, because you always have limited resources in
Time
Money
People
With selenium: you mimic the user by entering a value or do a certain click on a web application, and you wait if the browser gives you the behavior you want. You don't know and don't care how the inner functionality of the web application actually work. That's why a typical Selenium testing is black-box testing
For a contract work, I need to digitalize a lot of old, scanned-graphic-only plenary debate protocol PDFs from the Federal Parliament of Germany.
The problem is that most of these files have a two-column format:
Sample Protocol http://sert.homedns.org/img/btp12001.png
I would love to read your answer to my following questions:
How I can split the two columns before feeding them into OCR?
Which commercial, open-source OCR software or framework, do you recommend and why?
Please note that any tool, programming-language, framework etc. is all fine. Don't hesitate recommend esoteric products, libraries if you think they are cut for the jub ^__^!!
UPDATE: These documents are already scanned by the parliament o_O: sample (same as the image above) and there are lots of them and I want to deliver on the contract ASAP so I can't go fetch print copies of the same documents, cut and scan them myself. There are just too many of them.
Best Regards,
Cetin Sert
Cut the pages down the middle before you scan.
It depends what OCR software you are using. A few years ago I did some work with an OCR API, I cant quite remember the name but I think there's lots of alternatives. Anyway this API allowed me to define regions on the page to OCR, If you always know roughly where the columns are you could use an SDK to map out parts of the page.
I use Omnipage 17 for such things. It has an batchmode too, where you can put the documents in an folder, where they was grabed, and put the result into another.
It autorecognit the layout, include columns, or you can set the default layout to columns.
You can set many options how the output should look like.
But try a demo, if it goes correct. I have at the moment problems with ligaturs in some of my documents. So words like "fliegen" comes out as "fl iegen" so you must spell them.
Take a look at http://www.wisetrend.com/wisetrend_ocr_cloud.shtml (an online, REST API for OCR). It is based on the powerful ABBYY OCR engine. You can get a free account and try it with a few of your images to see if it handles the 2-column format (it should be able to do it). Also, there are a bunch of settings you can play with (see API documentation) - you may have to tweak some of them before it will work with 2 columns. Finally, as a solution of last resort, if the 2-column split is always in the same place, you can first create a program that splits the input image into two images (shouldn't be very difficult to write this using some standard image processing library), and then feed the resulting images to the OCR process.