Best approach for creating Random keystrokes for load testing a webapp for database-backed quicksearch using JMeter - apache

Context:
I am load testing a prototype enterprise web app that performs quick searches on a large dataset. It's backed by a database and uses JQuery datables backed by a servlet to narrow the results upon each keystroke.
I want to find out how it will behave under load and measure response time, stability and usability under various loads and come up with a SLA. The load in this case would be a number of users logging in, typing various search string, simultaneously.
Tools:
I am using Apache Jmeter to do this.
Question:
To truly make my load tests random and eliminate the effect of caching at database level (or anywhere else), I want my HTTP requests for each search to be random. I want to do something like this: send a character, wait, send another character, send backspace, send one more character, send two backspaces, etc.
What is the most elegant/efficient way of doing something like that using JMeter?
Right now I am looking into using CSV dataset and read random characters from a large file, but I'm wondering if there is a better way.

You can achieve the random search strings by using functions.
Specifically, look at RANDOM and CHAR.
basically, you'd have something like ${__CHAR(${__RANDOM(0,82)})} to generate a single character.
I'd also recommend having a CSV file with the top 100 most popular search terms to test against.

I don't know your use case, but it seems unlikely that people are going to be typing in random characters. If I am correct, simulating random keystrokes could be just as misleading as using a very small set of search keywords.
Instead, you should locate or develop a set of keywords that people are likely to use - possibly be scanning the content they will be searching? Then use that to populate what users will enter when they are searching.

Related

How to compare content between two web pages in different environments?

We are in the process of building a website from scratch from an existing website. The web page is an identical copy, and as the web page contains many pages we need a way to compare content between the sites. It is of course possible to do manually, but it takes both a lot of time and entails a risk of human errors.
I have seen that there are services that offer this by inputting two URLs which are then analyzed and where discrepancies are presented. However, these cannot be used as our test environment is local (built in Sitecore).
Is there a way to solve this without making our test environment available online (which is not possible)? For example, does software exist for this, or alternatively some service where you can compare a web page that is online with one that is local?
Note that we're only looking for content comparison (not visual).
(Un)fortunately there's many ways to do this, but fortunately there are some simple ones.
What I would do is:
Get a list of URLs for each site. If the Sitemap is exhaustive, then you could use that, if it's not you might want to run some Sitecore Powershell to get the lists.
Given the lists (from files, or Sitecore API or something), write a program to visit each URL, get the text of the page after it's done rendering, and save it to disk (something like Selenium is good for this and you can use any language). You'll want some folder structure like host/urlpart/urlpart/pagename.txt, basically the same as your content tree.
Use some filesystem diff program like WinMerge to compare the two folders
This is quick and dirty, but a good place to start.

How to pass random parameters to SilkTest Workbench or Classic Record&Play Scenario

I am new to SilkTest and I don't have any scripting background. What I need to do is to record some test cases and then play them to check my system. After getting used to it, I plan to learn scripting and dive into it, but first things first.
What I need is to pass random generated (or randomly read from a text file or pre-defined) parameters into the recordins so that every time I run the tests, different parameters are used. For example, there is a component in which I write some letters and the component filters the results based on the text. Then, I select one of the results. Now, instead of recording the same letters everytime, how can I use random given parameters?
Thanks
What you are looking for is called Active Data in Silk Test.
It allows enhancing your visual tests with external data, for example from an Excel file.
ActiveData testing enables you to leverage existing data in external files as input for powerful, comprehensive application testing solutions. ActiveData testing enables you to perform multiple transactions against test applications using a different set of data for each transaction without writing complicated code or compromising existing data.
You can find an introduction to Active Data in the online documentation or in the tutorial video.
I have a question, what version of Silk Test are you using, also, what client are you using (Silk Test Workbench, Silk4Net or Silk4J). Each of these clients has the ability to receive parameters from an external source whether it is from a command line or from an external data file.
You indicate that you want random data, do you really mean random data or external data? If it is random data that you want you probably need to use a random number/string generator for the client that you are working with (.Net code for Workbench and Silk4Net and Java code for Silk4J).

Determining all required DNS Queries to show a website

I need to create a list of all DNS Queries required to display a large number of sites (ideally up to 1 000 000). The list needs to assign the queries to the page that required them.
Example: Visiting google.com required a DNS query for google.com, ssl.gstatic.com, apis.google.com and other sites. My List would read something along the lines of
google.com:google.com,ssl.gstatic.com,apis.google.com,...
(exact format not relevant here)
I currently have two ideas on how to do this:
Set up a DNS Server with logging, build a script that visits a given list of domains using my DNS Server as a resolver
Building a script that loads the source code of the site (think python's urllib2, for example), parsing all embedded content and constructing a list of queries that would be needed
Both ideas have problems though. Visiting 1 000 000 Domains with a space of 2 seconds between visits (to make it possible to assign queries to the visited site afterwards), taking about 1 second to load (which is pretty optimistic) would take over 34 days, probably longer. But to build a parser I would need a complete list of all possible forms of embedded content that would result in a DNS Query, and I would need to query some of the target URLs as well (think iframes), and some content would be impossible to check for further queries (think flash content which connects to other servers).
I'm kind of stuck here, and would appreciate some input on how to deal with this. It would be possible to shorten the List of URLs to maybe 100 000, but any less would dramatically reduce the use of the result.
For context: I need this list for my bachelor thesis dealing with a attack strategy on a proposed DNS privacy extension.
You can use PhantomJS to do this, as it provides an interface that will let you capture network requests and log them, something along the lines of this example.
You'd need to write some simple Javascript, but as it's Node, it should be fairly easy to run this asynchronously to gather the data you need within a reasonable time.
There is a tool that can do this and produce a graphic representation. It is part of dnssec-tools called DNSpktflow (DNS Packet Flow)
It may not do what you want exactly but it is open source so you can see how they do it.

How to access results of Sonar metrics for use with applications like PowerPivot

I'm trying to run a number of applications with known failure rates through Sonar, with hopes of deciding which metrics are most valuable in determining whether a particular application will fail. Ultimately I'll be making some sort of algorithm that will look at the outputs of whatever metrics I'm using and generate a score from 1 - 100. I've got about 21 applications put through Sonar, and the results have been stored in a MySQL database. I originally planned to use PowerPivot to find relationships in the data, but it seems like the formatting of the tables doesn't lend itself well to that. Other questions on stackoverflow have told me that Sonar's tables are unformatted, and I should instead use the Web Service API to get the information. I'm unfamiliar with API and was unsuccessful in trying to do what I wanted by looking at Sonar's documentation for API.
From an answer to another question:
http://nemo.sonarsource.org/api/timemachine?resource=org.apache.cxf:cxf&format=csv&metrics=ncloc,violations_density,comment_lines_density,public_documented_api_density,duplicated_lines_density,blocker_violations,critical_violations,major_violations,minor_violations
This looks very similar to what I'd like to have, except I'm only looking at each application once (I'm analyzing a sample of all the live applications on a grid), which means Timemachine isn't really what I'm looking for. Would it be possible to generate a similar table, except instead of the stats for a particular application per date, it showed the statistics for an application and all of its classes, etc?
If you're not familiar with the WS API, you can also create your own Sonar plugin to achieve whatever you want: it is written in Java and it will execute on every analysis you run. This way, in the code ot this custom plugin, you can do whatever you want: flush the metrics you need in an output file, push them into a third party system, ... etc.
Just take a look on how to write a plugin (most probably you will create a Decorator). You have concrete examples also to get started faster.

"Safely" allow users to search with SQL

For example I've often wanted to search stackoverflow with
SELECT whatever FROM questions WHERE
views * N + votes * M > answers AND NOT(answered) ORDER BY views;
or something like that.
Is there any reasonable way to allow users to use SQL as a search/filter language?
I see a few problems with it:
Accessing/changing stuff (a carefully setup user account should fix that)
SQL injection (given the previous the worst they should be able to do is get back junk and crash there session).
DOS attacks with pathological queries
What indexes do you give them?
Edit: I'd like to allow joins and what not as well.
Accessing/changing stuff
No problem, just run the query with a crippled user, with permissions only to select
SQL injection
Just sanitize the query
DOS attacks
Time-out the query and throttle the access by IP. I guess you can also throttle the CPU usage in some servers
If you do SQLEncode your users' input (and make sure to remove all ; as well!), I see no huge safety flaw (other than that we're still handing nukes out to psychos...) in having three input boxes - one for table, one for columns and one for conditions. They won't be able to have strings in their conditions, but queries like your example should work. You will do the actual pasting together of the SQL statement, so you'll be in control of what is actually executed. If your setup is good enough you'll be safe.
BUT, I wouldn't for my life let my user enter SQL like that. If you want to really customize search options, give either a bunch of flags for the search field, or a bunch of form elements that can be combined at will.
Another option is to invent some kind of "markup language", sort of like Markdown (the framework SO uses for formatting all these questions and answers...), that you can translate to SQL. Then you can make sure that only "harmless" selects are performed, and you can protect user data etc.
In fact, if you ever implement this, you should see if you could run the commands from a separate account on the SQL server, which only has access to the very basic needs, and obviously only read access.
Facebook does this with FQL. See the blog post or presentation.
I just thought of a strong sanitize method that could be used to restrict what can be used.
Use MySQL and grab it's lex/yacc files
use the lex file as is
gut the yacc file to only the things you want to allow
use action rules that spit out the input on success.