How to flush out suspended WCF workflows from the instancestore? - .net-4.0

We have identified the need to flush out several different workflows that have been suspended/persisted for a long time (i.e. hung instances). This is so that our test environment can be flushed clean before acceptance tests are re-run.
The dirty solution is to use a sql script to remove records from the InstancesTable and other related tables in the database.
What's the proper solution?
These are WCF workflows.
Test rig is running XP.

Using the AppFabric you can use the UI, or I asume PowerShell commands, to delete individual instanced. For development and test purposes I normally just recreate the database by running SqlWorkflowInstanceStoreSchema.sql script again.

Found a way to do it (thanks to Pablo Rotondo on MSDN):
http://www.funkymule.com/post/2010/04/28/how-to-resume-suspended-workflows-in-net-40.aspx

Related

Running multiple Kettle transformation on single JVM

We want to use pan.sh to execute multiple kettle transformations. After exploring the script I found that it internally calls spoon.sh script which runs in PDI. Now the problem is every time a new transformation starts it create a separate JVM for its executions(invoked via a .bat file), however I want to group them to use single JVM to overcome memory constraints that the multiple JVM are putting on the batch server.
Could somebody guide me on how can I achieve this or share the documentation/resources with me.
Thanks for the good work.
Use Carte. This is exactly what this is for. You can startup a server (on the local box if you like) and then submit your jobs to it. One JVM, one heap, shared resource.
Benefit of that is then scalability, so when your box becomes too busy just add another one, also using carte and start sending some of the jobs to that other server.
There's an old but still current blog here:
http://diethardsteiner.blogspot.co.uk/2011/01/pentaho-data-integration-remote.html
As well as doco on the pentaho website.
Starting the server is as simple as:
carte.sh <hostname> <port>
There is also a status page, which you can use to query your carte servers, so if you have a cluster of servers, you can pick a quiet one to send your job to.

Why cleanup a DB after a test run?

I have several test suites that read and write data from a dedicated database when they are run. My strategy is to assume that the DB is in an unreliable state before a test is run and if I need certain records in certain tables or an empty table I do that setup before the test is run.
My attitude is to not cleanup the DB at the end of each test suite because each test suite should do a cleanup and setup before it runs. Also, if I'm trying to "visually" debug a test suite it helps that the final state of the DB persists after the tests have completed.
Is there a compelling reason to cleanup a DB after your tests have run?
Depends on your tests, what happens after your tests, and how many people are doing testing.
If you're just testing locally, then no, cleaning up after yourself isn't as important ~so long as~ you're consistently employing this philosophy AND you have a process in place to make sure the database is in a known-good state before doing something other than testing.
If you're part of a team, then yes, leaving your test junk behind can screw up other people/processes, and you should clean up after yourself.
In addition to the previous answer I'd like to also mention that this is more suitable when executing Integration tests. Since Integrated modules work together and in conjunction with infrastructure such as message queues and databases + each independent part works correctly with the services it depends on.
This
cleanup a DB after a test run
helps you to Isolate Test Data. A best practice here is to use transactions for database-dependent tests (e.g.,component tests) and roll back the transaction when done. Use a small subset of data to effectively test behavior. Consider it as Database Sandbox – using the Isolate Test Data pattern. E.g. each developer can use this lightweight DML to populate his local database sandboxes to expedite test execution.
Another advantage is that you Decouple your Database, so ensure that application is backward and forward compatible with your database so you can deploy each independently. Patterns like Encapsulate Table with View, and NoSQL databases ensure that you can deploy two application versions at once without either one of them throwing database-related errors. It was particularly successful in a project where it was imperative to access the database using stored procedures.
All this is actually one of the concepts that is used in Virtual test labs.
In addition to above answers, I'll add few more points:
DB shouldn't be cleaned after test because thats where you've your test data, test results and all history which can be referred later on.
DB should be cleaned only if you are changing some application setting to run your / any specific test, so that it shouldn't impact other tester.

Testing an n-tier web application - should my test project have its own database?

In an n-tier web-app, should I be running integration tests against a different database, one dedicated to testing the code? Is it standard practice to test against the production database as well?
You should never run untested code on production. After all, you don't want to discover that it has a bug that wipes out all data. That's what tests are supposed to find. And you should not have test/staging data in the production system. It is good practice to dump the data out of production and load it into another environment for periodic testing with real-world data.
You should have a test database (not shared with production). It's a good idea to wipe out the data before every test.
You can have smoke tests that run in production. They will pretend to be a user(agent) and visit many pages, maybe even create things (with a special tag so you can find them again and delete them.)
I'd rather think of different database user with own data set. Database schema should be the same. I'd never run tests on production database with the same database user. Test logic shouldn't even be delivered to the client as it may lead to severe security issues.
In my opinion you'd need a full production-like data set for testing purposes, to be able to test every single feature of your application. And also you would need an empty database (without any bussiness data) for application clients to have it as initial point on delivery. Such a dataset shouldn't be tested as there is no data needed to test bussiness logic.

Can TeamCity tests be run asynchronously

In our environment we have quite a few long-running functional tests which currently tie up build agents and force other builds to queue. Since these agents are only waiting on test results they could theoretically just be handing off the tests to other machines (test agents) and then run queued builds until the test results are available.
For CI builds (including unit tests) this should remain inline as we want instant feedback on failures, but it would be great to get a better balance between the time taken to run functional tests, the lead time of their results, and the throughput of our collective builds.
As far as I can tell, TeamCity does not natively support this scenario so I'm thinking there are a few options:
Spin up more agents and assign them to a 'Test' pool. Trigger functional build configs to run on these agents (triggered by successful Ci builds). While this seems the cleanest it doesn't scale very well as we then have a lead time of purchasing licenses and will often have need to run tests in alternate environments which would temporarily double (or more) the required number of test agents.
Add builds or build steps to launch tests on external machines, then immediately mark the build as successful so queued builds can be processed then, when the tests are complete, we mark the build as succeeded/failed. This is reliant on being able to update the results of a previous build (REST API perhaps?). It also feels ugly to mark something as successful then update it as failed later but we could always be selective in what we monitor so we only see the final result.
Just keep spinning up agents until we no longer have builds queueing. The problem with this is that it's a moving target. If we knew where the plateau was (or whether it existed) this would be the way to go, but our usage pattern means this isn't viable.
Has anyone had success with a similar scenario, or knows pros/cons of any of the above I haven't thought of?
Your description of the available options seems to be pretty accurate.
If you want live update of the builds progress you will need to have one TeamCity agent "busy" for each running build.
The only downside here seems to be the agent licenses cost.
If the testing builds just launch processes on other machines, the TeamCity agent processes themselves can be run on a low-end machine and even many agents on a single computer.
An extension to your second scenario can be two build configurations instead of single one: one would start external process and another one can be triggered on external process completeness and then publish all the external process results as it's own. It can also have a snapshot dependency on the starting build to maintain the relation.
For anyone curious we ended up buying more agents and assigning them to a test pool. Investigations proved that it isn't possible to update build results (I can definitely understand why this ugliness wouldn't be supported out of the box).

How to perform stress test against sharepoint site using threads

I want to analyze the performance (hence its weak points) of a sharepoint site doing stress test activity. What is needed to be done is call some methods exposed via web service that do the following things inside the sharepoint site:
-create a new group
-add a content to the group
-add an attachment to the content
-delete the content
-delete the previously created group
What is required is a simulation of a situation where there are 4500 users trying to do these operations concurrently (at the same time or more realistically within a short timespan, for example within 5 seconds).
We want to register the execution time of each operation (web method, for example of the "create new group"), too. I thought I could simulate these operations via a console applications using threads and stopwatchs. Is there anyone who has encountered a similar problem and can give me any existing solutions or hints to do it "the right way"? For
example how can I obtain that all threads start at the same instant? Thanks in advance.
I am a user of Visual Studio Load Testing since 2 years, and I find it very powerfull and easy to use. You can run integration tests, navigation in a web site, simulate database load, ... in fact, everything. Because it is a MS application, it is also fully compatible with all MS products like Sharepoint : it's easier to call a WCF service from a unit test than another technology (how to test nettcpbinding ?). You can also use the Visual Studio Profiler for instrumenting your code (and see what line of code is expensive or event ADO.net interactions). You can also easily extend the load testing by many extensibility points.
One important thing is that VS laod testing is "intrusive". It will note only collect response time, request lengths, ... but also all performance counters, database queries, ... All this metrics are saved in a dedicated database like SQLExpress for reporting. There is an AddOn for Excel.
Juste one important note (available for all load testing solutions !) :
You can run load tests from a developer machine or even a single dedicated machine, but you usually can't generate enough traffic to really see how the application responds (you machine can not simulate 500 concurrent users because of limited CPU/Memory/Network) . In order to simulate a lot of users, you'll set up what is known as a Load Test Rig.
A test rig is made up of a Test Controller machine and one or more Test Agent machines as shown in Figure 1. The controller manages and coordinates the agent machines and the agents generate load against the application. The test controller is also responsible for collecting performance monitor data from the servers under test and optionally from the test rig machines.
Here are some links :
MSDN
Dave's introduction
Not saying Visual Studio Load Testing is not a great tool. There are tools, like Tsung, Eventlet (and many others) that can support well over thousands of concurrent users.
Good luck.