Running integration/e2e tests on top of a Kubernetes stack - testing

I’ve been digging a bit into the way people run integration and e2e tests in the context of Kubernetes and have been quite disappointed by the lack of documentation and feedbacks. I know there are amazing tools such as kind or minikube that allow to run resources locally. But in the context of a CI, and with a bunch of services, it does not seem to be a good fit, for obvious resources reasons. I think there are great opportunities with running tests for:
Validating manifests or helm charts
Validating the well behaving of a component as part of a bigger whole
Validating the global behaviour of a product
The point here is not really about the testing framework but more about the environment on top of which the tests could be run.
Do you share my thought? Have you ever experienced running such kind of tests? Do you have any feedbacks or insights about it?
Thanks a lot

Interesting question and something that I have worked on over the last couple of months for my current employer. Essentially we ship a product as docker images with manifests. When writing e2e tests I want to run the product as close to the customer environment as possible.
Essentially to solve this we have built scripts that interact with our standard cloud provider (GCloud) to create a cluster, deploy the product and then run the tests against it.
For the major cloud providers this is not a difficult tasks but can be time consuming. There are a couple of things that we have learnt the hard way to keep in mind while developing the tests.
Concurrency, this may sound obvious but do think about the number of concurrent builds your CI can run.
Latency from the cloud, don't assume that you will get an instant response to every command that you run in the cloud. Also think about the timeouts. If you bring up a product with lots of pods and services what is the acceptable start up time?
Errors causing build failures, this is an interesting one. We have seen errors in the build due to network errors when communicating with our test deployment. These are nearly always transitive. It is best to avoid these making the build fail.
One thing to look at is GitLab are providing some documentation on how to build and test images in their CI pipeline.

On my side I use travis-ci. I build my container image inside it, then run k8s with kind (https://kind.sigs.k8s.io/) inside travis-CI, and then launch my e2e tests.
Here is some additional information on this blog post: https://k8s-school.fr/resources/en/blog/k8s-ci/
And the scripts to install kind inside travis-ci in 2 lines: https://github.com/k8s-school/kind-travis-ci.git. It allows lots of customization on the k8s side (enable psp, change CNI plugin)
Here is an example: https://github.com/lsst/qserv-operator
Or I use Github Actions CI, which allows to install kind easily: https://github.com/helm/kind-action and provide plenty of features, and free worker nodes for open-source projects.
Here is an example: https://github.com/xrootd/xrootd-k8s-operator
Please note that Github action workers may not scale for large build/e2e tests. Travis-CI scales pretty well.
In my understanding, this workflow coud be moved to an on-premise gitlab CI where your application can interact with other services located inside your network.
One interesting thing is that you do not have to maitain a k8s cluster for your CI, kind will do it for you!

Related

Have you found a way of persisting the odoo core modules in v14 different form a volume? And so, it is possible deploying odoo in gcloud run?

I want to deploy odoo as cheap as possible. I tried with gcloud sql (15-30€/m) + cloud run. But after some minutes passed the odoo interface shows me a white screen with so many logs in the console similar to this:
GET 404 1.04 KB24 ms Chrome 91 https://bf-dev3-u7raxlu3nq-ew.a.run.app/web/content/290-f328144/1/website.assets_editor.css
My interpretation is that, as cloud run is stateless, and the web static files seems to be stored in the core module, after the container is killed this information is lost. As I've been one month working looking for a solution, before trying any another way of deploying I ask the community: Have you found a way of persisting the odoo core modules in v14 different form a volume? And so, it is possible deploying odoo in gcloud run?
Here I listed all the ideas that I tried:
First, I thought that this css files were store in the werkzeug session, so I tried two addons that stored this session in a place different from the filestore. These addons were camptocamp odoo-cloud-platform-14.0/session-redis and misc-addons-13.0/base_session_store_psql. But, then the problem persisted.
Then I read that the static css and js file generated in the web editor are stored in odoo as attachments, and the addons misc-addons-13.0/ir_attachment_s3 could store these files in s3. But, although I configured this addon the problem persisted.
Next, I found this link describing needing to regenerate assets so them to be stored in the db. But, although I did that the problem persisted.
Finally, I thought to deploy odoo in other ways. The way of directly in a vm seems to be the more minimalistic and standard, and so seem to have the more chances to work, although it will be difficult to implement gitops. It can be deployed containers in the vm through docker compose what will help deploying updates. Gke anthos seems to implement gitops too and seems to persist volumes, but in the description it shows gke anthos is stateless. Finally, there's the way of deploying in a k8s cluster, this way will implement containers and allow autoscaling vs the docker compose way in a vm. But it's true it seems to be more expensive and more difficult to implement. Regarding seem to be more expensive it is thought of trying little working nodes machines so the cost stays small during the night. Regarding the difficulty of deploying, it is desired to implement gitops so it seems argo or other should be added. Also, I heard gke autopilot has a good free tier and is easier to deploy.
Thanks in advance :)
Cloud Run isn't the good solution for that. Indeed, if the werkzeug session is persisted in memory, the same client isn't sure to access to the same instance each time, and thus to lost the file even in the middle of a session.
The best solution is to use VM with sticky session configuration. You can use old school deployment on Compute Engine, or Cloud Native solution with GKE/K8S. It's more or less the same cost if you have only 1 cluster (the first one is free)
Just a correction about GKE Anthos. I think you talk about Cloud Run on Anthos, and yes, it's like Cloud Run but use KNative on GKE to manage the containers, and it's also serverless. But GKE can handle stateful deployment, as you need with odoo

should E2E be run in production

Apologies if this is open ended.
Currently my team and I are working on our End to end (E2E) testing strategy and we seem to be unsure whether we should be executing our E2E tests against our staging site or on our production site. We have gathered that there are pros and cons to both.
Pro Staging Tests
wont be corrupting analytics data on production.
Can detect failure before hitting production.
Pro Production Tests
Will be using actual components of the system including database and other configurations and may capture issues with prod configs.
I am sometimes not sure if we are conflating E2E with monitoring services (if such a thing exists). Does someone have an opinion on the matter?
In addition when are E2E tests run? Since every member of the system is being tested there doesn't seem to be an owner of the test suite making it hard to determine when the E2E should be run. We were hoping that we could run E2E in some sort of a pipeline before hitting production. Does that mean I should run these test when either the Front end / backend changes? Or would you rather just run the execution of the E2E on an interval regardless of any change?
In my team experience has shown that test automation is better done in a dedicated test server periodically and new code is deployed only after being tested several sessions in a row successfully.
Local test run is for test automation development and debugging.
Test server - for scheduled runs, because - no matter how good you are at writing tests, at some point they will become many hours in a row to run and you need a reliable statistic of them over time with fake data that wont break production server.
I disagree with #MetaWhirledPeas on the point of pursuing only fast test runs. Your priority should always be better coverage and reduced flakiness. You can always reduce the run time by paralelization.
Running in production - I have seen many situations when test results in a funny state of the official site that makes the company reputation go down. Other dangers are:
Breaking your database
Making purchases from non existent users and start losing money
Creating unnecessary strain on the official site api, that make the client experience during the run bad or even cause the server to stop completely.
So, in our team we have a dedicated manual tester for the production site.
You might not have all the best options at your disposal depending on how your department/environment/projects are set up, but ideally you do not want to test in production.
I'd say the general desire is to use fake data as often as possible, and curate it to cover real-world scenarios. If your prod configs and setup are different than your testing environment, do the hard work to ensure your testing environment configuration matches prod as much as possible. This is easier to accomplish if you're using CI tools, but discipline is required no matter what your setup may be.
When the tests run is going to depend on some things.
If you've made your website and dependencies trivial to spin up, and if you are already using a continuous integration workflow, you might be able to have the code build and launch tests during the pull request evaluation. This is the ideal.
If you have a slow build/deploy process you'll probably want to keep a permanent test environment running. You can then launch the tests after each deployment to your test environment, or run them ad hoc.
You could also schedule the tests to run periodically, but usually this indicates that the tests are taking too long. Strive to create quick tests to leave the door open for integration with your CI tools at some point. Parallelization will help, but your biggest gains will come from using cy.request() to fly through repetitive tasks like logging in, and using cy.intercept() to stub responses instead of waiting for a service.

How do I run cucumber tests when testing an rest or graphql API

This is my first time playing with cucumber and also creating a suite which tests and API. My questions is when testing the API does it need to be running?
For example I've got this in my head,
Start express server as background task
Then when that has booted up (How would I know if that happened?) then run the cucumber tests?
I don't really know the best practises for this. Which I think is the main problem here sorry.
It would be helpful to see a .travis.yml file or a bash script.
I can't offer you a working example. But I can outline how I would approach the problem.
Your goal is to automate the verification of a rest api or similar. That is, making sure that a web application responds in the expected way given a specific question.
For some reason you want to use Cucumber.
The first thing I would like to mention is that Behaviour-Driven Development, BDD, and Cucumber are not testing tools. The purpose with BDD and Cucumber is to act as a communication tool between those who know what the system should do, those who write code to make it happen, and those who verify the behaviour. That’s why the examples are written in, almost, a natural language.
How would I approach the problem then?
I would verify the vast majority of the behaviour by calling the methods that make up the API from a unit test or a Cucumber scenario. That is, verify that they work properly without a running server. And without a database. This is fast and speed is important. I would probably verify more than 90% of the logic this way.
I would verify the wiring by firing up a server and verify that it is possible to reach the methods verified in the previous step. This is slow so I would do as little as possible here. I would, if possible, fire up the server from the code used to implement the verification. I would start the server as a part of the test setup.
This didn’t involve any external tools. It only involved your programming language and some libraries. The reason for doing it this way is that I want to to be as portable as possible. The fewer tools you use, the easier it gets to work with something.
It has happened that I have done some of the setup in my build tool and had it start a server before running the integration tests. This is usually more heavy weight and something I avoid if possible.
So, verify the behaviour without a server. Verify the wiring with a server. It is important to only verify the wiring in this step. The logic has been verified earlier, there is no need to repeat it.
Speed, as in a fast feedback loop, is very important. Building and testing the entire system should, in a good world, take seconds rather than minutes.
I have a working example if you're interested (running on travis).
I use docker-compose to launch the API & required components such as database, then I run cucumber-js tests against the running stack.
docker-compose is also used for local development & testing.
I've also released a library to help writing cucumber for APIs, https://github.com/ekino/veggies.

Integration testing of func in OSGI container

I'm using FuseESB to run my app, which is essensially OSGI container (Felix), i'd like to figure approach to test my OSGI services in integration mode (including outer dependencies like DB, outer services, etc). First on a thought is ability to run specific bundle into container which involve all app services into running tests defined in this bundle. Can somebody help with that kind of issue? THANKS!
There are differnt ways of testing this.
Since FuseESB is based on Apache Karaf you might test with the apache karaf-pax-exam tools to test a complete container setup automatically.
Another way of just testing your OSGi bundles in a OSGi container is to use pax-exam directly. Last but not least if you just want to test your service look-up functionality you might test with pojosr, it's quite nice for testing but has it's limits especially if you depend on container features.
That said you'll find information at the following pages:
Pax-Exam
Apache Karaf
sample how Pax-Web uses pax-exam in its iTests
You may find http://www.javabeat.net/2011/11/how-to-test-osgi-applications/ helpful as an overview of the various OSGi test options. Configuring PAX-Exam to pull in your whole FuseESB container and get all your app services present will involve certain challenges, but once you've got the knack it can be very handy.
bndtools as the possibility to do JUnit tests inside the container.

Newbie question about Continuous integration and selenium tests,

I'm very new to C.I. but I have recently inherited a project where Team City has just been implemented and I'm slowly getting my head around it. One thing we would like to do is run some Selenium Tests as part of the build process. I've created the selenium tests and can run them successfully via nunit-console on my development machine. The build server builds the project and then deploys it (A web forms application as it happens) to a staging server.
Before each selenium test we set the database to a known state, i.e. to only have certain records in place - that way each test is independent of the others. The problem is the staging server will be used by real "human" testers so this would cause them a problem with the database continually being reset (records being removed etc.) The question is should I really also deploy the application to a virtual directory on the build server and run the selenium tests against that and only deploy to the staging server if those tests pass?
Or have I got this stuff completely wrong? If so how do you do it in your organisation?
I suggest that you do not mix your automated and manual testing by allowing your testers to access the server that is staged for your automated tests. This can cause false negatives both in your automated and your manual tests. These 'bugs' are indeterministic and more than likely never reproducible (a very bad news). This will cause you a lot of unnecessary 'bug reports' and build failures.
So here is what you can do...
In addition to your current setup, you can create an extra staged server for your manual testers. This is the least you should do. You should probably create several of them, one for each tester.
And here comes the rant...
In my current project we recently found out that our testers (we had ~10 of them) reused one server. They claimed that since our app is going to have multiple concurrent users, it was a good idea that while they are testing the individual functionalities, they are also testing how these functionalities act while multiple users are working on the same server. WRONG!
If multiple users are a concern, there should be test cases for the specific concerns. If functionality#1 can interfere with functionality#2, it should be specifically tested and not just be 'tested-by-luck'.
Before this was explained to our manual testers, we had many false bug reports due to the fact that one tester was simply stepping on another tester's toes. (e.g. tester1 deleted a record that tester2 introduced to the system, etc...). This created a lot of unnecessary bug reports and these bugs were never reproducible.
Sorry about the rant, I hoped this still helps :)