General Questions on Helm Chart Tests - testing

I'm working with Helm Chart testing, and I'm a bit confused. I have a couple (dumb) questions,
What does the Test Pod do? Why even make a Test Pod? Why can't the original Pod do testing?
How does the Test Pod communicate with the original Pod?
Does the test Pod tell the original Pod to perform certain tasks?
If I want to check Ports and Hosts (ping server), does the test Pod tell the original Pod to perform a ping command, or does the test Pod perform the ping? If the test Pod is running tests, how can you be sure that the original Pod is working correctly (i.e. if it is not doing anything)?
Perhaps some theory on Helm tests would be very beneficial for me. If there is a document with detailed information on Helm tests please let me know.

Imagine you want to know if Stack Overflow is working. So you run curl https://stackoverflow.com. 99.9% of the time, some magic with load balancers and databases and what not happens and you'll get a list of questions, but 0.1% of the time you get a "something is wrong" page. That is, by making an HTTP GET request from somewhere outside the application, you can still get a high-level view of whether it's working.
You can do as much or as little testing as you want from a test pod. Sending ICMP ECHO packets, as ping(1) does, isn't really that interesting, since it only verifies that the Service exists and the Kubernetes network layer is functional; it doesn't reach your application at all. A simple "are you alive" call can be interesting, but readiness and liveness probes do that too. I might try to write something that depends on the whole application and its dependencies being available, a little bit more than just seeing whether an HTTP endpoint responds.
The test tools you need probably aren't part of your main application. As an extreme example of this, many Go images are built on images FROM scratch that literally include nothing at all besides the application binary, so you need a separate image that contains curl and a shell. You could imagine running a larger test system that needs some sort of language runtime that doesn't match the normal language.
The test pod "makes normal requests" to the application service. Frequently this would be via HTTP, but it depends on what exactly the application does.
The test pod could make HTTP POST requests that cause some action, and verify its result. You could also argue that the test pod shouldn't modify data during a production rollout. Beyond the ordinary network-request path, there's not a way for the test pod to cause the application to "perform a task".
The test pod can make a known set of (HTTP) requests and verify the results that come back. If it gets 503 Service Unavailable that's almost certainly bad, for example. It can verify that a response has the right format as per an OpenAPI spec. There probably isn't a path for a random other pod to cause the application to run an arbitrary command and that wouldn't help you demonstrate the application is working correctly.

Related

Meteor, MUP & Cloudflare causing redirect/refresh loop

I have a meteor webapp that has been around for a few years. It hasn't been updated particularly often and thus the version is little bit old (Meteor 1.6.1.4), however it runs locally without an issue and I currently have a version of it it deployed without issue on a Digital Ocean droplet with Mongo on AtlasDB and the DNS on Cloudflare.
However I've been running into an issue deploying updates with the Meteor Up (MUP) tool. On my production server when I run mup deploy with my latest code the deployment works and validates successfully, however the live site now loops on page load. The page completes the load (including a call to Stripe API front end library) and the images load and as soon as that has happened the same page is loaded again over and over. This happens on each page of the webapp. There are no errors logged in the console.
I'm almost sure this isn't a codebase issue as I have a staging version of this same app running on an identical spec droplet which I can deploy to without issue. The only difference between the production and staging is that the staging uses a LetsEncrypt cert generated by MUP and production uses a Cloudflare issued cert. I can't remember exactly the reason for this as it was the outcome of my last round of troubleshooting, which did result in a successful deployment. The LetsEncrypt configuration with MUP seemed to be problematic when I last set everything up. Either way there is no obvious good reason why this error should occur.
So I think that the issue is most likely something to do with Cloudflare, however I don't have many clues as to what. I've tried clearing the full cache after deploying. I cannot disable the Cloudflare proxy as I get unsecure error.
For my next steps I'm thinking of setting up another staging droplet but with Cloudflare in front in the same way, to see if I can get a non-critical replicable version of the same error. From there I'm not sure what I would do to debug and fix. I was also wondering if configuring a load balancer for this webapp might be smart at this moment, though if in an SSL passthrough I wonder if it would not solve the underlying issue. This would also not necessarily be answering this question but rather just avoiding it. I'm also considering trying to update the version of the Meteor app as far as I can to see if there is any chance that codebase is part of the issue.
Any suggestions?

Running integration/e2e tests on top of a Kubernetes stack

I’ve been digging a bit into the way people run integration and e2e tests in the context of Kubernetes and have been quite disappointed by the lack of documentation and feedbacks. I know there are amazing tools such as kind or minikube that allow to run resources locally. But in the context of a CI, and with a bunch of services, it does not seem to be a good fit, for obvious resources reasons. I think there are great opportunities with running tests for:
Validating manifests or helm charts
Validating the well behaving of a component as part of a bigger whole
Validating the global behaviour of a product
The point here is not really about the testing framework but more about the environment on top of which the tests could be run.
Do you share my thought? Have you ever experienced running such kind of tests? Do you have any feedbacks or insights about it?
Thanks a lot
Interesting question and something that I have worked on over the last couple of months for my current employer. Essentially we ship a product as docker images with manifests. When writing e2e tests I want to run the product as close to the customer environment as possible.
Essentially to solve this we have built scripts that interact with our standard cloud provider (GCloud) to create a cluster, deploy the product and then run the tests against it.
For the major cloud providers this is not a difficult tasks but can be time consuming. There are a couple of things that we have learnt the hard way to keep in mind while developing the tests.
Concurrency, this may sound obvious but do think about the number of concurrent builds your CI can run.
Latency from the cloud, don't assume that you will get an instant response to every command that you run in the cloud. Also think about the timeouts. If you bring up a product with lots of pods and services what is the acceptable start up time?
Errors causing build failures, this is an interesting one. We have seen errors in the build due to network errors when communicating with our test deployment. These are nearly always transitive. It is best to avoid these making the build fail.
One thing to look at is GitLab are providing some documentation on how to build and test images in their CI pipeline.
On my side I use travis-ci. I build my container image inside it, then run k8s with kind (https://kind.sigs.k8s.io/) inside travis-CI, and then launch my e2e tests.
Here is some additional information on this blog post: https://k8s-school.fr/resources/en/blog/k8s-ci/
And the scripts to install kind inside travis-ci in 2 lines: https://github.com/k8s-school/kind-travis-ci.git. It allows lots of customization on the k8s side (enable psp, change CNI plugin)
Here is an example: https://github.com/lsst/qserv-operator
Or I use Github Actions CI, which allows to install kind easily: https://github.com/helm/kind-action and provide plenty of features, and free worker nodes for open-source projects.
Here is an example: https://github.com/xrootd/xrootd-k8s-operator
Please note that Github action workers may not scale for large build/e2e tests. Travis-CI scales pretty well.
In my understanding, this workflow coud be moved to an on-premise gitlab CI where your application can interact with other services located inside your network.
One interesting thing is that you do not have to maitain a k8s cluster for your CI, kind will do it for you!

How do I run cucumber tests when testing an rest or graphql API

This is my first time playing with cucumber and also creating a suite which tests and API. My questions is when testing the API does it need to be running?
For example I've got this in my head,
Start express server as background task
Then when that has booted up (How would I know if that happened?) then run the cucumber tests?
I don't really know the best practises for this. Which I think is the main problem here sorry.
It would be helpful to see a .travis.yml file or a bash script.
I can't offer you a working example. But I can outline how I would approach the problem.
Your goal is to automate the verification of a rest api or similar. That is, making sure that a web application responds in the expected way given a specific question.
For some reason you want to use Cucumber.
The first thing I would like to mention is that Behaviour-Driven Development, BDD, and Cucumber are not testing tools. The purpose with BDD and Cucumber is to act as a communication tool between those who know what the system should do, those who write code to make it happen, and those who verify the behaviour. That’s why the examples are written in, almost, a natural language.
How would I approach the problem then?
I would verify the vast majority of the behaviour by calling the methods that make up the API from a unit test or a Cucumber scenario. That is, verify that they work properly without a running server. And without a database. This is fast and speed is important. I would probably verify more than 90% of the logic this way.
I would verify the wiring by firing up a server and verify that it is possible to reach the methods verified in the previous step. This is slow so I would do as little as possible here. I would, if possible, fire up the server from the code used to implement the verification. I would start the server as a part of the test setup.
This didn’t involve any external tools. It only involved your programming language and some libraries. The reason for doing it this way is that I want to to be as portable as possible. The fewer tools you use, the easier it gets to work with something.
It has happened that I have done some of the setup in my build tool and had it start a server before running the integration tests. This is usually more heavy weight and something I avoid if possible.
So, verify the behaviour without a server. Verify the wiring with a server. It is important to only verify the wiring in this step. The logic has been verified earlier, there is no need to repeat it.
Speed, as in a fast feedback loop, is very important. Building and testing the entire system should, in a good world, take seconds rather than minutes.
I have a working example if you're interested (running on travis).
I use docker-compose to launch the API & required components such as database, then I run cucumber-js tests against the running stack.
docker-compose is also used for local development & testing.
I've also released a library to help writing cucumber for APIs, https://github.com/ekino/veggies.

Running rspec tests (via guard) on remote instance

probably my use-case is specific, but i'm sure i'm not the only one.
I have quite big Rails application, full of Rspec/Cucumber tests. Usually it takes like 30-40 minutes to run everything from scratch on Intel i5. Yes, we are using guard, so it's not every time from very beginning. But it's annoying anyway, and i want to distribute load somehow.
Also i have another development workstation with i7, and my idea to run guard loop on it. This way, i need something to automate Rspec/Cucumber tests running via guard on remote machine, but general behaviour should be the same: i'm changing something, guard runs test for changed part on remote workstation without any additional movements from my side. I don't want to push to repo during development, of course we are using CI and local CI will be not very reasonable. And of course we are using parallel_tests, so my question not about sharing load between CPU cores.
Ideas and suggestions are very welcome.
You could share the files with the fast computer (via smb f.e.) and run the tests on the remote computer and check it via ssh?
You could mount your project working directory on the Remote machine and start Guard there, preferably over SSH so you see the console output. In addition you could use the GNTP notifier and send notifications from the remote machine to your development machine:
Ruby
notification :gntp, :host => 'development.local', :password => 'secret'

How to test a cocoa touch app for the case when the network fails while downloading a file?

My iOS application, among its features, download files from a specific server. This downloading occurs entirely in the background, while the user is working on the app. When a download is complete, the resource associated with the file appears on the app screen.
My users report some misbehavior about missing resources that I could not reproduce. Some side information leads me to suspect that the problem is caused by the download of the resource's file to be aborted mid-way. Then the app has a partially downloaded file that never gets completed.
To confirm the hypothesis, to make sure any fix works, and to test for such random network vanishing under my feet, I would like to simulate the loss of the network on my test environment: the test server is web sharing on my development Mac, the test device is the iOS simulator running on the same Mac.
Is there a more convenient way to do that, than manually turning web sharing off on a breakpoint?
Depending on how you're downloading your file, one possible option would be to set the callback delegate to null halfway through the download. It would still download the data, but your application would simply stop receiving callbacks. Although, I don't know if that's how the application would function if it truly dropped the connection.
Another option would be to temporarily point the download request at some random file on an external web server, then halfway though just disconnect your computer from the internet. I've done that to test network connectivity issues and it usually works. The interesting problem in your case is that you're downloading from your own computer, so disconnecting won't help. This would just be so you can determine the order of callbacks within the application when this happens, (does it make any callbacks at all? In what order?) so that you can simulate that behavior when actually pointed to your test server.
Combine both options together, I guess, to get the best solution.