Best practice for a typeorm migration in a serverless app - orm

We use typeorm into a serverless application. And now we are thinking about migration running on production.
In classical application migration run command is a part of ci/cd pipeline. But if we copy this pattern to a serverless application, then we get 100 ci/cd pipelines (one for each lambda function). And when we update 100 lambdas at once, then 100 migration scripts run concurrently.
It looks not good IMHO. I hope there is another best practice for serverless apps.

I have a reasonbly large serverless application with a lot of functions, and I treat the DB as seperate from the functions, I have a central db folder with all TypeORM entities and migrations in it. I have a pipeline which triggers on changes within the migrations folder, this runs before deployment of any functions and simply runs any new migrations.
This setup hasn't worked out too badly for me, just need to be cognisant when writing migrations that I am careful about dropping columns / renaming fields etc, knowing there will be a brief lag between migration completion and function deployment, so the 'old' code could be running with migrated DB for a minute or 2 before the deployment of 'new' function code is finished.
PS: If you found a better solution I'd be really keen to hear it, I have no idea what 'best practises' there are for this sort of stuff in serverless world

Related

How to test my pipeline before changing it?

When changing the pipelines for my company I often see the pipeline breaking under some specific condition that we did not anticipate. We use yaml files to describe the pipelines (Azure Devops)
We have multiple scenarios, such as:
Pipelines are run by automatic triggers, by other pipelines and manually
Pipelines share the same templates
There are IF conditions for some jobs/steps based on parameters (user input)
In the end, I keep thinking of testing all scenarios before merging changes, we could create scripts to do that. But it's unfeasible to actually RUN all scenarios because it would take forever, so I wonder how to test it without running it. Is it possible? Do you have any ideas?
Thanks!
I already tried the Preview endpoints from Azure REST api, which is good, but it only validates the input, such as variables and parameters. We also needed to make sure which steps are running and the variables being set in those
As far as I know (I am still new to our ADO solutions), we have to fully run/schedule the pipeline to see that it runs and then wait a day or so for the scheduler to complete the auto executions. At this point I do have some failing pipelines for a couple of days that I need o fix.
I do get emails when a pipeline fails like this in the json that holds the metadata to create a job:
"settings": {
"name": "pipelineName",
"email_notifications": {
"on_failure": [
"myEmail#email.com"
],
"no_alert_for_skipped_runs": true
},
Theres an equivalent extension that can be added in this link, but I have not done it this way and cannot verify if it works.
Azure Pipelines: Notification on Job failure
https://marketplace.visualstudio.com/items?itemName=rvo.SendEmailTask
I am not sure what actions your pipeline does but if there are jobs being scheduled there on external computes like Databricks, there should be a email alert system you can use to detect failures.
Other than that if you had multiple environments (dev, qa, prod) you could test in non production environment.
Or if you have a dedicated storage location that is only for testing a pipeline, use that for the first few days and then reschedule the pipeline in the real location after testing it completes a few runs.

Liquibase incremental snapshots

We've got a rather interesting use-case where we're using Liquibase to deploy a database for our application but we're not actually in control of the database. This means that we've got to add in a lot of extra logic around each time we run Liquibase to avoid encountering any errors during the actual run. One way we've done that is that we're generating snapshots of what the DB should look like for each release of our product and then comparing that snapshot with the running DB to know that it's in a compatible state. The snapshot files for our complete database aren't gigantic but if we have to have a full one for every possible release that could cause our software package to get large in the future with dead weight.
We've looked at using the Linux patch command to create offset files as the deltas between these files will typically be very small (i.e. 1 column change, etc.) but the issues are the generated IDs in the snapshot that are not consistent across runs:
"snapshotId": "aefa109",
"table": "liquibase.structure.core.Table#aefa103"
Is there any way to force the IDs to be consistent or attack this problem in a different way?
Thanks!
Perhaps we should change how we think about PROD deployments. When I read:
This means that we've got to add in a lot of extra logic around each time we run Liquibase to avoid encountering any errors during the actual run.
This is sort of an anti-pattern in the world of Liquibase. Typically, Liquibase is used in a CI/CD pipeline and deployments of SQL are done on "lower environments" to practice for the PROD deployment (which many do not have control over, so your situation is a common one).
When we try to accommodate the possible errors during a PROD deployment, I feel we already are in a bad place with our deployment automation. We should have been testing the deploys on lower environmets that look like PROD.
For example your pipeline for your DB could look like:
DEV->QA->PROD
Create SQL for deployment in a changelog
DEV & QA seeded with restore from current state of PROD (maybe minus the row data)
You would have all control in DEV (the wild west)
Less control of QA (typically only by QA)
Iterate till you have no errors in your DEV & QA env
Deploy to PROD
If you still have errors, I would argue that you must root cause why and resolve so you can have a pipeline that is automatable.
Hope that helps,
Ronak

Why cleanup a DB after a test run?

I have several test suites that read and write data from a dedicated database when they are run. My strategy is to assume that the DB is in an unreliable state before a test is run and if I need certain records in certain tables or an empty table I do that setup before the test is run.
My attitude is to not cleanup the DB at the end of each test suite because each test suite should do a cleanup and setup before it runs. Also, if I'm trying to "visually" debug a test suite it helps that the final state of the DB persists after the tests have completed.
Is there a compelling reason to cleanup a DB after your tests have run?
Depends on your tests, what happens after your tests, and how many people are doing testing.
If you're just testing locally, then no, cleaning up after yourself isn't as important ~so long as~ you're consistently employing this philosophy AND you have a process in place to make sure the database is in a known-good state before doing something other than testing.
If you're part of a team, then yes, leaving your test junk behind can screw up other people/processes, and you should clean up after yourself.
In addition to the previous answer I'd like to also mention that this is more suitable when executing Integration tests. Since Integrated modules work together and in conjunction with infrastructure such as message queues and databases + each independent part works correctly with the services it depends on.
This
cleanup a DB after a test run
helps you to Isolate Test Data. A best practice here is to use transactions for database-dependent tests (e.g.,component tests) and roll back the transaction when done. Use a small subset of data to effectively test behavior. Consider it as Database Sandbox – using the Isolate Test Data pattern. E.g. each developer can use this lightweight DML to populate his local database sandboxes to expedite test execution.
Another advantage is that you Decouple your Database, so ensure that application is backward and forward compatible with your database so you can deploy each independently. Patterns like Encapsulate Table with View, and NoSQL databases ensure that you can deploy two application versions at once without either one of them throwing database-related errors. It was particularly successful in a project where it was imperative to access the database using stored procedures.
All this is actually one of the concepts that is used in Virtual test labs.
In addition to above answers, I'll add few more points:
DB shouldn't be cleaned after test because thats where you've your test data, test results and all history which can be referred later on.
DB should be cleaned only if you are changing some application setting to run your / any specific test, so that it shouldn't impact other tester.

Use IronWorkers while using my work

My website is hosted on AWS Elastic Beanstalk (PHP). I use Yii Framework as an MVC.
A while ago I wanted to run a SQL query everyday. I looked up how to run crons on Beanstalk and it seemed complicated to merge the concepts of Cloud and Cron. I ran into Iron Worker (http://www.iron.io/worker), and managed to create a worker that is currently doing its job fine.
Today I want to run a more complex cron (Look for notifications in my database, decide whether to send an email, build an email template and send the email (via AWS SES).
From what I understand, worker files are supposed to be self-contained items, with everything they need to work.
However, I have invested a lot of time and effort in building my MVC. I have complex models, verifications, an email templating engine, etc...
It seems very difficult to use the work I've done to create an Iron Worker. Even if I managed to port all of my code to a worker (which seems like a great deal of work), it means anytime I make changes to my main code I need to make sure the worker also has those changes. It means I would have a "branch" of my code. Even more so if I want to create more workers in the future.
What is the correct approach?
Short-term, you could likely just use the scheduling capabilities in IronWorker and have the worker hit an endpoint in your application. The endpoint will then trigger the operations to run within your app environment.
Longer-term, we do suggest you look at more of a service-oriented approach whereby you break your application up to be more loose-coupled and distributed. Here's a post on the subject. The advantages are many especially around scalability and development agility.
https://blog.heroku.com/archives/2013/12/3/end_monolithic_app
You can also take a look at this YII addition.
http://www.yiiframework.com/extension/yiiron/
Certainly don't want you rewrite your app unnecessarily but there are likely areas where you can look to decouple. Suggest creating a worker directory and making efforts to write the workers to be self-contained. In that way, you could run them in a different environment and just pass payloads to the worker. (Push queues can also be used to push to these workers.) Once you get used to distributed async processing, it's a pretty easy process to manage.
(Note: I work at Iron.io)

How to flush out suspended WCF workflows from the instancestore?

We have identified the need to flush out several different workflows that have been suspended/persisted for a long time (i.e. hung instances). This is so that our test environment can be flushed clean before acceptance tests are re-run.
The dirty solution is to use a sql script to remove records from the InstancesTable and other related tables in the database.
What's the proper solution?
These are WCF workflows.
Test rig is running XP.
Using the AppFabric you can use the UI, or I asume PowerShell commands, to delete individual instanced. For development and test purposes I normally just recreate the database by running SqlWorkflowInstanceStoreSchema.sql script again.
Found a way to do it (thanks to Pablo Rotondo on MSDN):
http://www.funkymule.com/post/2010/04/28/how-to-resume-suspended-workflows-in-net-40.aspx