How to create a Pipeline with Gitlab CI, just for the affected packages with Turborepo

How to create a Pipeline with Gitlab CI, just for the affected packages with Turborepo - gitlab-ci

So, I am in the process to integrate Turborepo into our NodeJS(React, Next, Node) Monorepo which uses GitLab CI. The thing is the example in the Docs is quite not up to what I want.
For reference here is what they have in their Docs:
image: node:latest
# To use Remote Caching, uncomment the next lines and follow the steps below.
# variables:
# TURBO_TOKEN: $TURBO_TOKEN
# TURBO_TEAM: $TURBO_TEAM
stages:
- build
build:
stage: build
script:
- npm install
- npm run build
- npm run test
We have a few stages. Beyond the ones in their example:
install
build
package
What I would ideally like to have is to use Turborepo and GitLab Downstream Pipelines to run as follows:
install stage should run when root package.json has changed.
build, package stage should be run just for the affected packages. (i.e, if shared-lib is changed, then shared-lib should run as well as the 2 consumers app-a, app-b. In parallel)
I read the Docs and I can somehow make the Downstread Job run but not for the affected. Instead it does it for all. The main problem is how can I read the affected packages and its consumers, and just run those.
I read with the latest version I can use the --dry commands to read those. But let's say that works reliable which from my testing it doesn't. how can I put those packages, as Downstream Jobs in Gitlab?

Related

Gitlab CI: create dist folder inside repository?

I just recently started using gitlab CI to automate some build/deploy steps. It works perfectly to build docker images etc, but I was wondering if it's possible to create a folder in the repository during a build step? For example I'm now making an npm utility package, but I'm just importing it in my other projects via a private gitlab repo (using deploy token), but the code of the util package is written in es6 and needs to be transpiled to commonJS to be used in the other packages. Manually I can run npm run build and it will output a dist folder with the transpiled code.
I was trying (and researching) if it's possible to automate this build process using .gitlab-ci but so far I couldn't find anything.
Anyone know how I can achieve this and/or if this is possible?
Thanks in advance!

Not sure if I got your question correctly, so add more details if not.
When your CI build creates new folders or files, they are written to the task runner's file system (no surprise here, I assume).
If you want to access these files from Gitlab's web UI you can define them as artifacts in your build job (see https://docs.gitlab.com/ee/user/project/pipelines/job_artifacts.html)
Your build job would look something like that (pseudo code written by memory, not tested on Gitlab):
build:
script:
- npm run build
artifacts:
paths:
- dist/
expire_in: 1 week
UPDATE If you want to upload the build artifact to an NPM registry, you could just build and push together.
build:
script:
- npm run build
- npm publish <PARAMETERS>

Cache npm install Task in VSTS

I have configured a private agent in VSTS and have installed NPM there globally. When I'm trying to install NPM through my build task, it is still installing NPM packages for every build which is taking an aweful lot of time- approximately 12 minutes.
How can I cache the NPM installations so that the build time is reduced?

We use npm-cache, npm-cache is a node module that will calculate a hash of your package.json file for every hash it will create zip folder on your build server with the content of node_modules, now npm install is reduced to extracting a zip on every build (of course only in case you didn’t actually change package.json).
The idea is: in the first time the tool download the npm packages and save them locally, in the second time if the package.json not changed he takes the packages from the local disk and copy them to the build agent folder, only if the package.json changed he downloads the packages from the internet.
Install the npm-cache on the build machine:
npm install npm-cache -g
In the build definition add Command Line task (Tool: C:\Windows\User\AppData\Roaming\npm\npm-cache (or just npm-cache if you add the tool to environment path variables); Arguments:install npm; Working folder: $(Build.SourcesDirectory) (or where package.json located).

MS has finally implemented this feature (currently in beta) https://learn.microsoft.com/en-us/azure/devops/pipelines/caching/index?view=azure-devops#nodejsnpm
From there:
variables:
npm_config_cache: $(Pipeline.Workspace)/.npm
steps:
- task: CacheBeta#0
inputs:
key: $(Build.SourcesDirectory)/package-lock.json
path: $(npm_config_cache)
displayName: Cache npm
- script: npm ci

Unfortunately we cannot cache the NPM installations as no such a built-in feature for now.
However there's already a user voice submitted to suggest the feature : Improve hosted build agent performance with build caches, and seems the VSTS team are actively working on this now...
For now, you can try to speed Up NPM INSTALL on Visual Studio Team Services

Use Cache task
Caching is added to a pipeline using the Cache pipeline task. This
task works like any other task and is added to the steps section of a
job
With the following configuration:
pool:
name: Azure Pipelines
steps:
- task: Cache#2
inputs:
key: 'YOUR_WEB_DIR/package.json'
path: 'YOUR_WEB_DIR/node_modules/'
- task: Npm#1
inputs:
command: 'install'
workingDir: 'YOUR_WEB_DIR/frontend'
You can use key YOUR_WEB_DIR/package-lock.json too, but be aware that file might be changed by other next step like npm install so hash also will be changed.

Vue.js + git build process

I use vue.js + vue-cli + webpack to build my applications. During development I will run npm run dev to have webpack continuously watch my sources, compile everything, and reload the browser. To create production build, I can simply run npm run build. I would like to do this in a way that when I make a git commit, if my sources have changed, the build is created automatically.
My current approach is to simply use git pre and post commit hooks to automatically run npm run build and add the built files to the commit. This has the following downsides:
Even if other parts of the repo are changed, I re-run the build process for the Vue app, and it takes a very long time.
It makes resolving merge conflicts nearly impossible.
It creates a lot of cruft in the repo, ballooning its size
Typically I use a Vue.js frontend with a Django backend in the same repo, and deploy to Heroku or similar via a git push. What other methods are out there for accomplishing this task that don't have the above downsides?

Write a script in the package.json scripts section with something like
build && git commit -m "Build commit"

How to run build in local machine with drone.io

Does the build have to run on the drone.io server? Can I run the build locally? Since developers need to pass the build first before pushing code to github, I am looking for a way to run the build on developer local machine. Below is my .drone.yml file:
pipeline:
build:
image: node:latest
commands:
- npm install
- npm test
- npm run eslint
integration:
image: mongo-test
commands:
- mvn test
It includes two docker containers. How to run the build against this file in drone? I looked at the drone cli but it doesn't work in my expected way.

#BradRydzewski comment is the right answer.
To run builds locally you use drone exec. You can check the docs.
Extending on his answer, you must execute the command in the root of your local repo, exactly where your .drone.yml file is. If your build relies on secrets, you need to feed these secrets through the command line using the --secret or --secrets-file option.
When running a local build, there is no cloning step. Drone will use your local git workspace and mount it in the step containers. So, if you checkout some other commit/branch/whatever during the execution of the local build, you will mess things up because Drone will see those changes. So don't update you local repo while the build is running.

How to avoid reinstalling dependencies for each job in Gitlab CI

I'm using Gitlab CI 8.0 with gitlab-ci-multi-runner 0.6.0. I have a .gitlab-ci.yml file similar to the following:
before_script:
- npm install
server_tests:
script: mocha
client_tests:
script: karma start karma.conf.js
This works but it means the dependencies are installed independently before each test job. For a large project with many dependencies this adds a considerable overhead.
In Jenkins I would use one job to install dependencies then TAR them up and create a build artefact which is then copied to downstream jobs. Would something similar work with Gitlab CI? Is there a recommended approach?

Update: I now recommend using artifacts with a short expire_in. This is superior to cache because it only has to write the artifact once per pipeline whereas the cache is updated after every job. Also the cache is per runner so if you run your jobs in parallel on multiple runners it's not guaranteed to be populated, unlike artifacts which are stored centrally.
Gitlab CI 8.2 adds runner caching which lets you reuse files between builds. However I've found this to be very slow.
Instead I've implemented my own caching system using a bit of shell scripting:
before_script:
# unique hash of required dependencies
- PACKAGE_HASH=($(md5sum package.json))
# path to cache file
- DEPS_CACHE=/tmp/dependencies_${PACKAGE_HASH}.tar.gz
# Check if cache file exists and if not, create it
- if [ -f $DEPS_CACHE ];
then
tar zxf $DEPS_CACHE;
else
npm install --quiet;
tar zcf - ./node_modules > $DEPS_CACHE;
fi
This will run before every job in your .gitlab-ci.yml and only install your dependencies if package.json has changed or the cache file is missing (e.g. first run, or file was manually deleted). Note that if you have several runners on different servers, they will each have their own cache file.
You may want to clear out the cache file on a regular basis in order to get the latest dependencies. We do this with the following cron entry:
#daily find /tmp/dependencies_* -mtime +1 -type f -delete

EDIT: This solution was recommended in 2016. In 2021, you might consider the caching docs instead.
A better approach these days is to make use of artifacts.
In the following example, the node_modules/ directory is immediately available to the lint job once the build stage has completed successfully.
build:
stage: build
script:
- npm install -q
- npm run build
artifacts:
paths:
- node_modules/
expire_in: 1 week
lint:
stage: test
script:
- npm run lint

From docs:
cache: Use for temporary storage for project dependencies. Not useful for keeping intermediate build results, like jar or apk files. Cache was designed to be used to speed up invocations of subsequent runs of a given job, by keeping things like dependencies (e.g., npm packages, Go vendor packages, etc.) so they don’t have to be re-fetched from the public internet. While the cache can be abused to pass intermediate build results between stages, there may be cases where artifacts are a better fit.
artifacts: Use for stage results that will be passed between stages. Artifacts were designed to upload some compiled/generated bits of the build, and they can be fetched by any number of concurrent Runners. They are guaranteed to be available and are there to pass data between jobs. They are also exposed to be downloaded from the UI. Artifacts can only exist in directories relative to the build directory and specifying paths which don’t comply to this rule trigger an unintuitive and illogical error message (an enhancement is discussed at https://gitlab.com/gitlab-org/gitlab-ce/issues/15530 ). Artifacts need to be uploaded to the GitLab instance (not only the GitLab runner) before the next stage job(s) can start, so you need to evaluate carefully whether your bandwidth allows you to profit from parallelization with stages and shared artifacts before investing time in changes to the setup.
So, I use cache. When don't need to update de cache (eg. build folder in a test job), I use policy: pull (see here).

I prefer use cache because removes files when pipeline finished.
Example
image: node
stages:
- install
- test
- compile
cache:
key: modules
paths:
- node_modules/
install:modules:
stage: install
cache:
key: modules
paths:
- node_modules/
after_script:
- node -v && npm -v
script:
- npm i
test:
stage: test
cache:
key: modules
paths:
- node_modules/
policy: pull
before_script:
- node -v && npm -v
script:
- npm run test
compile:
stage: compile
cache:
key: modules
paths:
- node_modules/
policy: pull
script:
- npm run build

I think it´s not recommended because all jobs of the same stage could be executed in parallel.
First all jobs of build are executed in parallel.
If all jobs of build succeeds, the test jobs are executed in parallel.
If all jobs of test succeeds, the deploy jobs are executed in parallel.
If all jobs of deploy succeeds, the commit is marked as success.
If any of the previous jobs fails, the commit is marked as failed and no jobs of further stage are executed.
I have read that here:
http://doc.gitlab.com/ci/yaml/README.html

Solved a problem with a symbolic link to a folder outside the working directory. The solution looks like this:
//.gitlab-ci.yml
before_script:
- New-Item -ItemType SymbolicLink -Path ".\node_modules" -Target "C:\GitLab-Runner\cache\node_modules"
- yarn
after_script:
- (Get-Item ".\node_modules").Delete()
I know this is a enough dirty solution but it saves a lot of time for build process and extends the storage life.

GitLab introduced caching to avoid redownloading dependencies for each job.
The following Node.js example is inspired from the caching documentation.
image: node:latest
# Cache modules in between jobs
cache:
key: $CI_COMMIT_REF_SLUG
paths:
- .npm/
before_script:
- npm ci --cache .npm --prefer-offline
server_tests:
script: mocha
client_tests:
script: karma start karma.conf.js
Note that the example uses npm ci. This command is like npm install, but designed to be used in automated environments. You can read more about npm ci in the documentation and the command line arguments you can pass.
For further information, check Caching in GitLab CI/CD and the cache keyword reference.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas