Did "npm install" become deterministic in npm 7? - npm

Here https://github.blog/2021-02-02-npm-7-is-now-generally-available/
it's said:
The lockfile v2 unlocks the ability to do deterministic and reproducible builds to produce a package tree.
But I wonder is it the default behavior now for npm 7? That is, if there is a package-lock.json will npm install update top-most packages with imprecise versions like ^1.0.0 from package.json or it will always work the same way as yarn does?
If npm install is deterministic now, will I be right if I say that npm ci is mostly an equivalent of
rm -rf node_modules && npm install
with some additional checks?

Short Answer:
Yes!
Longer Answer:
Provided you have a package-lock.json or a yarn.lock file, both npm or yarn, respectively, do yield deterministic results.
One thing to note here is that yarn, using yarn.lock file, however, yields deterministic builds only for a specific version of yarn.
Yarn installs are guaranteed to be deterministic given a single
combination of yarn.lock and Yarn version. It is possible that a different version of Yarn will result in a different tree layout on disk.
Whereas npm's algorithms allow it to yield deterministic results even for different versions of the npm because npm tree building contract is entirely specified by the package-lock.json file.
You can find a more detailed explanation of the two in this Blog

Related

What is the difference between "npm install" and "npm ci"?

I'm working with continuous integration and discovered the npm ci command.
I can't figure what the advantages are of using this command for my workflow.
Is it faster? Does it make the test harder, okay, and after?
From the official documentation for npm ci:
In short, the main differences between using npm install and npm ci are:
The project must have an existing package-lock.json or npm-shrinkwrap.json.
If dependencies in the package lock do not match those in package.json, npm ci will exit with an error, instead of updating the package lock.
npm ci can only install entire projects at a time: individual dependencies cannot be added with this command.
If a node_modules is already present, it will be automatically removed before npm ci begins its install.
It will never write to package.json or any of the package-locks: installs are essentially frozen.
Essentially,
npm install reads package.json to create a list of dependencies and uses package-lock.json to inform which versions of these dependencies to install. If a dependency is not in package-lock.json it will be added by npm install.
npm ci (also known as Clean Install) is meant to be used in automated environments — such as test platforms, continuous integration, and deployment — or, any situation where you want to make sure you're doing a clean install of your dependencies.
It installs dependencies directly from package-lock.json and uses package.json only to validate that there are no mismatched versions. If any dependencies are missing or have incompatible versions, it will throw an error.
Use npm install to add new dependencies, and to update dependencies on a project. Usually, you would use it during development after pulling changes that update the list of dependencies but it may be a good idea to use npm ci in this case.
Use npm ci if you need a deterministic, repeatable build. For example during continuous integration, automated jobs, etc. and when installing dependencies for the first time, instead of npm install.
npm install
Installs a package and all its dependencies.
Dependencies are driven by npm-shrinkwrap.json and package-lock.json (in that order).
without arguments: installs dependencies of a local module.
Can install global packages.
Will install any missing dependencies in node_modules.
It may write to package.json or package-lock.json.
When used with an argument (npm i packagename) it may write to package.json to add or update the dependency.
when used without arguments, (npm i) it may write to package-lock.json to lock down the version of some dependencies if they are not already in this file.
npm ci
Requires at least npm v5.7.1.
Requires package-lock.json or npm-shrinkwrap.json to be present.
Throws an error if dependencies from these two files don't match package.json.
Removes node_modules and install all dependencies at once.
It never writes to package.json or package-lock.json.
Algorithm
While npm ci generates the entire dependency tree from package-lock.json or npm-shrinkwrap.json, npm install updates the contents of node_modules using the following algorithm (source):
load the existing node_modules tree from disk
clone the tree
fetch the package.json and assorted metadata and add it to the clone
walk the clone and add any missing dependencies
dependencies will be added as close to the top as is possible
without breaking any other modules
compare the original tree with the cloned tree and make a list of
actions to take to convert one to the other
execute all of the actions, deepest first
kinds of actions are install, update, remove and move
npm ci will delete any existing node_modules folder and relies on the package-lock.json file to install the specific version of each package. It is significantly faster than npm install because it skips some features. Its clean state install is great for ci/cd pipelines and docker builds! You also use it to install everything all at once and not specific packages.
While everyone else has answered the technical differences none explain in what situations to use both.
You should use them in different situations.
npm install is great for development and in the CI when you want to cache the node_modules directory.
When to use this? You can do this if you are making a package for other people to use (you do NOT include node_modules in such a release). Regarding the caching, be careful, if you plan to support different versions of Node.js remember that node_modules might have to be reinstalled due to differences between the Node.js runtime requirements. If you wish to stick to one version, stick to the latest LTS.
npm ci should be used when you are to test and release a production application (a final product, not to be used by other packages) since it is important that you have the installation be as deterministic as possible, this install will take longer but will ultimately make your application more reliable (you do include node_modules in such a release). Stick with LTS version of Node.js.
npm i and npm ci both utilize the npm cache if it exists, this cache lives normally at ~/.npm.
Also, npm ci respects the package-lock.json file. Unlike npm install, which rewrites the file and always installs new versions.
Bonus: You could mix them depending on how complex you want to make it. On feature branches in git you could cache the node_modules to increase your teams productivity and on the merge request and master branches rely on npm ci for a deterministic outcome.
The documentation you linked had the summary:
In short, the main differences between using npm install and npm ci are:
The project must have an existing package-lock.json or npm-shrinkwrap.json.
If dependencies in the package lock do not match those in package.json, npm ci will exit with an error, instead of updating the package lock.
npm ci can only install entire projects at a time: individual dependencies cannot be added with this command.
If a node_modules is already present, it will be automatically removed before npm ci begins its install.
It will never write to package.json or any of the package-locks: installs are essentially frozen.
The commands are very similar in functionality however the difference is in the approach taken to install the dependencies specified in your package.json and package-lock.json files.
npm ci performs a clean install of all the dependencies of your app whereas npm install may skip some installations if they already exist on the system. A problem may arise if the version already installed on the system isn't the one your package.json intended to install i.e. the installed version is different from the 'required' version.
Other differences would be that npm ci never touches your package*.json files. It will stop installation and show an error if the dependency versions do not match in the package.json and package-lock.json files.
You can read a much better explanation from the official docs here.
Additionally, you may want to read about package locks here.
It is worth having in mind that light node docker images like alpine do not have Python installed which is a dependency of node-gyp which is used by npm ci.
I think it's a bit opinionated that in order to have npm ci working you need to install Python as dependency in your build.
More info here Docker and npm - gyp ERR! not ok
npm ci - install exactly what is listed in package-lock.json
npm install - without changing any versions in package.json, use package.json to create/update package-lock.json, then install exactly what is listed in package-lock.json
npm update - update package.json packages to latest versions, then use package.json to create/update package-lock.json, then install exactly what is listed in package-lock.json
Or said a different way, npm ci changes 0 package files, npm install changes 1 package file, and npm update changes 2 package files.
It does a clean install, use it in situations where you would delete node_modules and re-run npm i.
I have no idea why some people think it's short for "continuous integration". There is an npm install command that can be run as npm i and an npm clean-install command that can be run as npm ci.
npm install is the command used to install the dependencies listed in a project's package.json file, while npm ci is a command that installs dependencies from a package-lock.json or npm-shrinkwrap.json file. The npm ci command is typically used in continuous integration (CI) environments, where the package-lock.json or npm-shrinkwrap.json file is checked into version control and should not be modified. Because npm ci installs dependencies from a locked file, it is a faster and more reliable way to install dependencies than npm install, which could install different versions of dependencies based on the state of the package.json file.

npm install with --no-package-lock flag - is existing package-lock.json used?

From the npm 5 doc:
The --no-package-lock argument will prevent npm from creating a package-lock.json file.
Does an npm install with --no-package-lock follows the package-lock.json (if already exists) deterministic install / nested locked versions ? Or does it completly ignore it ?
Answer from the #npm_support:
Using --no-package-lock skips the package-lock. It is neither read nor written as if the package-lock feature did not exist.
So the package-lock.json file isn't used at all when the --no-package-lock is on.
For deterministic install you must have an package-lock.json and use npm ci. See https://docs.npmjs.com/cli/v7/commands/npm-ci
This command is similar to npm install, except it's meant to be used in automated environments such as test platforms, continuous integration, and deployment -- or any situation where you want to make sure you're doing a clean install of your dependencies.

What is the difference between npm-shrinkwrap.json and package-lock.json?

With the release of npm#5, it will now write a package-lock.json unless a npm-shrinkwrap.json already exists.
I installed npm#5 globally via:
npm install npm#5 -g
And now, if a npm-shrinkwrap.json is found during:
npm install
a warning will be printed:
npm WARN read-shrinkwrap This version of npm
is compatible with lockfileVersion#1,
but npm-shrinkwrap.json was generated for lockfileVersion#0.
I'll try to do my best with it!
So my take-away is that I should replace the shrinkwrap with the package-lock.json.
Yet why is there a new format for it? What can the package-lock.json do that the npm-shrinkwrap.json cannot?
The files have exactly the same content, but there are a handful of differences in how npm handles them, most of which are noted on the docs pages for package-lock.json and npm-shrinkwrap.json:
package-lock.json is never published to npm, whereas npm-shrinkwrap is by default
package-lock.json files that are not in the top-level package are ignored, but shrinkwrap files belonging to dependencies are respected
npm-shrinkwrap.json is backwards-compatible with npm versions 2, 3, and 4, whereas package-lock.json is only recognized by npm 5+
You can convert an existing package-lock.json to an npm-shrinkwrap.json by running npm shrinkwrap.
Thus:
If you are not publishing your package to npm, the choice between these two files is of little consequence. You may wish to use package-lock.json because it is the default and its name is clearer to npm beginners; alternatively, you may wish to use npm-shrinkwrap.json for backwards compatibility with npm 2-4 if it is difficult for you to ensure everyone on your development team is on npm 5+. (Note that npm 5 was released on 25th May 2017; backwards compatibility will become less and less important the further we get from that date, as most people will eventually upgrade.)
If you are publishing your package to npm, you have a choice between:
using a package-lock.json to record exactly which versions of dependencies you installed, but allowing people installing your package to use any version of the dependencies that is compatible with the version ranges dictated by your package.json, or
using an npm-shrinkwrap.json to guarantee that everyone who installs your package gets exactly the same version of all dependencies
The official view described in the docs is that option 1 should be used for libraries (presumably in order to reduce the amount of package duplication caused when lots of a package's dependencies all depend on slightly different versions of the same secondary dependency), but that option 2 might be reasonable for executables that are going to be installed globally.
Explanation from NPM Developer:
The idea is definitely for package-lock.json to be the Latest and
Greatest in shrinkwrap technology, and npm-shrinkwrap.json to be
reserved for those precious few folks out there who care very much
about their libraries having an exact node_modules -- and for people
who want CI using npm#>=2 to install a particular tree without having
to bump its npm version.
The new lockfile ("package-lock.json") shares basically all of the
same code, the exact same format as npm-shrinkwrap (you can rename
them between one another!). It's also something the community seems to
understand: "it has a lockfile" seems to click so much faster with
people. Finally, having a new file meant that we could have relatively
low-risk backwards-compat with shrinkwrap without having to do weird
things like allow-publication mentioned in the parent post.
I think the idea was to have --save and shrinkwrap happen by default but avoid any potential issues with a shrinkwrap happening where it wasn't wanted. So, they just gave it a new file name to avoid any conflicts. Someone from npm explained it more thoroughly here:
https://www.reddit.com/r/javascript/comments/6dgnnq/npm_v500_released_save_by_default_lockfile_better/di3mjuk/
The relevant quote:
npm publishes most files in your source directory by default, and
people have been publishing shrinkwraps for years. We didn't want to
break compatibility. With --save and shrinkwrap by default, there was
a great risk of it accidentally making it in and propagating through
the registry, and basically render our ability to update deps and
dedupe... null.
So we chose a new name. And we chose a new name kind of all of a
sudden. The new lockfile shares basically all of the same code, the
exact same format
package-lock.json versions are guaranteed with only npm ci (since npm install overwrites package-lock.json if there is a conflict with package.json).
npm-shrinkwrap.json versions are guaranteed with both npm ci and npm install.

How do I force npm to reinstall a single package, even if the version number is the same?

In my Node.js project, I have a dependency on another local project. Oftentimes, I need to make a small change to the dependency and see how it affects my main project. In order to do this, I have to reinstall my dependency using npm.
I can use npm update to try to update my dependency, but this seems like it will only work if the version number has changed on the dependency. I don't want to have to change the version number on my dependency every time I change a line of code or two to make an experimental change in development.
I can rm -rf node_modules/; npm install to ensure that I get the latest versions of all of my dependencies. Downloading all of my non-local dependencies takes several minutes, breaking up my train of thought.
Is there a way to force npm to reinstall a single dependency, even if that dependency's version number hasn't changed?
When you run npm install, it will install any missing dependencies, so you can combine it with an uninstall like this:
npm uninstall some_module; npm install
With npm 5, uninstalled modules are removed from the package.json, so you should use:
npm uninstall some_module; npm install some_module
On npm v 6.14:
npm install module_name --force --no-save
You get a message stating:
npm WARN using --force I sure hope you know what you are doing.
And then it proceeds to uninstall and reinstall the package.
Note: if you don't specify the --no-save option, npm updates the package version on package.json to the highest version that is compatible with the existing SemVer rule.
If you do not want npm to update the package's version on package.json, keep the --no-save option.
Not the best answer, but just for information, you can run
npm ci
It is the same as npm install, but it will remove the existing node_modules folder, if any, and do a fresh install for all packages. This is useful if the files in node_modules have been changed for some reason and you want to revert them to their original state.

Are yarn and npm interchangeable in practice?

I have a project with a package.json file and an install bash script that, among other steps, runs npm install.
I'm thinking of updating the script so that it runs yarn install if yarn is available (to take advantage of yarn's caching, lockfile, etc), and falls back to npm install otherwise. As far as I can tell, all the packages seem to install and work ok either way.
Are yarn and npm interchangeable enough for this to be a viable approach, though? Or are there potential issues that this could lead to? Are we meant to just pick one, or is yarn interchangeable with npm in practice?
(nb. I've read this closely related question, but I'm asking this as a separate question because it's about explicitly supporting both yarn and npm install processes in a project)
Yarn and npm (version >=3.0.0) should be relatively compatible, especially moving from npm to Yarn, because compatibility is one of the stated goals of Yarn. As stated in Migrating from npm:
Yarn can consume the same package.json format as npm, and can install any package from the npm registry.
So, in theory, any package.json that is valid for npm should also work equally well for Yarn. Note that I say that npm v2 is probably less compatible - this is because npm migrated from a nested node_modules structure to a flat layout (which is what Yarn uses). That said, Yarn and npm v3 should produce very similar layouts, because, as stated in the issue I linked:
To a first approximation we should try to be very compatible with the node_modules layout for people who need that compatibility, because it'll be the most likely way to avoid long-tail compatibility problems.
However, you will not be able to take advantage of the Yarn.lock generated by Yarn, because (as the name suggests) it's only supported by Yarn, and npm shrinkwrap is not compatible.
Also, as noted by #RyanZim, older versions of Yarn don't support pre- and post-install hooks, but versions later than v0.16.1 do. If you rely on these hooks, you will need to specify to users that versions greater than v0.16.1 are required.
In summary, as long as you encounter no bugs and only use features that are shared by both package managers, you should have no issues whatsoever.