What does "Linking Dependencies" during npm / yarn install really do? - npm

For large web apps npm install resp. yarn install does take a lot of time, mostly in a step called Linking Dependencies. What is happening here? Is it fetching the dependencies of the dependencies? Or something completely different? Which files are created during this step?

When you call yarn install, the following things happen in order:
Resolution: Yarn starts resolving dependencies by making requests to the registry and recursively looking up each dependency.
Downloading/Fetching: Next, Yarn looks in a global cache directory to see if the package needed has already been downloaded. If it hasn't, Yarn fetches the tarball for the package and places it in the global cache so it can work offline and won't need to download dependencies more than once. Dependencies can also be placed in source control as tarballs for full offline installs.
Linking: Finally, Yarn links everything together by copying all the files needed from the global cache into the local node_modules directory after identifying what's already there and what's not there.
yarn install does take a lot of time, mostly in a step called Linking Dependencies
You should notice that Step 3: Linking is taking more time than Step 1: Resolution and Step 2: Fetching where the actual download happens. During by this step we already have things that we need ready and downloaded, then why is it taking long, did we miss anything?
Yes, COPY to local project into node_modules folder...! The reason for this is that this copy is not equivalent to copying one large 4.7GB ISO file. Instead it's multiple super small files (Don't take it light when I say multiple, it can be 15k+ files :P ), hence take a lot of time to copy. (Also, it is important to note that when you download the packages, you download one large tar file per package, whose contents should then be extracted into the cache which also takes time)
It is slower due to
Anti-virus: Your antivirus is sitting in the middle and doing a quick inspect (in addition to our yarn checking if it already exists) on every single file yarn is trying to copy cutting its speed by so much. If you are on Windows, try adding your project's parent folder as exception to Windows Defender.
Storage medium's transfer rate: SSDs can improve this speed hugely (Sorry, SSHDs and FireCudas will not help either, this is gonna be one time).
But is this efficient? Can I have it taken from the global node_modules (after creating one)?
Nope for both questions. Because of the way node works each package finds its dependencies only relative to its own location. Also because each project may want to use different versions of the same package to ensure its working properly and not broken by package updates.
Ideally, the project folder should be lean. An efficient way of doing this would be to have a global node_modules folder. Any and all requested packages are downloaded if not already present AND used from this location. Actually Ruby does it this way. Here's my global Ruby's equivalent of node_modules folder. Notice the presence of different versions of the same package for use in different projects.
But keep in mind that it would reduce project portability. It's a trade-off that any manager (be it rubygems or node modules) has to make. I can just copy the node project folder (which in fact may take hours because you will be copying the (local) node_modules folder as well, but I can expect it to work if I have just that project folder, as opposed to copying a ruby project would only some seconds to few minutes, as there is no local packages (or gems as they call them) folder, but running the project on different system would require those packages to be present on the global gems folder.

The documentation for yarn install can be found here.
You can use the command
yarn install --verbose
Show additional logs while installing dependencies
The output will show what the yarn/npm install is doing.
It's good for debugging in case the process is failing or taking a long time.

The linking phase works in essentially 3 big steps:
Find every file that need to be in node_modules
Check this list versus what is already there and find what need to be copied around
from cache to node_modules
Do the copy
Maybe this issue on Github will help you out.
https://github.com/yarnpkg/yarn/issues/1496

Related

Should i delete webpack and other libraries after bundling?

NPM donwloads a lot of files needed for the webpack/libraries. From what i understand, webpack generates a one single bundle file, that contains all code for script working. After that, when i finish building my app, do i need to keep all those jquery/react files and webpack itself? Or should i just delete them?
It's common practice to make a project portable/shareable by following these steps;
Create a package.json and ensure to capture all dependencies,devDependencies and/or peerDependencies.
Add/commit this package.json and package-lock.json files to your version control
Create a .gitignore file and add node_modules to it (in essence, this cuts out that baggage)
For production purpose (e.g. to be shared with client finished product), build the project (which often results into a small files, often within /build or dist). And then you can always push that build file to AWS or Heroku or the clients' servers.
What does the above help you achieve?
You can easily start the project using any machine, as long as you run npm install which reads from your package.json.

NPM : Create an NPM package that adds files and folders to the project root directory

I've created a web app template that I use frequently for many different projects.
I would like to create an NPM package for it so that it's easier to install for new projects, separate the template from the project files, separate the template dependencies from the project dependencies, and allow easier updating of the template across all projects.
The issue I have is that I need some files/folders to be installed in the root directory (i.e. where package.json is saved). Most can go in the node_modules folder however I have some files that must be placed in the root directory.
For example, the template uses Next.js with a custom _app.js file. This must be in the root directory in a folder named pages. I also have various config files that must be in the root directory.
Can this be done with NPM, or does everything need to be installed in the node_modules folder? I'm having trouble finding anything on SO or Google that answers this, so if you happen to know a guide online on how to do this or can outline things I should search for it would be much appreciated.
With pure npm, everything has to go to the node_modules folder, so you can't solve your issue this way.
Maybe going with a templating tool such as grunt init or yeoman could be a solution here, although – unfortunately – you'll then lose some of the benefits of being able to install a package via npm.
Another option might be to use GitHub template repositories, which have just been introduced recently.
Last but not least one option might also be to just have the files' contents in the npm package, but create the pages/_app.js manually, but inside of it simply require the file contents from an npm module, and that's it. This at least helps to have the content portable, but of course it still asks you to setup the file and folder structure on your own.
Sorry that I don't have a better answer, but I hope it helps anyway.
PS: One "solution" might also be to use the postinstall step in an npm module's package.json file to create folder structure, copy files to where they should be and so on, but at least to me this feels more like a clumsy workaround than like a real solution.

Peer dependency that is also dev dependency of linked npm module is acting as a separate instance

In my app, I have these dependencies:
TypeORM
typeorm-linq-repository AS A LOCAL INSTALL ("typeorm-linq-repository": "file:../../../IRCraziestTaxi/typeorm-linq-repository"), who has a dev dependency AND a peer dependency of TypeORM
The reason I use a "file:" installation of typeorm-linq-repository is that I am the developer and test changes in this app prior to pushing releases to npm.
I was previously using node ~6.10 (npm ~4), so when I used the "file:" installation, it just copied the published files over, which is what I want.
However, after upgrading to node 8.11.3 (npm 5.6.0), it now links the folder rather than copying the published files.
Note, if it matters, that my environment is Windows.
The problem is this: since both my app and the linked typeorm-linq-repository have TypeORM in their own node_modules folders, TypeORM is being treated as a separate "instance" of the module in each app.
Therefore, after creating a connection in the main app, when the code that accesses the connection in typeorm-linq-repository is reached, it throws an error of Connection "default" was not found..
I have searched tirelessly for a solution to this. I have tried --preserve-symlinks, but that does not work.
The only way for me to make this work right now is to manually create the folder in my app's node_modules and copy applicable files over, which is a huge pain.
How can I either tell npm to NOT symlink the "file:" installation or get it to use the same instance of the TypeORM module?
I made it work pretty easily, although I feel like it's kind of a band-aid. I will post the answer here to help anybody else who may be having this issue, but if anybody has a more proper solution, feel free to answer and I will accept.
The trick was to link my app's installation of TypeORM to the TypeORM folder in my other linked dependency's node_modules folder.
...,
"typeorm": "file:../../../IRCraziestTaxi/typeorm-linq-repository/node_modules/typeorm",
"typeorm-linq-repository": "file:../../../IRCraziestTaxi/typeorm-linq-repository",
...

Build for production Electron (using npm and webpack)

When I build my Electron app for production I still get a node_modules folder with the dependencies. The folder is constituted by:
The dependencies which are installed via package.json I already noticed that I can just delete them from the folder (since their code is inside webpack bundle.js )
ffprobe-static, which actually occupies the largest amount with 40Mb
The nodejs modules such as ajv,deferential, debug, decamelize, etc (158 folders total, while I don't even know most of them, let alone use them directly)
Regarding 2: Is it mandatory to have the binary for ffprobe-static? Can I use ffprobe-static with the ffmpeg.dll given alongside Electron binary?
Regarding 3: Why do I need these and how can I get rid of them? Also, Electron binary already comes with an 18.9Mb node.dll file. Again, can't I use this instead of having again the node_modules?

Dropbox selective syncing - pattern matching?

I'm using Dropbox on daily basis and put my programming projects in there.
It works great, but once I got many projects my /node_modules dir's are putting
a struggle on Dropbox. It's syncing process starts to be slow and it eats up CPU time.
Is there any way to do a selective sync based on directory name or a mask pattern?
Would be nice to have to a .gitignore equivalent to configure.
Any 3rd party software for that task?
There is a way to selectively sync but I don't believe it has any advanced rules like you're describing:
https://www.dropbox.com/help/175/en
2 way to resolve this problem:
You can put node_modules upper then project directory in files tree. For example:
Project dir: c:/prj/myProjWrapper/myProj
In the c:/prj/myProjWrapper put package.json and make npm install here, NodeJS recursively will find it.
Win and Linux only, not for Mac! In project dir create .ds_store folder (it is not sync by dropbox). Put package.json in to it and do npm install. You must set NODE_PATH=./.ds_store/node_modules;. when starting NodeJS