Anguar node_modules: how to cleanup? [closed] - npm

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Having an Angular project I have a node_modules directory in my project dir.
That is pretty full with all the files of the modules I use.
I like to periodically save the project folder for backup. Doing this takes a bit of time because of node_modules.
Is it a bad idea to remove nodes_modules before backup and then after doing more coding rebuild it with
npm install
?
Or maybe theres a better way to have smaller backups?
EDIT
I do git and also this directory-backup. My question is regarding the directory-backup only.

Your package.json file act as a blue print for your required node modules with versions of every node module being used in the project, hence keeping a back up of node_module doesn't make sense, as you can get it back with a npm install on your project anytime
If you are using Git, you can ignore node_modules by adding the following in .gitignore file
# dependencies
/node_modules

node_modules and package-lock.json should not be backed up it should be installed when used as all data is present in package.json
Please use a version control system shuch as git instead of manual backups
Check this link for better understanding
https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control

Related

.gitignore node_modules/ not working for fd and ripgrep [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
I'm trying to setup fzf.vim for vim on windows 10.
You can use an alternative find command like ripgrep or fd which is supposed to respect .gitignore.
My .gitignore file has this line, which is working fine for git commits, etc.:
node_modules/
My dir structure is
/working directory
.gitignore file
.git dir
/node_modules dir
When I run
fd --type f
or
rg --files
It lists all files in node_modules.
I feel like this may be a windows problem.
How can I get these programs to use .gitignore to ignore node_modules?
Turns out, I was using a project where the git repo had not yet been initialized. So I had a .gitignore, but did not have a .git folder. And by default ripgrep needs it to be an actual git repository to utilize the .gitiginore folder. My solution was to use the following flag -
--no-require-git
More info here

general question about node_modules and security

Can't find anything on this online and might be a non-issue, but I figured I'd ask here to make sure.
We run the Wordfence security plugin on a bunch of WordPress sites and have recently seen this "critical issue" reported:
Filename: wp-content/themes/theme-name/node_modules/webpack-assets-manifest/test/fixtures/client.js
File Type: Not a core, theme, or plugin file from wordpress.org.
Details: This file appears to be installed or modified by a hacker to perform malicious activity.
If you know about this file you can choose to ignore it to exclude it from future scans.
The matched text in this file is: require('./Ginger.jpg');
The issue type is: Backdoor:PHP/req_img.3645
Description: A backdoor known as req_img
Now first of all that doesn't look like a backdoor to me, especially since node_modules contents aren't executed unless I run npm (or yarn), as far as I understand. Is this more serious than I think?
Secondly, when running npm/yarn on the server, the node_modules folder has chmod 775 (drwxrwxr-x) by default. Is it okay to leave it like that or should we take any action?

NPM : Create an NPM package that adds files and folders to the project root directory

I've created a web app template that I use frequently for many different projects.
I would like to create an NPM package for it so that it's easier to install for new projects, separate the template from the project files, separate the template dependencies from the project dependencies, and allow easier updating of the template across all projects.
The issue I have is that I need some files/folders to be installed in the root directory (i.e. where package.json is saved). Most can go in the node_modules folder however I have some files that must be placed in the root directory.
For example, the template uses Next.js with a custom _app.js file. This must be in the root directory in a folder named pages. I also have various config files that must be in the root directory.
Can this be done with NPM, or does everything need to be installed in the node_modules folder? I'm having trouble finding anything on SO or Google that answers this, so if you happen to know a guide online on how to do this or can outline things I should search for it would be much appreciated.
With pure npm, everything has to go to the node_modules folder, so you can't solve your issue this way.
Maybe going with a templating tool such as grunt init or yeoman could be a solution here, although – unfortunately – you'll then lose some of the benefits of being able to install a package via npm.
Another option might be to use GitHub template repositories, which have just been introduced recently.
Last but not least one option might also be to just have the files' contents in the npm package, but create the pages/_app.js manually, but inside of it simply require the file contents from an npm module, and that's it. This at least helps to have the content portable, but of course it still asks you to setup the file and folder structure on your own.
Sorry that I don't have a better answer, but I hope it helps anyway.
PS: One "solution" might also be to use the postinstall step in an npm module's package.json file to create folder structure, copy files to where they should be and so on, but at least to me this feels more like a clumsy workaround than like a real solution.

What does "Linking Dependencies" during npm / yarn install really do?

For large web apps npm install resp. yarn install does take a lot of time, mostly in a step called Linking Dependencies. What is happening here? Is it fetching the dependencies of the dependencies? Or something completely different? Which files are created during this step?
When you call yarn install, the following things happen in order:
Resolution: Yarn starts resolving dependencies by making requests to the registry and recursively looking up each dependency.
Downloading/Fetching: Next, Yarn looks in a global cache directory to see if the package needed has already been downloaded. If it hasn't, Yarn fetches the tarball for the package and places it in the global cache so it can work offline and won't need to download dependencies more than once. Dependencies can also be placed in source control as tarballs for full offline installs.
Linking: Finally, Yarn links everything together by copying all the files needed from the global cache into the local node_modules directory after identifying what's already there and what's not there.
yarn install does take a lot of time, mostly in a step called Linking Dependencies
You should notice that Step 3: Linking is taking more time than Step 1: Resolution and Step 2: Fetching where the actual download happens. During by this step we already have things that we need ready and downloaded, then why is it taking long, did we miss anything?
Yes, COPY to local project into node_modules folder...! The reason for this is that this copy is not equivalent to copying one large 4.7GB ISO file. Instead it's multiple super small files (Don't take it light when I say multiple, it can be 15k+ files :P ), hence take a lot of time to copy. (Also, it is important to note that when you download the packages, you download one large tar file per package, whose contents should then be extracted into the cache which also takes time)
It is slower due to
Anti-virus: Your antivirus is sitting in the middle and doing a quick inspect (in addition to our yarn checking if it already exists) on every single file yarn is trying to copy cutting its speed by so much. If you are on Windows, try adding your project's parent folder as exception to Windows Defender.
Storage medium's transfer rate: SSDs can improve this speed hugely (Sorry, SSHDs and FireCudas will not help either, this is gonna be one time).
But is this efficient? Can I have it taken from the global node_modules (after creating one)?
Nope for both questions. Because of the way node works each package finds its dependencies only relative to its own location. Also because each project may want to use different versions of the same package to ensure its working properly and not broken by package updates.
Ideally, the project folder should be lean. An efficient way of doing this would be to have a global node_modules folder. Any and all requested packages are downloaded if not already present AND used from this location. Actually Ruby does it this way. Here's my global Ruby's equivalent of node_modules folder. Notice the presence of different versions of the same package for use in different projects.
But keep in mind that it would reduce project portability. It's a trade-off that any manager (be it rubygems or node modules) has to make. I can just copy the node project folder (which in fact may take hours because you will be copying the (local) node_modules folder as well, but I can expect it to work if I have just that project folder, as opposed to copying a ruby project would only some seconds to few minutes, as there is no local packages (or gems as they call them) folder, but running the project on different system would require those packages to be present on the global gems folder.
The documentation for yarn install can be found here.
You can use the command
yarn install --verbose
Show additional logs while installing dependencies
The output will show what the yarn/npm install is doing.
It's good for debugging in case the process is failing or taking a long time.
The linking phase works in essentially 3 big steps:
Find every file that need to be in node_modules
Check this list versus what is already there and find what need to be copied around
from cache to node_modules
Do the copy
Maybe this issue on Github will help you out.
https://github.com/yarnpkg/yarn/issues/1496

Latest (master/snapshot) Spark documentation - either online or run locally [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Is there a location where the latest spark documentation has been built and is available?
For example the 1.3.0 release candidate branch was cut five days ago but it is not available from the apache site - the newest is the already-in-production 1.2.0.
Even better would be the output of an Amplab Jenkins build. But maybe someone just publishes it regularly on a publicly accessible location?
Alternatively what is the procedure to generate html from the Spark markdown's? I can easily put up a local webserver to serve them.
Nightly publication of documentation snapshots and build artifacts is on the Spark roadmap; see https://issues.apache.org/jira/browse/SPARK-1517.
For the SECOND part of my question - how to generate the docs locally - the docs/README.md does have instructions.
The result is shown here;
And here we are (notice localhost:4000 and Spark version 1.3.0 - which is not released yet)
The instructions are copied here:
The markdown code can be compiled to HTML using the Jekyll
tool. Jekyll and a few dependencies must be
installed for this to work. We recommend installing via the Ruby Gem
dependency manager. Since the exact HTML output varies between
versions of Jekyll and its dependencies, we list specific versions
here in some cases:
$ sudo gem install jekyll
$ sudo gem install jekyll-redirect-from
Execute jekyll from the docs/ directory. Compiling the site with
Jekyll will create a directory called _site containing index.html as
well as the rest of the compiled files.
You can modify the default Jekyll build as follows:
# Skip generating API docs (which takes a while)
$ SKIP_API=1 jekyll build
# Serve content locally on port 4000
$ jekyll serve --watch
# Build the site with extra features used on the live page
$ PRODUCTION=1 jekyll build
Pygments
We also use pygments (http://pygments.org) for syntax highlighting in
documentation markdown pages, so you will also need to install that
(it requires Python) by running sudo pip install Pygments.
To mark a block of code in your markdown to be syntax highlighted by
jekyll during the compile phase, use the following sytax:
{% highlight scala %}
// Your scala code goes here, you can replace scala with many other
// supported languages too.
{% endhighlight %}
Sphinx
We use Sphinx to generate Python API docs, so you will need to install
it by running sudo pip install sphinx.