git submodule vs npm package? - git-submodules

I'm using git submodule to build and shared components between projects. The project is not in production yet, so, at this point submodule is serving well.
But I'm concerned about maintenance and deploy, would be a good idea transform it into a npm package ?

An npm package will allow fragmentation across different package versions. On the other hand, git submodules have a bit of a learning curve, and the tooling is really not that good. With git submodules, you have all the source in one folder.
If it's at all possible, I'd recommend using a plain monorepo for all projects. You may need to create build time variables (via babel plugin/s), you may need some sort of "live config" get served from the backend. I worked with git submodules for a year and I've recently worked with a project that uses npm to share code.
I would recommend using only one git submodule, for all shared code, instead of several submodules. I would strongly consider using lerna, and use your one git submodule to track lerna's packages directory. And if the team decides they don't like git submodules, you can easily make this repo a sibling git repo, instead of a submodule. However, above all this, I'd recommend using a plain monorepo.
Here's a great talk on monorepo's from Netflix: https://www.youtube.com/watch?v=VNqmHJtItCs (strong focus on discouraging npm-style packages)
Here's google's infamous monorepo talk: https://www.youtube.com/watch?v=W71BTkUbdqE
This is a great site to read to help you think about good development flows: https://trunkbaseddevelopment.com/ (it primarily advocates for the monorepo approach)
If you are developing software for different clients(different people/companies paying you for similar projects), and have some agreement that they should be at least ~80% the same, you may really enjoy using build flags to help get started on splitting functionality, but I'm sure you should very proactively keep the code around the build flags clean, and refactor into re-usable components/packages. Give each client some sort of build-flags.json. Build flags should be named for features only, which in theory can all be individually toggled. Some code may be totally custom for each project, in this case, you may want to consider using dynamic imports, but generally this is a pain point I have yet to fully cross, although I have plenty of unrefined ideas around this.
If a monorepo is just not happening, I would actually recommend using npm packages+separate repos over git submodules, assuming you can do good semantic versioning of the package. (And, yalc seems to be a good tool for linking together packages, as opposed to the standard npm/yarn link)

My findings after trying both lerna, npm workspaces and git submodules. I find it is not a case of the one vs the other.
The reason why I say this is because one can have submodules that are part of the monorepo. Doing exactly this made my development experience better as I could clone an existing project and actively develop it within the bigger project (monorepo). I could then contribute back to the cloned project once satisfied with the changes. This is something that you cannot do with npm workspaces alone. Hence my argument that it is not a case of one vs the other. They solve different problems and can therefore complement each other.
Before using npm workspaces I would use npm link all the time. npm workspaces makes this use-case of developing with multiple packages more convenient. Even when the team you work with does not use a monorepo; you could use one to develop multiple packages and test them in conjunction. Once satisfied, you can push the individual repos. This is something you cannot do with git alone.
Maybe you can think of more novel ways of combining the features of npm and git.

Related

Straight forward way to use your own NPM package without the NPM registry

I want to split up the code base of several of my project into isolated package like projects. Those should be easily usable by npm but they do not seem significant enough to be published to the global npm registry.
So, my question is if there is a middle way to handling them like local provided packages and installing them with their path and publishing them in the global repository.
Concerns:
cluttering the npm registry with packages which don't seem to be significant enough to take up the name
the need to document and to create tests for each package seems to be too much and I would not sleep well publishing packages which are not well documented and tested
I take up a name which might be more appropriate to be used by a more sophisticated package and maintainers
I still want other to be able to easily try / use this package, to see if it fits their needs
Alternatives:
A) creating a private npm repository (with CouchDB?)
+ is pretty much identical to the npm repository and would be easy to use
+ the versioning is identical just pure semver lookup
- every user needs to set up this repository if they want to use this package or need it as a child dependency in their (public npm) package (even though this is unlikely)
- Need to invest time into setting it up and maintaining it
B) Using my username npm namespace
+ would solve pretty much every problem
- namespaces seem to be meant for projects and its sub packages which wouldn't be the case for my packages since their only connection is the creator
- it seems arrogant to prefix your packages with your name, like you are tagging it with a big sign THIS WAS DONE BY ME
C) Using GitHub with a special detached branch which contains the (tagged) releases
+ you could use it like the global npm repository since the npm resolving strategy allows the repository url with a semver range in place of the version
- special case which is bound to break
- GitHub is not meant to provide npm packages, about no developer expects a git url instead of the versionrange, tools and firewalls might have problem with this
- workflow is really not meant this way neither for git nor for npm
D) using a local package and install package by its path
+ easy to setup and use
- no version management
- build steps must be done manually beforehand
- can not publish packages depending on those packages
- all dependencies have to be installed locally
E) making those packages more useful, implementing edge cases, writing documentation and testing the whole package
+ would resolve about all problems
- ALOT of extra work, primarily thinking about edge cases and giving the developer a good api
- sometimes you can't really get the name for you package (it collides with other) which results in weird
- it is your responsibility, you have to maintain it, be responsible (test it well, edge cases)
- cluttering of the npm repository
So those are all the alternatives which came to mind when I tried to find a solution. Please leave a comment / answer if you have another idea or maybe you can remove / reduce the importance of those contra points.
Maybe you could include your own experience, so I get a better view for the whole problem.
Currently I would just try to make the package more helpful to the greater majority but this does not work in all cases.
Thank you all for your time!
Installing from git is pretty standard feature in package managers. npm doesn't have Github-support, it's generic support for any git repo. Unless you can find some discussion about deprecating it from npm, I'd not worry about it. It's used internally in many companies for private packages.
Of course, there is still some trade offs: build artifacts and maybe a bit more clumsy workflow. Things like npm outdated doesn't understand git semver. For build artifacts, I have seen many projects to commit them to master branch to support direct git-install. If you look around older open source projects for example, that's the case quite often.
We went for a private repository with verdaccio running in a docker container, which is very similar to version A. It took some setup, but for our developers all it took was a single npm command to add the private repo "in front of" npm for all packages of the namespace we created. Granted, our packages are project specific, but in a private repository that does not really matter either way, does it?
We considered the local package option at first, but the drawbacks were just too big for us, even if it's very easy to setup.
I'm not sure this helps, but this is at least the setup we decided upon when we had the same issue a few months ago.

How to handle shared dependencies when sharing components between multiple applications in a monorepo

I've got the following monorepo structure
root
--AppOne
----package.json
----node_modules
------styled-components
--AppTwo
----package.json
----node_modules
------styled-components
--Shared
----componentA
----package.json
----node_modules
------styled-components
My issue is that both AppOne and AppTwo are using a componentA from the shared directory, and it depends on an NPM package, for example on styled-components
This means that I need to have styled-components installed in all three directories, and this increases the bundle size and if the versions aren't the same can cause issues with the package doing what it is supposed to do.
It also means I get the following error from styled-components:
It looks like there are several instances of 'styled-components' initialized in this application.
This may cause dynamic styles not rendering properly, errors happening during rehydration process and makes your application bigger without a good reason.
My question is - what is the best way to solve this situation? Ideally I only want this package installed in one place. Should I be installing it in Shared and using an alias in AppOne and AppTwo to use the package?
Any advice much appreciated!
Working on a big monorepo myself, I will say that there are many ways to solve this by writing custom scripts, and trigger them on npm "post-install". Probably you can also manually maintain this relationship between dependencies in each of the package.json's peerDependencies.
I prefer to rely on tooling to handle inter-dependency management.
In my project, I use Lerna which has a package hoisting feature exactly for this use case.
If Lerna is overkill for you, you should know that package hoisting is also provided by Yarn Workspaces. In fact, when Lerna is used on top of Yarn, Lerna simply delegates the package hoisting to Yarn Workspaces, so you really don't need Lerna for that.
I also heard of Bolt in this regard, but as of early 2019 it's very promising but much less mature than Yarn/Lerna.

Package.JSON file dependencies

This is more of a conceptual question.
In Package.JSON file we have devDependencies and Dependencies. I understand what each is for. But are these in place simply to make it clearer to other developers what the dev dependencies and the production dependencies are when we distribute our files? If we weren't distributing our files would it make a difference if we put the devDependencies in the dependencies section? In my mind it shouldn't because the package.json is just used for npm installation and when we run our app through a bundler such as webpack it will only bundle up the modules required for deployment. In fact if we are not distributing our files theoretically do we even need a package.JSON file (although I see why we would want one so that we can move files from one place to another easily and just reinstall modules at the other end).
I believe this is a unique question because it asks not just what is the difference between dev and normal deps, but theoretically, is there are difference between the two if you don't publish. Allegedly duplicate answer does not talk about that, yet it's important for people's proper understanding of npm.
Back on your questions:
But are these in place simply to make it clearer to other developers what the dev dependencies and the production dependencies are when we distribute our files?
No. NPM will behave differently depending is it dev or "normal" dep. See more on alleged duplicate's accepted answer. For example, when you install published package, install doesn't install dev dependencies unless you explicitly request via a flag.
If we weren't distributing our files would it make a difference if we put the devDependencies in the dependencies section?
No difference functionality wise only if you don't publish. Besides being a bad practice, difficult to maintain and so on, of course.

npm and bower installing only end-user/production files

Lately I have been wondering if there is any way to use bower or npm only as a consumer.
Let’s say I am not really interested on developing the package further, but simply using it on my website/application.
So as I would first think:
npm install jquery
I have tried with the flag --production but the same structure was downloaded.
However, that brings me a huge tree of files and the only one I would need is the jquery/dist/jquery.min.js file.
Same goes for bower:
bower install jquery
Again, an expensive list of files, including src folder with a lot of dev-only related files.
I am sorry if I am wrongly assuming package managers behaviour here, but it would be interesting to know how to use these package managers as a simple end-user instead a developer in order to keep my project dependencies updated.
At the moment, I feel that it's just too much for what I need, and that by simply copying jquery.min.js over to my project src folder, it would be much cleaner/simpler.
If the concept of both npm and bower is different and someone can point it out it would be appreciated as well as any tips for an alternative package manager that only imports essential production files.
Apparently Volo does exactly what I was looking for following a concept where JS libraries should be kept as one single JS file.
Here, more information about the project’s design goals.

Alternatives to Git Submodules?

I feel that using Git submodules is somehow troublesome for my development workflow. I've heard about Git subtree and Gitslave.
Are there more tools out there for multiple repository projects and how do they compare ?
Can these tools run on Windows ?
Which is best for you depends on your needs, desires, and workflow. They are in some senses semi-isomorphic, it is just some are a lot easier to use than others for specific tasks.
gitslave is useful when you control and develop on the subprojects at more of less the same time as the superproject, and furthermore when you typically want to tag, branch, push, pull, etc all repositories at the same time. gitslave has never been tested on windows that I know of. It requires perl.
git-submodule is better when you do not control the subprojects or more specifically wish to fix the subproject at a specific revision even as the subproject changes. git-submodule is a standard part of git and thus would work on windows.
git-subtree provides a front-end to git's built-in subtree merge strategy. It is better when you prefer to have a single-repository "unified" git history. Unlike the subtree merge strategy, it is easier to export changes to the different (directory) trees back out to the original project, but it is not as automatic as it is with gitslave or even git-submodule.
repo is in theory similar to gitslave, but not as well documented for non-android operations that I have found. It is fairly dedicated to the Google Android development model and only natively supports a handful of git commands (though you can run arbitrary commands) and the limited native support doesn't support, for example, a centralized repository to push to and checking out a branch seems fairly difficult.
kitenet's mr is what you would want to use if you have multiple version control systems in use, but is mostly limited for git-only superprojects due to its lowest common denominator approach. There are ways to run arbitrary commands, but they are not as well integrated.
For some use cases, I have liked each of the following two simple approaches:
Nested repositories. If your software project has a plugin mechanism, with each plugin in its own sub-directory, it can make sense to git-ignore these plugin directories and, in your local filesystem, to make each of them into its own git repository. This way, all your files form a single directory tree, but are managed in different git repositories. It will not confuse git.
Per-package repositories. For software projects where you use some kind of source code package management system (gem / bundler, npm, pear or the like) it can make sense to put your re-used code into separate git repositories, then to make source packages from them, and then to install them with the package management tool into the parent project. Your parent project's git repository would only contain a reference to the required packages and their versions, while the actual code of these packages will be git-ignored as done with all other packages and external libraries as well. Compared to the nested repositories proposed above, this is a more elaborate approach as it allows to specify which package version is to be installed.
I currently use submodules for development and not just relating 3rd party libraries. There are some ways that you can make life easier with submodules, especially when they are the source of merge or rebase conflicts. Look to ls-tree to get the 2 commits involved on a conflict in the submodule. This is probably the most difficult part of submodules for people to deal with. For now scripting will make this much easier to work with. Future versions of Git should have better native support for dealing with them.
Hope this helps.
We encountered a similar issue when using Git submodules in projects where we had dependencies in a variety of languages. To deal with them, we built and open-sourced a tool called MDLR ("Modular") that gives you declarative version-controlled Git dependencies with similar functionality to Git submodules, but without the annoying workflow. You can install it and manage your dependencies with the instructions/downloads on the GitHub repo