Design principles as to how linux repository managers update themselves? - repository

I know there are other applications also, but considering yum/apt-get/aptitude/pacman are you core package managers for linux distributions.
Today I saw on my fedora 13 box:
(7/7): yum-3.2.28-4.fc13_3.2.28-5.fc13.noarch.drpm | 42 kB 00:00
And I started to wonder how does such a package update itself? What design is needed to ensure a program can update itself?
Perhaps this question is too general but I felt SO was more appropriate than programmers.SE for such a question being that it is more technical in nature. If there is a more appropriate place for this question feel free to let me know and I can close or a moderator can move.
Thanks.

I've no idea how those particular systems work, but...
Modern unix systems will generally tolerate overwriting a running executable without a hiccup, so in theory you could just do it.
You could do it in a chroot jail and then move or something similar to reduce the time during which the system is vulnerable. Add a journalling filesystem and this is a little safer still.
It occurs to me that the package-manager needs to hold the package access database in memory as well to insure against a race condition there. Again, the chroot jail and copy option is available as a lower risk alternative.

And I started to wonder how does such a package update itself? What
design is needed to ensure a program can update itself?
It's like a lot of things, you don't need to "design" specifically to solve this problem ... but you do need to be aware of certain "gotchas".
For instance Unix helps by reference counting inodes so "you" can delete a file you are still using, and it's fine. However this implies a few things you have to do, for instance if you have plugins then you need to load them all before you run start a transaction ... even if the plugin would only run at the end of the transaction (because you might have a different version at the end).
There are also some things that you need to do to make sure that anything you are updating works, like: Put new files down before removing old files. And don't truncate old files, just unlink. But those also help you :).
Using external problems, which you communicate with, can be tricky (because you can't exec a new copy of the old version after it's been updated). But this isn't often done, and when it is it's for things like downloading ... which can somewhat easily be made to happen before any updates.
There are also things which aren't a concern in the cmd line clients like yum/apt, for instance if you have a program which is going to run 2+ "updates" then you can have problems if the first update was to the package manager. downgrades make this even more fun :).
Also daemon like processes should basically never "load" the package manager, but as with other gotchas ... you tend to want to follow this anyway, for other reasons.

Related

Unable to derive module descriptor for legacy signed JAR

I'm trying to update a software system to JDK-11 using modules, and everything was going just fine right up until I slammed head-on into the aforementioned issue.
I have a legacy signed JAR that I need to incorporate for interaction with legacy systems. There's no way to update the JAR and no way to get a new version. The JAR must be signed in order to be usable (the whole "trusted code" deal and whatnot). The problem is that the JAR contains classes in the unnamed (root) package. Yeah. Stupid. Bad practice. Blablabla. It's still there, and I still need to use it.
I've not found any documentation or answers anywhere that would remotely suggest that what I need is possible. In fact, the opposite is true: everyone is adamant that in the "new"(ish) module system, no class may reside in the unnamed package.
Needless to say I'm unable to both modify the contents of the JAR, or get at the sources to render a new one - that's without even considering the issue of the signature...
That said: I refuse to believe the folks at Oracle would leave such a glaring oversight with regards to legacy code. As we all know, a lot of the time we have no choice but to use it for legitimate reasons, and we can't do anything to fix/update/refactor/etc... I would have hoped there was a mechanism added to the module system to support this, albeit for extreme cases only, etc...etc...
Disclaimer: I do fully understand why this isn't meant to be supported. What I'm having a hard time with is the lack of a workaround...
Thanks!
I've already tried:
creating a facade module that transitively adds the offending module (obviously no dice, same problem)
unpacking-and-repacking the module while temporarily disabling signature validation in a test env (fails because the class is apparently referenced within many other, properly-organized classes)
finding an updated module (no luck here, either)
beheading a chicken and roasting it over a pentagram while invoking the aid of ancient pagan gods (tasty, but didn't fix it)
curling up in a ball under my desk and weeping until execution succeeds (that's where I'm typing this from)...

Can a change to package-lock.json ever affect the deployment?

I'm reading the NPM docs about package-lock.json and my interpretation is that a committed change to it can never cause issues in the deployed version.
During the roll-out we run npm install which creates (or overwrites) the lock file anyway. In my mind, the lock file is more of a receipt of the state of the concurrent world while installing, rather than a pointer on how the installation should be performed.
However, I haven't been successful convincing my team that it is so. They feel uneasy relying on the statement above (not contradicting it nor arguing against it, just not entirely convinced to the degree that they would bet a testicle on it).
Is it at all possible that package-lock.json might affect the actual installation?
Since I'm new with the company, my track record of 10+ years has limited impact. And I'm myself humbly considering that even though the lock file never caused me any issues before, my experience might be irrelevant if the local environment is configured in a way I'm not familiar with yet. So I'm too cautious to bet my reputation as we're about to make a very important release.
In my mind, the lock file is more of a receipt of the state of the concurrent world while installing, rather than a pointer on how the installation should be performed.
Maybe I am interpreting your statement wrong but package-lock is a pointer for future installations in a way. See the general documentaion on lock files (different link than the one you shared), following statement from the above doc might be helpful:
This file describes an exact, and more importantly reproducible node_modules tree. Once it’s present, any future installation will base its work off this file, instead of recalculating dependency versions off package.json.`
A read on following discussion on this topic might be helpful to you too. Thanks!

How can a modified Julia package be used natively?

So, there is this cool package I've found but it leaves a lot to be desired. Since it made more sense to modify it, rather than build a new one myself, I changed the code in the corresponding source directory (C:\Users[my username].julia\v0.4[package name]\src). I made sure to modify not just the base.jl file, but also the [name of package].jl one so that there are no issues with dependencies or the new functions I added. I tried running the package several times to ensure that Julia doesn't spit out any errors or exceptions (the original package had some deprecated stuff, which I also remedied). Still, I fail to use the additional functionality of the package that I augmented. Any help would be greatly appreciated.
I'm using Julia ver 0.4.2, on a Windows 7 machine. As an IDE I use Notepad++. Thanks
I'm not exactly sure what you tried, but here's a guess as to what's going on: if you've already loaded the package in your julia session, edits to the source files won't take effect unless you explicitly reload the package. There are some good workflow tips here, and more explanation of the module system here.
However, for a newbie the easiest thing might be to quit julia and restart.
As far as making changes to a package, as Gnumic commented, your best approach is to make a branch and commit your changes there. Once you become convinced your changes represent an improvement, consider sharing your changes with the rest of the world.

build script - how to do it

About 2 months ago I overtook building proccess in current company. Even though I don't have much knowledge of it, I was the only with enough time, so I didn't have much choice.
Situation is not that good, and I would like to do following:
Labeling files in SourceSafe with version (example ProjectName PV 1.2)
GetFiles from SourceSafe to specific directory
Build vb6/c++/c# projects(yes, there are all kinds of them)
Build InstallShield setups
This is for now partly done using batch scripts(one for labeling and getting, one for building, etc..). So when building start I pretty much have babysit it.
Good part of this code could be reused.
Any recommendations on how to do it better? One big problem is whole bunch of dependencies between projects. Also labeling has to increment version and if necessary change PV to EV.
I would like to minimize user interaction as much as possible. One click on one build script(Spolsky is god) and all is done, no need to increment version, to set where to get files and similar stuff.
Is the batch scripting best way to go? Should I do some functionality with msbuild. Are there any other options?
Specific code is not need, for now I just need a way how to improve it, even though it wouldn't hurt.
Tnx,
Marko
Since you already have a build system (even though some of it currently "manual"), whatever you do, don't start over from scratch.
(1) Make sure you have a test machine (or Virtual Machine) on which to work. Thus you can make changes and improvements without having to worry about breaking anything.
(2) Put all of your build scripts and tools in version control, not just the source code. Then as you make changes, see if they work. If they do, then save them to version control. If they don't, then roll them back.
(3) Choose one area to work on at a time. Don't try to do everything at once. Going from a lot of manual work to "one-click" will take time no matter what build system you're working with.
Sounds like you want a continuous integration solution, like CC.Net. It has configuration options to do all the things you want and a great community to answer questions.
Also, batch scripting is probably not a good option. Sophisticated build and integration tools will let you feed parameters into the build and create different builds for different environments (test, production, etc.). Batch scripting will involve a lot of hand-coding and glue.

Mercurial practices: use with IDEs and scalability

I am not an experimented user of SCM tools, even though I am convinced of their usefulness, of course.
I used some obscure commercial tool in a former job, Perforce in the current one, and played a bit with TortoiseSVN for my little personal projects, but I disliked having lot of .svn folders all over the place, making searches, backups and such more difficult.
Then I discovered the interest of distributed SCM and I chose to go the apparently simpler (than git) Mercurial way, still for my personal, individual needs. I am in the process of learning to use it properly, having read part of the wiki and being in the middle of the excellent PDF book.
I see often repeated, for example in Mercurial working practices, "don't hesitate to use multiple trees locally. Mercurial makes this fast and light-weight." and "for each feature you work on, create a new tree.".
These are interesting and sensible advices, but they hurt a bit my little habits with centralized SCM, where we have a "holy" central repository where branches are carefully planned (and handled by administrators), changelists must be checked by (senior) peers and must not break the builds, etc. :-) Starting to work on a new branch takes quite some time...
So I have two questions in the light of above:
How practical is it to do lot of clones, in the context of IDEs and such? What if the project has configuration/settings files, makefiles or Ant scripts or shell scripts or whatever, needing path updates? (yes, probably a bad idea...) For example, in Eclipse, if I want to compile and run a clone, I have to do yet another project, tweaking the Java build path, the Run/Debug targets, and so on. Unless an Eclipse plugin ease that task. Do I miss some facility here?
How do that scale? I have read Hg is OK for large code bases, but I am perplex. At my job, we have a Java application (well, several around a big common kernel) of some 2 millions of lines, weighting some 110MB for code alone. Doing a clean compile on my old (2004) Windows workstation takes some 15 minutes to generate the 50MB of class files! I don't see myself cloning the whole project to change 3 files. So what are the practices here?
I haven't yet seen these questions addressed in my readings, so I hope this will make a useful thread.
You raise some good points!
How practical is it to do lot of clones, in the context of IDEs and such?
You're right that it can be difficult to manage many clones when IDEs and other tools depend on absolute paths. Part of it can be solved by always using relative paths in your configuration files -- making sure that a source checkout can compile from any location is a good goal in itself, no matter what revision control system you use :-)
But when you cannot or dont want to bother with several clones, then please note that a single clone can cope with multiple branches. The "hgbook" emphasizes many clones since this is a conceptually simple and very safe way of working. When you get more experience you'll see that you can use multiple heads in a single repository (perhaps by naming them with bookmarks) to do the same.
How do that scale?
Cloning a 110 MB repository should be quite fast: it depends on how long it takes to write 110 MB to your disk. In a recent message to the Mercurial mailinglist it was reported that cloning 6.3 GB took 4 minutes -- scaling that down to 110 MB gives about 4 seconds. That should be fast enough that your tea is still warm :-) Part of the trick is that the history data are simply hard-linked (yes, also on Windows) and so it is only a matter of writing out the files in the working copy.
PhiLo: I'm new at this, but mercurial also has "internal branches" that you can use within a single repository instead of cloning it.
Instead of
hg clone toto toto-bug-434
you can do
cd toto
hg branch bug-434
hg update bug-434
...
hg commit
hg update default
to create a branch and switch back and forth. Your built files not under rev control won't go away when you switch branches, some of them will just go out of date as the underlying sources are modified. Your IDE will rebuild what's needed and no more. It works much like CVS or subversion.
You should still have clean 'incoming' and 'outgoing' repositories in addition to your 'work' repository. Just that your 'work' can serve multiple purposes.
That said, you should clone your work repo before attempting anything intricate. If anything goes wrong you can throw the clone away and start over.
Question 1:
PIDA IDE has pretty good Mercurial integration. We also use Mercurial for development itself. Personally I have about 15 concurrent clones going of some projects, and the IDE copes fine. We don't have the trouble of tweaking build scripts etc, we can "clone and go".
It is so easy that in many cases I will clone to the bug number like:
hg clone http://pida.co.uk/hg pida-345
For bug #345, and I am ready to fix.
If you are having to tweak build scripts depending on the actual checkout directory of your application, I might consider that your build scripts should be using some kind of project-relative path, rather than hard-coded paths.