Hide local filenames in Browserify output? - browserify

When you bundle your code with Browserify, each module you use is inlined in the resulting output and labeled with its local file path. So you can see file path strings in your bundled code.
But in theory, these strings could all be rewritten to "1", "2", etc, which might be a security advantage in some situations (and would save a few bytes).
Is there any option for this, or some transform that will do it? (It would have to rewrite the labels for each inlined module, and all the corresponding require calls.)

Seems that 6 months after your question, Browserify's author James Halliday put Bundle Collapser out there. It does pretty much exactly what we were both after.

The insert globals option may be enabled with any of the following aliases
--insert-globals
--ig
--fast
If you remove all of those, your build may be slightly slower but it will remove the names.

Related

Snakemake: parameterized runs that re-use intermediate output files

here is a complex problem that I am struggling to find a clean solution for:
Imagine having a Snakemake workflow with several rules that can be parameterized in some way. Now, we might want to test different parameter settings for some rules, to see how the results differ. However, ideally, if these rules depend on the output of other rules that are not parameterized, we want to re-use these non-changing files, instead of re-computing them for each of our parameter settings. Furthermore, if at all possible, all this should be optional, so that in the default case, a user does not see any of this.
There is inherent complexity in there (to specify which files are re-used, etc). I am also aware that this is not exactly the intended use case of Snakemake ("reproducible workflows"), but is more of a meta-feature for experimentation.
Here are some approaches:
Naive solution: Add wildcards for each possible parameter to the file paths. This gets ugly, hard to maintain, and hard to extend really quickly. Not a solution.
A nice approach might be to name each run, and have an individual config file for that name which contains all settings that we need. Then, we only need a wildcard for such a named set of parameter settings. That would probably require to read some table of meta-config file, and process that. That doesn't solve the re-use issue though. Also, that means we need multiple config files for one snakemake call, and it seems that this is not possible (they would instead update each other, but not considered as individual configs to be run separately).
Somehow use sub-workflows, by specifying individual config files each time, e.g., via a wildcard. Not sure that this can be done (e.g., configfile: path/to/{config_name}.yaml). Still not a solution for file re-using.
Quick-and-dirty: Run all the rules up to the last output file that is shared between different configurations. Then, manually (or with some extra script) create directories with symlinks to this "base" run, with individual config files that specify the parameters for the per-config-runs. This still necessitates to call snakemake individually for each of these directories, making cluster usage harder.
None of these solve all issues though. Any ideas appreciated!
Thanks in advance, all the best
Lucas
Snakemake now offers the Paramspace helper to solve this! https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html?highlight=parameter#parameter-space-exploration
I have not tried it yet, but it seems like the solution to the issue!

using require in layer.conf in yocto

Considering all freedom that yocto gives to the developer, I have a question.
I would like to make this my_file.inc available only for recipes in one particular meta-layer. I know, that, for instance, using INHERIT keyword inside the local.conf will make my_class.bbclass file available globally for each recipe.
Is it a good practice to add this:
require my_file.inc
inside layer.conf? Or should I change my_file.inc to the my_file.bbclass, and, add INHERIT = "my_file.bbclass" to the layer.conf?
Any other possibilities?
Even if it seems to work, neither of your approaches is technically completely correct. The key point is that all .conf files are parsed first and everything they contain is globally visible throughout the whole build process. So if you add something through the layer.conf file, itis not being pulled in through an unexpected place, it also is not being limited that layer only and might therefore cause breakage at other places.
While I do not have a really good and clean solution, maybe the following can help you:
You can make your custom recipes react on certain keywords in DISTRO_FEATURES or MACHINE_FEATURES. Then you can create a two-staged approach:
Add the desired keyword in local.conf (or your MACHINE, or DISTRO, or whatever configuration)
Make the recipes react to it. If you need the mechanism in several places, then it might be useful to pour it into a .bbclass that your layer brings along and that you pull in for the respective recipes.
This way the effect is properly contained.
Maybe part 5.1.3.2 from the Yocto Project answers your question:
Avoid duplicating include files. Use append files (.bbappend) for each recipe that uses an include file. Or, if you are introducing a new recipe that requires the included file, use the path relative to the original layer directory to refer to the file. For example, use require recipes-core/package/file.inc instead of require file.inc. If you're finding you have to overlay the include file, it could indicate a deficiency in the include file in the layer to which it originally belongs. If this is the case, you should try to address that deficiency instead of overlaying the include file. For example, you could address this by getting the maintainer of the include file to add a variable or variables to make it easy to override the parts needing to be overridden.
So to avoid duplicate inclusion later, it would be better not to include your .inc file(s) this way.

IDE for Go capable of refactoring: variable, function, structure and package renaming

I am interested in any IDE (or even a script) that is capable of refactoring Go source code for variable renaming. For example in Eclipse for Java, one can select a variable, an object or a class, then to rename it and it gets automatically renamed in all the files in the project. This feature is very useful if automatic string replacement may cause substring collisions.
If you're interested in a script, use gofmt with -r flag. Like this:
gofmt -w -r 'OldFoo -> Foo' foopackage
From the docs:
Without an explicit path, it processes the standard input. Given a file, it operates on that file; given a directory, it operates on all .go files in that directory, recursively. (Files starting with a period are ignored.) By default, gofmt prints the reformatted sources to standard output.
EDIT: Today there are better tools for that: gorename for renaming and eg for general refactoring.
The gorename tool performs precise type-safe renaming of identifiers in Go source code.

Should I merge .pbxproj files with git using merge=union?

I'm wondering whether the merge=union option in .gitattributes makes sense for .pbxproj files.
The manpage states for this option:
Run 3-way file level merge for text files, but take lines from both versions, instead of leaving conflict markers. This tends to leave the added lines in the resulting file in random order and the user should verify the result.
Normally, this should be fine for the 90% case of adding files to the project. Does anybody have experience with this?
Not a direct experience, but:
This SO question really advices again merging .pbxproj files.
The pbxproj file isn't really human mergable.
While it is plain ASCII text, it's a form of JSON. Essentially you want to treat it as a binary file.
(hence a gitignore solution)
Actually, Peter Hosey adds in the comment:
It's a property list, not JSON. Same ideas, different syntax.
Yet, according to this question:
The truth is that it's way more harmful to disallow merging of that .pbxproj file than it is helpful.
The .pbxproj file is simply JSON (similar to XML). From experience, just about the ONLY merge conflict you were ever get is if two people have added files at the same time. The solution in 99% of the merge conflict cases is to keep both sides of the merge.
So a merge 'union' (with a gitattributes merge directive) makes sense, but do some test to see if it does the same thing than the script mentioned in the last question.
See also this question for potential conflicts.
See the Wikipedia article on Property List
I have been working with a large team lately and tried *.pbxproj merge=union, but ultimately had to remove it.
The problem was that braces would become out of place on a regular basis, which made the files unreadable. It is true that tho does work most of the time - but fails maybe 1 out of 4 times.
We are back to using *.pbxproj -crlf -merge for now. This seems to be the best solution that is workable for us.

Process for reducing the size of an executable

I'm producing a hex file to run on an ARM processor which I want to keep below 32K. It's currently a lot larger than that and I wondered if someone might have some advice on what's the best approach to slim it down?
Here's what I've done so far
So I've run 'size' on it to determine how big the hex file is.
Then 'size' again to see how big each of the object files are that link to create the hex files. It seems the majority of the size comes from external libraries.
Then I used 'readelf' to see which functions take up the most memory.
I searched through the code to see if I could eliminate calls to those functions.
Here's where I get stuck, there's some functions which I don't call directly (e.g. _vfprintf) and I can't find what calls it so I can remove the call (as I think I don't need it).
So what are the next steps?
Response to answers:
As I can see there are functions being called which take up a lot of memory. I cannot however find what is calling it.
I want to omit those functions (if possible) but I can't find what's calling them! Could be called from any number of library functions I guess.
The linker is working as desired, I think, it only includes the relevant library files. How do you know if only the relevant functions are being included? Can you set a flag or something for that?
I'm using GCC
General list:
Make sure that you have the compiler and linker debug options disabled
Compile and link with all size options turned on (-Os in gcc)
Run strip on the executable
Generate a map file and check your function sizes. You can either get your linker to generate your map file (-M when using ld), or you can use objdump on the final executable (note that this will only work on an unstripped executable!) This won't actually fix the problem, but it will let you know of the worst offenders.
Use nm to investigate the symbols that are called from each of your object files. This should help in finding who's calling functions that you don't want called.
In the original question was a sub-question about including only relevant functions. gcc will include all functions within every object file that is used. To put that another way, if you have an object file that contains 10 functions, all 10 functions are included in your executable even if one 1 is actually called.
The standard libraries (eg. libc) will split functions into many separate object files, which are then archived. The executable is then linked against the archive.
By splitting into many object files the linker is able to include only the functions that are actually called. (this assumes that you're statically linking)
There is no reason why you can't do the same trick. Of course, you could argue that if the functions aren't called the you can probably remove them yourself.
If you're statically linking against other libraries you can run the tools listed above over them too to make sure that they're following similar rules.
Another optimization that might save you work is -ffunction-sections, -Wl,--gc-sections, assuming you're using GCC. A good toolchain will not need to be told that, though.
Explanation: GNU ld links sections, and GCC emits one section per translation unit unless you tell it otherwise. But in C++, the nodes in the dependecy graph are objects and functions.
On deeply embedded projects I always try to avoid using any standard library functions. Even simple functions like "strtol()" blow up the binary size. If possible just simply avoid those calls.
In most deeply embedded projects you don't need a versatile "printf()" or dynamic memory allocation (many controllers have 32kb or less RAM).
Instead of just using "printf()" I use a very simple custom "printf()", this function can only print numbers in hexadecimal or decimal format not more. Most data structures are preallocated at compile time.
Andrew EdgeCombe has a great list, but if you really want to scrape every last byte, sstrip is a good tool that is missing from the list and and can shave off a few more kB.
For example, when run on strip itself, it can shave off ~2kB.
From an old README (see the comments at the top of this indirect source file):
sstrip is a small utility that removes the contents at the end of an
ELF file that are not part of the program's memory image.
Most ELF executables are built with both a program header table and a
section header table. However, only the former is required in order
for the OS to load, link and execute a program. sstrip attempts to
extract the ELF header, the program header table, and its contents,
leaving everything else in the bit bucket. It can only remove parts of
the file that occur at the end, after the parts to be saved. However,
this almost always includes the section header table, and occasionally
a few random sections that are not used when running a program.
Note that due to some of the information that it removes, a sstrip'd executable is rumoured to have issues with some tools. This is discussed more in the comments of the source.
Also... for an entertaining/crazy read on how to make the smallest possible executable, this article is worth a read.
Just to double-check and document for future reference, but do you use Thumb instructions? They're 16 bit versions of the normal instructions. Sometimes you might need 2 16 bit instructions, so it won't save 50% in code space.
A decent linker should take just the functions needed. However, you might need compiler & linke settings to package functions for individual linking.
Ok so in the end I just reduced the project to it's simplest form, then slowly added files one by one until the function that I wanted to remove appeared in the 'readelf' file. Then when I had the file I commented everything out and slowly add things back in until the function popped up again. So in the end I found out what called it and removed all those calls...Now it works as desired...sweet!
Must be a better way to do it though.
To answer this specific need:
•I want to omit those functions (if possible) but I can't find what's
calling them!! Could be called from any number of library functions I
guess.
If you want to analyze your code base to see who calls what, by whom a given function is being called and things like that, there is a great tool out there called "Understand C" provided by SciTools.
https://scitools.com/
I have used it very often in the past to perform static code analysis. It can really help to determine library dependency tree. It allows to easily browse up and down the calling tree among other things.
They provide a limited time evaluation, then you must purchase a license.
You could look at something like executable compression.