System.IO.Directory.GetFiles VS My.Computer.FileSystem.GetFiles

System.IO.Directory.GetFiles VS My.Computer.FileSystem.GetFiles - vb.net

I wanted to get the names of all the files in a specific directory which includes over 25,000 files. I tried using these methods:
System.IO.Directory.GetFiles and My.Computer.FileSystem.GetFiles
I've found out that System.IO is significantly faster My.Computer
By significantly I mean around 20 second faster.
Could anyone explain to me the difference between These two methods?

if i remember this correctly System.IO.Directory.GetFiles uses a indexed file method that is stored in reg, and My.Computer builds the indexed files as it iterates throw the file dir so the second time you used the same My.Computer method if i remember right it will be faster

Related

ExportAsFixedFormat for VBA saving Files

i have a problem regarding the process of saving a bunch of pdfs (exported from WordDocuments).
The runtime of my program just behaves a bit weird and thats why i am asking.
So I want to save the files on a global drive.
In my program i create a folder (in that drive where) i put all the pdfs.
Somehow if i do this operation for the first time it is really slow.
But if I do this operation ( for the same fodler a second time) it is somehow really fast. (after I deleted the "old" pdfs, or the old pdfs where overwritten)
I am a bit frustrated and I cannot explain why that is
Could somebody help pls
Would be very happy for an answer
Greetings
Jonas
Using this simple code
doc.ExportAsFixedFormat wholefile, ExportFormat:=wdExportFormatPDF

There are multiple reasons why the ExportAsFixedFormat method can be slow. But I would start from dealing with local files only. Then I'd play with arguments specifying the document quality and exported format. It makes sense to play with other parameters left in your code sample to its default values.

In VB2008Express, how do I use reuseable code files as in VB6

In VB6 I had (example) the following directories:
CommonCode
Parser
OrbitalDynamics
DnD
RelativisticBang
The first is standard inclusions I use (constants.vb, science.vb, constAstro.vb, DnD.vb...) while the others are projects who all use one or more of the common code files.
If I loaded OrbitalDynamics and, while in it, added a new constant in constants.vb then the next time I loaded the others, they also had the new constant immediately available.
I've rebuilt the common code files for VB2008Exp. I'm now trying to rebuild some of the projects. The problem is that I've not found a way to bring in the common code files. Every time I've tried it's copied them into the project rather than referenced them. Copying is, obviously, useless for the idea of common code.
Hopefully someone out there knows what I've missed, or knows some other method of using common code files in VB2008Exp. I can do this sort of thing in CPP, in C (both in Linux), and in VB6, but so far VB2008Exp has been very intractable.

FIleSystemWatcher.Created how does it work?

I am working on a project that will copy files to a database every time something is added to a specific directory. Now the program works fine when I'm testing with a small set of data but I was wondering if someone could explain how the FileSystemWatcher.Created event work.
My main concern is when I use this on a larger scale the program may slow down when it handles 100,000+ files.
If this is an issue could anyone explain if there is some sort of workaround to polling the original folder, lets call that "C:\folder", and maybe poll a temp folder instead.

I have not tested the watcher with 100,000 files. However, in most cases you should not have so many files in a folder awaiting processing. I recommend a structure like
C:\folder
C:\folder\processing
C:\folder\archive
C:\folder\error
As soon as you begin working on a given file, move it into processing. If you successfully process it, move the file again to archive. If there is an error while processing a file, instead move it into error.
This will make it easier for you to keep the files organized and diagnose problems that occur in production.
With that file structure, you will not run into issues with large numbers of files in the folder you are watching, unless you receive files in incredibly large bursts compared to the speed with which they can be moved into the processing state.

Reference Items in context

I am trying to figure out a way to access the Items from the context without having to declare an ItemGroup explicitly.
Currently trying for the Copy task:
<Copy SourceFiles="C:\blabla\**\*.*" DestinationFiles="%(?.RecursiveDir)" />
What can I use in place of "?" to select the Items in context ?
The reason is that, I have an MSBuild project file being generated via XSLT and there are unknown number of folders & files (some of them follow a different structure under the destination folder - in that case I intend to use different meta data in place of RecursiveDir) in the input XML. Is it possible to achieve this without the need to declare loads of Itemgroups (or an Itemgroup with lots of Items) ?
I tried searching for this, but all I found were posts with Itemgroups declared.

#Alexey Shcherbak wrote:
You want to refer item metadata without explicitly declaring item itself, so I have doubts you will be able to do this. Also Copy task require that Source Files should be ITaskItem[] type (literally - it require item collection). Actually the msdn description of copy task has an exact example you could follow, but you should declare itemgroup with nested items clause inside.
You may wonder if you have an item with a lot of files, does it make MSBuild slow down.
The answer is: it depends =). What numbers you mean under huge fileset =). It's true that MSBuild engine emits and evaluates each item group in memory, and probably a huge fileset could lead to a bigger memory footprint. But MSBuild is not adapted to work as your scripting language of choice (even powershell has issues with 250K+ files in one dir, windows itself also). If you just need perform a copy without accessing to full meta (except recursive dir)- use Exec task and invoke robocopy.exe - it works waaaay better than anything else (considering available out of the box tools).
As an addition - huge numbers should be tested and evaluated before we declare that concrete tool isn't acceptable for that. I think as soon as MSBuild could deal with big solutions - it could probably deal with pretty big filesets. It just the resource/speed question. But any tool also have it's unsurmountable limits.
Actually I meant not robocopy extension, but robocopy.exe itself en.wikipedia.org/wiki/Robocopy, you can easily call it with Exec task. And surely hardlinks are unbeatable in terms of "copy" speed =). But keep in mind - it will work only across single disk volume (because it's not actual copy, it just adding another file name to same set of bytes =) ). In case you need actual copy to another drive or over the network - robocopy will shine again =).
PS: 20k files are far from my definition of huge ;) We dealt with ~280k-300k small files, summary volume around 80Gb. Powershell for plumbing and robocopy for actual bits-moving won that round.

Process for reducing the size of an executable

I'm producing a hex file to run on an ARM processor which I want to keep below 32K. It's currently a lot larger than that and I wondered if someone might have some advice on what's the best approach to slim it down?
Here's what I've done so far
So I've run 'size' on it to determine how big the hex file is.
Then 'size' again to see how big each of the object files are that link to create the hex files. It seems the majority of the size comes from external libraries.
Then I used 'readelf' to see which functions take up the most memory.
I searched through the code to see if I could eliminate calls to those functions.
Here's where I get stuck, there's some functions which I don't call directly (e.g. _vfprintf) and I can't find what calls it so I can remove the call (as I think I don't need it).
So what are the next steps?
Response to answers:
As I can see there are functions being called which take up a lot of memory. I cannot however find what is calling it.
I want to omit those functions (if possible) but I can't find what's calling them! Could be called from any number of library functions I guess.
The linker is working as desired, I think, it only includes the relevant library files. How do you know if only the relevant functions are being included? Can you set a flag or something for that?
I'm using GCC

General list:
Make sure that you have the compiler and linker debug options disabled
Compile and link with all size options turned on (-Os in gcc)
Run strip on the executable
Generate a map file and check your function sizes. You can either get your linker to generate your map file (-M when using ld), or you can use objdump on the final executable (note that this will only work on an unstripped executable!) This won't actually fix the problem, but it will let you know of the worst offenders.
Use nm to investigate the symbols that are called from each of your object files. This should help in finding who's calling functions that you don't want called.
In the original question was a sub-question about including only relevant functions. gcc will include all functions within every object file that is used. To put that another way, if you have an object file that contains 10 functions, all 10 functions are included in your executable even if one 1 is actually called.
The standard libraries (eg. libc) will split functions into many separate object files, which are then archived. The executable is then linked against the archive.
By splitting into many object files the linker is able to include only the functions that are actually called. (this assumes that you're statically linking)
There is no reason why you can't do the same trick. Of course, you could argue that if the functions aren't called the you can probably remove them yourself.
If you're statically linking against other libraries you can run the tools listed above over them too to make sure that they're following similar rules.

Another optimization that might save you work is -ffunction-sections, -Wl,--gc-sections, assuming you're using GCC. A good toolchain will not need to be told that, though.
Explanation: GNU ld links sections, and GCC emits one section per translation unit unless you tell it otherwise. But in C++, the nodes in the dependecy graph are objects and functions.

On deeply embedded projects I always try to avoid using any standard library functions. Even simple functions like "strtol()" blow up the binary size. If possible just simply avoid those calls.
In most deeply embedded projects you don't need a versatile "printf()" or dynamic memory allocation (many controllers have 32kb or less RAM).
Instead of just using "printf()" I use a very simple custom "printf()", this function can only print numbers in hexadecimal or decimal format not more. Most data structures are preallocated at compile time.

Andrew EdgeCombe has a great list, but if you really want to scrape every last byte, sstrip is a good tool that is missing from the list and and can shave off a few more kB.
For example, when run on strip itself, it can shave off ~2kB.
From an old README (see the comments at the top of this indirect source file):
sstrip is a small utility that removes the contents at the end of an
ELF file that are not part of the program's memory image.
Most ELF executables are built with both a program header table and a
section header table. However, only the former is required in order
for the OS to load, link and execute a program. sstrip attempts to
extract the ELF header, the program header table, and its contents,
leaving everything else in the bit bucket. It can only remove parts of
the file that occur at the end, after the parts to be saved. However,
this almost always includes the section header table, and occasionally
a few random sections that are not used when running a program.
Note that due to some of the information that it removes, a sstrip'd executable is rumoured to have issues with some tools. This is discussed more in the comments of the source.
Also... for an entertaining/crazy read on how to make the smallest possible executable, this article is worth a read.

Just to double-check and document for future reference, but do you use Thumb instructions? They're 16 bit versions of the normal instructions. Sometimes you might need 2 16 bit instructions, so it won't save 50% in code space.
A decent linker should take just the functions needed. However, you might need compiler & linke settings to package functions for individual linking.

Ok so in the end I just reduced the project to it's simplest form, then slowly added files one by one until the function that I wanted to remove appeared in the 'readelf' file. Then when I had the file I commented everything out and slowly add things back in until the function popped up again. So in the end I found out what called it and removed all those calls...Now it works as desired...sweet!
Must be a better way to do it though.

To answer this specific need:
•I want to omit those functions (if possible) but I can't find what's
calling them!! Could be called from any number of library functions I
guess.
If you want to analyze your code base to see who calls what, by whom a given function is being called and things like that, there is a great tool out there called "Understand C" provided by SciTools.
https://scitools.com/
I have used it very often in the past to perform static code analysis. It can really help to determine library dependency tree. It allows to easily browse up and down the calling tree among other things.
They provide a limited time evaluation, then you must purchase a license.

You could look at something like executable compression.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas