i have a doubt in dlls loading &processing in memory ,normally dlls are shared library so dll should loads once is enough.if a process loads a dll (ex.advapi32.dll )into memory means ,after that another process how refers advapi32.dll to that process ...how can share common location for each process...
I'm not entirely sure what your question is, but yes, if multiple processes import the same DLL, then the read-only sections of that DLL are typically mapped into all of those processes. On the other hand, section that can change, like the BSS (variable) segment, get a copy in each process so that the changes that one process makes are invisible to other processes. If you want certain changes to be shared between processes for your own DLL, you can mark a data section in the DLL as shared. Exactly how you do this depends on the development tools you're using.
Related
ALL,
I have an application which loads multiple DLLs. One of those DLLs has a memory leak.
From what I understand the best tool to find memory leaks is VLD ;-)
So I downloaded the latest release and installed it in the default location.
Now the documentation says that I need to include the vld.h file somewhere once and link to the VLD libraries and then just run the application.
My question is - should I include it in the DLL code where the leak occurs or I do that in the main application? And the same with the linking...
Thank you.
Include vld.h to each source file (if you use precompiled headers, you can include vld.h once to the header) of each DLL, or some specific DLL. Then rebuild them / it.
vld.h redefines allocation functions, so when a source file is compiled, all allocation functions become special functions from VLD. Thus VLD can save information about allocations and deallocations.
You can use VLD in your main application. But in this case, you will get information about allocations made by your main application code only.
Excerpt From Micrsoft's "What is a .dll?":
"By using a DLL, a program can be modularized into separate
components. For example, an accounting program may be sold by module.
Each module can be loaded into the main program at run time if that
module is installed. Because the modules are separate, the load time
of the program is faster, and a module is only loaded when that
functionality is requested. Additionally, updates are easier to apply
to each module without affecting other parts of the program. For
example, you may have a payroll program, and the tax rates change each
year. When these changes are isolated to a DLL, you can apply an
update without needing to build or install the whole program again."
Ref:http://support.microsoft.com/kb/815065
DLL's are:
loaded at runtime
can "dynamically loaded" (by multiple programs at the same time)
- which allows saving of resources
- lowers disk space requirements
But why do they promote "modulizing" programs?What would happen if there weren't .dll files?Could someone provide/expand on the example
Modular programs provide a way of making a particular functionality available to many programs without having to include the same code in all of them. Also, they allow greater compatibility between programs since they would essentially use the same methods in common DLLs to obtain the same results.
One would write a program in a modular fashion such that different parts of the program could be maintained separately. Say you had some clever way of reading and writing your own data format to files. Say you make improvements to that technique. If the code for reading and writing the files lived in a DLL, you would only need to update the DLL. The program itself would remain unchanged.
If you have one monolithic EXE, you have to
pay for all the extra time relinking it, even if 1 source file changed (this is painful if it's > 80 MB, as is the case in large projects),
ship the entire EXE, when you could only ship a single DLL which is a fraction of the size (for patches/updates).
Breaking it up into DLLs you
have pluggability: The EXE is the host application and others can write DLLs that "plug into" the host via a well-defined interface. DLLs can be interchanged as long as they conform to the interface.
can share code across other DLLs and EXEs.
can have some DLLs be optionally loaded on demand, only if they're used, and unloaded when they're not needed
similar to above, have optional functionality. With a single EXE you have to download everything, even if some components are rarely used. With DLLs, you could have a system that downloads and installs features as needed.
The biggest advantage of dlls is probably during development of the original program. Without dlls you wouldn't be able to integrate with existing libraries without including the original source code. By including an existing library as a dll you don't need the source since it's all encapsulated in the dll. It would be a nightmare to develop in frameworks like .Net without dlls since you constantly include other libraries...
The alternative to breaking your program down in n > 1 pieces is to keep it in n == 1 piece. Why is this bad? Well it isn't always bad (maybe the BIOS is a good example?). But for user programs it usually is. Why? First we need to define what a program is.
What is a program?
A simple "program", roughly speaking, consists of an entry point (i.e. offset to the main function), functions and global variables. A function consists of instructions and information about what local variables are needed to run the function. To be executed a program must be loaded in primary memory/RAM (the aforementioned information). Because our program has functions (and not just jump statements), that implies the existence of a stack, which implies the existence of a containing environment managing the stack. (I suppose you could have a program that manages its own stack but I'd argue then your program is not a program anymore but an environment.) This environment contains the program, starts in the entry point and executes each instruction, be it "go to this part of the RAM and add it to whatever is in this register" or "If this register is all 0 then jump ahead this many instructions and resume execution there" indefinitely or until the program gives control back to its environment. (This is somewhat simplified - context switches in multi-process environments, illegal memory access, illegal instructions, etc. can also cause control to be taken from the program.)
Anyway, so we have two options: either load the entire program at once or have it stored and loaded in pieces.
n == 1
There are some advantages to doing it all at once:
Once the program is in memory no disk access is required to execute further (unless the program explicitly asks there to be).
Since the program is compiled/linked before execution begins you can do everything without any sort of string names/comparisons - go directly to the address (or an offset).
Functions are never out of sync with one another.
n > 1
There are some disadvantages, though, which mirror the advantages:
Most programs don't execute all code paths most of the time. I think there's some studies that in most programs most of the time spent executing is spent in a fraction of the instructions present in the program. In other words something like 20% of the program is executed 80% of the time (I just made that particular figure up - but you get the idea). If we divide our program up enough and only load instruction sets (i.e. functions) as they are needed then we won't waste time loading the 80% we'll never use this execution of the program. Along these lines we can ultimately fit more concurrently executing programs in our RAM at once if we only end up loading the fraction of the program we need.
Most programs share similar functions (i.e. storing data/trees/hashes/sorting/etc., reading input, writing output, etc.) and if each program has its own local copy then you can't reuse instruction code.
Many programs depend on the existence of others and are maintained by separate companies/groups/individuals. By releasing versioned modules we don't have to synchronize releases all the time.
Conclusion
These aren't the only points to consider but the first ones that came to my mind. I'd recommend reading about compilers, linkers and operating systems. That will answer this question more thoroughly than I and other questions I'm sure this has brought up. To recap dll's aren't the "best" way of packaging executable programs in all situations and circumstances - they have a particular use and advantages and disadvantages.
I am integrating a third party's application with my DLL. The DLL is created and destroyed several times per each run of the third party's software.
From my DLL, I need to recognize if it is the same third party's run or a different one that is creating me. Is there a way to recognize which process of the third party software is creating me?
If the DLL is being unloaded each time, then it will likely need some kind of persistent storage between each time it is loaded. If the calling application does not provide this information, then the DLL itself will need to perform that.
One possibility might be to use named shared memory. If it does not exist, create it and then use that as the "flag" to know that it is being called again in the same execution. When the process exits, it will be destroyed. There are, of course, security implications with this that would need to be considered. Another process could potentially create that shared memory to make your DLL "think" that it was being called again in the same run when it was actually the first invocation.
I want to embed a command-line utility in my C# application, so that I can grab its bytes as an array and run the executable without ever saving it to disk as a separate file (avoids storing executable as separate file and avoids needing ability to write temporary files anywhere).
I cannot find a method to run an executable from just its byte stream. Does windows require it to be on a disk, or is there a way to run it from memory?
If windows requires it to be on disk, is there an easy way in the .NET framework to create a virtual drive/file of some kind and map the file to the executable's memory stream?
You are asking for a very low-level, platform-specific feature to be implemented in a high-level, managed environment. Anything's possible...but nobody said it would be easy...
(BTW, I don't know why you think temp file management is onerous. The BCL does it for you: http://msdn.microsoft.com/en-us/library/system.io.path.gettempfilename.aspx )
Allocate enough memory to hold the executable. It can't reside on the managed heap, of course, so like almost everything in this exercise you'll need to PInvoke. (I recommend C++/CLI, actually, so as not to drive yourself too crazy). Pay special attention to the attribute bits you apply to the allocated memory pages: get them wrong and you'll either open a gaping security hole or have your process be shut down by DEP (i.e., you'll crash). See http://msdn.microsoft.com/en-us/library/aa366553(VS.85).aspx
Locate the executable in your assembly's resource library and acquired a pinned handle to it.
Memcpy() the code from the pinned region of the managed heap to the native block.
Free the GCHandle.
Call VirtualProtect to prevent further writes to the executable memory block.
Calculate the address of the executable's Main function within your process' virtual address space, based on the handle you got from VirtualAlloc and the offset within the file as shown by DUMPBIN or similar tools.
Place the desired command line arguments on the stack. (Windows Stdcall convention). Any pointers must point to native or pinned regions, of course.
Jump to the calculated address. Probably easiest to use _call (inline assembly language).
Pray to God that the executable image doesn't have any absolute jumps in it that would've been fixed up by calling LoadLibrary the normal way. (Unless, of course, you feel like re-implementing the brains of LoadLibrary during step #3).
Retrieve the return value from the #eax register.
Call VirtualFree.
Steps #5 and #11 should be done in a finally block and/or use the IDisposable pattern.
The other main option would be to create a RAMdrive, write the executable there, run it, and cleanup. That might be a little safer since you aren't trying to write self-modifying code (which is tough in any case, but especially so when the code isn't even yours). But I'm fairly certain it will require even more platform API calls than the dynamic code injection option -- all of them requiring C++ or PInvoke, naturally.
Take a look at the "In Memory" section of this paper. Realize that it's from a remote DLL injection perspective, but the concept should be the same.
Remote Library Injection
Creating a RAMdisk or dumping the code into memory and then executing it are both possible, but extremely complicated solutions (possibly more so in managed code).
Does it need to be an executable? If you package it as an assembly, you can use Assembly.Load() from a memory stream - a couple of trivial lines of code.
Or if it really has to be an executable, what's actually wrong with writing a temp file? It'll take a few lines of code to dump it to a temp file, execute it, wait for it to exit, and then delete the temp file - it may not even get out of the disk cache before you've deleted it! Sometimes the simple, obvious solution is the best solution.
This is explicitly not allowed in Vista+. You can use some undocumented Win32 API calls in XP to do this but it was broken in Vista+ because it was a massive security hole and the only people using it were malware writers.
We have developed a number of custom dll's which are called by third-party Windows applications. These dlls are loaded / unloaded as required.
Most of the dlls call web services and these need to have urls, timeouts, etc configured.
Because the dll is not permanently in memory, it has to read the configuration every time it is invoked. This seems sub-optimal to me.
Is there a better way to handle this?
Note: The configurable information is in an xml file so that the IT department can alter as required. They would not accept registry edits.
Note: These dll's cater for a number of third-party applications, It esentially implements an external EDMS interface. The vendors would not accept passing the required parameters.
Note: It’s a.NET application and the dll is written in C#. Essentially, there are both thick (Windows application) and thin clients that access this dll when they need to perform some kind of EDMS operation. The EDMS interface is defined as a set of calls that have to be implemented in the dll and the dll decides how to implement the EDMS functions e.g. for some clients, “Register Document” would update a DB and for others the same call would utilise a third-party EDMS system. There are no ASP clients.
My understanding is that the dll is loaded when the client wants to access an EDMS operation and is then unloaded when the call is finished. The client may not need to do another EDMS operation for a while (in some cases over an hour).
Use the registry to store your configuration information, it's definitely fast enough.
I think you need to provide more information. There are so many approaches at persisting configuration information. We don't even know the development platform. .Net?
I wouldn't rely on the registry unless I was sure it would always be available. You might get away with that on client machines, but you've already mentioned webservices.
XML file in the current directory seems to be very popular now for server side third-party dlls. But those configurations are optional.
If this is ASP, Your Trust Level will be very important in choosing a configuration persistance method.
You may be able to use your Application server's "Application Scope". Which gets loaded once per lifetime of the application. Your DLL can invalidate that data if it detects it needs too.
I've used text files, XML files, database, various IPC like shared memory segments, application scope, to persist configuration information. It depends a lot on the specifics of your project.
Care to elaborate further?
EDIT. Considering your clarifications, I'd go with an XML file. This custom XML file would be loaded using a search path that has been predefined and documented. If this is ASP.Net you can use Server.MapPath() for example to check various folders like App_Data. The DLL would check the current directory for the configuration file first though. You can then use a "manager" thread that holds the configuration data and passes it to any child threads that require it. The sharing can use IPC like a shared memory segment.
This seems like hassle, but you have to store the information in some scope... Either from disk, memory ( application scope, session scope, DLL global scope, another process/IPC etc. )
ASP.Net also gives you the ability to add custom configuration sections to standard configuration files like web.config. You can access those sections at will and they will not depend on when your DLL was loaded.
Why do you believe your DLL is being removed from memory?
Why don't you let the calling application fill out a data-structure with the stuff you need? Can be done as part of an init-call or so.
How often is the dll getting unloaded? COM dlls can control when they are unloaded via the DllCanUnload method. If these are COM components you could look at implementing some kind of timeout here to prevent frequent loads and unloads. Unless the dll is reload the configuration at a significant frequency it is unlikely to be a real performance bottleneck.
Knowing that the dll will reload its configuration at certain points is a useful feature, since it prevents the users wondering if they have to restart the host process, reboot the machine, etc for the configuration to take effect. You could even watch the file for changes to keep it up to date.
I think the best way for a DLL to get configuration information is via the application that is using it - either via implicit "Init"-calls, like Nils suggested, or via their configuration files.
DLLs shouldn't usually "configure themselves", as they can never be sure in which context they are used. Different users (as in applications) may have different configuration settings to make.
Since you said that the application is written in .NET, you should probably simply require them to put the necessary configuration for your DLL's functions in their configuration file ("whatever.exe.config") and access it from your DLL via AppSettings or even better via a custom configuration section.
Additionally, you may want to provide sensible default values for settings where that is possible (probably not for network addresses though).
If the dlls are loaded and unloaded from memory only at a gap of every 1 hour or so the in-efficiency due to mslal initializations (read file / registry) will be negligible.
However if this is more frequent, a higher inefficiency would be the physical action of loading and unloading of dlls. This could be more of an in-efficiency than small initializations.
It might therefore be better to keep them pinned in memory. That way the initialization performed at the load time, does not get repeated and you also avoid the in-efficiency of load and unload. You solve 2 issues this way.
I could tell you how to do this in C++. Not sure how you would do this in C#. GetModuleHandle + making an extra a LoadLibrary call on this handle is how i would do this in C++.
One way to do it is to have an Interface in the DLL which specify the required settings.
Then it's up to the "application project" to have a class that implements this interface and pass it to the DLL at initiation, this makes you free to change the implementation depending on project. One might read from web.config while another reads from DB.