May be my question is stupid. But i would like to get it cleared. We know that functions are loaded in memory only once and when you create new objects, only instance variables gets created, functions are never created. My question is, say suppose there is server and all clients access a method named createCustomer(). Say suppose all clients do something which fired createCustomer on server. So, if the method is in middle of execution and new client fires it. Will the new request be put on wait? or new request also will start executing the method? How does it all get managed when there is only one copy of function in memory? No book mentions answers to this type of questions. So i am posting here where i am bound to get answers :).
Functions are code which is then executed in a memory context. The code can be run many times in parallel (literally in parallel on a multi-processor machine), but each of those calls will execute in a different memory context (from the point of view of local variables and such). At a low level this works because the functions will reference local variables as offsets into memory on something called a "stack" which is pointed to by a processor register called the "stack pointer" (or in some interpreted languages, an analog of that register at a higher level), and the value of this register will be different for different calls to the function. So the x local variable in one call to function foo is in a different location in memory than the x local variable in another call to foo, regardless of whether those calls happen simultaneously.
Instance variables are different, they're referenced via a reference (pointer) to the memory allocated to the instance of an object. Two running copies of the same function might access the same instance variable at exactly the same time; similarly, two different functions might do so. This is why we get into "threading" or concurrency issues, synchronization, locks, race conditions, etc. But it's also one reason things can be highly efficient.
It's called "multi-threading". If each request has its own thread, and the object contains mutable data, each client will have the opportunity to modify the state of the object as they see fit. If the person who wrote the object isn't mindful of thread safety you could end up with an object that's in an inconsistent state between requests.
This is a basic threading issue, you can look it up at http://en.wikipedia.org/wiki/Thread_(computer_science).
Instead of thinking in terms of code that is executed, try to think of memory context of a thread that is changed. It does not matter where and what the actual code happens to be, and if it is the same code or a duplicate or something else.
Basically, it can happen that the function is called while it was already called earlier. The two calls are independent and may even happen to run in parallel (on a multicore machine). The way to achieve this independence is by using different stacks and virtual address spaces for each thread.
There are ways to synchronize calls, so additional callers have to wait until the first call finishes. This is also explained in the above link.
Related
I have a block of code like this in kotlin.
synchronized(this) {
// do some work and produces a String
}.also { /*it: String*/
logger.log(it)
}
can some thread come and with unlucky timing the it variable gets changed before logging happens? (There are a lot of threads executing this piece of code concurrently)
To expand on comments:
The synchronized block returns a reference that's passed into the also block as it; that reference is local to the thread, and so there's no possibility of it being affected by other threads.
In general, there's no guarantee about the object that that reference points to: if other threads have a reference to it, they could potentially change its state (or that of other objects it refers to).But in this case, it's a String; and in Kotlin, Strings are completely immutable. So there's no risk of that here.
Taking those together: the logging in OP's code is indeed thread-safe!
However:
We can't tell whether there could be race conditions or other concurrency issues within the synchronized block, before the String gets created and returned. It synchronizes on this, which prevents two threads simultaneously executing it on the same object; but if there are two instances of the class, each one could have a single thread running it.So for example there could be an issue if the block uses some common object that's not completely thread-safe, or some instance property that's set to the same reference in both instances, or if there's sharing in a more roundabout way. Obviously, this will depend upon the nature of the work done in that block, what objects it accesses, and what it does with them. So that's worth bearing in mind too.
With regards to atomic property, Apple's documentation has this below:
This means that the synthesized accessors ensure that a value is
always fully retrieved by the getter method or fully set via the
setter method, even if the accessors are called simultaneously from
different threads.
What does "fully retrieved" or "fully set" mean?
Why is "fully retrieved" or "fully set" not enough to guarantee thread safety?
Note: I am aware there are many posts regarding atomicity on SO, please don't tag this as duplicate unless the ticket specifically address the question above. After reading the posts, I still do not fully understand atomic property.
Atomic means that calls to the getter/setter are synchronized. That way if one thread is setting the property at the same time as another thread is getting it, the one getting the property is guaranteed to get a valid return value. Without it being atomic, it would be possible that the getter retrieves a garbage value, or a pointer to an object that is immediately deallocated. When it's atomic, it will also ensure that if two threads try to set it at the same time, one will wait for the other to finish. If it weren't atomic and two threads tried to set it at the same time, you could end up with a garbage value being written, or possibly objects being over/under retained or over/under released.
So basically, if the property is being set, any other calls to set it or get it will wait for the method to return. Same for if the property is being gotten, any other calls to get it or set it will wait until that get finishes.
This is sometimes sufficient for thread safety depending on what it's being used for. But often you want more than this level of synchronization for thread safety. For example if one block of code on a thread gets the value, makes some change to it, and wants to set it again without some other thread changing it in the meantime. You would have to do additional synchronization to make sure that you have a lock on it from before you get it until after you subsequently set it. You would want to do the same if you wanted to get an object and make some changes to that object without another thread trying to make changes at the same time on it.
"Fully set" and "fully retrieved" means the code below will always print "0x11111111" or "0x22222222". It will never print things like "0x11112222" or "0x11221122". Without atomic or some other appropriate thread synchronization, those sorts of partial reads or partial updates are allowed for some data types on some CPU architectures.
// Thread 1
while (true) x = 0x11111111;
// Thread 2
while (true) x = 0x22222222;
// Thread 3
while (true) printf("0x%x\n", x);
It means the value will never be accessed when it's halfway through being written. It will always be one intended value or another, never an incompletely altered bit pattern.
It isn't enough to guarantee thread-safety because ensuring that the value of a variable is either fully written or nor written at all is not enough to make all the code that uses that variable thread-safe. For example, there might be two variables that need to be updated together (the classical example is transferring credits from one account to another), but code in another thread could see one variable with the new value and one with the old.
Very often, you'll need to implement synchronization for whole systems, so the guarantees offered by atomic variables end up almost not mattering a lot of the time.
I successfully hooked BeginScene/EndScene methods of DirectX9's DeviceEx, in order to override regions on the screen of a graphics application. I did it by overriding the first 'line' of the function pointed by the appropriate vtable entry (42 for EndScene) with an x86 jump command.
The problem is that when I would like to call the original EndScene method, I have to write the original code overriden by the jump. This operation is not thread safe, and the application has two devices used by two threads.
I tried overriding the vtable entry or copying it and override the COM interface pointer to the vtable, neither ways worked. I guess the original function pointer is cached somewhere or was optimized in the compilation.
I thought about copying the whole original method body to another memory block, but two problems I'm afraid of: (1) (the easy one I think) I don't know how to discover the length of the method and (2) I don't know if the function body stores offsets which are relative to the location where the function is in memory.
I'm trying to hook WPF's device, if it can help somehow.
Do anyone know a thread safe way for such hooking?
Answering my own question: It seems that for my purpose (performing another method before or instead of the original one within my own process), 'trampoline' is the answer. Generally it means I need to make another code segment that makes exactly what the overriden assembly commands did.
Because it is not an easy task, using an external library is recommended.
A discussion about this topic:
How to create a trampoline function for hook
i am working on an application which calls the COM component of a partner's application. Ours is .Net, theirs isn't. I don't know much about COM; I know that the component we're calling is late-bound i.e.
obj As Object = CreateObject("THIRDPARTY.ThirdPartyObject")
We then call a method on this COM object (Option Strict Off in the head of the VB file):
obj.AMethod(ByVal Arg1 As Integer, ByVal Arg2 As Integer, ByVal Arg3 as Boolean)
I am a bit nonplussed that even though this call works, this overload doesn't exist in the COM interop .dll that is created if I instead add a reference to the COM server using Add Reference. The only available call to this method that it says is available is AMethod().
However, this in itself is not what bothers me. What bothers me is that this call works for a while, THEN throws a TargetParameterCountException after a few dozen calls have executed successfully.
I ask thee thus, StackOverflow:
What. The. Hell.
The only thing I can guess at is that the documentation for the COM component states that this method is executed synchronously - so therefore maybe whatever's responsible for throwing that exception is being blocked until some indeterminate point in time? Other than that, I'm completely stumped at this bizarre, and more importantly inconsistent behaviour.
edit #1:
More significant information that I've just remembered - from time to time the call throws an ExecutionEngineException instead. It only took one glance at the documentation to realise that this is VERY BAD. Doing a little bit of digging suggests to me that the late-binding call is causing stack corruption, crashing the entire CLR. Presumably this means that the runtime is shooting down bad calls (with TargetParameterCountException) some of the time and missing them (ExecutionEngineException) others.
edit #2:
Answering David Lively's questions:
The call with zero arguments that's currently in the code has been there for a long time. I haven't been able to get hold of a manual for the third party's COM implementation past two major revisions ago, so it's possible that they've withdrawn that signature from service
There is only one location that this method is called from
This is one desktop app calling another, on the same machine. Nothing fancy
The object is persisted throughout the scope of the user's interaction with the application, so there's never a new one created.
Unfortunately, it seems likely that there is indeed a bug in the implementation, as you suggest. The trouble with this vendor is that, when we report a bug, their response tends to follow the general form: i) deny there's a problem; ii) deny it's their problem; iii) refuse to fix it. These three steps tend to span a frustratingly long period of time.
No, it can't cause stack corruption. IDispatch::Invoke() is used to call the method, the arguments are packaged in an array. The stock implementation of IDispatch certainly would detect the argument mismatch, it uses the type library info to check. But it is conceivable that the COM server author implemented it himself. Imperfectly. It is something a C++ hacker might do, the stock implementation is dreadfully slow. The GC heap getting corrupted is the kind of thing that happens when imperfect code executes.
I haven't played with calling COM objects from VB in quite a while, but I'll take a wild guess:
I would expect an exception to be thrown if you're calling the object with too few or too many arguments, but it appears that's not the case. What is the real signature of the method you're calling?
In some languages and some situations, when you call a method, arguments are placed on the stack. If you place too many arguments, it's possible for the extraneous ones to remain on the stack after the method completes. This should cause lots of other problems, though.
Some possibilities/considerations:
The object is throwing this exception internally. This should be taken up with the author.
You're calling with too many parameters. If, as you said, the overload you're trying to call isn't published in the object's type library, you may actually be calling a different published method with a different signature. I'd REALLY expect a compiler error if this is the case.
Are your later calls taking place in the same part of your code, or is there a different execution branch that might be doing things a bit differently, and causing the error?
Are you running this from a desktop app/script, or a website? If a website, are you receiving a valid, expected response, or does the request hang as if an internal long-running process doesn't complete?
The object may be allocating and not releasing resources, which could cause undefined behavior when those resources are exhausted.
Are you releasing the object between calls, or is it recreated every time?
Also, re: your comments about late binding: the .CreateObject() method of instantiating a COM object is the normal, accepted way to do this. That shouldn't have anything to do with the issue. Based on the exceptions you listed, I'm strongly inclined to believe that there is an internal issue with the object.
Good luck.
OK, basically - false alarm. I've done it wrong - I've copied some code over from somewhere improperly and the thing I'm calling was never supposed to support that overload. What I find interesting is that the component didn't reject that late-bound call out of hand, but did everything it was supposed to do, at least initially.
I was experimenting with ways to get rid of some memory leaks within my application the other day when I realized that I know virtually nothing about cleaning up my resources. I did some research, and hoped that just calling the .dispose() would solve all of my problems. We have a table in our database that contains about 65,000 records. Obviously when I fill my dataset from the data adapter, the memory usage can get pretty high. When I called the dispose method on the dataset, I was surprised to find out that NONE of the memory got released. Why did this happen? Clearing the dataset doesn't help either.
IDisposable and thus Dispose is not used to reduced memory pressure, although in some cases it might, but instead used for deterministic cleanup.
Consider this, you construct an object that maintains an active and open connection to your database server. This connection uses resources, both on your machine, and the server.
You could of course just leave the object be when you're done with it, and eventually it'll get picked up by the garbage collector, but suppose you want to make sure at least the resources gets freed, and thus the connection closed, when you're done with it. This is where IDisposable.Dispose comes into play.
It is used to clean up resources managed by the object.
It will, however, not free the managed memory allocated to the object. This is still left to the garbage collector, that will kick in at some later time to do that.
Do you actually have a memory problem, or do you just look at the memory usage in Task Manager or similar and go "that's a bit high."?
If the latter, then you should just leave it be for now. .NET will run garbage collection more often if you have less memory available, so unless you're in a situation where you get, or might suspect you will get soon, a memory overflow condition, you're probably not going to have any problems.
Let me explain what I mean by "run less often".
If you have 8GB of memory in your machine, and only have Windows and Notepad running, most of that memory will be available. When you now run your program, even if it loads minor data blocks into memory, you can keep doing that for a long time, and memory usage will steadily grow. Exactly when the GC will kick in and try to reduce your memory footprint I don't know, but I can almost guarantee you that you will wonder why it gets so high.
Let's just for the sake of the argument say that your program will eventually use 2GB of memory.
Now, if you run your program on a machine that has less memory available, GC will occur more often, and will kick in on a lower limit, which might keep the memory usage below 500MB or possibly even less.
The important part to note here is that in order for you to get an accurate picture of how much memory application actually requires, then you can't rely on Task Manager or similar ways to measure it, you need something more targetted.
Calling Dispose() will only release unmanaged resources, such as file handles, database connections, unmanaged memory, etc. It will not release garbage collected memory.
Garbage collected memory will only get released at the next collection. Usually when the application domain memory is deamed full.
I'm going to point out something here that hasn't been explicitly mentioned: calling Dispose() will only clean up (free) unmanaged resources if the developer of the component has coded it.
What I mean is this: if you suspect you have a memory leak, calling Dispose() is not going to fix it if the original developer has done a lousy job and not correctly freed up unmanaged resources. For a bit more info, check this blog post. Take note of the statement The behaviour of Dispose is defined by the developer.
Some objects will ask one or more other entities to do something on its behalf until further notice, to the detriment of other entities. If an object which did so were to disappear without informing the former entities that their services were no longer needed, those entities would continue to uselessly act on behalf of an object that no longer needed them, to the continuing detriment of other entities that would want to use them.
In many cases, for an object "George" to tell an outside entity "Joe" that its services were no longer needed, George would have to know that its services were no longer needed. There are two normal means via which that can happen in .NET, finalization and IDIsposable.
If an object overrides a method called Finalize, then when the object is created the .NET garbage collector will add it to a list of objects with registered finalizers. If the GC discovers that there exists no rooted reference to the object other than that list, the GC will remove the object from that list and add it to a strongly-rooted queue of objects which should have their Finalize method called as soon as possible. Such an object can then use its Finalize method to inform other entities that their services are no longer required.
Although finalization-based cleanup can sometimes work, there's no guarantee of timeliness. At one point during the design of .net Microsoft may have intended that finalization would be the primary cleanup method, but for a variety of reasons it cannot safely be relied upon.
The other cleanup approach, which should be the focus of one's efforts, is IDisposable. Basically, the idea behind IDisposable is simple: for every object that implements IDisposable, there should be one entity (generally either an object or a nested execution scope) which is responsible for ensuring that that object's IDisposable.Dispose method will get called sometime within the lifetime of the universe (which would imply sometime while a reference to the object still exists), and preferably as soon as code can tell that the object's services will no longer be required.
Note that IDisposable.Dispose generally promises that any outside entities which had been asked to do something on an object's behalf will be told that they no longer need to do so, but such a promise does not imply that the number of entities is non-zero. If an object hasn't asked any outside entities to do anything on its behalf, then delivering a message "all" such entities doesn't require doing anything at all. On the other hand, the fact that a Dispose method may do nothing in some cases doesn't mean that it's guaranteed never to do anything in any case, nor that failure to call it in those cases where it would do something won't have detrimental effects.