I have a block of code like this in kotlin.
synchronized(this) {
// do some work and produces a String
}.also { /*it: String*/
logger.log(it)
}
can some thread come and with unlucky timing the it variable gets changed before logging happens? (There are a lot of threads executing this piece of code concurrently)
To expand on comments:
The synchronized block returns a reference that's passed into the also block as it; that reference is local to the thread, and so there's no possibility of it being affected by other threads.
In general, there's no guarantee about the object that that reference points to: if other threads have a reference to it, they could potentially change its state (or that of other objects it refers to).But in this case, it's a String; and in Kotlin, Strings are completely immutable. So there's no risk of that here.
Taking those together: the logging in OP's code is indeed thread-safe!
However:
We can't tell whether there could be race conditions or other concurrency issues within the synchronized block, before the String gets created and returned. It synchronizes on this, which prevents two threads simultaneously executing it on the same object; but if there are two instances of the class, each one could have a single thread running it.So for example there could be an issue if the block uses some common object that's not completely thread-safe, or some instance property that's set to the same reference in both instances, or if there's sharing in a more roundabout way. Obviously, this will depend upon the nature of the work done in that block, what objects it accesses, and what it does with them. So that's worth bearing in mind too.
Related
I am currently interning in a company and just starting to get into their code. I noticed that they have tasks that use singleton classes, but inside the singleton class there is a future object that is used to fetch thread dumps.
The code goes something like this:
singltonclass{
private ExecutorService x= Executors.newFixedThreadPool(1);
getInstance method(){}
methodThatFetchsThreadDumps(){
future is used here;
}
}
Is it a good idea to use a future inside a singleton? What happens if the task using this singleton runs twice and overlaps? Wouldn’t using the singleton multiple times cause the future to give unexpected behavior?
This isn't necessarily a bad thing. The Future will make sure that the objects returned will be visible across threads. The thread pool is fixed at size of 1, so if there are concurrent requests the second one blocks until the only worker thread becomes available, by which time it has handed off its results from the previous task. No overlap should be occurring.
With regards to atomic property, Apple's documentation has this below:
This means that the synthesized accessors ensure that a value is
always fully retrieved by the getter method or fully set via the
setter method, even if the accessors are called simultaneously from
different threads.
What does "fully retrieved" or "fully set" mean?
Why is "fully retrieved" or "fully set" not enough to guarantee thread safety?
Note: I am aware there are many posts regarding atomicity on SO, please don't tag this as duplicate unless the ticket specifically address the question above. After reading the posts, I still do not fully understand atomic property.
Atomic means that calls to the getter/setter are synchronized. That way if one thread is setting the property at the same time as another thread is getting it, the one getting the property is guaranteed to get a valid return value. Without it being atomic, it would be possible that the getter retrieves a garbage value, or a pointer to an object that is immediately deallocated. When it's atomic, it will also ensure that if two threads try to set it at the same time, one will wait for the other to finish. If it weren't atomic and two threads tried to set it at the same time, you could end up with a garbage value being written, or possibly objects being over/under retained or over/under released.
So basically, if the property is being set, any other calls to set it or get it will wait for the method to return. Same for if the property is being gotten, any other calls to get it or set it will wait until that get finishes.
This is sometimes sufficient for thread safety depending on what it's being used for. But often you want more than this level of synchronization for thread safety. For example if one block of code on a thread gets the value, makes some change to it, and wants to set it again without some other thread changing it in the meantime. You would have to do additional synchronization to make sure that you have a lock on it from before you get it until after you subsequently set it. You would want to do the same if you wanted to get an object and make some changes to that object without another thread trying to make changes at the same time on it.
"Fully set" and "fully retrieved" means the code below will always print "0x11111111" or "0x22222222". It will never print things like "0x11112222" or "0x11221122". Without atomic or some other appropriate thread synchronization, those sorts of partial reads or partial updates are allowed for some data types on some CPU architectures.
// Thread 1
while (true) x = 0x11111111;
// Thread 2
while (true) x = 0x22222222;
// Thread 3
while (true) printf("0x%x\n", x);
It means the value will never be accessed when it's halfway through being written. It will always be one intended value or another, never an incompletely altered bit pattern.
It isn't enough to guarantee thread-safety because ensuring that the value of a variable is either fully written or nor written at all is not enough to make all the code that uses that variable thread-safe. For example, there might be two variables that need to be updated together (the classical example is transferring credits from one account to another), but code in another thread could see one variable with the new value and one with the old.
Very often, you'll need to implement synchronization for whole systems, so the guarantees offered by atomic variables end up almost not mattering a lot of the time.
I just created a singleton method, and I would like to know what the function #synchronized() does, as I use it frequently, but do not know the meaning.
It declares a critical section around the code block. In multithreaded code, #synchronized guarantees that only one thread can be executing that code in the block at any given time.
If you aren't aware of what it does, then your application probably isn't multithreaded, and you probably don't need to use it (especially if the singleton itself isn't thread-safe).
Edit: Adding some more information that wasn't in the original answer from 2011.
The #synchronized directive prevents multiple threads from entering any region of code that is protected by a #synchronized directive referring to the same object. The object passed to the #synchronized directive is the object that is used as the "lock." Two threads can be in the same protected region of code if a different object is used as the lock, and you can also guard two completely different regions of code using the same object as the lock.
Also, if you happen to pass nil as the lock object, no lock will be taken at all.
From the Apple documentation here and here:
The #synchronized directive is a
convenient way to create mutex locks
on the fly in Objective-C code. The
#synchronized directive does what any
other mutex lock would do—it prevents
different threads from acquiring the
same lock at the same time.
The documentation provides a wealth of information on this subject. It's worth taking the time to read through it, especially given that you've been using it without knowing what it's doing.
The #synchronized directive is a convenient way to create mutex locks on the fly in Objective-C code.
The #synchronized directive does what any other mutex lock would do—it prevents different threads from acquiring the same lock at the same time.
Syntax:
#synchronized(key)
{
// thread-safe code
}
Example:
-(void)AppendExisting:(NSString*)val
{
#synchronized (oldValue) {
[oldValue stringByAppendingFormat:#"-%#",val];
}
}
Now the above code is perfectly thread safe..Now Multiple threads can change the value.
The above is just an obscure example...
#synchronized block automatically handles locking and unlocking for you. #synchronize
you have an implicit lock associated with the object you are using to synchronize. Here is very informative discussion on this topic please follow How does #synchronized lock/unlock in Objective-C?
Excellent answer here:
Help understanding class method returning singleton
with further explanation of the process of creating a singleton.
#synchronized is thread safe mechanism. Piece of code written inside this function becomes the part of critical section, to which only one thread can execute at a time.
#synchronize applies the lock implicitly whereas NSLock applies it explicitly.
It only assures the thread safety, not guarantees that. What I mean is you hire an expert driver for you car, still it doesn't guarantees car wont meet an accident. However probability remains the slightest.
May be my question is stupid. But i would like to get it cleared. We know that functions are loaded in memory only once and when you create new objects, only instance variables gets created, functions are never created. My question is, say suppose there is server and all clients access a method named createCustomer(). Say suppose all clients do something which fired createCustomer on server. So, if the method is in middle of execution and new client fires it. Will the new request be put on wait? or new request also will start executing the method? How does it all get managed when there is only one copy of function in memory? No book mentions answers to this type of questions. So i am posting here where i am bound to get answers :).
Functions are code which is then executed in a memory context. The code can be run many times in parallel (literally in parallel on a multi-processor machine), but each of those calls will execute in a different memory context (from the point of view of local variables and such). At a low level this works because the functions will reference local variables as offsets into memory on something called a "stack" which is pointed to by a processor register called the "stack pointer" (or in some interpreted languages, an analog of that register at a higher level), and the value of this register will be different for different calls to the function. So the x local variable in one call to function foo is in a different location in memory than the x local variable in another call to foo, regardless of whether those calls happen simultaneously.
Instance variables are different, they're referenced via a reference (pointer) to the memory allocated to the instance of an object. Two running copies of the same function might access the same instance variable at exactly the same time; similarly, two different functions might do so. This is why we get into "threading" or concurrency issues, synchronization, locks, race conditions, etc. But it's also one reason things can be highly efficient.
It's called "multi-threading". If each request has its own thread, and the object contains mutable data, each client will have the opportunity to modify the state of the object as they see fit. If the person who wrote the object isn't mindful of thread safety you could end up with an object that's in an inconsistent state between requests.
This is a basic threading issue, you can look it up at http://en.wikipedia.org/wiki/Thread_(computer_science).
Instead of thinking in terms of code that is executed, try to think of memory context of a thread that is changed. It does not matter where and what the actual code happens to be, and if it is the same code or a duplicate or something else.
Basically, it can happen that the function is called while it was already called earlier. The two calls are independent and may even happen to run in parallel (on a multicore machine). The way to achieve this independence is by using different stacks and virtual address spaces for each thread.
There are ways to synchronize calls, so additional callers have to wait until the first call finishes. This is also explained in the above link.
Can I do any of the following? Will they properly lock/unlock the same object? Why or why not? Assume there are many identical threads using global variable "obj", which was initialized before all threads started.
1.
#synchronized(obj) {
[obj release];
obj = nil;
}
2.
#synchronized(obj) {
obj = [[NSObject new] autorelease];
}
Short answer: no, they won't properly lock/unlock, and such approaches should be avoided.
My first question is why you'd want to do something like this, since these approaches nullify the purposes and benefits of using a #synchronized block in the first place.
In your second example, once a thread changes the value of obj, every subsequent thread that reaches the #synchronized block will synchronize on the new object, not the original object. For N threads, you'd be explicitly creating N autoreleased objects, and the runtime may create up to N recursive locks associated with those objects. Swapping out the object on which you synchronize within the critical section is a fundamental no-no of thread-safe concurrency. Don't do it. Ever. If multiple threads can safely access a block concurrently, just omit the #synchronized entirely.
In your first example, the results may be undefined, and certainly not what you want, either. If the runtime only uses the object pointer to find the associated lock, the code may run fine, but synchronizing on nil has no perceptible effect in my simple tests, so again you're using #synchronized in a pointless way, since it offers no protection whatsoever.
I'm honestly not trying to be harsh, since I figure you're probably just curious about the construct. I'm just wording this strongly to (hopefully) prevent you and others from writing code that is fatally flawed, especially if under the assumption that it synchronizes properly. Good luck!