Kotlin - How to lock a collection when accessing it from two threads - kotlin

wondered if anyone could assist, I'm trying to understand the correct way to access a collection in Kotlin with two threads.
The code below simulates a problem I'm having in a live system. One thread iterates over the collection but another thread can remove elements in that array.
I have tried adding #synchronized to the collections getter but that still gives me a concurrentmodification exception.
Can anyone let me know what the correct way of doing this would be?
class ListTest() {
val myList = mutableListOf<String>()
#Synchronized
get() = field
init {
repeat(10000) {
myList.add("stuff: $it")
}
}
}
fun main() = runBlocking<Unit> {
val listTest = ListTest()
launch(Dispatchers.Default) {
delay(1L)
listTest.myList.remove("stuff: 54")
}
launch {
listTest.myList.forEach { println(it) }
}
}

You are only synchronizing the getter and setter, so when you start using the reference you get to the list, it is already unlocked.
Kotlin has the Mutex class available for locking manipulation of a shared mutable object. Mutex is nicer than Java's synchronized because it suspends instead of blocking the coroutine thread.
Your example would be poor design in the real world because your class publicly exposes a mutable list. But going along with making it at least safe to modify the list:
class ListTest() {
private val myListMutex = Mutex()
private val myList = mutableListOf<String>()
init {
repeat(10000) {
myList.add("stuff: $it")
}
}
suspend fun modifyMyList(block: MutableList<String>.() -> Unit) {
myListMutex.withLock { myList.block() }
}
}
fun main() = runBlocking<Unit> {
val listTest = ListTest()
launch(Dispatchers.Default) {
delay(1L)
listTest.modifyMyList { it.remove("stuff: 54") }
}
launch {
listTest.modifyMyList { it.forEach { println(it) } }
}
}
If you are not working with coroutines, instead of a Mutex(), you can use an Any and instead of withLock use synchronized (myListLock) {} just like you would in Java to prevent code from within the synchronized blocks from running at the same time.

If you want to lock a collection, or any object for concurrent access, you can use the almost same construct as java's synchronized keyword.
So while accessing such an object you would do
fun someFun() {
synchronized(yourCollection) {
}
}
You can also use synchronizedCollection method from java's Collections class, but this only makes single method access thread safe, if you have to iterate over the collection, you will still have to manually handle the synchronization.

Related

Issue IDE warning if annotated member is not surrounded with a particular block

I have a data structure which has members that are not thread safe and the caller needs to lock the resource for reading and writing as appropriate. Here's a minimal code sample:
class ExampleResource : LockableProjectItem {
override val readWriteLock: ReadWriteLock = ReentrantReadWriteLock()
#RequiresReadLock
val nonThreadSafeMember: String = ""
}
interface LockableProjectItem {
val readWriteLock: ReadWriteLock
}
fun <T : LockableProjectItem, Out> T.readLock(block: T.() -> Out): Out {
try {
readWriteLock.readLock().lock()
return block(this)
} finally {
readWriteLock.readLock().unlock()
}
}
fun <T : LockableProjectItem, Out> T.writeLock(block: T.() -> Out): Out {
try {
readWriteLock.writeLock().lock()
return block(this)
} finally {
readWriteLock.writeLock().unlock()
}
}
annotation class RequiresReadLock
A call ExampleResource.nonThreadSafeMember might then look like this:
val resource = ExampleResource()
val readResult = resource.readLock { nonThreadSafeMember }
To make sure that the caller is aware that the resource needs to be locked, I would like the IDE to issue a warning for any members that are annotated with #RequiresReadLock and are not surrounded with a readLock block. Is there any way to do this in IntelliJ without writing a custom plugin for the IDE?
I think this is sort of a hack, but using context receivers might work. I don't think they are intended to be used in this way though.
You can declare a dummy object to act as the context receiver, and add that as a context receiver to the property:
object ReadLock
class ExampleResource : LockableProjectItem {
override val readWriteLock: ReadWriteLock = ReentrantReadWriteLock()
// properties with context receivers cannot have a backing field, so we need to explicitly declare this
private val nonThreadSafeMemberField: String = ""
context(ReadLock)
val nonThreadSafeMember: String
get() = nonThreadSafeMemberField
}
Then in readLock, you pass the object:
fun <T : LockableProjectItem, Out> T.readLock(block: context(ReadLock) T.() -> Out): Out {
try {
readWriteLock.readLock().lock()
return block(ReadLock, this)
} finally {
readWriteLock.readLock().unlock()
}
}
Notes:
This will give you an error if you try to access nonThreadSafeMember without the context receiver:
val resource = ExampleResource()
val readResult = resource.nonThreadSafeMember //error
You can still access nonThreadSafeMember without acquiring a read lock by doing e.g.
with(ReadLock) { // with(ReadLock) doesn't acquire the lock, just gets the context receiver
resource.nonThreadSafeMember // no error
}
But it's way harder to accidentally write something like this, which I think is what you are trying to prevent.
If you call another function inside readLock, and you want to access nonThreadSafeMember inside that function, you should mark that function with context(ReadLock) too. e.g.
fun main() {
val resource = ExampleResource()
val readResult = resource.readLock {
foo(this)
}
}
context(ReadLock)
fun foo(x: ExampleResource) {
x.nonThreadSafeMember
}
The context receiver is propagated through.

Correct way of locking a mutex in Kotlin

I want to implement a simple thread-safe Buffer, using Kotlin Coroutines, because coroutines are already used within the project.
The buffer will be used both in multi-thread and single-thread contexts, so having suspend fun getMostRecentData() doesn't seem very reasonable (see code below).
This is what I have so far. The fact that I have to write all that code to lock the mutex makes me wonder if I'm doing something wrong.
Anyway here's the code:
class SafeBuffer(
private val dispatcher: CoroutineDispatcher,
private val bufferSize: Int
) {
private val buffer = LinkedList<MyDataType>()
private val mutex = Mutex()
val size: Int
get() = buffer.size
// First approach: make a suspend fun
// Not great because I will need a runBlocking{} statement somewhere, every time I want to access the buffer
suspend fun getMostRecentData() : MyDataType? {
mutex.withLock {
return if (buffer.isEmpty()) null else buffer.last
}
}
// Second approach: use a runBlocking block inside the function
// Seems like it is missing the purpose of coroutines, and I'm not
// sure it is actually thread safe if other context is used somehow?
fun getMostRecentData() : MyDataType? {
runBlocking(dispatcher) {
mutex.withLock {
return if (buffer.isEmpty()) null else buffer.last
}
}
}
/**** More code ****/
(...)
}
So what's the most idiomatic/elegant way of achieving this?
Expanding on my comment, I think it would be idiomatic to have the buffer class only expose a suspend fun, as the consumer of the class would be responsible for figuring out how they want to use it (via runBlocking or from another coroutine). If you see this use case coming up a lot, an idiomatic approach may be to have an extension function on SafeBuffer to offer this functionality.
Extension functions are used all over the place in the coroutines API. In your code example, even Mutex.withLock is defined as an extension function.
class SafeBuffer(...) {
private val buffer = LinkedList<MyDataType>()
private val mutex = Mutex()
suspend fun getMostRecentData() : MyDataType? =
mutex.withLock {
if (buffer.isEmpty()) null else buffer.last
}
}
fun SafeBuffer.getMostRecentDataBlocking(): MyDataType? =
runBlocking {
getMostRecentData()
}

Proper use of coroutines Dispatcher Main and Default

I'm trying to do a cup heavy calculation and then want to update the UI.
Below is my code:
private fun updateData() {
GlobalScope.launch(Dispatchers.Default){ //work on default thread
while (true){
response.forEach {
val out = doIntensiveWork()
withContext(Dispatchers.Main){ //update on main thread
_data.postValue(out)
delay(1500L)
}
}
}
}
}
is this way of using coroutines okay? As running the entire work on Main also has no visible effect and work fine.
private fun updateData() {
GlobalScope.launch(Dispatchers.Main){ //work on Main thread
while (true){
response.forEach {
val out = doIntensiveWork()
_data.postValue(out)
delay(1500L)
}
}
}
}
Which one is recommended?
You should avoid using GlobalScope for the reason described here and here.
And you should consider doing heavy computation off the main thread
suspen fun doHeavyStuff(): Result = withContext(Dispatchers.IO) { // or Dispatchers.Default
// ...
}
suspend fun waitForHeavyStuf() = withContext(Dispatchers.Main) {
val result = doHeavyStuff() // runs on IO thread, but results comes back on Main thread
updateYourUI()
}
Documentation
More about Dispatchers here
I recommend Roman Elizarov blogs about Kotlin Coroutines

How can I guarantee to get latest data when I use Coroutine in Kotlin?

The Code A is from the project architecture-samples, you can see it here.
The updateTasksFromRemoteDataSource() is suspend function, so it maybe run asynchronously.
When I call the function getTasks(forceUpdate: Boolean) with the paramter True, I'm afraid that return tasksLocalDataSource.getTasks() will be fired before updateTasksFromRemoteDataSource().
I don't know if the Code B can guarantee return tasksLocalDataSource.getTasks() will be fired after updateTasksFromRemoteDataSource().
Code A
class DefaultTasksRepository(
private val tasksRemoteDataSource: TasksDataSource,
private val tasksLocalDataSource: TasksDataSource,
private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
) : TasksRepository {
override suspend fun getTasks(forceUpdate: Boolean): Result<List<Task>> {
// Set app as busy while this function executes.
wrapEspressoIdlingResource {
if (forceUpdate) {
try {
updateTasksFromRemoteDataSource()
} catch (ex: Exception) {
return Result.Error(ex)
}
}
return tasksLocalDataSource.getTasks()
}
}
private suspend fun updateTasksFromRemoteDataSource() {
val remoteTasks = tasksRemoteDataSource.getTasks()
if (remoteTasks is Success) {
// Real apps might want to do a proper sync, deleting, modifying or adding each task.
tasksLocalDataSource.deleteAllTasks()
remoteTasks.data.forEach { task ->
tasksLocalDataSource.saveTask(task)
}
} else if (remoteTasks is Result.Error) {
throw remoteTasks.exception
}
}
...
}
Code B
class DefaultTasksRepository(
private val tasksRemoteDataSource: TasksDataSource,
private val tasksLocalDataSource: TasksDataSource,
private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
) : TasksRepository {
override suspend fun getTasks(forceUpdate: Boolean): Result<List<Task>> {
// Set app as busy while this function executes.
wrapEspressoIdlingResource {
coroutineScope {
if (forceUpdate) {
try {
updateTasksFromRemoteDataSource()
} catch (ex: Exception) {
return Result.Error(ex)
}
}
}
return tasksLocalDataSource.getTasks()
}
}
...
}
Added Content
To Tenfour04: Thanks!
If somebody implement updateTasksFromRemoteDataSource() with lauch just like Code C, are you sure the Code C is return tasksLocalDataSource.getTasks() will be fired after updateTasksFromRemoteDataSource() when I call the function getTasks(forceUpdate: Boolean) with the paramter True?
Code C
class DefaultTasksRepository(
private val tasksRemoteDataSource: TasksDataSource,
private val tasksLocalDataSource: TasksDataSource,
private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
) : TasksRepository {
override suspend fun getTasks(forceUpdate: Boolean): Result<List<Task>> {
// Set app as busy while this function executes.
wrapEspressoIdlingResource {
if (forceUpdate) {
try {
updateTasksFromRemoteDataSource()
} catch (ex: Exception) {
return Result.Error(ex)
}
}
return tasksLocalDataSource.getTasks()
}
}
private suspend fun updateTasksFromRemoteDataSource() {
val remoteTasks = tasksRemoteDataSource.getTasks()
if (remoteTasks is Success) {
// Real apps might want to do a proper sync, deleting, modifying or adding each task.
tasksLocalDataSource.deleteAllTasks()
launch { //I suppose that launch can be fired
remoteTasks.data.forEach { task ->
tasksLocalDataSource.saveTask(task)
}
}
} else if (remoteTasks is Result.Error) {
throw remoteTasks.exception
}
}
}
New Added Content
To Joffrey: Thanks!
I think that the Code D can be compiled.
In this case, when forceUpdate is true, tasksLocalDataSource.getTasks() maybe be run before updateTasksFromRemoteDataSource() is done.
Code D
class DefaultTasksRepository(
private val tasksRemoteDataSource: TasksDataSource,
private val tasksLocalDataSource: TasksDataSource,
private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO,
private val myCoroutineScope: CoroutineScope
) : TasksRepository {
override suspend fun getTasks(forceUpdate: Boolean): Result<List<Task>> {
// Set app as busy while this function executes.
wrapEspressoIdlingResource {
if (forceUpdate) {
try {
updateTasksFromRemoteDataSource(myCoroutineScope)
} catch (ex: Exception) {
return Result.Error(ex)
}
}
return tasksLocalDataSource.getTasks()
}
}
private suspend fun updateTasksFromRemoteDataSource(myCoroutineScope: CoroutineScope) {
val remoteTasks = tasksRemoteDataSource.getTasks()
if (remoteTasks is Success) {
// Real apps might want to do a proper sync, deleting, modifying or adding each task.
tasksLocalDataSource.deleteAllTasks()
myCoroutineScope.launch {
remoteTasks.data.forEach { task ->
tasksLocalDataSource.saveTask(task)
}
}
} else if (remoteTasks is Result.Error) {
throw remoteTasks.exception
}
}
...
}
suspend functions look like regular functions from the call site's point of view because they execute sequentially just like regular synchronous functions.
What I mean by this is that the instructions following a plain call to a suspend function do not execute until the called function completes its execution.
This means that code A is fine (when forceUpdate is true, tasksLocalDataSource.getTasks() will never run before updateTasksFromRemoteDataSource() is done), and the coroutineScope in code B is unnecessary.
Now regarding code C, structured concurrency is here to save you.
People simply cannot call launch without a CoroutineScope receiver.
Since TaskRepository doesn't extend CoroutineScope, the code C as-is will not compile.
There are 2 ways to make this compile though:
Using GlobalScope.launch {}: this will cause the problem you expect, indeed. The body of such a launch will be run asynchronously and independently of the caller. updateTasksFromRemoteDataSource can in this case return before the launch's body is done. The only way to control this is to use .join() on the Job returned by the call to launch (which waits until it's done). This is why it is usually not recommended to use the GlobalScope, because it can "leak" coroutines.
wrapping calls to launch in a coroutineScope {...} inside updateTasksFromRemoteDataSource. This will ensure that all coroutines launched within the coroutineScope block are actually finished before the coroutineScope call completes. Note that everything that's inside the coroutineScope block may very well run concurrently, though, depending on how launch/async are used, but this is the whole point of using launch in the first place, isn't it?
Now with Code D, my answer for code C sort of still holds. Whether you pass a scope or use the GlobalScope, you're effectively creating coroutines with a bigger lifecycle than the suspending function that starts them.
Therefore, it does create the problem you fear.
But why would you pass a CoroutineScope if you don't want implementers to launch long lived coroutines in the provided scope?
Assuming you don't do that, it's unlikely that a developer would use the GlobalScope (or any scope) to do this. It's generally bad style to create long-lived coroutines from a suspending function. If your function is suspending, callers usually expect that when it completes, it has actually done its work.

How to use Kotlin's coroutines with collections

I'm fairly new to Kotlin and its coroutines module, and I'm trying to do something that seemed pretty simple to me at first.
I have a function (the getCostlyList() below) that returns a List after some costly computation. This method is called multiple time sequentially. All these calls are then merged into a Set.
private fun myFun(): Set<Int> {
return (1..10)
.flatMap { getCostlyList() }
.toSet()
}
private fun getCostlyList(): List<Int> {
// omitting costly code here...
return listOf(...)
}
My goal would be to use coroutines to make these calls to this costly method asynchronously, but I am having trouble wrapping my head around this issue.
you can write something like this:
private suspend fun myFun(): Set<Int> = coroutineScope {
(1..10)
.map { async { getCostlyList() } }
.awaitAll()
.flatten()
.toSet()
}