Do different locks cover each other? - kotlin

I would like to ask about locking mechanism. Lets say I want to do something like "very simple databse". I have 3 maps, one for storing values and two others for storing indices "by user" and "by device", so that filtering by both is more effective. I don't want to have inconsistencies across these three maps, as almost all methods are manipulating with 2 or all maps so synchronization is required and Concurrent map types are not enough.
Question: Is it okey to mix synchronized(this) together with synchronized(userIdIndices) or I need to use this everywhere?. If a thread enters for example addDataPoint() method, would synchronized(this) also lock getUserIdIndicesSize() synchronized(userIdIndices) as lets say some higher order lock or these are totally independent locks and it would enable concurrent access.
#Component
class DatapointStore {
private val dataPoints = HashMap<DatapointKey, Double>()
private val userIdIndices = HashMap<Long, TreeSet<DatapointKey>>()
private val deviceIdIndices = HashMap<Long, TreeSet<DatapointKey>>()
fun getDataPointsSize() = synchronized(dataPoints) { dataPoints.size }
fun getUserIdIndicesSize() = synchronized(userIdIndices) { userIdIndices.size }
fun getDeviceIdIndicesSize() = synchronized(deviceIdIndices) { deviceIdIndices.size }
fun addDataPoint(dataPoint: DatapointRequest) {
synchronized(this) { ... }
}
fun filterByUserId(userId: Long): List<DatapointRequest> {
synchronized(this) { ... }
}
fun filterByDeviceId(deviceId: Long): List<DatapointRequest> {
synchronized(this) { ... }
}
fun deleteByUserId(userId: Long) {
synchronized(this) { ... }
}
fun deleteByDeviceId(deviceId: Long) {
synchronized(this) { ... }
}
}
I expect it to be fully synchronized, where one thread can not add new datapoint while other is deleting, filtering or checking size, but I would not mind checking getUserIdIndicesSize() while executing filterByDeviceId(). That is why I want to mix two types of locks.

There's no relationship between two object mutexes. The lock represented by synchronized(this) is a separate and independent lock from the one created by synchronized(dataPoints), even when dataPoints is a property of this. If you want atomic updates across all three of your maps, you'll need to use a single shared mutex (like this) to access them all.

Related

How to wait for a flow to complete emitting the values

I have a function "getUser" in my Repository which emits an object representing a user based on the provided id.
flow function
fun getUser(id: String) = callbackFlow {
val collectionReference: CollectionReference =
FirebaseFirestore.getInstance().collection(COLLECTION_USERS)
val query: Query = collectionReference.whereEqualTo(ID, id)
query.get().addOnSuccessListener {
val lst = it.toObjects(User::class.java)
if (lst.isEmpty())
offer(null)
else
offer(it.toObjects(User::class.java)[0])
}
awaitClose()
}
I need these values in another class. I loop over a list of ids and I add the collected user to a new list. How can I wait for the list to be completed when I collect the values, before calling return?
collector function
private fun computeAttendeesList(reminder: Reminder): ArrayList<User> {
val attendeesList = arrayListOf<User>()
for (friend in reminder.usersToShare) {
repoScope.launch {
Repository.getUser(friend).collect {
it?.let { user ->
if (!attendeesList.contains(user))
attendeesList.add(user)
}
}
}
}
return attendeesList
}
I do not want to use live data since this is not a UI-related class.
There are multiple problems to address in this code:
getUser() is meant to return a single User, but it currently returns a Flow<User>
which will never end, and never return more than one user.
the way the list of users is constructed from multiple concurrent query is not thread safe (because multiple launches are executed on the multi-threaded IO dispatcher, and they all update the same unsafe list directly)
the actual use case is to get a list of users from Firebase, but many queries for a single ID are used instead of a single query
Solution to #1
Let's tackle #1 first. Here is a version of getUser() that suspends for a single User instead of returning a Flow:
suspend fun getUser(id: String): User {
val collectionReference = FirebaseFirestore.getInstance().collection(COLLECTION_USERS)
val query = collectionReference.whereEqualTo(ID, id)
return query.get().await().let { it.toObjects(User::class.java) }.firstOrNull()
}
// use the kotlinx-coroutines-play-services library instead
private suspend fun <T> Task<T>.await(): T {
return suspendCancellableCoroutine { cont ->
addOnCompleteListener {
val e = exception
if (e == null) {
#Suppress("UNCHECKED_CAST")
if (isCanceled) cont.cancel() else cont.resume(result as T)
} else {
cont.resumeWithException(e)
}
}
}
}
It turns out that this await() function was already written (in a better way) and it's available in the kotlinx-coroutines-play-services library, so you don't need to actually write it yourself.
Solution to #2
If we could not rewrite the whole thing according to #3, we could deal with problem #2 this way:
private suspend fun computeAttendeesList(reminder: Reminder): List<User> {
return reminder.usersToShare
.map { friendId ->
repoScope.async { Repository.getUser(friendId) }
}
.map { it.await() }
.toList()
}
Solution to #3
Instead, we could directly query Firebase for the whole list:
suspend fun getUsers(ids: List<String>): List<User> {
val collectionReference = FirebaseFirestore.getInstance().collection(COLLECTION_USERS)
val query = collectionReference.whereIn(ID, ids)
return query.get().await().let { it.toObjects(User::class.java) }
}
And then consume it in a very basic way:
private suspend fun computeAttendeesList(reminder: Reminder): List<User> {
return Repository.getUsers(reminder.usersToShare)
}
Alternatively, you could make this function blocking (remove suspend) and wrap your call in runBlocking (if you really need to block the current thread).
Note that this solution didn't enforce any dispatcher, so if you want a particular scope or dispatcher, you can wrap one of the suspend function calls with withContext.

Correct way of locking a mutex in Kotlin

I want to implement a simple thread-safe Buffer, using Kotlin Coroutines, because coroutines are already used within the project.
The buffer will be used both in multi-thread and single-thread contexts, so having suspend fun getMostRecentData() doesn't seem very reasonable (see code below).
This is what I have so far. The fact that I have to write all that code to lock the mutex makes me wonder if I'm doing something wrong.
Anyway here's the code:
class SafeBuffer(
private val dispatcher: CoroutineDispatcher,
private val bufferSize: Int
) {
private val buffer = LinkedList<MyDataType>()
private val mutex = Mutex()
val size: Int
get() = buffer.size
// First approach: make a suspend fun
// Not great because I will need a runBlocking{} statement somewhere, every time I want to access the buffer
suspend fun getMostRecentData() : MyDataType? {
mutex.withLock {
return if (buffer.isEmpty()) null else buffer.last
}
}
// Second approach: use a runBlocking block inside the function
// Seems like it is missing the purpose of coroutines, and I'm not
// sure it is actually thread safe if other context is used somehow?
fun getMostRecentData() : MyDataType? {
runBlocking(dispatcher) {
mutex.withLock {
return if (buffer.isEmpty()) null else buffer.last
}
}
}
/**** More code ****/
(...)
}
So what's the most idiomatic/elegant way of achieving this?
Expanding on my comment, I think it would be idiomatic to have the buffer class only expose a suspend fun, as the consumer of the class would be responsible for figuring out how they want to use it (via runBlocking or from another coroutine). If you see this use case coming up a lot, an idiomatic approach may be to have an extension function on SafeBuffer to offer this functionality.
Extension functions are used all over the place in the coroutines API. In your code example, even Mutex.withLock is defined as an extension function.
class SafeBuffer(...) {
private val buffer = LinkedList<MyDataType>()
private val mutex = Mutex()
suspend fun getMostRecentData() : MyDataType? =
mutex.withLock {
if (buffer.isEmpty()) null else buffer.last
}
}
fun SafeBuffer.getMostRecentDataBlocking(): MyDataType? =
runBlocking {
getMostRecentData()
}

Kotlin - How to lock a collection when accessing it from two threads

wondered if anyone could assist, I'm trying to understand the correct way to access a collection in Kotlin with two threads.
The code below simulates a problem I'm having in a live system. One thread iterates over the collection but another thread can remove elements in that array.
I have tried adding #synchronized to the collections getter but that still gives me a concurrentmodification exception.
Can anyone let me know what the correct way of doing this would be?
class ListTest() {
val myList = mutableListOf<String>()
#Synchronized
get() = field
init {
repeat(10000) {
myList.add("stuff: $it")
}
}
}
fun main() = runBlocking<Unit> {
val listTest = ListTest()
launch(Dispatchers.Default) {
delay(1L)
listTest.myList.remove("stuff: 54")
}
launch {
listTest.myList.forEach { println(it) }
}
}
You are only synchronizing the getter and setter, so when you start using the reference you get to the list, it is already unlocked.
Kotlin has the Mutex class available for locking manipulation of a shared mutable object. Mutex is nicer than Java's synchronized because it suspends instead of blocking the coroutine thread.
Your example would be poor design in the real world because your class publicly exposes a mutable list. But going along with making it at least safe to modify the list:
class ListTest() {
private val myListMutex = Mutex()
private val myList = mutableListOf<String>()
init {
repeat(10000) {
myList.add("stuff: $it")
}
}
suspend fun modifyMyList(block: MutableList<String>.() -> Unit) {
myListMutex.withLock { myList.block() }
}
}
fun main() = runBlocking<Unit> {
val listTest = ListTest()
launch(Dispatchers.Default) {
delay(1L)
listTest.modifyMyList { it.remove("stuff: 54") }
}
launch {
listTest.modifyMyList { it.forEach { println(it) } }
}
}
If you are not working with coroutines, instead of a Mutex(), you can use an Any and instead of withLock use synchronized (myListLock) {} just like you would in Java to prevent code from within the synchronized blocks from running at the same time.
If you want to lock a collection, or any object for concurrent access, you can use the almost same construct as java's synchronized keyword.
So while accessing such an object you would do
fun someFun() {
synchronized(yourCollection) {
}
}
You can also use synchronizedCollection method from java's Collections class, but this only makes single method access thread safe, if you have to iterate over the collection, you will still have to manually handle the synchronization.

Parallelly consuming a long sequence in Kotlin

I have a function generating a very long sequence of work items. Generating these items is fast, but there are too many in total to store a list of them in memory. Processing the items produces no results, just side effects.
I would like to process these items across multiple threads. One solution is to have a thread read from the generator and write to a concurrent bounded queue, and a number of executor threads polling for work from the bounded queue, but this is a lot of things to set up.
Is there anything in the standard library that would help me do that?
I had initially tried
items.map { async(executor) process(it) }.forEach { it.await() }
But, as pointed out in how to implement parallel mapping for sequences in kotlin, this doesn't work for reasons that are obvious in retrospect.
Is there a quick way to do this (possibly with an external library), or is manually setting up a bounded queue in the middle my best option?
You can look at coroutines combined with channels.
If all work items can be emmited on demand with producer channel. Then it's possible to await for each items and process it with a pool of threads.
An example :
sealed class Stream {
object End: Stream()
class Item(val data: Long): Stream()
}
val produceCtx = newSingleThreadContext("producer")
// A dummy producer that send one million Longs on its own thread
val producer = CoroutineScope(produceCtx).produce {
for (i in (0 until 1000000L)) send(Stream.Item(i))
send(Stream.End)
}
val workCtx = newFixedThreadPoolContext(4, "work")
val workers = Channel<Unit>(4)
repeat(4) { workers.offer(Unit) }
for(_nothing in workers) { // launch 4 times then wait for a task to finish
launch(workCtx) {
when (val item = producer.receive()) {
Stream.End -> workers.close()
is Stream.Item -> {
workFunction(item.data) // Actual work here
workers.offer(Unit) // Notify to launch a new task
}
}
}
}
Your magic word would be .asSequence():
items
.asSequence() // Creates lazy executable sequence
.forEach { launch { executor.process(it) } } // If you don't need the value aftrwards, use 'launch', a.k.a. "fire and forget"
but there are too many in total to store a list of them in memory
Then don't map to list and don't collect the values, no matter if you work with Kotlin or Java.
As long as you are on the JVM, you can write yourself an extension function, that works the sequence in chunks and spawns futures for all entries in a chunk. Something like this:
#Suppress("UNCHECKED_CAST")
fun <T, R> Sequence<T>.mapParallel(action: (value: T) -> R?): Sequence<R?> {
val numThreads = Runtime.getRuntime().availableProcessors() - 1
return this
.chunked(numThreads)
.map { chunk ->
val threadPool = Executors.newFixedThreadPool(numThreads)
try {
return#map chunk
.map {
// CAUTION -> needs to be written like this
// otherwise the submit(Runnable) overload is called
// which always returns an empty Future!!!
val callable: () -> R? = { action(it) }
threadPool.submit(callable)
}
} finally {
threadPool.shutdown()
}
}
.flatten()
.map { future -> future.get() }
}
You can then just use it like:
items
.mapParallel { /* process an item */ }
.forEach { /* handle the result */ }
As long as workload per item is similar, this gives a good parallel processing.

How to achieve mutex on method in Kotlin and prioritize one thread before another?

I have two kafka topics my_priorized_topic and my_not_so_priorized_topic. I want to have mutex on EventProcessor.doLogic, and always prioritize on handle messages from my_prioritized_topic before messages from my_not_so_prioritized_topic
Can anyone give me some pointers how to solve this with Kotlin, maybe with coroutines?
class EventProcessor {
fun doLogic(message: String) {
... // code which cannot be parallelized
}
}
class KafkaConsumers(private val eventProcessor: EventProcessor) {
#KafkaConsumer(topic = "my_priorized_topic")
fun consumeFromPriorizedTopic(message: String) {
eventProcessor.doLogic(message)
}
#KafkaConsumer(topic = "my_not_so_priorized_topic")
fun consumeFromNotSoPrioritizedTopic(message: String) {
eventProcessor.doLogic(message)
}
}
You could create two Channels for your high and low priority tasks. Then to consume the events from the channels, use coroutines' select expression and put the high priority task channel first.
Example (the String is the even):
fun process(value: String) {
// do what you want with the event
}
suspend fun selectFromHighAndLow(highPriorityChannel: ReceiveChannel<String>, lowPriorityChannel: ReceiveChannel<String>): String =
select<String> {
highPriorityChannel.onReceive { value ->
value
}
lowPriorityChannel.onReceive { value ->
value
}
}
val highPriorityChannel = Channel<String>()
val lowPriorityChannel = Channel<String>()
while (true) {
process(selectFromHighAndLow(highPriorityChannel, lowPriorityChannel))
}
To send stuff to those channels, you can use channel.send(event).