Kotlin multiplatform. Ktor download big file and save to local file [duplicate] - kotlin

I've been spending way too much time trying to solve this problem. So the code that I posted below does work in terms of downloading a file, but the problem is, the flow has a very unexpected behaviour. The response.content.readAvailable() method call seems to block until it's completely done downloading the whole file at which point the emit progress happens, so you end up waiting a long time for the file to download, and then in a split second you get all of the progress updates. So I'm wondering if there is a way to do this where I read in a certain number of bytes at a time and then emit a progress and then repeat until the file is done downloading? Or maybe a way to hook into the readAvailable() method and update the progress that way? Any help with this would be greatly appreciated.
Here's the code I found and modified, but still does not work right:
suspend fun HttpClient.downloadFile(
output: File,
downloadUrl: String,
md5Hash: String,
) = flow {
try {
val response = get<HttpResponse> { url(downloadUrl) }
val data = ByteArray(response.contentLength()?.toInt() ?: 0)
val contentLn = response.contentLength()?.toInt() ?: 0
var offset = 0
var bytesRemaining = contentLn
do {
val chunkSize = min(maxChunkSize, bytesRemaining)
logger?.d { "Read Available:" }
val result = response.content.readAvailable(data, offset, length = chunkSize)
val progress = ((offset / contentLn.toDouble()) * 100).toInt()
emit(DownloadResult.Progress(progress))
logger?.d { "logged progress: $progress" }
// delay(6000L) this was to test my assumption that the readAvalible was blocking.
offset += chunkSize
bytesRemaining -= chunkSize
} while (result != -1)
if (response.status.isSuccess()) {
if (data.md5().hex == md5Hash) {
output.write(data)
emit(DownloadResult.Success)
} else {
emit(DownloadResult.ErrorCorruptFile)
}
} else {
emit(DownloadResult.ErrorBadResponseCode(response.status.value))
}
} catch (e: TimeoutCancellationException) {
emit(DownloadResult.ErrorRequestTimeout("Connection timed out", e))
}
}

Finally after a stupid amount of time I solved this. What you need to use is this. That gives you access to the byte channel as it is downloading.
and a very crude implementation (that I'm not yet done with) is this:
get<HttpStatement>(url = downloadUrl).execute {
var offset = 0
val byteBufferSize = 1024 * 100
val channel = it.receive<ByteReadChannel>()
val contentLen = it.contentLength()?.toInt() ?: 0
val data = ByteArray(contentLen)
do {
val currentRead = channel.readAvailable(data, offset, byteBufferSize)
val progress = if(contentLen == 0) 0 else ( offset / contentLen.toDouble() ) * 100
logger?.d { "progress: $progress" }
offset += currentRead
} while (currentRead >= 0)
}
two things to not with this solution. 1.) I'm in the context of HttpClient, so that's how I have access to get(). 2.) I'm creating a byte buffer size of 1024 * 100 in order to not let the readAvailable method block for too long, though this might not be necessary... the one nice thing about it is that it determines how frequently you will be publishing your progress updates.

Related

How can I download a large file with Ktor and Kotlin with a progress indicator?

I've been spending way too much time trying to solve this problem. So the code that I posted below does work in terms of downloading a file, but the problem is, the flow has a very unexpected behaviour. The response.content.readAvailable() method call seems to block until it's completely done downloading the whole file at which point the emit progress happens, so you end up waiting a long time for the file to download, and then in a split second you get all of the progress updates. So I'm wondering if there is a way to do this where I read in a certain number of bytes at a time and then emit a progress and then repeat until the file is done downloading? Or maybe a way to hook into the readAvailable() method and update the progress that way? Any help with this would be greatly appreciated.
Here's the code I found and modified, but still does not work right:
suspend fun HttpClient.downloadFile(
output: File,
downloadUrl: String,
md5Hash: String,
) = flow {
try {
val response = get<HttpResponse> { url(downloadUrl) }
val data = ByteArray(response.contentLength()?.toInt() ?: 0)
val contentLn = response.contentLength()?.toInt() ?: 0
var offset = 0
var bytesRemaining = contentLn
do {
val chunkSize = min(maxChunkSize, bytesRemaining)
logger?.d { "Read Available:" }
val result = response.content.readAvailable(data, offset, length = chunkSize)
val progress = ((offset / contentLn.toDouble()) * 100).toInt()
emit(DownloadResult.Progress(progress))
logger?.d { "logged progress: $progress" }
// delay(6000L) this was to test my assumption that the readAvalible was blocking.
offset += chunkSize
bytesRemaining -= chunkSize
} while (result != -1)
if (response.status.isSuccess()) {
if (data.md5().hex == md5Hash) {
output.write(data)
emit(DownloadResult.Success)
} else {
emit(DownloadResult.ErrorCorruptFile)
}
} else {
emit(DownloadResult.ErrorBadResponseCode(response.status.value))
}
} catch (e: TimeoutCancellationException) {
emit(DownloadResult.ErrorRequestTimeout("Connection timed out", e))
}
}
Finally after a stupid amount of time I solved this. What you need to use is this. That gives you access to the byte channel as it is downloading.
and a very crude implementation (that I'm not yet done with) is this:
get<HttpStatement>(url = downloadUrl).execute {
var offset = 0
val byteBufferSize = 1024 * 100
val channel = it.receive<ByteReadChannel>()
val contentLen = it.contentLength()?.toInt() ?: 0
val data = ByteArray(contentLen)
do {
val currentRead = channel.readAvailable(data, offset, byteBufferSize)
val progress = if(contentLen == 0) 0 else ( offset / contentLen.toDouble() ) * 100
logger?.d { "progress: $progress" }
offset += currentRead
} while (currentRead >= 0)
}
two things to not with this solution. 1.) I'm in the context of HttpClient, so that's how I have access to get(). 2.) I'm creating a byte buffer size of 1024 * 100 in order to not let the readAvailable method block for too long, though this might not be necessary... the one nice thing about it is that it determines how frequently you will be publishing your progress updates.

how to increase the size limit of a mutable list in kotlin?

I was attempting to solve the multiset question (https://codeforces.com/contest/1354/problem/D) on codeforces using Fenwick Tree Data structure. I passed the sample test cases but got the memory limit error after submitting, the testcase is mentioned below.
(Basically the testcase is:
1000000 1000000
1.............1 //10^6 times
-1...........-1 //10^6 times).
I tried similar testcase in my IDE and got the below mentioned error.
(Similar to above, the testcase I provided is:
1000000 1
1.............1 //10^6 times
-1
)
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 524289 out of bounds for length 524289
at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
at java.base/java.util.Objects.checkIndex(Objects.java:373)
at java.base/java.util.ArrayList.get(ArrayList.java:426)
at MultisetKt.main(multiset.kt:47)
at MultisetKt.main(multiset.kt)
Here is my code:
private fun readInt() = readLine()!!.split(" ").map { it.toInt() }
fun main() {
var (n, q) = readInt()
var list = readInt() //modify the list to store it from index 1
var finalList = listOf(0) + list
val query = readInt()
var bit = MutableList(n+1){0}
fun update(i:Int, value:Int) {
var index = i
while(index < n){
bit.set (index , bit[index] + value)
index += (index and -index)
}
}
fun rangefunc(i:Int): Int {
var su = 0
var index = i
while(index > 0){
su += bit[index]
index -= (index and -index)
}
return su
}
fun find(x:Int):Int {
var l = 1
var r = n
var ans = n
var mid = 0
while (l <= r) {
mid = (l + r) / 2
if (rangefunc(mid) >= x) {
ans = mid
r = mid - 1
} else {
l = mid + 1
}
}
return ans
}
for (i in 1..n) {
update(finalList[i], 1)
}
for (j in 0..q - 1) {
if (query[j] > 0) {
update(query[j], 1)
} else {
update(find(-query[j]), -1)
}
}
if(rangefunc(n) == 0){
println(0)
}else{
println(find(1))
}
}
I believe this is because the BITlist is not able to store 10^6 elements but not sure. Please let me know what changes should I make in my code also any additional advice on how to deal with such cases in the future.
Thank you in advance :)
An ArrayList can store over 2 billion items (2 * 10^9). That is not your issue. ArrayIndexOutOfBoundsException is for trying to access an index of an ArrayList that is less than zero or greater than or equal to its size. In other words, an index that it doesn't yet contain.
There's more code there than I have time to debug. But I would start at the line that the stack trace points to and see how it's possible for you to attempt to call bit[index] with an index that equals the size of the ArrayList.
To answer your literal question, you can use LinkedList explicitly as your type of MutableList to avoid the size restriction, but it is heavier and it is slower when accessing elements by index.

Get pending messages with Redis Streams and Spring Data

I use Redis Streams in my Spring Boot application. Within a scheduler I regularly want to get all the pending messages and check how long they are already processing and re-trigger them if necessary.
My problem is now that I can get the pending messages, but I'm not sure how to get the payload.
My first approach used the pending and range operations. The downside here is that the totalDeliveryCount is not increased with range - so I cannot use the range method
val pendingMessages = stringRedisTemplate.opsForStream<String, Any>().pending(redisStreamName, Consumer.from(redisConsumerGroup, instanceName))
return pendingMessages.filter { pendingMessage ->
if (pendingMessage.totalDeliveryCount < maxDeliveryAttempts && pendingMessage.elapsedTimeSinceLastDelivery > Duration.ofMillis(pendingTimeout.toLong())) {
return#filter true
} else {
...
return#filter false
}
}.map { //map from PendingMessage::class to a MapRecord with the content
val map = stringRedisTemplate.opsForStream().range(redisStreamName, Range.just(it.idAsString)) // does not increase totalDeliveryCount !!!
if (map != null && map.size > 0) {
return#map map[0]
} else {
return#map null
}
}.filterNotNull().toList()
My second approach used the pending and read operations. For the read operation I can specify an offset with the current ID. The problem is that I only get IDs back which are higher then the one specified.
val pendingMessages = stringRedisTemplate.opsForStream().pending(redisStreamName, Consumer.from(redisConsumerGroup, instanceName))
return pendingMessages.filter { pendingMessage ->
if (pendingMessage.totalDeliveryCount < maxDeliveryAttempts && pendingMessage.elapsedTimeSinceLastDelivery > Duration.ofMillis(pendingTimeout.toLong())) {
return#filter true
} else {
...
return#filter false
}
}.map { //map from PendingMessage::class to a MapRecord with the content
val map = stringRedisTemplate.opsForStream<String, Any>()
.read(it.consumer, StreamReadOptions.empty().count(1),
StreamOffset.create(redisStreamName, ReadOffset.from(it.id)))
if (map != null && map.size > 0 && map[0].id.value == it.idAsString) { // map[0].id.value == it.idAsString does not match
return#map map[0]
} else {
return#map null
}
}.filterNotNull().toList()
So when I use ReadOffset.from('1234-0') I don't get the message with 1234-0 but everything after that message. Is there a way to get the exact message and also honoring the totalDeliveryCount and elapsedTimeSinceLastDelivery statistic?
I'm using spring-data-redis 2.3.1.RELEASE
I'm using following workaround now, which should be good for most cases:
return if (id.sequence > 0) {
"${id.timestamp}-${id.sequence - 1}"
} else {
"${id.timestamp - 1}-99999"
}
It relies on the fact that there are not more then 99999 messages inserted per ms.

How do I get a single document from a Firestore query within a loop

In my app I am running a for loop in my Firestore query. this query itself is meant to only return documents where the criteria of brand and location are met (String values) AND where a counter ("deal_number") located in the document is > than a comparative counter in users collections (Login.deal_number).
So essentially, I search the user collection for the last counter number and use that as a logical check against the counter in the deal id
menuref = FirebaseFirestore.getInstance()
menuref.collection("Users").whereEqualTo("uid", userid)
.addSnapshotListener { value, task ->
if (task != null) {
return#addSnapshotListener
}
for (document in value!!) {
val seller_brand = document.getString("brand")!!
val seller_location = document.getString("location")!!
val deal_num = document.getLong("last_deal_num")!!
//Login.deal_number is a companion object
Login.deal_number = deal_num
Log.d("Firestore_brand", seller_brand)
Log.d("Firestore_location", seller_location)
Log.d("lastdealnum", "${Login.deal_number}")
menuref.collection("Car_Deals").whereEqualTo("brand", seller_brand).whereEqualTo(seller_location, "True").whereGreaterThan("deal_number",Login.deal_number)
.addSnapshotListener { value, task ->
if (task != null) {
return#addSnapshotListener
}
counter_deal = 0
for (document in value!!) {
val new_deal_num = document.getLong("deal_number")!!
Log.d("dealnumnew", "$new_deal_num")
if (new_deal_num == Login.deal_number) {
counter_deal = counter_deal + 1
break
} else if (new_deal_num < Login.deal_number) {
counter_deal = counter_deal + 1
break
}
else if (new_deal_num > Login.deal_number && counter_deal < 1) {
Log.d("Tag_counter_deal","${counter_deal}")
Log.d("Tag_newdeal_num","${new_deal_num}")
Log.d("Tag_userdeal_num","${Login.deal_number}")
counter_deal = counter_deal + 1
newdealnumref =
FirebaseFirestore.getInstance().collection("Users")
newdealnumref.document(userid)
.update("last_deal_num", new_deal_num)
.addOnSuccessListener {
}.addOnFailureListener { e ->
Log.w(
"firestore_create_error",
"Error writing to document",
e
)
}
Log.d("newdealbrand", "$seller_brand $seller_location")
Log.d("newdeal", "New deal found")
dealCreatedNotificationChannel() // this is the android O channel creation
CodetoRunforNotification() // this is the code to run for the notification. generic, havent changed anything according to normal notification creation
with(NotificationManagerCompat.from(this)) {
notify(notify_counter, builder)
notify_counter++
}
counter_deal = 0
break
}
}
}
}
}
For the above, why does Firestore create multiple notifications when there should only be a single event, is it because of the way the filter does not seem to apply correctly. Is it due to the same effect as you would have with a recyclerview/listview whereby you need to clear an array do prevent duplicates that meet the criteria?
This seems to be a common trend with running a notification based on a Firestore query. Is this even possible? I have tried all sorts of breaks in the for loop but keep getting multiple hits on the same document. I tried use the counter_deal to limit the amount of notifications sent once the snapshot is triggered with no luck.

Is there a way to merge filter and map into single operation in Kotlin?

The below code will look for "=" and then split them. If there's no "=", filter them away first
myPairStr.asSequence()
.filter { it.contains("=") }
.map { it.split("=") }
However seeing that we have both
.filter { it.contains("=") }
.map { it.split("=") }
Wonder if there's a single operation that could combine the operation instead of doing it separately?
You can use mapNotNull instead of map.
myPairStr.asSequence().mapNotNull { it.split("=").takeIf { it.size >= 2 } }
The takeIf function will return null if the size of the list returned by split method is 1 i.e. if = is not present in the string. And mapNotNull will take only non null values and put them in the list(which is finally returned).
In your case, this solution will work. In other scenarios, the implementation(to merge filter & map) may be different.
I see your point and under the hood split is also doing an indexOf-check to get the appropriate parts.
I do not know of any such function supporting both operations in a single one, even though such a function would basically just be similar to what we have already for the private fun split-implementation.
So if you really want both in one step (and require that functionality more often), you may want to implement your own splitOrNull-function, basically copying the current (private) split-implementation and adapting mainly 3 parts of it (the return type List<String>?, a condition if indexOf delivers a -1, we just return null; and some default values to make it easily usable (ignoreCase=false, limit=0); marked the changes with // added or // changed):
fun CharSequence.splitOrNull(delimiter: String, ignoreCase: Boolean = false, limit: Int = 0): List<String>? { // changed
require(limit >= 0, { "Limit must be non-negative, but was $limit." })
var currentOffset = 0
var nextIndex = indexOf(delimiter, currentOffset, ignoreCase)
if (nextIndex == -1 || limit == 1) {
if (currentOffset == 0 && nextIndex == -1) // added
return null // added
return listOf(this.toString())
}
val isLimited = limit > 0
val result = ArrayList<String>(if (isLimited) limit.coerceAtMost(10) else 10)
do {
result.add(substring(currentOffset, nextIndex))
currentOffset = nextIndex + delimiter.length
// Do not search for next occurrence if we're reaching limit
if (isLimited && result.size == limit - 1) break
nextIndex = indexOf(delimiter, currentOffset, ignoreCase)
} while (nextIndex != -1)
result.add(substring(currentOffset, length))
return result
}
Having such a function in place you can then summarize both, the contains/indexOf and the split, into one call:
myPairStr.asSequence()
.mapNotNull {
it.splitOrNull("=") // or: it.splitOrNull("=", limit = 2)
}
Otherwise your current approach is already good enough. A variation of it would just be to check the size of the split after splitting it (basically removing the need to write contains('=') and just checking the expected size, e.g.:
myPairStr.asSequence()
.map { it.split('=') }
.filter { it.size > 1 }
If you want to split a $key=$value-formats, where value actually could contain additional =, you may want to use the following instead:
myPairStr.asSequence()
.map { it.split('=', limit = 2) }
.filter { it.size > 1 }
// .associate { (key, value) -> key to value }