Is there a way to group values and then calculate sum using RxJava - operators

I have a list of objects which is in form of a Flowable.
Example -
values count
A 2
B 3
C 4
A 5
C 1
I wish to group the flowable on the basis of values and then calculate the sum. Is there a better way of doing this ?
I have already tried generting a multimap. Then defining a function in the subscriber to aggregate the results. However, I feel I am not using JavaRx efficiently to do aggregations in streams.
Flowable<Response> responseFlowable = Flowable.fromIterable(generateList());
replayResponseFlowable.toMultimap(response -> response.getValues(), response -> response.getCount()).subscribe(groups-> calculateSum(groups));
}
private static void calculateSum(Map<String,Collection<Integer>> groups)
{
//iterate over the map and calculate sum for each of the groups.
}
The expected result is :
A 7
B 3
C 5
I wish to do this computation within the stream using JavaRx instead of defining a custom method. How can I do it ?

You can put the calculateSum in the stream to make it look nicer.
replayResponseFlowable.toMultiMap(response -> response.getValues())
.map(groups -> calculateSum(groups))
.subscribe(result -> {
print(result);
})
Since you are only aggregating the counts, you don't need to keep track of all items. Instead you can keep only the sum of the count for each value.
This can be done using .collect operator:
replayResponseFlowable
.collectInto(new HashMap<String, Integer>(), (group, response) -> {
group.merge(response.values, response.count, Integer::sum);
})
.subscribe(result -> {
print(result);
});
This will print
emitted=[{value=A, count=7}, {value=B, count=3}, {value=C, count=5}]
If you want to get the result as the flowable emits items, use .scan:
replayResponseFlowable
.scan(new HashMap<String, Integer>(), (group, response) -> {
group.merge(response.values, response.count, Integer::sum);
return group;
})
.subscribe(result -> {
print(result);
});
This will print:
emitted=[]
emitted=[{value=A, count=2}]
emitted=[{value=A, count=2}, {value=B, count=3}]
emitted=[{value=A, count=2}, {value=B, count=3}, {value=C, count=4}]
emitted=[{value=A, count=7}, {value=B, count=3}, {value=C, count=4}]
emitted=[{value=A, count=7}, {value=B, count=3}, {value=C, count=5}]

Related

mapping custom object kotlin

I have a custom object:
data class MoneyTransaction(
val amount: Double,
val category: String
)
I have a list of MoneyTransaction. I want to create a map out of that list where keys are categories, and the values are the total amount according to the category. Kotlin has functions like groupBy, groupByTo, groupingBy. But there is no tutorial or documentation about those, so I can't figure it out. So far I got this:
val map = transactionList.groupBy({it.category},{it.amount})
But this doesn't give the total amount, just separate amounts on each category
Any help would be much appreciated.
So first of all you group your transactions by category
transactionList.groupBy { it.category }
this gives you a Map<String, List<MoneyTransaction>> after that you need to sum up the amounts
transactionList.groupBy { it.category }
.mapValues { (_, transactionsInCategory) ->
transactionsInCategory.sumOf { it.amount }
}
This will give you a Map<String, Double> with the value representing the sum of all transactions in the category.
You can use groupingBy and then fold:
transactions.groupingBy(MoneyTransaction::category)
.fold(0.0) { acc, next -> acc + next.amount }
groupingBy here would return a Grouping<MoneyTransaction, String>, which is an intermediate collection of the groups. Then you fold each of the groups by starting from 0, and adding the next transaction's amount.
Looking at the implementation, the groupingBy call doesn't actually does any actual "grouping" - it just creates a lazy Grouping object. So effectively, you are going through the collection only once.

Feedback on Lambdas

I was hoping someone could provide me some feedback on better/cleaner ways to do the following:
val driversToIncome = trips
.map { trip ->
// associate every driver to a cost (NOT UNIQUE)
trip.driver to trip.cost }
.groupBy (
// aggregate all costs that belong to a driver
keySelector = { (driver, _) -> driver },
valueTransform = { (_, cost) -> cost }
)
.map { (driver, costs) ->
// sum all costs for each driver
driver to costs.sum() }
.toMap()
You can do it like this:
val driversToIncome = trips
.groupingBy { it.driver }
.fold(0) { acc, trip -> acc + trip.cost }
It groups trips by driver and while grouping it sums costs per each driver separately.
Note that groupingBy() does not do anything on its own, it only prepares for the grouping operation. This solution avoids creating intermediary collections, it does everything in a single loop.
Then fold() calls the provided lambda sequentially on each item belonging to the specific group. Lambda receives a result from the previous call and it provides a new result (result is called accumulator). As a result, it reduces a collection of items to a single item.
You can read more about this kind of transformations in documentation about Grouping and Aggregation. Also, they aren't really inventions of Kotlin. Such operations exist in other languages and data transformation tools, so you can read about it even on Wikipedia.

Extend Groupby to include multiply aggregation

I implemented a groupby function which groups columns based on a particular aggregation successfully. The issue is I am using a argument for chosen columns and aggregation as Map[String,String] which means multiple aggregations cannot be performed on one column. for example sum, mean and max all on one column.
below is what works soo far:
groupByFunction(input, Map("someSignal" -> "mean"))
def groupByFunction(dataframeDummy: DataFrame,
columnsWithOperation: Map[String,String],
someSession: String = "sessionId",
someSignal: String = "signalName"): DataFrame = {
dataframeDummy
.groupBy(
col(someSession),
col(someSignal)
).agg(columnsWithOperation)
}
Upon looking into it a bit more, the agg function can take a list of columns like below
userData
.groupBy(
window(
(col(timeStampColumnName) / lit(millisSecondsPerSecond)).cast(TimestampType),
timeWindowInS.toString.concat(" seconds")
),
col(sessionColumnName),
col(signalColumnName)
).agg(
mean("physicalSignalValue"),
sum("physicalSignalValue")).show()
So I decided to try to manipulate the input to look like that, below is how I did it:
val signalIdColumn = columnsWithOperation.toSeq.flatMap { case (key, list) => list.map(key -> _) }
val result = signalIdColumn.map(tuple =>
if (tuple._2 == "mean")
mean(tuple._1)
else if (tuple._2 == "sum")
sum(tuple._1)
else if (tuple._2 == "max")
max(tuple._1))
Now I have a list of columns, which is still a problem for agg funciton.
I was able to solve it using a sequence of tuples like this Seq[(String, String)] instead of Map[String,String]
def groupByFunction(dataframeDummy: DataFrame,
columnsWithOperation: Seq[(String, String)],
someSession: String = "sessionId",
someSignal: String = "signalName"): DataFrame = {
dataframeDummy
.groupBy(
col(someSession),
col(someSignal)
).agg(columnsWithOperation)
and then with the information
from below post:
https://stackoverflow.com/a/34955432/2091294
userData
.groupBy(
col(someSession),
col(someSignal)
).agg(columnsWithOperation.head, columnsWithOperation.tail: _*)

How to get max double in list?? With only one output, using Kotlin

I have tried using .maxBy .max() and collection.Max and I have only been able to print with it stating every element is max
val fileName = "src/products.txt"
var products = HashMap<Int, Pair<String, Double>>()
var inputFD = File(fileName).forEachLine {
var pieces = it.split(",")
println("Item# Description Price")
println("----- ------------- ------")
for ( (pro,ducts) in products.toSortedMap() ) {
var pax = mutableListOf(ducts).maxBy { it -> it.second }
var highest = listOf<Double>(ducts.second).max()
println("The highest priced record is ${highest}")
}
the file is set up like this (111, shoe, 9.99)
output looks like this
The highest priced record is [(pants, 89.99)]
The highest priced record is [(shoes, 49.99)]
You are trying to print the value within the for-loop, hence it is printing it for every product. Also the variable is initialized everytime in the loop, so every value would be max.
Here is the right approach. Note that you can solve it without using mutable variables.
val fileName = "src/products.txt"
val products = File(fileName).readLines() //read all lines from file to a list
.map { it.split(",") } // map it to list of list of strings split by comma
.map { it[0] to it[1].toDouble() } // map each product to its double value in a Pair
.toMap() // convert list of Pairs to a Map
println("Item# Description Price")
println("----- ------------- ------")
products.keys.forEachIndexed { index, desc ->
println("$index\t$desc\t${products[desc]}")
}
println("The highest priced record is ${products.maxBy { it.value }}")

How to typesafe reduce a Collection of Either to only Right

Maybe a stupid question but I just don't get it.
I have a Set<Either<Failure, Success>> and want to output a Set<Success> with Arrow-kt.
You can map the set like this for right:
val successes = originalSet.mapNotNull { it.orNull() }.toSet()
or if you want the lefts:
val failures = originalSet.mapNotNull { it.swap().orNull() }.toSet()
The final toSet() is optional if you want to keep it as a Set as mapNotNull is an extension function on Iterable and always returns a List
PS: No stupid questions :)
Update:
It can be done avoiding nullables:
val successes = originalSet
.map { it.toOption() }
.filter { it is Some }
.toSet()
We could potentially add Iterable<Option<A>>.filterSome and Iterable<Either<A, B>.mapAsOptions functions.
Update 2:
That last example returns a Set<Option<Success>>. If you want to unwrap the results without using null then one thing you can try is to fold the Set:
val successes = originalSet
.fold(emptySet<Success>()) { acc, item ->
item.fold({ acc }, { acc + it })
}
This last option (unintended pun) doesn't require the use of Option.