java8 stream grouping aggregate - sum

Given a java class Something
class Something {
String parent;
String parentName;
String child;
Date at;
int noThings;
Something(String parent, String parentName, String child, Date at, int noThings) {
this.parent = parent;
this.parentName = parentName;
this.child = child;
this.at = at;
this.noThings = noThings;
}
String getParent() { return parent; }
String getChild() { return child; }
int getNoThings() { return noThings; }
}
I have a list of something objects,
List<Something> hrlySomethings = Arrays.asList(
new Something("parent1", "pname1", "child1", new Date("01-May-2015 10:00:00"), 4),
new Something("parent1", "pname1", "child1", new Date("01-May-2015 12:00:00"), 2),
new Something("parent1", "pname1", "child1", new Date("01-May-2015 17:00:00"), 8),
new Something("parent1", "pname1", "child2", new Date("01-May-2015 07:00:00"), 12),
new Something("parent1", "pname1", "child2", new Date("01-May-2015 17:00:00"), 14),
new Something("parent2", "pname2", "child3", new Date("01-May-2015 11:00:00"), 3),
new Something("parent2", "pname2", "child3", new Date("01-May-2015 16:00:00"), 2));
I want to group the objects by parent and children and then find the total/sum of the "noThings" field for the last 24 hours.
List<Something> dailySomethings = Arrays.asList(
new Something("parent1", "pname1", "child1", new Date("01-May-2015 00:00:00"), 14),
new Something("parent1", "pname1", "child2", new Date("01-May-2015 00:00:00"), 26),
new Something("parent2", "pname2", "child3", new Date("01-May-2015 00:00:00"), 5))
I'm trying to use the streams to do this
I can figure out how to use the grouping to get a map of maps, and the total
Map<String,Map<String,IntSummaryStatistics>> daily =
hrlySomethings.stream().collect(
Collectors.groupingBy(Something ::getParent,
Collectors.groupingBy(ClientCollectionsReceived::getChild,
Collectors.summarizingInt(ClientCollectionsReceived::getNoThings))));
I can figure out how to get a distinct list based on parent and child,
Date startHour = "01-May-2015 00:00:00";
int totalNoThings = 0; // don't know how to put sum in here
List<Something> newList
= hrlySomethings.stream()
.map((Something other) -> {
return new Something(other.getParent(),
other.getChild(), startHour, totalNoThings);
})
.distinct()
.collect(Collectors.toList());
But I don't know how to combine the two to get the distinct list, with the totals. Is this possible?

First I assume that you are using java.util.Date (though I'd advise you to move to new java.time API). Second, I assume that Something class has also properly implemented equals and hashCode. Also more getters are necessary:
String getParentName() { return parentName; }
Date getAt() { return at; }
Under these assumptions your task can be solved like this:
List<Something> dailySomethings = hrlySomethings.stream().collect(
Collectors.groupingBy(
smth -> new Something(smth.getParent(),
smth.getParentName(),
smth.getChild(),
new Date(smth.getAt().getYear(),
smth.getAt().getMonth(),
smth.getAt().getDate()),
0),
Collectors.summingInt(Something::getNoThings)
)).entrySet().stream()
.map(entry -> new Something(entry.getKey().getParent(),
entry.getKey().getParentName(),
entry.getKey().getChild(),
entry.getKey().getAt(),
entry.getValue()))
.collect(Collectors.toList());
We use groupingBy only once, but create a suitable grouping key, which is the Something with parent, parentName and child set to the original, at changed to the day beginning and noThings set to zero. This way you group what you want. If you need only total sums, then summarizingInt is unnecessary, summingInt is enough. After that we transform the resulting map to the list creating new Something objects where noThings is filled from the map values and the rest is filled from the keys.

Related

Kotlin: create a list from another list

I want to create a list using Kotlin that contains items of another list, based on endDate equals to startDate and .. etc
Example:
listOf(
{id1, startDate=1, endDate=3},
{id3, startDate=5, endDate=6},
{id2, startDate=3, endDate=5},
{id4, startDate=10, endDate=12},
{id5, startDate=12, endDate=13},
{id6, startDate=13, endDate=16})
result listOf[{id1}, {id2}, {id3}], [{id4}, {id5}, {id6}] // these are two items
With the given dataset, this problem looks innocent at a first glance, but may grow to a more complex problem quickly. Imagine a dataset that has the potential of multiple, possible results. Should longest possible chains be preferred, or a result with balanced chain size?
A naive implementation may be like this (written inside a Kotest).
data class ListItem(
val id: String,
val startDate: Int,
val endDate: Int
)
given("another StackOverflow issue") {
val coll = listOf(
ListItem("id1", startDate = 1, endDate = 3),
ListItem("id3", startDate = 5, endDate = 6),
ListItem("id2", startDate = 3, endDate = 5),
ListItem("id4", startDate = 10, endDate = 12),
ListItem("id5", startDate = 12, endDate = 13),
ListItem("id6", startDate = 13, endDate = 16)
)
`when`("linking chain") {
/** final result ends up here */
val chains: MutableList<MutableList<ListItem>> = mutableListOf()
/** populate dequeue with items ordered by startDate */
val arrayDeque = ArrayDeque(coll.sortedBy { it.startDate })
/** loop is iterated at least once, hence do/while */
do {
/** add a new chain */
chains.add(mutableListOf())
/** get first element for chain */
var currentItem: ListItem = arrayDeque.removeFirst()
/** add first element to current chain */
chains.last().add(currentItem)
/** add items to current chain until chain is broken */
while (arrayDeque.any { it.startDate == currentItem.endDate }) {
/** get next element to add to chain and remove it from dequeue */
currentItem = arrayDeque
.first { it.startDate == currentItem.endDate }
.also { arrayDeque.remove(it) }
chains.last().add(currentItem)
}
} while (arrayDeque.any())
then("result should be as expected") {
chains.size shouldBe 2
chains.first().size shouldBe 3
chains.last().size shouldBe 3
chains.flatMap { it.map { innerItem -> innerItem.id } } shouldBe listOf(
"id1",
"id2",
"id3",
"id4",
"id5",
"id6",
)
}
}
}

Repast: query an agent set and count the number of agents in while loop

I want to achieve a logic like this:
while (count loading_docks with [status == "free"] > 0 and trucks with [status == "free" and type == "20'" and capacity < 1000] > 0) {
match a truck satisfying above 3 condidtions to a free dock for unloading cargo;
}
as can be seen, the query needs to be repetively called and updated in the while loop, and the second query is composed of 3 conditions (Which is not easy with AndQuery() method).
This is very easy to implement in Netlogo. What is the suitable and shorter way to achieve in repast?
UPDATE - the initial attempt
public void match_dock() {
for (Truck t: this.getTruck_queue()) {
if (this.Count_freeDock() > 0) {
Query<Object> fit_dock = new AndQuery(
new PropertyEquals(context, "status", 1),
new PropertyGreaterThanEquals(context, "max_veh", t.getTruck_type()));
double min = 10000;
Dock match = null;
for(Object o: fit_dock.query()) {
if (((Dock)o).getMax_veh() < min) {
match = (Dock)o;
}
}
match.setStatus(2);
match.getServe_list().add(t.getReq_id());
t.setServe_dock(match.getId());
// if (t.getServe_dock() != -1) {
// this.getTruck_queue().remove(t);
// }
}
}
}
public int Count_freeDock() {
List<Dock> free_list = new ArrayList<Dock>();
Query<Object> free_dock = new PropertyEquals<Object>(context, "status", 1);
for (Object o : free_dock.query()) {
if (o instanceof Dock) {
free_list.add((Dock)o);
}
}
return free_list.size();
}
There are three issues to fix:
1) The query of a particular agent set has to consider three conditions; AndQuery only composes two conditions. is there a Query method which allows more than two conditions to be considered at the same time?
current problem:
Query<Object> pre_fit = new AndQuery(
new PropertyEquals(context, "status", 1),
new PropertyGreaterThanEquals(context, "max_veh", t.getTruck_type()));
Query<Object> fit_dock = new AndQuery(pre_fit, new PropertyEquals(context, "ops_type", 3));
The initial composition of two conditions works fine and queries fast. However, when I add the third condition "ops_type", the query speed becomes hugely slow. What's the reason behind? Or is this a correct way to compose three conditions?
2) Is there simpler way to query the size (count) of a particular agent set, other than writing a custom count function (as shown in example)?
3) what is the shortest way to add(or copy) the queried agent set into a list for related list operations?
update entire code block:
public void match_dock() {
Iterator<Truck> truck_list = this.getTruck_queue().iterator();
while(truck_list.hasNext() && this.Count_freeDock() > 0) {
Truck t = truck_list.next();
// Query<Object> pre_fit = new AndQuery(
// new PropertyEquals(context, "status", 1),
// new PropertyGreaterThanEquals(context, "max_veh", t.getTruck_type()));
// Query<Object> ops_fit = new OrQuery<>(
// new PropertyEquals(context, "ops_type", 3),
// new PropertyEquals(context, "ops_type", this.getOps_type(t.getOps_type())));
// Query<Object> fit_dock = new AndQuery(pre_fit, new PropertyEquals(context, "ops_type", 3));
// Query<Object> fit_dock = new AndQuery(pre_fit, ops_fit);
Query<Object> pre_fit = new AndQuery(
new PropertyEquals(context, "status", 1),
new PropertyGreaterThanEquals(context, "max_veh", t.getTruck_type()));
Query<Object> q = new PropertyEquals(context, "ops_type", 3);
double min = 10000;
Dock match = null;
for (Object o : q.query(pre_fit.query())) {
// for(Object o: fit_dock.query()) {
if (((Dock)o).getMax_veh() < min) {
match = (Dock)o;
}
}
try {
match.setStatus(2);
match.getServe_list().add(t.getReq_id());
t.setServe_dock(match.getId());
if (t.getServe_dock() != -1) {
System.out.println("truck id " + t.getReq_id() + "serve dock: " + t.getServe_dock());
t.setIndock_tm(this.getTick());
truck_list.remove();
}
}
catch (Exception e){
// System.out.println("No fit dock found");
}
}
}
public int Count_freeDock() {
List<Dock> free_list = new ArrayList<Dock>();
Query<Object> free_dock = new PropertyEquals<Object>(context, "status", 1);
for (Object o : free_dock.query()) {
if (o instanceof Dock) {
free_list.add((Dock)o);
}
}
// System.out.println("free trucks: " + free_list.size());
return free_list.size();
}
UPDATE on 5/5
I have moved the query outside the while loop for better detection.
I found the slow speed could be largely due to the use of "PropertyGreaterThanEquals". regardless whether the queried field is int or double.
when you query using "PropertyGreaterThanEquals", the query runs very slow regardless wether the queried field is int or double. However, it returns correct result.
when you query using "PropertyEquals", the query runs in less than one second regardless wether the queried field is int or double. however, it returns result which is not correct since it needs to consider ">=".
public void match_dock() {
System.out.println("current tick is: " + this.getTick());
Iterator<Truck> truck_list = this.getTruck_queue().iterator();
Query<Object> pre_fit = new AndQuery(
new PropertyEquals(context, "status", 1),
new PropertyGreaterThanEquals(context, "max_veh", 30));
//new PropertyEquals(context, "max_veh", 30));
Query<Object> q = new PropertyEquals(context, "hv_spd", 240);
for (Object o : q.query(pre_fit.query())) {
if (o instanceof Dock) {
System.out.println("this object is: " + ((Dock)o).getId());
}
}
}
For 1, you could try chaining the queries like so:
Query<Object> pre_fit = new AndQuery(
new PropertyEquals(context, "status", 1),
new PropertyGreaterThanEquals(context, "max_veh", t.getTruck_type()));
Query<Object> q = new PropertyEquals(context, "ops_type", 3);
for (Object o : q.query(pre_fit.query())) { ...
I think this can be faster than the embedding the AndQuery, but I'm not entirely sure.
For 2, I think some of the Iterables produced by a Query are in fact Java Sets. You could try to cast to one of those and then call size(). If its not a set then you do in fact have to iterate as the query filter conditions are actually applied as part of the iteration.
For 3, I think there are some Java methods for this. new ArrayList(Iterable), and some methods in Collections.

Assigning values to ArrayList using mapTo

Previously I was using this code:
private val mItems = ArrayList<Int>()
(1..item_count).mapTo(mItems) { it }
/*
mItems will be: "1, 2, 3, 4, 5, ..., item_count"
*/
Now, I am using a class instead of Int, but the class has Int member with name id.
class ModelClass(var id: Int = 0, var status: String = "smth")
So how can I use this method to fill the ArrayList in similar way?
//?
private val mItems = ArrayList<ModelClass>()
(1..item_count).mapTo(mItems) { mItems[position].id = it } // Something like this
//?
From the mapTo documentation:
Applies the given transform function to each element of the original collection and appends the results to the given destination.
Therefore, you just need to return the elements you want:
(1..item_count).mapTo(mItems) { ModelClass(it) }
If you are OK with any MutableList (which is often ArrayList or similar):
val mItems1 = MutableList(item_count) { i -> i }
val mItems2 = MutableList(item_count) { ModelClass(it) }

Fluent Assertions OnlyContain

Using FluentAssertions, I want to check a list only contains objects with certain values.
For example, I attempted to use a lambda;
myobject.Should().OnlyContain(x=>x.SomeProperty == "SomeValue");
However, this syntax is not allowed.
I'm pretty sure that should work. Check out this example unit test from FluentAssertions' GitHub repository:
[TestMethod]
public void When_a_collection_contains_items_not_matching_a_predicate_it_should_throw()
{
//-----------------------------------------------------------------------------------------------------------
// Arrange
//-----------------------------------------------------------------------------------------------------------
IEnumerable<int> collection = new[] { 2, 12, 3, 11, 2 };
//-----------------------------------------------------------------------------------------------------------
// Act
//-----------------------------------------------------------------------------------------------------------
Action act = () => collection.Should().OnlyContain(i => i <= 10, "10 is the maximum");
//-----------------------------------------------------------------------------------------------------------
// Act
//-----------------------------------------------------------------------------------------------------------
act.ShouldThrow<AssertFailedException>().WithMessage(
"Expected collection to contain only items matching (i <= 10) because 10 is the maximum, but {12, 11} do(es) not match.");
}

Linq version of SQL "IN" statement

I have the following 3 tables as part of a simple "item tagging" schema:
==Items==
ItemId int
Brand varchar
Name varchar
Price money
Condition varchar
Description varchar
Active bit
==Tags==
TagId int
Name varchar
Active bit
==TagMap==
TagMapId int
TagId int (fk)
ItemId int (fk)
Active bit
I want to write a LINQ query to bring back Items that match a list of tags (e.g. TagId = 2,3,4,7). In my application context, examples of items would be "Computer Monitor", "Dress Shirt", "Guitar", etc. and examples of tags would be "electronics", "clothing", etc. I would normally accomplish this with a SQL IN Statement.
Something like
var TagIds = new int[] {12, 32, 42};
var q = from map in Context.TagMaps
where TagIds.Contains(map.TagId)
select map.Items;
should do what you need. This will generate an In ( 12, 32, 42 ) clause (or more specifically a parameterized IN clause if I'm not mistaken).
given array of items:
var list = new int[] {2,3,4}
use:
where list.Contains(tm.TagId)
List<int> tagIds = new List<int>() {2, 3, 4, 7};
int tagIdCount = tagIds.Count;
//
// Items that have any of the tags
// (any item may have any of the tags, not necessarily all of them
//
var ItemsAnyTags = db.Items
.Where(item => item.TagMaps
.Any(tm => tagIds.Contains(tm.TagId))
);
//
// Items that have ALL of the tags
// (any item may have extra tags that are not mentioned).
//
var ItemIdsForAllTags = db.TagMap
.Where(tm => tagIds.Contains(tm.TagId))
.GroupBy(tm => tm.ItemId)
.Where(g => g.Count() == tagIdCount)
.Select(g => g.Key);
//
var ItemsWithAllTags = db.Items
.Where(item => ItemsIdsForAllTags.Contains(item.ItemId));
//runs just one query against the database
List<Item> result = ItemsWithAllTags.ToList();
You can simply use,
var TagIds = {12, 32, 42}
var prod =entities.TagMaps.Where(tagmaps=> TagIds .Contains(tagmaps.TagId));
string[] names = {"John", "Cassandra", "Sarah"};
var results = (from n in db.Names
where names.Contains(n.Name)
select n).ToList();
You may create an extension method "IN()"
public static class Extension
{
public static bool IN(this object anyObject, params object[] list)
{ return list.Contains(anyObject); }
}
to be used like this
var q = from map in Context.TagMaps
where map.TagId.IN(2, 3, 4, 7)
select map.Items;
Or just use the inline array.Contains() notation:
var q = from map in Context.TagMaps
where new[]{2, 3, 4, 7}.Contains(map.TagId)
select map.Items;