"Distributing" a sequence among other sequences - sequence

I have an IObservable that produces a sequence like this:
A1, A2, A3, A4, B1, B2, B3, B4, C1, C2, C3, C4, D1, D2, D3, D4, ...
I would like to have a method:
IObservable<T>[] Distribute(IObservable<T>, count)
That, when called on a IObservable, would produce other IObservables - four, in this case - so that the output of each one would be elements from a relative "position" in the original sequence (pseudo-code):
GetOutput(unzipped[0]) = { A1, B1, C1, D1, ... };
GetOutput(unzipped[1]) = { A2, B2, C2, D2, ... };
This is identical to a person dealing cards to N people (N = count), when you give one card for each person in sequence, then start over.
The questions are:
Is it possible to get that functionality with existing Rx methods?
If not, how could I implement it myself? I feel that using Where with a modulo division in the index could be acceptable, except if the answer to question 1 above is "yes"...

When in Rx, you generally stay in IObservable land as long as possible. So a more natural signature would be IObservable<IObservable<T>> Distribute(IObservabe<T> source, int count), which would look like this:
public static IObservable<IObservable<T>> Distribute<T>(this IObservable<T> source, int count)
{
var toReturn = source.Select((t, i) => Tuple.Create(t, i))
.GroupBy(t => t.Item2 % count)
.Select(g => g.Select(t => t.Item1))
return toReturn;
}
You can get it as an array like this:
public static IObservable<T>[] DistributeArray<T>(this IObservable<T> source, int count)
{
var toReturn = new IObservable<T>[count];
for (int i = 0; i < count; i++)
{
var enclosedI = i;
toReturn[enclosedI] = source
.Where((t, j) => j % count == enclosedI);
}
return toReturn;
}

Related

Matching partial words in two different columns

I am working on trying to weed out a certain customer from our database. I've noticed a trend where people fill out their first name with the same name that is partial to how they fill out their company name. So an example would look like:
business_name first_name
------------- ----------
locksmith taylorsville locksmith
locksmith roy locksmi
locksmith clinton locks
locksmith farmington locksmith
These are people I do not want being pulled in a query. They are bad eggs. I'm trying to put together a query with a WHERE statement (presumably) that isolates anyone who has a first name that contains at least a partial match to their business name, but I'm stumped and could use some help.
You can use LIKE operator:
SELECT * FROM table WHERE business_name NOT LIKE CONCAT(first_name, '%')
% stands for anything.
You can employ similarity based approach
Try code at bottom of answer
It produces result like below
business_name partial_business_name first_name similarity
locksmith taylorsville locksmith locksmith 1.0
locksmith farmington locksmith locksmith 1.0
locksmith roy locksmith locksmi 0.7777777777777778
locksmith clinton locksmith locks 0.5555555555555556
So, you will be able to control what to filter out based on similarity value
** Code **
SELECT business_name, partial_business_name, first_name, similarity FROM
JS( // input table
(
SELECT business_name, REGEXP_EXTRACT(business_name, r'^(\w+)') AS partial_business_name, first_name AS first_name FROM
(SELECT 'locksmith taylorsville' AS business_name, 'locksmith' AS first_name),
(SELECT 'locksmith roy' AS business_name, 'locksmi' AS first_name),
(SELECT 'locksmith clinton' AS business_name, 'locks' AS first_name),
(SELECT 'locksmith farmington' AS business_name, 'locksmith' AS first_name),
) ,
// input columns
business_name, partial_business_name, first_name,
// output schema
"[{name: 'business_name', type:'string'},
{name: 'partial_business_name', type:'string'},
{name: 'first_name', type:'string'},
{name: 'similarity', type:'float'}]
",
// function
"function(r, emit) {
var _extend = function(dst) {
var sources = Array.prototype.slice.call(arguments, 1);
for (var i=0; i<sources.length; ++i) {
var src = sources[i];
for (var p in src) {
if (src.hasOwnProperty(p)) dst[p] = src[p];
}
}
return dst;
};
var Levenshtein = {
/**
* Calculate levenshtein distance of the two strings.
*
* #param str1 String the first string.
* #param str2 String the second string.
* #return Integer the levenshtein distance (0 and above).
*/
get: function(str1, str2) {
// base cases
if (str1 === str2) return 0;
if (str1.length === 0) return str2.length;
if (str2.length === 0) return str1.length;
// two rows
var prevRow = new Array(str2.length + 1),
curCol, nextCol, i, j, tmp;
// initialise previous row
for (i=0; i<prevRow.length; ++i) {
prevRow[i] = i;
}
// calculate current row distance from previous row
for (i=0; i<str1.length; ++i) {
nextCol = i + 1;
for (j=0; j<str2.length; ++j) {
curCol = nextCol;
// substution
nextCol = prevRow[j] + ( (str1.charAt(i) === str2.charAt(j)) ? 0 : 1 );
// insertion
tmp = curCol + 1;
if (nextCol > tmp) {
nextCol = tmp;
}
// deletion
tmp = prevRow[j + 1] + 1;
if (nextCol > tmp) {
nextCol = tmp;
}
// copy current col value into previous (in preparation for next iteration)
prevRow[j] = curCol;
}
// copy last col value into previous (in preparation for next iteration)
prevRow[j] = nextCol;
}
return nextCol;
}
};
var the_partial_business_name;
try {
the_partial_business_name = decodeURI(r.partial_business_name).toLowerCase();
} catch (ex) {
the_partial_business_name = r.partial_business_name.toLowerCase();
}
try {
the_first_name = decodeURI(r.first_name).toLowerCase();
} catch (ex) {
the_first_name = r.first_name.toLowerCase();
}
emit({business_name: r.business_name, partial_business_name: the_partial_business_name, first_name: the_first_name,
similarity: 1 - Levenshtein.get(the_partial_business_name, the_first_name) / the_partial_business_name.length});
}"
)
ORDER BY similarity DESC
Was used in How to perform trigram operations in Google BigQuery? and based on https://storage.googleapis.com/thomaspark-sandbox/udf-examples/pataky.js by #thomaspark where Levenshtein's distance is used to measure similarity
this will do the trick,
select * from TableName where lower(business_name) contains lower(first_name)
use lower() just in case they have upper case letters. Hope it helps.

REGEXP_REPLACE pattern has to be const? Comparing strings in BigQuery

I'm trying to measure similarity between strings using Dice's Coefficient (aka Pair Similarity) in BigQuery. For a second I thought that I can do that using just standard functions.
Suppose I need to compare "gana" and "gano". Then I would "cook" these two strings upfront into 'ga|an|na' and 'ga|an|no' (lists of 2-grams) and do this:
REGEXP_REPLACE('ga|an|na', 'ga|an|no', '')
Then based on change in length I can calculate my coeff.
But once applied to the table I get:
REGEXP_REPLACE second argument must be const and non-null
Is there any workaround for that? With simple REPLACE() second argument can be a field.
Maybe there is a better way to do it? I know, I can do UDF instead. But I wanted to avoid them here. We are running big tasks and UDFs are generally slower (at least in my experience) and are subject to different concurrency limit.
You can have JavaScript code inside for BigQuery SQL queries.
To measure similarity you could use Levenshtein's distance with a query like this (from https://stackoverflow.com/a/33443564/132438):
SELECT *
FROM js(
(
SELECT title,target FROM
(SELECT 'hola' title, 'hello' target), (SELECT 'this is beautiful' title, 'that is fantastic' target)
),
title, target,
// Output schema.
"[{name: 'title', type:'string'},
{name: 'target', type:'string'},
{name: 'distance', type:'integer'}]",
// The function
"function(r, emit) {
var _extend = function(dst) {
var sources = Array.prototype.slice.call(arguments, 1);
for (var i=0; i<sources.length; ++i) {
var src = sources[i];
for (var p in src) {
if (src.hasOwnProperty(p)) dst[p] = src[p];
}
}
return dst;
};
var Levenshtein = {
/**
* Calculate levenshtein distance of the two strings.
*
* #param str1 String the first string.
* #param str2 String the second string.
* #return Integer the levenshtein distance (0 and above).
*/
get: function(str1, str2) {
// base cases
if (str1 === str2) return 0;
if (str1.length === 0) return str2.length;
if (str2.length === 0) return str1.length;
// two rows
var prevRow = new Array(str2.length + 1),
curCol, nextCol, i, j, tmp;
// initialise previous row
for (i=0; i<prevRow.length; ++i) {
prevRow[i] = i;
}
// calculate current row distance from previous row
for (i=0; i<str1.length; ++i) {
nextCol = i + 1;
for (j=0; j<str2.length; ++j) {
curCol = nextCol;
// substution
nextCol = prevRow[j] + ( (str1.charAt(i) === str2.charAt(j)) ? 0 : 1 );
// insertion
tmp = curCol + 1;
if (nextCol > tmp) {
nextCol = tmp;
}
// deletion
tmp = prevRow[j + 1] + 1;
if (nextCol > tmp) {
nextCol = tmp;
}
// copy current col value into previous (in preparation for next iteration)
prevRow[j] = curCol;
}
// copy last col value into previous (in preparation for next iteration)
prevRow[j] = nextCol;
}
return nextCol;
}
};
var the_title;
try {
the_title = decodeURI(r.title).toLowerCase();
} catch (ex) {
the_title = r.title.toLowerCase();
}
emit({title: the_title, target: r.target,
distance: Levenshtein.get(the_title, r.target)});
}")
Below is tailored for similarity
Was used in How to perform trigram operations in Google BigQuery? and based on https://storage.googleapis.com/thomaspark-sandbox/udf-examples/pataky.js by #thomaspark
SELECT text1, text2, similarity FROM
JS(
// input table
(
SELECT * FROM
(SELECT 'mikhail' AS text1, 'mikhail' AS text2),
(SELECT 'mikhail' AS text1, 'mike' AS text2),
(SELECT 'mikhail' AS text1, 'michael' AS text2),
(SELECT 'mikhail' AS text1, 'javier' AS text2),
(SELECT 'mikhail' AS text1, 'thomas' AS text2)
) ,
// input columns
text1, text2,
// output schema
"[{name: 'text1', type:'string'},
{name: 'text2', type:'string'},
{name: 'similarity', type:'float'}]
",
// function
"function(r, emit) {
var _extend = function(dst) {
var sources = Array.prototype.slice.call(arguments, 1);
for (var i=0; i<sources.length; ++i) {
var src = sources[i];
for (var p in src) {
if (src.hasOwnProperty(p)) dst[p] = src[p];
}
}
return dst;
};
var Levenshtein = {
/**
* Calculate levenshtein distance of the two strings.
*
* #param str1 String the first string.
* #param str2 String the second string.
* #return Integer the levenshtein distance (0 and above).
*/
get: function(str1, str2) {
// base cases
if (str1 === str2) return 0;
if (str1.length === 0) return str2.length;
if (str2.length === 0) return str1.length;
// two rows
var prevRow = new Array(str2.length + 1),
curCol, nextCol, i, j, tmp;
// initialise previous row
for (i=0; i<prevRow.length; ++i) {
prevRow[i] = i;
}
// calculate current row distance from previous row
for (i=0; i<str1.length; ++i) {
nextCol = i + 1;
for (j=0; j<str2.length; ++j) {
curCol = nextCol;
// substution
nextCol = prevRow[j] + ( (str1.charAt(i) === str2.charAt(j)) ? 0 : 1 );
// insertion
tmp = curCol + 1;
if (nextCol > tmp) {
nextCol = tmp;
}
// deletion
tmp = prevRow[j + 1] + 1;
if (nextCol > tmp) {
nextCol = tmp;
}
// copy current col value into previous (in preparation for next iteration)
prevRow[j] = curCol;
}
// copy last col value into previous (in preparation for next iteration)
prevRow[j] = nextCol;
}
return nextCol;
}
};
var the_text1;
try {
the_text1 = decodeURI(r.text1).toLowerCase();
} catch (ex) {
the_text1 = r.text1.toLowerCase();
}
try {
the_text2 = decodeURI(r.text2).toLowerCase();
} catch (ex) {
the_text2 = r.text2.toLowerCase();
}
emit({text1: the_text1, text2: the_text2,
similarity: 1 - Levenshtein.get(the_text1, the_text2) / the_text1.length});
}"
)
ORDER BY similarity DESC
REGEXP_REPLACE second argument must be const and non-null
Is there any
workaround for that?
Below is just an idea/direction to address above question applied to logic you described:
I would "cook" these two strings upfront into 'ga|an|na' and
'ga|an|no' (lists of 2-grams) and do this: REGEXP_REPLACE('ga|an|na',
'ga|an|no', ''). Then based on change in length I can calculate my
coeff.
The "workaround" is:
SELECT a.w AS w1, b.w AS w2, SUM(a.x = b.x) / COUNT(1) AS c
FROM (
SELECT w, SPLIT(p, '|') AS x, ROW_NUMBER() OVER(PARTITION BY w) AS pos
FROM
(SELECT 'gana' AS w, 'ga|an|na' AS p)
) AS a
JOIN (
SELECT w, SPLIT(p, '|') AS x, ROW_NUMBER() OVER(PARTITION BY w) AS pos
FROM
(SELECT 'gano' AS w, 'ga|an|no' AS p),
(SELECT 'gamo' AS w, 'ga|am|mo' AS p),
(SELECT 'kana' AS w, 'ka|an|na' AS p)
) AS b
ON a.pos = b.pos
GROUP BY w1, w2
Maybe there is a better way to do it?
Below is the simple example of how Pair Similarity can be approached here (including building bigrams sets and calculation of coefficient:
SELECT
a.word AS word1, b.word AS word2,
2 * SUM(a.bigram = b.bigram) /
(EXACT_COUNT_DISTINCT(a.bigram) + EXACT_COUNT_DISTINCT(b.bigram) ) AS c
FROM (
SELECT word, char + next_char AS bigram
FROM (
SELECT word, char, LEAD(char, 1) OVER(PARTITION BY word ORDER BY pos) AS next_char
FROM (
SELECT word, SPLIT(word, '') AS char, ROW_NUMBER() OVER(PARTITION BY word) AS pos
FROM
(SELECT 'gana' AS word)
)
)
WHERE next_char IS NOT NULL
GROUP BY 1, 2
) a
CROSS JOIN (
SELECT word, char + next_char AS bigram
FROM (
SELECT word, char, LEAD(char, 1) OVER(PARTITION BY word ORDER BY pos) AS next_char
FROM (
SELECT word, SPLIT(word, '') AS char, ROW_NUMBER() OVER(PARTITION BY word) AS pos
FROM
(SELECT 'gano' AS word)
)
)
WHERE next_char IS NOT NULL
GROUP BY 1, 2
) b
GROUP BY 1, 2

On Java 7's equals() and deepEquals()

Method description says:
Returns true if the arguments are deeply equal to each other and false
otherwise... Equality is determined by using the equals method
of the first argument.
Which (to me) suggests that Objects are deeply equal if every object they maintain references to are also equal using the equals() method. And every objects they have a reference to are also equal. And ..
So .. equality is determined by using the equals method of the first argument.
How is this different from .equals()? Assuming that we describe equals appropriately where, objects is equal to another object is every field of the object is equal to it as well.
Can you please provide an example illustrating the difference between Objects.deepEquals() and Objects.equals()?
String[] firstArray = {"a", "b", "c"};
String[] secondArray = {"a", "b", "c"};
System.out.println("Are they equal 1 ? " + firstArray.equals(secondArray) );
System.out.println("Are they equal 2 ? " + Objects.equals(firstArray, secondArray) );
System.out.println("Are they deepEqual 1? " + Arrays.deepEquals(firstArray, secondArray) );
System.out.println("Are they deepEqual 2? " + Objects.deepEquals(firstArray, secondArray) );
will return
Are they equal 1 ? false
Are they equal 2 ? false
Are they deepEqual 1? true
Are they deepEqual 2? true
How come the "shallow" equals methods return false? This is because in Java, for arrays, equality is determined by object identity. In this example, firstArray and secondArray are distinct objects.
Doing String[] secondArray = firstArray instead will therefore return true for all four tests.
If at least one of the arguments of deepEquals method is not an array, then Objects.deepEquals and Objects.equals are same.
Example:
import java.util.Arrays;
import java.util.Objects;
public class Main {
public static void main(String[] args) {
Integer[] x = { 1, 2 };
Integer[] y = { 1, 2 };
System.out.println(Objects.equals(x, y)); // false
System.out.println(Objects.deepEquals(x, y)); // true
System.out.println(Arrays.equals(x, y)); // true
System.out.println(Arrays.deepEquals(x, y)); // true
System.out.println();
int[][] a = { { 1, 2 }, { 3, 4 } };
int[][] b = { { 1, 2 }, { 3, 4 } };
System.out.println(Objects.equals(a, b)); // false
System.out.println(Objects.deepEquals(a, b)); // true
System.out.println(Arrays.equals(a, b)); // false
System.out.println(Arrays.deepEquals(a, b)); // true
}
}
Documentation and decompiled code:
Objects#equals(Object a, Object b): Returns true if the arguments are equal to each other and false otherwise. Consequently, if both arguments are null, true is returned and if exactly one argument is null, false is returned. Otherwise, equality is determined by using the equals method of the first argument.
public static boolean equals(Object a, Object b) {
return (a == b) || (a != null && a.equals(b));
}
Objects#deepEquals(Object a, Object b): Returns true if the arguments are deeply equal to each other and false otherwise. Two null values are deeply equal. If both arguments are arrays, the algorithm in Arrays.deepEquals is used to determine equality. Otherwise, equality is determined by using the equals method of the first argument.
public static boolean deepEquals(Object a, Object b) {
if (a == b)
return true;
else if (a == null || b == null)
return false;
else
return Arrays.deepEquals0(a, b);
}
Arrays#equals(Object[] a, Object[] a2): Returns true if the two specified arrays of Objects are equal to one another. The two arrays are considered equal if both arrays contain the same number of elements, and all corresponding pairs of elements in the two arrays are equal.
public static boolean equals(Object[] a, Object[] a2) {
if (a==a2)
return true;
if (a==null || a2==null)
return false;
int length = a.length;
if (a2.length != length)
return false;
for (int i=0; i<length; i++) {
if (!Objects.equals(a[i], a2[i]))
return false;
}
return true;
}
Arrays#deepEquals(Object[] a1, Object[] a2): Returns true if the two specified arrays are deeply equal to one another. Unlike the equals(Object[],Object[]) method, this method is appropriate for use with nested arrays of arbitrary depth.
public static boolean deepEquals(Object[] a1, Object[] a2) {
if (a1 == a2)
return true;
if (a1 == null || a2==null)
return false;
int length = a1.length;
if (a2.length != length)
return false;
for (int i = 0; i < length; i++) {
Object e1 = a1[i];
Object e2 = a2[i];
if (e1 == e2)
continue;
if (e1 == null)
return false;
// Figure out whether the two elements are equal
boolean eq = deepEquals0(e1, e2);
if (!eq)
return false;
}
return true;
}
static boolean deepEquals0(Object e1, Object e2) {
assert e1 != null;
boolean eq;
if (e1 instanceof Object[] && e2 instanceof Object[])
eq = deepEquals ((Object[]) e1, (Object[]) e2);
else if (e1 instanceof byte[] && e2 instanceof byte[])
eq = equals((byte[]) e1, (byte[]) e2);
else if (e1 instanceof short[] && e2 instanceof short[])
eq = equals((short[]) e1, (short[]) e2);
else if (e1 instanceof int[] && e2 instanceof int[])
eq = equals((int[]) e1, (int[]) e2);
else if (e1 instanceof long[] && e2 instanceof long[])
eq = equals((long[]) e1, (long[]) e2);
else if (e1 instanceof char[] && e2 instanceof char[])
eq = equals((char[]) e1, (char[]) e2);
else if (e1 instanceof float[] && e2 instanceof float[])
eq = equals((float[]) e1, (float[]) e2);
else if (e1 instanceof double[] && e2 instanceof double[])
eq = equals((double[]) e1, (double[]) e2);
else if (e1 instanceof boolean[] && e2 instanceof boolean[])
eq = equals((boolean[]) e1, (boolean[]) e2);
else
eq = e1.equals(e2);
return eq;
}
deepEquals() is used with nested arrays of arbitrary depth.
equals() is used with simple primitive data types.
For ex:
public class TwoDArray {
public static void main(String args[]) {
int a[][] = new int[2][2];
int b[][] = new int[2][2];
for(int i=0;i<2;i++)
for(int j=0;j<2;j++) {
a[i][j] = i+j;
b[i][j] = i+j;
}
System.out.println(Arrays.deepEquals(a,b));//return true
System.out.println(Arrays.equals(a, b));//return false
}
}
Attaching a very good example i found on javarevisited.blogspot.in
public class ArrayCompareTest {
public static void main(String args[]) {
//comparing primitive int arrays in Java
int[] i1 = new int[] {1,2,3,4};
int[] i2 = new int[] {1,2,3,4};
int[] i3 = new int[] {0,2,3,4};
//Arrays.equals() compare Array and return true if both array are equal
//i..e either both of them are null or they are identical in length, and each pair
//match each other e.g. i[0]=i2[0], i[1]=i2[1] and so on
//i1 and i2 should be equal as both contains same elements
boolean result = Arrays.equals(i1, i2);
System.out.println("Comparing int array i1: " + Arrays.toString(i1)
+ " and i1: " + Arrays.toString(i2));
System.out.println("Does array i1 and i2 are equal : " + result);
//array ii2 and i3 are not equals as only length is same, first pair is not same
result = Arrays.equals(i2, i3);
System.out.println("Comparing int array i2: " + Arrays.toString(i2)
+ " and i3: " + Arrays.toString(i3));
System.out.println("Does array i2 and i3 are equal : " + result);
//comparing floating point or double arrays in Java
double[] d1 = new double[] {1.5, 2.4, 3.2, 4,1};
double[] d2 = new double[] {1.5, 2.4, 3.2, 4,1};
double[] d3 = new double[] {0.0, 2.4, 3.2, 4,1};
//Comparing two floating-point arrays using Arrays.equals() in Java
//double array d1 and d2 should be equal - length same, each index matches
result = Arrays.equals(d1, d2);
System.out.println("Comparing double array d1: " + Arrays.toString(d1)
+ " and d2: " + Arrays.toString(d2));
System.out.println("Does double array d1 and d2 are equal : " + result);
//double array d2 and d3 is not equal - length same, first pair does not match
result = Arrays.equals(d2, d3);
System.out.println("Comparing double array d2: " + Arrays.toString(d2)
+ " and d3: " + Arrays.toString(d3));
System.out.println("Does double array d2 and d3 are same : " + result);
//comparing Object array, here we will use String array
String[] s1 = new String[]{"One", "Two", "Three"};
String[] s2 = new String[]{"One", "Two", "Three"};
String[] s3 = new String[]{"zero", "Two", "Three"};
//String array s1 and s2 is equal - length same, each pair matches
result = Arrays.equals(s1, s2);
System.out.println("Comparing two String array s1: " + Arrays.toString(s1)
+ " and s2: " + Arrays.toString(s2));
System.out.println("Are both String array s1 and s2 are equal : " + result);
//String array s2 and s3 is not equal - length same, first pair different
result = Arrays.equals(d2, d3);
System.out.println("Comparing two String array s2: " + Arrays.toString(s2)
+ " and s3: " + Arrays.toString(s3));
System.out.println("Are both String array s2 and s3 are equal : " + result);
//Comparing nested arrays with equals and deepEquals method
//Arrays.equals() method does not compare recursively,
//while deepEquals() compare recursively
//if any element inside Array is type of Array itself,
//as here second element is String array
Object[] o1 = new Object[]{"one", new String[]{"two"}};
Object[] o2 = new Object[]{"one", new String[]{"two"}};
System.out.println("Object array o1: " + Arrays.toString(o1) + " and o2: "
+ Arrays.toString(o2));
System.out.println("Comparing Object Array o1 and o2 with Arrays.equals : "
+ Arrays.equals(o1, o2));
System.out.println("Comparing Object Array o1 and o2 with Arrays.deepEquals : "
+ Arrays.deepEquals(o1, o2));
}
}
Output:
Comparing int array i1: [1, 2, 3, 4] and i1: [1, 2, 3, 4]
Does array i1 and i2 are equal : true
Comparing int array i2: [1, 2, 3, 4] and i3: [0, 2, 3, 4]
Does array i2 and i3 are equal : false
Comparing double array d1: [1.5, 2.4, 3.2, 4.0, 1.0] and d2: [1.5, 2.4, 3.2, 4.0, 1.0]
Does double array d1 and d2 are equal : true
Comparing double array d2: [1.5, 2.4, 3.2, 4.0, 1.0] and d3: [0.0, 2.4, 3.2, 4.0, 1.0]
Does double array d2 and d3 are same : false
Comparing two String array s1: [One, Two, Three] and s2: [One, Two, Three]
Are both String array s1 and s2 are equal : true
Comparing two String array s2: [One, Two, Three] and s3: [zero, Two, Three]
Are both String array s2 and s3 are equal : false
Object array o1: [one, [Ljava.lang.String;#19821f] and o2: [one, [Ljava.lang.String;#addbf1]
Comparing Object Array o1 and o2 with Arrays.equals : false
Comparing Object Array o1 and o2 with Arrays.deepEquals : true

What is the cleanest way to get the sum of numbers in a collection/list in Dart?

I don't like using an indexed array for no reason other than I think it looks ugly. Is there a clean way to sum with an anonymous function? Is it possible to do it without using any outside variables?
Dart iterables now have a reduce function (https://code.google.com/p/dart/issues/detail?id=1649), so you can do a sum pithily without defining your own fold function:
var sum = [1, 2, 3].reduce((a, b) => a + b);
int sum = [1, 2, 3].fold(0, (previous, current) => previous + current);
or with shorter variable names to make it take up less room:
int sum = [1, 2, 3].fold(0, (p, c) => p + c);
This is a very old question but
In 2022 there is actually a built-in package.
Just import
import 'package:collection/collection.dart';
and call the .sum extension method on the Iterable.
FULL EXAMPLE
import 'package:collection/collection.dart';
void main() {
final list = [1, 2, 3, 4];
final sum = list.sum;
print(sum); // prints 10
}
If the list is empty, .sum returns 0.
You might also be interested in list.average...
I still think this is cleaner and easier to understand for this particular problem.
num sum = 0;
[1, 2, 3].forEach((num e){sum += e;});
print(sum);
or
num sum = 0;
for (num e in [1,2,3]) {
sum += e;
}
There is not a clean way to do it using the core libraries as they are now, but if you roll your own foldLeft then there is
main() {
var sum = foldLeft([1,2,3], 0, (val, entry) => val + entry);
print(sum);
}
Dynamic foldLeft(Collection collection, Dynamic val, func) {
collection.forEach((entry) => val = func(val, entry));
return val;
}
I talked to the Dart team about adding foldLeft to the core collections and I hope it will be there soon.
Starting with Dart 2.6 you can use extensions to define a utility method on the List. This works for numbers (example 1) but also for generic objects (example 2).
extension ListUtils<T> on List<T> {
num sumBy(num f(T element)) {
num sum = 0;
for(var item in this) {
sum += f(item);
}
return sum;
}
}
Example 1 (sum all the numbers in the list):
var numbers = [1, 2, 3];
var sum = numbers.sumBy((number) => number);
Example 2 (sum all the Point.x fields):
var points = [Point(1, 2), Point(3, 4)];
var sum = points.sumBy((point) => point.x);
I'd just like to add some small detail to #tmaihoff's answer (about using the collection.dart package):
The sum getter he talks about only works for iterables of num values, like List<int> or Set<double>.
If you have a list of other object types that represent values (like Money, Decimal, Rational, or any others) you must map it to numbers. For example, to count the number of chars in a list of strings you can do:
// Returns 15.
['a', 'ab', 'abc', 'abcd', 'abcde'].map((e) => e.length).sum;
As of 2022, another way of doing it, is using the sumBy() method of the fast_immutable_collections package:
// Returns 15.
['a', 'ab', 'abc', 'abcd', 'abcde'].sumBy((e) => e.length), 15);
Note: I'm the package author.
I suggest you to create this function in any common utility file.
T sum<T extends num>(T lhs, T rhs) => lhs + rhs;
int, double, float extends num class so you can use that function to sum any numbers.
e.g.,
List<int> a = [1,2,3];
int result = a.reduce(sum);
print(result); // result will be 6
Herewith sharing my Approach:
void main() {
int value = sumTwo([1, 4, 3, 43]);
print(value);
}
int sumTwo(List < int > numbers) {
int sum = 0;
for (var i in numbers) {
sum = sum + i;
}
return sum;
}
If when using fold gives a double TypeError, you can use reduce:
var sum = [0.0, 4.5, 6.9].reduce((a, b) => a + b);
If you are planning on doing a number of mathematical operations on your list, it may be helpful to create another list type that includes .sum() and other operations by extending ListBase. Parts of this are inspired by this response with performance tweaks from this response.
import 'dart:collection';
import 'dart:core';
class Vector<num> extends ListBase<num> {
List<num> _list;
Vector() : _list = new List<num>();
Vector.fromList(List<num> lst): _list = lst;
void set length(int l) {
this._list.length=l;
}
int get length => _list.length;
num operator [](int index) => _list[index];
void operator []=(int index, num value) {
_list[index]=value;
}
// Though not strictly necessary, for performance reasons
// you should implement add and addAll.
void add(num value) => _list.add(value);
void addAll(Iterable<num> all) => _list.addAll(all);
num sum() => _list.fold(0.0, (a, b) => a + b) as num;
/// add additional vector functions here like min, max, mean, factorial, normalize etc
}
And use it like so:
Vector vec1 = Vector();
vec1.add(1);
print(vec1); // => [1]
vec1.addAll([2,3,4,5]);
print(vec1); // => [1,2,3,4,5]
print(vec1.sum().toString()); // => 15
Vector vec = Vector.fromList([1.0,2.0,3.0,4.0,5.0]); // works for double too.
print(vec.sum().toString()); // => 15
A solution that has worked cleanly for me is:
var total => [1,2,3,4].fold(0, (e, t) => e + t); // result 10
Different ways to find the sum of all dart list elements,
Method 1: Using a loop :
This is the most commonly used method. Iterate through the list using a loop and add all elements of the list to a final sum variable. We are using one for loop here :
main(List<String> args) {
var sum = 0;
var given_list = [1, 2, 3, 4, 5];
for (var i = 0; i < given_list.length; i++) {
sum += given_list[i];
}
print("Sum : ${sum}");
}
Method 2: Using forEach :
forEach is another way to iterate through a list. We can also use this method to find out the total sum of all values in a dart list. It is similar to the above method. The only difference is that we don’t have to initialize another variable i and list.length is not required.
main(List<String> args) {
var sum = 0;
var given_list = [1, 2, 3, 4, 5];
given_list.forEach((e) => sum += e);
print("Sum : ${sum}");
}
Method 3: Using reduce :
reduce method combines all elements of a list iteratively to one single value using a given function. We can use this method to find out the sum of all elements as like below :
main(List<String> args) {
var given_list = [1, 2, 3, 4, 5];
var sum = given_list.reduce((value, element) => value + element);
print("Sum : ${sum}");
}
Method 4: Using fold :
fold() is similar to reduce. It combines all elements of a list iteratively to one single value using a function. It takes one initial value and calculates the final value based on the previous value.
main(List<String> args) {
var sum = 0;
var given_list = [1,2,3,4,5];
sum = given_list.fold(0, (previous, current) => previous + current);
print("Sum : ${sum}");
}
for more details:https://www.codevscolor.com/dart-find-sum-list-elements
extension DoubleArithmeticExtensions on Iterable<double> {
double get sum => length == 0 ? 0 : reduce((a, b) => a + b);
}
extension IntArithmeticExtensions on Iterable<int> {
int get sum => length == 0 ? 0 : reduce((a, b) => a + b);
}
Usage:
final actual = lineChart.data.lineBarsData[0].spots.map((s) => s.x).sum;

How to multiply several fields in a tuple by a given field of the tuple

For each row of data, I would like to multiply fields 1 through N by field 0. The data could have hundreds of fields per row (or a variable number of fields for that matter), so writing out each pair is not feasible. Is there a way to specify a range of fields, sort of like the the following (incorrect) snippet?
A = LOAD 'foo.csv' USING PigStorage(',');
B = FOREACH A GENERATE $0*($1,..);
A UDF could come in handy here.
Implement exec(Tuple input) and iterate over all fields of the tuple as follows (not tested):
public class MultiplyField extends EvalFunc<Long> {
public Long exec(Tuple input) throws IOException {
if (input == null || input.size() == 0) {
return null;
}
try {
Long retVal = 1;
for (int i = 0; i < input.size(); i++) {
Long j = (Long)input.get(i);
retVal *= j;
}
return retVal;
} catch(Exception e) {
throw WrappedIOException.wrap("Caught exception processing input row ", e);
}
}
}
Then register your UDF and call it from your FOREACH.