#2652 List.groupBy()

SlimerDude Tue 10 Oct

Following on from this topic I think the following (parametrised) utility method would be a really useful addition to List:

** Groups items according to the return value of the given function.
** Items that return the same object are grouped together.
** All groups are placed in a Map, keyed off the object they were grouped by.
**
** Example:
**   animals := ["cat", "cow", "donkey", "dog", "snake"]
**   animals.groupBy |Str item -> Str| { item.chars.first.toChar }
**     // --> [c:[cat, cow], d:[donkey, dog], s:[snake]]
**
** Note the returned map is ordered, allowing you to split a List into
** arbitrary sections by using a function that operates on the index.
**
** Example:
**   [1,2,3,4,5].groupBy |v, i| { i / 3}.vals  // --> [[1,2,3], [4,5]]
**   [1,2,3,4,5].groupBy |v, i| { i % 2}.vals  // --> [[1,3,5], [2,4]]
**
[Obj:V[]] groupBy(|V item, Int index -> Obj| c)

I've found the exact same method here:

And methods that replicate taking the vals (in the 2nd split example) are found here:

I find a lot of use for the function, often to categorise data (as per the 1st animal example) and to split data into display columns (as per the 2nd split example).

As for Fantom, it has potential to be used in:

  • sys::Uri.query()
  • web::WebUtil.parseHeaders()
  • web::WebUtil.parseQVals()

And probably many other places once you know it exists.

Because groupBy() returns a parametrised Map, the compiler pod first needs to be updated to handle these types:

diff -r a60c8b8decfb src/compiler/fan/namespace/GenericTypes.fan
--- a/src/compiler/fan/namespace/GenericTypes.fan	Fri Oct 06 09:21:02 2017 -0400
+++ b/src/compiler/fan/namespace/GenericTypes.fan	Tue Oct 10 17:27:55 2017 +0100
@@ -86,10 +86,17 @@
     {
       t = parameterizeFuncType(nn)
     }
+    else if (nn is MapType)
+    {
+      t = parameterizeMapType(nn)
+    }
+    else if (nn.name.size == 1)
+    {
+      t = doParameterize(nn.name[0])
+    }
     else
     {
-      if (nn.name.size != 1) throw Err("Cannot parameterize $t.qname")
-      t = doParameterize(nn.name[0])
+      throw Err("Cannot parameterize $t.qname - $nn.typeof $nn")
     }
     return nullable ? t.toNullable : t
   }
@@ -106,6 +113,13 @@
     return FuncType(params, t.names, ret)
   }
 
+  private CType parameterizeMapType(MapType t)
+  {
+    k := parameterize(t.k)
+    v := parameterize(t.v)
+    return MapType(k, v)
+  }
+
   abstract CType doParameterize(Int ch)
 
 //////////////////////////////////////////////////////////////////////////

And here's a patch for List.groupBy() complete with Java and Javascript implementations, and a couple of unit tests.

diff -r a60c8b8decfb src/sys/fan/List.fan
--- a/src/sys/fan/List.fan	Fri Oct 06 09:21:02 2017 -0400
+++ b/src/sys/fan/List.fan	Tue Oct 10 17:34:51 2017 +0100
@@ -544,6 +544,25 @@
   **
   L intersection(L that)
 
+  **
+  ** Groups items according to the return value of the given function.
+  ** Items that return the same object are grouped together.
+  ** All groups are placed in a Map, keyed off the object they were grouped by.
+  **
+  ** Example:
+  **   animals := ["cat", "cow", "donkey", "dog", "snake"]
+  **   animals.groupBy |item -> Str| { item.chars.first.toChar }
+  **     // --> [c:[cat, cow], d:[donkey, dog], s:[snake]]
+  **
+  ** Note the returned map is ordered, allowing you to split a List into
+  ** arbitrary sections by using a function that operates on the index.
+  **
+  ** Example:
+  **   [1,2,3,4,5].groupBy |v, i| { i / 3}.vals  // --> [[1,2,3], [4,5]]
+  **   [1,2,3,4,5].groupBy |v, i| { i % 2}.vals  // --> [[1,3,5], [2,4]]
+  **
+  [Obj:V[]] groupBy(|V item, Int index -> Obj| c)
+
 //////////////////////////////////////////////////////////////////////////
 // Utils
 //////////////////////////////////////////////////////////////////////////
diff -r a60c8b8decfb src/sys/java/fan/sys/List.java
--- a/src/sys/java/fan/sys/List.java	Fri Oct 06 09:21:02 2017 -0400
+++ b/src/sys/java/fan/sys/List.java	Tue Oct 10 17:34:51 2017 +0100
@@ -14,6 +14,7 @@
 import java.util.Collection;
 import java.util.Comparator;
 import java.util.HashMap;
+import java.util.LinkedHashMap;
 import fanx.serial.*;
 import fanx.util.*;
 
@@ -917,6 +918,35 @@
     return acc;
   }
 
+  public final Map groupBy(Func f)
+  {
+    Type r = f.returns();
+    if (r == Sys.VoidType) throw ArgErr.make("func must return a value");
+
+    HashMap groups = new LinkedHashMap(size);
+    Object  key;
+    Object  val;
+    List    list;
+    for (int i=0; i<size; ++i)
+    {
+      val = values[i];
+
+      if (f.arity() == 1)
+        key = f.call(val);
+      else
+        key = f.call(val, Long.valueOf(i));
+
+      if (!groups.containsKey(key))
+        groups.put(key, new List(of, 16));
+
+      list = (List) groups.get(key);
+      list.add(val);
+    }
+
+    MapType mapType = new MapType(r, typeof());
+    return new Map(mapType, groups);
+  }
+
 //////////////////////////////////////////////////////////////////////////
 // Utils
 //////////////////////////////////////////////////////////////////////////
diff -r a60c8b8decfb src/sys/js/fan/List.js
--- a/src/sys/js/fan/List.js	Fri Oct 06 09:21:02 2017 -0400
+++ b/src/sys/js/fan/List.js	Tue Oct 10 17:34:51 2017 +0100
@@ -756,6 +756,32 @@
   return acc;
 }
 
+fan.sys.List.prototype.groupBy = function(f)
+{
+  var r = f.returns();
+  if (r == fan.sys.Void.$type) throw fan.sys.ArgErr.make("func must return a value");
+
+  var groups = new fan.sys.Map.make(r, this.$typeof());
+  groups.ordered$(true);
+  var  key, val, list;
+  for (var i=0; i<this.m_size; ++i)
+  {
+    val = this.m_values[i];
+
+    if (f.m_params.size() == 1)
+      key = f.call(val);
+    else
+      key = f.call(val, i);
+
+    if (!groups.containsKey(key))
+      groups.set(key, fan.sys.List.make(this.m_of));
+
+    list = groups.get(key);
+    list.add(val);
+  }
+  return groups;
+}
+
 //////////////////////////////////////////////////////////////////////////
 // Utils
 //////////////////////////////////////////////////////////////////////////
diff -r a60c8b8decfb src/testSys/fan/ListTest.fan
--- a/src/testSys/fan/ListTest.fan	Fri Oct 06 09:21:02 2017 -0400
+++ b/src/testSys/fan/ListTest.fan	Tue Oct 10 17:34:51 2017 +0100
@@ -1142,6 +1142,35 @@
   }
 
 //////////////////////////////////////////////////////////////////////////
+// GroupBy
+//////////////////////////////////////////////////////////////////////////
+
+  Void testGroupBy()
+  {
+    verifyErrMsg(ArgErr#, "func must return a value")
+    {
+      [,]->groupBy |Obj o| { }
+    }
+
+    animals := ["cat", "cow", "donkey", "dog", "snake"]
+    groups  := animals.groupBy |Str item -> Int| { item.chars.first }
+
+    verifyEq(groups.typeof, Int:Str[]#)
+    verifyEq(groups.ordered, true)
+    verifyEq(groups['a'], null)
+    verifyEq(groups['c'], ["cat", "cow"])
+    verifyEq(groups['d'], ["donkey", "dog"])
+    verifyEq(groups['s'], ["snake"])
+    verifyEq(groups.size, 3)
+
+    split := [1,2,3,4,5].groupBy(|Int v, Int i->Int| { i / 3}).vals
+    verifyEq(split, [[1,2,3], [4,5]])
+
+    split = [1,2,3,4,5].groupBy(|Int v, Int i->Int| { i % 2}).vals
+    verifyEq(split, [[1,3,5], [2,4]])
+  }
+
+//////////////////////////////////////////////////////////////////////////
 // Sort
 //////////////////////////////////////////////////////////////////////////

Login or Signup to reply.