#486 It-Block Proposal

brian Sat 21 Mar 2009

Background

This proposal is effectively a continuation of several different proposals and ideas which started with the core problem of how the construction process should work. Today construction is a two phase process:

  1. the actual constructor itself
  2. with-block attached to the constructor

We use this technique to avoid boiler plate constructors which require passing in every field which might need to be set, including const fields. For example:

b := Button
{
   mode = ButtonMode.push
   text = "Press Me"
}

In this case mode is a const field, but we allow you set it in the with-block after the actual constructor is called. The obvious problem with this approach is that it doesn't provide the ability to for the class to ensure that the object is correctly constructed - it has no hook after the with-block has run.

Surprisingly this problem has been extremely tricky to solve elegantly. But during the journey lots of issues have come up, including good ideas about how potential solutions might address builders, immutable setters, and potentially more expressive closures. I think most of the best ideas have revolved around how to package up with-blocks into a first class entity which can be passed into the constructor. This drastically simplifies the construction process because it puts the constructor back in charge of the entire process.

This proposal builds on the basic concept of turning a with-block into a first class chunk of code which can be passed into the constructor. I came back to the fundamental issue that there should be only be one language feature for "packaging up a chunk of code" to pass around as a first class entity - and that should closures.

This proposal is basically a set of changes to turn with-blocks into it-blocks which are a special form of closures.

Summary

First let me summarize the language changes:

  • it-blocks are closures with an implicit single argument called it (like Groovy)
  • it is a new keyword with similar implicit scoping semantics to this
  • the it parameter uses type inference based on the target expression of the closure
  • ambiguity between the scopes of it, this, or local variables is a compile time error (you must explicitly use this or it)
  • const fields on the it may be set in an it-block, but synthetic checks are generated to raise a runtime exception if the closure is executed outside the context of it's constructor (basically puts the flag overhead into the closure, not the object)
  • declaration of an it-block where a function type is expected is lazy (the it-block is passed a function to the target method)
  • declaration of an it-block where a function type is not expected is impertative (the it block is immediately executed just like with-blocks are today)
  • it-blocks may not use the return keyword - so they can only return using implicit return when body is single expression (to avoid non-local return confusions)

Basic Examples

Before we dive into constructors, let's look at it-blocks in normal closure cases. Let's look at a simple example of a closure today:

list := ["alpha", "bravo", "charlie", "delta", "echo", "foxtrot"]
list.each |Str s| { echo(s.size) }

We can omit the parameter using the implied it parameter:

list.each {  echo(it.size) }
list.each |Str it| { echo(it.size) } // long hand for above

The reason that it was inferred to be a Str was because it was a closure argument to Str[].each which takes a function parameter typed as |Str,Int|. Since the second parameter is optional, the it-block had an inferred type of |Str|.

Furthermore we treat it as an implied scope just like we typically imply the this parameter:

list.each {  echo(size) }

In the example above the it-block was mapped into a closure because that was what the target expression expected. However, we can apply an it-block to any expression and it is implied to be the target of the expression itself:

foo := "some string"
foo { echo(it.size) } // implied 'it' parameter to it-block
foo { echo(size) }    // implied 'it' scope

These cases differ from the one above because they are executed immediately, thus they have the exact same behavior as with-blocks today. What this means is that the notion of impertative code versus lazy code is controled not at the call site, but by the target expression. This is not so different from LINQ where the target expression determines whether a lamba is compiled into an AST or a function.

Constructor Examples

Let's take a simple example with an implicit make:

class FooA
{
  Str bar  := ""  // mutable field
}

In both of these cases we are declaring an it-block with an implicit target of FooA:

f := FooA { bar = "bar" }
f { bar = "xyz" }

Because the implicit target doesn't take a closure, these it-blocks are imperative and immediately execute (just like with-block today). In the first case the it-block executes after the constructor (just like today). In the second case it executes against the base expression which is the local variable f (just like today).

But let's say we want to change bar to a const field:

class FooB
{
  const Str bar := ""  // immutable field
}

In the class above, no external class can set the bar field. This is because FooB doesn't declare a constructor with a closure argument. If FooB does wish to enable external classes to configure bar "with-block style", then it would have to declare a constructor like this:

class FooB
{
  new make(|This| f)
  {
    f(this)
    if (bar.isEmpty) throw ArgErr("bar cannot be empty")
  }

  const Str bar
}

Note that we allow the special This type to be used here as a convenience versus writing out the class name. Now we can write code like this:

FooB { bar = "bar" }  // it-block syntax sugar
FooB.make |FooB it| { it.bar = "bar" }  // long-hand

What this feature does is give control of the it-block back to the constructor. The constructor takes the it-block, decides when to process the it-block (as a function call), and then has the opportunity to perform error checking. So the constructor is now 100% in control of the construction process and can do error handling after it has applied the it-block.

Builders

Last month there was a great discussion about adding first class builders to the language. The it-block feature actually allows you to elegantly define builder styled APIs:

class Foo
{
  static Foo make(|FooBuilder| f) { ... }
}

class FooBuilder
{
  Str bar
}

// the it parameter is typed as FooBuilder by the target Foo.make
Foo { bar = "bar" }

So the language doesn't provide any direct support for defining the builder class itself (which always seemed a bit icky to me). But the it-block syntax gives you the expressive power to use builders in a natural way. It leaves the door open for creative builder designs or even definition by DSLs.

Immutable Setters

We can apply the same builder technique to immutable setters:

const class Immutable
{
  new make(|This b|) : this.makeModify(null, b) {}

  private new makeModify(Immutable? orig, |This| b)
  {
    if (orig != null) this.copyFrom(orig)  // copy original
    b(this) // apply new modifications
    // error checking
  }

  This modify(|This| b) { makeModify(this, b) }

  const Str a := ""
  const Str b := ""
  const Str c := ""
}

m := Immutable { a = "foo" }
m = m.modify { b = "bar"; c = "baz" }

In this case we are allowing the it-block implicitly typed by modify to set const fields, the compiler allows this because it generates synthetic runtime checks. This lets us pass the field sets into a private constructor which copies the original, applies the new field sets, and performs error checking all within a single controlled construction process.

Misc Notes

Null Fields: With this change we can turn on the enforcement of null checks after the constructor completes. If a class wishes to defer setting non-nullable fields to a it-block, then it must declare its constructors to take an it-block. The compiler can then generate the field null checks in the constructor itself (as the last thing that happens).

Readonly: per previous discussion, the readonly keyword goes away. If you wish to allow setting of const fields, then the class declares constructors with an it-block parameter. If you wish to prevent a const field from being set it-blocks, then you use the Type name { private set } syntax as discussed.

Java Memory Model: as we have discussed, it may be possible with this change to actually make const fields final - although there is still a ton of issues getting to that point. I believe another solution is to synchronize the instance in each constructor which effectively safely publishes to other threads. I need to do more research.

Wrap Up

This is a long post, but I wanted to try cover all the critical issues with a bunch of example code. I am actually pretty excited about this proposal, because I feel like it elegantly solves the problem and adds a lot of new expressive power to Fan. But despite working on this problem for 9 months now, I'm sure I've overlooked some dark corners.

So please give it some thought and give me your feedback.

tompalmer Sat 21 Mar 2009

How does Groovy handle nested blocks with a different it in each? Is it always easy to tell where it binds to?

brian Sat 21 Mar 2009

How does Groovy handle nested blocks with a different it in each? Is it always easy to tell where it binds to?

Beyond the use of an implicit it, what we are doing really isn't similar to Groovy. In groovy binding variables is done dynamically. This article is the best I've seen to how Groovy actually implements their builder/with-block syntax.

We actually want to do our bindings at compile time, but the compiler knows everything which is important:

  1. the full list of slots available on this (if not a static method)
  2. the full list of slots available on it
  3. the full list of local variables in scope

If you attempt to use a local variable, then it hides the this and it scope (just like locals work against this in Fan and Java today). If you attempt to use a variable in both it and this scope, then it is a compile time error - you have to prefix with this or it.

So about this case, which I use a lot today:

x := 5; y := 7
pt := Point { x = x; y = y }

Technically that would now assign the local variables x and y to themselves. So we now need to write that code like:

x := 5; y := 7
pt := Point { it.x = x; it.y = y }

So that is just like a common idiom used in constructors. However the twist I am going to add is that the compiler is going to detect if you are assigning something to itself and report a compile time error. This wil catch incorrect code, and is something that has bit me in Java land too.

But back your original root issue, what about nested it-blocks? I think in that case, the inner it hides the outer it. There is only one it in scope at a single time.

tompalmer Sat 21 Mar 2009

There is only one it in scope at a single time.

Thanks for clarifying. I think part of my concern was failing to notice that it blocks are always syntactically identified by the entire lack of params, such as for with blocks today.

So do implicit adds work the same as today except that adding locals overrides imported it method calls?

Also, is the "put a with block after any object" idea gone, or can you still do that? If you can, I think I have versioning risks again. Maybe just a withIt method on Obj would be good enough, so you could say things such as:

person.withIt {
  name = "Joe Bob"
}

Not 100% sure on the idea or the method name withIt, though.

Anyway, after digesting things, I think I like your recommendation. It's a lot like current with blocks in some ways, but gets rid of the LHS vs. RHS concept, adds the it reference availability, clears up some ambiguities, makes the feature more generally available, and so on. I would like to have return available in some ways, but I agree that with the current state of things otherwise, that disallowing it is the right decision.

So, summary again, I like this new proposal.

brian Sat 21 Mar 2009

Also, is the "put a with block after any object" idea gone, or can you still do that? If you can, I think I have versioning risks again.

No that idea is still there. The issue then becomes who determines if the it-block is executed immediately or lazily? If the call site determines it, then yes you want a different syntax, most likely I would go with this syntax:

someExpr: { it.foo }

However, in this proposal, the call site does not determine the behavior, the calling method does. I think this is important because means I can start off with a simple constructor, then later go back and add lazy behavior by adding a it-block parameter. It gives me source level compatibility between the two (just like constructors versus factories do).

So there really isn't any risks, because Fan is designed to version methods by adding new parameters if you give them a default. And if you don't give it a default, then you will get a compile time error.

qualidafial Sun 22 Mar 2009

How does Groovy handle nested blocks with a different it in each? Is it always easy to tell where it binds to?

Technically you could explicitly reference each nested "it" by parameterizing each block, right?

Foo |Foo foo| {
  bar = Bar |Bar bar| {
    baz = Baz |Baz baz| {
      foo.a = "a"
      bar.a = "a"
      baz.a = "a"
    }
  }
}

There may be a naming conflict in there (I'm new to Fan so I'm not sure) but you get the idea.

brian Sun 22 Mar 2009

Technically you could explicitly reference each nested "it" by parameterizing each block, right?

Actually, that code would not have the same necessarily have the semantics. One of the primary characteristics of an it-block is that it is executed immediately if the calling method doesn't expect a function. So in your example if Foo, Bar, and Baz all expected a function in their constructors, then yes that code works just like it-blocks would. Otherwise it would be a compile time error. But what you can do now via it-blocks is assign it to a local variable for use in a inner scope:

Foo 
{
  fooIt := it
  bar = Bar
  {
    barIt := it
    baz = Baz
    {
      fooIt.a = "a"
      barIt.a = "b"
      it.a    = "a"
    }
  }
}

qualidafial Sun 22 Mar 2009

One of the primary characteristics of an it-block is that it is executed immediately if the calling method doesn't expect a function.

Wouldn't it be useful to apply these semantics to all blocks? This is what I assumed you meant when I read your proposal, and it seems this be both clean and widely useful.

This would also reduce the definition of it-blocks to just:

  • An it parameter of the type of the preceding expression is implied if no parameter list is given before the block
  • it is implied as the target of all unqualified expressions e.g. { echo (size) } has the same meaning as { it.echo (it.size) }.

JohnDG Mon 23 Mar 2009

This looks like a well-thought out and sound proposal to me.

It's unfortunate that modifications to an immutable object will not resemble modifications to a mutable object, as this may lead to a proliferation of code paths for functions that operate on both mutable and immutable structures.

I'm also not sure what I think of the decision to immediately execute it blocks with no means of intercepting -- but perhaps this is less relevant if we can always add another parameter to the method to give the call site some say as to when and if the block gets executed.

The it-block feature actually allows you to elegantly define builder styled APIs:

This will not be that useful until compiler plug-ins are supported, due to the sheer amount of boilerplate. But after plug-ins, we can write:

builder := Builder <[ Point.type ]>

qualidafial Mon 23 Mar 2009

How will this proposal affect existing serialized @collection objects?

qualidafial Mon 23 Mar 2009

new make(|This| f)
{
  f(this)
  if (bar.isEmpty) throw ArgErr("bar cannot be empty")
}

What happens if both the super- and subclass constructors take a |This| argument?

brian Mon 23 Mar 2009

Wouldn't it be useful to apply these semantics to all blocks?

It-blocks are not expressions by themselves, but rather declarative appendages to a expression (which determines their type inference). True closures on the other hand are arbitrary expressions which can be used anywhere in the language where an expression is expected. So I don't think it makes sense to apply the immediate execution to all closures - that could end up really confusing. It-blocks are really for declarative stuff which is why I think the immediate execution makes sense.

It's unfortunate that modifications to an immutable object will not resemble modifications to a mutable object

I am not sure they could ever be 100% the same, because you have to re-assign the result. Although maybe you are thinking of an syntax that implicitly updates the target variable (although I think that would be confusing). But do you agree it-blocks give you a great deal of expressiveness to implement immutable change strategies?

I'm also not sure what I think of the decision to immediately execute it blocks with no means of intercepting

I was a little unsure about this myself, but now that I've been writing up thoughts on the proposal this aspect has really grown on me. I think putting the API in charge of lazy versus immediate makes a lot of sense.

How will this proposal affect existing serialized @collection objects

So far, things work the same. I think it is an orthogonal discussion if we want to add the comma operator. Although I think with my new scope checking rules that we can remove any ambiguity at compile time - so the existing syntax might work as is.

What happens if both the super- and subclass constructors take a |This| argument?

This is an interesting use case we haven't talked about, but see other discussion.

qualidafial Tue 24 Mar 2009

It-blocks are not expressions by themselves, but rather declarative appendages to a expression (which determines their type inference). True closures on the other hand are arbitrary expressions which can be used anywhere in the language where an expression is expected.

Why not just make it-blocks a simple shorthand for closures?

f := { echo (it) } // shorthand
f := |Obj it| { it.echo (it) } // longhand

Another benefit is that if we replace with-blocks with it-blocks (which are closures) we get to lose the |,| syntax.

f := |,| { echo ("hello") }
f := { echo ("hello") }

So I don't think it makes sense to apply the immediate execution to all closures - that could end up really confusing. It-blocks are really for declarative stuff which is why I think the immediate execution makes sense.

I don't think there is a conflict here. The same rules that apply in your proposal should work at all call sites:

  • Target is a function which accepts a Func argument -> pass it-block (or any closure, really) to function, function may call the closure as needed.
  • All other expressions (including functions which do not accept a Func) -> Evaluate the expression, then call the closure with the value of the expression.

As stated before I'm still pretty new to Fan so I could be overlooking something important.

JohnDG Tue 24 Mar 2009

I am not sure they could ever be 100% the same, because you have to re-assign the result.

Yes, indeed, but that's usually not an issue since mutation methods typically return Void, which returns this. So you can write code that works with mutable or immutable objects. e.g.:

pt := pt { x = 1 }

I believe this would still be possible even with the it proposal so long as a with method were introduced (:).

But do you agree it-blocks give you a great deal of expressiveness to implement immutable change strategies?

For sure.

jodastephen Tue 24 Mar 2009

return

The it concept, particularly as applied to existing closures looks very nice:

list.each { echo(it) }

However, we have ended up back in a situation where you can't easily tell whether return is allowed (versus built in statements) and what it does. I believe that this is a flaw in Groovy that comes with these very lightweight closures. Fan shouldn't copy this mistake.

There are two approaches to solving this:

1) Require a prefix for it blocks, such as a # or :

list.each #{ echo(it) }

list.each: { echo(it) }

2) Allow long-return in it blocks (keeping short-return in closures):

list.each {
  if (it == null) return  // long-return, ends method
  echo(it)
}

list.each |Str str | {
  if (str == null) return   // short-return, next in loop
  echo(str)
}

Option 2 opens up the long return hole of weird exceptions. One solution to that would be for the declaration to have an ability to prevent/allow it block closures. This would be used to separate safe and unsafe blocks (eg. loops from listeners/callbacks). For example:

public Void each(|V<it>| block) { ... }  // declared to allow it blocks

Sets

On mutable/immutable sets to an existing object, it would seem to make more sense for these to be routed via a method:

This with(|This| itBlock) { itBlock(this) }

The compiler could determine whether with() was overridden for any class, and if not, optimise with immediate execution. If the class was const, then with() would always be called (perhaps all const classes should extend ConstObj, not Obj). The result from a const class with() would have to be assigned to a variable or used in some way.

Construction

This proposal leaves us with no way to tell the difference by reading the code as to whether a block following a constructor can set a const field or not. Maybe that isn't an issue - I can't remember the problems right now, and its late....

qualidafial Tue 24 Mar 2009

This proposal leaves us with no way to tell the difference by reading the code as to whether a block following a constructor can set a const field or not.

I thought about this too. Maybe require := assignment in these cases?

brian Tue 24 Mar 2009

I don't think there is a conflict here. The same rules that apply in your proposal should work at all call sites

I definitely see your point, but it doesn't feel right to me. It-blocks are declarative in nature, but general closures are not. So I'd prefer to see a compile time error if I think I am passing a closure and I am not. But this is one of those cases, where if we stick to the current proposal we can change our minds later without effecting backward compatibility.

I believe this would still be possible even with the it proposal so long as a with method were introduced (:)

Excellent point, I totally spaced out on that idea in my proposal. So let's enhance the proposal to introduce a with method which has similar semantics to constructor shortcuts:

Foo { ... }  =>  Foo.make { ... }
foo { ... }  =>  foo.with { ... }

However, we have ended up back in a situation where you can't easily tell whether return is allowed

Definitely an issue with no satisfactory solution. But I think the proposal is pretty simple - if you use a return in an it-block you get a compiler error. Not perfect, but the compiler prevents you from writing confusing code. Note that in most cases the syntax looks exactly like it does today, and you can't use return in a with-block a today. So that aspect really isn't changing.

This proposal leaves us with no way to tell the difference by reading the code as to whether a block following a constructor can set a const field or not

I've given this a lot of thought, and I believe that setting a const field in an it-block has to be a runtime check. Without that level of flexibility it ties your hands in building immutable and builder-style solutions. It is not ideal, but a compile-time check is going to be extremely limiting. But setting a const field outside an it-block will always be immediately flagged as a compiler error. So I feel this is the right trade-off.

JohnDG Tue 24 Mar 2009

Excellent point, I totally spaced out on that idea in my proposal. So let's enhance the proposal to introduce a with method which has similar semantics to constructor shortcuts:

If the default implementation of with does what you would expect for immutable and mutable objects, then I'm 100% on board with the proposal. It feels like a very safe, modest proposal that simplifies a lot of common cases and cleans up with blocks as they exist today.

brian Tue 24 Mar 2009

If the default implementation of with does what you would expect for immutable and mutable objects, then I'm 100% on board

Just to be clear, what exactly is that? My thinking was that with wasn't a predefined method on Obj, but rather something a given class defines. To have with automatically implemented by immutable const classes would be very cool though. Although I think that behavior needs to be coupled with a default implemention of clone.

JohnDG Wed 25 Mar 2009

Just to be clear, what exactly is that? My thinking was that with wasn't a predefined method on Obj, but rather something a given class defines.

I thought it was widely agreed upon that a language with built-in support for immutability needs to provide a standardized way of creating modified clones of immutable data structures. At least, this made it onto more than one list of must-have features for a Fan 1.0 release.

If such a feature is not built in, then there will be no well-defined, boilerplate-free method of creating modified clones of the numerous immutable data structures that are sure to populate third-party libraries.

I wouldn't even mind if the predefined with method were final, such that from a functional point of view, on mutable objects, the method would run the it-block immediately, while with mutable objects, it would run the it-block on a clone. (From an implementation point of view, none of the above would happen and all the code would be inlined for maximum performance.)

To have with automatically implemented by immutable const classes would be very cool though.

Only because they share the same class hierarchy, you also need to provide with for mutable classes, as well, in which case the behavior is "run immediately".

Although I think that behavior needs to be coupled with a default implementation of clone.

This needs to be done anyway, since in a language with immutability, clone can be optimized such that immutable data structures (wherever they appear in the data hierarchy of the object being cloned) are not themselves cloned.

brian Wed 25 Mar 2009

I do not disagree that immutable setters need to be defined. Although I think this feature uses it-blocks, I think it is really a different feature in its own right. I will start a different discussion for it - we really don't have any concrete proposals for it yet.

JohnDG Wed 25 Mar 2009

Although I think this feature uses it-blocks, I think it is really a different feature in its own right.

it-blocks are the primary feature through which mutable objects will be modified. So if a different mechanism is chosen for immutable data structures, it again presents the problem that code cannot operate on both immutable and mutable data structures without separate code paths for each.

jodastephen Thu 26 Mar 2009

> However, we have ended up back in a situation where you can't easily tell whether return is allowed

Definitely an issue with no satisfactory solution. But I think the proposal is pretty simple - if you use a return in an it-block you get a compiler error. Not perfect, but the compiler prevents you from writing confusing code. Note that in most cases the syntax looks exactly like it does today, and you can't use return in a with-block a today. So that aspect really isn't changing.

The problem with adding a long-return later is that you've let the cat out of the bag now.

Closures come in two types - those which you want to execute immediately, and those which are deferred for later execution (on another thread or as a callback).

The immediate execution cases, like list.each, are those that you want to allow long-returns in as they are safe. The callback cases are unsafe for long-return, and so you don't want to allow it.

So, while it-blocks would block any return now, they could be extended in the future to allow long-return. But since you'll have allowed callbacks to be defined using it-blocks, you can't stop the damaging long-returns from being coded.

I believe that the declaration-site should specify whether it-blocks are allowed (immediate execution) or not (callbacks).

(This gets more complex again when you consider that a two parameter closure, like looping around a map, should also be able to use long-returns. This suggests that the real issue isn't whether the block is an it-block, but whether it is called immediately or not.)

brian Thu 26 Mar 2009

The problem with adding a long-return later is that you've let the cat out of the bag now.

I think we have debated the long-return issue at length and never reached a solution. At this point b/w the core Fan code and Bespin we have 150K lines of Fan code, and it is pretty clear to me this whole issue is a red herring. On the rare occasion I've wanted long-returns, the eachWhile method solves the problem. So basically what I am saying is that I don't think the whole long-return issue should be brought into this discussion. Chances are 95% that we will never do them, and if we do them we can figure it out then. I don't want to make compromises on this feature which we have to do, based on a feature we are unlikely to ever do. This is about trade-offs, and in this proposal I think we have a solid plan for unification of with-blocks and closures.

tompalmer Fri 27 Mar 2009

Also, with blocks today don't allow returns, or am I mistaken? So this policy also sticks to status quo (whether or not it's all my favorite). And the new it blocks continue the with block style of looking syntactically different: no keyword and no params.

brian Fri 27 Mar 2009

Also, with blocks today don't allow returns, or am I mistaken? So this policy also sticks to status quo (whether or not it's all my favorite).

correct - status quo

brian Tue 31 Mar 2009

I think we have pretty good consensus around this proposal. Anymore comments or concerns about this direction? I plan to start work on it this week.

jodastephen Fri 3 Apr 2009

One thing I don't think you can do is to have a constructor with default parameter values, and an it-block.

class FooB {
  new make(Str a := "", Str b := "", |This| block) {
    block(this)
  }
}

I should also note that the design doesn't solve one of my goals, which was to prevent references to this escaping during construction (because those references are unsafe). However, that is perhaps a feature of construction, rather than it-blocks, and they can be treated almost as two separate problems to solve.

brian Sat 4 Apr 2009

One thing I don't think you can do is to have a constructor with default parameter values, and an it-block

Actually that is something I was planning to support independent of it-blocks, it happens anytime you have a closure parameter as the last argument. I am going to allow the closure to bind to the last argument (so you can still use the default params).

I should also note that the design doesn't solve one of my goals, which was to prevent references to this escaping during construction (because those references are unsafe).

I think that is really an orthogonal problem - it really doesn't have anything to do with whether you use it-blocks or not. Off the top of my head it seems like an impossible problem to solve, but I'm up for brainstorming on it.

brian Tue 7 Apr 2009

If you are working off the tip, be aware that I have pushed the initial support for it-blocks to hg.

This feature is by no means complete, but it-blocks now work good enough to replace the functionality of with-blocks. Currently a couple minor things are broken:

  • it-blocks used with a Java FFI constructor
  • complex facets
  • const field compile time checking

Login or Signup to reply.