Mike's corner of the web.

Applicative functors in uncurried languages

Sunday 9 September 2012 20:47

Note: this post assumes you already have some familiarity with applicative functors

In this post, I'll show how to implement applicative functors in JavaScript, specifically for options, and then show an alternative formulation that's arguably better suited to languages that generally have uncurried functions (that is, languages that tend to have functions that accept multiple arguments rather than a single argument).

First of all, let's implement the option type (otherwise known as the maybe type) in JavaScript as a functor:

var none = {
    map: function(func) {
        return none;
    },
    
    bind: function(func) {
        return none;
    },
    
    toString: function() {
        return "none";
    }
};

function some(value) {
    return {
        map: function(func) {
            return some(func(value));
        },
        
        bind: function(func) {
            return func(value);
        },
        
        toString: function() {
            return "some(" + value + ")";
        }
    };
}

var functor = {
    map: function(func, option) {
        return option.map(func)
    },
    unit: some,
    applyFunctor: function(funcOption, argOption) {
        return funcOption.bind(function(func) {
            return argOption.map(func);
        });
    }
};

We can then use option values as applicative functors. Let's try our implementation out to make sure it behaves as we expect:

var four = some(4);
var six = some(6);

function add(first, second) {
    return first + second;
};

function curry(func, numberOfArguments) {
    return function(value) {
        if (numberOfArguments === 1) {
            return func(value);
        } else {
            return curry(func.bind(null, value), numberOfArguments - 1);
        }
    };
}

functor.applyFunctor(functor.map(curry(add, 2), four), six);
// => some(10)
functor.applyFunctor(functor.map(curry(add, 2), none), six);
// => none
functor.applyFunctor(functor.map(curry(add, 2), four), none);
// => none

Note that the use of the functor required us to curry the add function. This isn't a problem in functional languages such as Haskell, since functions tend to be curried by default. However, in languages that usually define functions to have multiple arguments (uncurried languages, for short), such as JavaScript, things get a little untidy.

My understanding of applicative functors is that they allow functors, or rather map, to be generalised to functions that accept more than one argument, such as add. Therefore, in an uncurried language, we might imagine the following cleaner API:

functor.applyFunctorUncurried(add, four, six);
// => some(10)
functor.applyFunctorUncurried(add, none, six);
// => none
functor.applyFunctorUncurried(add, four, none);
// => none

And such an API turns out to be not too hard to implement:

functor.applyFunctorUncurried = function(func) {
    var args = Array.prototype.slice.call(arguments, 1);
    return args.reduce(
        functor.applyFunctor,
        functor.unit(curry(func, args.length))
    );
}

Interestingly, the implementation of applyFunctorUncurried is most easily expressed in terms of the original applyFunctor. I've found cases like this explain why functional languages tend to favour curried functions: it often makes the implementation of higher-order functions such as applyFunctor much more straightforward.

This raises an interesting question: are these two formulations of applyFunctor of equal power? That is, is it possible to implement each in terms of the other? It's straightforward to see that we can implement applyFunctorUncurried in terms of applyFunctor since it's precisely the implementation above. What about implementing applyFunctor in terms of applyFunctorUncurried? This turns out to be pretty straightforward too:

function applyFunctor(funcFunctor, argFunctor) {
    return functor.applyFunctorUncurried(apply, funcFunctor, argFunctor);
}

function apply(func, value) {
    return func(value);
}

Please let me know if you spot mistakes in any of the above -- I've not exactly been rigorous in proof!

I'd be curious to know if there are any languages that include the alternative formulation of applyFunctor, and whether there are common cases where the original formulation is preferable even in uncurried languages.

Topics: Functional programming, Language design, JavaScript

Peaks and troughs in software development

Monday 20 August 2012 19:47

The problem with a smooth development process is that every day is pretty much the same as the last. You might be writing great code and solving interesting problems with other passionate people, but constantly working on the same thing can begin to feel dull or even frustrating. By having a silky-smooth development process with reliable code and regular releases, you've removed those natural peaks and troughs, like the high of fixing another critical bug in production before you head home and crash. I think it was Steve Freeman who once mentioned that sometimes it's valuable to put some of those peaks and troughs back in, but preferably without putting critical bugs back in.

For instance, I like the idea of spending one day a week working on unprioritised work. It might be that the developers are keen to try out a new rendering architecture that'll halve page load times, or that there's a piece of code that can be turned into a separate library that'll be useful on other projects. Maybe there's a little visual bug that's never going to be deemed important enough to be prioritised, but a developer takes enough pride in their work to spend half an hour fixing it. This feels like a peak to me: there's a lot of value to the product in polishing the user experience, in refactoring the code, and trying out risky ideas, and the developers get to scratch some of their own itches.

However, it's regularity can make it feel routine, and you're still working on the same product. As useful as these small, regular peaks and troughs are, I think you also need the occasional Everest. Maybe it's saying “This week, I'm going to try something I've never tried before that's completely unrelated to the project”. Or perhaps you need a Grand Canyon: “Today, we're just going to concentrate on being better programmers by doing a code retreat”. Finding something that works is hard, and you can't even reuse the same idea too much without risking its value as an artificial peak or trough. But I think it's important to keep trying. You don't just want a project and its team to be alive: you need them to be invigorated.

Topics: Software development

Safer mutation: change the value, change the name

Saturday 16 June 2012 12:33

Many advocates of functional programming suggest that the concept of state, the idea that a value can change and mutate over time, makes reasoning about your program much harder, leading to more bugs. Most languages allow some form of mutability, and can therefore implement both functional and imperative algorithms, even if the preference is strongly towards immutability. In a completely pure functional language, mutability is entirely removed. Since some concepts are arguably easier to understand and implement when using mutable state, this can mean certain problems are harder to solve in a purely functional language. But what if we allowed a limited form of mutability in such a way that we still preserve many of the nicer properties of functional programming, such as referential transparency?

To take a simple example: suppose we want to append an item to the end of a list. In an imperative language, we might write something like this:

list.append("first")

so now list has an extra item, meaning that the original value of list no longer exists. In a functional programming language, we'd create a new value instead of mutating the original list:

val longerList = list.append("first")

We can now use both list and longerList, since list was not modified during the append. This means we never need to reason about what state list is in – its value never changes. The trade-off is that a functional append tends to be more expensive than an imperative append. If we don't actually want to use list again, then this is arguably a bad trade-off. What if we could allow the list to be mutated under the covers, but still be able to present a programming model that appears to preserve immutability? So, we write the same code:

val longerList = list.append("first")

but list is now implemented as a mutable list. The compiler must now ensure that list is never used after the append operation. This means the actual implementation is effectively the same as when written in an imperative style, but we ensure that whenever we change the value of an object, we also change the name used to access it.

This approach does have some severe limitations. For instance, sharing mutable state between many objects is likely to be impossible. If we allowed mutable state to be shared, then mutating that state inside one object would require marking all objects that hold that state to be unusable. In general, having the compiler keep track of this is likely to be unfeasible.

Yet this sharing of mutable state is arguably the worst form of mutablility. It means that changing something in one part of your system could change something in another far away part of the system. This idea of changing the name whenever we change the value is most useful for mutability in the small, when we just want to implement a particular algorithm efficiently.

However, there still might cases where you'd quite reasonably want to share mutable state between, say, just two objects. The more interesting question is: is it possible to handle this case without requiring the user to write an excessive number of hints to the compiler?

Topics: Language design, Functional programming

The opposite of <noscript>

Saturday 9 June 2012 21:49

HTML has the <noscript> tag for when you want an element to be displayed if and only if JavaScript is disabled, but what if you want the opposite? How do you display an element if and only JavaScript is enabled? I came across a rather tidy solution on StackOverflow. In the <head>, we add the following:

<noscript>
  <style>
    .iff-javascript-enabled {
        display: none;
    }
  </style>
</noscript>

We then add the iff-javascript-enabled class to the appropriate elements:

<noscript><p>JavaScript is disabled</p></noscript>
<p class="iff-javascript-enabled">JavaScript is enabled</p>

The advantage of this solution over others is that there's no delay. Most other solutions hide the relevant elements by default, and then use JavaScript to show them, but this means that the elements are hidden until that piece of JavaScript fires. However, in some cases this is desirable. For instance, suppose an element does nothing until some JavaScript hooks up an onclick handler. Showing that element before the onclick handler is added might be frustrating since clicking the element would do nothing.

Still, there's a simplicity to this solution that I quite enjoy.

Topics: HTML, CSS, JavaScript

What is the expression problem?

Sunday 27 May 2012 21:00

View interactive version of this post

The problem

The expression is a tricky problem in many languages that asks: given a set of functions that operate over a set of types, how do we allow both the set of functions and the set of types that those functions operate over be extended without losing type safety?

Abstract data types

Let's say we have the abstract syntax tree (AST) for a simple mathematical language that contains literals and additions. In ML, we can represent a node with the abstract data type (ADT) node which has two data constructors, LiteralNode and AddNode:

datatype node
    = LiteralNode of real
    | AddNode of node * node

We can then define a function evaluate that turns the AST into a single number.

datatype node
    = LiteralNode of real
    | AddNode of node * node
                
fun evaluate (LiteralNode value) = value
  | evaluate (AddNode (left, right)) = (evaluate left) + (evaluate right)

Note that evaluate is type-safe since it handles all possible instances of node. Now, suppose we want to add another operation over nodes, say to turn the AST into a string. Using ADTs, this is simply a case of adding another function. Importantly, this doesn't require any modification to the existing source code.

datatype node
    = LiteralNode of real
    | AddNode of node * node
    
fun evaluate (LiteralNode value) = value
  | evaluate (AddNode (left, right)) = (evaluate left) + (evaluate right)
  
fun nodeToString (LiteralNode value) = Real.toString value
  | nodeToString (AddNode (left, right)) =
        "(" ^ (nodeToString left) ^ " + " ^ (nodeToString right) ^ ")"
  

The problem arises when we decide that we'd like a variant of our mathematical language with the negation operator. We'd like to be able to evaluate this extension of our mathematical language, but we're not concerned with turning negations into strings. There's no straightforward way of achieving this using ADTs -- we're forced to add another data constructor to node, which may not be possible if we don't own the original source code. We also add the appropriate case to evaluate.

datatype node
    = LiteralNode of real
    | AddNode of node * node
    | NegateNode of node
    
fun evaluate (LiteralNode value) = value
  | evaluate (AddNode (left, right)) = (evaluate left) + (evaluate right)
  | evaluate (NegateNode term) = ~(evaluate term)
  
fun nodeToString (LiteralNode value) = Real.toString value
  | nodeToString (AddNode (left, right)) =
        "(" ^ (nodeToString left) ^ " + " ^ (nodeToString right) ^ ")"
  

Even if we can modify our definition of node, we still have a problem: we can no longer safely create functions that operate over our original language. Consider the function nodeToString: since it no longer exhaustively matches all possible instances of node, it's not type-safe. To restore type safety, we're forced to update it to handle the case of NegateNode:

datatype node
    = LiteralNode of real
    | AddNode of node * node
    | NegateNode of node
    
fun evaluate (LiteralNode value) = value
  | evaluate (AddNode (left, right)) = (evaluate left) + (evaluate right)
  | evaluate (NegateNode term) = ~(evaluate term)
  
fun nodeToString (LiteralNode value) = Real.toString value
  | nodeToString (AddNode (left, right)) =
        "(" ^ (nodeToString left) ^ " + " ^ (nodeToString right) ^ ")"
  | nodeToString (NegateNode term) = "-" ^ (nodeToString term)
      

In general, ADTs make it easy to add extra functions that operate over existing data types, but difficult to safely extend those data types. Now, let's take a look at the same problem in an object-orientated language, specifically Java.

Object-orientation: interfaces and classes

We begin by defining the interface Node, along with two implementations, AddNode and LiteralNode:

public interface Node {
}
            
public class LiteralNode implements Node {
    private final double value;

    public LiteralNode(double value) {
        this.value = value;
    }
}

public class AddNode implements Node {
    private final Node left;
    private final Node right;

    public AddNode(Node left, Node right) {
        this.left = left;
        this.right = right;
    }
}
      

For the sake of readability, let's leave out the constructors:

public interface Node {
}
            
public class LiteralNode implements Node {
    private final double value;
}

public class AddNode implements Node {
    private final Node left;
    private final Node right;
}
      

Next, we want to evaluate each node to a single number. We add an evaluate method to the interface, and add appropriate implementations to the concrete classes.

public interface Node {
    double evaluate();
}
            
public class LiteralNode implements Node {
    private final double value;
    
    public double evaluate() {
        return value;
    }
}

public class AddNode implements Node {
    private final Node left;
    private final Node right;
    
    public double evaluate() {
        return left.evaluate() + right.evaluate();
    }
}
      

Unlike our approach with ADTs in ML, extending our language to support negation is straightforward. We simply add another implementation of Node, which doesn't require any modification of the original source code.

public interface Node {
    double evaluate();
}
            
public class NegateNode implements Node {
    private final Node term;
    
    public double evaluate() {
        return -term.evaluate();
    }
}

public class LiteralNode implements Node {
    private final double value;
    
    public double evaluate() {
        return value;
    }
}

public class AddNode implements Node {
    private final Node left;
    private final Node right;
    
    public double evaluate() {
        return left.evaluate() + right.evaluate();
    }
}
      

Unfortunately, safely adding another operation on nodes requires us to modify the original source code for Node, which may not always be possible. In our case, we want to be able to turn our original language of add and literal nodes into strings, so we need to add a nodeToString method on both the Node interface and the classes AddNode and LiteralNode:

public interface Node {
    double evaluate();
    String nodeToString();
}

public class NegateNode implements Node {
    private final Node term;
    
    public double evaluate() {
        return -term.evaluate();
    }
}
            
public class LiteralNode implements Node {
    private final double value;
    
    public double evaluate() {
        return value;
    }
    
    public String nodeToString() {
        return Double.toString(value);
    }
}

public class AddNode implements Node {
    private final Node left;
    private final Node right;
    
    public double evaluate() {
        return left.evaluate() + right.evaluate();
    }
    
    public String nodeToString() {
        return "(" + left.nodeToString() + " + " +
            right.nodeToString() + ")";
    }
}
      

Even if we can modify the original source code, by modifying the interface, we've forced all implementations of Node to implement nodeToString even though we only ever wanted to use such an operation on our original add and literal nodes. In particular, we're forced to add nodeToString to NegateNode:

public interface Node {
    double evaluate();
    String nodeToString();
}

public class NegateNode implements Node {
    private final Node term;
    
    public double evaluate() {
        return -term.evaluate();
    }
    
    public String nodeToString() {
        return "-" + term.nodeToString();
    }
}
            
public class LiteralNode implements Node {
    private final double value;
    
    public double evaluate() {
        return value;
    }
    
    public String nodeToString() {
        return Double.toString(value);
    }
}

public class AddNode implements Node {
    private final Node left;
    private final Node right;
    
    public double evaluate() {
        return left.evaluate() + right.evaluate();
    }
    
    public String nodeToString() {
        return "(" + left.nodeToString() + " + " +
            right.nodeToString() + ")";
    }
}
      

By using methods on interfaces, we have the opposite problem to ADTs: adding additional types of nodes without modifying or affecting existing code is straightforward, while it's difficult to safely add additional operations over those nodes.

Summary

In this particular example, our ideal solution would let us:

  • define AddNode and LiteralNode, and an operation evaluate over both of them.
  • add a third type of node, NegateNode, which evaluate can be performed on, without modification of the original source code.
  • add a second operation nodeToString over the original set of nodes, AddNode and LiteralNode, without modification of the original source code.
  • not be forced to implement nodeToString for NegateNode.

We can express these properties more generally as being able to:

  • define a set of data types and operations over those data types
  • add additional data types that can have the same operations applied to them, without modification of the original source code.
  • add additional operations over those data types, without modification of the original source code.
  • add these additional data types and operations independently. That is, if an extension ExtData adds a data type D, and another extension ExtOp adds an operation Op, we should be able to safely use both extensions without implementing the operation Op for the data type D, although we may choose to do so if we want to apply Op to D.

all while preserving type-safety.

Topics: Language design

Solving the expression problem with union types in Shed

Monday 30 April 2012 22:31

The expression is a tricky problem in many languages that asks: given a set of functions that operate over a set of types, how do we allow both the set of functions and the set of types that those functions operate over be extended without losing type safety? If you're not familiar with the problem, I recommend reading the explanation by the author of Magpie. For our purposes, we'll use an abstract syntax tree for mathematical expressions as our data type. To start, let's have two sorts of node: addition operators and literals.

interface Node {}

def AddNode class(myLeft: Node, myRight: Node) implements Node => {
    public def left fun() => myLeft;
    public def right fun() => myRight;
}

def LiteralNode class(myValue: Double) implements Node => {
    public def value fun() => myValue;
}

(As an aside: due to the design of the language, we can't give the arguments to a class the same name as it's getter, for instance def value fun() => value, since the body of the function would refer to the function rather than the class argument. Prepending each of the arguments with my is a poor solution, and although I have several ideas on how to rectify this, I'm still pondering on the simplest, cleanest solution.)

Suppose we want to implement a function evaluate that evaluates the expression to a single value. Our first attempt at an implementation might look like this:

def evaluate fun(node: Node) : Double =>
    match(node,
        case(AddNode, evaluateAdd),
        case(LiteralNode, evaluateLiteral)
    );

def evaluateAdd fun(add: AddNode) =>
    evaluate(add.left()) + evaluate(add.right());

def evaluateLiteral fun(literal: LiteralNode) =>
    literal.value();

There's one immediate with this solution: it's not type-safe. If somebody adds another implementation of Node, then evaluate no longer covers all possible cases. The solution to this problem is to define a union type:

type StandardNode = AddNode | LiteralNode

and update evaluate by changing the type of its argument:

def evaluate fun(node: StandardNode) : Double =>
    match(node,
        case(AddNode, evaluateAdd),
        case(LiteralNode, evaluateLiteral)
    );

def evaluateAdd fun(add: AddNode) =>
    evaluate(add.left()) + evaluate(add.right());

def evaluateLiteral fun(literal: LiteralNode) =>
    literal.value();

This makes evaluate type-safe, but has had the unintended consequence of making evaluateAdd unsafe: add.left() and add.right() both have the type Node, yet evaluate only accepts the narrower type StandardNode. We fix this by adding type parameters to AddNode:

def AddNode class[T] => (myLeft: T, myRight: T) implements Node => {
    public def left fun() => myLeft;
    public def right fun() => myRight;
}

and modifying the type of the argument of evaluateAdd and updating the value of StandardNode:

def evaluateAdd fun(add: AddNode[StandardNode]) =>
    evaluate(add.left()) + evaluate(add.right());
    
type StandardNode = AddNode[StandardNode] | LiteralNode;

(At this point that the interface Node isn't really necessary any more, although there might be other reasons to keep it around.)

Suppose we now add NegateNode and the associated union type ExtendedNode:

def NegateNode class[T] => (myValue: T) => {
    public def value fun() => myValue;
}

type ExtendedNode =
    AddNode[ExtendedNode] |
    NegateNode[ExtendedNode] |
    LiteralNode;

ExtendedNode cannot reuse the definition of StandardNode since AddNode[ExtendedNode] is a subtype of ExtendedNode but not a subtype of StandardNode. The solution is to introduce another type parameter, this time on StandardNode and ExtendedNode:

type StandardNode[T] = AddNode[T] | LiteralNode;

type ExtendedNode[T] = StandardNode[T] | NegateNode[T];

We can then add the appropriate type parameters to the argument of evaluate:

def evaluate fun(node: StandardNode[StandardNode]) : Double =>
    match(node,
        case(AddNode[StandardNode[StandardNode]], evaluateAdd),
        case(LiteralNode, evaluateLiteral)
    );

But this doesn't work either: we need to specify the type parameter to the second reference to StandardNode, which is StandardNode, which also requires a type parameter... and so on. The solution is to add yet more types that fix the type parameter to themselves:

type StandardNodeF = StandardNode[StandardNodeF];
type ExtendedNodeF = ExtendedNode[ExtendedNodeF];

def evaluate fun(node: StandardNodeF) : Double =>
    match(node,
        case(AddNode[StandardNodeF], evaluateAdd),
        case(LiteralNode, evaluateLiteral)
    );

In order to evaluate an instance of ExtendedNode, we'd need to define the following:

def evaluateExtended fun(node: ExtendedNodeF) : Double =>
    match(node,
        case(AddNode[ExtendedNodeF], evaluateAddExtended),
        case(NegateNode[ExtendedNodeF], evaluateNegate),
        case(LiteralNode, evaluateLiteral)
    );

def evaluateAddExtended fun(add: AddNode[ExtendedNodeF]) =>
    evaluateExtended(add.left()) + evaluateExtended(add.right());
    
def evaluateNegate fun(negate: NegateNode[ExtendedNodeF]) =>
    -evaluateExtended(negate.value());

It seems reasonable to write evaluateNegate, but the definition of evaluateAddExtended seems virtually the same as before. The difference is the type parameter for AddNode, and the function we use to evaluate the sub-nodes. So, we introduce a type parameter and argument to abstract both:

def evaluateAdd fun[T] => fun(evaluator: Function[T, Double]) =>
    fun(add: AddNode[T]) =>
        evaluator(add.left()) + evaluator(add.right());

We can also perform a similar transformation on evaluateNegate and evaluate:

def evaluateNegate fun[T] => fun(evaluator: Function[T, Double]) =>
    fun(negate: NegateNode[T]) =>
        -evaluator(negate.value());

def evaluate fun[T] => fun(evaluator: Function[T, Double]) =>
    fun(node: T) : Double =>
        match(node,
            case(AddNode[StandardNodeF], evaluateAdd[T](evaluator)),
            case(LiteralNode, evaluateLiteral)
        );

Now we can rewrite evaluateExtended to use evaluate:

def evaluateExtended fun[T] => (evaluator: Function[T, Double] =>
    fun(node: ExtendedNode[T]) : Double =>
        match(node,
            case(StandardNode[T], evaluate[T](evaluator)),
            case(NegateNode[T], evaluateNegate[T](evaluateNegate))
        );

If we want to call evaluate or evaluateExtended we need to use a similar trick as with StandardNode and ExtendedNode to instantiate the functions:

def evaluateF fun(node: StandardNodeF) =>
    evaluate[StandardNodeF](evaluateF)(node);
    
def evaluateExtendedF fun(node: ExtendedNodeF) =>
    evaluateExtended[ExtendedNodeF](evaluateExtendedF)(node);

Hopefully you can now see how you'd extend the solution to include further node types. Although not covered here, it's also possible to create functions or classes to help combine evaluators, and functions generally written in this style with a bit less boilerplate.

If we imagine an ideal solution to the expression problem, we might argue that this solution is a little verbose, and I'd be inclined to agree. The question is: is it unnecessarily verbose? There's an argument to be made that this exposes the essential complexity of solving the expression problem. Other less verbose solutions hide rather than remove this complexity. On the one hand, this allows one to express the same ideas more succinctly without being cluttered with the machinery of how the solution is achieved, compared to the solution I just described where we have to constantly pass around the type parameter T and evaluator argument. On the other hand, if you want to understand what's going on, you don't have to look very far since everything is explicitly passed around.

On the whole, I think it's simpler than some solutions I've seen to the expression problem, and the verbosity isn't all-consuming. Pretty good for a first go, I reckon.

Topics: Language design, Shed

Shed programming language

Monday 30 April 2012 21:52

Looking around at many of the mainstream languages today, I can't help feeling as if they've become rather large and unwieldy. I'd say this is true of both scripting languages such as Python and Ruby, as well as languages common in the enterprise, such as C# and even Java. I'd argue that features such as (implementation) inheritance, extension methods, properties, attributes/decorators, and null act to complicate each language.

A little over a year ago, I thought about the set of features that I actually used and wondered what a slimmed-down object-orientated programming language would look like. To that end, I've been working on the Shed programming language. A non-exhaustive list of features would be:

  • Statically-typed
  • Functions, including
    • Closures
    • Functions as values
    • Lightweight anonymous function syntax
  • Interfaces
  • Classes that can implement interfaces
  • Object literals, which can also implement interfaces

Intentionally missing are:

  • Nulls
  • Implementation inheritance
  • Extension methods
  • Properties
  • Function overloading

To give a flavour, here's a naive implementation of the Fibonacci sequence:

def fibonacci fun(n: Double) =>
    if n <= 1
        then n
        else fibonacci(n - 1) + fibonacci(n - 2)

Note that the syntax is still changing. The current implementation of the compiler doesn't support operators yet, so to get the above compiling, you'd have to replace the operators with method calls.

The aim of implementing the language is fun and experimentation, rather than creating a language to write production code in. I'm pretty sure that for the code I tend to write Shed is a good fit, at least for my programming style. I don't know if the language I'm writing applies so well to other domains that I'm less familiar with, but I intend to enjoy finding out, and possibly extending the feature list as I go.

One of the main principles of the language is to optimise for the reader rather than the writer. I spend far more time wondering what some piece of code does that I'd written a few months ago, than typing new bits of code.

Following on from this, variables in Shed may only be introduced into a scope via an explicit declaration, excluding variables in default scope. For instance:

import time.*;

since there's no way to tell exactly what variables have been added to scope. In contrast, the following import adds DateTime to scope:

import time.DateTime;

As I implement various bits of Shed, I'll continue to post interesting problems and solutions I come across. Until then...

Topics: Language design, Shed

The impurity of object identity

Thursday 23 February 2012 13:22

While thinking about what subsets of common languages, such as JavaScript and Java, could be considered pure, it occurred to me that object identity in most languages is an unexpected source of impurity. That is, if you're using object identity, you're not writing purely functional code.

Take, for instance, this piece of JavaScript:

var createEmpty = function () {
    return {};
};

var first = createEmpty();
var second = createEmpty();
var isTheSame = first === second

We create two empty objects, and then determine whether they represent the same object using the triple equals operator (using is in Python or == in Java would have much the same effect). Since they were constructed separately, isTheSame holds the value false. Yet in a purely functional language, calling the same function with the same arguments (in this case, no arguments) twice should return the exact same value.

Strictly speaking, it's the construction of the object that is the impure code: calling the identity operator with the same arguments will always return the same value, whereas each time we construct an object, we assign it a different identity. However, code that either contains no use of object identity, or contains no construction of objects can be considered pure. Treating object identity as the impure concept that cannot be used is, in my opinion, the more useful option: it's quite handy being able to construct objects.

Topics: Functional programming, Language design

Coupling classes to interfaces

Sunday 23 October 2011 09:49

Over on the 8th Light blog, Steven Degutis discusses the advantages of Go's approach to interfaces. Specifically, Go allows you to have a class implement an interface without explicitly mentioning the interface so long as it has methods with the right names and type signatures. He likes the idea you can write an interface, and have an existing class implement that interface without modification. I find the idea that a class can implement an interface just by the coincidence of having methods with matching names and type signatures to be terrifying.

To re-use Steven's example:

    type Predator interface {
        chasePrey(Prey) bool
        eatPrey(Prey)
    }
   
    type Lion struct{}
   
    func (self Lion) chasePrey(p Prey) bool {
        // ...
    }
   
    func (self Lion) eatPrey(p Prey) {
        // ...
    }

Since Lion has the methods chasePrey and eatPrey with the correct type signature, it implements the interface Predator, yet the interface Predator is never mentioned in the definition of Lion. This is considered a Good Thing: to quote Steven:

But alas, in Java, the class itself must know ahead of time all the names of the interfaces it wants to conform to. This is an unfortunate form of temporal coupling.

[…]

Go doesn't care that, 5 years ago, Lion never specified which interfaces it implements, it just cares that it has the methods this interface needs, and rightly so.

I think it's a terrifying idea that my class could suddenly start implementing new interfaces that I'd never considered when writing the class. When I define an interface in Java, it has more meaning than just the name and type signature of each method. An interface is also a contract for the behaviour of the implementation of those methods, which can't be verified by names and types alone. I want to explicitly specify the interfaces that a class implements, as a declaration in the code that says “I understand what behaviour sub-types of the interface should have, and I've done my best to make sure that this class implements that behaviour”. To me, this is much more useful than knowing the type of a class is compatible with an interface.

Now, I'm not saying that Go has got it wrong. I think Go's interfaces provide weaker contracts in exchange for greater flexibility, but which is better depends on your programming style and preferences.

Topics: Software design

Sharing JavaScript between the browser and the server

Saturday 22 October 2011 09:20

When talking about node.js, I usually hear people give two reasons why they love it. The first is that it uses an event-driven model for concurrency, rather than using threads and blocking function calls. The second is that node.js allows you to use the same language, JavaScript, on both the browser and the server. You can then share logic that might otherwise be duplicated, such as validation of user input. Yet people often dismiss the latter point, saying that when they do web development, the amount of logic that ends up being duplicated is neglible, since the code on the browser and on the server address different concerns.

My question is: have we got this the wrong way round? Does code on the browser and server tend to address separate concerns precisely because sharing logic between them is hard? Once we start using the same language on both sides, we might start to see new ideas that this brings to the table. With modern web applications, we see more and more code that we might have previously expected to see on the server being brought into the browser. If we can easily move code between the browser and server, then we start to have an enourmous amount of flexibility: we can easily change our minds about where some particular code should be executed both when building our application, and at run-time.

Topics: JavaScript, Software design