Differences between “first-class function” mechanisms

Some languages (Javascript, Python) have the notion that a function is an object:

//Javascript
var fn = console.log;

This means that functions can be treated like any other object (first-class functions), e.g. passed in as an argument to another function:

var invoker = function(toInvoke) {
    toInvoke();
};
invoker(fn); //will call console.log

Other languages (C++, C#, VB.NET) do not define functions as real objects:

//C#
Type t = Console.WriteLine.GetType();
//This code will not compile, because:
//"'Console.WriteLine()' is a method, which is not valid in the given context"

Rather these languages may have objects which can point to a function (such as C++ function pointers) and can be passed around just like any other object. In the CLI, these wrapper objects are called delegates or delegate instances:

//C#
void Invoker(Action toInvoke) {
    toInvoke();
}

Action action = Console.WriteLine;
Invoker(action);

//also valid, and the toInvoke argument will now contain a delegate which points to Console.WriteLine
//Invoker(Console.WriteLine);

What differences in capability arise from these two mechanisms — “function object” vs “pointer-to-function as object”?

4

The two pieces of code are not equivalent to each other. A language can be implemented in a way so that each method can be directly used as a first class object. This impacts the ABI, calling convention, and linking mechanism, but is not extraordinarily special. Most modern language implementations already attach a lot of metadata to each function.

However, the meaning of object.method differs substantially between its use in delegate assignment vs its use in simple variable assignment:

  • In C#, the code Func<…> m = obj.method; m() is equivalent to obj.method(). That is, a Func<…> some kind of object that knows which object it belongs to (i.e. the method is “bound” to a specific object). This is exactly equivalent to a closure. The resulting Func<…> must therefore remember both the method and the target object. Since a method may be bound to more than one object, each binding results in a new value.

  • In contrast, obj.method in JavaScript merely resolves the method without binding it to an object. We have to do that ourselves: obj.method() would be equivalent to var m = obj.method.bind(obj); m(). You will see that binding a method to different objects will result in values that are not equal to one another, while obviously the unbound method is identical.

In general, I prefer the C# approach, where there is equivalent semantics between obj.method() and all available usages of obj.method. On the other hand, in Javascript, obj.method() introduces different semantics from the similar obj.method.

4

I see two areas of different behavior that arise from the two mechanisms:

  1. The pointer object is a separate object from the referenced function, and will have a separate identity from the function, and from other pointers to the function
  2. The pointer object can have behaviors beyond that of the original function. For example, in the CLI:
  • Delegate instances know about the relevant this (and can bind it to the method)
  • Delegate instances can contain pointers to multiple functions

(Note: This is from my experience in C# / VB.NET and Javascript. Other languages may have different variants of either mechanism.)

Object reference equality

If a function is a “real object” then any variables pointing to the function are actually pointing to the same object:

//Javascript
var fn = console.log;
var fn1 = console.log;
console.log(fn === fn1); //prints true

Pointer objects have their own identity, even when both point to the same function:

//C#
Action action = Console.WriteLine;
Action action2 = Console.WriteLine;
Console.WriteLine(Object.ReferenceEquals(action, action2)); //prints False

Additional behaviors of the pointer object

Target binding

Function objects have no knowledge of the class to which they are attached:

//Javascript
var a1 = {
    data: 5,
    writeData: function() {
        'use strict'; //otherwise `this` would be the global object; `this.data` would probably return `undefined`
        console.log(this.data);
    }
};
    
var action = a1.writeData;
action(); //Uncaught TypeError: Cannot read property 'data' of undefined

Therefore, part of calling the function as an instance method, is the implict binding of this within the function to the object:

a1.writeData(); //prints 5

We can also explicitly bind this with bind, apply, or call:

action = a1.writeData.bind(a1);
action(); //prints 5

However (as @amon pointed out in this answer), the delegate instance retains that information:

//C#
public class A {
    public int Data;
    public void WriteData() {
        Console.WriteLine(this.Data);
    }
}

var a1 = new A() { Data=4 };
Action action = a1.WriteData;

because action contains knowledge of the target of methods:

Console.WriteLine(action.Target == a1); //prints True

Multicast delegate

Javascript variables / properties that refer to a function object, work just like references to any other object, and therefore cannot refer to multiple function objects simultaneously.

On the other hand, a delegate instance in .NET can point to multiple functions:

//C#
public static class Writers {
    public static void WriteOne() {
        Console.WriteLine(1);
    }
    public static void WriteTwo() {
        Console.WriteLine(2);
    }
}

action = Writers.WriteOne;
action += Writers.WriteTwo;
action(); //prints 1, and then prints 2

While there are differences as you note, it is perhaps important to consider that a closure (i.e. a function with captured locally scoped variables, as for example exists in Javascript) is computationally equivalent to an object with a single method. From the perspective of what can be done with them, at least on a theoretical level, there is no difference between them. In practice, there are two categories of actual difference:

  • Syntactic differences (i.e. how they interact with language features such as how they are created, whether they need special syntax to invoke, what operations — such as function composition — can be performed automatically on them without having to write adapters, and so on)

  • Compatibility differences (i.e. whether or not they can interact cleanly with preexisting code).

Both classes of difference can be overcome by writing trivial adapters that convert either between a closure or an object as required.

5

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *