Unified Function Syntax

ISO/IEC JTC1 SC22 WG21 N2763 = 08-0273 - 2008-09-19

Lawrence Crowl, [email protected], [email protected]
Alisdair Meredith, [email protected]

This paper is a revision of N2582 = 08-0092 and reflects the consensus of the "other" library subgroup on 15 September 2008. The revision removes the handling of the nested declarator issue, as core language issue 681 addresses the problem. The revision removes the introduction of named lambdas and block-local function definitions. The intent is that these features be added in a later standard.

Introduction

The sytax for both the new function declarator syntax (N2541 New Function Declarator Syntax Wording) and lambda expressions (N2550 Lambda Expressions and Closures: Wording for Monomorphic Lambdas (Revision 4)) are similar. As suggested by Alisdair Meredith (N2511 Named Lambdas and Local Functions), the syntax for both could be made more similar, thus simplifying the view of the programmer. The British position (N2510 BSI Position on Lambda Functions) supports this work.

Such a unification would address the concerns of Daveed Vandevoorde (N2337 The Syntax of auto Declarations) that the auto was too overloaded in its use for both a new function declarator syntax and for automatically deducing variable type (N1984 Deducing the type of variable from its initializer expression (revision 4)).

This paper explores the syntactic unification of function declarations and lambda expressions. It takes the new lambda syntax (N2550) as the starting point for syntactic unifications, and specific syntactic suggestions in N2511 no longer apply. As a simplistic unification would introduce unfortunate irregularities in semantics, we also explore regularizing the semantics of such declarations.

Syntactic Unification

Our general approach is to replace the use of auto in N2541 with a lambda-introducer of the form [].


int x=0, y=0;
[]f(int z)->int {
    return x+y+z;
}
struct s {
    int k;
    []g(int z)->int;
};
[]s::g(int z)->int {
    return k+x+z;
}

As lambda expressions can implicitly deduce the return type in simple cases, a reasonable extension to the unified syntax might allow the same for inline function definitions. However, the Evolution Working Group looked at a similar extension in some detail earlier in the process and rejected it for the coming C++0x revision. We see no reason to reverse that decision at this late stage. So, the return type is required.

Finally, for more declarative consistency, the parameter list is required, even if empty, for function declarations. (We do not propose to add that requirement to lambda expressions.)

Future Semantic Regularization

The intent of unified function syntax is to enable future language features as compatible extensions. We discuss those extensions here, but stress that the proposed wording does not introduce named lambdas or nested functions.

For functions at namespace, the lambda-introducer of the form [] is semantically correct.

For functions at class scope, the lambda-introducer of the form [] is semantically correct, with the understanding that class-scope non-static functions still have an implicit this parameter. (A more explicit lambda-introducer form would be [this], but we little advantage in requiring extra syntax were none was required before.)

For functions at block scope, an lambda-introducer of the form [] indicates a local function without access to the containing function's local variables. An lambda-introducer of any other form indicates a named lambda.


int h(int b) {
    []m(int z)->int // local function
        { return x+z; } // b is not in scope
    [&]n(int z)->int // named lambda
        { return b+x+z; } // b is in scope
}

With these new declarations, the names declared would decay to the appropriate type:

Note that the above decay results in slightly different semantics for the following two lines:


[]   f     (int z)->int { return x+z; }
auto f = [](int z)->int { return x+z; };

The former defines a function and will decay to a function pointer. The latter uses a lambda expression to initialize a variable containing a closure object and will not decay to a function pointer.

Finally, we come to the issue of compatiblity with existing block-local function declarations. Such declarations refer to a function at namespace scope rather than to a function at block scope. Use of such semantics has always seemed inconsistent, and so we propose to make the new syntax always declare a block-local function. Thus, forward local function declarations are reasonable.


int h(int b) {
    []even(unsigned n)->bool;
    []odd(unsigned n)->bool;
    []even(unsigned n)->bool {
        if ( n == 0 ) return true;
        else return odd(n-1);
    }
    []odd(unsigned n)->int {
        if ( n == 0 ) return false;
        else return even(n-1);
    }
}

Proposed Wording

The proposed wording shows changes from working draft standard N2723.

3.3.1 Point of declaration [basic.scope.pdecl]

In paragraph 9, edit

[Note: friend declarations refer to functions or classes that are members of the nearest enclosing namespace, but they do not introduce new names into that namespace (7.3.1.2). Function declarations at block scope and object declarations with the extern specifier at block scope The following kinds of block scope declarations refer to delarations that are members of an enclosing namespace, but they do not introduce new names into that scope.

end note]

3.5 Program and linkage [basic.link]

In paragraph 6, edit

The name of a function declared in block scope without a simple-type-specifier of lambda-introducer, and the name of a function or an object declared by a block scope extern declaration, have linkage. If a declared function has no linkage, the program is ill-formed. If there is a visible declaration of an entity with linkage having the same name and type, ignoring entities declared outside the innermost enclosing namespace scope, the block scope declaration declares that same entity and receives the linkage of the previous declaration. If there is more than one such matching entity, the program is ill-formed. Otherwise, if no matching entity is found, the block scope entity receives external linkage.

7.1.6.2 Simple type specifiers [dcl.type.simple]

In paragraph 1, edit

The simple type specifiers are

simple-type-specifier:
::opt nested-name-specifieropt type-name
::opt nested-name-specifieropt template simple-template-id
char
char16_t
char32_t
wchar_t
bool
short
int
long
signed
unsigned
float
double
void
auto
lambda-introducer
decltype ( expression )

In paragraph 2, edit

The auto specifier is a placeholder for a type to be deduced (7.1.6.4). The lambda-introducer specifier is used only for a late-specified return type (8.3.5 [dcl.fct]), and shall have the form []. The other simple-type-specifiers specify either a previously-declared user-defined type or one of the fundamental types (3.9.1). Table 9 summarizes the valid combinations of simple-type-specifiers and the types they specify.

In table 9, edit

Table 9: simple-type-specifiers and the types they specify
Specifier(s)Type
type-namethe type named
char"char"
unsigned char"unsigned char"
signed char"signed char"
char16_t"char16_t"
char32_t"char32_t"
bool"bool"
unsigned"unsigned int"
unsigned int"unsigned int"
signed"int"
signed int"int"
int"int"
unsigned short int"unsigned short int"
unsigned short"unsigned short int"
unsigned long int"unsigned long int"
unsigned long"unsigned long int"
unsigned long long int"unsigned long long int"
unsigned long long"unsigned long long int"
signed long int"long int"
signed long"long int"
signed long long int"long long int"
signed long long"long long int"
long long int"long long int"
long long"long long int"
long int"long int"
long"long int"
signed short int"short int"
signed short"short int"
short int"short int"
short"short int"
wchar_t"wchar_t"
float"float"
double"double"
long double"long double"
void"void"
autotype to be deduced
lambda-introducerthe late-specified return type
decltype(expression)the type as defined below

7.1.6.4 auto specifier [dcl.spec.auto]

Remove paragraph 1.

The auto type-specifier signifies that the type of an object being declared shall be deduced from its initializer or specified explicitly at the end of a function declarator.

Remove paragraph 2.

The auto type-specifier may appear with a function declarator with a late-specified return type (8.3.5) in any context where such a declarator is valid, and the use of auto is replaced by the type specified at the end of the declarator.

Edit paragraph 3.

Otherwise, the type of the object is deduced from its initializer. The auto type-specifier signifies that the type of an object being declared shall be deduced from its initializer. The name of the object being declared shall not appear in the initializer expression. The auto type-specifier is allowed when declaring objects in a block (6.3), in namespace scope (3.3.5), and in a for-init-statement (6.5.3). The decl-specifier-seq shall be followed by one or more init-declarators, each of which shall have a non-empty initializer of either of the following forms:

= assignment-expression
( assignment-expression )

8.3.5 Functions [dcl.fct]

In paragraph 2,

In a declaration T D where D has the form

D1 ( parameter-declaration-clause ) cv-qualifier-seqopt ref-qualifieropt exception-specificationopt -> type-id

and the type of the contained declarator-id in the declaration T D1 is "derived-declarator-type-list T," T shall be the single type-specifier auto lambda-introducer and the derived-declarator-type-list shall be empty. Then the The type of the declarator-id in D is "function of (parameter-declaration-clause) cv-qualifier-seqopt ref-qualifieropt returning type-id". Such a function type has a late-specified return type.

Note that N2761 may add attribute specifiers to the declaration form above. That edit does not conflict. Note that N2757 may remove "and the derived-declarator-type-list shall be empty" in the text above.

In paragraph 3,

The type-id in this form includes the longest possible sequence of abstract-declarators. [Note: This resolves the ambiguous binding of array and function declarators. [Example:

auto [] f()->int(*)[4]; // function returning a pointer to array[4] of int
// not function returning array[4] of pointer to int

end example] —end note]

Within paragraph 12, edit

[Note: typedefs and late-specified return types are sometimes convenient when the return type of a function is complex. For example, the function fpif above could have been declared


typedef int IFUNC(int);
IFUNC* fpif(int);

or


auto [] fpif(int)->int(*)(int)

A late-specified return type is most useful for a type that would be more complicated to specify before the declarator-id:


template <class T, class U>
auto [] add(T t, U u) -> decltype(t + u);

rather than


template <class T, class U>
decltype((*(T*)0) + (*(U*)0)) add(T t, U u);

end note]