ballerina-platform / ballerina-spec Goto Github PK

Ballerina Language and Platform Specifications

License: Other

HTML 77.67% Ballerina 20.71% Makefile 0.27% XSLT 0.66% Shell 0.17% Awk 0.26% CSS 0.21% JavaScript 0.06%

ballerina-spec's Introduction

Ballerina Specifications

Ballerina is a new programming language for writing network distributed applications.

This repository will contain specifications relating to Ballerina, and will be used to manage issues relating to those specifications.

Currently, it contains only the Ballerina Language Specification. The XHTML source is in the lang subdirectory. The 2019R3 release can be read using the following URL

https://htmlpreview.github.io/?https://raw.githubusercontent.com/ballerina-platform/ballerina-spec/v2019R3/lang/spec.html

The live version is here:

https://htmlpreview.github.io/?https://raw.githubusercontent.com/ballerina-platform/ballerina-spec/master/lang/spec.html

There is a separate list of specifications of languge extensions.

There is a separate list of proposals for new language features.

ballerina-spec's People

Contributors

Stargazers

Watchers

Forkers

sanjiva maheshika hasithaa madhuramendis praneesha dulvinw hevayo mhermsen casidiablo hjpvue shafreenanfar krvperera azinneera afkham sameerajayasoma dilhasha xlight05 maryamzi suhand gabilang ims94 jbampton anuruddhal manuranga madushajg rdulmina sandarujayawardana lasinicl lochana-chathura rpjayasekara nadeeshan96 pcnfernando suleka96 udda1996 kavinduzoysa seanpm2001 seanwallawalla-forks heshanpadmasiri chiransachintha nipunamadhushan lakshanweerasinghe gimantha keizer619 prakanth97 thevakumar-luheerathan ushirask sm1990 sasindudilshara namfrans shaxinjon2007 azeemmuzammil fenggefeifei ravinperera00

ballerina-spec's Issues

Built-in annotations

Current implementation has a concept of built-in annotations, which can be used without a module qualifier and without importing a module.

The spec currently requires that an annotation tag must reference an annotation.

Cooperative multitasking

At the moment it is too easy for an a programmer to unintentionally write programs that have data races.

A concept for fixing this is have multitasking be cooperative by default. This would mean that a new strand will run on the same thread as its starting strand unless the programmer explicitly uses a keyword on the worker declaration to indicate that it should run on a new thread.

Add concept of configurability to language

Add configurability as a language-level concept. The source code would be declare something configurable and then there would be a standard way at runtime to supply a value for something configurable in place of the default.

This should apply to at least:

initializers for module level variables
values of annotations on module level variables

Can we allow function-body-block in an arrow-function-expr?

I am aware that a function-body-block is allowed in anonymous function expressions and only an expression is allowed in an array-function-expr.

public function main() {
    var f = foo();
    io:println(f.call(5));
}

function foo() returns function(int) returns int {
    return i => { int k = i * i; 
                         return randomFunc(k);
                       }    
}

At the moment, I need to have an anonymous function expression to support the above requirement which is verbose IMO.

Allow record types to specify default values for fields

Basic idea is to allow fields in a record type to specify a default value.

record T {
  boolean isX = false;
}

There are a number of details that need to be worked out:

When does a default get applied?
What is allowed as the specification of the default value and how does it get evaluated?
Does it affect when a value belongs to a type?
How does it work with unions, for example, when one arm has a default and another does not?
How does this relate to the idea of optional fields? Can optional fields have defaults?
How does it relate to the distinction between type and type descriptor?
Does field access make use of this?
Do convert or stamp typedesc methods use this?
How does it work with the *T type reference feature?
How does it relate to the concept of an implicit initial value? Does this concept need revising to work with concept of default?
Does this also apply to object fields?
How, if at all, should default of function arguments be revised so as to be harmonious with record field defaulting? See issue #27. (This is what makes it potentially incompatible.)

Clarify difference between types and type descriptors

The spec has not been sufficiently clear about the difference between types and type descriptors. Some points still need to be clarified:

concept that type descriptors get resolved as part of evaluation
inherent type is actually type descriptor
contextually expected type is actually type descriptor

Type descriptor for any function

For most basic types, it is trivial to write a type descriptor that covers all values of that basic type. But for functions it is not possible. It feels like something that ought to be possible.

For example, if an annotation tag can only be attached to a function, then we might want to say that it is allowed only for a typedesc whose parameter is a subtype of any function.

I suspect we will have syntax ambiguities if we allow function by itself as a keyword to represent a type (we don't have anywhere else where we allow braces optionally).

We could allow function returns T to be the type of all functions returning type T no matter what their parameters; then function returns any|error would cover all functions.

Don't want to use function<T> since that is likely to conflict with syntax for parameterized types.

Add block-level type declarations

Type declarations are currently allowed only at module level.

This was partly because we were thinking of type descriptors as compile time constants, and compile time constants are allowed only at module level. But in fact, type descriptors are not compile time constants, because local object type descriptors need to have methods which refer to local variables (this is important for services). This becomes even more important with annotations, which have values evaluated at runtime that attach to type descriptors.

We should therefore allow type declarations at block level also.

Allow LHS of assignment to be _

At the moment the grammar does not allow the LHS of an assignment to be _. This was not intended. _ is allowed as a binding pattern, but destructuring assignment only allows a structural binding pattern (so that it does not overlap with the regular form of assignment).

Specify which keywords are reserved in which contexts

The spec doesn't get into the details of what keywords are reserved in what contexts.

We have a lot of keywords and they shouldn't be reserved everywhere.

Add conceptual material similar to what was in 0.980

In the 0.980 specification, the Evaluation semantics section has some conceptual material, particularly relating to services. This was outdated by changes to the language design. It would be nice to have something similar in the spec.

Syntax to make type of floating point literals explicit

At the moment, a floating point literal is treated as float or decimal based on the contextually required type, as set by for example a type cast. It would be better to have a specific syntax so to allow all numeric values to be serialised unambiguously as literals.

Therefore we will allow suffixes of d and f to indicate decimal and float literals respectively.

We would also get rid of type casts from simple-const-expr.

Unary expression syntaxes are confusing

Description:
Current unary expression syntaxes are bit confusing as some users can think it as a post decrement or increment operation. Because with the grammar rule below

expression
    |    (ADD | SUB | BIT_COMPLEMENT | NOT | UNTAINT) expression

It is allowed for user to write unary operations as below.

function func1() {
    int a = 1;
    a = --a;
    a = ++a;
    a = -+a;
    a = ------+------a;
}

can we make this a bit less confusing?

Generalize error constructors to mappings and lists

Problem:
Error constructors use a function-like syntax in what seems to be a rather ad-hoc way

Solution:
Generalize error constructor (with changes from #2) into a functional constructor, which would be useable for both construction and pattern matching.

functional-constructor-expr := named-type-reference _(_ arg-list _)_
named-type-reference := variable-reference | error

The named-type-reference specifies the contextually expected type. The interpretation of arg-list depends on the basic type referred to:

for error, the argument list has an optional positional argument specifying the error reason string, and a named argument for each field in the detail record
for mappings, the argument list has a named argument for each field
for lists, the argument lists has a positional argument for each member

Note that new is restricted to behavioural types (matching the current impl).

Limit on function parameter defaults to compile-time constants

Function parameter defaults would be considerably more useful if they were expressions evaluated on each function call, particularly if those expressions can involve previous parameters or self.

For example, many of the string library functions could usefully default to the length of the string. Without this, you end up having to allow () in the type, so as to be used as default value.

$$ not allowed by string template grammar

The string template grammar does not allow '$$' in a string unless it is immediately before an interpolation.

int value = 10;
string s = string `Price - $${value}`; // supported. `s` contains Price - $10
string s = string `Hello $$ world`; // unsupported

However, we can add multiple $ symbols using an interpolation.

string s = string `Hello ${"$$"} world`;

Easy way to panic if the result of an expression is an error

In some cases, the type of a function call or other expression allows an error, but the programmer does not expect that the error will occur (because of the function arguments or program state). Thus, it will be a program bug if an error does occur, contrary to expectations.

At the moment, the user would write something like:

T|error x = foo();
if x is error {
   panic x;
}
else {
   // x is T
}

Apart from being verbose, this has the disadvantage of not generating quite the right error. Ideally, the error should be that an foo returned an error unexpectedly, with the error that foo returned being in the detail record as the cause field.

One solution is to add an operator to deal with this case explicitly, which would be similar to check, but instead of returning it would panic. So the above example would be written

T x = checkpanic foo();

The keyword checkpanic is not very elegant, but it is a feature that it sticks out a bit.

Program startup/shutdown

Not yet clear if we have the right semantics here yet.

In particular, the fact that services in all imported modules are started seems strange, and not very consistent with how main is handled. What does it mean for a module import to be unused if the way to use a service is to import it?

There have also been requests from Cellery team to restore multiple entry point capability.

Inferred error type for named workers

Use error<*> to mean an error, where the exact type of error is to be inferred (as with var). This is useful with workers where the rules on error propagation are necessarily quite subtle. This follows the inferred array type size syntax.

Specify lang.string functions

Description
The string basic type should define some convenient methods.

Possible methods

length() returns int;
isEmpty() returns boolean;
codePointAt(int) returns int;
charAt(int) returns string; // string of length 1
substring(int startIndex, int endIndex) returns string;
indexOf(string) returns int;
lastIndexOf(string) returns int;
contains(string) returns boolean;
startsWith(string) returns boolean;
endsWith(string) returns boolean;
replace(string fromString, string newSubstr) returns string;
replaceFirst(string
translate(string fromChars, string toChars) return string;
trim() returns string; // what counts as whitespace? “ \r\n\t”
toLowerAscii() returns string; // converts A-Z to a-z, leaving other chars unchanged
toUpperAscii() returns string; // converts a-z to A-Z, leaving other chars unchanged
codePointArray() returns int[];

Functionality that is not appropriate for a method should go in stdlib lang.string module

Hyperlink grammar

Generate HTML where references to productions are linked to the rule that defines it.

Deprecation design

Design how deprecation should work.

We want to be able to smoothly introduce new library modules.

This might use an annotation, and so maybe affected by issue #17

Positional parameters cannot have defaults

At the moment the spec does not allow positional parameters to have a default and so be optional. This is a bit limiting for API design, especially given that function overloading is not allowed.

Lang.* functions on containers to support functional-style programming

These are methods that have a parameter that is of type function.

Make range syntax work for list and string slices

We have a range syntax M ... N and M ..< N, which is used for iteration over lists and strings. It would be nice if the same syntax could work for slicing. Ideally, we want slices to be both lvalues and rvalues, so you could say something like:

v = w[1 ..<w.length()];
v[1 ..< 3] = w;

Currently, range syntax works as an expression returning a list of integers. However, if we are going to use it for slices also, we might need an alternative approach by, for example, having it return an iterable object.

Change to new closed record syntax in patterns in match statement

The spec has changed from { X; !...} to {| X |} syntax for closed records in types and binding patterns. This change also needs to be made for patterns in a match statement.

Say to comment via github issues

Spec should say to submit comments via github issues rather than ballerina-dev.

Repo license is wrong

The spec license is not Apache but rather

Creative Commons Attribution-NoDerivatives 4.0 International

https://creativecommons.org/licenses/by-nd/4.0/

Update specification of experimental features to better match implementation

Main features are:

Lock
Querying
1. Table queries
2. Streaming queries
Transactions

Type-safe function apply

Should be possible to apply a function value to a container representing an argument list in a type-safe way. This is probably just a generalization of the semantics of ...x.

This is for when the type of the function is known at compile time. There is also a need for a more dynamic, non type-safe version, for when the type is not known.

This relates to #27.

Add start expression

start expression (or is it action?) is implemented but not in the spec.

The spec should define it.

Semantics are to start a function in its own strand and return a future.

Change async send from statement to action

At the moment, async send message -> worker is a statement, whereas sync send message ->> worker is an action. It is weird for constructs with very similar functionality to be in fundamentally different syntactic categories.

It would work better to make async send be an action that always returns nil.

Distinguish graceful and immediate shutdown in Listener interface

Should listener interface be changed to support both graceful shutdown and immediate shutdown, and, if so, how?

A binding pattern of _ should not match errors

Currently a binding pattern of _ always matches, meaning that it matches errors also. This makes it far to easy to ignore errors by doing

_ = someFunctionThatMightFail();

instead of doing. In particular, this is easier than doing what is usually the right thing, which is

check someFunctionThatMightFail();

We will fix this by saying that a binding pattern of _ only matches a value that belongs to type any.

Add table of contents

We need at least a minimal ToC for public release.

Auto-number sections and generate full ToC

Sections at all levels should be automatically numbered, with a table of contents generated from them.

There should be <section> elements wrapped around sections.

Allow splice in an array constructor

Allow ...x in an array constructor in a similar way to how it is allowed in a function call.

It would be the same as specifying each member of the list individually.

Would it also make sense in a tuple constructor?

Resource method signatures and external resource methods

Description:

Current grammar allows us to have resource methods with external function body. This brought up the following question.

Normally, the result of a function or a (remote) method is represented by the return value. In the external case, the function signature is good enough to understand the function/method semantics without knowing the implementation.

But resource methods are special. The result of a resource method will be handled using an action invocation. (in WebSocket that can be multiple action invocations). The return value is used to represent the end of the resource or error situation.

Technically we should be able to achieve the same with an external implementation. Conceptually resource signatures are not initiative enough to understand the resource semantics in an external case.

We have the following options.

Not allowing external resource methods. - which is inconsistent with other methods.
Fix resource signature.
Allow external resource methods - considering this is a rare case.

@jclark @sanjiva Thoughts?

Related Issues:
N/A

Evaluation order of module-level initializers

Spec should define the order in which module-level initializer expressions are evaluated. This makes a difference since they may have side-effects. The goal should be to move things around as little as necessary to ensure that the variables that an initializer depends on are initialized before the initializer is evaluated.

Killing a worker

It would be useful to be able to kill a worker if it's known that its results will not be used. However, given that workers can modify global state, killing a worker off at an arbitrary points might lead to global data being left in an inconsistent state.

Java threads have a non-preemptive killing mechanism (which it calls interrupt), which looks promising. The idea is that a strand would have a cancelled flag. Futures would have a builtin method cancel: f.cancel() would set the cancelled flag to true. Worker would check its cancelled flag at specific points (e.g. message send/receive and blocking IO), and would panic if the flag was set.

Declarative data transformations

We need a declarative level way to describe transformations in terms of a data flow graph, so that we can provide a graphical interface.

Consider whether this can handle XProc use case.

Particular simple cases

Parse strings into structures
1. DateTime/Duration
2. Binary
3. Xml
Hooks to call user-defined functions for arbitrary stuff

Easy way to create an array whose size is not a compile time constant

It's common to want to create an array where the size is known only at runtime.

In Java or C++, you can do new T[expr].

Rust has a syntax [expr; N] which creates an array of length N by cloning the result of evaluating expr.

Maybe we should something like this in Ballerina. This relates to what the initial value for array members is and whether it can be specified.

Revised design for table type

There's a discussion here:

https://groups.google.com/forum/#!msg/ballerina-dev/b4GM_sGXA64/GJV75EaCGQAJ

Float/Decimal literal descriminator (f/d) conflics with hex literals

Description:

decimal d = 0x22.Fd;

Interpreting final letter 'd' seems ambiguous as is could be number 'D' or decimal discriminator 'd'.

Annotations on services

Annotations on services work by having each evaluation of a service constructor create a new type descriptor to which the annotations can be attached. This needs to be explained in the spec.

Depends on issue #11

Add lang.* methods for list and mapping

Add built-in methods for lists and mappings.

This methods on lists should be harmonious with those on strings.

Methods to support a functional style of programming (i.e. methods that take an argument of type function) are a separate issue #23.

Consider revising function argument defaulting

We should consider whether and, if so, how the design of function argument defaulting should be revised so as to

align well with record field defaulting #22
fix useability limitations of current design #25, #26

How should json type deal with Ballerina's multiple numeric types?

Problem is how should json type deal with existence in Ballerina of multiple numeric types, in particular the existence of both decimal and float.

We believe that different applications will want to use different Ballerina numeric types to represent numbers in JSON; the Ballerina json type should therefore allow control over which numeric types are used; this can be done by making the numeric type a parameter to json.

Thus json<decimal> would mean that JSON numbers are represented by Ballerina decimals, and json<float> would mean that JSON numbers are represented by Ballerina floats; json without a parameter would mean json<float|decimal|int>, and so have the same semantic as now.

The json parser, which converts from bytes/characters into Ballerina values, would be parameterized in the numeric type. If a resource function declares that a parameter is, for example, json<decimal>, the parser would get supplied with a parameter that tells it to convert JSON number syntax into Ballerina decimal values.

Specify object subtyping

The spec needs to specify how object subtyping works.

This will almost certainly depend on function subtyping #28.

Can we define this in terms of shapes? (ie type denotes set of shapes, and subtype means subset of denoted set of shapes)

It would be intuitive if every object type was a subtype of object {}, since then we would not need any special way to write a type corresponding to the object basic type.

Privacy

One significant complexity that relates only to objects is privacy. We need to be clear on what private means. Two possible interpretations

access only via self (private to this object)
access only by methods that belong to same object type descriptor (private to the object type)

Approach A

One possible approach is as follows. An object type S is a subtype of an object type T provided that the only differences between S and T are the following:

the order of declaration of fields and methods may differ between S and T
a method may be extern in one and not in the other
method bodies in S and T may be different
the function type of a method m in S may be a subtype of the function type of m in T
the type of a field f in S may be a subtype of the type of field f in T
S may have a method m, where T does not have any method m
S may have a field f, where T does have have any field f
the visibility (public/module/private) of a field or method x in S may be greater than the visibility of the corresponding field or method x in T (where public > module > private)

Approach B

An alternative approach is that private aspects of value do not affect type.

same
same
same
same
same
S may have a method m, where T does not have any method m or has a private method m
S may have a field f, where T does have have any field f or has a private field f
S and T may have different private fields and methods
a field or method x in S may be public and the field or method x in T may have module visibility

This only works with first interpretation of private.

Specify function subtyping

The spec does not specify exactly how function subtyping should work.

Basic principle is that functions should be covariant in return type and contravariant in parameter types. Default values should not affect subtyping.

Ideally we want to define this in terms of shapes (ie type denotes set of shapes, and subtype means subset of denoted set of shapes).

Relates to #27