5.4 Module Representation
5.4.1 Representation
5.4.1.1 Single Modules
Modules are represented as JavaScript object literals. Aside from the theModule field, which contains the compiled code of the module, they follow a JSON-structured schema. The format has several design goals:
- It should not require any free or global JavaScript identifiers beyond chapter 15 of the ES5 spec, allowing the compiler and runtime system to parameterize the module by its context.
It should be reasonable for a developer to write by hand, so that modules in pure JavaScript can seamlessly co-exist with compiled Pyret modules.
It should be simple for the compiler to produce and consume.
It should contain the information necessary for its dependents to be statically checked, without consulting the original source program.
A module, whether compiled or handwritten, has the following form:
module := { |
"requires": [<require>, ...], |
"provides": <provides>, |
"nativeRequires": [<nativeRequire>, ...], |
"theModule": <moduleFunction> |
} |
|
require := |
| { "import-type": "builtin", "name": <string> } |
| { "import-type": "dependency", "protocol": <string>, "args": [<string>, ...] } |
|
nativeRequire := |
| <string> |
|
provides := |
{ |
aliases: { <name>: <type>, ... }, |
values: { <name>: <type>, ... }, |
datatypes: { <name>: <type>, ... } |
// <type>s in shorthands cannot use shorthands as types |
// (described below) |
shorthands: { <name>: <type>, ... }, |
} |
|
prim-type := |
| "tany" | "Number" | "String" | "Boolean" | "Any" | "Nothing" |
|
|
type := |
| <prim-type> |
| <type-full> |
| <type-array> |
| <string-defined-in-shorthands> |
|
type-full := |
| { tag: "any" } |
| { tag: "name", origin: <require>, name: <string> } |
| { tag: "forall", args: [<string>, ...], onto: <type> } |
| { tag: "arrow", args: [<type>, ...], ret: <type> } |
| { tag: "tyapp", onto: <type>, args: [<type>, ...] } |
| { tag: "tyvar", name: <string> } |
| { tag: "record", fields: { <name>: <type> }, ... } |
| { tag: "data", |
name: <string>, |
params: [<string>, ...], |
variants: [<variant-full>, ...], |
methods: { <name>: <type>, ... } |
} |
|
variant-full := |
| { tag: "variant", |
name: <string>, |
vmembers: [<vmember-full>, ...] |
} |
| { tag: "singleton-variant", name: <string> } |
|
vmember-full := |
| { tag: "variant-member", name: <string>, kind: <variant-kind>, typ: <type> } |
|
variant-kind := |
| "normal" | "ref" |
|
type-array := |
| ["Array", <type>] |
| ["RawArray", <type>] |
| ["Option", <type>] |
| ["List", <type>] |
# type of args, resulting constructed type |
| ["Maker", <type>, <type>] |
| ["arrow", [<type>, ...], <type>] |
| ["data", |
<string>, |
[<string>, ...], |
[<variant-array>, ...], |
{ <name>: <type>, ... } |
] |
| ["tid", <string>] |
| ["forall", [<string>, ...], <type>] |
| ["local", <string>] |
| ["record", { <name>: <type>, ... }] |
| ["tyapp", <type>, [<type>, ...]] |
|
variant-array := |
| [<string>] |
| [<string>, [<vmember-array>, ...]] |
|
vmember-array := |
| [<string>, <type>] |
| ["ref", <string>, <type>] |
|
moduleFunction := |
| function(runtime, namespace, uri, <id>, ..., <id>, ...) { |
// compiled or handwritten JavaScript code |
} |
|
The first three fields—requires, provides, and nativeRequires—hold static information about the modules dependencies and exports.
requires
The requires field holds the compiled equivalent of an import line. This includes the kind of import, and any parameters that are part of the import statement. For example, the import line import file("./lib/helpers.arr") as H would show up in the compiled code as
{ "import-type": "dependency", |
"protocol": "file", |
"args": ["./lib/helpers.arr"] } |
Builtin imports, like lists and sets, have an import-type of "builtin":
{ "import-type": "builtin", name: "lists" } |
Note that the require can be generated from an import line without any special context information. For example, in the example above, the path is not resolved to an absolute path. This happens later in [REF]. This decision in large part supports the goal of handwritten modules, where it would be onerous to fill in absolute paths and keep track of them.
provides
provides describe the types exported from a module. This includes:
The types of exported values. So, for example, a program that defines
x :: Number = 22
would have the following in its in its compiled provides.values:
x: "Number"
The types of exported aliases. For example, a program that defines
type Point = { x :: Number, y :: Number }
would have the following in its compiled provides.aliases:
Point: {
tag: "record",
fields: { x: "Number", y: "Number" }
}
Any exported datatypes. For example, a program that defines
data Point:
| point(x :: Number, y :: Number)
end
would have the following in its compiled provides.datatypes:
Point: {
tag: "data",
name: "Point",
params: [],
variants: [
{
tag: "variant",
name: "point",
vmembers: [
{
tag: "variant-member",
kind: "normal",
name: "x"
typ: "Number"
},
{
tag: "variant-member",
kind: "normal",
name: "y"
typ: "Number"
}
]
}
],
methods: {}
}
Writing out all of the types fully, with tag and so on, is quite a bit of typing for handwritten modules. So these types can also be specified in an array notation, where the first element of the array is typically a string indicating the tag, and the rest of the array describes the type positionally. So, for example, the datatype for Point could be written:
Point: ["data", |
"Point", |
[], |
[ |
["variant", "point", [["x", "Number"], ["y", "Number"]]] |
], |
{} |
] |
Both styles are fully supported and can be interchanged.
Since often modules refer to the same type many times, it can be painful to write out the same type over and over in a handwritten specification. The provides declaration also allows specification of shorthands, which are not new types exported by the module, but rather shortcuts for writing out its types. For example, a module that implements dictionaries from a key type K to a value type V will likely use a type like this repeatedly:
["tyapp", ["local", "Dict"], [["tid", "K"], ["tid", "V"]]] |
That is, the locally-defined (within this module) type Dict, parameterized by two type variables. Instead of writing:
{ |
values: { |
"new-dict": ["forall", ["K", "V"], ["arrow", [], |
["tyapp", ["local", "Dict"], [["tid", "K"], ["tid", "V"]]]]], |
"set": ["forall", ["K", "V"], ["arrow", |
[ |
["tyapp", ["local", "Dict"], [["tid", "K"], ["tid", "V"]]] |
["tid", "K"], |
["tid", "V"] |
], |
["tyapp", ["local", "Dict"], [["tid", "K"], ["tid", "V"]]]]], |
"get": ["forall", ["K", "V"], ["arrow", |
[ |
["tyapp", ["local", "Dict"], [["tid", "K"], ["tid", "V"]]] |
["tid", "K"], |
], |
["Option", "V"]]] |
} |
// aliases and types and so on |
} |
It’s easier to define:
{ |
shorthands: { |
dOfKV: ["tyapp", ["local", "Dict"], [["tid", "K"], ["tid", "V"]]] |
}, |
values: { |
"new-dict": ["forall", ["K", "V"], ["arrow", [], dofKV]], |
"set": ["forall", ["K", "V"], ["arrow", |
[ dOfKV, ["tid", "K"], ["tid", "V"] ], |
dOfKV]], |
"get": ["forall", ["K", "V"], ["arrow", |
[ dOfKV, ["tid", "K"], ], |
["Option", "V"]]] |
} |
// aliases and types and so on |
} |
There are several examples of the uses of these declarations in [REF].
Some "shorthands with options" are predefined, namely Option, Array, RawArray, List, and Maker. The first four of these are straightforward, single-argument type constructors. The last one describes the type of list in [list: ...], namely, the type of the object whose fields allow for the construction of composite values. Makers accept two type arguments: the type of the ... arguments in the constructor notation, and the resulting type of the constructed value.
nativeRequires
Describe dependencies of the module that are not Pyret-based. The strings in nativeRequires are processed not by Pyret’s module loading system, but by a (configurable) use of RequireJS. This is discussed later in [REF].
Pyret distinguishes nativeRequires for several reasons. In some contexts, like running on Node, there needs to be some mechanism for accessing system libraries like fs for filesystem access. In addition, there are numerous JavaScript libraries implemented in RequireJS format, and it’s useful to have a way for handwritten Pyret modules to import and use them directly. To avoid using global scope or other mechanisms, the runtime uses RequireJS as a standard way to locate and load these modules.
Of course, this also assumes that code is run within a sandbox so it cannot simply eval its way to arbitrary behavior. While Pyret doesn’t currently run within something like Caja [REF] when evaled, it is a long-term goal.
theModule
The final field, theModule, holds a function that implements the module’s behavior, and constructs the values that it provides. Its arguments have a particular shape:
runtime – the first argument is the current runtime (as described in The Pyret Runtime). This is the entrypoint for most interesting built-in Pyret behavior, and used pervasively in compiled and handwritten modules alike.
namespace – the second argument is a dictionary object, called a namespace [REF], that holds the mappings for global identifiers and types available in the module. This is seldom useful in handwritten code; its main use is in modules at the REPL that have an interesting and changing set of globally-available names.
uri – the third argument is the URI of the module, as a string. Since the same code could be loaded for different purposes (e.g. a Google Drive module loaded both from a shared import and a path import), a module does not store its URI at compile time. The URI is provided when the module is instantiated, which can be used for logging, reporting error messages, and other unique module identification purposes.
requires ids – After uri, there should be a number of identifiers equal to the number of requires listed. These will hold the module objects [REF] for the specified dependencies when the module is loaded.
nativeRequires ids – After the requires ids, there should be a number of identifiers equal to the number of nativeRequires listed. These will hold the values returned from using RequireJS on the native dependencies when the module is loaded.
5.4.1.2 JavaScript Interop Example (node)
In command-line Pyret, the module import form js-file will look for a (relative) path to a JavaScript file in the JavaScript module format, and load it. Here is an example:
$ ls |
lib.js test.arr |
$ cat lib.js |
({ |
requires: [], |
nativeRequires: [], |
provides: { |
values: { "from-a-library": "String" } |
}, |
theModule: function(runtime, _, uri) { |
return runtime.makeModuleReturn({ |
"from-a-library": "I'm from a library!" |
}, {}); |
} |
}) |
$ cat test.arr |
import js-file("lib") as L |
|
print(L.from-a-library + "\n") |
⤇ pyret -q test.arr |
I'm from a library! |
The program didn't define any tests. |
5.4.1.3 Complete Programs
Modules as described in Single Modules lack the necessary information and context to run – their dependencies must still be provided, most crucially, and the runtime needs to know in which order to run them.
To this end, Pyret also specifies a format for complete programs, which contains all the information needed to run a program, given a runtime and an implementation of RequireJS. Running such a complete program, which can be done in several ways, is discussed in [REF]. This section lays out and motivates its structure. This structure is not intended to be written by hand.
program := { |
staticModules: <staticModules>, |
depMap: <depmap>, |
toLoad: [<uri>, ...], |
} |
|
depmap := { <uri>: { <dependency> : <uri>, ... }, ... } |
|
staticModules := { <uri>: <module>, ... } |
|
dependency := string encoding of <require> |
|
module := as above |
The dictionary of staticModules maps from uri to module structures as described in Single Modules. This includes all the Pyret-based modules and code that the program will use. It’s worth noting that the information in the provides block is (potentially) extraneous if the only goal is to run the program. However, if compiled modules are to provide enough information to e.g. type-check code that is linked against them in the future, it’s worth keeping this static information around.
The depmap indicates, for each listed require dependency, which module should be used to satisfy it. This is indicated by mapping from a string representation of the require to the URI for the appropriate module. The string encoding is straightforward, and creates a string that looks much like the original import line. For example, a require like:
{ "import-type": "dependency", "protocol": "file", "args": ["./lib/helpers.arr"] } |
would appear encoded as
file(./lib/helpers.arr) |
The toLoad list indicates the order in which the modules should be loaded. It should always be a valid topological sort of the graph implicit in depmap. In that sense, it’s not strictly necessary information, but it makes running a generated program much more straightforward, since its clear in which order to instantiate modules. This also makes it easy to determine the main entrypoint for the program, which is the last module indicated in the toLoad list. That is, the modules leading up to the last one are exactly its (transitive) dependencies, and run in order to create their exports, which will be used later in the toLoad list to instantiate further modules.
Concretely, the first few modules in the toLoad list are typically builtins, like lists and error, required for just about every program. Increasing indices in the toLoad list tend towards user-implemented code until finally reaching the main module that the user requested be compiled.