Skip to content

Indexed Substance Statements

The Substance language allows users to define indexed expressions that expand into multiple statements. An indexed statement is a single substance statement (as described above) with templated identifiers and an indexing clause. A common example is declaring an indexed set of variables:

substance
Vector v_i for i in [0, 2]
-- is equivalent to Vector v_0, v_1, v_2

Templated identifiers (like v_i in the example) are regular Substance identifiers consisting alphanumeric characters and underscores, except that the last underscore and the substring following it denote an indexed variable. In the example, _i in v_i denotes an index variable i taking on the values in the range of [0, 2], i.e. 0, 1, 2. This line of code then gets expanded into three statements Vector v_0, Vector v_1, and Vector v_2.

INFO

The phrase i in [x, y] for some x and y requires that i is an integer, i \geq x, and i \leq y. As such, expressions like Vector v_i for i in [3, 0] has no effect, since there is no integer i that is at least 3 and at most 0.

When templated identifiers collide with regular identifiers, the latter is shadowed in indexing statements:

substance
Vector v_i, v_j, vec1

Vector v_i for i in [0, 10]
  -- ok: expands into v_0, v_1, v_2, ..., v_10

Vector v_j for i in [0, 1]
  -- error: the range of `j` is not defined by `i in [0, 1]`
  -- notice that `v_j` here does not refer to the `v_j` defined above
  -- since `v_j` occurs in an indexed statement.

Orthogonal(v_i, vec1) for i in [0, 3]
  -- ok: expands into Orthogonal(v_0, vec1); Orthogonal(v_1, vec1);
  --     Orthogonal(v_2, vec1); Orthogonal(v_3, vec1)
  -- notice that `vec1` does not have an underscore, so it is treated
  -- as a regular substance variable

Orthogonal(v_i, v_j) for i in [0, 5]
  -- error: the range of `j` is not defined by `i in [0, 5]`

When multiple template variables and ranges are present, Substance takes all combinations (cartisian product) of the indices in the ranges, like

substance
Orthogonal(v_i, v_j) for i in [0, 1], j in [1, 2]
  -- expands into Orthogonal(v_0, v_1); Orthogonal(v_0, v_2);
  -- Orthogonal(v_1, v_1); Orthogonal(v_1, v_2)

Conditional Filtering

Sometimes, we don't want to iterate through all possible combinations, since some combinations are undesirable. Substance allows users to filter the combinations using a Boolean expression in the where clause. Substance would discard all combinations that make the Boolean expression false.

substance
Vector v_i for i in [0, 10] where i % 2 == 0
  -- even indices: 0, 2, 4, 6, 8, 10

Orthogonal(v_i, v_j) for i in [0, 2], j in [0, 2] where i <= j
  -- triangular range: [0, 0], [0, 1], [0, 2], [1, 1], [1, 2], [2, 2]

Orthogonal(v_i, v_j) for i in [0, 3], j in [0, 3] where i + 1 == j
  -- consecutive pairs: [0, 1], [1, 2], [2, 3]

Edge(v_i, v_j) for i in [0, 4], j in [0, 4] where j == (i + 1) mod 5
  -- cyclic pairs: [0, 1], [1, 2], [2, 3], [3, 4], [4, 0]

Orthogonal(v_i, v_j) for i in [0, 3], j in [0, 3] where i % 2 == 0 && j == i + 1
  -- disjoint pairs of 2: [0, 1], [2, 3]

Specifically, the Boolean expressions may contain:

  • Boolean constants (true and false),
  • Unary logical operator (! for logical-not) followed by a Boolean expression,
  • Binary logical operators (&& for logical-and and || for logical-or) between Boolean expressions, and
  • Numerical comparisons (== for equality, != for non-equality, < for less-than, > for greater-than, <= for less-than-or-equal-to, and >= for greater-than-or-equal-to) between numerical expressions.

Numerical expressions may contain:

  • Floating-point constants,
  • Index variables defined in the ranges, like i in for i in [0, 2],
  • Unary numerical operators (-) followed by a numerical expression, and
  • Binary numerical operators (+ for plus, - for minus, * for multiplication, / for division, either % or mod for modulo, and ^ for power) between two numerical expressions.

WARNING

Because of an internal tokenizer bug, expressions like +1 are always parsed as one single token +1 denoting the integer "positive one", instead of two tokens + and 1, regardless of their locations in the program. Similarly, -1 is always parsed as a single -1 instead of - and 1.

As such, expressions like 2+1 are always interpreted as 2 and +1 instead of the expected 2, +, and 1; the same occurs for 2-1 which is interpreted as 2 and -1 instead of the expected 2, -, and 1. In other words, they are interpreted as two numbers side-by-side instead of a number, an operator, and another number. This bug causes errors like

error
Error: Syntax error at line 1 col 39:

  Node n0_i for i in [0,15] where i == 2+1
                                        ^
Unexpected int_literal token: "+1".

since the + operator is absorbed into the token +1 so the parser can no longer find the + operator.

This bug has been documented here.

The workaround to this bug is to always put spaces around the + and - operators, writing expressions like 2 + 1, n - 1, etc., unless signed numbers like -1 and +3 are specifically required.

The default order of operations is the same as other programming languages, and parentheses can be used to override the default order of operations.

Duplications

An indexed statement generates a list of Substance statements. The existing semantics on duplicates apply to them as well.

substance
Vector v_0

Vector v_i for i in [0, 2]
  -- error: `v_0` is declared and cannot be re-declared

Vector v_i, v_j for i in [0, 2], j in [0, 2]
  -- error: `v_0` is declared and cannot be re-declared

Orthogonal(v_0, v_1)

Orthogonal(v_i, v_j) for i in [0, 2], j in [0, 2] where i != j
  -- ok, because duplicated predicates are ok

Accessing Individual Elements of an Indexed Set

An indexing statement generates identifiers by replacing index variables with strings of integer values. Therefore, generated identifiers can be used just like regular identifiers as long as they exist:

substance
Vector v_i for i in [0, 2]
LinearlyDependent(v_0, v_2) -- ok

Released under the MIT License.