SCM uses its type representations to speed evaluation. All of the
subr types (see Subr Cells) are tc7 types. Since the
tc7 field is in the low order bit position of the CAR it
can be retrieved and dispatched on quickly by dereferencing the SCM
pointer pointing to it and masking the result.
All the SCM Special Forms get translated to immediate symbols
(isym) the first time they are encountered by the interpreter
(ceval). The representation of these immediate symbols is
engineered to occupy the same bits as tc7. All the isyms
occur only in the CAR of lists.
If the CAR of a expression to evaluate is not immediate, then it
may be a symbol. If so, the first time it is encountered it will be
converted to an immediate type ILOC or GLOC
(see Immediates). The codes for ILOC and GLOC lower 7
bits distinguish them from all the other types we have discussed.
Once it has determined that the expression to evaluate is not immediate,
ceval need only retrieve and dispatch on the low order 7 bits of
the CAR of that cell, regardless of whether that cell is a
closure, header, or subr, or a cons containing ILOC or
GLOC.
In order to be able to convert a SCM symbol pointer to an immediate ILOC
or GLOC, the evaluator must be holding the pointer to the list in which
that symbol pointer occurs. Turning this requirement to an advantage,
ceval does not recursively call itself to evaluate symbols in
lists; It instead calls the macro EVALCAR. EVALCAR does
symbol lookup and memoization for symbols, retrieval of values for ILOCs
and GLOCs, returns other immediates, and otherwise recursively calls
itself with the CAR of the list.
ceval inlines evaluation (using EVALCAR) of almost all
procedure call arguments. When ceval needs to evaluate a list of
more than length 3, the procedure eval_args is called. So
ceval can be said to have one level lookahead. The avoidance of
recursive invocations of ceval for the most common cases (special
forms and procedure calls) results in faster execution. The speed of
the interpreter is currently limited on most machines by interpreter
size, probably having to do with its cache footprint. In order to keep
the size down, certain EVALCAR calls which don't need to be fast
(because they rarely occur or because they are part of expensive
operations) are instead calls to the C function evalcar.
There was some discussion a year ago about a "Forth" style Scheme interpreter. This is the only improvement I know of which would beat SCM in speed.
Provided there is still type code space available in SCM, if we devote some of the IMCAR codes to "inlined" operations, we should get a significant performance boost. What is eliminated is the having to look up aGLOCorILOCand then dispatch on the subr type. The IMCAR operation would be dispatched to directly. Another way to view this is that we make available special form versions ofCAR,CDR, etc. Since the actual operation code is localized in the interpreter, it is much easier than uncompilation and then recompilation to handle(trace car); For instance a switch gets set which tells the interpreter to instead always look up the values of the associated symbols.
symhash table.
symhash is an array of lists of ISYMs and pairs of symbols
and values.
ILOC) which specifies how many environment frames down and how
far in to go for the value. When this immediate object is subsequently
encountered, the value can be retrieved quickly.
GLOC. The low order bit is normally reserved for
GCmark; But, since references to variables in the code always occur in
the CAR position and the GCmark is in the CDR, there is no
conflict.
If the compile FLAG CAUTIOUS is #defined then the number of
arguments is always checked for application of closures. If the compile
FLAG RECKLESS is #defined then they are not checked. Otherwise,
number of argument checks for closures are made only when the function
position (whose value is the closure) of a combination is not an
ILOC or GLOC. When the function position of a combination
is a symbol it will be checked only the first time it is evaluated
because it will then be replaced with an ILOC or GLOC.
EVAL Returns the result of evaluating expression in
env. SIDEVAL evaluates expression in env when
the value of the expression is not used.
Both of these macros alter the list structure of expression as it
is memoized and hence should be used only when it is known that
expression will not be referenced again. The C function
eval is safe from this problem.
eval copies expression so that memoization
does not modify expression.