Changeset 15093

Nov 28, 2011, 6:32:59 PM (8 years ago)

New Linux ARM binaries.

The image and FASL versions changed on the ARM, but (if I did it right)
not on other platforms.

(The image and FASL versions are now architecture-specific. This may
make it somewhat easier and less disruptive to change them, since the
motivation for such a change is often also architecture-specific.)
The FASL and current image version are defined (in the "TARGET" package)
in the architecture-specific *-arch.lisp files; the min, max, and current
image versions are defined in the *constants*.h file for the architecture.

Most of the changes are ARM-specific.

Each TCR now contains a 256-word table at byte offset 256. (We've
been using about 168 bytes in the TCR, so there are still 88 bytes/22
words left for expansion.) The table is initialized at TCR-creation
time to contain the absolute addresses of the subprims (there are
currently around 130 defined); we try otherwise not to reference
subprims by absolute address. Jumping to a subprim is:

(ldr pc (:@ rcontext (:$ offset-of-subprim-in-tcr-table)))

and calling one involves loading its address from that table into a
register and doing (blx reg). We canonically use LR as the register,
since it's going to be clobbered by the blx anyway and there doesn't
seem to be a performance hazard there. The old scheme (which involved
using BA and BLA pseudoinstructions to jump to/call a hidden jump table
at the end of the function) is no longer supported.

ARM Subprims no longer need to be aligned (on anything more than an
instruction boundary.) Some remnants of the consequences of an old
scheme (where subprims had to "fit" in small regions and sometimes
had to jump out of line if they would overflow that region's bounds)
still remain, but we can repair that (and it'll be a bit more straightforward
to add new ARM subprims.) We no longer care (much) about where subprims
are mapped in memory, and don't have to bias suprimitive addresses by
a platform-specific constant (and have to figure out whether or not we've
already done so) on (e.g.) Android.

Rather than setting the first element (fn.entrypoint) of a
newly-created function to the (absolute) address of a subprim that updates
that entrypoint on the first call, we use a little LAP function to correct
the address before the function can be called.

Non-function objects that can be stored in symbols' function cells
(the UNDEFINED-FUNCTION object, the things that encapsulate
special-operator names and global macro-functions) need to be
structured like FUNCTIONS: the need to have a word-aligned entrypoint
in element 0 that tracks the CODE-VECTOR object in element 1. We
don't want these things to be of type FUNCTION, but do want the GC to
adjust the entrypoint if the codevector moves. We've been essentially
out of GVECTOR subtags on 32-bit platforms, largely because of the
constraints that vector/array subtags must be greater than other
subtags and numeric types be less. The first constraint is probably
reasonable, but the second isn't: other typecodes (tag-list, etc) may
be less than the maximum numeric typecode, so tests like NUMBERP can't
reliably involve a simple comparison. (As long as a mask of all
numeric typecodes will fit in a machine word/FIXNUM, a simple LOGBITP
test can be used instead.) Removed all portable and ARM-specific code
that made assumptions about numeric typecode ordering, made a few more
gvector typecodes available, and used one of them to define a new
"pseudofunction" type. Made the GC update the entrypoints of
pseudofunctions and used them for the undefined-function object and
for the function cells of macros/special-operators.

Since we don't need the subprim jump table at the end of each function
anymore, we can more easily revive the idea of embedded pc-relative
constant data ("constant pools") and initialize FPRs from constant
data, avoiding most remaining traffic between FPRs and GPRs.

I've had a fairly-reproducible cache-coherency problem: on the first
GC in the cold load, the thread misbehaves mysteriously when it
resumes. The GC tries to synchronize the I and D caches on the entire
range of addresses that may contain newly-moved code-vectors. I'm not
at all sure why, but walking that range and flushing the cache for
each code-vector individually seems to avoid the problem (and may actually
be faster.)

Fix ticket:894

Fixed a few typos in error messages/comments/etc.

I -think- that the non-ARM-specific changes (how FASL/image versions are
defined) should bootstrap cleanly, but won't know for sure until this is
committed. (I imagine that the buildbot will complain if not.)

59 edited

Note: See TracChangeset for help on using the changeset viewer.