Nov 28, 2011, 6:32:59 PM (9 years ago)

New Linux ARM binaries.

The image and FASL versions changed on the ARM, but (if I did it right)
not on other platforms.

(The image and FASL versions are now architecture-specific. This may
make it somewhat easier and less disruptive to change them, since the
motivation for such a change is often also architecture-specific.)
The FASL and current image version are defined (in the "TARGET" package)
in the architecture-specific *-arch.lisp files; the min, max, and current
image versions are defined in the *constants*.h file for the architecture.

Most of the changes are ARM-specific.

Each TCR now contains a 256-word table at byte offset 256. (We've
been using about 168 bytes in the TCR, so there are still 88 bytes/22
words left for expansion.) The table is initialized at TCR-creation
time to contain the absolute addresses of the subprims (there are
currently around 130 defined); we try otherwise not to reference
subprims by absolute address. Jumping to a subprim is:

(ldr pc (:@ rcontext (:$ offset-of-subprim-in-tcr-table)))

and calling one involves loading its address from that table into a
register and doing (blx reg). We canonically use LR as the register,
since it's going to be clobbered by the blx anyway and there doesn't
seem to be a performance hazard there. The old scheme (which involved
using BA and BLA pseudoinstructions to jump to/call a hidden jump table
at the end of the function) is no longer supported.

ARM Subprims no longer need to be aligned (on anything more than an
instruction boundary.) Some remnants of the consequences of an old
scheme (where subprims had to "fit" in small regions and sometimes
had to jump out of line if they would overflow that region's bounds)
still remain, but we can repair that (and it'll be a bit more straightforward
to add new ARM subprims.) We no longer care (much) about where subprims
are mapped in memory, and don't have to bias suprimitive addresses by
a platform-specific constant (and have to figure out whether or not we've
already done so) on (e.g.) Android.

Rather than setting the first element (fn.entrypoint) of a
newly-created function to the (absolute) address of a subprim that updates
that entrypoint on the first call, we use a little LAP function to correct
the address before the function can be called.

Non-function objects that can be stored in symbols' function cells
(the UNDEFINED-FUNCTION object, the things that encapsulate
special-operator names and global macro-functions) need to be
structured like FUNCTIONS: the need to have a word-aligned entrypoint
in element 0 that tracks the CODE-VECTOR object in element 1. We
don't want these things to be of type FUNCTION, but do want the GC to
adjust the entrypoint if the codevector moves. We've been essentially
out of GVECTOR subtags on 32-bit platforms, largely because of the
constraints that vector/array subtags must be greater than other
subtags and numeric types be less. The first constraint is probably
reasonable, but the second isn't: other typecodes (tag-list, etc) may
be less than the maximum numeric typecode, so tests like NUMBERP can't
reliably involve a simple comparison. (As long as a mask of all
numeric typecodes will fit in a machine word/FIXNUM, a simple LOGBITP
test can be used instead.) Removed all portable and ARM-specific code
that made assumptions about numeric typecode ordering, made a few more
gvector typecodes available, and used one of them to define a new
"pseudofunction" type. Made the GC update the entrypoints of
pseudofunctions and used them for the undefined-function object and
for the function cells of macros/special-operators.

Since we don't need the subprim jump table at the end of each function
anymore, we can more easily revive the idea of embedded pc-relative
constant data ("constant pools") and initialize FPRs from constant
data, avoiding most remaining traffic between FPRs and GPRs.

I've had a fairly-reproducible cache-coherency problem: on the first
GC in the cold load, the thread misbehaves mysteriously when it
resumes. The GC tries to synchronize the I and D caches on the entire
range of addresses that may contain newly-moved code-vectors. I'm not
at all sure why, but walking that range and flushing the cache for
each code-vector individually seems to avoid the problem (and may actually
be faster.)

Fix ticket:894

Fixed a few typos in error messages/comments/etc.

I -think- that the non-ARM-specific changes (how FASL/image versions are
defined) should bootstrap cleanly, but won't know for sure until this is
committed. (I imagine that the buildbot will complain if not.)

1 edited


  • trunk/source/level-1/l1-clos.lisp

    r15001 r15093  
    381381                                     (ash -1 $lfbits-noname-bit)))
    382382              #+arm-target
    383               (gvector :function
    384                        #.(ash (arm::arm-subprimitive-address '.SPfix-nfn-entrypoint) (- arm::fixnumshift))
     383              (%fix-fn-entrypoint
     384               (gvector :function
     385                       0
    385386                       (%svref (if small
    386387                                 #'%small-map-slot-id-lookup
    389390                       table
    390391                       (dpb 1 $lfbits-numreq
    391                             (ash -1 $lfbits-noname-bit)))
     392                            (ash -1 $lfbits-noname-bit))))
    392393              #+x86-target
    393394              (%clone-x86-function (if small
    413414                            (ash -1 $lfbits-noname-bit)))
    414415              #+arm-target
    415               (gvector :function
    416                        #.(ash (arm::arm-subprimitive-address '.SPfix-nfn-entrypoint) (- arm::fixnumshift))
     416              (%fix-fn-entrypoint
     417               (gvector :function
     418                        0
    417419                       (%svref (if small
    418420                                 #'%small-slot-id-value
    424426                       #'%slot-id-ref-missing
    425427                       (dpb 2 $lfbits-numreq
    426                             (ash -1 $lfbits-noname-bit)))
     428                            (ash -1 $lfbits-noname-bit))))
    427429              #+x86-target
    428430              (%clone-x86-function (if small
    450452                            (ash -1 $lfbits-noname-bit)))
    451453              #+arm-target
    452               (gvector :function
    453                        #.(ash (arm::arm-subprimitive-address '.SPfix-nfn-entrypoint) (- arm::fixnumshift))
     454              (%fix-fn-entrypoint
     455               (gvector :function
     456                        0
    454457                       (%svref (if small
    455458                                 #'%small-set-slot-id-value
    461464                       #'%slot-id-set-missing
    462465                       (dpb 3 $lfbits-numreq
    463                             (ash -1 $lfbits-noname-bit)))
     466                            (ash -1 $lfbits-noname-bit))))
    464467              #+x86-target
    465468              (%clone-x86-function
    17061709                                        (ash 1 $lfbits-aok-bit)))
    17071710           #+arm-target
    1708            (gvector :function
    1709                     #.(ash (arm::arm-subprimitive-address '.SPfix-nfn-entrypoint) (- arm::fixnumshift))
     1711           (%fix-fn-entrypoint
     1712            (gvector :function
     1713                     0
    17101714                    *unset-fin-code*
    17111715                    wrapper
    17151719                    0
    17161720                    (logior (ash 1 $lfbits-gfn-bit)
    1717                             (ash 1 $lfbits-aok-bit)))))
     1721                            (ash 1 $lfbits-aok-bit))))))
    17181722    (setf (slot-vector.instance slots) fn)
    17191723    (when dt
Note: See TracChangeset for help on using the changeset viewer.