wiki:CallingConvention

OpenMCL calling conventions

x86

x86-64

The architectural CALL and RET instructions are not used, because CALL pushes a value onto the stack which will have an essentially arbitrary tag. It would be hard for the GC to look at that arbitrarily-tagged thing, recognize it as a pointer into a function, and then find that function.

(N.B.: It looks like this may be changing. See, e.g., r6290 ff.)

Therefore, there needs to be some way for the GC to recognize return addresses on the stack so that it can fix them up when it decides to move a function somewhere else in memory. If a return address is in a register, the GC again needs to be able to recognize that.

On x86-64, function objects have their own tag (#b1111). Two additional tags (#b0100 and #b1100) are reserved for tagged return addresses.

Both functions (tag #xf) and tagged return addresses (#x4 or #xc) can be JMPed to.

While a function is executing, %fn (= %r13) usually points to the tagged address of the function (#xXXXX...XXXF), which is the address of the function's first machine instruction. A function object consists of a number of words that contain machine instructions and other immediate data, follwed by some number of references to lisp objects that the function references as constants.

As the function's instructions are executed, the program counter contains arbitrarily tagged values, since x86 instructions are a variable number of bytes in length. Therefore, the value in the program counter doesn't make it easy for the GC to identify the containing function. However, as long as %fn (or something else...) refers to the function (i.e., is a pointer with tag #xf), the GC can find the function.

Suppose a function does a (simple, non-tail) call:

(defun foo (x) (bar x) nil)

This looks like:

  (movq (@ -8 (% rbp)) (% arg_z))                 ;[17]
  (movw ($ 8) (% nargs))                          ;[21]
  (movq (@ 'BAR (% fn)) (% temp0))                ;[26]
  (leaq (@ (:^ L53) (% fn)) (% ra0))              ;[33]
  (movq (@ 10 (% temp0)) (% fn))                  ;[40]
  (jmpq (% fn))                                   ;[44]
L53
  (leaq (@ (- (:^ L53)) (% ra0)) (% fn))

The instruction addresses that the disassembler prints are relative to start of the function (where the tagged #xf pointer points); if the function's at "somewhere +15", the tagged return address at L53 is at "somewhere+68": its low 4 bits are #bx100. If the disassemble were a little smarter, it'd show something like:

  (jmpq (% fn))                                   ;[44]
  (:align 3)
L49
  (:long 53)
L53

The 32-bit word which precedes the tagged return address enables the GC to easily find the function associated with any return adddress.

Note that at instuction 26 we use %fn to access the symbol BAR. (The dissassembler shows us this constant reference symbolically, but of course BAR is stored in the constants area of the function, somewhere beyond the end of the code.)

At instruction 40 we indirect through the symbol BAR and load a new value into %fn. If a GC were to take place at this point, %fn isn't referencing the function. Even if nothing else referenced the function (if it was an anonymous lambda that had just been compiled, for instance), the thing that convinces the GC to retain it is the fact that the tagged return address (loaded into %ra0 at instruction 33) is indirectly (and unambiguously) referencing the function.

When the callee (BAR) returns, it'll do so by JMPing to the address it received in %ra0 (L53); the code at L53 recovers %fn by subtracting a constant (53, which is what (:^ L53) - "address of L53" - refers to) from %ra0.

To see this, evaluate (ccl::dbg), and type "b" to get a backtrace.

(#x00002AAAAB186B68) #x00003000404E9314 : #<Function CALL-CHECK-REGS #x00003000404E922F> + 229
(#x00002AAAAB186BA0) #x00003000404E0DBC : #<Function TOPLEVEL-EVAL #x00003000404E0C0F> + 429
(#x00002AAAAB186BF0) #x00003000404E2C3C : #<Function READ-LOOP #x00003000404E24BF> + 1917
(#x00002AAAAB186DF8) #x00003000404E8BE4 : #<Function TOPLEVEL-LOOP #x00003000404E8B5F> + 133
(#x00002AAAAB186E28) #x0000300040425494 : #<Anonymous Function #x000030004042540F> + 133
(#x00002AAAAB186E40) #x000030004057F984 : #<Anonymous Function #x000030004057F67F> + 773
(#x00002AAAAB186EC8) #x000030004043961C : #<Function RUN-PROCESS-INITIAL-FORM #x000030004043935F> + 701
(#x00002AAAAB186F48) #x0000300040439F9C : #<Anonymous Function #x0000300040439E0F> + 397
(#x00002AAAAB186F98) #x000030004041AB5C : #<Anonymous Function #x000030004041A9EF> + 365

The first number is the address of the stack frame. The second number is the tagged return address associated with that frame: note that these addresses all end in #x4 or #xc. The 32-bit word stored before each return address enable the kernel debugger (and the GC) to find the tagged (#xf) pointer to the function object. (Note that all of the functions displayed in the backtrace have #xf in the low nibble.)

This scheme wastes a bit of space, burns two tags, and assumes that it's reasonable to keep the "current function" and the "canonical return address" in registers, at least on function call and return.

(largely from email from Gary Byers)


rme