wiki:WatchedObjects

Version 5 (modified by rme, 5 years ago) (diff)

DWIM-y watch; update example accordingly

Overview

Clozure CL provides a way for lisp objects to be watched so that a condition will be signaled when a thread attempts to write to the watched object. For a certain class of bugs (someone is changing this value, but I don't know who), this can be extremely helpful.

Usage

WATCH

WATCH &optional object [function]

The WATCH function arranges for the specified object to be monitored for writes. This is accomplished by copying the object to its own set of virtual memory pages, which are then write-protected. This protection is enforced by the computer's memory-management hardware; the write-protection does not slow down reads at all.

When any write to the object is attempted, a WRITE-TO-WATCHED-OBJECT condition will be signaled.

When called with no arguments, WATCH returns a freshly-consed list of the objects currently being watched.

DWIM

WATCH operates at a fairly low level; it is not possible to avoid the details of the internal representation of objects.

Nevertheless, as a convenience, WATCHing a standard-instance, a hash-table, or a multi-dimensional or non-simple CL array will watch the underlying slot-vector, hash-table-vector, or data-vector, respectively.

Discussion

WATCH can monitor any memory-allocated lisp object.

In Clozure CL, a memory-allocated object is either a cons cell or a uvector.

WATCH operates on cons cells, not lists. In order to watch a chain of cons cells, each cons cell must be watched individually. Because each watched cons cell takes up its own own virtual memory page (4 Kbytes), it's only feasible to watch relatively short lists.

If a memory-allocated object isn't a cons cell, then it is a vector-like object called a uvector. A uvector is a memory-allocated lisp object whose first word is a header that describes the object's type and the number of elements that it contains.

So, a hash table is a uvector, as is a string, a standard instance, a double-float, a CL array or vector, and so forth.

Some CL objects, like strings and other simple vectors, map in a straightforward way onto the uvector representation. It is easy to understand what happens in such cases. The uvector index corresponds directly to the vector index:

? (defvar *s* "xxxxx")
*S*
? (watch *s*)
"xxxxx"
? (setf (char *s* 3) #\o)
> Error: Write to watched uvector "xxxxx" at index 3
>        Faulting instruction: (movl (% eax) (@ -5 (% r15) (% rcx)))
> While executing: SET-CHAR, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.

In the case of more complicated objects (e.g., a hash-table, a standard-instance, a package, etc.), the elements of the uvector are like slots in a structure. It's necessary to know which one of those "slots" contains the data that will be changed when the object is written to.

As mentioned above, watch knows about arrays, hash-tables, and standard-instances, and will automatically watch the appropriate data-containing element.

An example might make this clearer.

? (defclass foo ()
    (slot-a slot-b slot-c))
#<STANDARD-CLASS FOO>
? (defvar *a-foo* (make-instance 'foo))
*A-FOO*
? (watch *a-foo*)
#<SLOT-VECTOR #xDB00D>
;;; Note that WATCH has watched the internal slot-vector object
? (setf (slot-value *a-foo* 'slot-a) 'foo)
> Error: Write to watched uvector #<SLOT-VECTOR #xDB00D> at index 1
>        Faulting instruction: (movq (% rsi) (@ -5 (% r8) (% rdi)))
> While executing: %MAYBE-STD-SETF-SLOT-VALUE-USING-CLASS, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.

Looking at a backtrace would presumably show what object and slot name were written.

Note that even though the write was to slot-a, the uvector index was 1 (not 0). This is because the first element of a slot-vector is a pointer to the instance that owns the slots. We can retrieve that to look at the object that was modified:

1 > (uvref (write-to-watched-object-object *break-condition*) 0)
#<FOO #x30004113502D>
1 > (describe *)
#<FOO #x30004113502D>
Class: #<STANDARD-CLASS FOO>
Wrapper: #<CLASS-WRAPPER FOO #x300041135EBD>
Instance slots
SLOT-A: #<Unbound>
SLOT-B: #<Unbound>
SLOT-C: #<Unbound>
1 > 

UNWATCH

UNWATCH object [function]

The UNWATCH function ensures that the specified object is in normal, non-monitored memory. If the object is not currently being watched, UNWATCH does nothing and returns NIL. Otherwise, the newly unwatched object is returned.

WRITE-TO-WATCHED-OBJECT

WRITE-TO-WATCHED-OBJECT [condition]

This condition is signaled when a watched object is written to. There are three slots of interest: object, offset, and instruction.

The object slot is the actual object that was written.

The offset slot is the byte offset from the tagged object pointer to the starting address of the write.

The instruction slot contains the disassembled machine instruction that attempted the write.

The :report function for this condition will attempt to use the offset to determine what uvector index (or which half, either the car or the cdr, for a cons cell) was attempted to be written. The disassembled instruction will also be displayed, if possible.

Restarts

Two restarts are provided: one will skip over the faulting write instruction and proceed; the other offers to unwatch the object and continue.

Notes

Although some care has been taken to minimize potential problems arising from watching and unwatching objects from multiple threads, there may well be subtle race conditions present that could cause bad behavior.

For example, suppose that a thread attempts to write to a watched object. This causes the operating system to generate an exception. The lisp kernel figures out what the exception is, and calls back into lisp to signal the write-to-watched-object condition and perhaps handle the error.

Now, as soon lisp code starts running again (for the callback), it's possible that some other thread could unwatch the very watched object that caused the exception, perhaps before we even have a chance to signal the condition, much less respond to it.

Having the object unwatched out from underneath a handler may at least confuse it, if not cause deeper trouble. Use caution with unwatch.

Examples

Here are a couple more examples in addition to the above examples of watching a string and a standard-instance.

Fancy arrays

?  (defvar *f* (make-array '(2 3) :element-type 'double-float))
*F*
? (watch *f*)
#(0.0D0 0.0D0 0.0D0 0.0D0 0.0D0 0.0D0)
;;; Note that the above vector is the underlying data-vector for the array
? (setf (aref *f* 1 2) pi)
> Error: Write to watched uvector #<VECTOR 6 type DOUBLE-FLOAT, simple> at index 5
>        Faulting instruction: (movq (% rax) (@ -5 (% r8) (% rdi)))
> While executing: ASET, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.
1 > 

In this case, uvector index in the report is the row-major index of the element that was written to.

Hash tables

Hash tables are surprisingly complicated. The representation of a hash table includes an element called a hash-table-vector. The keys and values of the elements are stored pairwise in this vector.

One problem with trying to monitor hash tables for writes is that the underlying hash-table-vector is replaced with an entirely new one when the hash table is rehashed. A previously-watched hash-table-vector will not be the used by the hash table after rehashing, and writes to the new vector will not be caught.

? (defvar *h* (make-hash-table))
*H*
? (setf (gethash 'noise *h*) 'feep)
FEEP
? (watch *h*)
#<HASH-TABLE-VECTOR #xDD00D>
;;; underlying hash-table-vector
? (setf (gethash 'noise *h*) 'ding)
> Error: Write to watched uvector #<HASH-TABLE-VECTOR #xDD00D> at index 35
>        Faulting instruction: (lock)
>          (cmpxchgq (% rsi) (@ (% r8) (% rdx)))
> While executing: %STORE-NODE-CONDITIONAL, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.
;;; see what value is being replaced...
1 > (uvref (write-to-watched-object-object *break-condition*) 35)
FEEP
;;; backtrace shows useful context
1 > :b
*(1A109F8) : 0 (%STORE-NODE-CONDITIONAL ???) NIL
 (1A10A50) : 1 (LOCK-FREE-PUTHASH NOISE #<HASH-TABLE :TEST EQL size 1/60 #x30004117D47D> DING) 653
 (1A10AC8) : 2 (CALL-CHECK-REGS PUTHASH NOISE #<HASH-TABLE :TEST EQL size 1/60 #x30004117D47D> DING) 229
 (1A10B00) : 3 (TOPLEVEL-EVAL (SETF (GETHASH # *H*) 'DING) NIL) 709
 ...

Lists

As previously mentioned, WATCH only watches individual cons cells.

? (defun watch-list (list)
    (maplist #'watch list))
WATCH-LIST
? (defvar *l* (list 1 2 3))
*L*
? (watch-list *l*)
((1 2 3) (2 3) (3))
? (setf (nth 2 *l*) 'foo)
> Error: Write to the CAR of watched cons cell (3)
>        Faulting instruction: (movq (% rsi) (@ 5 (% rdi)))
> While executing: %SETNTH, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.