Changes between Version 6 and Version 7 of CclUnderGdb


Ignore:
Timestamp:
Feb 26, 2009, 3:03:43 PM (10 years ago)
Author:
gz
Comment:

reorganize

Legend:

Unmodified
Added
Removed
Modified
  • CclUnderGdb

    v6 v7  
    11== Running CCL under GDB ==
    22
     3=== Overview ===
     4The lisp wants to handle most of the signals that can be raised to
     5indicate an exception.  If the lisp's exception-handling code doesn't
     6know how to handle an exception, it enters the CCL kernel debugger
     7(there's no good and direct way to pass it to another debugger).  When an
     8exception occurs in foreign code, the kernel debugger tries to
     9note that fact.
    310
    4 > Any tips and tricks to share?[[br]]
    5 > What do I do about segfaults?  What about SIG40?[[br]]
     11GDB's much more likely to be able to make at least some sense
     12out of the state of things in the exception-in-foreign-code case
     13than the lisp's kernel debugger is.
    614
    7 You basically want to tell GDB to load (or "source", as a verb) a file
    8 that tells it about signals that're handled by the application
    9 and defines some macros (most of which have to do with printing
    10 lisp object values)
     15=== Loading GDB init file ===
     16
     17Before doing anything with lisp in GDB, you need to load (or "source",
     18as a verb) the file `ccl/lisp-kernel/linuxx8664/.gdbinit` (replace
     19`linuxx8664` with whatever OS you're running).  This file tells GDB
     20about signals that need to be passed to lisp for handling, and defines
     21some macros (most of which have to do with printing lisp object values).
     22
     23That file will be sourced automatically if it (or a link to it)
     24is in the same directory as the executable (or, IIRC, in your
     25home directory.)  Otherwise, once in GDB, just do:
     26
    1127{{{
    1228(gdb) source ccl/lisp-kernel/linuxx8664/.gdbinint
    1329}}}
    14 That file will be sourced automatically if it (or a link to it)
    15 is in the same directory as the executable (or, IIRC, in your
    16 home directory.)
    17 
    18 I think that the "handle" forms in the .gdbinit file enumerate
    19 all of the signals that the lisp handles; there was a time
    20 last fall when at least one case was missing from the checked-in
    21 .gdbinit file.  The general idea is to say something like:
    22 {{{
    23 handle SIGQUIT pass nostop noprint
    24 }}}
    25 which tells GDB that if the target process gets a SIGQUIT, it should
    26 let the application handle it (GDB should "pass" it to the application)
    27 without stopping or printing anything.
    28 
    29 A SIGINT by default causes entry to GDB and is not passed to the
    30 application.  I sometimes find it useful to be able be able interrupt
    31 the lisp via SIGINT (after entering GDB).  Doing something like
    32 {{{
    33 handle SIGINT pass stop print
    34 }}}
    35 causes GDB to ask for confirmation because "SIGINT is used by the
    36 debugger".  (It's not used in the same way that breakpoints and
    37 single-step exceptions are used, so I usually just sigh and give
    38 it the confirmation it craves.)
    3930
    4031
    41 GDB's very good at debugging C code that was compiled with debugging
    42 enabled and for which you have the source code.  (It's even better
    43 if optimization's toned down.)  If you're trying to debug C library
    44 code for which you have the source and for which debugging information
    45 was generated and not stripped, GDB's sort of in its element and
    46 offers lots of useful features.
     32=== Connecting GDB ===
    4733
    48 The lisp wants to handle most of the signals that can be raised to
    49 indicate an exception.  On x86-64 Linux, SIGSEGV means lots of
    50 things, and those things in turn mean different things when you're
    51 executing lisp code when the occur than they would if you were executing
    52 C ("foreign") code.  If the lisp's exception-handling code doesn't
    53 know how to handle an exception, it enters the kernel debugger (there's
    54 no good and direct way to pass it to another debugger.)  When an
    55 exception occurs in foreign code, the kernel debugger tries to
    56 note that fact; ideally, it would also disable some debugging
    57 commands that only make sense if the exception occurred while
    58 executing lisp code, but it leaves them enabled.  (The "L"
    59 kernel debugger command is very useful for seeing the values
    60 of lisp objects in registers at the point of the exception,
    61 but it will crash or misbehave if those registers don't contain
    62 lisp objects, as they wouldn't if the exception occurred in foreign
    63 code.)
     34When lisp is in the kernel debugger following an exception in foreign code:
    6435
     36(*) Note the PID, printed in brackets in the kernel debugger prompt, say it's `[1234]`
    6537
    66 GDB's much more likely to be able to make at least some sense
    67 out of the state of things in the exception-in-foreign-code case
    68 than the lisp's kernel debugger is.  If GDB's already running
    69 (as opposed to having been attached after the fact), you can
    70 do this via the same technique that I described a few weeks ago
    71 (but it's a little easier if you don't have to play "guess
    72 which thread was in the kernel debugger.)  The general idea
    73 is:
     38(*) Do the `R` command to display raw (hex) register values and note the value in `RIP` (the program counter/instruction pointer), say it's `0x12345678`.
    7439
    75 a) In the kernel debugger do R to display raw (hex) register values and note the value in {{{RIP}}} (the program counter/instruction pointer.)
     40(*) If GDB is already running, drop into it (via !^C).  Otherwise, get a shell and do:
     41{{{
     42shell> gdb /path/to/ccl/lx86cl64  # location of lisp kernel
     43(gdb) source lisp-kernel/linuxx8664/.gdbinit
     44(gdb) attach 1234    # or whatever the PID is
     45}}}
    7646
    77 b) Drop into GDB (via !^C) and set a breakpoint at that address.
    78 If the address is 0x87654321, the GDB command to set that breakpoint
    79 would be:
     47(*) set a breakpoint at the exception:
    8048{{{
    81 (gdb) br *0x87654321
     49(gdb) br *0x12345678 # or whatever the RIP value is
    8250}}}
    8351The leading asterisk is necessary to prevent GDB from interpreting
    8452the integer as a line number.
    8553
    86 c)  Tell GDB to let the interrupted process continue
     54(*) tell GDB to let lisp run:
    8755{{{
    8856(gdb) continue
     
    9159All other lisp threads should be suspended.
    9260
    93 d)  In the kernel debugger, use the "x" command, which exits
    94 from the kernel debugger resumes other threads.
     61(*) Back in the kernel debugger, use the `x` command, which exits from the kernel debugger and resumes other threads.
     62{{{
     63[1234] Clozure CL kernel debugger: x
     64}}}
    9565
    96 The next time any thread reaches the address of the breakpoint,
     66That should immediately break into gdb at the instruction that caused the
     67fault.
     68
     69More generally, the next time any thread reaches the address of the breakpoint,
    9770GDB will be entered.  It's hard to guarantee that the first thread
    9871that reaches that point will be the one that got the exception,
     
    10073time to wake up after being suspended.)
    10174
    102 In GDB at that point,
     75=== Debugging in GDB ===
     76
    10377{{{
    10478(gdb) bt
    10579}}}
    106 will do a backtrace (at least as far back as the foreign function
     80will do a C backtrace (at least as far back as the foreign function
    10781call from lisp)
     82
    10883{{{
    10984(gdb) info regs
    11085}}}
    11186will show register values.
     87
     88{{{
     89(gdb) x/i $pc
     90}}}
     91disassembles the instruction at the pc/%rip.
    11292
    11393If the foreign code has symbolic debugging information and wasn't
     
    120100more quickly than you would otherwise.
    121101
     102Some Linux distributions provide debugging information and library source
     103for the standard libraries; on Fedora, this information is contained in
     104optional "debuginfo" packages.  If it's available, the information is
     105often very useful.
    122106
    123107As far as other tips and tricks ... I'm not sure what I could
     
    157141To enter GDB when lisp is starting up, set a breakpoint at *_SPfuncall, which is called soon after the image is loaded (and is rarely called thereafter, since funcall is inlined).
    158142
    159 ----
    160 To enter GDB when lisp is already in the kernel debugger after an exception in foreign code, first do the `R` command in the kernel debugger to display raw (hex) register values and note the value in {{{RIP}}} (the program counter/instruction pointer.)  Then:
    161 
    162 {{{
    163 shell> gdb /path/to/lisp-kernel
    164 (gdb) source lisp-kernel/linuxx8664/.gdbinit
    165 (gdb) attach <pid>    # pid is printed in brackets in the kernel debugger prompt
    166 (gdb) br *0x87654321  # or whatever the RIP value is
    167 (gdb) continue
    168 }}}
    169 
    170 Back in the kernel debugger:
    171 
    172 {{{
    173 [pid] Clozure CL kernel debugger: x
    174 }}}
    175 
    176 
    177 That should immediately break into gdb at the instruction that caused the
    178 fault.  At that point:
    179 
    180 {{{
    181 (gdb) x/i $pc   # disassembles the instruction at the pc/%rip
    182 (gdb) bt        # do a C backtrace
    183 }}}
    184 
    185 Some Linux distributions provide debugging information and library source
    186 for the standard libraries; on Fedora, this information is contained in
    187 optional "debuginfo" packages.  If it's available, the information is
    188 often very useful.
    189 
    190 ----
    191143To cause GC (including the EGC) to run integrity checks on entry, add `-DGC_INTEGRITY_CHECKING` to the CDEFINES in the kernel Makefile and rebuild the kernel.  Alternately you can `(setq ccl::*gc-event-status-bits* 4)` at any time for the same effect.
    192144
    193145If you look at the .gdbinit file, there are a number of useful lisp-related commands defined there.  Try them...
    194146
     147=== Signal handling ===
     148
     149The "handle" forms in the .gdbinit file enumerate
     150all of the signals that the lisp handles.  The general
     151idea is to say something like:
     152{{{
     153handle SIGQUIT pass nostop noprint
     154}}}
     155which tells GDB that if the target process gets a SIGQUIT, it should
     156let the application handle it (GDB should "pass" it to the application)
     157without stopping or printing anything.
     158
     159A SIGINT by default causes entry to GDB and is not passed to the
     160application.  I sometimes find it useful to be able be able interrupt
     161the lisp via SIGINT (after entering GDB).  Doing something like
     162{{{
     163handle SIGINT pass stop print
     164}}}
     165causes GDB to ask for confirmation because "SIGINT is used by the
     166debugger".  (It's not used in the same way that breakpoints and
     167single-step exceptions are used, so I usually just sigh and give
     168it the confirmation it craves.)