Ticket #899 (closed defect: fixed)
segfault in LinuxX8632/WindowsX8632 build using postmodern library
|Reported by:||emit||Owned by:||rme|
|Keywords:||segfault sigsegv linuxx8632 postmodern||Cc:|
I've reproduced the problem with the binary release (1.7-r14925M) and also a rebuild from svn (1.7-r15146M). There is no such problem when using the 64 bit linux version. I've tested the LinuxX8632 in 32bit linux 2.6.32 (debian squeeze) and also 64bit install 3.0 (arch linux). The postgresql server is version 8.4 (not relevant).
steps to reproduce:
;; install latest postmodern somehow. ;; I've tried the one from quicklisp dist and ;; also the latest git version from ;; http://marijnhaverbeke.nl/postmodern/#download (ql:quicklisp :postmodern) (pomo:connect-toplevel "dbname" "user" "pass" "host") (pomo:query "select '42'")
expected result: (("42")) <--- ok in sbcl and LinuxX8664 ccl
in LinuxX8632 ccl I get:
> Error: Database error: Unexpected end of file on #<BASIC-TCP-STREAM :CLOSED #x188F5736> > While executing: (:INTERNAL #:G5027 CL-POSTGRES:EXEC-QUERY), in process listener(1). > Type :POP to abort, :R for a list of available restarts. > Type :? for other options.
At first it looks like a bug in the library, but tcpdump reveals mangled data in which the length of simple-describe-message packet is declared incorrectly as 1. This four byte length value in the packet's header is computed through define-message macro in postmodern/cl-postgres/messages.lisp.
During macro expansion it appears to correctly compute, but when it finally gets sent through (simple-describe-message socket) in cl-postgres:send-query, it's accessing some garbage.
Through defmacro integer-writer (postmodern/cl-postgres/communicate.lisp), a bunch of functions called uint1 uint2 uint4 etc are created which send bytes to socket through write-byte. I changed all the occurrences of write-byte in this macro to my-write-byte and defined
(defun my-write-byte (value socket) (format t "~a~%" value) (write-byte value socket))
I'm not sure if it's relevant but I also commented out all the #.optimize. After intercepting all the write-bytes with my-write-bytes, it dumps the value of each byte sent (for integer fields); when it comes to accessing the length it now dumps the following error
;;; (pomo:query "select '42'") ;; protocol.lisp (send-query .. 80 ; 'P' --- Parse command packet (simple-parse-message ..) 0 0 0 19 ; 19 bytes (length of parse packet) 0 0 ;;;;; string select '42' is sent here 0 ; number of param data types (0) 0 68 ; 'D' --- Describe command (simple-describe-message ..) 0 0 0 > Error: Fault during read of memory address #x0 > While executing: CCL::VALID-HEADER-P, in process listener(1). > Type :POP to abort, :R for a list of available restarts. > Type :? for other options. 1 > q ;;;; length of message = 1 in 32bit ccl. 6 in sbcl and 64bit ccl
No backtrace is available. Last byte (least significant byte of the length value) should be 6, which tells postgresql that the packet is six bytes: 4 byte length field + 2 bytes of data. If I check the value of the length variable, "base-length", computed during macro expansion of define-message simple-describe-message, it appears correct (=6).
Sending the 0x00000001 as length of the packet naturally causes postgresql to output the following error to syslog: invalid message length and the connection is dropped.
If I repeat with gdb attached to the process I get:
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xf740ab70 (LWP 5227)] 0x1040547e in ?? () (gdb) bt #0 0x1040547e in ?? () #1 0x10404c65 in ?? () #2 0x104400a5 in ?? () #3 0x1044749d in ?? () #4 0x08054445 in _SPFret1valn () at ../x86-spentry32.s:955 #5 0x100d8d55 in ?? () #6 0x08054445 in _SPFret1valn () at ../x86-spentry32.s:955 #7 0x10506c9d in ?? () #8 0x10502c25 in ?? () #9 0x1086eb75 in ?? () #10 0x100899f5 in ?? ()
I'm not certain but I'm inclined to believe this is a bug in ccl.