Ticket #670 (closed defect: worksforme)

Opened 4 years ago

Last modified 4 years ago

Clozure CL64 crash when viewing a webpage using javascript

Reported by: raffael cavallaro Owned by:
Priority: major Milestone:
Component: IDE Version: trunk
Keywords: webkit IDE 64-bit intel browser-window Cc:

Description (last modified by rme) (diff)

From a fresh trunk launch of the 64-bit intel IDE (Clozure CL64) execute:

(require 'webkit)

which will load the webkit example from the ccl:examples directory.

when webkit is done loading, execute:

(ccl::browser-window "http://www.apple.com")

this will drop you into AltConsole? with the following error and backtrace, etc. that appears to be related to JavaScript? execution. NB, this same series of steps does not error under the 32-bit intel IDE (Clozure CL32).

Unhandled exception 4 at 0x256157e28c4f, context->regs at #x7fff5fbfcfe0
Exception occurred while executing foreign code
? for help
[4160] Clozure CL kernel debugger: ?
(G)  Set specified GPR to new value
(R)  Show raw GPR/SPR register values
(L)  Show Lisp values of tagged registers
(F)  Show FPU registers
(S)  Find and describe symbol matching specified name
(B)  Show backtrace
(T)  Show info about current thread
(X)  Exit from this debugger, asserting that any exception was handled
(P)  Propagate the exception to another handler (debugger or OS)
(K)  Kill Clozure CL process
(?)  Show this help
[4160] Clozure CL kernel debugger: r
%rax = 0x000000001549a080      %r8  = 0x000000000000003f
%rcx = 0x0000000015b77838      %r9  = 0x00000000156496a0
%rdx = 0x0000000000000000      %r10 = 0x000000001549a2c0
%rbx = 0x0000000015c1e480      %r11 = 0x0000000015b80e40
%rsp = 0x00007fff5fbfd4b0      %r12 = 0x00000000000001e8
%rbp = 0x00007fff5fbfd520      %r13 = 0x0000000015710148
%rsi = 0x0000000015710148      %r14 = 0xffff000000000000
%rdi = 0x0000000015638000      %r15 = 0xffff000000000002
%rip = 0x0000256157e28c4f   %rflags = 0x00010202
[4160] Clozure CL kernel debugger: l
[4160] Clozure CL kernel debugger: f
f00: 0x1549a640 (4.072277e-26), 0x000000001549a640 (1.764547e-315)
f01: 0x1549a6c0 (4.072317e-26), 0x000000001549a6c0 (1.764547e-315)
f02: 0x1549a740 (4.072356e-26), 0x000000001549a740 (1.764548e-315)
f03: 0x1549a7c0 (4.072396e-26), 0x000000001549a7c0 (1.764549e-315)
f04: 0x00000002 (2.802597e-45), 0x0000000000000002 (9.881313e-324)
f05: 0x00730065 (1.056122e-38), 0x0061002e00730065 (7.565563e-307)
f06: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
f07: 0x00000000 (0.000000e+00), 0x3ff0000000000000 (1.000000e+00)
f08: 0x00000000 (0.000000e+00), 0x3ff0000000000000 (1.000000e+00)
f09: 0x00000000 (0.000000e+00), 0x3ff0000000000000 (1.000000e+00)
f10: 0x00000000 (0.000000e+00), 0x3ff0000000000000 (1.000000e+00)
f11: 0x00000000 (0.000000e+00), 0x3ff0000000000000 (1.000000e+00)
f12: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
f13: 0x00000000 (0.000000e+00), 0x4030000000000000 (1.600000e+01)
f14: 0x3b808081 (3.921569e-03), 0x3b8080813b808081 (4.368036e-22)
f15: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
mxcsr = 0x00001fa1
[4160] Clozure CL kernel debugger: b
current thread: tcr = 0x1007b0, native thread ID = 0x207, interrupts enabled


(#x000000000044BF30) #x0000300000DE030C : #<Anonymous Function #x0000300000DE025F> + 173
(#x000000000044BF50) #x0000300001178884 : #<Function (:OBJC-DISPATCH run) #x000030000117864F> + 565
(#x000000000044BF88) #x0000300001178164 : #<Function EVENT-LOOP #x0000300001177FBF> + 421
[4160] Clozure CL kernel debugger: t
Current Thread Context Record (tcr) = 0x1007b0
Control (C) stack area:  low = 0x7fff5f99c000, high = 0x7fff5fbffac0
Value (lisp) stack area: low = 0x200000, high = 0x44c000
Exception stack pointer = 0x7fff5fbfd4b0
[4160] Clozure CL kernel debugger: 

Change History

comment:1 Changed 4 years ago by rme

  • Description modified (diff)

We've encountered this before, I think.

See  http://paste.lisp.org/display/95994

Maybe I better include that text here.

Program received signal EXC_BAD_INSTRUCTION, Illegal instruction/operand.
0x0000263775a12cb2 in ?? ()
1: x/i $pc  0x263775a12cb2:	(bad)  
(gdb) where
#0  0x0000263775a12cb2 in ?? ()
#1  0x00007fff876d4d83 in JSC::Interpreter::execute ()
Previous frame inner to this frame (gdb could not unwind past this frame)

[from gb] In several times that I've gotten this to crash and in rme's original paste, the low 12 bits of the %rip have been consistently #xcb2; the uppper 52 bits have been random

(gdb) info mach-region $pc
Region from 0x538079412000 to 0x5380f9400000 (rwx, max rwx; copy, private, not-reserved) (17 sub-regions)
(gdb) p/x $pc-0x538079412000
$1 = 0xcb2

The vmmap program can do a better job than GDB can of identifying the address in question:

JS JIT generated code  0000538079412000-0000538079413000 [    4K] rwx/rwx SM=COW  

So, we're on a page in the middle of a 2GB reserved region, and the address of that reserved region seems to vary a lot from run to run (perhaps intentionally.) There is in fact what appears to be JITted code on either side of that address:

(gdb) x/7i $pc-20
0x538079412c9e:	mov    $0x7fff843f4ff0,%r11
0x538079412ca8:	callq  *%r11
0x538079412cab:	mov    %rax,0x18(%r13)
0x538079412caf:	jmpq   0x5380794121b7
0x538079412cb4:	mov    %rax,0x8(%rsp)
0x538079412cb9:	movq   $0x158fc120,0x10(%rsp)
0x538079412cc2:	mov    %rsp,%rdi

Unfortunately, the address we're dying at is in the middle of the JMPQ at 0x...caf; whatever caused the apparent transfer to 0x....cb2 causes us to try to execute the most significant bytes of the JMPQ's displacement (#xff #xff in this case.)

It's plausible that something intended to transfer control to 0x....cb2 and got the math wrong. I guess that it's possible that CCL could influence this somehow, but I can't imagine how it would do so. (Even if CCL were randomly scribbling over memory, the fact that the JIT regions' addresses seem to vary from run to run suggests that that random scribbling would have to be incredibly lucky to keep finding those regions and scribbling on them ...)

comment:2 Changed 4 years ago by rme

This seems to be working in 10.6.5.

Of course, Apple's web page is different now.

comment:3 Changed 4 years ago by rme

  • Status changed from new to closed
  • Resolution set to worksforme
Note: See TracTickets for help on using tickets.