1 | Using the Linux oprofile system-level profiler with CCL |
---|
2 | ======================================================= |
---|
3 | |
---|
4 | 'oprofile' (<http://oprofile.sourceforge.net>) is a system-level profiler |
---|
5 | that's available for most modern Linux distributions. |
---|
6 | |
---|
7 | Use of oprofile and its companion programs isn't really documented here; |
---|
8 | what -is- described is a way of generating symbolic information that |
---|
9 | enables profiling summaries generated by the 'opreport' program to |
---|
10 | identify lisp functions meaningfully. |
---|
11 | |
---|
12 | |
---|
13 | Generating a lisp image for use with oprofile |
---|
14 | --------------------------------------------- |
---|
15 | |
---|
16 | Modern Linux uses the 'ELF" (Executable and Linking Format) object |
---|
17 | file format; the oprofile tools can associate symbolic names with |
---|
18 | addresses in a memory-mapped file if that file appears to be an |
---|
19 | ELF object file and if it contains ELF symbol information that |
---|
20 | describes those memory regiions. So, the general idea is to make |
---|
21 | a lisp heap image that looks enough like an ELF shared library |
---|
22 | to fool the oprofile tools (we don't actually load heap images |
---|
23 | via ELF dynamic linking technology, but we can make it look like |
---|
24 | we did.) |
---|
25 | |
---|
26 | Prerequisites |
---|
27 | ------------- |
---|
28 | |
---|
29 | 1) oprofile itself, which is almost certainly available via your |
---|
30 | distributions's package management system if not already preinstalled. |
---|
31 | 2) 'libelf', which provides utilities for reading and writing ELF |
---|
32 | files (and is likewise likely preinstalled or readily installable.) |
---|
33 | |
---|
34 | Generating ELF symbols for lisp functions. |
---|
35 | ----------------------------------------- |
---|
36 | In order to create a lisp heap image which can be used for oprofile- |
---|
37 | based profiling, we need to: |
---|
38 | |
---|
39 | 1) load any code that we want to profile |
---|
40 | 2) generate a file that contains ELF symbol information describing |
---|
41 | the names and addresses of all lisp functions. |
---|
42 | |
---|
43 | This step involves doing (from within CCL) |
---|
44 | |
---|
45 | ? (require "ELF") |
---|
46 | "ELF" |
---|
47 | ("ELF") |
---|
48 | |
---|
49 | ? (ccl::write-elf-symbols-to-file "home:elf-symbols") |
---|
50 | |
---|
51 | The argument to CCL::WRITE-ELF-SYMBOLS-TO-FILE can be any writable |
---|
52 | pathname. The function will do whatever's necessary to nail lisp |
---|
53 | functions down in memory (so that they aren't moved by GC), then |
---|
54 | write an ELF object file to the indicated pathname. This typically |
---|
55 | takes a few seconds. |
---|
56 | |
---|
57 | 3) Generating a lisp heap image in which the ELF symbols generated |
---|
58 | in the previous step are prepended. |
---|
59 | |
---|
60 | The function CCL:SAVE-APPLICATION provides a :PREPEND-KERNEL argument, |
---|
61 | which is ordinarily used to save a standalone application in which |
---|
62 | the kernel and heap image occupy a single file. :PREPEND-KERNEL |
---|
63 | doesn't really care what it's prepending to the image, and we can |
---|
64 | just as easily ask it to prepend the ELF symbol file generated in |
---|
65 | the previous step. |
---|
66 | |
---|
67 | ? (save-application "somewhere/image-for-profiling" :prepend-kernel |
---|
68 | "home:elf-symbols") |
---|
69 | |
---|
70 | If you then run |
---|
71 | |
---|
72 | shell> ccl64 somewhare/image-for-profiling |
---|
73 | |
---|
74 | any lisp code sampled by oprofile in that image will be identified |
---|
75 | "symbolically" by opreport. |
---|
76 | |
---|
77 | Example |
---|
78 | ------- |
---|
79 | ;;; Define some lisp functions that we want to profile and save |
---|
80 | ;;; a profiling-enabled image. In this case, we just want to |
---|
81 | ;;; define the FACTORIAL funcion, to keep things simple. |
---|
82 | ? (defun fact (n) (if (zerop n) 1 (* n (fact (1- n))))) |
---|
83 | FACT |
---|
84 | ? (require "ELF") |
---|
85 | "ELF" |
---|
86 | ("ELF") |
---|
87 | ? (ccl::write-elf-symbols-to-file "home:elf-symbols") |
---|
88 | "home:elf-symbols" |
---|
89 | ? (save-application "home:profiled-ccl" :prepend-kernel "home:elf-symbols") |
---|
90 | |
---|
91 | ;;; Setup oprofile with (mostly) default arguments. This example was |
---|
92 | ;;; run on a Fedora 8 system where an uncompressed 'vmlinux' kernel |
---|
93 | ;;; image isn't readily available. |
---|
94 | |
---|
95 | ;;; Note that use of 'opcontrol' generally requires root access, e.g., |
---|
96 | ;;; 'sudo' or equivalent: |
---|
97 | |
---|
98 | [~] gb@rinpoche> sudo opcontrol --no-vmlinux --setup |
---|
99 | |
---|
100 | ;;; Start the profiler |
---|
101 | |
---|
102 | [~] gb@rinpoche> sudo opcontrol --start |
---|
103 | Using 2.6+ OProfile kernel interface. |
---|
104 | Using log file /var/lib/oprofile/samples/oprofiled.log |
---|
105 | Daemon started. |
---|
106 | Profiler running. |
---|
107 | |
---|
108 | ;;; Start CCL with the "profiled-ccl" image created above. |
---|
109 | ;;; Invoke "(FACT 10000)" |
---|
110 | |
---|
111 | [~] gb@rinpoche> ccl64 profiled-ccl |
---|
112 | Welcome to Clozure Common Lisp Version 1.2-r9198M-trunk (LinuxX8664)! |
---|
113 | ? (null (fact 10000)) |
---|
114 | NIL |
---|
115 | ? (quit) |
---|
116 | |
---|
117 | ;;; We could stop the profiler (opcontrol --stop) here; instead, |
---|
118 | ;;; we simply flush profiling data to disk, where 'opreport' can |
---|
119 | ;;; find it. |
---|
120 | |
---|
121 | [~] gb@rinpoche> sudo opcontrol --dump |
---|
122 | |
---|
123 | ;;; Ask opreport to show us where we were spending time in the |
---|
124 | ;;; 'profiled-ccl' image. |
---|
125 | |
---|
126 | [~] gb@rinpoche> opreport -l profiled-ccl | head |
---|
127 | CPU: Core 2, speed 1596 MHz (estimated) |
---|
128 | Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 |
---|
129 | samples % symbol name |
---|
130 | 6417 65.2466 <Compiled-function.(:INTERNAL.MULTIPLY-UNSIGNED-BIGNUM-AND-1-DIGIT-FIXNUM.MULTIPLY-BIGNUM-AND-FIXNUM).(Non-Global)..0x30004002453F> |
---|
131 | 3211 32.6487 <Compiled-function.%MULTIPLY-AND-ADD4.0x300040000AAF> |
---|
132 | 17 0.1729 <Compiled-function.%%ONE-ARG-DCODE.0x3000401740AF> |
---|
133 | 11 0.1118 <Compiled-function.%UNLOCK-RECURSIVE-LOCK-OBJECT.0x30004007F7DF> |
---|
134 | 10 0.1017 <Compiled-function.AUTO-FLUSH-INTERACTIVE-STREAMS.0x3000404ED6AF> |
---|
135 | 7 0.0712 <Compiled-function.%NANOSLEEP.0x30004040385F> |
---|
136 | 7 0.0712 <Compiled-function.%ZERO-TRAILING-SIGN-DIGITS.0x300040030F3F> |
---|
137 | |
---|
138 | I think that we can conclude that (FACT 10000) mostly involves multiplying |
---|
139 | bignums by small fixnums. |
---|
140 | |
---|
141 | Issues |
---|
142 | ------ |
---|
143 | CCL::WRITE-ELF-SYMBOLS-TO-FILE currently only works on x86-64; it certainly |
---|
144 | -could- be made to work on ppc32/ppc64 as well. |
---|
145 | |
---|
146 | So far, no one has been able to make oprofile/opreport options that're |
---|
147 | supposed to generate call-stack info generate meaningful call-stack info. |
---|
148 | |
---|
149 | As of a few months ago, there was an attempt to provide symbol info |
---|
150 | for oprofile/opreport "on the fly", e.g., for use in JIT compilation |
---|
151 | or other incremental compilations scenarios. That's obviously more nearly |
---|
152 | The Right Thing, but it might be awhile before that experimental code |
---|
153 | makes it into widespread use. |
---|
154 | |
---|
155 | |
---|