Clozure CL Documentation


1. About Clozure CL
1.1. Introduction to Clozure CL
2. Obtaining, Installing, and Running Clozure CL
2.1. Releases and System Requirements
2.2. Obtaining Clozure CL
2.3. Command Line Set Up
2.4. Personal Customization with the Init File
2.5. Command Line Options
2.6. Using Clozure CL with GNU Emacs and SLIME
2.7. Example Programs
3. Building Clozure CL from its Source Code
3.1. Building Definitions
3.2. Setting Up to Build
3.3. Building Everything
3.4. Building the kernel
3.5. Building the heap image
4. Using Clozure CL
4.1. Introduction
4.2. Trace
4.3. Advising
4.4. Directory
4.5. Unicode
4.6. Pathnames
4.7. Memory-mapped Files
4.8. Static Variables
4.9. Saving Applications
4.10. Concatenating FASL Files
4.11. Floating Point Numbers
4.12. Watched Objects
4.13. Code Coverage
5. The Clozure CL IDE
5.1. Introduction
5.2. Building the IDE
5.3. Running the IDE
5.4. IDE Features
5.5. IDE Sources
5.6. The Application Builder
6. Programming with Threads
6.1. Threads Overview
6.2. (Intentionally) Missing Functionality
6.3. Implementation Decisions and Open Questions
6.4. Porting Code from the Old Thread Model
6.5. Background Terminal Input
6.6. The Threads which Clozure CL Uses for Its Own Purposes
6.7. Threads Dictionary
7. Programming with Sockets
7.1. Overview
7.2. Sockets Dictionary
8. Running Other Programs as Subprocesses
8.1. Overview
8.2. Examples
8.3. Limitations and known bugs
8.4. External-Program Dictionary
9. Streams
9.1. Stream Extensions
9.2. Creating Your Own Stream Classes with Gray Streams
10. Writing Portable Extensions to the Object System using the MetaObject Protocol
10.1. Overview
10.2. Implementation status
10.3. Concurrency issues
11. Profiling
11.1. Using the Linux oprofile system-level profiler
11.2. Using Apple's CHUD metering tools
12. The Foreign-Function Interface
12.1. Specifying And Using Foreign Types
12.2. Foreign Function Calls
12.3. Referencing and Using Foreign Memory Addresses
12.4. The Interface Database
12.5. Using Interface Directories
12.6. Using Shared Libraries
12.7. The Interface Translator
12.8. Case-sensitivity of foreign names in CCL
12.9. Reading Foreign Names
12.10. Tutorial: Using Basic Calls and Types
12.11. Tutorial: Allocating Foreign Data on the Lisp Heap
12.12. The Foreign-Function-Interface Dictionary
13. The Objective-C Bridge
13.1. Changes in 1.2
13.2. Using Objective-C Classes
13.3. Instantiating Objective-C Objects
13.4. Calling Objective-C Methods
13.5. Defining Objective-C Classes
13.6. Defining Objective-C Methods
13.7. Loading Frameworks
13.8. How Objective-C Names are Mapped to Lisp Symbols
14. Platform-specific notes
14.1. Overview
14.2. Unix/Posix/Darwin Features
14.3. Cocoa Programming in Clozure CL
14.4. Building an Application Bundle
14.5. Recommended Reading
14.6. Operating-System Dictionary
15. Understanding and Configuring the Garbage Collector
15.1. Heap space allocation
15.2. The Ephemeral GC
15.3. GC Page reclamation policy
15.4. "Pure" areas are read-only, paged from image file
15.5. Weak References
15.6. Weak References Dictionary
15.7. Garbage-Collection Dictionary
16. Implementation Details of Clozure CL
16.1. Threads and exceptions
16.2. Register usage and tagging
16.3. Heap Allocation
16.4. GC details
16.5. The ephemeral GC
16.6. Fasl files
16.7. The Objective-C Bridge
17. Modifying Clozure CL
17.1. Contributing Code Back to the Clozure CL Project
17.2. Using Clozure CL in "development" and in "user" mode
17.3. The Kernel Debugger
17.4. Using AltiVec in Clozure CL LAP functions
17.5. Development-Mode Dictionary
18. Questions and Answers
18.1. How can I do nonblocking (aka "unbuffered" and "raw") IO?
18.2. I'm using the graphics demos. Why doesn't the menubar change?
18.3. I'm using Slime and Cocoa. Why doesn't *standard-output* seem to work?
Glossary of Terms
Symbol Index

List of Tables

3.1. Platform-specific filename conventions
4.1. Line Termination Keywords

Chapter 1. About Clozure CL

1.1. Introduction to Clozure CL

Clozure CL is a fast, mature, open source Common Lisp implementation that runs on Linux, Mac OS X and BSD on either Intel x86-64 or PPC. Clozure CL was forked from Macintosh Common Lisp (MCL) in 1998 and the development has been entirely separate since. Ports to IA32 and Windows are under development.

When it was forked from MCL in 1998, the new Lisp was named OpenMCL. Recently, Clozure renamed its Lisp to Clozure CL, partly because its ancestor MCL has lately been released as open source. Clozure thought it might be confusing for users if there were two independent open-source projects with such similar names. The new name also reflects Clozure CL's current status as the flagship product of Clozure Associates.

Furthermore, the new name refers to Clozure CL's ancestry: in its early years, MCL was known as Coral Common Lisp, or "CCL". For years the package that contains most of Clozure CL's implementation-specific symbols has been named "CCL", an acronym that once stood for the name of the Lisp product. It seems fitting that "CCL" once again stands for the name of the product.

Some commands and source files may still refer to "OpenMCL" instead of Clozure CL.

Clozure CL compiles to native code and supports multithreading using native OS threads. It includes a foreign-function interface, and supports both Lisp code that calls external code, and external code that calls Lisp code. Clozure CL can create standalone executables on all supported platforms.

On Mac OS X, Clozure CL supports building GUI applications that use OS X's native Cocoa frameworks, and the OS X distributions include an IDE written with Cocoa, and distributed with complete sources.

On all supported platforms, Clozure CL can run as a command-line process, or as an inferior Emacs process using either SLIME or ILISP.

Features of Clozure CL include

  • Very fast compilation speed.

  • A fast, precise, compacting, generational garbage collector written in hand-optimized C. The sizes of the generations are fully configurable. Typically, a generation can be collected in a millisecond on modern systems.

  • Fast execution speed, competitive with other Common Lisp implementations on most benchmarks.

  • Robust and stable. Customers report that their CPU-intensive, multi-threaded applications run for extended periods on Clozure CL without difficulty.

  • Full native OS threads on all platforms. Threads are automatically distributed across multiple cores. The API includes support for shared memory, locking, and blocking for OS operations such as I/O.

  • Full Unicode support.

  • Full SLIME integration.

  • An IDE on Mac OS X, fully integrated with the Macintosh window system and User Interface standards.

  • Excellent debugging facilities. The names of all local variables are available in a backtrace.

  • A complete, mature foreign function interface, including a powerful bridge to Objective-C and Cocoa on Mac OS X.

  • Many extensions including: files mapped to Common Lisp vectors for fast file I/O; thread-local hash tables and streams to eliminate locking overhead; cons hashing support; and much more

  • Very efficient use of memory

Although it's an open-source project, available free of charge under a liberal license, Clozure CL is also a fully-supported product of Clozure Associates. Clozure continues to extend, improve, and develop Clozure CL in response to customer and user needs, and offers full support and development services for Clozure CL.

Chapter 2. Obtaining, Installing, and Running Clozure CL

2.1. Releases and System Requirements

As of this writing, Clozure CL 1.4 is the latest release; it was made in October 2009. For up-to-date information about releases, please see http://ccl.clozure.com/.

Clozure CL runs on the following platforms:

  • Linux (x86, x86-64, ppc32, ppc64)

  • Mac OS X 10.4 and later (x86, x86-64, ppc32, ppc64)

  • FreeBSD 6.x and later (x86, x86-64)

  • Solaris (x86, x86-64)

  • Microsoft Windows XP and later (x86, x86-64)

2.1.1. LinuxPPC

Clozure CL requires version 2.2.13 (or later) of the Linux kernel and version 2.1.3 (or later) of the GNU C library (glibc) at a bare minimum.

2.1.2. Linux x86

Because of the nature of Linux distributions, it's difficult to give precise version number requirements. In general, a "fairly modern" (no more than 2 or three years old) kernel and C library are more likely to work well than older versions.

2.1.3. FreeBSD x86

Clozure CL should run on FreeBSD 6.x and 7.x. FreeBSD 7 users will need to install the "compat6x" package in order to use the distributed Clozure CL kernel, which is built on a FreeBSD 6.x system.

2.1.4. Mac OS X (ppc and x86)

Clozure CL runs under Mac OS X versions 10.4 and later. Post-1.4 versions will require at least 10.5.

64-bit versions of Clozure CL naturally require 64-bit processors (e.g., a G5 or Core 2 processor). Some early Intel-based Macintoshes used processors that don't support 64-bit operation, so the 64-bit Clozure CL will not run on them, although the 32-bit Clozure CL will.

2.1.5. Microsoft Windows

At the moment, the 32-bit Clozure CL does not run under 64-bit Windows.

2.2. Obtaining Clozure CL

There two main ways to obtain Clozure CL. For Mac OS X, there are disk images that can be used to install Clozure CL in the usual Macintosh way. For other OSes, Subversion is the best way to obtain Clozure CL. Mac OS X users can also use Subversion if they prefer. Tarballs are available for those who prefer them, but if you have Subversion installed, it is simpler and more flexible to use Subversion than tarballs.

There are three popular ways to use Clozure CL: as a stand-alone double-clickable application (Mac OS X only), as a command-line application, or with Emacs and SLIME. The following sections describe these options.

2.2.1. The Mac Way

If you are using Mac OS X then you can install and use Clozure CL in the usual Macintosh way. Download and mount a disk image, then drag the ccl folder to the Applications folder or wherever you wish. After that you can double-click the Clozure CL application found inside the ccl directory. The disk images are available at ftp://clozure.com/pub/release/1.4/

So that Clozure CL can locate its source code, and for other reasons explained in Section 4.6.2, “Predefined Logical Hosts”, you keep the Clozure CL application in the ccl directory. If you use a shell, you can set the value of the CCL_DEFAULT_DIRECTORY environment variable to explicitly indicate the location of the ccl directory. If you choose to do that, then the ccl directory and the Clozure CL application can each be in any location you find convenient.

2.2.2. Getting Clozure CL with Subversion

It is very easy to download, install, and build Clozure CL using Subversion. This is the preferred way to get either the latest, or a specific version of Clozure CL, unless you prefer the Mac Way. Subversion is a source code control system that is in wide use. Many OSes come with Subversion pre-installed. A complete, buildable and runnable set of Clozure CL sources and binaries can be retrieved with a single Subversion command.

Day-to-day development of Clozure CL takes place in an area of the Subversion repository known as the trunk. At most times, the trunk is perfectly usable, but occasionally it can be unstable or totally broken. If you wish to live on the bleeding edge, the following command will fetch a copy of the trunk for Darwin x86 (both 32- and 64-bit versions):

          
svn co http://svn.clozure.com/publicsvn/openmcl/trunk/darwinx86/ccl
        

To get a trunk Clozure CL for another platform, replace "darwinx86" with one of the following names (all versions include both 32- and 64-bit binaries):

  • darwinx86

  • linuxx86

  • freebsdx86

  • solarisx86

  • windows

  • linuxppc

  • darwinppc

Release versions of Clozure CL are intended to be stable. While bugs will be fixed in the release branches, enhancements and new features will go into the trunk. To get the 1.4 release of Clozure CL type:

          
svn co http://svn.clozure.com/publicsvn/openmcl/release/1.4/darwinx86/ccl
        

The above command will fetch the complete sources and binaries for the Darwin x86 build of Clozure CL. To get a Clozure CL for another platform, replace "darwinx86" with one of the following names (all versions include both 32- and 64-bit binaries):

  • darwinx86

  • linuxx86

  • freebsdx86

  • solarisx86

  • windows

  • linuxppc

  • darwinppc

These distributions contain complete sources and binaries. They use Subversion's "externals" features to share common sources; the majority of source code is the same across all versions.

Once the checkout is complete you can build Clozure CL by running the lisp kernel and executing the rebuild-ccl function. For example:

          
joe:ccl> ./dx86cl64
Welcome to Clozure Common Lisp Version 1.2  (DarwinX8664)!
? (rebuild-ccl :full t)

<lots of compilation output>

  ? (quit)
  joe:ccl>
        

If you don't have a C compiler toolchain installed, rebuild-ccl will not work. Please refer to Chapter 3, Building Clozure CL from its Source Code for addtional details.

2.2.2.1. Checking Subversion Installation

If svn co doesn't work, then make sure that Subversion is installed on your system. Bring up a command line shell and type:

          
shell> svn
        

If Subversion is installed, you will see something like:

          
Type 'svn help' for usage
        

If Subversion is not installed, you will see something like:

          
-bash: svn: command not found
        

If Subversion is not installed, you'll need to figure out how to install it on your OS. You can find information about obtaining and installing Subversion at the Subversion Packages page.

2.2.3. Tarballs

Tarballs are available at ftp://clozure.com/pub/release/1.4/. Download and extract one on your local disk. Then edit the Clozure CL shell script to set the value of CCL_DEFAULT_DIRECTORY and start up the appropriate Clozure CL kernel. See Section 2.3.1, “The ccl Shell Script” for more information about the Clozure CL shell scripts.

2.3. Command Line Set Up

Sometimes it's convenient to use Clozure CL from a Unix shell command line. This is especially true when using Clozure CL as a way to run Common Lisp utilities.

2.3.1. The ccl Shell Script

Clozure CL needs to be able to find the ccl directory in order to support features such as require and provide, access to foreign interface information (see The Interface Database) and the Lisp build process (see Building Clozure CL from its Source Code). Specifically, it needs to set up logical pathname translations for the "ccl:" logical host. If this logical host isn't defined (or isn't defined correctly), some things might work, some things might not, and it'll generally be hard to invoke and use Clozure CL productively.

Clozure CL uses the value of the environment variable CCL_DEFAULT_DIRECTORY to determine the filesystem location of the ccl directory; the ccl shell script is intended to provide a way to invoke Clozure CL with that environment variable set correctly.

There are two versions of the shell script: "ccl/scripts/ccl" is used to invoke 32-bit implementations of Clozure CL and "ccl/scripts/ccl64" is used to invoke 64-bit implementations.

To use the script:

  1. Copy the script to a directory that is on your PATH. This is often /usr/local/bin or ~/bin. It is better to do this than to add ccl/scripts to your PATH, because the script needs to be edited, and editing it in-place means that Subversion sees the script as modified..

  2. Edit the definition of CCL_DEFAULT_DIRECTORY near the beginning of the shell script so that it refers to your ccl directory. Alternately, set the value of the CCL_DEFAULT_DIRECTORY environment variable in your .cshrc, .tcshrc, .bashrc,.bash_profile, .MacOSX/environment.plist, or wherever you usually set environment variables. If there is an existing definition of the variable, the ccl script will not override it.

  3. Ensure that the shell script is executable, for example:

    $ chmod +x ~/ccl/ccl/scripts/ccl64

    This command grants execute permission to the named script. If you are using a 32-bit platform, substitute "ccl" in place of "ccl64".

    Warning

    The above command won't work if you are not the owner of the installed copy of Clozure CL. In that case, you can use the "sudo" command like this:

    $ sudo chmod +x ~/ccl/ccl/scripts/ccl64

    Give your password when prompted.

    If the "sudo" command doesn't work, then you are not an administrator on the system you're using, and you don't have the appropriate "sudo" permissions. In that case you'll need to get help from the system's administrator.

Note that most people won't need both ccl and ccl64 scripts. You only need both if you sometimes run 32-bit Clozure CL and sometimes run 64-bit Clozure CL. You can rename the script that you use to whatever you want. For example, if you are on a 64-bit system, and you only use Clozure CL in 64-bit mode, then you can rename ccl64 to ccl so that you only need to type "ccl" to run it.

Once this is done, it should be possible to invoke Clozure CL by typing ccl or ccl64 at a shell prompt:

> ccl [args ...]
Welcome to Clozure CL Version 1.2 (DarwinPPC32)!
?
      

The ccl shell script passes all of its arguments to the Clozure CL kernel. See Section 2.3.2, “Invocation” for more information about these arguments. When invoked this way, the Lisp should be able to initialize the "ccl:" logical host so that its translations refer to the "ccl" directory. To test this, you can call probe-file in Clozure CL's read-eval-print loop:

? (probe-file "ccl:level-1;level-1.lisp")  ;returns the physical pathname of the file
#P"/Users/alms/my_lisp_stuff/ccl/level-1/level-1.lisp"
      

2.3.2. Invocation

Assuming that the shell script is properly installed, it can be used to invoke Clozure CL from a shell prompt:

shell>ccl args
	    

ccl runs a 32-bit session; ccl64 runs a 64-bit session.

2.4. Personal Customization with the Init File

By default Clozure CL tries to load the file "home:ccl-init.lisp" or the compiled "home:ccl-init.fasl" upon starting up. Clozure CL does this by executing (load "home:ccl-init"). If it's unable to load the file (for example because the file doesn't exist), Clozure CL doesn't signal an error or warning, it just completes its startup normally.

On Unix systems, if "ccl-init.lisp" is not present, Clozure CL will look for ".ccl-init.lisp" (post 1.2 versions only).

The "home:" prefix to the filename is a Common Lisp logical host, which Clozure CL initializes to refer to your home directory. Clozure CL therefore looks for either of the files ~/ccl-init.lisp or ~/ccl-init.fasl.

Because the init file is loaded the same way as normal Lisp code is, you can put anything you want in it. For example, you can change the working directory, and load packages that you use frequently.

To suppress the loading of this init-file, invoke Clozure CL with the --no-init option.

2.5. Command Line Options

When using Clozure CL from the command line, the following options may be used to modify its behavior. The exact set of Clozure CL command-line arguments may vary per platform and slowly changes over time. The current set of command line options may be retrieved by using the --help option.

  • -h (or --help). Provides a definitive (if somewhat terse) summary of the command line options accepted by the Clozure CL implementation and then exits.

  • -V (or --version). Prints the version of Clozure CL then exits. The version string is the same value that is returned by LISP-IMPLEMENTATION-VERSION.

  • -K character-encoding-name (or --terminal-encoding character-encoding-name). Specifies the character encoding to use for *TERMINAL-IO* (see Section 4.5.4, “Character Encodings”). Specifically, the character-encoding-name string is uppercased and interned in the KEYWORD package. If an encoding named by that keyword exists, CCL:*TERMINAL-CHARACTER-ENCODING-NAME* is set to the name of that encoding. CCL:*TERMINAL-CHARACTER-ENCODING-NAME* defaults to NIL, which is a synonym for :ISO-8859-1.

    For example:

    shell> ccl -K utf-8
    	      

    has the effect of making the standard CL streams use :UTF-8 as their character encoding.

  • -n (or --no-init). If this option is given, the init file is not loaded. This is useful if Clozure CL is being invoked by a shell script that should not be affected by whatever customizations a user might have in place.

  • -e form (or --eval). An expression is read (via READ-FROM-STRING) from the string form and evaluated. If form contains shell metacharacters, it may be necessary to escape or quote them to prevent the shell from interpreting them.

  • -l path (or --load path). Loads file specified by path.

  • -T n (or --set-lisp-heap-gc-threshold n). Sets the Lisp gc threshold to n. (see Section 15.3, “GC Page reclamation policy”

  • -Q (or --quiet). Suppresses printing of heralds and prompts when the --batch command line option is specified.

  • -R n (or --heap-reserve). Reserves n bytes for heap expansion. The default is 549755813888. (see Section 15.1, “Heap space allocation”)

  • -S n (or --stack-size n). Sets the size of the initial control stack to n. (see Section 6.3.1, “Thread Stack Sizes”)

  • -Z n (or --thread-stack-size n). Sets the size of the first thread's stack to n. (see Section 6.3.1, “Thread Stack Sizes”)

  • -b (or --batch). Execute in "batch mode". End-of-file from *STANDARD-INPUT* causes Clozure CL to exit, as do attempts to enter a break loop.

  • --no-sigtrap An obscure option for running under GDB.

  • -I image-name (or --image-name image-name). Specifies the image name for the kernel to load. Defaults to the kernel name with ".image" appended.

The --load and --eval options can each be provided multiple times. They're executed in the order specified on the command line, after the init file (if there is one) is loaded and before the toplevel read-eval-print loop is entered.

2.6. Using Clozure CL with GNU Emacs and SLIME

SLIME (see the SLIME web page) is an Emacs mode for interacting with Common Lisp systems. Clozure CL is well-supported by SLIME.

See the InstallingSlime topic on the Clozure CL wiki for some tips on how to get SLIME running with Clozure CL.

2.7. Example Programs

A number (ok, a small number), of example programs are distributed in the "ccl:examples;" directory of the source distribution. See the README-OPENMCL-EXAMPLES text file in that directory for information about prerequisites and usage.

Some of the example programs are derived from C examples in textbooks, etc.; in those cases, the original author and work are cited in the source code.

Unless the original author or contributor claims other rights, you're free to incorporate any of this example code or derivative thereof in any of your own works without restriction. In doing so, you agree that the code was provided "as is", and that no other party is legally or otherwise responsible for any consequences of your decision to use it.

If you've developed Clozure CL examples that you'd like to see added to the distribution, please send mail to the Clozure CL mailing lists. Any such contributions would be welcome and appreciated (as would bug fixes and improvements to the existing examples.)

Chapter 3. Building Clozure CL from its Source Code

Clozure CL, like many other Lisp implementations, consists of a kernel and a heap image. The kernel is an ordinary C program, and is built with a C compiler. It provides very basic and fundamental facilities, such as memory management, garbage collection, and bootstrapping. All the higher-level features are written in Lisp, and compiled into the heap image. Both parts are needed to have a working Lisp implementation; neither the kernel nor the heap image can stand alone.

You may already know that, when you have a C compiler which is written in C, you need a working C compiler to build the compiler. Similarly, the Clozure CL heap image includes a Lisp compiler, which is written in Lisp. You therefore need a working Lisp compiler in order to build the Lisp heap image.

Where will you get a working Lisp compiler? No worries; you can use a precompiled copy of a (slightly older and compatible) version of Clozure CL. This section explains how to do all this.

In principle it should be possible to use another implementation of Common Lisp as the host compiler, rather than an older Clozure CL; this would be a challenging and experimental way to build, and is not described here.

3.1. Building Definitions

The following terms are used in subsequent sections; it may be helpful to refer to these definitions.

fasl files are the object files produced by compile-file. fasl files store the machine code associated with function definitions and the external representation of other lisp objects in a compact, machine-readable form. fasl is short for “FASt Loading”. Clozure CL uses different pathname types (extensions) to name fasl files on different platforms; see Table 3.1, “Platform-specific filename conventions”

The Lisp kernel is a C program with a fair amount of platform-specific assembly language code. Its basic job is to map a lisp heap image into memory, transfer control to some compiled lisp code that the image contains, handle any exceptions that occur during the execution of that lisp code, and provide various other forms of runtime support for that code. Clozure CL uses different filenames to name the lisp kernel files on different platforms; see Table 3.1, “Platform-specific filename conventions”.

A heap image is a file that can be quickly mapped into a process's address space. Conceptually, it's not too different from an executable file or shared library in the OS's native format (ELF or Mach-O/dyld format); for historical reasons, Clozure CL's own heap images are in their own (fairly simple) format. The term full heap image refers to a heap image file that contains all of the code and data that comprise Clozure CL. Clozure CL uses different filenames to name the standard full heap image files on different platforms; see Table 3.1, “Platform-specific filename conventions”.

A bootstrapping image is a minimal heap image used in the process of building Clozure CL itself. The bootstrapping image contains just enough code to load the rest of Clozure CL from fasl files. It may help to think of the bootstrapping image as the egg and the full heap image as the chicken. Clozure CL uses different filenames to name the standard bootstrapping image files on different platforms; see Table 3.1, “Platform-specific filename conventions” .

Each supported platform (and possibly a few as-yet-unsupported ones) has a uniquely named subdirectory of ccl/lisp-kernel/; each such contains a Makefile and may contain some auxiliary files (linker scripts, etc.) that are used to build the lisp kernel on a particular platform.The platform-specific name of the kernel build directory is described in Table 3.1, “Platform-specific filename conventions”.

3.1.1. Platform-specific filename conventions

Table 3.1. Platform-specific filename conventions

Platform kernel full-image boot-image fasl extension kernel-build directory
DarwinPPC32 dppccl dppccl.image ppc-boot.image .dfsl darwinppc
LinuxPPC32 ppccl ppccl.image ppc-boot .pfsl linuxppc
DarwinPPC64 dppccl64 dppccl64.image ppc-boot64.image .d64fsl darwinppc64
LinuxPPC64 ppccl64 ppccl64.image ppc-boot64 .p64fsl linuxppc64
LinuxX8664 lx86cl64 lx86cl64.image x86-boot64 .lx64fsl linuxx8664
LinuxX8632 lx86cl lx86cl.image x86-boot32 .lx32fsl linuxx8632
DarwinX8664 dx86cl64 dx86cl64.image x86-boot64.image .dx64fsl darwinx8664
DarwinX8632 dx86cl dx86cl.image x86-boot32.image .dx32fsl darwinx8632
FreeBSDX8664 fx86cl64 fx86cl64.image fx86-boot64 .fx64fsl freebsdx8664
FreeBSDX8632 fx86cl fx86cl.image fx86-boot32 .fx32fsl freebsdx8632
SolarisX64 sx86cl64 sx86cl64.image sx86-boot64 .sx64fsl solarisx64
SolarisX86 sx86cl sx86cl.image sx86-boot32 .sx32fsl solarisx86
Win64 wx86cl64.exe sx86cl64.image wx86-boot64.image .wx64fsl win64
Win32 wx86cl.exe wx86cl.image wx86-boot32.image .wx32fsl win32

3.2. Setting Up to Build

At a given time, there are generally two versions of Clozure CL that you might want to use (and therefore might want to build from source):

  • The released version

  • The development version, called the "trunk", which may contain both interesting new features and interesting new bugs

All versions are available for download from svn.clozure.com via the Subversion source control system.

For example, to get a released version (1.3 in this example), use a command like:

	svn co http://svn.clozure.com/publicsvn/openmcl/release/1.3/xxx/ccl
      

To get the trunk version, use:

	svn co http://svn.clozure.com/publicsvn/openmcl/trunk/xxx/ccl
      

Change the "xxx" to one of the following names: darwinx86, linuxx86, freebsdx86, solarisx86, window, linuxppc, or darwinppc.

In the case of released versions, there may also be tar archives available. See the Clozure CL Trac for details.

Subversion client programs are pre-installed on Mac OS X 10.5 and later and are typically either pre-installed or readily available on Linux and FreeBSD platforms. The Subversion web page contains links to subversion client programs for many platforms; users of Mac OS X 10.4 can also install Subversion clients via Fink or MacPorts.

3.3. Building Everything

Given that you now have everything you need, do the following in a running Clozure CL to bring your Lisp system completely up to date.

? (ccl:rebuild-ccl :full t)
    

That call to the function rebuild-ccl performs the following steps:

  • Deletes all fasl files and other object files in the ccl directory tree

  • Runs an external process that does a make in the current platform's kernel build directory to create a new kernel. This step can only work if the C compiler and related tools are installed; see Section 3.4.1, “Kernel build prerequisites”.

  • Does (compile-ccl t) in the running lisp, to produce a set of fasl files from the “higher level” lisp sources.

  • Does (xload-level-0 :force) in the running lisp, to compile the lisp sources in the “ccl:level-0;” directory into fasl files and then create a bootstrapping image from those fasl files.

  • Runs another external process, which causes the newly compiled lisp kernel to load the new bootstrapping image. The bootsrtrapping image then loads the “higher level” fasl files and a new copy of the platform's full heap image is then saved.

If all goes well, it'll all happen without user intervention and with some simple progress messages. If anything goes wrong during execution of either of the external processes, the process output is displayed as part of a lisp error message.

rebuild-ccl is essentially just a short cut for running all the individual steps involved in rebuilding the system. You can also execute these steps individually, as described below.

3.4. Building the kernel

The Lisp kernel is the executable that you run to use Lisp. It doesn't actually contain the entire Lisp implementation; rather, it loads a heap image which contains the specifics—the "library", as it might be called if this was a C program. The kernel also provides runtime support to the heap image, such as garbage collection, memory allocation, exception handling, and the OS interface.

The Lisp kernel file has different names on different platforms. See Table 3.1, “Platform-specific filename conventions”. On all platforms the lisp kernel sources reside in ccl/lisp-kernel.

This section gives directions on how to rebuild the Lisp kernel from its source code. Most Clozure CL users will rarely have to do this. You probably will only need to do it if you are attempting to port Clozure CL to a new architecture or extend or enhance its kernel in some way. As mentioned above, this step happens automatically when you do

? (rebuild-ccl :full t)
      

3.4.1. Kernel build prerequisites

The Clozure CL kernel can be bult with the following widely available tools:

  • cc or gcc- the GNU C compiler

  • ld - the GNU linker

  • m4 or gm4- the GNU m4 macro processor

  • as - the GNU assembler (version 2.10.1 or later)

  • make - either GNU make or, on FreeBSD, the default BSD make program

In general, the more recent the versions of those tools, the better; some versions of gcc 3.x on Linux have difficulty compiling some of the kernel source code correctly (so gcc 4.0 should be used, if possible.) On Mac OS X, the versions of the tools distributed with Xcode should work fine; on Linux, the versions of the tools installed with the OS (or available through its package management system) should work fine if they're "recent enough". On FreeBSD, the installed version of the m4 program doesn't support some features that the kernel build process depends on; the GNU version of the m4 macroprocessor (called gm4 on FreeBSD) should be installed.

Note

In order to build the lisp kernel on Mac OS X 10.6 Snow Leopard, you must install the optional 10.4 support when installing Xcode.

3.4.2. Using "make" to build the lisp kernel

With those tools in place, do:

shell> cd ccl/lisp-kernel/PLATFORM
shell> make
	    

That'll assemble several assembly language source files, compile several C source files, and link ../../the kernel.

3.5. Building the heap image

The initial heap image is loaded by the Lisp kernel, and provides most of the language implementation The heap image captures the entire state of a running Lisp (except for external resources, such as open files and TCP sockets). After it is loaded, the contents of the new Lisp process's memory are exactly the same as those of the old Lisp process when the image was created.

The heap image is how we get around the fact that we can't run Lisp code until we have a working Lisp implementation, and we can't make our Lisp implementation work until we can run Lisp code. Since the heap image already contains a fully-working implementation, all we need to do is load it into memory and start using it.

If you're building a new version of Clozure CL, you need to build a new heap image.

(You might also wish to build a heap image if you have a large program that is very complicated or time-consuming to load, so that you will be able to load it once, save an image, and thenceforth never have to load it again. At any time, a heap image capturing the entire memory state of a running Lisp can be created by calling the function ccl:save-application.)

3.5.1. Development cycle

Creating a new Clozure CL full heap image consists of the following steps:

  1. Using your existing Clozure CL, create a bootstrapping image

  2. Using your existing Clozure CL, recompile your updated Clozure CL sources

  3. Invoke Clozure CL with the bootstrapping image you just created (rather than with the existing full heap image).

When you invoke Clozure CL with the bootstrapping image, it starts up, loads all of the Clozure CL fasl files, and saves out a new full heap image. Voila. You've created a new heap image.

A few points worth noting:

  • There's a circular dependency between the full heap image and the bootstrapping image, in that each is used to build the other.

  • There are some minor implementation differences, but the environment in effect after the bootstrapping image has loaded its fasl files is essentially equivalent to the environment provided by the full heap image; the latter loads a lot faster and is easier to distribute, of course.

  • If the full heap image doesn't work (because of an OS compatibilty problem or other bug), it's very likely that the bootstrapping image will suffer the same problems.

Given a bootstrapping image and a set of up-to-date fasl files, the development cycle usually involves editing lisp sources (or updating those sources via cvs update), recompiling modified files, and using the bootstrapping image to produce a new heap image.

3.5.2. Generating a bootstrapping image

The bootstrapping image isn't provided in Clozure CL distributions. It can be built from the source code provided in distributions (using a lisp image and kernel provided in those distributions) using the procedure described below.

The bootstrapping image is built by invoking a special utility inside a running Clozure CL heap image to load files contained in the ccl/level-0 directory. The bootstrapping image loads several dozen fasl files. After it's done so, it saves a heap image via save-application. This process is called "cross-dumping".

Given a source distribution, a lisp kernel, and a heap image, one can produce a bootstrapping image by first invoking Clozure CL from the shell:

shell> ccl
Welcome to Clozure CL .... !
?
	  

then calling ccl:xload-level-0 at the lisp prompt

? (ccl:xload-level-0)
	  

This function compiles the lisp sources in the ccl/level-0 directory if they're newer than the corresponding fasl files and then loads the resulting fasl files into a simulated lisp heap contained in data structures inside the running lisp. That simulated heap image is then written to disk.

xload-level-0 should be called whenever your existing boot image is out-of-date with respect to the source files in ccl:level-0; :

? (ccl:xload-level-0 :force)
      

forces recompilation of the level-0 sources.

3.5.3. Generating fasl files

Calling:

? (ccl:compile-ccl)
	  

at the lisp prompt compiles any fasl files that are out-of-date with respect to the corresponding lisp sources; (ccl:compile-ccl t) forces recompilation. ccl:compile-ccl reloads newly-compiled versions of some files; ccl:xcompile-ccl is analogous, but skips this reloading step.

Unless there are bootstrapping considerations involved, it usually doesn't matter whether these files are reloaded after they're recompiled.

Calling compile-ccl or xcompile-ccl in an environment where fasl files don't yet exist may produce warnings to that effect whenever files are required during compilation; those warnings can be safely ignored. Depending on the maturity of the Clozure CL release, calling compile-ccl or xcompile-ccl may also produce several warnings about undefined functions, etc. They should be cleaned up at some point.

3.5.4. Building a full image from a bootstrapping image

To build a full image from a bootstrapping image, just invoke the kernel with the bootstrapping image as an argument

$ cd ccl                        # wherever your ccl directory is
$ ./KERNEL BOOT_IMAGE
	  

Where KERNEL and BOOT_IMAGE are the names of the kernel and boot image appropriate to the platform you are running on. See Table 3.1, “Platform-specific filename conventions”

That should load a few dozen fasl files (printing a message as each file is loaded.) If all of these files successfully load, the lisp will print a prompt. You should be able to do essentially everything in that environment that you can in the environment provided by a "real" heap image. If you're confident that things loaded OK, you can save that image.

? (ccl:save-application "image_name") ; Overwiting the existing heap image
	  

Where image_name is the name of the full heap image for your platform. See Table 3.1, “Platform-specific filename conventions”.

If things go wrong in the early stages of the loading sequence, errors are often difficult to debug; until a fair amount of code (CLOS, the CL condition system, streams, the reader, the read-eval-print loop) is loaded, it's generally not possible for the lisp to report an error. Errors that occur during these early stages ("the cold load") sometimes cause the lisp kernel debugger (see ) to be invoked; it's primitive, but can sometimes help one to get oriented.

Chapter 4. Using Clozure CL

4.1. Introduction

The Common Lisp standard allows considerable latitude in the details of an implementation, and each particular Common Lisp system has some idiosyncrasies. This chapter describes ordinary user-level features of Clozure CL, including features that may be part of the Common Lisp standard, but which may have quirks or details in the Clozure CL implementation that are not described by the standard. It also describes extensions to the standard; that is, features of Clozure CL that are not part of the Common Lisp standard at all.

4.2. Trace

Clozure CL's tracing facility is invoked by an extended version of the Common Lisp trace macro. Extensions allow tracing of methods, as well as finer control over tracing actions.

TRACE {keyword global-value}* {spec | (spec {keyword local-value}*)}* [Macro]

The trace macro encapsulates the functions named by specs, causing trace actions to take place on entry and exit from each function. The default actions print a message on function entry and exit. Keyword/value options can be used to specify changes in the default behavior.

Invoking (trace) without arguments returns a list of functions being traced.

A spec is either a symbol that is the name of a function, or an expression of the form (setf symbol), or a specific method of a generic function in the form (:method gf-name {qualifier}* ({specializer}*)), where a specializer can be the name of a class or an EQL specializer.

A spec can also be a string naming a package, or equivalently a list (:package package-name), in order to request that all functions in the package to be traced.

By default, whenever a traced function is entered or exited, a short message is printed on *trace-output* showing the arguments on entry and values on exit. Options specified as key/value pairs can be used to modify this behavior. Options preceding the function specs apply to all the functions being traced. Options specified along with a spec apply to that spec only and override any global options. The following options are supported:

:methods {T | nil}

If true, and if applied to a spec naming a generic function, arranges to trace all the methods of the generic function in addition to the generic function itself.

:inside outside-spec | ({outside-spec}*)

Inhibits all trace actions unless the current invocation of the function being traced is inside one of the outside-spec's, i.e. unless a function named by one of the outside-spec's is currently on the stack. outside-spec can name a function, a method, or a package, as above.

:if form, :condition form

Evaluates form whenever the function being traced is about to be entered, and inhibits all trace actions if form returns nil. The form may reference the lexical variable ccl::args, which is a list of the arguments in this call. :condition is just a synonym for :if, though if both are specified, both must return non-nil.

:before-if form

Evaluates form whenever the function being traced is about to be entered, and inhibits the entry trace actions if form returns nil. The form may reference the lexical variable ccl::args, which is a list of the arguments in this call. If both :if and :before-if are specified, both must return non-nil in order for the before entry actions to happen.

:after-if form

Evaluates form whenever the function being traced has just exited, and inhibits the exit trace actions if form returns nil. The form may reference the lexical variable ccl::vals, which is a list of values returned by this call. If both :if and :after-if are specified, both must return non-nil in order for the after exit actions to happen.

:print-before form

Evaluates form whenever the function being traced is about to be entered, and prints the result before printing the standard entry message. The form may reference the lexical variable ccl::args, which is a list of the arguments in this call. To see multiple forms, use values: :print-before (values (one-thing) (another-thing)).

:print-after form

Evaluates form whenever the function being traced has just exited, and prints the result after printing the standard exit message. The form may reference the lexical variable ccl::vals, which is a list of values returned by this call. To see multiple forms, use values: :print-after (values (one-thing) (another-thing)).

:print form

Equivalent to :print-before form :print-after form.

:eval-before form

Evaluates form whenever the function being traced is about to be entered. The form may reference the lexical variable ccl::args, which is a list of the arguments in this call.

:eval-after form

Evaluates form whenever the function being has just exited. The form may reference the lexical variable ccl::vals, which is a list of values returned by this call.

:eval form

Equivalent to :eval-before form :eval-after form.

:break-before form

Evaluates form whenever the function being traced is about to be entered, and if the result is non-nil, enters a debugger break loop. The form may reference the lexical variable ccl::args, which is a list of the arguments in this call.

:break-after form

Evaluates form whenever the function being traced has just exited, and if the result is non-nil, enters a debugger break loop. The form may reference the lexical variable ccl::vals, which is a list of values returned by this call.

:break form

Equivalent to :break-before form :break-after form.

:backtrace-before form, :backtrace form

Evaluates form whenever the function being traced is about to be entered. The form may reference the lexical variable ccl::args, which is a list of the arguments in this call. The value returned by form is intepreted as follows:

nil

does nothing

:detailed

prints a detailed backtrace to *trace-output*.

(:detailed integer)

prints the top integer frames of detailed backtrace to *trace-output*.

integer

prints top integer frames of a terse backtrace to *trace-output*.

anything else

prints a terse backtrace to *trace-output*.

Note that unlike with the other options, :backtrace is equivalent to :backtrace-before only, not both before and after, since it's usually not helpful to print the same backtrace both before and after the function call.

:backtrace-after form

Evaluates form whenever the function being traced has just exited. The form may reference the lexical variable ccl::vals, which is a list of values returned by this call. The value returned by form is intepreted as follows:

nil

does nothing

:detailed

prints a detailed backtrace to *trace-output*.

(:detailed integer)

prints the top integer frames of detailed backtrace to *trace-output*.

integer

prints top integer frames of a terse backtrace to *trace-output*.

anything else

prints a terse backtrace to *trace-output*.

:before action

specifies the action to be taken just before the traced function is entered. action is one of:

:print

The default, prints a short indented message showing the function name and the invocation arguments

:break

Equivalent to :before :print :break-before t

:backtrace

Equivalent to :before :print :backtrace-before t

function

Any other value is interpreted as a function to call on entry instead of printing the standard entry message. It is called with its first argument being the name of the function being traced, the remaining arguments being all the arguments to the function being traced, and ccl:*trace-level* bound to the current nesting level of trace actions.

:after action

specifies the action to be taken just after the traced function exits. action is one of:

:print

The default, prints a short indented message showing the function name and the returned values

:break

Equivalent to :after :print :break-after t

:backtrace

Equivalent to :after :print :backtrace-after t

function

Any other value is interpreted as a function to call on exit instead of printing the standard exit message. It is called with its first argument being the name of the function being traced, the remaining arguments being all the values returned by the function being traced, and ccl:*trace-level* bound to the current nesting level of trace actions.

CCL:*TRACE-LEVEL* [Variable]

Variable bound to the current nesting level during execution of before and after trace actions. The default printing actions use it to determine the amount of indentation.

CCL:*TRACE-MAX-INDENT* [Variable]

The default before and after print actions will not indent by more than the value of ccl:*trace-max-indent* regardless of the current trace level.

CCL:TRACE-FUNCTION spec &key {keyword value}* [Function]

This is a functional version of the TRACE macro. spec and keywords are as for TRACE, except that all arguments are evaluated.

CCL:*TRACE-PRINT-LEVEL* [Variable]

The default print actions bind CL:*PRINT-LEVEL* to this value while printing. Note that this rebinding is only in effect during the default entry and exit messages. It does not apply to printing of :print-before/:print-after forms or any explicit printing done by user code.

CCL:*TRACE-PRINT-LENGTH* [Variable]

The default print actions bind CL:*PRINT-LENGTH* to this value while printing. Note that this rebinding is only in effect during the default entry and exit messages. It does not apply to printing of :print-before/:print-after forms or any explicit printing done by user code.

CCL:*TRACE-BAR-FREQUENCY* [Variable]

By default, this is nil. If non-nil it should be a integer, and the default entry and exit messages will print a | instead of space every this many levels of indentation.

4.3. Advising

The advise macro can be thought of as a more general version of trace. It allows code that you specify to run before, after, or around a given function, for the purpose of changing the behavior of the function. Each piece of added code is called a piece of advice. Each piece of advice has a unique name, so that you can have multiple pieces of advice on the same function, including multiple :before, :after, and :around pieces of advice.

The :name and :when keywords serve to identify the piece of advice. A later call to advise with the same values of :name and :when will replace the existing piece of advice; a call with different values will not.

[Macro]

advise spec form &key when name
Add a piece of advice to the function or method specified by spec according to form.

Arguments and Values:

spec--- A specification of the function on which to put the advice. This is either a symbol that is the name of a function or generic function, or an expression of the form (setf symbol), or a specific method of a generic function in the form (:method symbol {qualifiers} (specializer {specializer})).

form--- A form to execute before, after, or around the advised function. The form can refer to the variable arglist that is bound to the arguments with which the advised function was called. You can exit from form with (return).

name--- A name that identifies the piece of advice.

when--- An argument that specifies when the piece of advice is run. There are three allowable values. The default is :before, which specifies that form is executed before the advised function is called. Other possible values are :after, which specifies that form is executed after the advised function is called, and :around, which specifies that form is executed around the call to the advised function. Use (:do-it) within form to indicate invocation of the original definition.

Examples:

The function foo, already defined, does something with a list of numbers. The following code uses a piece of advice to make foo return zero if any of its arguments is not a number. Using :around advice, you can do the following:

(advise foo (if (some #'(lambda (n) (not (numberp n))) arglist)
	      0
	      (:do-it))
	:when :around :name :zero-if-not-nums)
	

To do the same thing using a :before piece of advice:

(advise foo (if (some #'(lambda (n) (not (numberp n))) arglist)
	      (return 0))
	:when :before :name :zero-if-not-nums)
	

[Macro]

unadvise spec &key when name
Remove the piece or pieces of advice matching spec, when, and name.

Description:

The unadvise macro removes the piece or pieces of advice matching spec, when, and name. When the value of spec is t and the values of when and name are nil, unadvise removes every piece of advice; when spec is t, the argument when is nil, and name is non-nil, unadvise removes all pieces of advice with the given name.

Arguments and Values:

The arguments have the same meaning as in advise.

[Macro]

advisedp spec &key when name
Return a list of the pieces of advice matching spec, when, and name.

Description:

The advisedp macro returns a list of existing pieces of advice that match spec, when, and name. When the value of spec is t and the values of when and name are nil, advisedp returns all existing pieces of advice.

Arguments and Values:

The arguments have the same meaning as in advise.

4.4. Directory

Clozure CL's DIRECTORY function accepts the following implementation-dependent keyword arguments:

:files boolean

If true, includes regular (non-directory) files in DIRECTORY's output. Defaults to T.

:directories boolean

If true, includes directories in DIRECTORY's output. Defaults to NIL.

:all boolean

If true, includes files and directories whose names start with a dot character in DIRECTORY's output. (Entries whose name is "." or ".." are never included.) Defaults to T.

:follow-links boolean

If true, includes the TRUENAMEs of symbolic or hard links in DIRECTORY's output; if false, includes the link filenames without attempting to resolve them. Defaults to T.

Note that legacy HFS alias files are treated as plain files.

4.5. Unicode

All characters and strings in Clozure CL fully support Unicode by using UTF-32. There is only one CHARACTER type and one STRING type in Clozure CL. There has been a lot of discussion about this decision which can be found by searching the openmcl-devel archives at http://clozure.com/pipermail/openmcl-devel/. Suffice it to say that we decided that the simplicity and speed advantages of only supporting UTF-32 outweigh the space disadvantage.

4.5.1. Characters

There is one CHARACTER type in Clozure CL. All CHARACTERs are BASE-CHARs. CHAR-CODE-LIMIT is now #x110000, which means that all Unicode characters can be directly represented. As of Unicode 5.0, only about 100,000 of 1,114,112 possible CHAR-CODEs are actually defined. The function CODE-CHAR knows that certain ranges of code values (notably #xd800-#xddff) will never be valid character codes and will return NIL for arguments in that range, but may return a non-NIL value (an undefined/non-standard CHARACTER object) for other unassigned code values.

Clozure CL supports character names of the form u+xxxx—where x is a sequence of one or more hex digits. The value of the hex digits denotes the code of the character. The + character is optional, so #\u+0020, #\U0020, and #\U+20 all refer to the #\Space character.

Characters with codes in the range #xa0-#x7ff also have symbolic names These are the names from the Unicode standard with spaces replaced by underscores. So #\Greek_Capital_Letter_Epsilon can be used to refer to the character whose CHAR-CODE is #x395. To see the complete list of supported character names, look just below the definition for register-character-name in ccl:level-1;l1-reader.lisp.

4.5.2. External Formats

OPEN, LOAD, and COMPILE-FILE all take an :EXTERNAL-FORMAT keyword argument. The value of :EXTERNAL-FORMAT can be :DEFAULT (the default value), a line termination keyword (see Section 4.5.3, “Line Termination Keywords”), a character encoding keyword (see Section 4.5.4, “Character Encodings”), an external-format object created using CCL::MAKE-EXTERNAL-FORMAT (see make-external-format), or a plist with keys: :DOMAIN, :CHARACTER-ENCODING and :LINE-TERMINATION. If argument is a plist, the result of (APPLY #'MAKE-EXTERNAL-FORMAT argument) will be used.

If :DEFAULT is specified, then the value of CCL:*DEFAULT-EXTERNAL-FORMAT* is used. If no line-termination is specified, then the value of CCL:*DEFAULT-LINE-TERMINATION* is used, which defaults to :UNIX. If no character encoding is specified, then CCL:*DEFAULT-FILE-CHARACTER-ENCODING* is used for file streams and CCL:*DEFAULT-SOCKET-CHARACTER-ENCODING* is used for socket streams. The default, default character encoding is NIL which is a synonym for :ISO-8859-1.

Note that the set of keywords used to denote CHARACTER-ENCODINGs and the set of keywords used to denote line-termination conventions is disjoint: a keyword denotes at most a character encoding or a line termination convention, but never both.

EXTERNAL-FORMATs are objects (structures) with three read-only fields that can be accessed via the functions: EXTERNAL-FORMAT-DOMAIN, EXTERNAL-FORMAT-LINE-TERMINATION and EXTERNAL-FORMAT-CHARACTER-ENCODING.

[Function]

make-external-format &key domain character-encoding line-termination => external-format
Either creates a new external format object, or return an existing one with the same specified slot values.

Arguments and Values:

domain---This is used to indicate where the external format is to be used. Its value can be almost anything. It defaults to NIL. There are two domains that have a pre-defined meaning in Clozure CL: :FILE indicates encoding for a file in the file system and :SOCKET indicates i/o to/from a socket. The value of domain affects the default values for character-encoding and line-termination.

character-encoding---A keyword that specifies the character encoding for the external format. Section 4.5.4, “Character Encodings”. Defaults to :DEFAULT which means if domain is :FILE use the value of the variable CCL:*DEFAULT-FILE-CHARACTER-ENCODING* and if domain is :SOCKET, use the value of the variable CCL:*DEFAULT-SOCKET-CHARACTER-ENCODING*. The initial value of both of these variables is NIL, which means the :ISO-8859-1 encoding.

line-termination---A keyword that indicates a line termination keyword Section 4.5.3, “Line Termination Keywords”. Defaults to :DEFAULT which means use the value of the variable CCL:*DEFAULT-LINE-TERMINATION*.

external-format---An external-format object as described above.

Description:

Despite the function's name, it doesn't necessarily create a new, unique EXTERNAL-FORMAT object: two calls to MAKE-EXTERNAL-FORMAT with the same arguments made in the same dynamic environment return the same (eq) object.

4.5.3. Line Termination Keywords

Line termination keywords indicate which characters are used to indicate the end of a line. On input, the external line termination characters are replaced by #\Newline and on output, #\Newlines are converted to the external line termination characters.

Table 4.1. Line Termination Keywords

keyword character(s)
:UNIX #\Linefeed
:MACOS #\Return
:CR #\Return
:CRLF #\Return #\Linefeed
:CP/M #\Return #\Linefeed
:MSDOS #\Return #\Linefeed
:DOS #\Return #\Linefeed
:WINDOWS #\Return #\Linefeed
:INFERRED see below
:UNICODE #\Line_Separator

:INFERRED means that a stream's line-termination convention is determined by looking at the contents of a file. It is only useful for FILE-STREAMs that're open for :INPUT or :IO. The first buffer full of data is examined, and if a #\Return character occurs before any #\Linefeed character, then the line termination type is set to :MACOS, otherwise it is set to :UNIX.

4.5.4. Character Encodings

Internally, all characters and strings in Clozure CL are in UTF-32. Externally, files or socket streams may encode characters in a wide variety of ways. The International Organization for Standardization, widely known as ISO, defines many of these character encodings. Clozure CL implements some of these encodings as detailed below. These encodings are part of the specification of external formats Section 4.5.2, “External Formats”. When reading from a stream, characters are converted from the specified external character encoding to UTF-32. When writing to a stream, characters are converted from UTF-32 to the specified character encoding.

Internally, CHARACTER-ENCODINGs are objects (structures) that are named by character encoding keywords (:ISO-8859-1, :UTF-8, etc.). The structures contain attributes of the encoding and functions used to encode/decode external data, but unless you're trying to define or debug an encoding there's little reason to know much about the CHARACTER-ENCODING objects and it's usually preferable to refer to a character encoding by its name.

4.5.4.1. Encoding Problems

On output to streams with character encodings that can encode the full range of Unicode—and on input from any stream—"unencodable characters" are represented using the Unicode #\Replacement_Character (= #\U+fffd); the presence of such a character usually indicates that something got lost in translation. Either data wasn't encoded properly or there was a bug in the decoding process.

4.5.4.2. Byte Order Marks

The endianness of a character encoding is sometimes explicit, and sometimes not. For example, :UTF-16BE indicates big-endian, but :UTF-16 does not specify endianness. A byte order mark is a special character that may appear at the beginning of a stream of encoded characters to specify the endianness of a multi-byte character encoding. (It may also be used with UTF-8 character encodings, where it is simply used to indicate that the encoding is UTF-8.)

Clozure CL writes a byte order mark as the first character of a file or socket stream when the endianness of the character encoding is not explicit. Clozure CL also expects a byte order mark on input from streams where the endianness is not explicit. If a byte order mark is missing from input data, that data is assumed to be in big-endian order.

A byte order mark from a UTF-8 encoded input stream is not treated specially and just appears as a normal character from the input stream. It is probably a good idea to skip over this character.

4.5.4.3. DESCRIBE-CHARACTER-ENCODINGS

The set of character encodings supported by Clozure CL can be retrieved by calling CCL:DESCRIBE-CHARACTER-ENCODINGS.

[Function]

describe-character-encodings
Writes descriptions of defined character encodings to *terminal-io*.

Description:

Writes descriptions of all defined character encodings to *terminal-io*. These descriptions include the names of the encoding's aliases and a doc string which briefly describes each encoding's properties and intended use.

4.5.4.4. Supported Character Encodings

The list of supported encodings is reproduced here. Most encodings have aliases, e.g. the encoding named :ISO-8859-1 can also be referred to by the names :LATIN1 and :IBM819, among others. Where possible, the keywordized name of an encoding is equivalent to the preferred MIME charset name (and the aliases are all registered IANA charset names.)

:ISO-8859-1

An 8-bit, fixed-width character encoding in which all character codes map to their Unicode equivalents. Intended to support most characters used in most Western European languages.

Clozure CL uses ISO-8859-1 encoding for *TERMINAL-IO* and for all streams whose EXTERNAL-FORMAT isn't explicitly specified. The default for *TERMINAL-IO* can be set via the -K command-line argument (see Section 2.5, “Command Line Options”).

ISO-8859-1 just covers the first 256 Unicode code points, where the first 128 code points are equivalent to US-ASCII. That should be pretty much equivalent to what earliers versions of Clozure CL did that only supported 8-bit characters, but it may not be optimal for users working in a particular locale.

Aliases: :ISO_8859-1, :LATIN1, :L1, :IBM819, :CP819, :CSISOLATIN1

:ISO-8859-2

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in most languages used in Central/Eastern Europe.

Aliases: :ISO_8859-2, :LATIN-2, :L2, :CSISOLATIN2

:ISO-8859-3

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in most languages used in Southern Europe.

Aliases: :ISO_8859-3, :LATIN,3 :L3, :CSISOLATIN3

:ISO-8859-4

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in most languages used in Northern Europe.

Aliases: :ISO_8859-4, :LATIN4, :L4, :CSISOLATIN4

:ISO-8859-5

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in the Cyrillic alphabet.

Aliases: :ISO_8859-5, :CYRILLIC, :CSISOLATINCYRILLIC, :ISO-IR-144

:ISO-8859-6

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in the Arabic alphabet.

Aliases: :ISO_8859-6, :ARABIC, :CSISOLATINARABIC, :ISO-IR-127

:ISO-8859-7

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in the Greek alphabet.

Aliases: :ISO_8859-7, :GREEK, :GREEK8, :CSISOLATINGREEK, :ISO-IR-126, :ELOT_928, :ECMA-118

:ISO-8859-8

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in the Hebrew alphabet.

Aliases: :ISO_8859-8, :HEBREW, :CSISOLATINHEBREW, :ISO-IR-138

:ISO-8859-9

An 8-bit, fixed-width character encoding in which codes #x00-#xcf map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in the Turkish alphabet.

Aliases: :ISO_8859-9, :LATIN5, :CSISOLATIN5, :ISO-IR-148

:ISO-8859-10

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in Nordic alphabets.

Aliases: :ISO_8859-10, :LATIN6, :CSISOLATIN6, :ISO-IR-157

:ISO-8859-11

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found the Thai alphabet.

:ISO-8859-13

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in Baltic alphabets.

:ISO-8859-14

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in Celtic languages.

Aliases: :ISO_8859-14, :ISO-IR-199, :LATIN8, :L8, :ISO-CELTIC

:ISO-8859-15

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in Western European languages (including the Euro sign and some other characters missing from ISO-8859-1.

Aliases: :ISO_8859-15, :LATIN9

:ISO-8859-16

An 8-bit, fixed-width character encoding in which codes #x00-#x9f map to their Unicode equivalents and other codes map to other Unicode character values. Intended to provide most characters found in Southeast European languages.

Aliases: :ISO_8859-16, :ISO-IR-199, :LATIN8, :L8, :ISO-CELTIC

:MACINTOSH

An 8-bit, fixed-width character encoding in which codes #x00-#x7f map to their Unicode equivalents and other codes map to other Unicode character values. Traditionally used on Classic MacOS to encode characters used in western languages.

Aliases: :MACOS-ROMAN, :MACOSROMAN, :MAC-ROMAN, :MACROMAN

:UCS-2

A 16-bit, fixed-length encoding in which characters with CHAR-CODEs less than #x10000 can be encoded in a single 16-bit word. The endianness of the encoded data is indicated by the endianness of a byte-order-mark character (#u+feff) prepended to the data; in the absence of such a character on input, the data is assumed to be in big-endian order.

:UCS-2BE

A 16-bit, fixed-length encoding in which characters with CHAR-CODEs less than #x10000 can be encoded in a single 16-bit big-endian word. The encoded data is implicitly big-endian; byte-order-mark characters are not interpreted on input or prepended to output.

:UCS-2LE

A 16-bit, fixed-length encoding in which characters with CHAR-CODEs less than #x10000 can be encoded in a single 16-bit little-endian word. The encoded data is implicitly little-endian; byte-order-mark characters are not interpreted on input or prepended to output.

:US-ASCII

An 7-bit, fixed-width character encoding in which all character codes map to their Unicode equivalents.

Aliases: :CSASCII, :CP63,7 :IBM637, :US, :ISO646-US, :ASCII, :ISO-IR-6

:UTF-16

A 16-bit, variable-length encoding in which characters with CHAR-CODEs less than #x10000 can be encoded in a single 16-bit word and characters with larger codes can be encoded in a pair of 16-bit words. The endianness of the encoded data is indicated by the endianness of a byte-order-mark character (#u+feff) prepended to the data; in the absence of such a character on input, the data is assumed to be in big-endian order. Output is written in native byte-order with a leading byte-order mark.

:UTF-16BE

A 16-bit, variable-length encoding in which characters with CHAR-CODEs less than #x10000 can be encoded in a single 16-bit big-endian word and characters with larger codes can be encoded in a pair of 16-bit big-endian words. The endianness of the encoded data is implicit in the encoding; byte-order-mark characters are not interpreted on input or prepended to output.

:UTF-16LE

A 16-bit, variable-length encoding in which characters with CHAR-CODEs less than #x10000 can be encoded in a single 16-bit little-endian word and characters with larger codes can be encoded in a pair of 16-bit little-endian words. The endianness of the encoded data is implicit in the encoding; byte-order-mark characters are not interpreted on input or prepended to output.

:UTF-32

A 32-bit, fixed-length encoding in which all Unicode characters can be encoded in a single 32-bit word. The endianness of the encoded data is indicated by the endianness of a byte-order-mark character (#u+feff) prepended to the data; in the absence of such a character on input, input data is assumed to be in big-endian order. Output is written in native byte order with a leading byte-order mark.

Alias: :UTF-4

:UTF-32BE

A 32-bit, fixed-length encoding in which all Unicode characters encoded in a single 32-bit word. The encoded data is implicitly big-endian; byte-order-mark characters are not interpreted on input or prepended to output.

Alias: :UCS-4BE

:UTF-8

An 8-bit, variable-length character encoding in which characters with CHAR-CODEs in the range #x00-#x7f can be encoded in a single octet; characters with larger code values can be encoded in 2 to 4 bytes.

:UTF-32LE

A 32-bit, fixed-length encoding in which all Unicode characters can encoded in a single 32-bit word. The encoded data is implicitly little-endian; byte-order-mark characters are not interpreted on input or prepended to output.

Alias: :UCS-4LE

:Windows-31j

An 8-bit, variable-length character encoding in which character code points in the range #x00-#x7f can be encoded in a single octet; characters with larger code values can be encoded in 2 bytes.

Aliases: :CP932, :CSWINDOWS31J

:EUC-JP

An 8-bit, variable-length character encoding in which character code points in the range #x00-#x7f can be encoded in a single octet; characters with larger code values can be encoded in 2 bytes.

Alias: :EUCJP

4.5.4.5. Encoding and Decoding Strings

Clozure CL provides functions to encode and decode strings to and from vectors of type (simple-array (unsigned-byte 8)).

[Function]

count-characters-in-octet-vector vector &key start end external-format

Description:

Returns the number of characters that would be produced by decoding vector (or the subsequence thereof delimited by start and end) according to external-format.

[Function]

decode-string-from-octets vector &key start end external-format string

Description:

Decodes the octets in vector (or the subsequence of it delimited by start and end) into a string according to external-format.

If string is supplied, output will be written into it. It must be large enough to hold the decoded characters. If string is not supplied, a new string will be allocated to hold the decoded characters.

Returns, as multiple values, the decoded string and the position in vector where the decoding ended.

[Function]

encode-string-to-octets string &key start end external-format use-byte-order-mark vector vector-offset

Description:

Encodes string (or the substring delimited by start and end) into external-format and returns, as multiple values, a vector of octets containing the encoded data and an integer that specifies the offset into the vector where the encoded data ends.

When use-byte-order-mark is true, a byte-order mark will be included in the encoded data.

If vector is supplied, output will be written to it. It must be of type (simple-array (unsigned-byte 8)) and be large enough to hold the encoded data. If it is not supplied, the function will allocate a new vector.

If vector-offset is supplied, data will be written into the output vector starting at that offset.

[Function]

string-size-in-octets string &key start end external-format use-byte-order-mark

Description:

Returns the number of octets required to encode string (or the substring delimited by start and end) into external-format.

When use-byte-order-mark is true, the returned size will include space for a byte-order marker.

4.6. Pathnames

4.6.1. Pathname Expansion

Leading tilde (~) characters in physical pathname namestrings are expanded in the way that most shells do:

"~user/..." can be used to refer to an absolute pathname rooted at the home directory of the user named "user".

"~/..." can be used to refer to an absolute pathname rooted at the home directory of the current user.

4.6.2. Predefined Logical Hosts

Clozure CL sets up logical pathname translations for logical hosts: ccl and home

The CCL logical host should point to the ccl directory. It is used for a variety of purposes by Clozure CL including: locating Clozure CL source code, require and provide, accessing foreign function information, and the Clozure CL build process. It is set to the value of the environment variable CCL_DEFAULT_DIRECTORY, which is set by the openmcl shell script Section 2.3.1, “The ccl Shell Script”. If CCL_DEFAULT_DIRECTORY is not set, then it is set to the directory containing the current heap image.

4.6.3. OS X (Darwin)

Clozure CL assumes that pathname strings are decomposed UTF-8.

4.6.4. Linux

Pathname strings are treated as null-terminated strings coded in the encoding named by the value returned by the function CCL:PATHNAME-ENCODING-NAME. This value may be changed with SETF.

4.6.5. FreeBSD

Pathname strings are treated as null-terminated strings encoded according to the current locale; a future release may change this convention to use UTF-8.

4.7. Memory-mapped Files

In release 1.2 and later, Clozure CL supports memory-mapped files. On operating systems that support memory-mapped files (including Mac OS X, Linux, and FreeBSD), the operating system can arrange for a range of virtual memory addresses to refer to the contents of an open file. As long as the file remains open, programs can read values from the file by reading addresses in the mapped range.

Using memory-mapped files may in some cases be more efficient than reading the contents of a file into a data structure in memory.

Clozure CL provides the functions CCL:MAP-FILE-TO-IVECTOR and CCL:MAP-FILE-TO-OCTET-VECTOR to support memory-mapping. These functions return vectors whose contents are the contents of memory-mapped files. Reading an element of such a vector returns data from the corresponding position in the file.

Without memory-mapped files, a common idiom for reading the contents of files might be something like this:

(let* ((stream (open pathname :direction :input :element-type '(unsigned-byte 8)))
       (vector (make-array (file-size-to-vector-size stream)
                           :element-type '(unsigned-byte 8))))
  (read-sequence vector stream))
    

Using a memory-mapped files has a result that is the same in that, like the above example, it returns a vector whose contents are the same as the contents of the file. It differs in that the above example creates a new vector in memory and copies the file's contents into it; using a memory-mapped file instead arranges for the vector's elements to point to the file's contents on disk directly, without copying them into memory first.

The vectors returned by CCL:MAP-FILE-TO-IVECTOR and CCL:MAP-FILE-TO-OCTET-VECTOR are read-only; any attempt to change an element of a vector returned by these functions results in a memory-access error. Clozure CL does not currently support writing data to memory-mapped files.

Vectors created by CCL:MAP-FILE-TO-IVECTOR and CCL:MAP-FILE-TO-OCTET-VECTOR are required to respect Clozure CL's limit on the total size of an array. That means that you cannot use these functions to create a vector longer than ARRAY-TOTAL-SIZE-LIMIT, even if the filesystem supports file sizes that are larger. The value of ARRAY-TOTAL-SIZE-LIMIT is (EXPT 2 24) on 32-but platforms; and (EXPT 2 56) on 64-bit platforms.

CCL:MAP-FILE-TO-IVECTOR pathname element-type [Function]

pathname

The pathname of the file to be memory-mapped.

element-type

The element-type of the vector to be created. Specified as a type-specifier that names a subtype of either SIGNED-BYTE or UNSIGNED-BYTE.

The map-file-to-ivector function tries to open the file at pathname for reading. If successful, the function maps the file's contents to a range of virtual addresses. If successful, it returns a read-only vector whose element-type is given by element-type, and whose contents are the contents of the memory-mapped file.

The returned vector is a displaced-array whose element-type is (UPGRADED-ARRAY-ELEMENT-TYPE element-type). The target of the displaced array is a vector of type (SIMPLE-ARRAY element-type (*)) whose elements are the contents of the memory-mapped file.

Because of alignment issues, the mapped file's contents start a few bytes (4 bytes on 32-bit platforms, 8 bytes on 64-bit platforms) into the vector. The displaced array returned by CCL:MAP-FILE-TO-IVECTOR hides this overhead, but it's usually more efficient to operate on the underlying simple 1-dimensional array. Given a displaced array (like the value returned by CCL:MAP-FILE-TO-IVECTOR), the function ARRAY-DISPLACEMENT returns the underlying array and the displacement index in elements.

Currently, Clozure CL supports only read operations on memory-mapped files. If you try to change the contents of an array returned by map-file-to-ivector, Clozure CL signals a memory error.

CCL:UNMAP-IVECTOR displaced-array [Function]

If the argument is a displaced-array returned by map-file-to-ivector, and if it has not yet been unmapped by this function, then unmap-ivector undoes the memory mapping, closes the mapped file, and changes the displaced-array so that its target is an empty vector (of length zero).

CCL:MAP-FILE-TO-OCTET-VECTOR pathname [Function]

This function is a synonym for (CCL:MAP-FILE-TO-IVECTOR pathname '(UNSIGNED-BYTE 8)) It is provided as a convenience for the common case of memory-mapping a file as a vector of bytes.

CCL:UNMAP-OCTET-VECTOR displaced-array [Function]

This function is a synonym for (CCL:UNMAP-IVECTOR)

4.8. Static Variables

Clozure CL supports the definition of static variables, whose values are the same across threads, and which may not be dynamically bound. The value of a static variable is thus the same across all threads; changing the value in one thread changes it for all threads.

Attempting to dynamically rebind a static variable (for instance, by using LET, or using the variable name as a parameter in a LAMBDA form) signals an error. Static variables are shared global resources; a dynamic binding is private to a single thread.

Static variables therefore provide a simple way to share mutable state across threads. They also provide a simple way to introduce race conditions and obscure bugs into your code, since every thread reads and writes the same instance of a given static variable. You must take care, therefore, in how you change the values of static variables, and use normal multithreaded programming techniques, such as locks or semaphores, to protect against race conditions.

In Clozure CL, access to a static variable is usually faster than access to a special variable that has not been declared static.

DEFSTATIC var value &key doc-string [Macro]

var

The name of the new static variable.

value

The initial value of the new static variable.

doc-string

A documentation string that is assigned to the new variable.

Proclaims the variable special, assigns the variable the supplied value, and assigns the doc-string to the variable's VARIABLE documentation. Marks the variable static, preventing any attempt to dynamically rebind it. Any attempt to dynamically rebind var signals an error.

4.9. Saving Applications

Clozure CL provides the function CCL:SAVE-APPLICATION, which creates a file containing an archived Lisp memory image.

Clozure CL consists of a small executable called the Lisp kernel, which implements the very lowest level features of the Lisp system, and an image, which contains the in-memory representation of most of the Lisp system, including functions, data structures, variables, and so on. When you start Clozure CL, you are launching the kernel, which then locates and reads an image file, restoring the archived image in memory. Once the image is fully restored, the Lisp system is running.

Using CCL:SAVE-APPLICATION, you can create a file that contains a modified image, one that includes any changes you've made to the running Lisp system. If you later pass your image file to the Clozure CL kernel as a command-line parameter, it then loads your image file instead of its default one, and Clozure CL starts up with your modifications.

If this scenario seems to you like a convenient way to create an application, that's just as intended. You can create an application by modifying the running Lisp until it does what you want, then use CCL:SAVE-APPLICATION to preserve your changes and later load them for use.

In fact, you can go further than that. You can replace Clozure CL's toplevel function with your own, and then, when the image is loaded, the Lisp system immediately performs your tasks rather than the default tasks that make it a Lisp development system. If you save an image in which you have done this, the resulting Lisp system is your tool rather than a Lisp development system.

You can go a step further still. You can tell CCL:SAVE-APPLICATION to prepend the Lisp kernel to the image file. Doing this makes the resulting image into a self-contained executable binary. When you run the resulting file, the Lisp kernel immediately loads the attached image file and runs your saved system. The Lisp system that starts up can have any behavior you choose. It can be a Lisp development system, but with your customizations; or it can immediately perform some task of your design, making it a specialized tool rather than a general development system.

In other words, you can develop any application you like by interactively modifying Clozure CL until it does what you want, then using CCL:SAVE-APPLICATION to preserve your changes in an executable image.

On Mac OS X, the application builder uses CCL:SAVE-APPLICATION to create the executable portion of the application bundle. Double-clicking the application bundle runs the executable image created by CCL:SAVE-APPLICATION.

Also on Mac OS X, Clozure CL supports an object type called MACPTR, which is the type of pointers into the foreign (Mac OS) heap. Examples of commonly-user MACPTR objects are Cocoa windows and other dynamically-allocated Mac OS system objects.

Because a MACPTR object is a pointer into a foreign heap that exists for the lifetime of the running Lisp process, and because a saved image is used by loading it into a brand new Lisp process, saved MACPTR objects cannot be relied on to point to the same things when reconstituted from a saved image. In fact, a restored MACPTR object might point to anything at all—for example an arbitrary location in the middle of a block of code, or a completely nonexistent virtual address.

For that reason, CCL:SAVE-APPLICATION converts all MACPTR objects to DEAD-MACPTR objects when writing them to an image file. A DEAD-MACPTR is functionally identical to a MACPTR, except that code that operates on MACPTR objects distinguishes them from DEAD-MACPTR objects and can handle them appropriately—signaling errors, for example.

As of Clozure CL 1.2, there is one exception to the conversion of MACPTR to DEAD-MACPTR objects: a MACPTR object that points to the address 0 is not converted, because address 0 can always be relied upon to refer to the same thing.

As of Clozure CL 1.2, the constant CCL:+NULL-PTR+ refers to a MACPTR object that points to address 0.

On all supported platforms, you can use CCL:SAVE-APPLICATION to create a command-line tool that runs the same way any command-line program does. Alternatively, if you choose not to prepend the kernel, you can save an image and then later run it by passing it as a command-line parameter to the opencml or opencml64 script.

SAVE-APPLICATION filename &key toplevel-function init-file error-handler application-class clear-clos-caches (purify t) impurify (mode #o644) prepend-kernel [Function]

filename

The pathname of the file to be created when Clozure CL saves the application.

toplevel-function

The function to be executed after startup is complete. The toplevel is a function of no arguments that performs whatever actions the lisp system should perform when launched with this image.

If this parameter is not supplied, Clozure CL uses its default toplevel. The default toplevel runs the read-eval-print loop.

init-file

The pathname of a Lisp file to be loaded when the image starts up. You can place initialization expressions in this file, and use it to customize the behavior of the Lisp system when it starts up.

error-handler

The error-handling mode for the saved image. The supplied value determines what happens when an error is not handled by the saved image. Valid values are :quit (Lisp exits with an error message); :quit-quietly (Lisp exits without an error message); or :listener (Lisp enters a break loop, enabling you to debug the problem by interacting in a listener). If you don't supply this parameter, the saved image uses the default error handler (:listener).

application-class

The CLOS class that represents the saved Lisp application. Normally you don't need to supply this parameter; CCL:SAVE-APPLICATION uses the class CCL:LISP-DEVELOPMENT-SYSTEM. In some cases you may choose to create a custom application class; in that case, pass the name of the class as the value for this parameter.

clear-clos-caches

If true, ensures that CLOS caches are emptied before saving the image. Normally you don't need to supply this parameter, but if for some reason you want to ensure the CLOS caches are clear when the image starts up, you can pass any true value.

4.10. Concatenating FASL Files

Multiple fasl files can be concatenated into a single file.

[Function]

fasl-concatenate out-file fasl-files &key (:if-exists :error)
Concatenate several fasl files, producing a single output file.

Arguments and Values:

out-file--- Name of the file in which to store the concatenation.

fasl-files--- List of names of fasl files to concatenate.

:if-exists--- As for OPEN, defaults to :error

Description:

Creates a fasl file which, when loaded, will have the same effect as loading the individual input fasl files in the specified order. The single file might be easier to distribute or install, and loading it may be at least a little faster than loading the individual files (since it avoids the overhead of opening and closing each file in succession.)

The PATHNAME-TYPE of the output file and of each input file defaults to the current platform's fasl file type (.dx64fsl or whatever.) If any of the input files has a different type/extension an error will be signaled, but it doesn't otherwise try too hard to verify that the input files are real fasl files for the current platform.

4.11. Floating Point Numbers

In Clozure CL, the Common Lisp types short-float and single-float are implemented as IEEE single precision values; double-float and long-float are IEEE double precision values. On 64-bit platforms, single-floats are immediate values (like fixnums and characters).

Floating-point exceptions are generally enabled and detected. By default, threads start up with overflow, division-by-zero, and invalid enabled, and the rounding mode is set to nearest. The functions SET-FPU-MODE and GET-FPU-MODE provide user control over floating-point behavior.

[Function]

get-fpu-mode &optional mode
Return the state of exception-enable and rounding-mode control flags for the current thread.

Arguments and Values:

mode--- One of the keywords :rounding-mode, :overflow, :underflow, :division-by-zero, :invalid, :inexact.

Description:

If mode is supplied, returns the value of the corresponding control flag for the current thread.

Otherwise, returns a list of keyword/value pairs which describe the floating-point exception-enable and rounding-mode control flags for the current thread.

rounding-mode--- One of :nearest, :zero, :positive, :negative

overflow, underflow, division-by-zero, invalid, inexact --- If true, the floating-point exception is signaled. If NIL, it is masked.

[Function]

set-fpu-mode &key rounding-mode overflow underflow division-by-zero invalid inexact
Set the state of exception-enable and rounding-mode control flags for the current thread.

Arguments and Values:

rounding-mode--- If supplied, must be one of :nearest, :zero, :positive, or :negative.

overflow, underflow, division-by-zero, invalid, inexact---NIL to mask the exception, T to signal it.

Description:

Sets the current thread's exception-enable and rounding-mode control flags to the indicated values for arguments that are supplied, and preserves the values assoicated with those that aren't supplied.

4.12. Watched Objects

As of release 1.4, Clozure CL provides a way for lisp objects to be watched so that a condition will be signaled when a thread attempts to write to the watched object. For a certain class of bugs (someone is changing this value, but I don't know who), this can be extremely helpful.

4.12.1. WATCH

[Function]

watch &optional object
Monitor a lisp object for writes.

Arguments and Values:

object--- Any memory-allocated lisp object.

Description:

The WATCH function arranges for the specified object to be monitored for writes. This is accomplished by copying the object to its own set of virtual memory pages, which are then write-protected. This protection is enforced by the computer's memory-management hardware; the write-protection does not slow down reads at all.

When any write to the object is attempted, a WRITE-TO-WATCHED-OBJECT condition will be signaled.

When called with no arguments, WATCH returns a freshly-consed list of the objects currently being watched.

WATCH returns NIL if the object cannot be watched (typically because the object is in a static or pure memory area).

DWIM:

WATCH operates at a fairly low level; it is not possible to avoid the details of the internal representation of objects. Nevertheless, as a convenience, WATCHing a standard-instance, a hash-table, or a multi-dimensional or non-simple CL array will watch the underlying slot-vector, hash-table-vector, or data-vector, respectively.

Discussion:

WATCH can monitor any memory-allocated lisp object.

In Clozure CL, a memory-allocated object is either a cons cell or a uvector.

WATCH operates on cons cells, not lists. In order to watch a chain of cons cells, each cons cell must be watched individually. Because each watched cons cell takes up its own own virtual memory page (4 Kbytes), it's only feasible to watch relatively short lists.

If a memory-allocated object isn't a cons cell, then it is a vector-like object called a uvector. A uvector is a memory-allocated lisp object whose first word is a header that describes the object's type and the number of elements that it contains.

So, a hash table is a uvector, as is a string, a standard instance, a double-float, a CL array or vector, and so forth.

Some CL objects, like strings and other simple vectors, map in a straightforward way onto the uvector representation. It is easy to understand what happens in such cases. The uvector index corresponds directly to the vector index:


? (defvar *s* "xxxxx")
*S*
? (watch *s*)
"xxxxx"
? (setf (char *s* 3) #\o)
> Error: Write to watched uvector "xxxxx" at index 3
>        Faulting instruction: (movl (% eax) (@ -5 (% r15) (% rcx)))
> While executing: SET-CHAR, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.

    

In the case of more complicated objects (e.g., a hash-table, a standard-instance, a package, etc.), the elements of the uvector are like slots in a structure. It's necessary to know which one of those "slots" contains the data that will be changed when the object is written to.

As mentioned above, watch knows about arrays, hash-tables, and standard-instances, and will automatically watch the appropriate data-containing element.

An example might make this clearer.


? (defclass foo ()
    (slot-a slot-b slot-c))
#<STANDARD-CLASS FOO>
? (defvar *a-foo* (make-instance 'foo))
*A-FOO*
? (watch *a-foo*)
#<SLOT-VECTOR #xDB00D>
;;; Note that WATCH has watched the internal slot-vector object
? (setf (slot-value *a-foo* 'slot-a) 'foo)
> Error: Write to watched uvector #<SLOT-VECTOR #xDB00D> at index 1
>        Faulting instruction: (movq (% rsi) (@ -5 (% r8) (% rdi)))
> While executing: %MAYBE-STD-SETF-SLOT-VALUE-USING-CLASS, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.

    

Looking at a backtrace would presumably show what object and slot name were written.

Note that even though the write was to slot-a, the uvector index was 1 (not 0). This is because the first element of a slot-vector is a pointer to the instance that owns the slots. We can retrieve that to look at the object that was modified:


1 > (uvref (write-to-watched-object-object *break-condition*) 0)
#<FOO #x30004113502D>
1 > (describe *)
#<FOO #x30004113502D>
Class: #<STANDARD-CLASS FOO>
Wrapper: #<CLASS-WRAPPER FOO #x300041135EBD>
Instance slots
SLOT-A: #<Unbound>
SLOT-B: #<Unbound>
SLOT-C: #<Unbound>
1 >
 
    

4.12.2. UNWATCH

[Function]

unwatch object
Stop monitoring a lisp object for writes.

Description:

The UNWATCH function ensures that the specified object is in normal, non-monitored memory. If the object is not currently being watched, UNWATCH does nothing and returns NIL. Otherwise, the newly unwatched object is returned.

4.12.3. WRITE-TO-WATCHED-OBJECT

[Condition]

WRITE-TO-WATCHED-OBJECT
Condition signaled when a write to a watched object is attempted.

Discussion:

This condition is signaled when a watched object is written to. There are three slots of interest:

object--- The actual object that was the destination of the write.

offset--- The byte offset from the tagged object pointer to the address of the write.

instruction--- The disassembled machine instruction that attempted the write.

Restarts:

A few restarts are provided: one will skip over the faulting write instruction and proceed; another offers to unwatch the object and continue.

There is also an emulate restart. In some common cases, the faulting write instruction can be emulated, enabling the write to be performed without having to unwatch the object (and therefore let other threads potentially write to it). If the faulting instruction isn't recognized, the emulate restart will not be offered.

4.12.4. Notes

Although some care has been taken to minimize potential problems arising from watching and unwatching objects from multiple threads, there may well be subtle race conditions present that could cause bad behavior.

For example, suppose that a thread attempts to write to a watched object. This causes the operating system to generate an exception. The lisp kernel figures out what the exception is, and calls back into lisp to signal the write-to-watched-object condition and perhaps handle the error.

Now, as soon lisp code starts running again (for the callback), it's possible that some other thread could unwatch the very watched object that caused the exception, perhaps before we even have a chance to signal the condition, much less respond to it.

Having the object unwatched out from underneath a handler may at least confuse it, if not cause deeper trouble. Use caution with unwatch.

4.12.5. Examples

Here are a couple more examples in addition to the above examples of watching a string and a standard-instance.

4.12.5.1. Fancy arrays

?  (defvar *f* (make-array '(2 3) :element-type 'double-float))
*F*
? (watch *f*)
#(0.0D0 0.0D0 0.0D0 0.0D0 0.0D0 0.0D0)
;;; Note that the above vector is the underlying data-vector for the array
? (setf (aref *f* 1 2) pi)
> Error: Write to watched uvector #<VECTOR 6 type DOUBLE-FLOAT, simple> at index 5
>        Faulting instruction: (movq (% rax) (@ -5 (% r8) (% rdi)))
> While executing: ASET, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.
1 > 
  

In this case, uvector index in the report is the row-major index of the element that was written to.

4.12.5.2. Hash tables

Hash tables are surprisingly complicated. The representation of a hash table includes an element called a hash-table-vector. The keys and values of the elements are stored pairwise in this vector.

One problem with trying to monitor hash tables for writes is that the underlying hash-table-vector is replaced with an entirely new one when the hash table is rehashed. A previously-watched hash-table-vector will not be the used by the hash table after rehashing, and writes to the new vector will not be caught.

? (defvar *h* (make-hash-table))
*H*
? (setf (gethash 'noise *h*) 'feep)
FEEP
? (watch *h*)
#<HASH-TABLE-VECTOR #xDD00D>
;;; underlying hash-table-vector
? (setf (gethash 'noise *h*) 'ding)
> Error: Write to watched uvector #<HASH-TABLE-VECTOR #xDD00D> at index 35
>        Faulting instruction: (lock)
>          (cmpxchgq (% rsi) (@ (% r8) (% rdx)))
> While executing: %STORE-NODE-CONDITIONAL, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.
;;; see what value is being replaced...
1 > (uvref (write-to-watched-object-object *break-condition*) 35)
FEEP
;;; backtrace shows useful context
1 > :b
*(1A109F8) : 0 (%STORE-NODE-CONDITIONAL ???) NIL
 (1A10A50) : 1 (LOCK-FREE-PUTHASH NOISE #<HASH-TABLE :TEST EQL size 1/60 #x30004117D47D> DING) 653
 (1A10AC8) : 2 (CALL-CHECK-REGS PUTHASH NOISE #<HASH-TABLE :TEST EQL size 1/60 #x30004117D47D> DING) 229
 (1A10B00) : 3 (TOPLEVEL-EVAL (SETF (GETHASH # *H*) 'DING) NIL) 709
 ...
  

4.12.5.3. Lists

As previously mentioned, WATCH only watches individual cons cells.

? (defun watch-list (list)
    (maplist #'watch list))
WATCH-LIST
? (defvar *l* (list 1 2 3))
*L*
? (watch-list *l*)
((1 2 3) (2 3) (3))
? (setf (nth 2 *l*) 'foo)
> Error: Write to the CAR of watched cons cell (3)
>        Faulting instruction: (movq (% rsi) (@ 5 (% rdi)))
> While executing: %SETNTH, in process listener(1).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.
  

4.13. Code Coverage

4.13.1. Overview

In Clozure CL 1.4 and later, code coverage provides information about which paths through generated code have been executed and which haven't. For each source form, it can report one of three possible outcomes:

  • Not covered: this form was never entered.

  • Partly covered: This form was entered, and some parts were executed and some weren't.

  • Fully covered: Every bit of code generated from this form was executed.

4.13.2. Limitations

While the information gathered for coverage of generated code is complete and precise, the mapping back to source forms is of necessity heuristic, and depends a great deal on the behavior of macros and the path of the source forms through compiler transforms. Source information is not recorded for variables, which further limits the source mapping. In practice, there is often enough information scattered about a partially covered function to figure out which logical path through the code was taken and which wasn't. If that doesn't work, you can try disassembling to see which parts of the compiled code were not executed: in the disassembled code there will be references to #<CODE-NOTE [xxx] ...> where xxx is NIL if the code that follows was never executed and non-NIL if it was.

Sometimes the situation can be improved by modifying macros to try to preserve more of the input forms, rather than destructuring and rebuilding them.

Because the code coverage information is associated with compiled functions, load-time toplevel expressions do not get reported on. You can work around this by creating a function and calling it. I.e. instead of

(progn
  (do-this)
  (setq that ...) ...))
  

do:

(defun init-this-and-that ()
  (do-this)
  (setq that ...)  ...)
(init-this-and-that)
  

Then you can see the coverage information in the definition of init-this-and-that.

4.13.3. Usage

In order to gather code coverage information, you first have to recompile all your code to include code coverage instrumentation. Compiling files will generate code coverage instrumentation if CCL:*COMPILE-CODE-COVERAGE* is true:

(setq ccl:*compile-code-coverage* t) 
(recompile-all-your-files) 
  

The compilation process will be many times slower than normal, and the fasl files will be many times bigger.

When you execute function loaded from instrumented fasl files, they will record coverage information every time they are executed. The system keeps track of which instrumented files have been loaded.

The following functions can be used to manage the coverage data:

[Function]

report-coverage &key (external-format :default) (statistics t) (html t)
Generate code coverage report

Arguments and Values:

html--- If non-nil, this will generate an HTML report, consisting of an index file and one html file for each instrumented source file that has been loaded in the current session. The individual source file reports are stored in the same directory as the index file.

external-format--- Controls the external format of the html files.

statistics--- If :statistics is non-nil, a comma-separated file is also generated with the summary of statistics. You can specify a filename for the statistics argument, otherwise "statistics.csv" is created in the output directory. See documentation of ccl:coverage-statistics below for a description of the values in the statistics file.

Example:

If you've loaded foo.lx64fsl and bar.lx64fsl, and have run some tests, you could do

(CCL:REPORT-COVERAGE "/my/dir/coverage/report.html")
    

and this would generate report.html, foo_lisp.html and bar_lisp.html, and statistics.csv all in /my/dir/coverage/.

[Function]

reset-coverage
Resets all coverage data back to the "Not Executed" state

Summary:

Resets all coverage data back to the "Not Executed" state

[Function]

clear-coverage
Forget about all instrumented files that have been loaded.

Summary:

Gets rid of the information about which instrumented files have been loaded, so ccl:report-coverage will not report any files, and ccl:save-coverage-in-file will not save any info, until more instrumented files are loaded.

[Function]

save-coverage-in-file pathname
Save all coverage into to a file so you can restore it later.

Summary:

Saves all coverage info in a file, so you can restore the coverage state later. This allows you to combine multiple runs or continue in a later session. Equivalent to (ccl:write-coverage-to-file (ccl:save-coverage) pathname).

[Function]

restore-coverage-from-file pathname
Load coverage state from a file.

Summary:

Restores the coverage data previously saved with CCL:SAVE-COVERAGE-IN-FILE, for the set of instrumented fasls that were loaded both at save and restore time. I.e. coverage info is only restored for files that have been loaded in this session. For example if in a previous session you had loaded "foo.lx86fsl" and then saved the coverage info, in this session you must load the same "foo.lx86fsl" before calling ccl:restore-coverage-from-file in order to retrieve the stored coverage info for "foo". Equivalent to (ccl:restore-coverage (ccl:read-coverage-from-file pathname)).

[Function]

save-coverage
Returns a snapshot of the current coverage data.

Summary:

Returns a snapshot of the current coverage data. A snapshot is a copy of the current coverage state. It can be saved in a file with ccl:write-coverage-to-file, reinstated back as the current state with ccl:restore-coverage, or combined with other snapshots with ccl:combine-coverage.

[Function]

restore-coverage snapshot
Reinstalls a coverage snapshot as the current coverage state.

Summary:

Reinstalls a coverage snapshot as the current coverage state.

[Function]

write-coverage-to-file snapshot pathname
Save a coverage snapshot in a file.

Summary:

Saves the coverage snapshot in a file. The snapshot can be loaded back with ccl:read-coverage-from-file or loaded and restored with ccl:restore-coverage-from-file. Note that the file created is actually a lisp source file and can be compiled for faster loading.

[Function]

read-coverage-from-file pathname
Return the coverage snapshot saved in a file.

Summary:

Returns the snapshot saved in pathname. Doesn't affect the current coverage state. pathname can be the file previously created with ccl:write-coverage-to-file or ccl:save-coverage-in-file, or it can be the name of the fasl created from compiling such a file.

[Function]

coverage-statistics
Returns a sequence of coverage-statistics objects, one per source file.

Summary:

Returns a sequence ccl:coverage-statistics objects, one for each source file, containing the same information as that written to the statistics file by ccl:report-coverage. The following accessors are defined for ccl:coverage-statistics objects:

ccl:coverage-source-file

the name of the source file corresponding to this information

ccl:coverage-expressions-total

the total number of expressions

ccl:coverage-expressions-entered

the number of source expressions that have been entered (i.e. at least partially covered)

ccl:coverage-expressions-covered

the number of source expressions that were fully covered

ccl:coverage-unreached-branches

the number of conditionals with one branch taken and one not taken

ccl:coverage-code-forms-total

the total number of code forms. A code form is an expression in the final stage of compilation, after all macroexpansion and compiler transforms and simplification

ccl:coverage-code-forms-covered

the number of code forms that have been entered

ccl:coverage-functions-total

the total number of functions

ccl:coverage-functions-fully-covered

the number of functions that were fully covered

ccl:coverage-functions-partly-covered

the number of functions that were partly covered

ccl:coverage-functions-not-entered

the number of functions never entered

[Variable]

*compile-code-coverage*
When true, instrument functions for code coverage.

Summary:

This variable controls whether functions are instrumented for code coverage. Files compiled while this variable is true will contain code coverage instrumentation.

[Macro]

without-compiling-code-coverage
Don't record code coverange for forms within the body.

Summary:

This macro arranges so that body doesn't record internal details of code coverage. It will be considered totally covered if it's entered at all. The Common Lisp macros ASSERT and CHECK-TYPE use this macro.

Chapter 5. The Clozure CL IDE

5.1. Introduction

Clozure CL ships with the complete source code for an integrated development environment written using Cocoa on Mac OS X. This chapter describes how to build and use that environment, referred to hereafter simply as "the IDE".

The IDE provides a programmable text editor, listener windows, an inspector for Lisp data structures, and a means of easily building a Cocoa application in Lisp. In addition, its source code provides an example of a fairly complex Cocoa application written in Lisp.

The current version of the IDE has seen the addition of numerous features and many bugfixes. Although it's by no means a finished product, we hope it will prove more useful than previous versions, and we plan additional work on the IDE for future releases.

5.2. Building the IDE

Building the Clozure CL IDE is now a very simple process.

  1. In a shell session, cd to the ccl directory.

  2. Run ccl from the shell. The easiest way to do this is generally to execute the ccl or ccl64 command.

  3. Evaluate the form (require :cocoa-application)

For example, assuming that the Clozure CL distribution is installed in "/usr/local/ccl", the following sequence of shell interactions builds the IDE:

oshirion:ccl mikel$ ccl64
Welcome to Clozure Common Lisp Version 1.2-r9198M-trunk  (DarwinX8664)!
? (require :cocoa-application)
;Loading #P"ccl:cocoa-ide;fasls;cocoa-utils.dx64fsl.newest"...
;Loading #P"ccl:cocoa-ide;fasls;cocoa-defaults.dx64fsl.newest"...

[...many lines of "Compiling" and "Loading" omitted...]

Saving application to /usr/local/ccl/Clozure CL.app/

oshirion:ccl mikel$ 

    

Clozure CL compiles and loads the various subsystems that make up the IDE, then constructs a Cocoa application bundle named "Clozure CL.app" and saves the Lisp image into it. Normally Clozure CL creates the application bundle in the root directory of the Clozure CL distribution.

5.3. Running the IDE

After it has been built, you can run the "Clozure CL.app" application normally, by double-clicking its icon. When launched, the IDE initially displays a single listener window that you can use to interact with Lisp. You can type Lisp expressions for evaluation at the prompt in the listener window. You can also use Hemlock editing commands to edit the text of expressions in the listener window.

5.4. IDE Features

5.4.1. Editor Windows

You can open an editor window either by choosing Open from the File menu and then selecting a text file, or by choosing New from the File menu. You can also evaluate the expression (ed) in the listener window; in that case Clozure CL creates a new window as if you had chosen New from the File menu.

Editor windows implement Hemlock editing commands. You can use all the editing and customization features of Hemlock within any editor window (including listener windows).

5.4.2. The Lisp Menu

The Lisp menu provides several commands for interacting with the running Lisp session, in addition to the ways you can interact with it by evaluating expressions. You can evaluate a selected range of text in any editing buffer. You can compile and load the contents of editor windows (please note that in the current version, Clozure CL compiles and loads the contents of the file associated with an editor window; that means that if you try to load or compile a window that has not been saved to a file, the result is an error).

You can interrupt computations, trigger breaks, and select restarts from the Lisp menu. You can also display a backtrace or open the Inspector window.

5.4.2.1. Checking for Updates

At the bottom of the Lisp menu is an item entitled "Check for Updates". If your copy of Clozure CL came from the Clozure Subversion server (which is the preferred source), and if your internet connection is working, then you can select this menu item to check for updates to your copy of Clozure CL.

When you select "Check for Updates", Clozure CL uses the svn program to query the Clozure Subversion repository and determine whether new updates to Clozure CL are available. (This means that on Mac OS X versions earlier than 10.5, you must ensure that the Subversion client software is installed before using the "Check for Updates" feature. See the wikiHow page on installing Subversion for more information.) If updates are available, Clozure CL automatically downloads and installs them. After a successful download, Clozure CL rebuilds itself, and then rebuilds the IDE on the newly-rebuilt Lisp. Once this process is finished, you should quit the running IDE and start the newly built one (which will be in the same place that the old one was).

Normally, Clozure CL can install updates and rebuild itself without any problems. Occasionally, an unforeseen problem (such as a network outage, or a hardware failure) might interrupt the self-rebuilding process, and leave your copy of Clozure CL unusable. If you are expecting to update your copy of Clozure CL frequently, it might be prudent to keep a backup copy of your working environment ready in case of such situtations. You can also always obtain a full, fresh copy of Clozure CL from Clozure's repository..

5.4.3. The Tools Menu

The tools menu provides access to the Apropos and Processes windows. The Apropos window searches the running Lisp image for symbols that match any text you enter. You can use the Apropos window to quickly find function names and other useful symbols. The Processes window lists all threads running in the current Lisp session. If you double-click a process entry, Clozure CL opens an Inspector window on that process.

5.4.4. The Inspector Window

The Inspector window displays information about a Lisp value. The information displayed varies from the very simple, in the case of a simple data value such as a character, to the complex, in the case of structured data such as lists or CLOS objects. The left-hand column of the window's display shows the names of the object's attributes; the righthand column shows the values associated with those attributes. You can inspect the values in the righthand column by double-clicking them.

Inspecting a value in the righthand column changes the Inspector window to display the double-clicked object. You can quickly navigate the fields of structured data this way, inspecting objects and the objects that they refer to. Navigation buttons at the top left of the window enable you to retrace your steps, backing up to return to previously-viewed objects, and going forward again to objects you navigated into previously.

You can change the contents of a structured object by evaluating expressions in a listener window. The refresh button (marked with a curved arrow) updates the display of the Inspector window, enabling you to quickly see the results of changing a data structure.

5.5. IDE Sources

Clozure CL builds the IDE from sources in the "objc-bridge" and "cocoa-ide" directories in the Clozure CL distribution. The IDE as a whole is a relatively complicated application, and is probably not the best place to look when you are first trying to understand how to build Cocoa applications. For that, you might benefit more from the examples in the "examples/cocoa/" directory. Once you are familiar with those examples, though, and have some experience building your own application features using Cocoa and the Objective-C bridge, you might browse through the IDE sources to see how it implements its features.

The search path for Clozure CL's REQUIRE feature includes the "objc-bridge" and "cocoa-ide" directories. You can load features defined in these directories by using REQUIRE. For example, if you want to use the Cocoa features of Clozure CL from a terminal session (or from an Emacs session using SLIME or ILISP), you can evaluate (require :cocoa).

5.6. The Application Builder

One important feature of the IDE currently has no Cocoa user interface: the application builder. The application builder constructs a Cocoa application bundle that runs a Lisp image when double-clicked. You can use the application builder to create Cocoa applications in Lisp. These applications are exactly like Cocoa applications created with XCode and Objective-C, except that they are written in Lisp.

To make the application builder available, evaluate the expression (require :build-application). Clozure CL loads the required subsystems, if necessary.

BUILD-APPLICATION &key (name "MyApplication") (type-string "APPL") (creator-string "OMCL") (directory (current-directory)) (copy-ide-resources t) (info-plist NIL) (nibfiles NIL) (main-nib-name NIL) (application-class 'GUI::COCOA-APPLICATION) (toplevel-function NIL) [Function]

The build-application function constructs an application bundle, populates it with the files needed to satisfy Mac OS X that the bundle is a launchable application, and saves an executable Lisp image to the proper subdirectory of the bundle. Assuming that the saved Lisp image contains correct code, a user can subsequently launch the resulting Cocoa application by double-clicking its icon in the Finder, and the saved Lisp environment runs.

The keyword arguments control various aspects of application bundle as BUILD-APPLICATION builds it.

name

Specifies the application name of the bundle. BUILD-APPLICATION creates an application bundle whose name is given by this parameter, with the extension ".app" appended. For example, using the default value for this parameter results in a bundle named "MyApplication.app".

type-string

Specifies type of bundle to create. You should normally never need to change the default value, which Mac OS X uses to identify application bundles.

creator-string

Specifies the creator code, which uniquely identifies the application under Mac OS X. The default creator code is that of Clozure CL. For more information about reserving and assigning creator codes, see Apple's developer page on the topic.

directory

The directory in which BUILD-APPLICATION creates the application bundle. By default, it creates the bundle in the current working directory. Unless you use CURRENT-DIRECTORY to set the working directory, the bundle may be created in some unexpected place, so it's safest to specify a full pathname for this argument. A typical value might be "/Users/foo/Desktop/" (assuming, of course, that your username is "foo").

copy-ide-resources

Whether to copy the resource files from the IDE's application bundle. By default, BUILD-APPLICATION copies nibfiles and other resources from the IDE to the newly-created application bundle. This option is often useful when you are developing a new application, because it enables your built application to have a fully-functional user interface even before you have finished designing one. By default, the application uses the application menu and other UI elements of the IDE until you specify otherwise. Once your application's UI is fully implemented, you may choose to pass NIL for the value of this parameter, in which case the IDE resources are not copied into your application bundle.

info-plist

A user-supplied NSDictionary object that defines the contents of the Info.plist file to be written to the application bundle. The default value is NIL, which specifies that the Info.plist from the IDE is to be used if copy-ide-resources is true, and a new dictionary created with default values is to be used otherwise. You can create a suitable NSDictionary object using the function make-info-dict. For details on the parameters to this function, see its definition in "ccl/cocoa-ide/builder-utilities.lisp".

nibfiles

A list of pathnames, where each pathname identifies a nibfile created with Apple's InterfaceBuilder application. BUILD-APPLICATION copies each nibfile into the appropriate place in the application bundle, enabling the application to load user-interface elements from them as-needed. It is safest to provide full pathnames to the nibfiles in the list. Each nibfile must be in ".nib" format, not ".xib" format, in order that the application can load it.

main-nib-name

The name of the nibfile to load initially when launching. The user-interface defined in this nibfile becomes the application's main interface. You must supply the name of a suitable nibfile for this parameter, or the resulting application uses the Clozure CL user interface.

application-class

The name of the application's CLOS class. The default value is the class provided by Clozure CL for graphical applications. Supply the name of your application class if you implement one. If not, Clozure CL uses the default class.

toplevel-function

The toplevel function that runs when the application launches. Normally the default value, which is Clozure CL's toplevel, works well, but in some cases you may wish to customize the behavior of the application's toplevel. The best source of information about writing your own toplevel is the Clozure CL source code, especially the implementations of TOPLEVEL-FUNCTION in "ccl/level-1/l1-application.lisp"

BUILD-APPLICATION creates a folder named "name.app" in the directory directory. Inside that folder, it creates the "Contents" folder that Mac OS X application bundles are expected to contain, and populates it with the "MacOS" and "Resources" folders, and the "Info.plist" and "PkgInfo" files that must be present in a working application bundle. It takes the contents of the "Info.plist" and "PkgInfo" files from the parameters to BUILD-APPLICATION. If copy-ide-resources is true then it copies the contents of the "Resources" folder from the "Resources" folder of the running IDE.

The work needed to produce a running Cocoa application is very minimal. In fact, if you supply BUILD-APPLICATION with a valid nibfile and pathnames, it builds a running Cocoa application that displays your UI. It doesn't need you to write any code at all to do this. Of course, the resulting application doesn't do anything apart from displaying the UI defined in the nibfile. If you want your UI to accomplish anything, you need to write the code to handle its events. But the path to a running application with your UI in it is very short indeed.

Please note that BUILD-APPLICATION is a work in progress. It can easily build a working Cocoa application, but it still has limitations that may in some cases prove inconvenient. For example, in the current version it provides no easy way to specify an application delegate different from the default. If you find the current limitations of BUILD-APPLICATION too restrictive, and want to try extending it for your use, you can find the source code for it in "ccl/cocoa-ide/build-application.lisp". You can see the default values used to populate the "Info.plist" file in "ccl/cocoa-ide/builder-utilities.lisp".

For more information on how to use BUILD-APPLICATION, see the Currency Converter example in "ccl/examples/cocoa/currency-converter/".

5.6.1. Running the Application Builder From the Command Line

It's possible to automate use of the application builder by running a call to CCL:BUILD-APPLICATION from the terminal command line. For example, the following command, entered at a shell prompt in Mac OS X's Terminal window, builds a working copy of the Clozure CL environment called "Foo.app":

ccl -b -e "(require :cocoa)" -e "(require :build-application)" -e "(ccl::build-application :name \"Foo\")"
      

You can use the same method to automate building your Lisp/Cocoa applications. Clozure CL handles each Lisp expressions passed with a -e argument in order, so you can simply evaluate a sequence of Lisp expressions as in the above example to build your application, ending with a call to CCL:BUILD-APPLICATION. The call to CCL:BUILD-APPLICATION can process all the same arguments as if you evaluated it in a Listener window in the Clozure CL IDE.

Building a substantial Cocoa application (rather than just reproducing the Lisp environment using defaults, as is done in the above example) is likely to involve a relatively complicated sequence of loading source files and perhaps evaluating Lisp forms. You might be best served to place your command line in a shell script that you can more easily edit and test.

One potentially complicated issue concerns loading all your Lisp source files in the right order. You might consider using ASDF to define and load a system that includes all the parts of your application before calling CCL:BUILD-APPLICATION. ASDF is a "another system-definition facility", a sort of make for Lisp, and is included in the Clozure CL distribution. You can read more about ASDF at the ASDF home page.

Alternatively, you could use the standard features of Common Lisp to load your application's files in the proper order.

Chapter 6. Programming with Threads

6.1. Threads Overview

Clozure CL provides facilities which enable multiple threads of execution (threads, sometimes called lightweight processes or just processes, though the latter term shouldn't be confused with the OS's notion of a process) within a lisp session. This document describes those facilities and issues related to multithreaded programming in Clozure CL.

Wherever possible, I'll try to use the term "thread" to denote a lisp thread, even though many of the functions in the API have the word "process" in their name. A lisp-process is a lisp object (of type CCL:PROCESS) which is used to control and communicate with an underlying native thread. Sometimes, the distinction between these two (quite different) objects can be blurred; other times, it's important to maintain.

Lisp threads share the same address space, but maintain their own execution context (stacks and registers) and their own dynamic binding context.

Traditionally, Clozure CL's threads have been cooperatively scheduled: through a combination of compiler and runtime support, the currently executing lisp thread arranged to be interrupted at certain discrete points in its execution (typically on entry to a function and at the beginning of any looping construct). This interrupt occurred several dozen times per second; in response, a handler function might observe that the current thread had used up its time slice and another function (the lisp scheduler) would be called to find some other thread that was in a runnable state, suspend execution of the current thread, and resume execution of the newly executed thread. The process of switching contexts between the outgoing and incoming threads happened in some mixture of Lisp and assembly language code; as far as the OS was concerned, there was one native thread running in the Lisp image and its stack pointer and other registers just happened to change from time to time.

Under Clozure CL's cooperative scheduling model, it was possible (via the use of the CCL:WITHOUT-INTERRUPTS construct) to defer handling of the periodic interrupt that invoked the lisp scheduler; it was not uncommon to use WITHOUT-INTERRUPTS to gain safe, exclusive access to global data structures. In some code (including much of Clozure CL itself) this idiom was very common: it was (justifiably) believed to be an efficient way of inhibiting the execution of other threads for a short period of time.

The timer interrupt that drove the cooperative scheduler was only able to (pseudo-)preempt lisp code: if any thread called a blocking OS I/O function, no other thread could be scheduled until that thread resumed execution of lisp code. Lisp library functions were generally attuned to this constraint, and did a complicated mixture of polling and "timed blocking" in an attempt to work around it. Needless to say, this code is complicated and less efficient than it might be; it meant that the lisp was a little busier than it should have been when it was "doing nothing" (waiting for I/O to be possible.)

For a variety of reasons - better utilization of CPU resources on single and multiprocessor systems and better integration with the OS in general - threads in Clozure CL 0.14 and later are preemptively scheduled. In this model, lisp threads are native threads and all scheduling decisions involving them are made by the OS kernel. (Those decisions might involve scheduling multiple lisp threads simultaneously on multiple processors on SMP systems.) This change has a number of subtle effects:

  • it is possible for two (or more) lisp threads to be executing simultaneously, possibly trying to access and/or modify the same data structures. Such access really should have been coordinated through the use of synchronization objects regardless of the scheduling modeling effect; preemptively scheduled threads increase the chance of things going wrong at the wrong time and do not offer lightweight alternatives to the use of those synchronization objects.

  • even on a single-processor system, a context switch can happen on any instruction boundary. Since (in general) other threads might allocate memory, this means that a GC can effectively take place at any instruction boundary. That's mostly an issue for the compiler and runtime system to be aware of, but it means that certain practices(such as trying to pass the address of a lisp object to foreign code)that were always discouraged are now discouraged ... vehemently.

  • there is no simple and efficient way to "inhibit the scheduler"or otherwise gain exclusive access to the entire CPU.

  • There are a variety of simple and efficient ways to synchronize access to particular data structures.

As a broad generalization: code that's been aggressively tuned to the constraints of the cooperative scheduler may need to be redesigned to work well with the preemptive scheduler (and code written to run under Clozure CL's interface to the native scheduler may be less portable to other CL implementations, many of which offer a cooperative scheduler and an API similar to Clozure CL (< 0.14) 's.) At the same time, there's a large overlap in functionality in the two scheduling models, and it'll hopefully be possible to write interesting and useful MP code that's largely independent of the underlying scheduling details.

The keyword :OPENMCL-NATIVE-THREADS is on *FEATURES* in 0.14 and later and can be used for conditionalization where required.

6.2. (Intentionally) Missing Functionality

Much of the functionality described above is similar to that provided by Clozure CL's cooperative scheduler, some other parts of which make no sense in a native threads implementation.

  • PROCESS-RUN-REASONS and PROCESS-ARREST-REASONS were SETFable process attributes; each was just a list of arbitrary tokens. A thread was eligible for scheduling (roughly equivalent to being "enabled") if its arrest-reasons list was empty and its run-reasons list was not. I don't think that it's appropriate to encourage a programming style in which otherwise runnable threads are enabled and disabled on a regular basis (it's preferable for threads to wait for some sort of synchronization event to occur if they can't occupy their time productively.)

  • There were a number of primitives for maintaining process queues;that's now the OS's job.

  • Cooperative threads were based on coroutining primitives associated with objects of type STACK-GROUP. STACK-GROUPs no longerexist.

6.3. Implementation Decisions and Open Questions

6.3.1. Thread Stack Sizes

When you use MAKE-PROCESS to create a thread, you can specify a stack size. Clozure CL does not impose a limit on the stack size you choose, but there is some evidence that choosing a stack size larger than the operating system's limit can cause excessive paging activity, at least on some operating systems.

The maximum stack size is operating-system-dependent. You can use shell commands to determine what it is on your platform. In bash, use "ulimit -s -H" to find the limit; in tcsh, use "limit -h s".

This issue does not affect programs that create threads using the default stack size, which you can do either by specifying no value for the :stack-size argument to MAKE-PROCESS, or by specifying the value CCL::*default-control-stack-size*.

If your program creates threads with a specified stack size, and that size is larger than the OS-specified limit, you may want to consider reducing the stack size in order to avoid possible excessive paging activity.

6.3.2.  As of August 2003:

  • It's not clear that exposing PROCESS-SUSPEND/PROCESS-RESUME is a good idea: it's not clear that they offer ways to win, and it's clear that they offer ways to lose.

  • It has traditionally been possible to reset and enable a process that's "exhausted" . (As used here, the term"exhausted" means that the process's initial function has run and returned and the underlying native thread has been deallocated.) One of the principal uses of PROCESS-RESET is to "recycle" threads; enabling an exhausted process involves creating a new native thread (and stacks and synchronization objects and ...),and this is the sort of overhead that such a recycling scheme is seeking to avoid. It might be worth trying to tighten things up and declare that it's an error to apply PROCESS-ENABLE to an exhausted thread (and to make PROCESS-ENABLE detect this error.)

  • When native threads that aren't created by Clozure CL first call into lisp, a "foreign process" is created, and that process is given its own set of initial bindings and set up to look mostly like a process that had been created by MAKE-PROCESS. The life cycle of a foreign process is certainly different from that of a lisp-created one: it doesn't make sense to reset/preset/enable a foreign process, and attempts to perform these operations should be detected and treated as errors.

6.4. Porting Code from the Old Thread Model

Older versions of Clozure CL used what are often called "user-mode threads", a less versatile threading model which does not require specific support from the operating system. This section discusses how to port code which was written for that mode.

It's hard to give step-by-step instructions; there are certainly a few things that one should look at carefully:

  • It's wise to be suspicious of most uses of WITHOUT-INTERRUPTS; there may be exceptions, but WITHOUT-INTERRUPTS is often used as shorthand for WITH-APPROPRIATE-LOCKING. Determining what type of locking is appropriate and writing the code to implement it is likely to be straightforward and simple most of the time.

  • I've only seen one case where a process's "run reasons" were used to communicate information as well as to control execution; I don't think that this is a common idiom, but may be mistaken about that.

  • It's certainly possible that programs written for cooperatively scheduled lisps that have run reliably for a long time have done so by accident: resource-contention issues tend to be timing-sensitive, and decoupling thread scheduling from lisp program execution affects timing. I know that there is or was code in both Clozure CL and commercial MCL that was written under the explicit assumption that certain sequences of open-coded operations were uninterruptable; it's certainly possible that the same assumptions have been made (explicitly or otherwise) by application developers.

6.5. Background Terminal Input

6.5.1. Overview

Unless and until Clozure CL provides alternatives (via window streams, telnet streams, or some other mechanism) all lisp processes share a common *TERMINAL-IO* stream (and therefore share *DEBUG-IO*, *QUERY-IO*, and other standard and internal interactive streams.)

It's anticipated that most lisp processes other than the "Initial" process run mostly in the background. If a background process writes to the output side of *TERMINAL-IO*, that may be a little messy and a little confusing to the user, but it shouldn't really be catastrophic. All I/O to Clozure CL's buffered streams goes thru a locking mechanism that prevents the worst kinds of resource-contention problems.

Although the problems associated with terminal output from multiple processes may be mostly cosmetic, the question of which process receives input from the terminal is likely to be a great deal more important. The stream locking mechanisms can make a confusing situation even worse: competing processes may "steal" terminal input from each other unless locks are held longer than they otherwise need to be, and locks can be held longer than they need to be (as when a process is merely waiting for input to become available on an underlying file descriptor).

Even if background processes rarely need to intentionally read input from the terminal, they may still need to do so in response to errors or other unanticipated situations. There are tradeoffs involved in any solution to this problem. The protocol described below allows background processes which follow it to reliably prompt for and receive terminal input. Background processes which attempt to receive terminal input without following this protocol will likely hang indefinitely while attempting to do so. That's certainly a harsh tradeoff, but since attempts to read terminal input without following this protocol only worked some of the time anyway, it doesn't seem to be an unreasonable one.

In the solution described here (and introduced in Clozure CL 0.9), the internal stream used to provide terminal input is always locked by some process (the "owning" process.) The initial process (the process that typically runs the read-eval-print loop) owns that stream when it's first created. By using the macro WITH-TERMINAL-INPUT, background processes can temporarily obtain ownership of the terminal and relinquish ownership to the previous owner when they're done with it.

In Clozure CL, BREAK, ERROR, CERROR, Y-OR-N-P, YES-OR-NO-P, and CCL:GET-STRING- FROM-USER are all defined in terms of WITH-TERMINAL-INPUT, as are the :TTY user-interfaces to STEP and INSPECT.

6.5.2. An example

? Welcome to Clozure CL Version (Beta: linux) 0.9!
?

? (process-run-function "sleeper" #'(lambda () (sleep 5) (break "broken")))
#<PROCESS sleeper(1) [Enabled] #x3063B33E>

?
;;
;; Process sleeper(1) needs access to terminal input.
;;
      

This example was run under ILISP; ILISP often gets confused if one tries to enter input and "point" doesn't follow a prompt. Entering a "simple" expression at this point gets it back in synch; that's otherwise not relevant to this example.

()
NIL
? (:y 1)
;;
;; process sleeper(1) now controls terminal input
;;
> Break in process sleeper(1): broken
> While executing: #<Anonymous Function #x3063B276>
> Type :GO to continue, :POP to abort.
> If continued: Return from BREAK.
Type :? for other options.
1 > :b
(30C38E30) : 0 "Anonymous Function #x3063B276" 52
(30C38E40) : 1 "Anonymous Function #x304984A6" 376
(30C38E90) : 2 "RUN-PROCESS-INITIAL-FORM" 340
(30C38EE0) : 3 "%RUN-STACK-GROUP-FUNCTION" 768
1 > :pop
;;
;; control of terminal input restored to process Initial(0)
;;
?
      

6.5.3. A more elaborate example.

If a background process ("A") needs access to the terminal input stream and that stream is owned by another background process ("B"), process "A" announces that fact, then waits until the initial process regains control.

? Welcome to Clozure CL Version (Beta: linux) 0.9!
?

? (process-run-function "sleep-60" #'(lambda () (sleep 60) (break "Huh?")))
#<PROCESS sleep-60(1) [Enabled] #x3063BF26>

? (process-run-function "sleep-5" #'(lambda () (sleep 5) (break "quicker")))
#<PROCESS sleep-5(2) [Enabled] #x3063D0A6>

?       ;;
;; Process sleep-5(2) needs access to terminal input.
;;
()
NIL

? (:y 2)
;;
;; process sleep-5(2) now controls terminal input
;;
> Break in process sleep-5(2): quicker
> While executing: #x3063CFDE>
> Type :GO to continue, :POP to abort.
> If continued: Return from BREAK.
Type :? for other options.
1 >     ;; Process sleep-60(1) will need terminal access when
;; the initial process regains control of it.
;;
()
NIL
1 > :pop
;;
;; Process sleep-60(1) needs access to terminal input.
;;
;;
;; control of terminal input restored to process Initial(0)
;;

? (:y 1)
;;
;; process sleep-60(1) now controls terminal input
;;
> Break in process sleep-60(1): Huh?
> While executing: #x3063BE5E>
> Type :GO to continue, :POP to abort.
> If continued: Return from BREAK.
Type :? for other options.
1 > :pop
;;
;; control of terminal input restored to process Initial(0)
;;

?
      

6.5.4. Summary

This scheme is certainly not bulletproof: imaginative use of PROCESS-INTERRUPT and similar functions might be able to defeat it and deadlock the lisp, and any scenario where several background processes are clamoring for access to the shared terminal input stream at the same time is likely to be confusing and chaotic. (An alternate scheme, where the input focus was magically granted to whatever thread the user was thinking about, was considered and rejected due to technical limitations.)

The longer-term fix would probably involve using network or window-system streams to give each process unique instances of *TERMINAL-IO*.

Existing code that attempts to read from *TERMINAL-IO* from a background process will need to be changed to use WITH-TERMINAL-INPUT. Since that code was probably not working reliably in previous versions of Clozure CL, this requirement doesn't seem to be too onerous.

Note that WITH-TERMINAL-INPUT both requests ownership of the terminal input stream and promises to restore that ownership to the initial process when it's done with it. An ad hoc use of READ or READ-CHAR doesn't make this promise; this is the rationale for the restriction on the :Y command.

6.6. The Threads which Clozure CL Uses for Its Own Purposes

In the "tty world", Clozure CL starts out with 2 lisp-level threads:

? :proc
1 : -> listener     [Active]
0 :    Initial      [Active]
    

If you look at a running Clozure CL with a debugging tool, such as GDB, or Apple's Thread Viewer.app, you'll see an additional kernel-level thread on Darwin; this is used by the Mach exception-handling mechanism.

The initial thread, conveniently named "initial", is the one that was created by the operating system when it launched Clozure CL. It maps the heap image into memory, does some Lisp-level initialization, and, when the Cocoa IDE isn't being used, creates the thread "listener", which runs the top-level loop that reads input, evaluates it, and prints the result.

After the listener thread is created, the initial thread does "housekeeping": it sits in a loop, sleeping most of the time and waking up occasionally to do "periodic tasks". These tasks include forcing output on specified interactive streams, checking for and handling control-C interrupts, etc. Currently, those tasks also include polling for the exit status of external processes and handling some kinds of I/O to and from those processes.

In this environment, the initial thread does these "housekeeping" activities as necessary, until ccl:quit is called; quitting interrupts the initial thread, which then ends all other threads in as orderly a fashion as possible and calls the C function #_exit.

The short-term plan is to handle each external-process in a dedicated thread; the worst-case behavior of the current scheme can involve busy-waiting and excessive CPU utilization while waiting for an external process to terminate in some cases.

The Cocoa features use more threads. Adding a Cocoa listener creates two threads:

      ? :proc
      3 : -> Listener     [Active]
      2 :    housekeeping  [Active]
      1 :    listener     [Active]
      0 :    Initial      [Active]
    

The Cocoa event loop has to run in the initial thread; when the event loop starts up, it creates a new thread to do the "housekeeping" tasks which the initial thread would do in the terminal-only mode. The initial thread then becomes the one to receive all Cocoa events from the window server; it's the only thread which can.

It also creates one "Listener" (capital-L) thread for each listener window, with a lifetime that lasts as long as the thread does. So, if you open a second listener, you'll see five threads all together:

      ? :proc
      4 : -> Listener-2   [Active]
      3 :    Listener     [Active]
      2 :    housekeeping  [Active]
      1 :    listener     [Active]
      0 :    Initial      [Active]
    

Unix signals, such as SIGINT (control-C), invoke a handler installed by the Lisp kernel. Although the OS doesn't make any specific guarantee about which thread will receive the signal, in practice, it seems to be the initial thread. The handler just sets a flag and returns; the housekeeping thread (which may be the initial thread, if Cocoa's not being used) will check for the flag and take whatever action is appropriate to the signal.

In the case of SIGINT, the action is to enter a break loop, by calling on the thread being interrupted. When there's more than one Lisp listener active, it's not always clear what thread that should be, since it really depends on the user's intentions, which there's no way to divine programmatically. To make its best guess, the handler first checks whether the value of ccl:*interactive-abort-process* is a thread, and, if so, uses it. If that fails, it chooses the thread which currently "owns" the default terminal input stream; see .

In the bleeding-edge version of the Cocoa support which is based on Hemlock, an Emacs-like editor, each editor window has a dedicated thread associated with it. When a keypress event comes in which affects that specific window the initial thread sends it to the window's dedicated thread. The dedicated thread is responsible for trying to interpret keypresses as Hemlock commands, applying those commands to the active buffer; it repeats this in a loop, until the window closes. The initial thread handles all other events, such as mouse clicks and drags.

This thread-per-window scheme makes many things simpler, including the process of entering a "recursive command loop" in commands like "Incremental Search Forward", etc. (It might be possible to handle all Hemlock commands in the Cocoa event thread, but these "recursive command loops" would have to maintain a lot of context/state information; threads are a straightforward way of maintaining that information.)

Currently (August 2004), when a dedicated thread needs to alter the contents of the buffer or the selection, it does so by invoking methods in the initial thread, for synchronization purposes, but this is probably overkill and will likely be replaced by a more efficient scheme in the future.

The per-window thread could probably take more responsibility for drawing and handling the screen than it currently does; -something- needs to be done to buffer screen updates a bit better in some cases: you don't need to see everything that happens during something like indentation; you do need to see the results...

When Hemlock is being used, listener windows are editor windows, so in addition to each "Listener" thread, you should also see a thread which handles Hemlock command processing.

The Cocoa runtime may make additional threads in certain special situations; these threads usually don't run lisp code, and rarely if ever run much of it.

6.7. Threads Dictionary

[Function]

all-processes => result
Obtain a fresh list of all known Lisp threads.

Values:

result---a list of all lisp processes (threads) known to Clozure CL.

Description:

Returns a list of all lisp processes (threads) known to Clozure CL as of the precise instant it's called. It's safe to traverse this list and to modify the cons cells that comprise that list (it's freshly consed.) Since other threads can create and kill threads at any time, there's generally no way to get an "accurate" list of all threads, and (generally) no sense in which such a list can be accurate.

[Function]

make-process name &key persistent priority class stack-size vstack-size tstack-size initial-bindings use-standard-initial-bindings => process
Creates and returns a new process.

Arguments and Values:

name---a string, used to identify the process.

persistent---if true, requests that information about the process be retained by SAVE-APPLICATION so that an equivalent process can be restarted when a saved image is run. The default is nil.

priority---ignored. It shouldn't be ignored of course, but there are complications on some platforms. The default is 0.

class---the class of process object to create; should be a subclass of CCL:PROCESS. The default is CCL:PROCESS.

stack-size---the size, in bytes, of the newly-created process's control stack; used for foreign function calls and to save function return address context. The default is CCL:*DEFAULT-CONTROL-STACK-SIZE*.

vstack-size---the size, in bytes, of the newly-created process's value stack; used for lisp function arguments, local variables, and other stack-allocated lisp objects. The default is CCL:*DEFAULT-VALUE-STACK-SIZE*.

tstack-size---the size, in bytes, of the newly-created process's temp stack; used for the allocation of dynamic-extent objects. The default is CCL:*DEFAULT-TEMP-STACK-SIZE*.

use-standard-initial-bindings---when true, the global "standard initial bindings" are put into effect in the new thread before. See DEF-STANDARD-INITIAL-BINDING. "standard" initial bindings are put into effect before any bindings specified by :initial-bindings are. The default is t.

initial-bindings---an alist of (symbol . valueform) pairs, which can be used to initialize special variable bindings in the new thread. Each valueform is used to compute the value of a new binding of symbol in the execution environment of the newly-created thread. The default is nil.

process---the newly-created process.

Description:

Creates and returns a new lisp process (thread) with the specified attributes. process will not begin execution immediately; it will need to be preset (given an initial function to run, as by process-preset) and enabled (allowed to execute, as by process-enable) before it's able to actually do anything.

If valueform is a function, it is called, with no arguments, in the execution environment of the newly-created thread; the primary value it returns is used for the binding of the corresponding symbol.

Otherwise, valueform is evaluated in the execution environment of the newly-created thread, and the resulting value is used.

[Function]

process-suspend process => result
Suspends a specified process.

Arguments and Values:

process---a lisp process (thread).

result---T if process had been runnable and is now suspended; NIL otherwise. That is, T if process's process-suspend-count transitioned from 0 to 1.

Description:

Suspends process, preventing it from running, and stopping it if it was already running. This is a fairly expensive operation, because it involves a few calls to the OS. It also risks creating deadlock if used improperly, for instance, if the process being suspended owns a lock or other resource which another process will wait for.

Each call to process-suspend must be reversed by a matching call to process-resume before process is able to run. What process-suspend actually does is increment the process-suspend-count of process.

A process can't suspend itself, though this once worked and this documentation claimed has claimed that it did.

Notes:

process-suspend was previously called process-disable. process-enable now names a function for which there is no obvious inverse, so process-disable is no longer defined.

[Function]

process-resume process => result
Resumes a specified process which had previously been suspended by process-suspend.

Arguments and Values:

process---a lisp process (thread).

result---T if process had been suspended and is now runnable; NIL otherwise. That is, T if process's process-suspend-count transitioned from to 0.

Description:

Undoes the effect of a previous call to process-suspend; if all such calls are undone, makes the process runnable. Has no effect if the process is not suspended. What process-resume actually does is decrement the process-suspend-count of process, to a minimum of 0.

Notes:

This was previously called PROCESS-ENABLE; process-enable now does something slightly different.

[Function]

process-suspend-count process => result
Returns the number of currently-pending suspensions applicable to a given process.

Arguments and Values:

process---a lisp process (thread).

result---The number of "outstanding" process-suspend calls on process, or NIL if process has expired.

Description:

An "outstanding" process-suspend call is one which has not yet been reversed by a call to process-resume. A process expires when its initial function returns, although it may later be reset.

A process is runnable when it has a process-suspend-count of 0, has been preset as by process-preset, and has been enabled as by process-enable. Newly-created processes have a process-suspend-count of 0.

[Function]

process-preset process function &rest args => result
Sets the initial function and arguments of a specified process.

Arguments and Values:

process---a lisp process (thread).

function---a function, designated by itself or by a symbol which names it.

args---a list of values, appropriate as arguments to function.

result---undefined.

Description:

Typically used to initialize a newly-created or newly-reset process, setting things up so that when process becomes enabled, it will begin execution by applying function to args. process-preset does not enable process, although a process must be process-preset before it can be enabled. Processes are normally enabled by process-enable.

[Function]

process-enable process &optional timeout
Begins executing the initial function of a specified process.

Arguments and Values:

process---a lisp process (thread).

timeout---a time interval in seconds. May be any non-negative real number the floor of which fits in 32 bits. The default is 1.

result---undefined.

Description:

Tries to begin the execution of process. An error is signaled if process has never been process-preset. Otherwise, process invokes its initial function.

process-enable attempts to synchronize with process, which is presumed to be reset or in the act of resetting itself. If this attempt is not successful within the time interval specified by timeout, a continuable error is signaled, which offers the opportunity to continue waiting.

A process cannot meaningfully attempt to enable itself.

Notes:

It would be nice to have more discussion of what it means to synchronize with the process.

[Function]

process-run-function process-specifier function &rest args => process
Creates a process, presets it, and enables it.

Arguments and Values:

name---a string, used to identify the process. Passed to make-process.

function---a function, designated by itself or by a symbol which names it. Passed to preset-process.

persistent---a boolean, passed to make-process.

priority---ignored.

class---a subclass of CCL:PROCESS. Passed to make-process.

stack-size---a size, in bytes. Passed to make-process.

vstack-size---a size, in bytes. Passed to make-process.

tstack-size---a size, in bytes. Passed to make-process.

process---the newly-created process.

Description:

Creates a lisp process (thread) via make-process, presets it via process-preset, and enables it via process-enable. This means that process will immediately begin to execute. process-run-function is the simplest way to create and run a process.

[Function]

process-interrupt process function &rest args => result
Arranges for the target process to invoke a specified function at some point in the near future, and then return to what it was doing.

Arguments and Values:

process---a lisp process (thread).

function---a function.

args---a list of values, appropriate as arguments to function.

result---the result of applying function to args if process is the current-process, otherwise NIL.

Description:

Arranges for process to apply function to args at some point in the near future (interrupting whatever process was doing.) If function returns normally, process resumes execution at the point at which it was interrupted.

process must be in an enabled state in order to respond to a process-interrupt request. It's perfectly legal for a process to call process-interrupt on itself.

process-interrupt uses asynchronous POSIX signals to interrupt threads. If the thread being interrupted is executing lisp code, it can respond to the interrupt almost immediately (as soon as it has finished pseudo-atomic operations like consing and stack-frame initialization.)

If the interrupted thread is blocking in a system call, that system call is aborted by the signal and the interrupt is handled on return.

It is still difficult to reliably interrupt arbitrary foreign code (that may be stateful or otherwise non-reentrant); the interrupt request is handled when such foreign code returns to or enters lisp.

Notes:

It would probably be better for result to always be NIL, since the present behavior is inconsistent.

Process-interrupt works by sending signals between threads, via the C function #_pthread_signal. It could be argued that it should be done in one of several possible other ways under Darwin, to make it practical to asynchronously interrupt things which make heavy use of the Mach nanokernel.

[Variable]

*CURRENT-PROCESS*
Bound in each process, to that process itself.

Value Type:

A lisp process (thread).

Initial Value:

Bound separately in each process, to that process itself.

Description:

Used when lisp code needs to find out what process it is executing in. Shouldn't be set by user code.

See Also:
all-processes

[Function]

process-reset process &optional kill-option => result
Causes a specified process to cleanly exit from any ongoing computation.

Arguments and Values:

process---a lisp process (thread).

kill-option---an internal argument, must be nil.

result---undefined.

Description:

Causes process to cleanly exit from any ongoing computation and enter a state where it can be process-preset. This is implemented by signaling a condition of type PROCESS-RESET; user-defined condition handlers should generally refrain from attempting to handle conditions of this type.

The kill-option argument is for internal use only and should not be specified by user code

A process can meaningfully reset itself.

There is in general no way to know precisely when process has completed the act of resetting or killing itself; a process which has either entered the limbo of the reset state or exited has few ways of communicating either fact. process-enable can reliably determine when a process has entered the "limbo of the reset state", but can't predict how long the clean exit from ongoing computation might take: that depends on the behavior of unwind-protect cleanup forms, and of the OS scheduler.

Resetting a process other than *current-process* involves the use of process-interrupt.

[Function]

process-kill process => result
Causes a specified process to cleanly exit from any ongoing computation, and then exit.

Arguments and Values:

process---a lisp process (thread).

result---undefined.

Description:

Entirely equivalent to calling (PROCESS-RESET PROCESS T). Causes process to cleanly exit from any ongoing computation, and then exit.

[Function]

process-abort process &optional condition => NIL
Causes a specified process to process an abort condition, as if it had invoked abort.

Arguments and Values:

process---a lisp process (thread).

condition---a lisp condition. The default is NIL.

Description:

Entirely equivalent to calling (process-interrupt process (lambda () (abort condition))). Causes process to transfer control to the applicable handler or restart for abort.

If condition is non-NIL, process-abort does not consider any handlers which are explicitly bound to conditions other than condition.

[Variable]

*TICKS-PER-SECOND*
Bound to the clock resolution of the OS scheduler.

Value Type:

A positive integer.

Initial Value:

The clock resolution of the OS scheduler. Currently, both LinuxPPC and DarwinPPC yield an initial value of 100.

Description:

This value is ordinarily of marginal interest at best, but, for backward compatibility, some functions accept timeout values expressed in "ticks". This value gives the number of ticks per second.

[Function]

process-whostate process => whostate
Returns a string which describes the status of a specified process.

Description:

This information is primarily for the benefit of debugging tools. whostate is a terse report on what process is doing, or not doing, and why.

If the process is currently waiting in a call to process-wait or process-wait-with-timeout, its process-whostate will be the value which was passed to that function as whostate.

Notes:

This should arguably be SETFable, but doesn't seem to ever have been.

[Function]

process-allow-schedule
Used for cooperative multitasking; probably never necessary.

Description:

Advises the OS scheduler that the current thread has nothing useful to do and that it should try to find some other thread to schedule in its place. There's almost always a better alternative, such as waiting for some specific event to occur. For example, you could use a lock or semaphore.

Notes:

This is a holdover from the days of cooperative multitasking. All modern general-purpose operating systems use preemptive multitasking.

[Function]

process-wait whostate function &rest args => result
Causes the current lisp process (thread) to wait for a given predicate to return true.

Arguments and Values:

whostate---a string, which will be the value of process-whostate while the process is waiting.

function---a function, designated by itself or by a symbol which names it.

args---a list of values, appropriate as arguments to function.

result---NIL.

Description:

Causes the current lisp process (thread) to repeatedly apply function to args until the call returns a true result, then returns NIL. After each failed call, yields the CPU as if by process-allow-schedule.

As with process-allow-schedule, it's almost always more efficient to wait for some specific event to occur; this isn't exactly busy-waiting, but the OS scheduler can do a better job of scheduling if it's given the relevant information. For example, you could use a lock or semaphore.

[Function]

process-wait-with-timeout whostate ticks function args => result
Causes the current thread to wait for a given predicate to return true, or for a timeout to expire.

Arguments and Values:

whostate---a string, which will be the value of process-whostate while the process is waiting.

ticks---either a positive integer expressing a duration in "ticks" (see *ticks-per-second*), or NIL.

function---a function, designated by itself or by a symbol which names it.

args---a list of values, appropriate as arguments to function.

result---T if process-wait-with-timeout returned because its function returned true, or NIL if it returned because the duration ticks has been exceeded.

Description:

If ticks is NIL, behaves exactly like process-wait, except for returning T. Otherwise, function will be tested repeatedly, in the same kind of test/yield loop as in process-wait until either function returns true, or the duration ticks has been exceeded.

Having already read the descriptions of process-allow-schedule and process-wait, the astute reader has no doubt anticipated the observation that better alternatives should be used whenever possible.

[Macro]

without-interrupts &body body => result
Evaluates its body in an environment in which process-interrupt requests are deferred.

Arguments and Values:

body---an implicit progn.

result---the primary value returned by body.

Description:

Executes body in an environment in which process-interrupt requests are deferred. As noted in the description of process-interrupt, this has nothing to do with the scheduling of other threads; it may be necessary to inhibit process-interrupt handling when (for instance) modifying some data structure (for which the current thread holds an appropriate lock) in some manner that's not reentrant.

[Function]

make-lock &optional name => lock
Creates and returns a lock object, which can be used for synchronization between threads.

Arguments and Values:

name---any lisp object; saved as part of lock. Typically a string or symbol which may appear in the process-whostates of threads which are waiting for lock.

lock---a newly-allocated object of type CCL:LOCK.

Description:

Creates and returns a lock object, which can be used to synchronize access to some shared resource. lock is initially in a "free" state; a lock can also be "owned" by a thread.

[Macro]

with-lock-grabbed (lock) &body body
Waits until a given lock can be obtained, then evaluates its body with the lock held.

Arguments and Values:

lock---an object of type CCL:LOCK.

body---an implicit progn.

result---the primary value returned by body.

Description:

Waits until lock is either free or owned by the calling thread, then executes body with the lock owned by the calling thread. If lock was free when with-lock-grabbed was called, it is restored to a free state after body is executed.

[Function]

grab-lock lock
Waits until a given lock can be obtained, then obtains it.

Arguments and Values:

lock---an object of type CCL:LOCK.

Description:

Blocks until lock is owned by the calling thread.

The macro with-lock-grabbed could be defined in terms of grab-lock and release-lock, but it is actually implemented at a slightly lower level.

[Function]

release-lock lock
Relinquishes ownership of a given lock.

Arguments and Values:

lock---an object of type CCL:LOCK.

Description:

Signals an error of type CCL:LOCK-NOT-OWNER if lock is not already owned by the calling thread; otherwise, undoes the effect of one previous grab-lock. If this means that release-lock has now been called on lock the same number of times as grab-lock has, lock becomes free.

[Function]

try-lock lock => result
Obtains the given lock, but only if it is not necessary to wait for it.

Arguments and Values:

lock---an object of type CCL:LOCK.

result---T if lock has been obtained, or NIL if it has not.

Description:

Tests whether lock can be obtained without blocking - that is, either lock is already free, or it is already owned by *current-process*. If it can, causes it to be owned by the calling lisp process (thread) and returns T. Otherwise, the lock is already owned by another thread and cannot be obtained without blocking; NIL is returned in this case.

[Function]

make-read-write-lock => read-write-lock
Creates and returns a read-write lock, which can be used for synchronization between threads.

Arguments and Values:

read-write-lock---a newly-allocated object of type CCL:READ-WRITE-LOCK.

Description:

Creates and returns an object of type CCL::READ-WRITE-LOCK. A read-write lock may, at any given time, belong to any number of lisp processes (threads) which act as "readers"; or, it may belong to at most one process which acts as a "writer". A read-write lock may never be held by a reader at the same time as a writer. Initially, read-write-lock has no readers and no writers.

Notes:

There probably should be some way to atomically "promote" a reader, making it a writer without releasing the lock, which could otherwise cause delay.

[Macro]

with-read-lock (read-write-lock) &body body => result
Waits until a given lock is available for read-only access, then evaluates its body with the lock held.

Arguments and Values:

read-write-lock---an object of type CCL:READ-WRITE-LOCK.

body---an implicit progn.

result---the primary value returned by body.

Description:

Waits until read-write-lock has no writer, ensures that *current-process* is a reader of it, then executes body.

After executing body, if *current-process* was not a reader of read-write-lock before with-read-lock was called, the lock is released. If it was already a reader, it remains one.

[Macro]

with-write-lock (read-write-lock) &body body
Waits until the given lock is available for write access, then executes its body with the lock held.

Arguments and Values:

read-write-lock---an object of type CCL:READ-WRITE-LOCK.

body---an implicit progn.

result---the primary value returned by body.

Description:

Waits until read-write-lock has no readers and no writer other than *current-process*, then ensures that *current-process* is the writer of it. With the lock held, executes body.

After executing body, if *current-process* was not the writer of read-write-lock before with-write-lock was called, the lock is released. If it was already the writer, it remains the writer.

[Function]

make-semaphore => semaphore
Creates and returns a semaphore, which can be used for synchronization between threads.

Arguments and Values:

semaphore---a newly-allocated object of type CCL:SEMAPHORE.

Description:

Creates and returns an object of type CCL:SEMAPHORE. A semaphore has an associated "count" which may be incremented and decremented atomically; incrementing it represents sending a signal, and decrementing it represents handling that signal. semaphore has an initial count of 0.

[Function]

signal-semaphore semaphore => result
Atomically increments the count of a given semaphore.

Arguments and Values:

semaphore---an object of type CCL:SEMAPHORE.

result---an integer representing an error identifier which was returned by the underlying OS call.

Description:

Atomically increments semaphore's "count" by 1; this may enable a waiting thread to resume execution.

Notes:

result should probably be interpreted and acted on by signal-semaphore, because it is not likely to be meaningful to a lisp program, and the most common cause of failure is a type error.

[Function]

wait-on-semaphore semaphore => result
Waits until the given semaphore has a positive count which can be atomically decremented.

Arguments and Values:

semaphore---an object of type CCL:SEMAPHORE.

result---an integer representing an error identifier which was returned by the underlying OS call.

Description:

Waits until semaphore has a positive count that can be atomically decremented; this will succeed exactly once for each corresponding call to SIGNAL-SEMAPHORE.

Notes:

result should probably be interpreted and acted on by wait-on-semaphore, because it is not likely to be meaningful to a lisp program, and the most common cause of failure is a type error.

[Function]

timed-wait-on-semaphore semaphore timeout => result
Waits until the given semaphore has a positive count which can be atomically decremented, or until a timeout expires.

Arguments and Values:

semaphore---An object of type CCL:SEMAPHORE.

timeout---a time interval in seconds. May be any non-negative real number the floor of which fits in 32 bits. The default is 1.

result---T if timed-wait-on-semaphore returned because it was able to decrement the count of semaphore; NIL if it returned because the duration timeout has been exceeded.

Description:

Waits until semaphore has a positive count that can be atomically decremented, or until the duration timeout has elapsed.

[Function]

process-input-wait fd &optional timeout
Waits until input is available on a given file-descriptor.

Arguments and Values:

fd---a file descriptor, which is a non-negative integer used by the OS to refer to an open file, socket, or similar I/O connection. See ccl::stream-device.

timeout---either NIL or a time interval in milliseconds. Must be a non-negative integer. The default is NIL.

Description:

Wait until input is available on fd. This uses the select() system call, and is generally a fairly efficient way of blocking while waiting for input. More accurately, process-input-wait waits until it's possible to read from fd without blocking, or until timeout, if it is not NIL, has been exceeded.

Note that it's possible to read without blocking if the file is at its end - although, of course, the read will return zero bytes.

Notes:

process-input-wait has a timeout parameter, and process-output-wait does not. This inconsistency should probably be corrected.

[Function]

process-output-wait fd &optional timeout
Waits until output is possible on a given file descriptor.

Arguments and Values:

fd---a file descriptor, which is a non-negative integer used by the OS to refer to an open file, socket, or similar I/O connection. See ccl::stream-device.

timeout---either NIL or a time interval in milliseconds. Must be a non-negative integer. The default is NIL.

Description:

Wait until output is possible on fd or until timeout, if it is not NIL, has been exceeded. This uses the select() system call, and is generally a fairly efficient way of blocking while waiting to output.

If process-output-wait is called on a network socket which has not yet established a connection, it will wait until the connection is established. This is an important use, often overlooked.

Notes:

process-input-wait has a timeout parameter, and process-output-wait does not. This inconsistency should probably be corrected.

[Macro]

with-terminal-input &body body => result
Executes its body in an environment with exclusive read access to the terminal.

Arguments and Values:

body---an implicit progn.

result---the primary value returned by body.

Description:

Requests exclusive read access to the standard terminal stream, *terminal-io*. Executes body in an environment with that access.

[Variable]

*REQUEST-TERMINAL-INPUT-VIA-BREAK*
Controls how attempts to obtain ownership of terminal input are made.

Value Type:

A boolean.

Initial Value:

NIL.

Description:

Controls how attempts to obtain ownership of terminal input are made. When NIL, a message is printed on *TERMINAL-IO*; it's expected that the user will later yield control of the terminal via the :Y toplevel command. When T, a BREAK condition is signaled in the owning process; continuing from the break loop will yield the terminal to the requesting process (unless the :Y command was already used to do so in the break loop.)

[Toplevel Command]

(:y p)
Yields control of terminal input to a specified lisp process (thread).

Arguments and Values:

p---a lisp process (thread), designated either by an integer which matches its process-serial-number, or by a string which is equal to its process-name.

Description:

:Y is a toplevel command, not a function. As such, it can only be used interactively, and only from the initial process.

The command yields control of terminal input to the process p, which must have used with-terminal-input to request access to the terminal input stream.

[Function]

join-process process &optional default => values
Waits for a specified process to complete and returns the values that that process's initial function returned.

Arguments and Values:

process---a process, typically created by process-run-function or by make-process

default---A default value to be returned if the specified process doesn't exit normally.

values---The values returned by the specified process's initial function if that function returns, or the value of the default argument, otherwise.

Description:

Waits for the specified process to terminate. If the process terminates "normally" (if its initial function returns), returns the values that that initial function returnes. If the process does not terminate normally (e.g., if it's terminated via process-kill and a default argument is provided, returns the value of that default argument. If the process doesn't terminate normally and no default argument is provided, signals an error.

A process can't successfully join itself, and only one process can successfully receive notification of another process's termination.

Chapter 7. Programming with Sockets

7.1. Overview

Clozure CL supports the socket abstraction for interprocess communication. A socket represents a connection to another process, typically (but not necessarily) a TCP/IP network connection to a client or server running on some other machine on the network.

All symbols mentioned in this chapter are exported from the CCL package. As of version 0.13, these symbols are additionally exported from the OPENMCL-SOCKET package.

Clozure CL supports three types of sockets: TCP sockets, UDP sockets, and Unix-domain sockets. This should be enough for all but the most esoteric network situations. All sockets are created by make-socket. The type of socket depends on the arguments to it, as follows:

tcp-stream

A buffered bi-directional stream over a TCP/IP connection. tcp-stream is a subclass of stream, and you can read and write to it using all the usual stream functions. Created by (make-socket :address-family :internet :type :stream :connect :active ...) or by (accept-connection ...).

file-socket-stream

A buffered bi-directional stream over a "UNIX domain" connection. file-socket-stream is a subclass of stream, and you can read and write to it using all the usual stream functions. Created by (make-socket :address-family :file :type :stream :connect :active ...) or by (accept-connection ...),

listener-socket

A passive socket used to listen for incoming TCP/IP connections on a particular port. A listener-socket is not a stream. It doesn't support I/O. It can only be used to create new tcp-streams by accept-connection. Created by (make-socket :type :stream :connect :passive ...)

file-listener-socket

A passive socket used to listen for incoming UNIX domain connections named by a file in the local filesystem. A listener-socket is not a stream. It doesn't support I/O. It can only be used to create new file-socket-streams by accept-connection. Created by (make-socket :address-family :file :type :stream :connect :passive ...)

udp-socket

A socket representing a packet-based UDP/IP connection. A udp-socket supports I/O but it is not a stream. Instead, you must use the special functions send-to and receive-from to read and write to it. Created by (make-socket :type :datagram ...)

7.2. Sockets Dictionary

[Function]

make-socket &key address-family type connect eol format remote-host remote-port local-host local-port local-filename remote-filename keepalive reuse-address nodelay broadcast linger backlog input-timeout output-timeout connect-timeout auto-close deadline

Arguments and Values:

address-family---The address/protocol family of this socket. Currently only :internet (the default), meaning IP, and :file, referring to UNIX domain addresses, are supported.

type---One of :stream (the default) to request a connection-oriented socket, or :datagram to request a connectionless socket. The default is :stream.

connect---This argument is only relevant to sockets of type :stream. One of :active (the default) to request a :passive to request a file or TCP listener socket.

eol---This argument is currently ignored (it is accepted for compatibility with Franz Allegro).

format---One of :text (the default), :binary, or :bivalent. This argument is ignored for :stream sockets for now, as :stream sockets are currently always bivalent (i.e. they support both character and byte I/O). For :datagram sockets, this argument is ignored (the format of a datagram socket is always :binary).

remote-host---Required for TCP streams, it specifies the host to connect to (in any format acceptable to lookup-hostname). Ignored for listener sockets. For UDP sockets, it can be used to specify a default host for subsequent calls to send-to or receive-from.

remote-port---Required for TCP streams, it specifies the port to connect to (in any format acceptable to lookup-port). Ignored for listener sockets. For UDP sockets, it can be used to specify a default port for subsequent calls to for subsequent calls to send-to or receive-from.

remote-filename---Required for file-socket streams, it specifies the name of a file in the local filesystem (e.g., NOT mounted via NFS, AFP, SMB, ...) which names and controls access to a UNIX-domain socket.

local-host---Allows you to specify a local host address for a listener or UDP socket, for the rare case where you want to restrict connections to those coming to a specific local address for security reasons.

local-port---Specify a local port for a socket. Most useful for listener sockets, where it is the port on which the socket will listen for connections.

local-filename---Required for file-listener-sockets. Specifies the name of a file in the local filesystem which is used to name a UNIX-domain socket. The actual filesystem file should not previously exist when the file-listener-socket is created; its parent directory should exist and be writable by the caller. The file used to name the socket will be deleted when the file-listener-socket is closed.

keepalive---If true, enables the periodic transmission of "keepalive" messages.

reuse-address---If true, allows the reuse of local ports in listener sockets, overriding some TCP/IP protocol specifications. You will need this if you are debugging a server..

nodelay---If true, disables Nagle's algorithm, which tries to minimize TCP packet fragmentation by introducing transmission delays in the absence of replies. Try setting this if you are using a protocol which involves sending a steady stream of data with no replies and are seeing significant degradations in throughput.

broadcast---If true, requests permission to broadcast datagrams on a UDP socket.

linger---If specified and non-nil, should be the number of seconds the OS is allowed to wait for data to be pushed through when a close is done. Only relevant for TCP sockets.

backlog---For a listener socket, specifies the number of connections which can be pending but not accepted. The default is 5, which is also the maximum on some operating systems.

input-timeout---The number of seconds before an input operation times out. Must be a real number between zero and one million. If an input operation takes longer than the specified number of seconds, an input-timeout error is signalled. (see Section 9.1.4, “Stream Timeouts and Deadlines”)

output-timeout---The number of seconds before an output operation times out. Must be a real number between zero and one million. If an output operation takes longer than the specified number of seconds, an output-timeout error is signalled. (see Section 9.1.4, “Stream Timeouts and Deadlines”)

connect-timeout---The number of seconds before a connection attempt times out. [TODO: what are acceptable values?] If a connection attempt takes longer than the specified number of seconds, a socket-error is signalled. This can be useful if the specified interval is shorter than the interval that the OS's socket layer imposes, which is sometimes a minute or two.

auto-close---When non-nil, any resulting socket stream will be closed when the GC can prove that the stream is unreferenced. This is done via CCL's termination mechanism [TODO add xref].

deadline---Specifies an absolute time in internal-time-units. If an I/O operation on the stream does not complete before the deadline then a COMMUNICATION-DEADLINE-EXPIRED error is signalled. A deadline takes precedence over any input/output timeouts that may be set. (see Section 9.1.4, “Stream Timeouts and Deadlines”)

Description:

Creates and returns a new socket

[Function]

accept-connection (socket listener-socket) &key wait

Arguments and Values:

socket---The listener-socket to listen on.

wait---If true (the default), and there are no connections waiting to be accepted, waits until one arrives. If false, returns NIL immediately.

Description:

Extracts the first connection on the queue of pending connections, accepts it (i.e. completes the connection startup protocol) and returns a new tcp-stream or file-socket-stream representing the newly established connection. The tcp stream inherits any properties of the listener socket that are relevant (e.g. :keepalive, :nodelay, etc.) The original listener socket continues to be open listening for more connections, so you can call accept-connection on it again.

[Function]

dotted-to-ipaddr dotted &key errorp

Arguments and Values:

dotted---A string representing an IP address in the "nn.nn.nn.nn" format

errorp---If true (the default) an error is signaled if dotted is invalid. If false, NIL is returned.

Description:

Converts a dotted-string representation of a host address to a 32-bit unsigned IP address.

[Function]

ipaddr-to-dotted ipaddr &key values

Arguments and Values:

ipaddr---A 32-bit integer representing an internet host address

values---If false (the default), returns a string in the form "nn.nn.nn.nn". If true, returns four values representing the four octets of the address as unsigned 8-bit integers.

Description:

Converts a 32-bit unsigned IP address into octets.

[Function]

ipaddr-to-hostname ipaddr &key ignore-cache

Arguments and Values:

ipaddr---a 32-bit integer representing an internet host address

ignore-cache---This argument is ignored (it is accepted for compatibility with Franz Allegro)

Description:

Converts a 32-bit unsigned IP address into a host name string

[Function]

lookup-hostname host

Arguments and Values:

host---Specifies the host. It can be either a host name string such as "clozure.com", or a dotted address string such as "192.168.0.1", or a 32-bit unsigned IP address such as 3232235521.

Description:

Converts a host spec in any of the acceptable formats into a 32-bit unsigned IP address

[Function]

lookup-port port protocol

Arguments and Values:

port---Specifies the port. It can be either a string, such as "http" or a symbol, such as :http, or an unsigned port number. Note that a string is case-sensitive. A symbol is lowercased before lookup.

protocol---Must be one of "tcp" or "udp".

Description:

Finds the port number for the specified port and protocol

[Function]

receive-from (socket udp-socket) size &key buffer extract offset

Arguments and Values:

socket---The socket to read from

size---Maximum number of bytes to read. If the packet is larger than this, any extra bytes are discarded.

buffer---If specified, must be an octet vector which will be used to read in the data. If not specified, a new buffer will be created (of type determined by socket-format).

extract---If true, the subsequence of the buffer corresponding only to the data read in is extracted and returned as the first value. If false (the default) the original buffer is returned even if it is only partially filled.

offset---Specifies the start offset into the buffer at which data is to be stored. The default is 0.

Description:

Reads a UDP packet from a socket. If no packets are available, waits for a packet to arrive. Returns four values:

  1. The buffer with the data

  2. The number of bytes read

  3. The 32-bit unsigned IP address of the sender of the data

  4. The port number of the sender of the data

[Function]

send-to (socket udp-socket) buffer size &key remote-host remote-port offset

Arguments and Values:

socket---The socket to write to

buffer---A vector containing the data to send. It must be an octet vector.

size---Number of octets to send

remote-host---The host to send the packet to, in any format acceptable to lookup-hostname. The default is the remote host specified in the call to make-socket.

remote-port---The port to send the packet to, in any format acceptable to lookup-port. The default is the remote port specified in the call to make-socket.

offset---The offset in the buffer where the packet data starts

Description:

Send a UDP packet over a socket.

[Function]

shutdown socket &key direction

Arguments and Values:

socket---The socket to shut down (typically a tcp-stream)

direction---One of :input to disallow further input, or :output to disallow further output.

Description:

Shuts down part of a bidirectional connection. This is useful if e.g. you need to read responses after sending an end-of-file signal.

[Function]

socket-os-fd socket

Arguments and Values:

socket---The socket

Description:

Returns the native OS's representation of the socket, or NIL if the socket is closed. On Unix, this is the Unix 'file descriptor', a small non-negative integer. Note that it is rather dangerous to mess around with tcp-stream fd's, as there is all sorts of buffering and asynchronous I/O going on above the OS level. listener-socket and udp-socket fd's are safer to mess with directly as there is less magic going on.

[Function]

remote-host socket

Arguments and Values:

socket---The socket

Description:

Returns the 32-bit unsigned IP address of the remote host, or NIL if the socket is not connected.

[Function]

remote-port socket

Arguments and Values:

socket---The socket

Description:

Returns the remote port number, or NIL if the socket is not connected.

[Function]

local-host socket

Arguments and Values:

socket---The socket

Description:

Returns 32-bit unsigned IP address of the local host.

[Function]

local-port socket

Arguments and Values:

socket---The socket

Description:

Returns the local port number

[Function]

socket-address-family socket

Arguments and Values:

socket---The socket

Description:

Returns :internet or :file, as appropriate.

[Function]

socket-connect socket

Arguments and Values:

socket---The socket

Description:

Returns :active for tcp-stream, :passive for listener-socket, and NIL for udp-socket

[Function]

socket-format socket

Arguments and Values:

socket---The socket

Description:

Returns the socket format as specified by the :format argument to make-socket.

[Function]

socket-type socket

Arguments and Values:

socket---The socket

Description:

returns :stream for tcp-stream and listener-socket, and :datagram for udp-socket.

[Class]

SOCKET-ERROR

Description:

The class of OS errors signaled by socket functions

Superclasses:

simple-error

[Function]

socket-error-code socket-error

Arguments and Values:

socket-error---the condition

Description:

The OS error code of the error

[Function]

socket-error-identifier socket-error

Arguments and Values:

socket-error---the condition

Description:

A symbol representing the error code in a more OS-independent way.

One of: :address-in-use :connection-aborted :no-buffer-space :connection-timed-out :connection-refused :host-unreachable :host-down :network-down :address-not-available :network-reset :connection-reset :shutdown :access-denied or :unknown.

[Function]

socket-error-situation socket-error

Arguments and Values:

socket-error---the condition

Description:

A string describing the context where the error happened. On Linux, this is the name of the system call which returned the error.

[Method]

close (socket socket) &key abort

Arguments and Values:

socket---The socket to close

abort---If false (the default), closes the socket in an orderly fashion, finishing up any buffered pending I/O, before closing the connection. If true, aborts/ignores pending I/O. (For listener and udp sockets, this argument is effectively ignored since there is never any buffered I/O to clean up).

Description:

The close generic function can be applied to sockets. It releases the operating system resources associated with the socket.

[Macro]

with-open-socket (var . make-socket-args) &body body

Arguments and Values:

var---variable to bind

make-socket-args---arguments suitable for passing to make-socket

body---body to execute

Description:

executes body with var bound to the result of applying make-socket to make-socket-args. The socket gets closed on exit.

Chapter 8. Running Other Programs as Subprocesses

8.1. Overview

Clozure CL provides primitives to run external Unix programs, to select and connect Lisp streams to their input and output sources, to (optionally) wait for their completion and to check their execution and exit status.

All of the global symbols described below are exported from the CCL package.

This implementation is modeled on - and uses some code from - similar facilities in CMUCL.

8.2. Examples

;;; Capture the output of the "uname" program in a lisp string-stream
;;; and return the generated string (which will contain a trailing
;;; newline.)
? (with-output-to-string (stream)
    (run-program "uname" '("-r") :output stream))
;;; Write a string to *STANDARD-OUTPUT*, the hard way.
? (run-program "cat" () :input (make-string-input-stream "hello") :output t)
;;; Find out that "ls" doesn't expand wildcards.
? (run-program "ls" '("*.lisp") :output t)
;;; Let the shell expand wildcards.
? (run-program "sh" '("-c" "ls *.lisp") :output t)

These last examples will only produce output if Clozure CL's current directory contains .lisp files, of course.

8.3. Limitations and known bugs

  • Clozure CL and the external process may get confused about who owns which streams when input, output, or error are specified as T and wait is specified as NIL.

  • External processes that need to talk to a terminal device may not work properly; the environment (SLIME, ILISP) under which Clozure CL is run can affect this.

8.4. External-Program Dictionary

[Function]

run-program program args &key (wait t) pty sharing input if-input-does-not-exist output (if-output-exists :error) (error :output) (if-error-exists :error) status-hook external-format
Invokes an external program as an OS subprocess of lisp.

Arguments and Values:

program---A string or pathname which denotes an executable file. The PATH environment variable is used to find programs whose name doesn't contain a directory component.

args---A list of simple-strings

wait---Indicates whether or not run-program should wait for the EXTERNAL-PROCESS to complete or should return immediately.

pty---This option is accepted but currently ignored; it's intended to make it easier to run external programs that need to interact with a terminal device.

sharing---Sets a specific sharing mode (see :SHARING) for any streams created within RUN-PROGRAM when INPUT, OUTPUT or ERROR are requested to be a :STREAM.

input---Selects the input source used by the EXTERNAL-PROCESS. May be any of the following:

  • NIL Specifies that a null input stream (e.g., /dev/null) should be used.

  • T Specifies that the EXTERNAL-PROCESS should use the input source with which Clozure CL was invoked.

  • A string or pathname. Specifies that the EXTERNAL-PROCESS should receive its input from the named existing file.

  • :STREAM Creates a Lisp stream opened for character output. Any data written to this stream (accessible as the EXTERNAL-PROCESS-INPUT-STREAM of the EXTERNAL-PROCESS object) appears as input to the external process.

  • A stream. Specifies that the lisp stream should provide input to the EXTERNAL-PROCESS.

if-input-does-not-exist---If the input argument specifies the name of an existing file, this argument is used as the if-does-not-exist argument to OPEN when that file is opened.

output---Specifies where standard output from the external process should be sent. Analogous to input above.

if-output-exists---If output is specified as a string or pathname, this argument is used as the if-exists argument to OPEN when that file is opened.

error---Specifies where error output from the external process should be sent. In addition to the values allowed for output, the keyword :OUTPUT can be used to indicate that error output should be sent where standard output goes.

if-error-exists---Analogous to if-output-exists.

status-hook---A user-defined function of one argument (the EXTERNAL-PROCESS structure.) This function is called whenever Clozure CL detects a change in the status of the EXTERNAL-PROCESS.

external-format--- The external format (see Section 4.5.2, “External Formats”) for all of the streams (input, output, and error) used to communicate with the external process.

Description:

Runs the specified program in an external (Unix) process, returning an object of type EXTERNAL-PROCESS if successful.

[Function]

signal-external-process proc signal-number

Arguments and Values:

proc---An EXTERNAL-PROCESS, as returned by RUN-PROGRAM.

signal---A small integer.

Description:

Sends the specified "signal" to the specified external process. (Typically, it would only be useful to call this function if the EXTERNAL-PROCESS was created with :WAIT NIL. ) Returns T if successful; signals an error otherwise.

[Function]

external-process-id proc
Returns the "process ID" of an OS subprocess, a positive integer which identifies it.

Arguments and Values:

proc---An EXTERNAL-PROCESS, as returned by RUN-PROGRAM.

Description:

Returns the process id assigned to the external process by the operating system. This is typically a positive, 16-bit number.

[Function]

external-process-input-stream proc
Returns the lisp stream which is used to write input to a given OS subprocess, if it has one.

Arguments and Values:

proc---An EXTERNAL-PROCESS, as returned by RUN-PROGRAM.

Description:

Returns the stream created when the input argument to run-program is specified as :STREAM.

[Function]

external-process-output-stream proc
Returns the lisp stream which is used to read output from an OS subprocess, if there is one.

Arguments and Values:

proc---An EXTERNAL-PROCESS, as returned by RUN-PROGRAM.

Description:

Returns the stream created when the output argument to run-program is specified as :STREAM.

[Function]

external-process-error-stream proc
Returns the stream which is used to read "error" output from a given OS subprocess, if it has one.

Arguments and Values:

proc---An EXTERNAL-PROCESS, as returned by RUN-PROGRAM.

Description:

Returns the stream created when the error argument to run-program is specified as :STREAM.

[Function]

external-process-status proc
Returns information about whether an OS subprocess is running; or, if not, why not; and what its result code was if it completed.

Arguments and Values:

proc---An EXTERNAL-PROCESS, as returned by RUN-PROGRAM.

Description:

Returns, as multiple values, a keyword denoting the status of the external process (one of :running, :stopped, :signaled, or :exited), and the exit code or terminating signal if the first value is other than :running.

Chapter 9. Streams

9.1. Stream Extensions

9.1.1. Stream External Encoding

Clozure CL streams have an external-encoding attribute that may be read using STREAM-EXTERNAL-ENCODING and set using (SETF STREAM-EXTERNAL-ENCODING).

9.1.2. Additional keywords for OPEN and MAKE-SOCKET

OPEN and MAKE-SOCKET have each been extended to take the additional keyword arguments: :CLASS, :SHARING, and :BASIC.

:CLASS

A symbol that names the desired class of the stream. The specified class must inherit from FILE-STREAM for OPEN.

:SHARING

Specifies how a stream can be used by multiple threads. The possible values are: :PRIVATE, :LOCK and :EXTERNAL. :PRIVATE is the default. NIL is also accepted as a synonym for :EXTERNAL.

:PRIVATE

Specifies that the stream can only be accessed by the thread that created it. This is the default. (There was some discussion on openmcl-devel about the idea of "transferring ownership" of a stream; this has not yet been implemented.) Attempts to do I/O on a stream with :PRIVATE sharing from a thread other than the stream's owner yield an error.

:LOCK

Specifies that all access to the stream require the calling thread to obtain a lock. There are separate "read" and "write" locks for IO streams. This makes it possible for instance, for one thread to read from such a stream while another thread writes to it. (see also make-read-write-lock with-read-lock with-write-lock)

:EXTERNAL

Specifies that I/O primitives enforce no access protocol. This may be appropriate for some types of application which can control stream access via application-level protocols. Note that since even the act of reading from a stream changes its internal state (and simultaneous access from multiple threads can therefore lead to corruption of that state), some care must be taken in the design of such protocols.

:BASIC

A boolean that indicates whether or not the stream is a Gray stream, i.e. whether or not the stream is an instance of FUNDAMENTAL-STREAM or CCL::BASIC-STREAM(see Section 9.1.3, “Basic Versus Fundamental Streams”). Defaults to T.

9.1.3. Basic Versus Fundamental Streams

Gray streams (see Section 9.2, “Creating Your Own Stream Classes with Gray Streams”) all inherit from FUNDAMENTAL-STREAM whereas basic streams inherit from CCL::BASIC-STREAM. The tradeoff between FUNDAMENTAL and BASIC streams is entirely between flexibility and performance, potential or actual. I/O primitives can recognize BASIC-STREAMs and exploit knowledge of implementation details. FUNDAMENTAL stream classes can be subclassed and extended in a standard way (the Gray streams protocol).

For existing stream classes (FILE-STREAMs, SOCKETs, and the internal CCL::FD-STREAM classes used to implement file streams and sockets), a lot of code can be shared between the FUNDAMENTAL and BASIC implementations. The biggest difference should be that that code can be reached from I/O primitives like READ-CHAR without going through some steps that're there to support generality and extensibility, and skipping those steps when that support isn't needed can improve I/O performance.

The Gray stream method STREAM-READ-CHAR should work on appropriate BASIC-STREAMs. (There may still be cases where such methods are undefined; such cases should be considered bugs.) It is not guaranteed that Gray stream methods would ever be called by I/O primitives to read a character from a BASIC-STREAM, though there are still cases where this happens.

A simple loop reading 2M characters from a text file runs about 10X faster when the file is opened the new defaults (:SHARING :PRIVATE :BASIC T) than it had before these changes were made. That sounds good, until one realizes that the "equivalent" C loop can be about 10X faster still ...

9.1.4. Stream Timeouts and Deadlines

A stream that is associated with a file descriptor has attributes and accessors: STREAM-INPUT-TIMEOUT, STREAM-OUTPUT-TIMEOUT, and STREAM-DEADLINE. All three accessors have corresponding SETF methods. STREAM-INPUT-TIMEOUT and STREAM-OUTPUT-TIMEOUT are specified in seconds and can be any positive real number less than one million. When a timeout is set and the corresponding I/O operation takes longer than the specified interval, an error is signalled. The error is INPUT-TIMEOUT for input and OUTPUT-TIMEOUT for output. STREAM-DEADLINE specifies an absolute time in internal-time-units. If an I/O operation on the stream does not complete before the deadline then a COMMUNICATION-DEADLINE-EXPIRED error is signalled. A deadline takes precedence over any input/output timeouts that may be set.

9.1.5. Open File Streams

Historically, Clozure CL and MCL maintained a list of open file streams in the value of CCL:*OPEN-FILE-STREAMS*. This functionality has been replaced with the thread-safe function: CCL:OPEN-FILE-STREAMS and its two helper functions: CCL:NOTE-OPEN-FILE-STREAM and CCL:REMOVE-OPEN-FILE-STREAM. Maintaining this list helps to ensure that streams get closed in an orderly manner when the lisp exits.

[Function]

open-file-streams => stream-list
Returns the list of file streams that are currently open.

Values:

stream-list---A list of open file streams. This is a copy of an internal list so it may be destructively modified without ill effect.

Description:

Returns a list of open file streams.

[Function]

note-open-file-stream file-stream
Adds a file stream to the internal list of open file streams that is returned by note-open-file-stream.

Arguments:

file-stream---A file stream.

Description:

Adds a file stream to the internal list of open file streams that is returned by open-file-streams. This function is thread-safe. It will usually only be called from custom stream code when a file-stream is created.

[Function]

remove-open-file-stream file-stream
Removes file stream from the internal list of open file streams that is returned by open-file-streams.

Arguments:

file-stream---A file stream.

Description:

Remove file stream from the internal list of open file streams that is returned by open-file-streams. This function is thread-safe. It will usually only be called from custom stream code when a file-stream is closed.

9.2. Creating Your Own Stream Classes with Gray Streams

9.2.1. Overview

This sect1 is still being written and revised, because it is woefully incomplete. The dictionary section currently only lists a couple functions. Caveat lector.

Gray streams are an extension to Common Lisp. They were proposed for standardization by David Gray (the astute reader now understands their name) quite some years ago, but not accepted, because they had not been tried sufficiently to find conceptual problems with them.

They have since been implemented by quite a few modern Lisp implementations. However, they do indeed have some inadequacies, and each implementation has addressed these in different ways. The situation today is that it's difficult to even find out how to get started using Gray streams. This is why standards are important.

Here's a list of some classes which you might wish for your new stream class to inherit from:

fundamental-stream
fundamental-input-stream
fundamental-output-stream
fundamental-character-stream
fundamental-binary-stream
fundamental-character-input-stream
fundamental-character-output-stream
fundamental-binary-input-stream
fundamental-binary-output-stream
ccl::buffered-stream-mixin
ccl::buffered-input-stream-mixin
ccl::buffered-output-stream-mixin
ccl::buffered-io-stream-mixin
ccl::buffered-character-input-stream-mixin
ccl::buffered-character-output-stream-mixin
ccl::buffered-character-io-stream-mixin
ccl::buffered-binary-input-stream-mixin
ccl::buffered-binary-output-stream-mixin
ccl::buffered-binary-io-stream-mixin
file-stream
file-input-stream
file-output-stream
file-io-stream
file-character-input-stream
file-character-output-stream
file-character-io-stream
file-binary-input-stream
file-binary-output-stream
file-binary-io-stream
ccl::fd-stream
ccl::fd-input-stream
ccl::fd-output-stream
ccl::fd-io-stream
ccl::fd-character-input-stream
ccl::fd-character-output-stream
ccl::fd-character-io-stream
ccl::fd-binary-input-stream
ccl::fd-binary-output-stream
ccl::fd-binary-io-stream

All of these are defined in ccl/level-1/l1-streams.lisp, except for the ccl:file-* ones, which are in ccl/level-1/l1-sysio.lisp.

According to the original Gray streams proposal, you should inherit from the most specific of the fundamental-* classes which applies. Using Clozure CL, though, if you want buffering for better performance, which, unless you know of some reason you wouldn't, you do, you should instead inherit from the appropriate ccl::buffered-* class The buffering you get this way is exactly the same as the buffering which is used on ordinary, non-Gray streams, and force-output will work properly on it.

Notice that -mixin suffix in the names of all the ccl::buffered-* classes? The suffix means that this class is not "complete" by itself; you still need to inherit from a fundamental-* stream, even if you also inherit from a *-mixin stream. You might consider making your own class like this. .... Except that they do inherit from the fundamental-* streams, that's weird.

If you want to be able to create an instance of your class with the :class argument to (open) and (with-open-file), you should make it inherit from one of the file-* classes. If you do this, it's not necessary to inherit from any of the other classes (though it won't hurt anything), since the file-* classes already do.

When you inherit from the file-* classes, you can use (call-next-method) in any of your methods to get the standard behavior. This is especially useful if you want to create a class which performs some simple filtering operation, such as changing everything to uppercase or to a different character encoding. If you do this, you will definitely need to specialize ccl::select-stream-class. Your method on ccl::stream-select-class should accept an instance of the class, but pay no attention to its contents, and return a symbol naming the class to actually be instantiated.

If you need to make your functionality generic across all the different types of stream, probably the best way to implement it is to make it a mixin, define classes with all the variants of input, output, io, character, and binary, which inherit both from your mixin and from the appropriate other class, then define a method on ccl::select-stream-class which chooses from among those classes.

Note that some of these classes are internal to the CLL package. If you try to inherit from those ones without the ccl:: prefix, you'll get an error which may confuse you, calling them "forward-referenced classes". That just means you used the wrong symbol, so add the prefix.

Here's a list of some generic functions which you might wish to specialize for your new stream class, and which ought to be documented at some point.

stream-direction stream =>
stream-device stream direction =>
stream-length stream &optional new =>
stream-position stream &optional new =>
streamp stream => boolean
stream-write-char output-stream char =>
stream-write-entire-string output-stream string =>
stream-read-char input-stream =>
stream-unread-char input-stream char =>
stream-force-output output-stream => nil
stream-maybe-force-output output-stream => nil
stream-finish-output output-stream => nil
stream-clear-output output-stream => nil
close stream &key abort => boolean
stream-fresh-line stream => t
stream-line-length stream => length
interactive-stream-p stream => boolean
stream-clear-input input-stream => nil
stream-listen input-stream => boolean
stream-filename stream => string
ccl::select-stream-class instance in-p out-p char-p => class

The following functions are standard parts of Common Lisp, but behave in special ways with regard to Gray streams.

open-stream-p stream => generalized-boolean
input-stream-p stream => generalized-boolean
output-stream-p stream => generalized-boolean
stream-element-type stream =>
stream-error-stream =>
open
close
with-open-file

Specifically, (open) and (with-open-file) accept a new keyword argument, :class, which may be a symbol naming a class; the class itself; or an instance of it. The class so given must be a subtype of 'stream, and an instance of it with no particular contents will be passed to ccl::select-stream-class to determine what class to actually instantiate.

The following are standard, and do not behave specially with regard to Gray streams, but probably should.

stream-external-format

9.2.2. Extending READ-SEQUENCE and WRITE-SEQUENCE

9.2.2.1. Overview

The "Gray Streams" API is based on an informal proposal that was made before ANSI CL adopted the READ-SEQUENCE and WRITE-SEQUENCE functions; as such, there is no "standard" way for the author of a Gray stream class to improve the performance of these functions by exploiting knowledge of the stream's internals (e.g., the buffering mechanism it uses.)

In the absence of any such knowledge, READ-SEQUENCE and WRITE-SEQUENCE are effectively just convenient shorthand for a loop which calls READ-CHAR/READ-BYTE/WRITE-CHAR/WRITE-BYTE as appropriate. The mechanism described below allows subclasses of FUNDAMENTAL-STREAM to define more specialized (and presumably more efficient) behavior.

9.2.2.2. Notes

READ-SEQUENCE and WRITE-SEQUENCE do a certain amount of sanity-checking and normalization of their arguments before dispatching to one of the methods above. If an individual method can't do anything particularly clever, CALL-NEXT-METHOD can be used to handle the general case.

9.2.2.3. Example

(defclass my-string-input-stream (fundamental-character-input-stream)
  ((string :initarg :string :accessor my-string-input-stream-string)
   (index :initform 0 :accessor my-string-input-stream-index)
   (length)))

(defmethod stream-read-vector ((stream my-string-input-stream) vector start end)
  (if (not (typep vector 'simple-base-string))
      (call-next-method)
      (with-slots (string index length)
	      (do* ((outpos start (1+ outpos)))
               ((or (= outpos end)
                    (= index length))
                outpos))
        (setf (schar vector outpos)
              (schar string index))
        (incf index)))))
	    

9.2.3. Multibyte I/O

All heap-allocated objects in Clozure CL that cannot contain pointers to lisp objects are represented as ivectors. Clozure CL provides low-level functions, and , to efficiently transfer data between buffered streams and ivectors. There's some overlap in functionality between the functions described here and the ANSI CL READ-SEQUENCE and WRITE-SEQUENCE functions.

As used here, the term "octet" means roughly the same thing as the term "8-bit byte". The functions described below transfer a specified sequence of octets between a buffered stream and an ivector, and don't really concern themselves with higher-level issues (like whether that octet sequence is within bounds or how it relates to the logical contents of the ivector.) For these reasons, these functions are generally less safe and more flexible than their ANSI counterparts.

9.2.4. Gray Streams Dictionary

[Generic Function]

stream-read-list stream list count

Arguments and Values:

stream---a stream, presumably a fundamental-input-stream.

list---a list. When a STREAM-READ-LIST method is called by READ-SEQUENCE, this argument is guaranteed to be a proper list.

count---a non-negative integer. When a STREAM-READ-LIST method is called by READ-SEQUENCE, this argument is guaranteed not to be greater than the length of the list.

Description:

Should try to read up to count elements from stream into the list list, returning the number of elements actually read (which may be less than count in case of a premature end-of-file.)

[Generic Function]

stream-write-list stream list count

Arguments and Values:

stream---a stream, presumably a fundamental-output-stream.

list---a list. When a STREAM-WRITE-LIST method is called by WRITE-SEQUENCE, this argument is guaranteed to be a proper list.

count---a non-negative integer. When a STREAM-WRITE-LIST method is called by WRITE-SEQUENCE, this argument is guaranteed not to be greater than the length of the list.

Description:

should try to write the first count elements of list to stream. The return value of this method is ignored.

[Generic Function]

stream-read-vector stream vector start end

Arguments and Values:

stream---a stream, presumably a fundamental-input-stream

vector---a vector. When a STREAM-READ-VECTOR method is called by READ-SEQUENCE, this argument is guaranteed to be a simple one-dimensional array.

start---a non-negative integer. When a STREAM-READ-VECTOR method is called by READ-SEQUENCE, this argument is guaranteed to be no greater than end and not greater than the length of vector.

end---a non-negative integer. When a STREAM-READ-VECTOR method is called by READ-SEQUENCE, this argument is guaranteed to be no less than end and not greater than the length of vector.

Description:

should try to read successive elements from stream into vector, starting at element start (inclusive) and continuing through element end (exclusive.) Should return the index of the vector element beyond the last one stored into, which may be less than end in case of premature end-of-file.

[Generic Function]

stream-write-vector stream vector start end

Arguments and Values:

stream---a stream, presumably a fundamental-output-stream

vector---a vector. When a STREAM-WRITE-VECTOR method is called by WRITE-SEQUENCE, this argument is guaranteed to be a simple one-dimensional array.

start---a non-negative integer. When a STREAM-WRITE-VECTOR method is called by WRITE-SEQUENCE, this argument is guaranteed to be no greater than end and not greater than the length of vector.

end---a non-negative integer. When a STREAM-WRITE-VECTOR method is called by WRITE-SEQUENCE, this argument is guaranteed to be no less than end and not greater than the length of vector.

Description:

should try to write successive elements of vector to stream, starting at element start (inclusive) and continuing through element end (exclusive.)

[Generic Function]

ccl::stream-device s direction
Returns the OS file descriptor associated with a given lisp stream.

Method Signatures:
ccl::stream-device (s stream) direction => fd
Arguments and Values:

s---a stream.

direction---either :INPUT or :OUTPUT.

fd---a file descriptor, which is a non-negative integer used by the OS to refer to an open file, socket, or similar I/O connection. NIL if there is no file descriptor associated with s in the direction given by direction.

Description:

Returns the file descriptor associated with s in the direction given by direction. It is necessary to specify direction because the input and output file descriptors may be different; the most common case is when one of them has been redirected by the Unix shell.

[Generic Function]

stream-read-ivector stream ivector start-octet max-octets

Description:

Reads up to max-octets octets from stream into ivector, storing them at start-octet. Returns the number of octets actually read.

Arguments:

stream---An input stream. The method defined on BUFFERED-INPUT-STREAMs requires that the size in octets of an instance of the stream's element type is 1.

ivector---Any ivector.

start-octet---A non-negative integer.

max-octets---A non-negative integer. The return value may be less than the value of this parameter if EOF was encountered.

[Generic Function]

stream-write-ivector stream ivector start-octet max-octets

Description:

Writes max-octets octets to stream from ivector, starting at start-octet. Returns max-octets.

Arguments:

stream---An input stream. The method defined on BUFFERED-OUTPUT-STREAMs requires that the size in octets of an instance of the stream's element type is 1.

ivector---Any ivector

start-octet---A non-negative integer.

max-octet---A non-negative integer.

Examples:
;;; Write the contents of a (SIMPLE-ARRAY(UNSIGNED-BYTE 16) 3) 
;;; to a character file stream. Read back the characters.
(let* ((a (make-array 3 
                      :element-type '(unsigned-byte 16)
                      :initial-contents '(26725 27756 28449))))
  (with-open-file (s "junk"
                     :element-type 'character
                     :direction :io
                     :if-does-not-exist :create
                     :if-exists :supersede)
    ;; Write six octets (three elements).
    (stream-write-ivector s a 0 6)
    ;; Rewind, then read a line
    (file-position s 0)
    (read-line s)))

;;; Write a vector of DOUBLE-FLOATs. Note that (to maintain
;;; alignment) there are 4 octets of padding before the 0th 
;;; element of a (VECTOR DOUBLE-FLOAT).
;;; (Note that (= (- arch::misc-dfloat-offset 
;;;                  arch::misc-data-offset) 4))
(defun write-double-float-vector
    (stream vector &key (start 0) (end (length vector)))
     (check-type vector (vector double-float))
     (let* ((start-octet (+ (* start 8) 
                            (- arch::misc-dfloat-offset
                               arch::misc-data-offset)))
	        (num-octets (* 8 (- end start))))
       (stream-write-ivector stream vector start-octet num-octets)))
          

Chapter 10. Writing Portable Extensions to the Object System using the MetaObject Protocol

10.1. Overview

Clozure CL supports a fairly large subset of the semi-standard MetaObject Protocol (MOP) for CLOS, as defined in chapters 5 and 6 of "The Art Of The Metaobject Protocol", (Kiczales et al, MIT Press 1991, ISBN 0-262-61074-4); this specification is also available online at http://www.alu.org/mop/index.html.

10.2. Implementation status

The keyword :openmcl-partial-mop is on *FEATURES* to indicate the presence of this functionality.

All of the symbols defined in the MOP specification (whether implemented or not) are exported from the "CCL" package and from an "OPENMCL-MOP" package.

construct

status

accessor-method-slot-definition

+

add-dependent

+

add-direct-method

+

add-direct-subclass

+

add-method

+

class-default-initargs

+

class-direct-default-initargs

+

class-direct-slots

+

class-direct-subclasses

+

class-direct-superclasses

+

class-finalized-p

+

class-prototype

+

class-slots

+

compute-applicable-methods

-

compute-applicable-methods-using-classes

-

compute-class-precedence-list

+

compute-direct-initargs

+

compute-discriminating-function

-

compute-effective-method

+

compute-effective-slot-definition

+

compute-slots

+

direct-slot-definition-class

+

effective-slot-definition-class

+

ensure-class

+

ensure-class-using-class

+

ensure-generic-function-using-class

+

eql-specializer-object

+

extract-lambda-list

+

extract-specializer-names

+

finalize-inheritance

+

find-method-combination

+

funcallable-standard-instance-access

+

generic-function-argument-precedence-order

+

generic-function-declarations

+

generic-function-lambda-list

+

generic-function-method-class

+

generic-function-method-combination

+

generic-function-methods

+

generic-function-name

+

intern-eql-specializer

+

make-method-lambda

-

map-dependents

+

method-function

+

method-generic-function

+

method-lambda-list

+

method-qualifiers

+

method-specializers

+

reader-method-class

+

remove-dependent

+

remove-direct-method

+

remove-direct-subclass

+

remove-method

+

set-funcallable-instance-function

-

slot-boundp-using-class

+

slot-definition-allocation

+

slot-definition-initargs

+

slot-definition-initform

+

slot-definition-initfunction

+

slot-definition-location

+

slot-definition-name

+

slot-definition-readers

+

slot-definition-type

+

slot-definition-writers

+

slot-makunbound-using-class

+

slot-value-using-class

+

specializer-direct-generic-functions

+

specializer-direct-methods

+

standard-instance-access

+

update-dependent

+

validate-superclass

+

writer-method-class

+

Note that those generic functions whose status is "-" in the table above deal with the internals of generic function dispatch and method invocation (the "Generic Function Invocation Protocol".) Method functions are implemented a bit differently in Clozure CL from what the MOP expects, and it's not yet clear if or how this subprotocol can be well-supported.

Those constructs that are marked as "+" in the table above are nominally implemented as the MOP document specifies (deviations from the specification should be considered bugs; please report them as such.) Note that some CLOS implementations in widespread use (e.g., PCL) implement some things (ENSURE-CLASS-USING-CLASS comes to mind) a bit differently from what the MOP specifies.

10.3. Concurrency issues

The entire CLOS class and generic function hierarchy is effectively a (large, complicated) shared data structure; it's not generally practical for a thread to request exclusive access to all of CLOS, and the effects of volitional modification of the CLOS hierarchy (via class redefinition, CHANGE-CLASS, etc) in a multithreaded environment aren't always tractable.

Native threads exacerbate this problem (in that they increase the opportunities for concurrent modification and access.) The implementation should try to ensure that a thread's view of any subset of the CLOS hierarchy is consistent (to the extent that that's possible) and should try to ensure that incidental modifications of the hierarchy (cache updates, etc.) happen atomically; it's not generally possible for the implementation to guarantee that a thread's view of things is correct and current.

If you are loading code and defining classes in the most usual way, which is to say, via the compiler, using only a single thread, these issues are probably not going to affect you much.

If, however, you are making finicky changes to the class hierarchy while you're running multiple threads which manipulate objects related to each other, more care is required. Before doing such a thing, you should know what you're doing and already be aware of what precautions to take, without being told. That said, if you do it, you should seriously consider what your application's critical data is, and use locks for critical code sections.

Chapter 11. Profiling

11.1. Using the Linux oprofile system-level profiler

oprofile is a system-level profiler that's available for most modern Linux distributions.

Use of oprofile and its companion programs isn't really documented here; what is described is a way of generating symbolic information that enables profiling summaries generated by the opreport program to identify lisp functions meaningfully.

11.1.1. Generating a lisp image for use with oprofile

Modern Linux uses the 'ELF" (Executable and Linking Format) object file format; the oprofile tools can associate symbolic names with addresses in a memory-mapped file if that file appears to be an ELF object file and if it contains ELF symbol information that describes those memory regions. So, the general idea is to make a lisp heap image that looks enough like an ELF shared library to fool the oprofile tools (we don't actually load heap images via ELF dynamic linking technology, but we can make it look like we did.)

11.1.2. Prerequisites

  • oprofile itself, which is almost certainly available via your distribution's package management system if not already preinstalled.

  • libelf, which provides utilities for reading and writing ELF files (and is likewise likely preinstalled or readily installable.)

11.1.3. Generating ELF symbols for Lisp functions

In order to create a lisp heap image which can be used for oprofile- based profiling, we need to:

  1. load any code that we want to profile

  2. generate a file that contains ELF symbol information describing the names and addresses of all lisp functions.

    This step involves doing (from within Clozure CL)

    ? (require "ELF")
    "ELF"
    ("ELF")
    
    ? (ccl::write-elf-symbols-to-file "home:elf-symbols")
    	    

    The argument to CCL::WRITE-ELF-SYMBOLS-TO-FILE can be any writable pathname. The function will do whatever's necessary to nail lisp functions down in memory (so that they aren't moved by GC), then write an ELF object file to the indicated pathname. This typically takes a few seconds.

  3. Generate a lisp heap image in which the ELF symbols generated in the previous step are prepended.

    The function CCL:SAVE-APPLICATION provides a :PREPEND-KERNEL argument, which is ordinarily used to save a standalone application in which the kernel and heap image occupy a single file. :PREPEND-KERNEL doesn't really care what it's prepending to the image, and we can just as easily ask it to prepend the ELF symbol file generated in the previous step.

    ? (save-application "somewhere/image-for-profiling"
        :prepend-kernel "home:elf-symbols")
    	    

    If you then run

    shell> ccl64 somewhare/image-for-profiling
    	    

    any lisp code sampled by oprofile in that image will be identified "symbolically" by opreport.

11.1.4. Example

;;; Define some lisp functions that we want to profile and save
;;; a profiling-enabled image.  In this case, we just want to 
;;; define the FACTORIAL funcion, to keep things simple.
? (defun fact (n) (if (zerop n) 1 (* n (fact (1- n)))))
FACT
? (require "ELF")
"ELF"
("ELF")
? (ccl::write-elf-symbols-to-file "home:elf-symbols")
"home:elf-symbols"
? (save-application "home:profiled-ccl" :prepend-kernel "home:elf-symbols")

;;; Setup oprofile with (mostly) default arguments.  This example was
;;; run on a Fedora 8 system where an uncompressed 'vmlinux' kernel
;;; image isn't readily available.

;;; Note that use of 'opcontrol' generally requires root access, e.g.,
;;; 'sudo' or equivalent:

[~] gb@rinpoche> sudo opcontrol --no-vmlinux --setup

;;; Start the profiler

[~] gb@rinpoche> sudo opcontrol --start
Using 2.6+ OProfile kernel interface.
Using log file /var/lib/oprofile/samples/oprofiled.log
Daemon started.
Profiler running.

;;; Start CCL with the "profiled-ccl" image created above.
;;; Invoke "(FACT 10000)"

[~] gb@rinpoche> ccl64 profiled-ccl 
Welcome to Clozure Common Lisp Version 1.2-r9198M-trunk  (LinuxX8664)!
? (null (fact 10000))
NIL
? (quit)

;;; We could stop the profiler (opcontrol --stop) here; instead,
;;; we simply flush profiling data to disk, where 'opreport' can
;;; find it.

[~] gb@rinpoche> sudo opcontrol --dump

;;; Ask opreport to show us where we were spending time in the
;;; 'profiled-ccl' image.

[~] gb@rinpoche> opreport -l profiled-ccl | head
CPU: Core 2, speed 1596 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        symbol name
6417     65.2466  <Compiled-function.(:INTERNAL.MULTIPLY-UNSIGNED-BIGNUM-AND-1-DIGIT-FIXNUM.MULTIPLY-BIGNUM-AND-FIXNUM).(Non-Global)..0x30004002453F>
3211     32.6487  <Compiled-function.%MULTIPLY-AND-ADD4.0x300040000AAF>
17        0.1729  <Compiled-function.%%ONE-ARG-DCODE.0x3000401740AF>
11        0.1118  <Compiled-function.%UNLOCK-RECURSIVE-LOCK-OBJECT.0x30004007F7DF>
10        0.1017  <Compiled-function.AUTO-FLUSH-INTERACTIVE-STREAMS.0x3000404ED6AF>
7         0.0712  <Compiled-function.%NANOSLEEP.0x30004040385F>
7         0.0712  <Compiled-function.%ZERO-TRAILING-SIGN-DIGITS.0x300040030F3F>
	

11.1.5. Issues

CCL::WRITE-ELF-SYMBOLS-TO-FILE currently only works on x86-64; it certainly -could- be made to work on ppc32/ppc64 as well.

So far, no one has been able to make oprofile/opreport options that're supposed to generate call-stack info generate meaningful call-stack info.

As of a few months ago, there was an attempt to provide symbol info for oprofile/opreport "on the fly", e.g., for use in JIT compilation or other incremental compilations scenarios. That's obviously more nearly The Right Thing, but it might be awhile before that experimental code makes it into widespread use.

11.2. Using Apple's CHUD metering tools

11.2.1. Prerequisites

Apple's CHUD metering tools are available (as of this writing) from:

ftp://ftp.apple.com/developer/Tool_Chest/Testing_-_Debugging/Performance_tools/.

The CHUD tools are also generally bundled with Apple's XCode tools. CHUD 4.5.0 (which seems to be bundled with XCode 3.0) seems to work well with this interface; later versions may have problems. Versions of CHUD as old as 4.1.1 may work with 32-bit PPC versions of CCL; later versions (not sure exactly -what- versions) added x86, ppc64, and x86-64 support.

One way to tell whether any version of the CHUD tools is installed is to try to invoke the "shark" command-line program (/usr/bin/shark) from the shell:

shell> shark --help
	

and verifying that that prints a usage summary.

CHUD consists of several components, including command-line programs, GUI applications, kernel extensions, and "frameworks" (collections of libraries, headers, and other resources which applications can use to access functionality provided by the other components.) Past versions of Clozure CL/OpenMCL have used the CHUD framework libraries to control the CHUD profiler. Even though the rest of CHUD is currently 64-bit aware, the frameworks are unfortunately still only available as 32-bit libraries, so the traditional way of controlling the profiling facility from Clozure CL has only worked from DarwinPPC32 versions.

Two of the CHUD component programs are of particular interest:

  1. The "Shark" application (often installed in "/Developer/Applications/Performance Tools/Shark.app"), which provides a graphical user interface for exploring and analyzing profiling results and provides tools for creating "sampling configurations" (see below), among other things.

  2. The "shark" program ("/usr/bin/shark"), which can be used to control the CHUD profiling facility and to collect sampling data, which can then be displayed and analyzed in Shark.app.

The fact that these two (substantially different) programs have names that differ only in alphabetic case may be confusing. The discussion below tries to consistently distinguish between "the shark program" and "the Shark application".

11.2.2. Usage synopsis

? (defun fact (n) (if (zerop n) 1 (* n (fact (1- n)))))
FACT
? (require "CHUD-METERING")
"CHUD-METERING"
("CHUD-METERING")
? (chud:meter (null (fact 10000)))
NIL	      ; since that large number is not NULL
	  

and, a few seconds after the result is returned, a file whose name is of the form "session_nnn.mshark" will open in Shark.app.

The fist time that CHUD:METER is used in a lisp session, it'll do a few things to prepare subsequent profiling sessions. Those things include:

  • creating a directory to store files that are related to using the CHUD tools in this lisp session. This directory is created in the user's home directory and has a name of the form:

    profiling-session-<lisp-kernel>-<pid>_<mm>-<dd>-<yyyy>_<h>.<m>.<s>
    	      

    where <pid> is the lisp's process id, <lisp-kernel> is the name of the lisp kernel (of all things ...), and the other values provide a timestamp.

  • does whatever needs to be done to ensure that currently-defined lisp functions don't move around as the result of GC activity, then writes a text file describing the names and addresses of those functions to the profiling-session directory created above. (The naming conventions for and format of that file are described in

    http://developer.apple.com/documentation/DeveloperTools/Conceptual/SharkUserGuide/MiscellaneousTopics/chapter_951_section_4.html#//apple_ref/doc/uid/TP40005233-CH14-DontLinkElementID_42

  • run the shark program ("/usr/bin/shark") and wait until it's ready to receive signals that control its operation.

This startup activity typically takes a few seconds; after it's been completed, subsequent use of CHUD:METER doesn't involve that overhead. (See the discussion of :RESET below.)

After any startup activity is complete, CHUD:METER arranges to send a "start profiling" signal to the running shark program, executes the form, sends a "stop profiling" signal to the shark program, and reads its diagnostic output, looking for the name of the ".mshark" file it produces. If it's able to find this filename, it arranges for "Shark.app" to open it.

11.2.3. Profiling "configurations"

By default, a shark profiling session will:

  • use "time based" sampling, to periodically interrupt the lisp process and note the value of the program counter and at least a few levels of call history.

  • do this sampling once every millisecond

  • run for up to 30 seconds, unless told to stop earlier.

This is known as "the default configuration"; it's possible to use items on the "Config" menu in the Shark application to create alternate configurations which provide different kinds of profiling parameters and to save these configurations in files for subsequent reuse. (The set of things that CHUD knows how to monitor is large and interesting.)

You use alternate profiling configurations (created and "exported" via Shark.app) with CHUD:METER, but the interface is a little awkward.

11.2.4. Reference

CHUD:*SHARK-CONFIG-FILE* [Variable]

When non-null, this should be the pathname of an alternate profiling configuration file created by the "Config Editor" in Shark.app.

CHUD:METER form &key (reset nil) (debug-output nil) [Macro]

Executes FORM (an arbitrary lisp form) and returns whatever result(s) it returns, with CHUD profiling enabled during the form's execution. Tries to determine the name of the session file (*.mshark) to which the shark program wrote profiling data and opens this file in the Shark application.

Arguments:

debug-output

when non-nil, causes output generated by the shark program to be echoed to *TERMINAL-IO*. For debugging.

reset

when non-nil, terminates any running instance of the shark program created by previous invocations of CHUD:METER in this lisp session, generates a new .spatch file (describing the names and addresses of lisp functions), and starts a new instance of the shark program; if CHUD:*SHARK-CONFIG-FILE* is non-NIL when this new instance is started, that instance is told to use the specified config file for profiling (in lieu of the default profiling configuration.)

11.2.5. Acknowledgement

Both Dan Knapp and Hamilton Link have posted similar CHUD interfaces to openmcl-devel in the past; Hamilton's also reported bugs in the spatch mechanism to CHUD developers (and gotten those bugs fixed.)

Chapter 12. The Foreign-Function Interface

12.1. Specifying And Using Foreign Types

12.1.1. Overview

CCL provides a fairly rich language for defining and specifying foreign data types (this language is derived from CMUCL's "alien type" system.)

In practice, most foreign type definitions are introduced into CCL via its interface database (see ), though it's also possible to define foreign types interactively and/or programmatically.

CCL's foreign type system is "evolving" (a polite word for not-quite-complete): there are some inconsistencies involving package usage, for instance. Symbols used in foreign type specifiers should be keywords, but this convention isn't always enforced.

Foreign type, record, and field names are case-sensitive; CCL uses some escaping conventions (see ) to allow keywords to be used to denote these names.

12.1.1.1. Type Annotations

As of version 1.2, CCL supports annotating the types of foreign pointers on Mac OS X. Forms that create pointers to foreign memory—that is, MACPTRs—store with the MACPTR object a type annotation that identifies the foreign type of the object pointed to. Calling PRINT-OBJECT on a MACPTR attempts to print information about the identified foreign type, including whether it was allocated on the heap or the stack, and whether it's scheduled for automatic reclamation by the garbage collector.

Support for type annotation is not yet complete. In particular, some uses of PREF and SLOT-VALUE do ot yet take type annotations into account, and neither do DESCRIBE and INSPECT.

12.1.1.2. Foreign Types as Classes

Some types of foreign pointers take advantage of the support for type annotations, and pointers of these types can be treated as instances of known classes. Specifically, a pointer to an :<NSR>ect is recognized as an instance of the built-in class NS:NS-RECT, a pointer to an <NSS>ize is treated as an instance of NS:NS-SIZE, a pointer to an <NSP>oint is recognized as an instance of NS:NS-POINT, and a pointer to an <NSR>ange is recognized as an instance of NS:NS-RANGE.

A few more obscure structure types also support this mechanism, and it's possible that a future version will support user definition of similar type mappings.

This support for foreign types as classes provides the following conveniences for each supported type:

  • a PRINT-OBJECT method is defined

  • a foreign type name is created and treated as an alias for the corresponding type. As an example, the name :NS-RECT is a name for the type that corresponds to NS:NS-RECT, and you can use :NS-RECT as a type designator in RLET forms to specify a structure of type NS-RECT.

  • the class is integrated into the type system so that (TYPEP R 'NS:NS-RECT) is implemented with fair efficiency.

  • inlined accessor and SETF inverses are defined for the structure type's fields. In the case of an <NSR*gt;ect, for example, the fields in question are the fields of the embedded point and size, so that NS:NS-RECT-X, NS:NS-RECT-Y, NS:NS-RECT-WIDTH, NS-RECT-HEIGHT and SETF inverses are defined. The accessors and setter functions typecheck their arguments and the setters handle coercion to the appropriate type of CGFLOAT where applicable.

  • an initialization function is defined; for example,

    (NS:INIT-NS-SIZE s w h)
              

    is roughly equivalent to

    (SETF (NS:NS-SIZE-WIDTH s) w
          (NS:NS-SIZE-HEIGHT s) h)
              

    but might be a little more efficient.

  • a creation function is defined; for example

    (NS:NS-MAKE-POINT x y)
              

    is functionally equivalent to

    (LET ((P (MAKE-GCABLE-RECORD :NS-POINT)))
      (NS:INIT-NS-POINT P X Y)
      p)
              
  • a macro is defined which, like RLET, stack-allocates an instance of the foreign record type, optionally initializes that instance, and executes a body of code with a variable bound to that instance.

    For example,

    (ns:with-ns-range (r loc len)
      (format t "~& range has location ~s, length ~s" 
         (ns:ns-range-location r) (ns:ns-range-length r)))
              

12.1.2. Syntax of Foreign Type Specifiers

  • Some foreign types are builtin: keywords denote primitive,builtin types such as the IEEE-double-float type (denoted:DOUBLE-FLOAT), in much the same way as certain symbols(CONS, FIXNUM,etc.) define primitive CL types.

  • Constructors such as :SIGNED and :UNSIGNED can be used to denote signed and unsigned integer subtypes (analogous to the CL type specifiers SIGNED-BYTE and UNSIGNED-BYTE.) :SIGNED is shorthand for(:SIGNED 32) and :UNSIGNED is shorthand for (:UNSIGNED 32).

  • Aliases for other (perhaps more complicated) types can be defined via CCL:DEF-FOREIGN-TYPE (sort of like CL:DEFTYPE or the C typedef facility). The type :CHAR is defined as an alias for (:SIGNED8) on some platforms, as (:UNSIGNED 8) on others.

  • The construct (:STRUCT name) can be used to refer to a named structure type; (:UNION name)can be used to refer to a named union type. It isn't necessary to enumerate a structure or union type's fields in order to refer to the type.

  • If X is a valid foreign type reference,then (:* X) denotes the foreign type "pointer to X". By convention, (:* T) denotes an anonymous pointer type, vaguely equivalent to "void*" in C.

  • If a fieldlist is a list of lists, each of whose CAR is a foreign field name (keyword) and whose CADR is a foreign type specifier, then (:STRUCT name ,@fieldlist) is a definition of the structure type name, and (:UNION name ,@fieldlist) is a definition of the union type name. Note that it's necessary to define a structure or union type in order to include that type in a structure, union, or array, but only necessary to "refer to" a structure or union type in order to define a type alias or a pointer type.

  • If X is a defined foreign type , then (:array X &rest dims) denotes the foreign type "array of X". Although multiple array dimensions are allowed by the :array constructor, only single-dimensioned arrays are (at all) well-supported in CCL.

12.2. Foreign Function Calls

12.2.1. Overview

CCL provides a number of constructs for calling foreign functions from Lisp code (all of them based on the function CCL:%FF-CALL). In many cases, CCL's interface translator (see ) provides information about the foreign function's entrypoint name and argument and return types; this enables the use of the #_ reader macro (described below), which may be more concise and/or more readable than other constructs.

CCL also provides a mechanism for defining callbacks: lisp functions which can be called from foreign code.

There's no supported way to directly pass lisp data to foreign functions: scalar lisp data must be coerced to an equivalent foreign representation, and lisp arrays (notably strings) must be copied to non-GCed memory.

12.2.1.1. Type Designators for Arguments and Return Values

The types of foreign argument and return values in foreign function calls and callbacks can be specified by any of the following keywords:

:UNSIGNED-BYTE

The argument/return value is of type (UNSIGNED-BYTE 8)

:SIGNED-BYTE

The argument/return value is of type (SIGNED-BYTE 8)

:UNSIGNED-HALFWORD

The argument/return value is of type (UNSIGNED-BYTE 16)

:SIGNED-HALFWORD

The argument/return value is of type (SIGNED-BYTE 16)

:UNSIGNED-FULLWORD

The argument/return value is of type (UNSIGNED-BYTE 32)

:SIGNED-FULLWORD

The argument/return value is of type (SIGNED-BYTE 32)

:UNSIGNED-DOUBLEWORD

The argument/return value is of type (UNSIGNED-BYTE 64)

:SIGNED-DOUBLEWORD

The argument/return value is of type (SIGNED-BYTE 64)

:SINGLE-FLOAT

The argument/return value is of type SINGLE-FLOAT

:DOUBLE-FLOAT

The argument/return value is of type DOUBLE-FLOAT

:ADDRESS

The argument/return values is a MACPTR.

:VOID

or NIL Not valid as an argument type specifier; specifies that there is no meaningful return value

On some platforms, a small positive integer N can also be used as an argument specifier; it indicates that the corresponding argument is a pointer to an N-word structure or union which should be passed by value to the foreign function. Exactly which foreign structures are passed by value and how is very dependent on the Application Binary Interface (ABI) of the platform; unless you're very familiar with ABI details (some of which are quite baroque), it's often easier to let higher-level constructs deal with these details.

12.2.1.2. External Entrypoints and Named External Entrypoints

PowerPC machine instructions are always aligned on 32-bit boundaries, so the two least significant bits of the first instruction ("entrypoint") of a foreign function are always 0. CCL often represents an entrypoint address as a fixnum that's binary-equivalent to the entrypoint address: if E is an entrypoint address expressed as a signed 32-bit integer, then (ash E -2) is an equivalent fixnum representation of that address. An entrypoint address can also be encapsulated in a MACPTR (see FIXTHIS), but that's somewhat less efficient.

Although it's possible to use fixnums or macptrs to represent entrypoint addresses, it's somewhat cumbersome to do so. CCL can cache the addresses of named external functions in structure-like objects of type CCL:EXTERNAL-ENTRY-POINT (sometimes abbreviated as EEP). Through the use of LOAD-TIME-VALUE, compiled lisp functions are able to reference EEPs as constants; the use of an indirection allows CCL runtime system to ensure that the EEP's address is current and correct.

12.2.2. Return Conventions for C Structures

On some platforms, C functions that are defined to return structures do so by reference: they actually accept a first parameter of type "pointer to returned struct/union" - which must be allocated by the caller - and don't return a meaningful value.

Exactly how a C function that's defined to return a foreign structure does so is dependent on the ABI (and on the size and composition of the structure/union in many cases.)

12.3. Referencing and Using Foreign Memory Addresses

12.3.1. Overview

12.3.1.1. Basics

For a variety of technical reasons, it isn't generally possible to directly reference arbitrary absolute addresses (such as those returned by the C library function malloc(), for instance) in CCL. In CCL (and in MCL), such addresses need to be encapsulated in objects of type CCL:MACPTR; one can think of a MACPTR as being a specialized type of structure whose sole purpose is to provide a way of referring to an underlying "raw" address.

It's sometimes convenient to blur the distinction between a MACPTR and the address it represents; it's sometimes necessary to maintain that distinction. It's important to remember that a MACPTR is (generally) a first-class Lisp object in the same sense that a CONS cell is: it'll get GCed when it's no longer possible to reference it. The "lifetime" of a MACPTR doesn't generally have anything to do with the lifetime of the block of memory its address points to.

It might be tempting to ask "How does one obtain the address encapsulated by a MACPTR ?". The answer to that question is that one doesn't do that (and there's no way to do that): addresses aren't first-class objects, and there's no way to refer to one.

Two MACPTRs that encapsulate the same address are EQL to each other.

There are a small number of ways to directly create a MACPTR (and there's a fair amount of syntactic sugar built on top of of those primitives.) These primitives will be discussed in greater detail below, but they include:

  • Creating a MACPTR with a specified address, usually via the function CCL:%INT-TO-PTR.

  • Referencing the return value of a foreign function call (see )that's specified to return an address.

  • Referencing a memory location that's specified to contain an address.

All of these primitive MACPTR-creating operations are usually open-coded by the compiler; it has a fairly good notion of what low-level operations "produce" MACPTRs and which operations "consume" the addresses that the encapsulate, and will usually optimize out the introduction of intermediate MACPTRs in a simple expression.

One consequence of the use of MACPTR objects to encapsulate foreign addresses is that (naively) every reference to a foreign address causes a MACPTR to be allocated.

Consider a code fragment like the following:

(defun get-next-event ()
  "get the next event from a hypothetical window system"
  (loop
     (let* ((event (#_get_next_window_system_event))) ; via an FF-CALL
       (unless (null-event-p event)
         (handle-event event)))))
        

As this is written, each call to the (hypothetical) foreign function #_get_next_window_system_event will return a new MACPTR object. Ignoring for the sake of argument the question of whether this code fragment exhibits a good way to poll for external events (it doesn't), it's not hard to imagine that this loop could execute several million times per second (producing several million MACPTRs per second.) Clearly, the "naive" approach is impractical in many cases.

12.3.1.2. Stack allocation of—and destructive operations on—MACPTRs.

If certain conditions held in the environment in which GET-NEXT-EVENT ran—namely, if it was guaranteed that neither NULL-EVENT-P nor HANDLE-EVENT cached or otherwise retained their arguments (the "event" pointer)—there'd be a few alternatives to the naive approach. One of those approaches would be to use the primitive function %SETF-MACPTR (described in greater detail below) to destructively modify a MACPTR (to change the value of the address it encapsulates.) The GET-NEXT-EVENT example could be re-written as:

(defun get-next-event ()
  (let* ((event (%int-to-ptr 0)))     ; create a MACPTR with address 0
    (loop
       (%setf-macptr event (#_get_next_window_system_event)) ; re-use it
       (unless (null-event-p event)
         (handle-event event)))))
        

That version's a bit more realistic: it allocates a single MACPTR outside if the loop, then changes its address to point to the current address of the hypothetical event structure on each loop iteration. If there are a million loop iterations per call to GET-NEXT-EVENT, we're allocating a million times fewer MACPTRs per call; that sounds like a Good Thing.

An Even Better Thing would be to advise the compiler that the initial value (the null MACPTR) bound to the variable event has dynamic extent (that value won't be referenced once control leaves the extent of the binding of that variable.) Common Lisp allows us to make such an assertion via a DYNAMIC-EXTENT declaration; CCL's compiler can recognize the "primitive MACPTR-creating operation" involved and can replace it with an equivalent operation that stack-allocates the MACPTR object. If we're not worried about the cost of allocating that MACPTR on every iteration (the cost is small and there's no hidden GC cost), we could move the binding back inside the loop:

(defun get-next-event ()
  (loop
     (let* ((event (%null-ptr))) ; (%NULL-PTR) is shorthand for (%INT-TO-PTR 0)
       (declare (dynamic-extent event))
       (%setf-macptr event (#_get_next_window_system_event))
       (unless (null-event-p event)
         (handle-event event)))))
        

The idiom of binding one or more variables to stack-allocated MACPTRs, then destructively modifying those MACPTRs before executing a body of code is common enough that CCL provides a macro (WITH-MACPTRS) that handles all of the gory details. The following version of GET-NEXT-EVENT is semantically equivalent to the previous version, but hopefully a bit more concise:

(defun get-next-event ()
  (loop
     (with-macptrs ((event (#_get_next_window_system_event)))
       (unless (null-event-p event)
         (handle-event event)))))
        

12.3.1.3. Stack-allocated memory (and stack-allocated pointers to it.)

Fairly often, the blocks of foreign memory (obtained by malloc or something similar) have well-defined lifetimes (they can safely be freed at some point when it's known that they're no longer needed and it's known that they're no longer referenced.) A common idiom might be:

(with-macptrs (p (#_allocate_foreign_memory size))
  (unwind-protect
       (use-foreign-memory p)
    (#_deallocate_foreign_memory p)))
        

That's not unreasonable code, but it's fairly expensive for a number of reasons: foreign functions calls are themselves fairly expensive (as is UNWIND-PROTECT), and most library routines for allocating and deallocating foreign memory (things like malloc and free) can be fairly expensive in their own right.

In the idiomatic code above, both the MACPTR P and the block of memory that's being allocated and freed have dynamic extent and are therefore good candidates for stack allocation. CCL provides the %STACK-BLOCK macro, which executes a body of code with one or more variables bound to stack-allocated MACPTRs which encapsulate the addresses of stack-allocated blocks of foreign memory. Using %STACK-BLOCK, the idiomatic code is:

(%stack-block ((p size))
              (use-foreign-memory p))
        

which is a bit more efficient and a bit more concise than the version presented earlier.

%STACK-BLOCK is used as the basis for slightly higher-level things like RLET. (See FIXTHIS for more information about RLET.)

12.3.1.4. Caveats

Reading from, writing to, allocating, and freeing foreign memory are all potentially dangerous operations; this is no less true when these operations are performed in CCL than when they're done in C or some other lower-level language. In addition, destructive operations on Lisp objects be dangerous, as can stack allocation if it's abused (if DYNAMIC-EXTENT declarations are violated.) Correct use of the constructs and primitives described here is reliable and safe; slightly incorrect use of these constructs and primitives can crash CCL.

12.3.2. Foreign-Memory-Addresses Dictionary

Unless otherwise noted, all of the symbols mentioned below are exported from the CCL package.

12.3.2.1. Scalar memory reference

Syntax

%get-signed-byte ptr &optional (offset 0)

%get-unsigned-byte ptr &optional (offset 0)

%get-signed-word ptr &optional (offset 0)

%get-unsigned-word ptr &optional (offset 0)

%get-signed-long ptr &optional (offset 0)

%get-unsigned-long ptr &optional (offset 0)

%%get-signed-longlong ptr &optional (offset 0)

%%get-unsigned-longlong ptr &optional (offset 0)

%get-ptr ptr &optional (offset 0)

%get-single-float ptr &optional (offset 0)

%get-double-float ptr &optional (offset 0)

Description

References and returns the signed or unsigned 8-bit byte, signed or unsigned 16-bit word, signed or unsigned 32-bit long word, signed or unsigned 64-bit long long word, 32-bit address, 32-bit single-float, or 64-bit double-float at the effective byte address formed by adding offset to the address encapsulated by ptr.

Arguments
ptr

A MACPTR

offset

A fixnum

All of the memory reference primitives described above can be

used with SETF.

12.3.2.2. %get-bit [Function]

Syntax

%get-bit ptr bit-offset

Description

References and returns the bit-offsetth bit at the address encapsulated by ptr. (Bit 0 at a given address is the most significant bit of the byte at that address.) Can be used with SETF.

Arguments

 

ptr

A MACPTR

bit-offset

A fixnum

12.3.2.3. %get-bitfield [Function]

Syntax

%get-bitfield ptr bit-offset width

Description

References and returns an unsigned integer composed from the width bits found bit-offset bits from the address encapsulated by ptr. (The least significant bit of the result is the value of (%get-bit ptr (1- (+ bit-offset width))). Can be used with SETF.

Arguments

 

ptr

A MACPTR

bit-offset

A fixnum

width

A positive fixnum

12.3.2.4. %int-to-ptr [Function]

Syntax

%int-to-ptr int

Description

Creates and returns a MACPTR whose address matches int.

Arguments

 

int

An (unsigned-byte 32)

12.3.2.5. %inc-ptr [Function]

Syntax

%inc-ptr ptr &optional (delta 1)

Description

Creates and returns a MACPTR whose address is the address of ptr plus delta. The idiom (%inc-ptr ptr 0) is sometimes used to copy a MACPTR, e.g., to create a new MACPTR encapsulating the same address as ptr.

Arguments

 

ptr

A MACPTR

delta

A fixnum

12.3.2.6. %ptr-to-int [Function]

Syntax

%ptr-to-int ptr

Description

Returns the address encapsulated by ptr, as an (unsigned-byte 32).

Arguments

 

ptr

A MACPTR

12.3.2.7. %null-ptr [Macro]

Syntax

%null-ptr

Description

Equivalent to (%int-to-ptr 0).

12.3.2.8. %null-ptr-p [Function]

Syntax

%null-ptr-p ptr

Description

Returns T If ptr is a MACPTR encapsulating the address 0, NIL if ptr encapsulates some other address.

Arguments

 

ptr

A MACPTR

12.3.2.9. %setf-macptr [Function]

Syntax

%setf-macptr dest-ptr src-ptr

Description

Causes dest-ptr to encapsulate the same address that src-ptr does, then returns dest-ptr.

Arguments

 

dest-ptr

A MACPTR

src-ptr

A MACPTR

12.3.2.10. %incf-ptr [Macro]

Syntax

%incf-ptr ptr &optional (delta 1)

Description

Destructively modifies ptr, by adding delta to the address it encapsulates. Returns ptr.

Arguments

 

ptr

A MACPTR

delta

A fixnum

12.3.2.11. with-macptrs [Macro]

Syntax

with-macptrs (var expr)* &body body

Description

Executes body in an environment in which each var is bound to a stack-allocated macptr which encapsulates the foreign address yielded by the corresponding expr. Returns whatever value(s) body returns.

Arguments

 

var

A symbol (variable name)

expr

A MACPTR-valued expression

12.3.2.12. %stack-block [Macro]

Syntax

%stack-block (var expr)* &body body

Description

Executes body in an environment in which each var is bound to a stack-allocated macptr which encapsulates the address of a stack-allocated region of size expr bytes. Returns whatever value(s) body returns.

Arguments

 

var

A symbol (variable name)

expr

An expression which should evaluate to a non-negative fixnum

12.3.2.13. make-cstring [Function]

Syntax

make-cstring string

Description

Allocates a block of memory (via malloc) of length (1+ (length string)). Copies the string to this block and appends a trailing NUL byte; returns a MACPTR to the block.

Arguments

 

string

A lisp string

12.3.2.14. with-cstrs [Macro]

Syntax

with-cstrs (var string)* &body body

Description

Executes body in an environment in which each var is bound to a stack-allocated macptr which encapsulates the %address of a stack-allocated region of into which each string (and a trailing NUL byte) has been copied. Returns whatever value(s) body returns.

Arguments

 

var

A symbol (variable name)

string

An expression which should evaluate to a lisp string

12.3.2.15. with-encoded-cstrs [Macro]

Syntax

with-encoded-cstrs ENCODING-NAME (varI stringI)* &body body

Description

Executes body in an environment in which each varI is bound to a macptr which encapsulates the %address of a stack-allocated region of into which each stringI (and a trailing NUL character) has been copied. Returns whatever value(s) body returns.

ENCODING-NAME is a keyword constant that names a character encoding. Each foreign string is encoded in the named encoding. Each foreign string has dynamic extent.

WITH-ENCODED-CSTRS does not automatically prepend byte-order marks to its output; the size of the terminating #\NUL character depends on the number of octets per code unit in the encoding.

The expression

(ccl:with-cstrs ((x "x")) (#_puts x))

is functionally equivalent to

(ccl:with-encoded-cstrs :iso-8859-1 ((x "x")) (#_puts x))
Arguments

 

varI

A symbol (variable name)

stringI

An expression which should evaluate to a lisp string

12.3.2.16. %get-cstring [Function]

Syntax

%get-cstring ptr

Description

Interprets ptr as a pointer to a (NUL -terminated) C string; returns an equivalent lisp string.

Arguments

ptr

A MACPTR

12.3.2.17. %str-from-ptr [Function]

Syntax

%str-from-ptr ptr length

Description

Returns a lisp string of length length, whose contents are initialized from the bytes at ptr.

Arguments
ptr

A MACPTR

length

a non-negative fixnum

12.4. The Interface Database

12.4.1. Overview

CCL uses a set of database files which contain foreign type, record, constant, and function definitions derived from the operating system's header files, be that Linux or Darwin. An archive containing these database files (and the shell scripts which were used in their creation) is available; see the Distributions page for information about obtaining current interface database files.

Not surprisingly, different platforms use different database files.

CCL defines reader macros that consult these databases:

  • #$foo looks up the value of the constant definition of foo

  • #_foo looks up the foreign function definition for foo

In both cases, the symbol foo is interned in the "OS" package. The #$ reader macro has the side-effect of defining foo as a constant (as if via DEFCONSTANT); the #_ reader macro has the side effect of defining foo as a macro which will expand into an (EXTERNAL-CALL form.)

It's important to remember that the side-effect happens when the form containing the reader macro is read. Macroexpansion functions that expand into forms which contain instances of those reader macros don't do what one might think that they do, unless the macros are expanded in the same lisp session as the reader macro was read in.

In addition, references to foreign type, structure/union, and field names (when used in the RREF/PREF and RLET macros) will cause these database files to be consulted.

Since the CCL sources contain instances of these reader macros (and references to foreign record types and fields), compiling CCL from those sources depends on the ability to find and use (see Section 3.5, “Building the heap image”).

12.4.2. Other issues:

  • CCL now preserves the case of external symbols in its database files. See Case-sensitivity of foreign names in CCL for information about case in foreign symbol names.

  • The Linux databases are derived from a somewhat arbitrary set of Linux header files. Linux is enough of a moving target that it may be difficult to define a standard, reference set of interfaces from which to derive a standard, reference set of database files.This seems to be less of an issue with Darwin and FreeBSD.

For information about building the database files, see Section 12.7, “The Interface Translator”.

12.5. Using Interface Directories

12.5.1. Overview

As distributed, the "ccl:headers;" (for LinuxPPC) directory is organized like:

        headers/
        headers/gl/
        headers/gl/C/
        headers/gl/C/populate.sh
        headers/gl/constants.cdb
        headers/gl/functions.cdb
        headers/gl/records.cdb
        headers/gl/objc-classes.cdb
        headers/gl/objc-methods.cdb
        headers/gl/types.cdb
        headers/gnome/
        headers/gnome/C/
        headers/gnome/C/populate.sh
        headers/gnome/constants.cdb
        headers/gnome/functions.cdb
        headers/gnome/records.cdb
        headers/gnome/objc-classes.cdb
        headers/gnome/objc-methods.cdb
        headers/gnome/types.cdb
        headers/gtk/
        headers/gtk/C/
        headers/gtk/C/populate.sh
        headers/gtk/constants.cdb
        headers/gtk/functions.cdb
        headers/gtk/records.cdb
        headers/gtk/objc-classes.cdb
        headers/gtk/objc-methods.cdb
        headers/gtk/types.cdb
        headers/libc/
        headers/libc/C/
        headers/libc/C/populate.sh
        headers/libc/constants.cdb
        headers/libc/functions.cdb
        headers/libc/records.cdb
        headers/libc/objc-classes.cdb
        headers/libc/objc-methods.cdb
        headers/libc/types.cdb
      

e.g, as a set of parallel subdirectories, each with a lowercase name and each of which contains a set of 6 database files and a "C" subdirectory which contains a shell script used in the database creation process.

As one might assume, the database files in each of these subdirectories contain foreign type, constant, and function definitions - as well as Objective-C class and method info -that correspond (roughly) to the information contained in the header files associated with a "-dev" package in a Linux distribution. "libc" corresponds pretty closely to the interfaces associated with "glibc/libc6" header files, "gl" corresponds to an "openGL+GLUT" development package, "gtk" and "gnome" contain interface information from the GTK+1.2 and GNOME libraries, respectively.

For Darwin, the "ccl:darwin-headers" directory contains a "libc" subdirectory, whose contents roughly correspond to those of "/usr/include" under Darwin, as well as subdirectories corresponding to the MacOSX Carbon and Cocoa frameworks.

To see the precise set of .h files used to generate the database files in a given interface directory, consult the corresponding "populate.sh" shell script (in the interface directory's "C" subdirectory.)

The intent is that this initial set can be augmented to meet local needs, and that this can be done in a fairly incremental fashion: one needn't have unrelated header files installed in order to generate interface databases for a package of interest.

Hopefully, this scheme will also make it easier to distribute patches and bug fixes.

CCL maintains a list of directories; when looking for a foreign type, constant, function, or record definition, it'll consult the database files in each directory on that list. Initially, the list contains an entry for the "libc" interface directory. CCL needs to be explicitly told to look in other interface directories should it need to do so.

12.5.2. Creating new interface directories

This example refers to "ccl:headers;", which is appropriate for LinuxPPC. The procedure's analogous under Darwin, where the "ccl:darwin-headers;" directory would be used instead.

To create a new interface directory, "foo", and a set of database files in that directory:

  1. Create a subdirectory of "ccl:headers;" named "foo".

  2. Create a subdirectory of "ccl:headers;foo;" named "C".

  3. Create a file in "ccl:headers;foo;C;" named "populate.sh".

    One way of accomplishing the above steps is:

                ? (close (open "ccl:headers;foo;C;populate.sh" :direction :output :
                               if-does-not-exist :create :if-exists :overwrite))
              
  4. Edit the file created above, using the "populate.sh" files in the distribution as guidelines.

    The file might wind up looking something like:

    #/bin/sh
                h-to-ffi.sh `foo-config -cflags` /usr/include/foo/foo.h

Refer to Section 12.7, “The Interface Translator” for information about running the interface translator and .ffi parser.

Assuming that all went well, there should now be .cdb files in "ccl:headers;foo;". You can then do

          ? (use-interface-dir :foo)
	    

whenever you need to access the foreign type information in those database files.

12.6. Using Shared Libraries

12.6.1. Overview

CCL provides facilities to open and close shared libraries.

"Opening" a shared library, which is done with open-shared-library, maps the library's code and data into CCL's address space and makes its exported symbols accessible to CCL.

"Closing" a shared library, which is done with close-shared-library, unmaps the library's code and and removes the library's symbols from the global namespace.

A small number of shared libraries (including libc, libm, libdl under Linux, and the "system" library under Darwin) are opened by the lisp kernel and can't be closed.

CCL uses data structures of type EXTERNAL-ENTRY-POINT to map a foreign function name (string) to that foreign function's current address. (A function's address may vary from session to session as different versions of shared libraries may load at different addresses; it may vary within a session for similar reasons.)

An EXTERNAL-ENTRY-POINT whose address is known is said to be resolved. When an external entry point is resolved, the shared library which defines that entry point is noted; when a shared library is closed, the entry points that it defines are made unresolved. An EXTERNAL-ENTRY-POINT must be in the resolved state in order to be FF-CALLed; calling an unresolved entry point causes a "last chance" attempt to resolve it. Attempting to resolve an entrypoint that was defined in a closed library will cause an attempt to reopen that library.

CCL keeps track of all libraries that have been opened in a lisp session. When a saved application is first started, an attempt is made to reopen all libraries that were open when the image was saved, and an attempt is made to resolve all entry points that had been referenced when the image was saved. Either of these attempts can fail "quietly", leaving some entry points in an unresolved state.

Linux shared libraries can be referred to either by a string which describes their full pathname or by their soname, a shorter string that can be defined when the library is created. The dynamic linker mechanisms used in Linux make it possible (through a series of filesystem links and other means) to refer to a library via several names; the library's soname is often the most appropriate identifier.

so names are often less version-specific than other names for libraries; a program that refers to a library by the name "libc.so.6" is more portable than one which refers to "libc-2.1.3.so" or to "libc-2.2.3.so", even though the latter two names might each be platform-specific aliases of the first.

All of the global symbols described below are exported from the CCL package.

12.6.2. Limitations and known bugs

  • Don't get me started.

  • The underlying functionality has a poor notion of dependency;it's not always possible to open libraries that depend on unopened libraries, but it's possible to close libraries on which other libraries depend. It may be possible to generate more explicit dependency information by parsing the output of the Linux ldd and ldconfig programs.

12.6.3. >Darwin Notes

Darwin shared libraries come in two (basic) flavors:

  • "dylibs" (which often have the extension".dylib") are primarily intended to be linked against at compile/link time. They can be loaded dynamically,but can't be unloaded. Accordingly,OPEN-SHARED-LIBRARY can be used to open a .dylib-style library;calling CLOSE-SHARED-LIBRARY on the result of such a call produces a warning, and has no other effect. It appears that (due to an OS bug) attempts to open .dylib shared-libraries that are already open can cause memory corruption unless the full pathname of the .dylib file is specified on the first and all subsequent calls.

  • "bundles" are intended to serve as application extensions; they can be opened multiple times (creating multiple instances of the library!) and closed properly.

Thanks to Michael Klingbeil for getting both kinds of Darwin shared libraries working in CCL.

12.7. The Interface Translator

12.7.1. Overview

CCL uses an interface translation system based on the FFIGEN system, which is described at this page The interface translator makes the constant, type, structure, and function definitions in a set of C-language header files available to lisp code.

The basic idea of the FFIGEN scheme is to use the C compiler's frontend and parser to translate .h files into semantically equivalent .ffi files, which represent the definitions from the headers using a syntax based on S-expressions. Lisp code can then concentrate on the .ffi representation, without having to concern itself with the semantics of header file inclusion or the arcana of C parsing.

The original FFIGEN system used a modified version of the LCC C compiler to produce .ffi files. Since many OS header files contain GCC-specific constructs, CCL's translation system uses a modified version of GCC (called, somewhat confusingly, ffigen.)

See here for information on building and installing ffigen.

A component shell script called h-to-ffi.sh reads a specified .h file (and optional preprocessor arguments) and writes a (hopefully) equivalent .ffi file to standard output, calling the ffigen program with appropriate arguments.

For each interface directory (see FIXTHIS) subdir distributed with CCL, a shell script (distributed with CCL as "ccl:headers;subdir;C;populate.sh" (or some other platform-specific headers directory) calls h-to-ffi.sh on a large number of the header files in /usr/include (or some other system header path) and creates a parallel directory tree in "ccl:headers;subdir;C;system;header;path;" (or "ccl:darwin-headers;subdir;C;system;header;path;", etc.), populating that directory with .ffi files.

A lisp function defined in "ccl:library;parse-ffi.lisp" reads the .ffi files in a specified interface directory subdir and generates new versions of the databases (files with the extension .cdb).

The CDB databases are used by the #$ and #_ reader macros and are used in the expansion of RREF, RLET, and related macros.

12.7.2. Details: rebuilding the CDB databases, step by step

  1. Ensure that the FFIGEN program is installed. See the"README" file generated during the FFIGEN build process for specific installation instructions.This example assumes LinuxPPC; for other platforms, substitute the appropriate headers directory.

  2. Edit the "ccl:headers;subdir;C;populate.sh"shell script. When you're confident that the files and preprocessor options match your environment, cd to the"ccl:headers;subdir;C;" directory and invoke ./populate.sh. Repeat this step until you're able to cleanly translate all files referenced in the shell script.

  3. Run CCL:

                  ? (require "PARSE-FFI")
                  PARSE-FFI
    
                  ? (ccl::parse-standard-ffi-files :SUBDIR)
                  ;;; lots of output ... after a while, shiny new .cdb files should
                  ;;; appear in "ccl:headers;subdir;"
              

    It may be necessary to call CCL::PARSE-STANDARD-FFI-FILES twice, to ensure that forward-references are resolved

12.8. Case-sensitivity of foreign names in CCL

12.8.1. Overview

As of release 0.11, CCL addresses the fact that foreign type, constant, record, field, and function nams are case-sensitive and provides mechanisms to refer to these names via lisp symbols.

Previous versions of CCL have tried to ignore that fact, under the belief that case conflicts were rare and that many users (and implementors) would prefer not to deal with case-related issues. The fact that some information in the interface databases was incomplete or inaccessible because of this policy made it clearer that the policy was untenable. I can't claim that the approach described here is aesthetically pleasing, but I can honestly say that it's less unpleasant than other approaches that I'd thought of. I'd be interested to hear alternate proposals.

The issues described here have to do with how lisp symbols are used to denote foreign functions, constants, types, records, and fields. It doesn't affect how other lisp objects are sometimes used to denote foreign objects. For instance, the first argument to the EXTERNAL-CALL macros is now and has always been a case-sensitive string.

12.8.2. Foreign constant and function names

The primary way of referring to foreign constant and function names in CCL is via the #$ and #_ reader macros. These reader macro functions each read a symbol into the "OS" package, look up its constant or function definition in the interface database, and assign the value of the constant to the symbol or install a macroexpansion function on the symbol.

In order to observe case-sensitivity, the reader-macros now read the symbol with (READTABLE-CASE :PRESERVE) in effect.

This means that it's necessary to type the foreign constant or function name in correct case, but it isn't necessary to use any special escaping constructs when writing the variable name. For instance:

        (#_read fd buf n) ; refers to foreign symbol "read"
        (#_READ fd buf n) ; refers to foreign symbol "READ", which may
        ; not exist ...
        #$o_rdonly ; Probably doesn't exist
        #$O_RDONLY ; Exists on most platforms
      

12.8.3. Foreign type, record, and field names

Constructs like RLET expect a foreign type or record name to be denoted by a symbol (typically a keyword); RREF (and PREF) expect an "accessor" form, typically a keyword formed by concatenating a foreign type or record name with a sequence of one or more foreign field names, separated by dots. These names are interned by the reader as other lisp symbols are, with an arbitrary value of READTABLE-CASE in effect (typically :UPCASE.) It seems like it would be very tedious to force users to manually escape (via vertical bar or backslash syntax) all lowercase characters in symbols used to specify foreign type, record, and field names (especially given that many traditional POSIX structure, type, and field names are entirely lowercase.)

The approach taken by CCL is to allow the symbols (keywords) used to denote foreign type, record, and field names to contain angle brackets (< and >). Such symbols are translated to foreign names via the following set of conventions:

  • All instances of < and > in the symbol's pname are balanced and don't nest.

  • Any alphabetic characters in the symbol's pname that aren't enclosed in angle brackets are treated as lower-case,regardless of the value of READTABLE-CASE and regardless of the case in which they were written.

  • Alphabetic characters that appear within angle brackets are mapped to upper-case, again regardless of how they were written or interned.

There may be many ways of "escaping" (with angle brackets) sequences of upper-case and non-lower-case characters in a symbol used to denote a foreign name. When translating in the other direction, CCL always escapes the longest sequence that starts with an upper-case character and doesn't contain a lower-case character.

It's often preferable to use this canonical form of a foreign type name.

The accessor forms used by PREF/RREF should be viewed as a series of foreign type/record and field names; upper-case sequences in the component names should be escaped with angle brackets, but those sequences shouldn't span components. (More simply, the separating dots shouldn't be enclosed, even if both surrounding characters need to be.)

Older POSIX code tends to use lower-case exclusively for type, record, and field names; there are only a few cases in the CCL sources where mixed-case names need to be escaped.

12.8.4. Examples

        ;;; Allocate a record of type "window".
        (rlet ((w :window)) ...)
        ;;; Allocate a record of type "Window", which is probably a
        ;;;  different type
        (rlet ((w :<w>indow)) ...)
        ;;; This is equivalent to the last example
        (rlet ((w :<w>INDOW)))
      

12.9. Reading Foreign Names

CCL provides several reader macros to make it more convenient to handle foreign type, function, variable, and constant names. Each of these reader macros reads symbols preserving the case of the source text, and selects an appropriate package in which to intern the resulting symbol. These reader macros are especially useful when your Lisp code interacts extensively with a foreign library—for example, when using Mac OS X's Cocoa frameworks.

These reader macros include "#_" to read foreign function names, "#&" to read foreign variable names (note that in earlier versions of OpenMCL the reader macro "#?" was used for this same purpose), "#$" to read foreign constant names, "#/" to read the names of foreign Objective-C methods, and "#>" to read keywords that can be used as the names of types, records, and accessors.

All of these reader macros preserve the case of the text that they read; beyond that similarity, each performs some additional work, unique to each reader macro, to create symbols suitable for a particular use. For example, the function, variable, and constant reader macros intern the resulting symbol in the "OS" package of the running platform, but the reader macro for Objective-C method names interns symbols in the "NEXTSTEP-FUNCTIONS" package.

You are likely to see these reader macros used extensively in Lisp code that works with foreign libraries; for example, CCL IDE code, which defines numerous Objective-C classes and methods, uses these reader macros extensively.

For more detailed descriptions of each of these reader macros, see the Foreign-Function-Interface Dictionary section.

12.10. Tutorial: Using Basic Calls and Types

This tutorial is meant to cover the basics of CCL for calling external C functions and passing data back and forth. These basics will provide the foundation for more advanced techniques which will allow access to the various external libraries and toolkits.

The first step is to start with a simple C dynamic library in order to actually observe what is actually passing between CCL and C. So, some C code is in order:

Create the file typetest.c, and put the following code into it:

#include <stdio.h>

void
void_void_test(void)
{
    printf("Entered %s:\n", __FUNCTION__);
    printf("Exited  %s:\n", __FUNCTION__);
    fflush(stdout);
}

signed char
sc_sc_test(signed char data)
{
    printf("Entered %s:\n", __FUNCTION__);
    printf("Data In: %d\n", (signed int)data);
    printf("Exited  %s:\n", __FUNCTION__);
    fflush(stdout);
    return data;
}

unsigned char
uc_uc_test(unsigned char data)
{
    printf("Entered %s:\n", __FUNCTION__);
    printf("Data In: %d\n", (signed int)data);
    printf("Exited  %s:\n", __FUNCTION__);
    fflush(stdout);
    return data;
}
    

This defines three functions. If you're familiar with C, notice that there's no main(), because we're just building a library, not an executable.

The function void_void_test() doesn't take any parameters, and doesn't return anything, but it prints two lines to let us know it was called. sc_sc_test() takes a signed char as a parameter, prints it, and returns it. uc_uc_test() does the same thing, but with an unsigned char. Their purpose is just to prove to us that we really can call C functions, pass them values, and get values back from them.

This code is compiled into a dynamic library on OS X 10.3.4 with the command:


      gcc -dynamiclib -Wall -o libtypetest.dylib typetest.c \
      -install_name ./libtypetest.dylib
    

Tip

Users of 64-bit platforms may need to pass options such as "-m64" to gcc, may need to give the output library a different extension (such as ".so"), and may need to user slightly different values for other options in order to create an equivalent test library.

The -dynamiclib tells gcc that we will be compiling this into a dynamic library and not an executable binary program. The output filename is "libtypetest.dylib". Notice that we chose a name which follows the normal OS X convention, being in the form "libXXXXX.dylib", so that other programs can link to the library. CCL doesn't need it to be this way, but it is a good idea to adhere to existing conventions.

The -install_name flag is primarily used when building OS X "bundles". In this case, we are not using it, so we put a placeholder into it, "./libtypetest.dylib". If we wanted to use typetest in a bundle, the -install_name argument would be a relative path from some "current" directory.

After creating this library, the first step is to tell CCL to open the dynamic library. This is done by calling .


      Welcome to CCL Version (Beta: Darwin) 0.14.2-040506!

      ? (open-shared-library "/Users/andewl/openmcl/libtypetest.dylib")
      #<SHLIB /Users/andewl/openmcl/libtypetest.dylib #x638EF3E>
    

You should use an absolute path here; using a relative one, such as just "libtypetest.dylib", would appear to work, but there are subtle problems which occur after reloading it. See the Darwin notes on for details. It would be a bad idea anyway, because software should never rely on its starting directory being anything in particular.

This command returns a reference to the opened shared library, and CCL also adds one to the global variable ccl::*shared-libraries*:


      ? ccl::*shared-libraries*
      (#<SHLIB /Users/andewl/openmcl/libtypetest.dylib #x638EF3E>
       #<SHLIB /usr/lib/libSystem.B.dylib #x606179E>)
    

Before we call anything, let's check that the individual functions can actually be found by the system. We don't have to do this, but it helps to know how to find out whether this is the problem, when something goes wrong. We use external-call:


      ? (external "_void_void_test")
      #<EXTERNAL-ENTRY-POINT "_void_void_test" (#x000CFDF8) /Users/andewl/openmcl/libtypetest.dylib #x638EDF6>

      ? (external "_sc_sc_test")
      #<EXTERNAL-ENTRY-POINT "_sc_sc_test" (#x000CFE50) /Users/andewl/openmcl/libtypetest.dylib #x638EB3E>

      ? (external "_uc_uc_test")
      #<EXTERNAL-ENTRY-POINT "_uc_uc_test" (#x000CFED4) /Users/andewl/openmcl/libtypetest.dylib #x638E626>
    

Notice that the actual function names have been "mangled" by the C linker. The first function was named "void_void_test" in typetest.c, but in libtypetest.dylib, it has an underscore (a "_" symbol) before it: "_void_void_test". So, this is the name which you have to use. The mangling - the way the name is changed - may be different for other operating systems or other versions, so you need to "just know" how it's done...

Also, pay particular attention to the fact that a hexadecimal value appears in the EXTERNAL-ENTRY-POINT. (#x000CFDF8, for example - but what it is doesn't matter.) These hex numbers mean that the function can be dereferenced. Functions which aren't found will not have a hex number. For example:


      ? (external "functiondoesnotexist")
      #<EXTERNAL-ENTRY-POINT "functiondoesnotexist" {unresolved}  #x638E3F6>
    

The "unresolved" tells us that CCL wasn't able to find this function, which means you would get an error, "Can't resolve foreign symbol," if you tried to call it.

These external function references also are stored in a hash table which is accessible through a global variable, ccl::*eeps*.

At this point, we are ready to try our first external function call:


      ? (external-call "_void_void_test" :void)
      Entered void_void_test:
      Exited  void_void_test:
      NIL
    

We used , which is is the normal mechanism for accessing externally linked code. The "_void_void_test" is the mangled name of the external function. The :void refers to the return type of the function.

The next step is to try passing a value to C, and getting one back:


      ? (external-call "_sc_sc_test" :signed-byte -128 :signed-byte)
      Entered sc_sc_test:
      Data In: -128
      Exited  sc_sc_test:
      -128
    

The first :signed-byte gives the type of the first argument, and then -128 gives the value to pass for it. The second :signed-byte gives the return type. The return type is always given by the last argument to .

Everything looks good. Now, let's try a number outside the range which fits in one byte:


      ? (external-call "_sc_sc_test" :signed-byte -567 :signed-byte)
      Entered sc_sc_test:
      Data In: -55
      Exited  sc_sc_test:
      -55
    

Hmmmm. A little odd. Let's look at the unsigned stuff to see how it reacts:


      ? (external-call "_uc_uc_test" :unsigned-byte 255 :unsigned-byte)
      Entered uc_uc_test:
      Data In: 255
      Exited  uc_uc_test:
      255
    

That looks okay. Now, let's go outside the valid range again:


      ? (external-call "_uc_uc_test" :unsigned-byte 567 :unsigned-byte)
      Entered uc_uc_test:
      Data In: 55
      Exited  uc_uc_test:
      55

      ? (external-call "_uc_uc_test" :unsigned-byte -567 :unsigned-byte)
      Entered uc_uc_test:
      Data In: 201
      Exited  uc_uc_test:
      201
    

Since a signed byte can only hold values from -128 through 127, and an unsigned one can only hold values from 0 through 255, any number outside that range gets "clipped": only the low eight bits of it are used.

What is important to remember is that external function calls have very few safety checks. Data outside the valid range for its type will silently do very strange things; pointers outside the valid range can very well crash the system.

That's it for our first example library. If you're still following along, let's add some more C code to look at the rest of the primitive types. Then we'll need to recompile the dynamic library, load it again, and then we can see what happens.

Add the following code to typetest.c:

int
si_si_test(int data)
{
    printf("Entered %s:\n", __FUNCTION__);
    printf("Data In: %d\n", data);
    printf("Exited  %s:\n", __FUNCTION__);
    fflush(stdout);
    return data;
}

long
sl_sl_test(long data)
{
    printf("Entered %s:\n", __FUNCTION__);
    printf("Data In: %ld\n", data);
    printf("Exited  %s:\n", __FUNCTION__);
    fflush(stdout);
    return data;
}

long long
sll_sll_test(long long data)
{
    printf("Entered %s:\n", __FUNCTION__);
    printf("Data In: %lld\n", data);
    printf("Exited  %s:\n", __FUNCTION__);
    fflush(stdout);
    return data;
}

float
f_f_test(float data)
{
    printf("Entered %s:\n", __FUNCTION__);
    printf("Data In: %e\n", data);
    printf("Exited  %s:\n", __FUNCTION__);
    fflush(stdout);
    return data;
}

double
d_d_test(double data)
{
    printf("Entered %s:\n", __FUNCTION__);
    printf("Data In: %e\n", data);
    printf("Exited  %s:\n", __FUNCTION__);
    fflush(stdout);
    return data;
}
    

The command line to compile the dynamic library is the same as before:


      gcc -dynamiclib -Wall -o libtypetest.dylib typetest.c \
      -install_name ./libtypetest.dylib
    

Now, restart CCL. This step is required because CCL cannot close and reload a dynamic library on OS X.

Have you restarted? Okay, try out the new code:


      Welcome to CCL Version (Beta: Darwin) 0.14.2-040506!

      ? (open-shared-library "/Users/andewl/openmcl/libtypetest.dylib")
      #<SHLIB /Users/andewl/openmcl/libtypetest.dylib #x638EF3E>

      ? (external-call "_si_si_test" :signed-fullword -178965 :signed-fullword)
      Entered si_si_test:
      Data In: -178965
      Exited  si_si_test:
      -178965

      ? ;; long is the same size as int on 32-bit machines.
      (external-call "_sl_sl_test" :signed-fullword -178965 :signed-fullword)
      Entered sl_sl_test:
      Data In: -178965
      Exited  sl_sl_test:
      -178965

      ? (external-call "_sll_sll_test"
      :signed-doubleword -973891578912 :signed-doubleword)
      Entered sll_sll_test:
      Data In: -973891578912
      Exited  sll_sll_test:
      -973891578912
    

Okay, everything seems to be acting as expected. However, just to remind you that most of this stuff has no safety net, here's what happens if somebody mistakes sl_sl_test() for sll_sll_test(), thinking that a long is actually a doubleword:


      ? (external-call "_sl_sl_test"
      :signed-doubleword -973891578912 :signed-doubleword)
      Entered sl_sl_test:
      Data In: -227
      Exited  sl_sl_test:
      -974957576192
    

Ouch. The C function changes the value with no warning that something is wrong. Even worse, it manages to pass the original value back to CCL, which hides the fact that something is wrong.

Finally, let's take a look at doing this with floating-point numbers.


      Welcome to CCL Version (Beta: Darwin) 0.14.2-040506!

      ? (open-shared-library "/Users/andewl/openmcl/libtypetest.dylib")
      #<SHLIB /Users/andewl/openmcl/libtypetest.dylib #x638EF3E>

      ? (external-call "_f_f_test" :single-float -1.256791e+11 :single-float)
      Entered f_f_test:
      Data In: -1.256791e+11
      Exited  f_f_test:
      -1.256791E+11

      ? (external-call "_d_d_test" :double-float -1.256791d+290 :double-float)
      Entered d_d_test:
      Data In: -1.256791e+290
      Exited  d_d_test:
      -1.256791D+290
    

Notice that the number ends with "...e+11" for the single-float, and "...d+290" for the double-float. Lisp has both of these float types itself, and the d instead of the e is how you specify which to create. If you tried to pass :double-float 1.0e2 to external-call, Lisp would be nice enough to notice and give you a type error. Don't get the :double-float wrong, though, because then there's no protection.

Congratulations! You now know how to call external C functions from within CCL, and pass numbers back and forth. Now that the basic mechanics of calling and passing work, the next step is to examine how to pass more complex data structures around.

12.10.1. Acknowledgement

This chapter was generously contributed by Andrew P. Lentvorski Jr.

12.11. Tutorial: Allocating Foreign Data on the Lisp Heap

Not every foreign function is so marvelously easy to use as the ones we saw in the last section. Some functions require you to allocate a C struct, fill it with your own information, and pass in a pointer to that struct. Some of them require you to allocate an empty struct that they will fill in so that you can read the information out of it.

There are generally two ways to allocate foreign data. The first way is to allocate it on the stack; the RLET macro is one way to do this. This is analogous to using automatic variables in C. In the jargon of Common Lisp, data allocated this way is said to have dynamic extent.

The other way to heap-allocate the foreign data. This is analogous to calling malloc in C. Again in the jargon of Common Lisp, heap-allocated data is said to have indefinite extent. If a function heap-allocates some data, that data remains valid even after the function itself exits. This is useful for data which may need to be passed between multiple C calls or multiple threads. Also, some data may be too large to copy multiple times or may be too large to allocate on the stack.

The big disadvantage to allocating data on the heap is that it must be explicitly deallocated—you need to "free" it when you're done with it. Normal Lisp objects, even those with indefinite extent, are deallocated by the garbage collector when it can prove that they're no longer referenced. Foreign data, though, is outside the GC's ken: it has no way to know whether a blob of foreign data is still referenced by foreign code or not. It is thus up to the programmer to manage it manually, just as one does in C with malloc and free.

What that means is that, if you allocate something and then lose track of the pointer to it, there's no way to ever free that memory. That's what's called a memory leak, and if your program leaks enough memory it will eventually use up all of it! So, you need to be careful to not lose your pointers.

That disadvantage, though, is also an advantage for using foreign functions. Since the garbage collector doesn't know about this memory, it will never move it around. External C code needs this, because it doesn't know how to follow it to where it moved, the way that Lisp code does. If you allocate data manually, you can pass it to foreign code and know that no matter what that code needs to do with it, it will be able to, until you deallocate it. Of course, you'd better be sure it's done before you do. Otherwise, your program will be unstable and might crash sometime in the future, and you'll have trouble figuring out what caused the trouble, because there won't be anything pointing back and saying "you deallocated this too soon."

And, so, on to the code...

As in the last tutorial, our first step is to create a local dynamic library in order to help show what is actually going on between CCL and C. So, create the file ptrtest.c, with the following code:

#include <stdio.h>

void reverse_int_array(int * data, unsigned int dataobjs)
{
    int i, t;
    
    for(i=0; i<dataobjs/2; i++)
        {
            t = *(data+i);
            *(data+i) = *(data+dataobjs-1-i);
            *(data+dataobjs-1-i) = t;
        }
}

void reverse_int_ptr_array(int **ptrs, unsigned int ptrobjs)
{
    int *t;
    int i;
    
    for(i=0; i<ptrobjs/2; i++)
        {
            t = *(ptrs+i);
            *(ptrs+i) = *(ptrs+ptrobjs-1-i);
            *(ptrs+ptrobjs-1-i) = t;
        }
}

void
reverse_int_ptr_ptrtest(int **ptrs)
{
    reverse_int_ptr_array(ptrs, 2);
    
    reverse_int_array(*(ptrs+0), 4);
    reverse_int_array(*(ptrs+1), 4);
}
    

This defines three functions. reverse_int_array takes a pointer to an array of ints, and a count telling how many items are in the array, and loops through it putting the elements in reverse. reverse_int_ptr_array does the same thing, but with an array of pointers to ints. It only reverses the order the pointers are in; each pointer still points to the same thing. reverse_int_ptr_ptrtest takes an array of pointers to arrays of ints. (With me?) It doesn't need to be told their sizes; it just assumes that the array of pointers has two items, and that both of those are arrays which have four items. It reverses the array of pointers, then it reverses each of the two arrays of ints.

Now, compile ptrtest.c into a dynamic library using the command:

      gcc -dynamiclib -Wall -o libptrtest.dylib ptrtest.c -install_name ./libptrtest.dylib
    

The function make-heap-ivector is the primary tool for allocating objects in heap memory. It allocates a fixed-size CCL object in heap memory. It returns both an array reference, which can be used directly from CCL, and a macptr, which can be used to access the underlying memory directly. For example:

      ? ;; Create an array of 3 4-byte-long integers
      (multiple-value-bind (la lap)
          (make-heap-ivector 3 '(unsigned-byte 32))
        (setq a la)
        (setq ap lap))
      ;Compiler warnings :
      ;   Undeclared free variable A, in an anonymous lambda form.
      ;   Undeclared free variable AP, in an anonymous lambda form.
      #<A Mac Pointer #x10217C>

      ? a
      #(1396 2578 97862649)

      ? ap
      #<A Mac Pointer #x10217C>
    

It's important to realize that the contents of the ivector we've just created haven't been initialized, so their values are unpredictable, and you should be sure not to read from them before you set them, to avoid confusing results.

At this point, a references an object which works just like a normal array. You can refer to any item of it with the standard aref function, and set them by combining that with setf. As noted above, the ivector's contents haven't been initialized, so that's the next order of business:

      ? a
      #(1396 2578 97862649)

      ? (aref a 2)
      97862649

      ? (setf (aref a 0) 3)
      3

      ? (setf (aref a 1) 4)
      4

      ? (setf (aref a 2) 5)
      5

      ? a
      #(3 4 5)
    

In addition, the macptr allows direct access to the same memory:

      ? (setq *byte-length-of-long* 4)
      4

      ? (%get-signed-long ap (* 2 *byte-length-of-long*))
      5

      ? (%get-signed-long ap (* 0 *byte-length-of-long*))
      3

      ? (setf (%get-signed-long ap (* 0 *byte-length-of-long*)) 6)
      6

      ? (setf (%get-signed-long ap (* 2 *byte-length-of-long*)) 7)
      7

      ? ;; Show that a actually got changed through ap
      a
      #(6 4 7)
    

So far, there is nothing about this object that could not be done much better with standard Lisp. However, the macptr can be used to pass this chunk of memory off to a C function. Let's use the C code to reverse the elements in the array:

      ? ;; Insert the full path to your copy of libptrtest.dylib
      (open-shared-library "/Users/andrewl/openmcl/openmcl/gtk/libptrtest.dylib")
      #<SHLIB /Users/andrewl/openmcl/openmcl/gtk/libptrtest.dylib #x639D1E6>

      ? a
      #(6 4 7)

      ? ap
      #<A Mac Pointer #x10217C>

      ? (external-call "_reverse_int_array" :address ap :unsigned-int (length a) :address)
      #<A Mac Pointer #x10217C>

      ? a
      #(7 4 6)

      ? ap
      #<A Mac Pointer #x10217C>
    

The array gets passed correctly to the C function, reverse_int_array. The C function reverses the contents of the array in-place; that is, it doesn't make a new array, just keeps the same one and reverses what's in it. Finally, the C function passes control back to CCL. Since the allocated array memory has been directly modified, CCL reflects those changes directly in the array as well.

There is one final bit of housekeeping to deal with. Before moving on, the memory needs to be deallocated:

      ? (dispose-heap-ivector a ap)
      NIL
    

The dispose-heap-ivector macro actually deallocates the ivector, releasing its memory into the heap for something else to use. Both a and ap now have undefined values.

When do you call dispose-heap-ivector? Anytime after you know the ivector will never be used again, but no sooner. If you have a lot of ivectors, say, in a hash table, you need to make sure that when whatever you were doing with the hash table is done, those ivectors all get freed. Unless there's still something somewhere else which refers to them, of course! Exactly what strategy to take depends on the situation, so just try to keep things simple unless you know better.

The simplest situation is when you have things set up so that a Lisp object "encapsulates" a pointer to foreign data, taking care of all the details of using it. In this case, you don't want those two things to have different lifetimes: You want to make sure your Lisp object exists as long as the foreign data does, and no longer; and you want to make sure the foreign data doesn't get deallocated while your Lisp object still refers to it.

If you're willing to accept a few limitations, you can make this easy. First, you can't let foreign code keep a permanent pointer to the memory; it has to always finish what it's doing, then return, and not refer to that memory again. Second, you can't let any Lisp code that isn't part of your encapsulating "wrapper" refer to the pointer directly. Third, nothing, either foreign code or Lisp code, should explicitly deallocate the memory.

If you can make sure all of these are true, you can at least ensure that the foreign pointer is deallocated when the encapsulating object is about to become garbage, by using CCL's nonstandard "termination" mechanism, which is essentially the same as what Java and other languages call "finalization".

Termination is a way of asking the garbage collector to let you know when it's about to destroy an object which isn't used anymore. Before destroying the object, it calls a function which you write, called a terminator.

So, you can use termination to find out when a particular macptr is about to become garbage. That's not quite as helpful as it might seem: It's not exactly the same thing as knowing that the block of memory it points to is unreferenced. For example, there could be another macptr somewhere to the same block; or, if it's a struct, there could be a macptr to one of its fields. Most problematically, if the address of that memory has been passed to foreign code, it's sometimes hard to know whether that code has kept the pointer. Most foreign functions don't, but it's not hard to think of exceptions.

You can use code such as this to make all this happen:

      (defclass wrapper (whatever)
        ((element-type :initarg :element-type)
         (element-count :initarg :element-count)
         (ivector)
         (macptr)))

      (defmethod initialize-instance ((wrapper wrapper) &rest initargs)
        (declare (ignore initargs))
        (call-next-method)
        (ccl:terminate-when-unreachable wrapper)
        (with-slots (ivector macptr element-type element-count) wrapper
          (multiple-value-bind (new-ivector new-macptr)
              (make-heap-ivector element-count element-type)
            (setq ivector new-ivector
                  macptr new-macptr))))

      (defmethod ccl:terminate ((wrapper wrapper))
        (with-slots (ivector macptr) wrapper
          (when ivector
            (dispose-heap-ivector ivector macptr)
            (setq ivector nil
                  macptr nil))))
    

The ccl:terminate method will be called on some arbitrary thread sometime (hopefully soon) after the GC has decided that there are no strong references to an object which has been the argument of a ccl:terminate-when-unreachable call.

If it makes sense to say that the foreign object should live as long as there's Lisp code that references it (through the encapsulating object) and no longer, this is one way of doing that.

Now we've covered passing basic types back and forth with C, and we've done the same with pointers. You may think this is all... but we've only done pointers to basic types. Join us next time for pointers... to pointers.

12.11.1. Acknowledgement

Much of this chapter was generously contributed by Andrew P. Lentvorski Jr.

12.12. The Foreign-Function-Interface Dictionary

[Reader Macro]

#_

Description:

Reads a symbol from the current input stream, with *PACKAGE* bound to the "OS" package and with readtable-case preserved.

Does a lookup on that symbol in the CCL interface database, signalling an error if no foreign function information can be found for the symbol in any active interface directory.

Notes the foreign function information, including the foreign function's return type, the number and type of the foreign function's required arguments, and an indication of whether or not the function accepts additional arguments (via e.g., the "varargs" mechanism in C).

Defines a macroexpansion function on the symbol, which expand macro calls involving the symbol into EXTERNAL-CALL forms where foreign argument type specifiers for required arguments and the return value specifer are provided from the information in the database.

Returns the symbol.

The effect of these steps is that it's possible to call foreign functions that take fixed numbers of arguments by simply providing argument values, as in:

(#_isatty fd)
          (#_read fd buf n)

and to call foreign functions that take variable numbers of arguments by specifying the types of non-required args, as in:

(with-cstrs ((format-string "the answer is: %d"))
          (#_printf format-string :int answer))

You can query whether a given name is defined in the interface databases by appending the '?' character to the reader macro; for example:

          CL-USER> #_?printf
          T
          CL-USER> #_?foo
          NIL
        

[Reader Macro]

#&

Description:

In CCL 1.2 and later, the #& reader macro can be used to access foreign variables; this functionality depends on the presence of "vars.cdb" files in the interface database. The current behavior of the #& reader macro is to:

Read a symbol from the current input stream, with *PACKAGE* bound to the "OS" package and with readtable-case preserved.

Use that symbol's pname to access the CCL interface database, signalling an error if no appropriate foreign variable information can be found with that name in any active interface directory.

Use type information recorded in the database to construct a form which can be used to access the foreign variable, and return that form.

Please note that the set of foreign variables declared in header files may or may not match the set of foreign variables exported from libraries (we're generally talking about C and Unix here ...). When they do match, the form constructed by the #& reader macro manages the details of resolving and tracking changes to the foreign variable's address.

Future extensions (via prefix arguments to the reader macro) may offer additional behavior; it might be convenient (for instance) to be able to access the address of a foreign variable without dereferencing that address.

Foreign variables in C code tend to be platform- and package-specific (the canonical example - "errno" - is typically not a variable when threads are involved. )

In LinuxPPC,

? #&stderr

returns a pointer to the stdio error stream ("stderr" is a macro under OSX/Darwin).

On both LinuxPPC and DarwinPPC,

? #&sys_errlist

returns a pointer to a C array of C error message strings.

You can query whether a given name is defined in the interface databases by appending the '?' character to the reader macro; for example:

          CL-USER> #&?sys_errlist
          T
          CL-USER> #&?foo
          NIL
        

[Reader Macro]

#$

Description:

In CCL 0.14.2 and later, the #? reader macro can be used to access foreign constants; this functionality depends on the presence of "constants.cdb" files in the interface database. The current behavior of the #$ reader macro is to:

Read a symbol from the current input stream, with *PACKAGE* bound to the "OS" package and with readtable-case preserved.

Use that symbol's pname to access the CCL interface database, signalling an error if no appropriate foreign constant information can be found with that name in any active interface directory.

Use type information recorded in the database to construct a form which can be used to access the foreign constant, and return that form.

Please note that the set of foreign constants declared in header files may or may not match the set of foreign constants exported from libraries. When they do match, the form constructed by the #$ reader macro manages the details of resolving and tracking changes to the foreign constant's address.

You can query whether a given name is defined in the interface databases by appending the '?' character to the reader macro; for example:

          CL-USER> #$?SO_KEEPALIVE
          T
          CL-USER> #$?foo
          NIL
        

[Reader Macro]

#/

Description:

In CCL 1.2 and later, the #/ reader macro can be used to access foreign functions on the Darwin platform. The current behavior of the #/ reader macro is to:

Read a symbol from the current input stream, with *PACKAGE* bound to the "NEXTSTEP-FUNCTIONS" package, with readtable-case preserved, and with any colons included.

Do limited sanity-checking on the resulting symbol; for example, any name that contains at least one colon is required also to end with a colon, to conform to Objective-C method-naming conventions.

Export the resulting symbol from the "NEXTSTEP-FUNCTIONS" package and return it.

For example, reading "#/alloc" interns and returns NEXTSTEP-FUNCTIONS:|alloc|. Reading "#/initWithFrame:" interns and returns NEXTSTEP-FUNCTIONS:|initWithFrame:|.

A symbol read using this macro can be used as an operand in most places where an Objective-C message name can be used, such as in the (OBJ:@SELECTOR ...) construct.

Please note: the reader macro is not rigorous about enforcing Objective-C method-naming conventions. Despite the simple checking done by the reader macro, it may still be possible to use it to construct invalid names.

The act of interning a new symbol in the NEXTSTEP-FUNCTIONS package triggers an interface database lookup of Objective-C methods with the corresponding message name. If any such information is found, a special type of dispatching function is created and initialized and the new symbol is given the newly-created dispatching function as its function definition.

The dispatching knows how to call declared Objective-C methods defined on the message. In many cases, all methods have the same foreign type signature, and the dispatching function merely passes any arguments that it receives to a function that does an Objective-C message send with the indicated foreign argument and return types. In other cases, where different Objective-C messages have different type signatures, the dispatching function tries to choose a function that handles the right type signature based on the class of the dispatching function's first argument.

If new information about Objective-C methods is introduced (e.g., by using additional interface files or as Objective-C methods are defined from lisp), the dispatch function is reinitialized to recognize newly-introduced foreign type signatures.

The argument and result coercion that the bridge has traditionally supported is supported by the new mechanism (e.g., :<BOOL> arguments can be specified as lisp booleans and :<BOOL> results are returned as lisp boolean values, and an argument value of NIL is coerced to a null pointer if the corresponding argument type is :ID.

Some Objective-C methods accept variable numbers of arguments; the foreign types of non-required arguments are determined by the lisp types of those arguments (e.g., integers are passed as integers, floats as floats, pointers as pointers, record types by reference.)

Examples:

          ;;; #/alloc is a known message.
          ? #'#/alloc
          #<OBJC-DISPATCH-FUNCTION NEXTSTEP-FUNCTIONS:|alloc| #x300040E94EBF>
          ;;; Sadly, #/foo is not ...
          ? #'#/foo
          > Error: Undefined function: NEXTSTEP-FUNCTIONS:|foo|

          ;;; We can send an "init" message to a newly-allocated instance of
          ;;; "NSObject" by:

          (send (send ns:ns-object 'alloc) 'init)

          ;;; or by

          (#/init (#/alloc ns:ns-object))
        

Objective-C methods that "return" structures return them as garbage-collectable pointers when called via dispatch functions. For example, if "my-window" is an NS:NS-WINDOW instance, then

          (#/frame my-window)
        

returns a garbage-collectable pointer to a structure that describes that window's frame rectangle. This convention means that there's no need to use SLET or special structure-returning message send syntax; keep in mind, though, that #_malloc, #_free, and the GC are all involved in the creation and eventual destruction of structure-typed return values. In some programs these operations may have an impact on performance.

[Reader Macro]

#>

Description:

In CCL 1.2 and later, the #> reader macro reads the following text as a keyword, preserving the case of the text. For example:

          CL-USER> #>FooBar
          :<F>OO<B>AR
        

The resulting keyword can be used as the name of foreign types, records, and accessors.

[Function]

close-shared-library library &key completely
Stops using a shared library, informing the operating system that it can be unloaded if appropriate.

Values:

library---either an object of type SHLIB, or a string which designates one by its so-name.

completely---a boolean. The default is T.

Description:

If completely is T, sets the reference count of library to 0. Otherwise, decrements it by 1. In either case, if the reference count becomes 0, close-shared-library frees all memory resources consumed library and causes any EXTERNAL-ENTRY-POINTs known to be defined by it to become unresolved.

[Macro]

defcallback name ({arg-type-specifier var}* &optional result-type-specifier) &body body

Values:

name---A symbol which can be made into a special variable

arg-type-specifer---One of the foreign argument-type keywords, described above, or an equivalent foreign type specifier. In addition, if the keyword :WITHOUT-INTERRUPTS is specified, the callback will be executed with lisp interrupts disabled if the corresponding var is non-NIL. If :WITHOUT-INTERRUPTS is specified more than once, the rightmost instance wins.

var---A symbol (lisp variable), which will be bound to a value of the specified type.

body---A sequence of lisp forms, which should return a value which can be coerced to the specified result-type.

Description:

Proclaims name to be a special variable; sets its value to a MACPTR which, when called by foreign code, calls a lisp function which expects foreign arguments of the specified types and which returns a foreign value of the specified result type. Any argument variables which correspond to foreign arguments of type :ADDRESS are bound to stack-allocated MACPTRs.

If name is already a callback function pointer, its value is not changed; instead, it's arranged that an updated version of the lisp callback function will be called. This feature allows for callback functions to be redefined incrementally, just like Lisp functions are.

defcallback returns the callback pointer, e.g., the value of name.

[Macro]

def-foreign-type name foreign-type-spec

Values:

name---NIL or a keyword; the keyword may contain escaping constructs.

foreign-type-spec---A foreign type specifier, whose syntax is (loosely) defined above.

Description:

If name is non-NIL, defines name to be an alias for the foreign type specified by foreign-type-spec. If foreign-type-spec is a named structure or union type, additionally defines that structure or union type.

If name is NIL, foreign-type-spec must be a named foreign struct or union definition, in which case the foreign structure or union definition is put in effect.

Note that there are two separate namespaces for foreign type names, one for the names of ordinary types and one for the names of structs and unions. Which one name refers to depends on foreign-type-spec in the obvious manner.

[Macro]

external name => entry
Resolves a reference to an external symbol which is defined in a shared library.

Values:

name--- a simple-string which names an external symbol. Case-sensitive.

entry--- an object of type EXTERNAL-ENTRY-POINT which maintains the address of the foreign symbol named by name.

Description:

If there is already an EXTERNAL-ENTRY-POINT for the symbol named by name, finds it and returns it. If not, creates one and returns it.

Tries to resolve the entry point to a memory address, and identify the containing library.

Be aware that under Darwin, external functions which are callable from C have underscores prepended to their names, as in "_fopen".

[Macro]

external-call name {arg-type-specifier arg}* &optional result-type-specifier

Values:

name---A lisp string. See external, above.

arg-type-specifer---One of the foreign argument-type keywords, described above, or an equivalent foreign type specifier.

arg---A lisp value of type indicated by the corresponding arg-type-specifier

result-type-specifier---One of the foreign argument-type keywords, described above, or an equivalent foreign type specifier.

Description:

Calls the foreign function at the address obtained by resolving the external-entry-point associated with name, passing the values of each arg as a foreign argument of type indicated by the corresponding arg-type-specifier. Returns the foreign function result (coerced to a Lisp object of type indicated by result-type-specifier), or NIL if result-type-specifer is :VOID or NIL

[Function]

%ff-call entrypoint {arg-type-keyword arg}* &optional result-type-keyword

Values:

entrypoint---A fixnum or MACPTR

arg-type-keyword---One of the foreign argument-type keywords, described above

arg---A lisp value of type indicated by the corresponding arg-type-keyword

result-type-keyword---One of the foreign argument-type keywords, described above

Description:

Calls the foreign function at address entrypoint passing the values of each arg as a foreign argument of type indicated by the corresponding arg-type-keyword. Returns the foreign function result (coerced to a Lisp object of type indicated by result-type-keyword), or NIL if result-type-keyword is :VOID or NIL

[Macro]

ff-call entrypoint {arg-type-specifier arg}* &optional result-type-specifier

Values:

entrypoint---A fixnum or MACPTR

arg-type-specifer---One of the foreign argument-type keywords, described above, or an equivalent foreign type specifier.

arg---A lisp value of type indicated by the corresponding arg-type-specifier

result-type-specifier---One of the foreign argument-type keywords, described above, or an equivalent foreign type specifier.

Description:

Calls the foreign function at address entrypoint passing the values of each arg as a foreign argument of type indicated by the corresponding arg-type-specifier. Returns the foreign function result (coerced to a Lisp object of type indicated by result-type-specifier), or NIL if result-type-specifer is :VOID or NIL

[Function]

foreign-symbol-address name

Values:

name---A lisp string.

Description:

Tries to resolve the address of the foreign symbol name. If successful, returns that address encapsulated in a MACPTR, else returns NIL.

[Function]

foreign-symbol-entry name

Values:

name---A lisp string.

Description:

Tries to resolve the address of the foreign symbol name. If successful, returns a fixnum representation of that address, else returns NIL.

[Function]

free ptr

Values:

ptr---A MACPTR that points to a block of foreign, heap-allocated memory.

Description:

In CCL 1.2 and later, the CCL:FREE function invokes the foreign free function from the platform's standard C library to deallocate a block of foreign memory.

Previous versions of CCL implemented this function, but it was not exported.

If the argument to CCL:FREE is a gcable pointer (for example, an object returned by MAKE-GCABLE-RECORD) then CCL:FREE informs the garbage collector that the foreign memory has been deallocated before calling the foreign free function.

[Function]

make-heap-ivector element-count element-type => vector macptr size

Values:

element-count---A positive integer.

element-type---A type specifier.

vector---A lisp vector. The initial contents are undefined.

mactpr---A pointer to the first byte of data stored in the vector.

size---The size of the returned vector in octets.

Description:

An "ivector" is a one-dimensional array that's specialized to a numeric or character element type.

MAKE-HEAP-IVECTOR allocates an ivector in foreign memory. The GC will never move this vector, and will in fact not pay any attention to it at all. The returned pointer to it can therefore be passed safely to foreign code.

The vector must be explicitly deallocated with DISPOSE-HEAP-IVECTOR.

[Macro]

make-gcable-record typespec &rest initforms => result

Values:

typespec---A foreign type specifier, or a keyword which is used as the name of a foreign struct or union.

initforms---If the type denoted by typespec is scalar, a single value appropriate for that type; otherwise, a list of alternating field names and values appropriate for the types of those fields.

result--- A macptr which encapsulates the address of a newly-allocated record on the foreign heap. The foreign object returned by make-gcable-record is freed when the garbage collector determines that the MACPTR object that describes it is unreachable.

Description:

Allocates a block of foreign memory suitable to hold the foreign type described by typespec, in the same manner as MAKE-RECORD. In addition, MAKE-GCABLE-RECORD marks the returned object gcable; in other words, it informs the garbage collector that it may reclaim the object when it becomes unreachable.

In all other respects, MAKE-GCABLE-RECORD works the same way as MAKE-RECORD

When using gcable pointers, it's important to remember the distinction between a MACPTR object (which is a lisp object, more or less like any other) and the block of foreign memory that the MACPTR object points to. If a gcable MACPTR object is the only thing in the world (lisp world or foreign world) that references the underlying block of foreign memory, then freeing the foreign memory when it becomes impossible to reference it is convenient and sane. If other lisp MACPTRs reference the underlying block of foreign memory or if the address of that foreign memory is passed to and retained by foreign code, having the GC free the memory may have unpleasant consequences if those other references are used.

Take care, therefore, not to create a gcable record unless you are sure that the returned MACPTR will be the only reference to the allocated memory that will ever be used.

[Macro]

make-record typespec &rest initforms => result

Values:

typespec---A foreign type specifier, or a keyword which is used as the name of a foreign struct or union.

initforms---If the type denoted by typespec is scalar, a single value appropriate for that type; otherwise, a list of alternating field names and values appropriate for the types of those fields.

result--- A macptr which encapsulates the address of a newly-allocated record on the foreign heap.

Description:

Expands into code which allocates and initializes an instance of the type denoted by typespec, on the foreign heap. The record is allocated using the C function malloc, and the user of make-record must explicitly call the function CCL:FREE to deallocate the record, when it is no longer needed.

If initforms is provided, its value or values are used in the initialization. When the type is a scalar, initforms is either a single value which can be coerced to that type, or no value, in which case binary 0 is used. When the type is a struct, initforms is a list, giving field names and the values for each. Each field is treated in the same way as a scalar is: If a value for it is given, it must be coerceable to the field's type; if not, binary 0 is used.

When the type is an array, initforms may not be provided, because make-record cannot initialize its values. make-record is also unable to initialize fields of a struct which are themselves structs. The user of make-record should set these values by another means.

A possibly-significant limitation is that it must be possible to find the foreign type at the time the macro is expanded; make-record signals an error if this is not the case.

Notes:

It is inconvenient that make-record is a macro, because this means that typespec cannot be a variable; it must be an immediate value.

If it weren't for this requirement, make-record could be a function. However, that would mean that any stand-alone application using it would have to include a copy of the interface database (see Section 12.4, “The Interface Database”), which is undesirable because it's large.

[Function]

open-shared-library name => library
Asks the operating system to load a shared library for CCL to use.

Values:

name---A SIMPLE-STRING which is presumed to be the so-name of or a filesystem path to the library.

library---An object of type SHLIB which describes the library denoted by name.

Description:

If the library denoted by name can be loaded by the operating system, returns an object of type SHLIB that describes the library; if the library is already open, increments a reference count. If the library can't be loaded, signals a SIMPLE-ERROR which contains an often-cryptic message from the operating system.

Examples:
;;; Try to do something simple.
          ? (open-shared-library "libgtk.so")
          > Error: Error opening shared library "libgtk.so": /usr/lib/libgtk.so: undefined symbol: gdk_threads_mutex
          > While executing: OPEN-SHARED-LIBRARY

          ;;; Grovel around, curse, and try to find out where "gdk_threads_mutex"
          ;;; might be defined. Then try again:

          ? (open-shared-library "libgdk.so")
          #<SHLIB libgdk.so #x3046DBB6>

          ? (open-shared-library "libgtk.so")
          #<SHLIB libgtk.so #x3046DC86>

          ;;; Reference an external symbol defined in one of those libraries.

          ? (external "gtk_main")
          #<EXTERNAL-ENTRY-POINT "gtk_main" (#x012C3004) libgtk.so #x3046FE46>

          ;;; Close those libraries.

          ? (close-shared-library "libgtk.so")
          T

          ? (close-shared-library "libgdk.so")
          T

          ;;; Reference the external symbol again.

          ? (external "gtk_main")
          #<EXTERNAL-ENTRY-POINT "gtk_main" {unresolved} libgtk.so #x3046FE46>
Notes:

It would be helpful to describe what an soname is and give examples of one.

Does the SHLIB still get returned if the library is already open?

[Macro]

pref ptr accessor-form

Values:

ptr---a MACPTR.

accessor-form---a keyword which names a foreign type or record, as described in Section 12.8.3, “Foreign type, record, and field names”.

Description:

References an instance of a foreign type (or a component of a foreign type) accessible via ptr.

Expands into code which references the indicated scalar type or component, or returns a pointer to a composite type.

PREF can be used with SETF.

RREF is a deprecated alternative to PREF. It accepts a :STORAGE keyword and rather loudly ignores it.

[Function]

%reference-external-entry-point eep

Values:

eep---An EXTERNAL-ENTRY-POINT, as obtained by the EXTERNAL macro.

Description:

Tries to resolve the address of the EXTERNAL-ENTRY-POINT eep; returns a fixnum representation of that address if successful, else signals an error.

[Macro]

rlet (var typespec &rest initforms)* &body body

Values:

var---A symbol (a lisp variable)

typespec---A foreign type specifier or foreign record name.

initforms---As described above, for make-record

Description:

Executes body in an environment in which each var is bound to a MACPTR encapsulating the address of a stack-allocated foreign memory block, allocated and initialized from typespec and initforms as per make-record. Returns whatever value(s) body returns.

Record fields that aren't explicitly initialized have unspecified contents.

[Macro]

rletz (var typespec &rest initforms)* &body body

Values:

var---A symbol (a lisp variable)

typespec---A foreign type specifier or foreign record name.

initforms---As described above, for ccl:make-record

Description:

Executes body in an environment in which each var is bound to a MACPTR encapsulating the address of a stack-allocated foreign memory block, allocated and initialized from typespec and initforms as ccl:make-record.

Returns whatever value(s) body returns.

Unlike rlet, record fields that aren't explicitly initialized are set to binary 0.

[Function]

terminate-when-unreachable object

Values:

object---A CLOS object of a class for which there exists a method of the generic function ccl:terminate.

Description:

The "termination" mechanism is a way to have the garbage collector run a function right before an object is about to become garbage. It is very similar to the "finalization" mechanism which Java has. It is not standard Common Lisp, although other Lisp implementations have similar features. It is useful when there is some sort of special cleanup, deallocation, or releasing of resources which needs to happen when a certain object is no longer being used.

When the garbage collector discovers that an object is no longer referred to anywhere in the program, it deallocates that object, freeing its memory. However, if ccl:terminate-when-unreachable has been called on the object at any time, the garbage collector first invokes the generic function ccl:terminate, passing it the object as a parameter.

Therefore, to make termination do something useful, you need to define a method on ccl:terminate.

Because calling ccl:terminate-when-unreachable only affects a single object, rather than all objects of its class, you may wish to put a call to it in the initialize-instance method of a class. Of course, this is only appropriate if you do in fact want to use termination for all objects of a given class.

Example:
          (defclass resource-wrapper ()
            ((resource :accessor resource)))

          (defmethod initialize-instance :after ((x resource-wrapper) &rest initargs)
             (ccl:terminate-when-unreachable x))

          (defmethod ccl:terminate ((x resource-wrapper))
             (when (resource x)
                (deallocate (resource x))))

[Function]

unuse-interface-dir dir-id

Values:

dir-id---A keyword whose pname, mapped to lower case, names a subdirectory of "ccl:headers;" (or "ccl:darwin-headers;")

Description:

Tells CCL to remove the interface directory denoted by dir-id from the list of interface directories which are consulted for foreign type and function information. Returns T if the directory was on the search list, NIL otherwise.

[Function]

use-interface-dir dir-id

Values:

dir-id---A keyword whose pname, mapped to lower case, names a subdirectory of "ccl:headers;" (or "ccl:darwin-headers;")

Description:

Tells CCL to add the interface directory denoted by dir-id to the list of interface directories which it consults for foreign type and function information. Arranges that that directory is searched before any others.

Note that use-interface-dir merely adds an entry to a search list. If the named directory doesn't exist in the file system or doesn't contain a set of database files, a runtime error may occur when CCL tries to open some database file in that directory, and it will try to open such a database file whenever it needs to find any foreign type or function information. unuse-interface-dir may come in handy in that case.

Examples:

One typically wants interface information to be available at compile-time (or, in many cases, at read-time). A typical idiom would be:

(eval-when (:compile-toplevel :execute)
          (use-interface-dir :GTK))

Using the :GTK interface directory makes available information on foreign types, functions, and constants. It's generally necessary to load foreign libraries before actually calling the foreign code, which for GTK can be done like this:

(load-gtk-libraries)

It should now be possible to do things like:

(#_gtk_widget_destroy w)

Chapter 13. The Objective-C Bridge

Mac OS X APIs use a language called Objective-C, which is approximately C with some object-oriented extensions modeled on Smalltalk. The Objective-C bridge makes it possible to work with Objective-C objects and classes from Lisp, and to define classes in Lisp which can be used by Objective-C.

The ultimate purpose of the Objective-C and Cocoa bridges is to make Cocoa (the standard user-interface framework on Mac OS X) as easy as possible to use from Clozure CL, in order to support the development of GUI applications and IDEs on Mac OS X (and on any platform that supports Objective-C, such as GNUStep). The eventual goal, which is much closer than it used to be, is complete integration of Cocoa into CLOS.

The current release provides Lisp-like syntax and naming conventions for the basic Objective-C operations, with automatic type processing and messages checked for validity at compile-time. It also provides some convenience facilities for working with Cocoa.

13.1. Changes in 1.2

Version 1.2 of Clozure CL exports most of the useful symbols described in this chapter; in previous releases, most of them were private in the CCL package.

There are several new reader macros that make it much more convenient than before to refer to several classes of symbols used with the Objective-C bridge. For a full description of these reader-macros, see the Foreign-Function-Interface Dictionary, especially the entries at the beginning, describing reader macros.

As in previous releases, 32-bit versions of Clozure CL use 32-bit floats and integers in data structures that describe geometry, font sizes and metrics, and so on. 64-bit versions of Clozure CL use 64-bit values where appropriate.

The Objective-C bridge defines the type NS:CGFLOAT as the Lisp type of the preferred floating-point type on the current platform, and defines the constant NS:+CGFLOAT+. On DarwinPPC32, the foreign types :cgfloat, :<NSUI>nteger, and :<NSI>nteger are defined by the Objective-C bridge (as 32-bit float, 32-bit unsigned integer, and 32-bit signed integer, respectively); these types are defined as 64-bit variants in the 64-bit interfaces.

Every Objective-C class is now properly named, either with a name exported from the NS package (in the case of a predefined class declared in the interface files) or with the name provided in the DEFCLASS form (with :METACLASS NS:+NS-OBJECT) which defines the class from Lisp. The class's Lisp name is now proclaimed to be a "static" variable (as if by DEFSTATIC, as described in the "Static Variables" section) and given the class object as its value. In other words:

(send (find-class 'ns:ns-application) 'shared-application)
    

and

(send ns:ns-application 'shared-application)
    

are equivalent. (Since it's not legal to bind a "static" variable, it may be necessary to rename some things so that unrelated variables whose names coincidentally conflict with Objective-C class names don't do so.)

13.2. Using Objective-C Classes

The class of most standard CLOS classes is named STANDARD-CLASS. In the Objective-C object model, each class is an instance of a (usually unique) metaclass, which is itself an instance of a "base" metaclass (often the metaclass of the class named "NSObject".) So, the Objective-C class named "NSWindow" and the Objective-C class "NSArray" are (sole) instances of their distinct metaclasses whose names are also "NSWindow" and "NSArray", respectively. (In the Objective-C world, it's much more common and useful to specialize class behavior such as instance allocation.)

When Clozure CL first loads foreign libraries containing Objective-C classes, it identifies the classes they contain. The foreign class name, such as "NSWindow", is mapped to an external symbol in the "NS" package via the bridge's translation rules, such as NS:NS-WINDOW. A similar transformation happens to the metaclass name, with a "+" prepended, yielding something like NS:+NS-WINDOW.

These classes are integrated into CLOS such that the metaclass is an instance of the class OBJC:OBJC-METACLASS and the class is an instance of the metaclass. SLOT-DESCRIPTION metaobjects are created for each instance variable, and the class and metaclass go through something very similar to the "standard" CLOS class initialization protocol (with a difference being that these classes have already been allocated.)

Performing all this initialization, which is done when you (require "COCOA"), currently takes several seconds; it could conceivably be sped up some, but it's never likely to be fast.

When the process is complete, CLOS is aware of several hundred new Objective-C classes and their metaclasses. Clozure CL's runtime system can reliably recognize MACPTRs to Objective-C classes as being CLASS objects, and can (fairly reliably but heuristically) recognize instances of those classes (though there are complicating factors here; see below.) SLOT-VALUE can be used to access (and, with care, set) instance variables in Objective-C instances. To see this, do:

      ? (require "COCOA")
    

and, after waiting a bit longer for a Cocoa listener window to appear, activate that Cocoa listener and do:

? (describe (ccl::send ccl::*NSApp* 'key-window))
    

This sends a message asking for the key window, which is the window that has the input focus (often the frontmost), and then describes it. As we can see, NS:NS-WINDOWs have lots of interesting slots.

13.3. Instantiating Objective-C Objects

Making an instance of an Objective-C class (whether the class in question is predefined or defined by the application) involves calling MAKE-INSTANCE with the class and a set of initargs as arguments. As with STANDARD-CLASS, making an instance involves initializing (with INITIALIZE-INSTANCE) an object allocated with ALLOCATE-INSTANCE.

For example, you can create an ns:ns-number like this:

      ? (make-instance 'ns:ns-number :init-with-int 42)
      #<NS-CF-NUMBER 42 (#x85962210)>
    

It's worth looking at how this would be done if you were writing in Objective C:

      [[NSNumber alloc] initWithInt: 42]
    

Allocating an instance of an Objective-C class involves sending the class an "alloc" message, and then using those initargs that don't correspond to slot initargs as the "init" message to be sent to the newly-allocated instance. So, the example above could have been done more verbosely as:

      ? (defvar *n* (ccl::send (find-class 'ns:ns-number) 'alloc))
      *N*

      ? (setq *n* (ccl::send *n* :init-with-int 42))
      #<NS-CF-NUMBER 42 (#x16D340)>
    

That setq is important; this is a case where init decides to replace the object and return the new one, instead of modifying the existing one. In fact, if you leave out the setq and then try to view the value of *N*, Clozure CL will freeze. There's little reason to ever do it this way; this is just to show what's going on.

You've seen that an Objective-C initialization method doesn't have to return the same object it was passed. In fact, it doesn't have to return any object at all; in this case, the initialization fails and make-instance returns nil.

In some special cases, such as loading an ns:ns-window-controller from a .nib file, it may be necessary for you to pass the instance itself as one of the parameters to the initialization method. It goes like this:

      ? (defvar *controller*
      (make-instance 'ns:ns-window-controller))
      *CONTROLLER*

      ? (setq *controller*
      (ccl::send *controller*
      :init-with-window-nib-name #@"DataWindow"
      :owner *controller*))
      #<NS-WINDOW-CONTROLLER <NSWindowController: 0x1fb520> (#x1FB520)>
    

This example calls (make-instance) with no initargs. When you do this, the object is only allocated, and not initialized. It then sends the "init" message to do the initialization by hand.

There is an alternative API for instantiating Objective-C classes. You can call OBJC:MAKE-OBJC-INSTANCE, passing it the name of the Objective-C class as a string. In previous releases, OBJC:MAKE-OBJC-INSTANCE could be more efficient than OBJC:MAKE-INSTANCE in cases where the class did not define any Lisp slots; this is no longer the case. You can now regard OBJC:MAKE-OBJC-INSTANCE as completely equivalent to OBJC:MAKE-INSTANCE, except that you can pass a string for the classname, which may be convenient in the case that the classname is in some way unusual.

13.4. Calling Objective-C Methods

In Objective-C, methods are called "messages", and there's a special syntax to send a message to an object:

      [w alphaValue]
      [w setAlphaValue: 0.5]
      [v mouse: p inRect: r]
    

The first line sends the method "alphaValue" to the object w, with no parameters. The second line sends the method "setAlphaValue", with the parameter 0.5. The third line sends the method "mouse:inRect:" - yes, all one long word - with the parameters p and r.

In Lisp, these same three lines are:

      (send w 'alpha-value)
      (send w :set-alpha-value 0.5)
      (send v :mouse p :in-rect r)
    

Notice that when a method has no parameters, its name is an ordinary symbol (it doesn't matter what package the symbol is in, as only its name is checked). When a method has parameters, each part of its name is a keyword, and the keywords alternate with the values.

These two lines break those rules, and both will result in error messages:

      (send w :alpha-value)
      (send w 'set-alpha-value 0.5)
    

Instead of (send), you can also invoke (send-super), with the same interface. It has roughly the same purpose as CLOS's (call-next-method); when you use (send-super), the message is handled by the superclass. This can be used to get at the original implementation of a method when it is shadowed by a method in your subclass.

13.4.1. Type Coercion for Objective-C Method Calls

Clozure CL's FFI handles many common conversions between Lisp and foreign data, such as unboxing floating-point args and boxing floating-point results. The bridge adds a few more automatic conversions:

NIL is equivalent to (%NULL-PTR) for any message argument that requires a pointer.

T/NIL are equivalent to #$YES/#$NO for any boolean argument.

A #$YES/#$NO returned by any method that returns BOOL will be automatically converted to T/NIL.

13.4.2. Methods which Return Structures

Some Cocoa methods return small structures, such as those used to represent points, rects, sizes and ranges. When writing in Objective C, the compiler hides the implementation details. Unfortunately, in Lisp we must be slightly more aware of them.

Methods which return structures are called in a special way; the caller allocates space for the result, and passes a pointer to it as an extra argument to the method. This is called a Structure Return, or STRET. Don't look at me; I don't name these things.

Here's a simple use of this in Objective C. The first line sends the "bounds" message to v1, which returns a rectangle. The second line sends the "setBounds" message to v2, passing that same rectangle as a parameter.

        NSRect r = [v1 bounds];
        [v2 setBounds r];
	  

In Lisp, we must explicitly allocate the memory, which is done most easily and safely with rlet. We do it like this:

(rlet ((r :<NSR>ect))
          (send/stret r v1 'bounds)
          (send v2 :set-bounds r))
      

The rlet allocates the storage (but doesn't initialize it), and makes sure that it will be deallocated when we're done. It binds the variable r to refer to it. The call to send/stret is just like an ordinary call to send, except that r is passed as an extra, first parameter. The third line, which calls send, does not need to do anything special, because there's nothing complicated about passing a structure as a parameter.

In order to make STRETs easier to use, the bridge provides two conveniences.

First, you can use the macros slet and slet* to allocate and initialize local variables to foreign structures in one step. The example above could have been written more tersely as:

(slet ((r (send v1 'bounds)))
      (send v2 :set-bounds r))
	  

Second, when one call to send is made inside another, the inner one has an implicit slet around it. So, one could in fact just write:

(send v1 :set-bounds (send v2 'bounds))
      

There are also several pseudo-functions provided for convenience by the Objective-C compiler, to make objects of specific types. The following are currently supported by the bridge: NS-MAKE-POINT, NS-MAKE-RANGE, NS-MAKE-RECT, and NS-MAKE-SIZE.

These pseudo-functions can be used within an SLET initform:

(slet ((p (ns-make-point 100.0 200.0)))
      (send w :set-frame-origin p))
      

Or within a call to send:

(send w :set-origin (ns-make-point 100.0 200.0))
      

However, since these aren't real functions, a call like the following won't work:

(setq p (ns-make-point 100.0 200.0))
      

To extract fields from these objects, there are also some convenience macros: NS-MAX-RANGE, NS-MIN-X, NS-MIN-Y, NS-MAX-X, NS-MAX-Y, NS-MID-X, NS-MID-Y, NS-HEIGHT, and NS-WIDTH.

Note that there is also a send-super/stret for use within methods. Like send-super, it ignores any shadowing methods in a subclass, and calls the version of a method which belongs to its superclass.

13.4.3. Variable-Arity Messages

There are a few messages in Cocoa which take variable numbers of arguments. Perhaps the most common examples involve formatted strings:

[NSClass stringWithFormat: "%f %f" x y]
      

In Lisp, this would be written:

(send (find-class 'ns:ns-string)
      :string-with-format #@"%f %f"
      (:double-float x :double-float y))
      

Note that it's necessary to specify the foreign types of the variables (in this example, :double-float), because the compiler has no general way of knowing these types. (You might think that it could parse the format string, but this would only work for format strings which are not determined at runtime.)

Because the Objective-C runtime system does not provide any information on which messages are variable arity, they must be explicitly declared. The standard variable arity messages in Cocoa are predeclared by the bridge. If you need to declare a new variable arity message, use (DEFINE-VARIABLE-ARITY-MESSAGE "myVariableArityMessage:").

13.4.4. Optimization

The bridge works fairly hard to optimize message sends, when it has enough information to do so. There are two cases when it does. In either, a message send should be nearly as efficient as when writing in Objective C.

The first case is when both the message and the receiver's class are known at compile-time. In general, the only way the receiver's class is known is if you declare it, which you can do with either a DECLARE or a THE form. For example:

(send (the ns:ns-window w) 'center)
	  

Note that there is no way in Objective-C to name the class of a class. Thus the bridge provides a declaration, @METACLASS. The type of an instance of "NSColor" is ns:ns-color. The type of the class "NSColor" is (@metaclass ns:ns-color):

(let ((c (find-class 'ns:ns-color)))
  (declare ((ccl::@metaclass ns:ns-color) c))
  (send c 'white-color))
      

The other case that allows optimization is when only the message is known at compile-time, but its type signature is unique. Of the more-than-6000 messages currently provided by Cocoa, only about 50 of them have nonunique type signatures.

An example of a message with a type signature that is not unique is SET. It returns VOID for NSColor, but ID for NSSet. In order to optimize sends of messages with nonunique type signatures, the class of the receiver must be declared at compile-time.

If the type signature is nonunique or the message is unknown at compile-time, then a slower runtime call must be used.

When the receiver's class is unknown, the bridge's ability to optimize relies on a type-signature table which it maintains. When first loaded, the bridge initializes this table by scanning every method of every Objective-C class. When new methods are defined later, the table must be updated. This happens automatically when you define methods in Lisp. After any other major change, such as loading an external framework, you should rebuild the table:

? (update-type-signatures)
      

Because send and its relatives send-super, send/stret, and send-super/stret are macros, they cannot be funcalled, applyed, or passed as arguments to functions.

To work around this, there are function equivalents to them: %send, %send-super, %send/stret, and %send-super/stret. However, these functions should be used only when the macros will not do, because they are unable to optimize.

13.5. Defining Objective-C Classes

You can define your own foreign classes, which can then be passed to foreign functions; the methods which you implement in Lisp will be made available to the foreign code as callbacks.

You can also define subclasses of existing classes, implementing your subclass in Lisp even though the parent class was in Objective C. One such subclass is CCL::NS-LISP-STRING. It is also particularly useful to make subclasses of NS-WINDOW-CONTROLLER.

We can use the MOP to define new Objective-C classes, but we have to do something a little funny: the :METACLASS that we'd want to use in a DEFCLASS option generally doesn't exist until we've created the class (recall that Objective-C classes have, for the sake of argument, unique and private metaclasses.) We can sort of sleaze our way around this by specifying a known Objective-C metaclass object name as the value of the DEFCLASS :METACLASS object; the metaclass of the root class NS:NS-OBJECT, NS:+NS-OBJECT, makes a good choice. To make a subclass of NS:NS-WINDOW (that, for simplicity's sake, doesn't define any new slots), we could do:

(defclass example-window (ns:ns-window)
  ()
  (:metaclass ns:+ns-object))
    

That'll create a new Objective-C class named EXAMPLE-WINDOW whose metaclass is the class named +EXAMPLE-WINDOW. The class will be an object of type OBJC:OBJC-CLASS, and the metaclass will be of type OBJC:OBJC-METACLASS. EXAMPLE-WINDOW will be a subclass of NS-WINDOW.

13.5.1. Defining classes with foreign slots

If a slot specification in an Objective-C class definition contains the keyword :FOREIGN-TYPE, the slot will be a "foreign slot" (i.e. an Objective-C instance variable). Be aware that it is an error to redefine an Objective-C class so that its foreign slots change in any way, and Clozure CL doesn't do anything consistent when you try to.

The value of the :FOREIGN-TYPE initarg should be a foreign type specifier. For example, if we wanted (for some reason) to define a subclass of NS:NS-WINDOW that kept track of the number of key events it had received (and needed an instance variable to keep that information in), we could say:

(defclass key-event-counting-window (ns:ns-window)
  ((key-event-count :foreign-type :int
                    :initform 0
                    :accessor window-key-event-count))
  (:metaclass ns:+ns-object))
      

Foreign slots are always SLOT-BOUNDP, and the initform above is redundant: foreign slots are initialized to binary 0.

13.5.2. Defining classes with Lisp slots

A slot specification in an Objective-C class definition that doesn't contain the :FOREIGN-TYPE initarg defines a pretty-much normal lisp slot that'll happen to be associated with "an instance of a foreign class". For instance:

(defclass hemlock-buffer-string (ns:ns-string)
  ((hemlock-buffer :type hi::hemlock-buffer
                   :initform hi::%make-hemlock-buffer
                   :accessor string-hemlock-buffer))
  (:metaclass ns:+ns-object))
	  

As one might expect, this has memory-management implications: we have to maintain an association between a MACPTR and a set of lisp objects (its slots) as long as the Objective-C instance exists, and we have to ensure that the Objective-C instance exists (does not have its -dealloc method called) while lisp is trying to think of it as a first-class object that can't be "deallocated" while it's still possible to reference it. Associating one or more lisp objects with a foreign instance is something that's often very useful; if you were to do this "by hand", you'd have to face many of the same memory-management issues.

13.6. Defining Objective-C Methods

In Objective-C, unlike in CLOS, every method belongs to some particular class. This is probably not a strange concept to you, because C++ and Java do the same thing. When you use Lisp to define Objective-C methods, it is only possible to define methods belonging to Objective-C classes which have been defined in Lisp.

You can use either of two different macros to define methods on Objective-C classes. define-objc-method accepts a two-element list containing a message selector name and a class name, and a body. objc:defmethod superficially resembles the normal CLOS defmethod, but creates methods on Objective-C classes with the same restrictions as those created by define-objc-method.

13.6.1. Using define-objc-method

As described in the section Calling Objective-C Methods, the names of Objective-C methods are broken into pieces, each piece followed by a parameter. The types of all parameters must be explicitly declared.

Consider a few examples, meant to illustrate the use of define-objc-method. Let us define a class to use in them:

(defclass data-window-controller (ns:ns-window-controller)
  ((window :foreign-type :id :accessor window)
   (data :initform nil :accessor data))
  (:metaclass ns:+ns-object))
      

There's nothing special about this class. It inherits from ns:ns-window-controller. It has two slots: window is a foreign slot, stored in the Objective-C world; and data is an ordinary slot, stored in the Lisp world.

Here is an example of how to define a method which takes no arguments:

(define-objc-method ((:id get-window)
                     data-window-controller)
    (window self))
      

The return type of this method is the foreign type :id, which is used for all Objective-C objects. The name of the method is get-window. The body of the method is the single line (window self). The variable self is bound, within the body, to the instance that is receiving the message. The call to window uses the CLOS accessor to get the value of the window field.

Here's an example that takes a parameter. Notice that the name of the method without a parameter was an ordinary symbol, but with a parameter, it's a keyword:

(define-objc-method ((:id :init-with-multiplier (:int multiplier))
                     data-window-controller)
  (setf (data self) (make-array 100))
  (dotimes (i 100)
    (setf (aref (data self) i)
          (* i multiplier)))
  self)
      

To Objective-C code that uses the class, the name of this method is initWithMultiplier:. The name of the parameter is multiplier, and its type is :int. The body of the method does some meaningless things. Then it returns self, because this is an initialization method.

Here's an example with more than one parameter:

(define-objc-method ((:id :init-with-multiplier (:int multiplier)
                          :and-addend (:int addend))
                     data-window-controller)
  (setf (data self) (make-array size))
  (dotimes (i 100)
    (setf (aref (data self) i)
          (+ (* i multiplier)
             addend)))
  self)
      

To Objective-C, the name of this method is initWithMultiplier:andAddend:. Both parameters are of type :int; the first is named multiplier, and the second is addend. Again, the method returns self.

Here is a method that does not return any value, a so-called "void method". Where our other methods said :id, this one says :void for the return type:

(define-objc-method ((:void :take-action (:id sender))
                     data-window-controller)
  (declare (ignore sender))
  (dotimes (i 100)
    (setf (aref (data self) i)
          (- (aref (data self) i)))))
      

This method would be called takeAction: in Objective-C. The convention for methods that are going to be used as Cocoa actions is that they take one parameter, which is the object responsible for triggering the action. However, this method doesn't actually need to use that parameter, so it explicitly ignores it to avoid a compiler warning. As promised, the method doesn't return any value.

There is also an alternate syntax, illustrated here. The following two method definitions are equivalent:

(define-objc-method ("applicationShouldTerminate:"
                     "LispApplicationDelegate")
    (:id sender :<BOOL>)
    (declare (ignore sender))
    nil)
  
(define-objc-method ((:<BOOL>
                        :application-should-terminate sender)
                       lisp-application-delegate)
    (declare (ignore sender))
    nil)
      

13.6.2. Using objc:defmethod

The macro OBJC:DEFMETHOD can be used to define Objective-C methods. It looks superficially like CL:DEFMETHOD in some respects.

Its syntax is

(OBC:DEFMETHOD name-and-result-type 
               ((receiver-arg-and-class) &rest other-args) 
      &body body)
      

name-and-result-type is either an Objective-C message name, for methods that return a value of type :ID, or a list containing an Objective-C message name and a foreign type specifier for methods with a different foreign result type.

receiver-arg-and-class is a two-element list whose first element is a variable name and whose second element is the Lisp name of an Objective-C class or metaclass. The receiver variable name can be any bindable lisp variable name, but SELF might be a reasonable choice. The receiver variable is declared to be "unsettable"; i.e., it is an error to try to change the value of the receiver in the body of the method definition.

other-args are either variable names (denoting parameters of type :ID) or 2-element lists whose first element is a variable name and whose second element is a foreign type specifier.

Consider this example:

(objc:defmethod (#/characterAtIndex: :unichar)
    ((self hemlock-buffer-string) (index :<NSUI>nteger))
  ...)
      

The method characterAtIndex:, when invoked on an object of class HEMLOCK-BUFFER-STRING with an additional argument of type :<NSU>integer returns a value of type :unichar.

Arguments that wind up as some pointer type other than :ID (e.g. pointers, records passed by value) are represented as typed foreign pointers, so that the higher-level, type-checking accessors can be used on arguments of type :ns-rect, :ns-point, and so on.

Within the body of methods defined via OBJC:DEFMETHOD, the local function CL:CALL-NEXT-METHOD is defined. It isn't quite as general as CL:CALL-NEXT-METHOD is when used in a CLOS method, but it has some of the same semantics. It accepts as many arguments as are present in the containing method's other-args list and invokes version of the containing method that would have been invoked on instances of the receiver's class's superclass with the receiver and other provided arguments. (The idiom of passing the current method's arguments to the next method is common enough that the CALL-NEXT-METHOD in OBJC:DEFMETHODs should probably do this if it receives no arguments.)

A method defined via OBJC:DEFMETHOD that returns a structure "by value" can do so by returning a record created via MAKE-GCABLE-RECORD, by returning the value returned via CALL-NEXT-METHOD, or by other similar means. Behind the scenes, there may be a pre-allocated instance of the record type (used to support native structure-return conventions), and any value returned by the method body will be copied to this internal record instance. Within the body of a method defined with OBJC:DEFMETHOD that's declared to return a structure type, the local macro OBJC:RETURNING-FOREIGN-STRUCT can be used to access the internal structure. For example:

(objc:defmethod (#/reallyTinyRectangleAtPoint: :ns-rect) 
  ((self really-tiny-view) (where :ns-point))
  (objc:returning-foreign-struct (r)
    (ns:init-ns-rect r (ns:ns-point-x where) (ns:ns-point-y where)
                        single-float-epsilon single-float-epsilon)
    r))
       

If the OBJC:DEFMETHOD creates a new method, then it displays a message to that effect. These messages may be helpful in catching errors in the names of method definitions. In addition, if a OBJC:DEFMETHOD form redefines a method in a way that changes its type signature, Clozure CL signals a continuable error.

13.6.3. Method Redefinition Constraints

Objective C was not designed, as Lisp was, with runtime redefinition in mind. So, there are a few constraints about how and when you can replace the definition of an Objective C method. Currently, if you break these rules, nothing will collapse, but the behavior will be confusing; so don't.

Objective C methods can be redefined at runtime, but their signatures shouldn't change. That is, the types of the arguments and the return type have to stay the same. The reason for this is that changing the signature changes the selector which is used to call the method.

When a method has already been defined in one class, and you define it in a subclass, shadowing the original method, they must both have the same type signature. There is no such constraint, though, if the two classes aren't related and the methods just happen to have the same name.

13.7. Loading Frameworks

On Mac OS X, a framework is a structured directory containing one or more shared libraries along with metadata such as C and Objective-C header files. In some cases, frameworks may also contain additional items such as executables.

Loading a framework means opening the shared libraries and processing any declarations so that Clozure CL can subsequently call its entry points and use its data structures. Clozure CL provides the function OBJC:LOAD-FRAMEWORK for this purpose.

(OBJC:LOAD-FRAMEWORK framework-name interface-dir)
    

framework-name is a string that names the framework (for example, "Foundation", or "Cocoa"), and interface-dir is a keyword that names the set of interface databases associated with the named framework (for example, :foundation, or :cocoa).

Assuming that interface databases for the named frameworks exist on the standard search path, OBJC:LOAD-FRAMEWORK finds and initializes the framework bundle by searching OS X's standard framework search paths. Loading the named framework may create new Objective-C classes and methods, add foreign type descriptions and entry points, and adjust Clozure CL's dispatch functions.

If interface databases don't exist for a framework you want to use, you will need to create them. For more information about creating interface databases, see Creating new interface directories.

13.8. How Objective-C Names are Mapped to Lisp Symbols

There is a standard set of naming conventions for Cocoa classes, messages, etc. As long as they are followed, the bridge is fairly good at automatically translating between Objective-C and Lisp names.

For example, "NSOpenGLView" becomes ns:ns-opengl-view; "NSURLHandleClient" becomes ns:ns-url-handle-client; and "nextEventMatchingMask:untilDate:inMode:dequeue:" becomes (:next-event-matching-mask :until-date :in-mode :dequeue). What a mouthful.

To see how a given Objective-C or Lisp name will be translated by the bridge, you can use the following functions:

(ccl::objc-to-lisp-classname string)
(ccl::lisp-to-objc-classname symbol)
(ccl::objc-to-lisp-message string)
(ccl::lisp-to-objc-message string)
(ccl::objc-to-lisp-init string)
(ccl::lisp-to-objc-init keyword-list)

Of course, there will always be exceptions to any naming convention. Please tell us on the mailing lists if you come across any name translation problems that seem to be bugs. Otherwise, the bridge provides two ways of dealing with exceptions:

First, you can pass a string as the class name of MAKE-OBJC-INSTANCE and as the message to SEND. These strings will be directly interpreted as Objective-C names, with no translation. This is useful for a one-time exception. For example:

(ccl::make-objc-instance "WiErDclass")
(ccl::send o "WiErDmEsSaGe:WithARG:" x y)
    

Alternatively, you can define a special translation rule for your exception. This is useful for an exceptional name that you need to use throughout your code. Some examples:

(ccl::define-classname-translation "WiErDclass" wierd-class)
(ccl::define-message-translation "WiErDmEsSaGe:WithARG:" (:weird-message :with-arg))
(ccl::define-init-translation "WiErDiNiT:WITHOPTION:" (:weird-init :option))
    

The normal rule in Objective-C names is that each word begins with a capital letter (except possibly the first). Using this rule literally, "NSWindow" would be translated as N-S-WINDOW, which seems wrong. "NS" is a special word in Objective-C that should not be broken at each capital letter. Likewise "URL", "PDF", "OpenGL", etc. Most common special words used in Cocoa are already defined in the bridge, but you can define new ones as follows:

(ccl::define-special-objc-word "QuickDraw")
    

Note that message keywords in a SEND such as (SEND V :MOUSE P :IN-RECT R) may look like the keyword arguments in a Lisp function call, but they really aren't. All keywords must be present and the order is significant. Neither (:IN-RECT :MOUSE) nor (:MOUSE) translate to "mouse:inRect:"

Also, as a special exception, an "init" prefix is optional in the initializer keywords, so (MAKE-OBJC-INSTANCE 'NS-NUMBER :INIT-WITH-FLOAT 2.7) can also be expressed as (MAKE-OBJC-INSTANCE 'NS-NUMBER :WITH-FLOAT 2.7)

Chapter 14. Platform-specific notes

14.1. Overview

The documentation and whatever experience you may have in using Clozure CL under Linux should also apply to using it under Darwin/MacOS X and FreeBSD. There are some differences between the platforms, and these differences are sometimes exposed in the implementation.

14.1.1. Differences Between 32-bit and 64-bit implementations

Fixnums on 32-bit systems are 30 bits long, and are in the range -536870912 through 536870911. Fixnums on 64-bit systems are 61 bits long, and are in the range -1152921504606846976 through 1152921504606846975. (see Section 16.2.4, “Tagging scheme”)

Since we have much larger fixnums on 64-bit systems, INTERNAL-TIME-UNITS-PER-SECOND is 1000000 on 64-bit systems but remains 1000 on 32-bit systems. This enables much finer grained timing on 64-bit systems.

14.1.2. File-system case

Darwin and MacOS X use HFS+ file systems by default; HFS+ file systems are usually case-insensitive. Most of Clozure CL's filesystem and pathname code assumes that the underlying filesystem is case-sensitive; this assumption extends to functions like EQUAL, which assumes that #p"FOO" and #p"foo" denote different, un-EQUAL filenames. Since Darwin/MacOS X can also use UFS and NFS filesystems, the opposite assumption would be no more correct than the one that's currently made.

Whatever the best solution to this problem turns out to be, there are some practical considerations. Doing:

? (save-application "DPPCCL")
	  

on 32-bit DarwinPPC has the unfortunate side-effect of trying to overwrite the Darwin Clozure CL kernel, "dppccl", on a case-insensitive filesystem.

To work around this, the Darwin Clozure CL kernel expects the default heap image file name to be the kernel's own filename with the string ".image" appended, so the idiom would be:

? (save-application "dppccl.image")
	  

14.1.3. Line Termination Characters

MacOSX effectively supports two distinct line-termination conventions. Programs in its Darwin substrate follow the Unix convention of recognizing #\LineFeed as a line terminator; traditional MacOS programs use #\Return for this purpose. Many modern GUI programs try to support several different line-termination conventions (on the theory that the user shouldn't be too concerned about what conventions are used an that it probably doesn't matter. Sometimes this is true, other times ... not so much.

Clozure CL follows the Unix convention on both Darwin and LinuxPPC, but offers some support for reading and writing files that use other conventions (including traditional MacOS conventions) as well.

This support (and anything like it) is by nature heuristic: it can successfully hide the distinction between newline conventions much of the time, but could mistakenly change the meaning of otherwise correct programs (typically when files contain both #\Return and #\Linefeed characters or when files contain mixtures of text and binary data.) Because of this concern, the default settings of some of the variables that control newline translation and interpretation are somewhat conservative.

Although the issue of multiple newline conventions primarily affects MacOSX users, the functionality described here is available under LinuxPPC as well (and may occasionally be useful there.)

None of this addresses issues related to the third newline convention ("CRLF") in widespread use (since that convention isn't native to any platform on which Clozure CL currently runs). If Clozure CL is ever ported to such a platform, that issue might be revisited.

Note that some MacOS programs (including some versions of commercial MCL) may use HFS file type information to recognize TEXT and other file types and so may fail to recognize files created with Clozure CL or other Darwin applications (regardless of line termination issues.)

Unless otherwise noted, the symbols mentioned in this documentation are exported from the CCL package.

14.1.4. Single-precision trig & transcendental functions

Despite what Darwin's man pages say, early versions of its math library (up to and including at least OSX 10.2 (Jaguar) don't implement single-precision variants of the transcendental and trig functions (#_sinf, #_atanf, etc.) Clozure CL worked around this by coercing single-precision args to double-precision, calling the double-precision version of the math library function, and coercing the result back to a SINGLE-FLOAT. These steps can introduce rounding errors (and potentially overflow conditions) that might not be present or as severe if true 32-bit variants were available.

14.1.5. Shared libraries

Darwin/MacOS X distinguishes between "shared libraries" and "bundles" or "extensions"; Linux and FreeBSD don't. In Darwin, "shared libraries" have the file type "dylib" : the expectation is that this class of file is linked against when executable files are created and loaded by the OS when the executable is launched. The latter class - "bundles/extensions" - are expected to be loaded into and unloaded from a running application, via a mechanism like the one used by Clozure CL's OPEN-SHARED-LIBRARY function.

14.2. Unix/Posix/Darwin Features

Clozure CL has several convenience functions which allow you to make Posix (portable Unix) calls without having to use the foreign-function interface. Each of these corresponds directly to a single Posix function call, as it might be made in C. There is no attempt to make these calls correspond to Lisp idioms, such as setf. This means that their behavior is simple and predictable.

For working with environment variables, there are CCL::GETENV and CCL::SETENV.

For working with user and group IDs, there are CCL::GETUID, CCL::SETUID, and CCL::SETGID. To find the home directory of an arbitrary user, as set in the user database (/etc/passwd), there is CCL::GET-USER-HOME-DIR.

For process IDs, there is CCL::GETPID.

For the system() function, there is CCL::OS-COMMAND. Ordinarily, it is better - both more efficient and more predictable - to use the features described in Chapter 8, Running Other Programs as Subprocesses. However, sometimes you may want to specifically ask the shell to invoke a command for you.

14.3. Cocoa Programming in Clozure CL

Cocoa is one of Apple's APIs for GUI programming; for most purposes, development is considerably faster with Cocoa than with the alternatives. You should have a little familiarity with it, to better understand this section.

A small sample Cocoa program can be invoked by evaluating (REQUIRE 'TINY) and then (CCL::TINY-SETUP). This program provides a simple example of using several of the bridge's capabilities.

The Tiny demo creates Cocoa objects dynamically, at runtime, which is always an option. However, for large applications, it is usually more convenient to create your objects with Apple Interface Builder, and store them in .nib files to be loaded when needed. Both approaches can be freely mixed in a single program.

14.3.1. The Command Line and the Window System

Clozure CL is ordinarily a command-line application (it doesn't have a connection to the OSX Window server, doesn't have its own menubar or dock icon, etc.) By opening some libraries and jumping through some hoops, it's able to sort of transform itself into a full-fledged GUI application (while retaining its original TTY-based listener.) The general idea is that this hybrid environment can be used to test and protoype UI ideas and the resulting application can eventually be fully transformed into a bundled, double-clickable application. This is to some degree possible, but there needs to be a bit more infrastructure in place before many people would find it easy.

Cocoa applications use the NSLog function to write informational/warning/error messages to the application's standard output stream. When launched by the Finder, a GUI application's standard output is diverted to a logging facility that can be monitored with the Console application (found in /Applications/Utilities/Console.app). In the hybrid environment, the application's standard output stream is usually the initial listener's standard output stream. With two different buffered stream mechanisms trying to write to the same underlying Unix file descriptor, it's not uncommon to see NSLog output mixed with lisp output on the initial listener.

14.3.2. Writing (and reading) Cocoa code

The syntax of the constructs used to define Cocoa classes and methods has changed a bit (it was never documented outside of the source code and never too well documented at all), largely as the result of functionality offered by Randall Beer's bridge; the “standard name-mapping conventions” referenced below are described in his CocoaBridgeDoc.txt file, as are the constructs used to invoke (“send messages to”) Cocoa methods.

All of the symbols described below are currently internal to the CCL package.

ccl::@class
ccl::@selector
ccl::define-objc-method
ccl::define-objc-class-method

14.3.3. The Application Kit and Multiple Threads

The Cocoa API is broken into several pieces. The Application Kit, affectionately called AppKit, is the one which deals with window management, drawing, and handling events. AppKit really wants all these things to be done by a "distinguished thread". creation, and drawing to take place on a distinguished thread.

Apple has published some guidelines which discuss these issues in some detail; see the Apple Multithreading Documentation, and in particular the guidelines on Using the Application Kit from Multiple Threads. The upshot is that there can sometimes be unexpected behavior when objects are created in threads other than the distinguished event thread; eg, the event thread sometimes starts performing operations on objects that haven't been fully initialized.

It's certainly more convenient to do certain types of exploratory programming by typing things into a listener or evaluating a “defun” in an Emacs buffer; it may sometimes be necessary to be aware of this issue while doing so.

Each thread in the Cocoa runtime system is expected to maintain a current “autorelease pool” (an instance of the NSAutoreleasePool class); newly created objects are often added to the current autorelease pool (via the -autorelease method), and periodically the current autorelease pool is sent a “-release” message, which causes it to send “-release” messages to all of the objects that have been added to it.

If the current thread doesn't have a current autorelease pool, the attempt to autorelease any object will result in a severe-looking warning being written via NSLog. The event thread maintains an autorelease pool (it releases the current pool after each event is processed and creates a new one for the next event), so code that only runs in that thread should never provoke any of these severe-looking NSLog messages.

To try to suppress these messages (and still participate in the Cocoa memory management scheme), each listener thread (the initial listener and any created via the “New Listener” command in the IDE) is given a default autorelease pool; there are REPL colon-commands for manipulating the current listener's “toplevel autorelease pool”.

In the current scheme, every time that Cocoa calls lisp code, a lisp error handler is established which maps any lisp conditions to ObjC exceptions and arranges that this exception is raised when the callback to lisp returns. Whenever lisp code invokes a Cocoa method, it does so with an ObjC exception handler in place; this handler maps ObjC exceptions to lisp conditions and signals those conditions.

Any unhandled lisp error or ObjC exception that occurs during the execution of the distinguished event thread's event loop causes a message to be NSLog'ed and the event loop to (try to) continue execution. Any error that occurs in other threads is handled at the point of the outermost Cocoa method invocation. (Note that the error is not necessarily “handled” in the dynamic context in which it occurs.)

Both of these behaviors could possibly be improved; both of them seem to be substantial improvements over previous behaviors (where, for instance, a misspelled message name typically terminated the application.)

14.3.4. Acknowledgement

The Cocoa bridge was originally developed, and generously contributed by, Randall Beer.

14.4. Building an Application Bundle

You may have noticed that (require "COCOA") takes a long time to load. It is possible to avoid this by saving a Lisp heap image which has everything already loaded. There is an example file which allows you to do this, "ccl/examples/cocoa-application.lisp", by producing a double-clickable application which runs your program. First, load your own program. Then, do:

? (require "COCOA-APPLICATION")
    

When it finishes, you should be able to double-click the Clozure CL icon in the ccl directory, to quickly start your program.

The OS may have already decided that Clozure CL.app isn't a valid executable bundle, and therefore won't let you double-click it. If this happens to you, to force it to reconsider, just update the last-modified time of the bundle. In Terminal:

> touch Clozure CL.app
    

There is one important caveat.

Because of the way that the ObjC bridge currently works, a saved image is dependent upon the exact versions of the Cocoa libraries which were present when it was saved. Specifically, the interface database is. So, for example, an application produced under OS X 10.3.5 will not work under OS X 10.3.6. This is inconvenient when you wish to distribute an application you have built this way.

When an image which had contained ObjC classes (which are also CLOS classes) is re-launched, those classes are "revived": all preexisting classes have their addresses updated destructively, so that existing subclass/superclass/metaclass relationships are maintained. It's not possible (and may never be) to preserve foreign instances across SAVE-APPLICATION. (It may be the case that NSArchiver and NSCoder and related classes offer some approximation of that.)

14.5. Recommended Reading

Cocoa Documentation

This is the top page for all of Apple's documentation on Cocoa. If you are unfamiliar with Cocoa, it is a good place to start.

Foundation Reference for Objective-C

This is one of the two most important Cocoa references; it covers all of the basics, except for GUI programming. This is a reference, not a tutorial.

Application Kit Reference for Objective-C

This is the other; it covers GUI programming with Cocoa in considerable depth. This is a reference, not a tutorial.

Apple Developer Documentation

This is the site which the above two documents are found on; go here to find the documentation on any other Apple API. Also go here if you need general guidance about OS X, Carbon, Cocoa, Core Foundation, or Objective C.

14.6. Operating-System Dictionary

[Function]

getenv name => value

Arguments and Values:

name---a string which is the name of an existing environment variable; case-sensitive

value---if there is an environment variable named name, its value, as a string; if there is not, NIL

Description:

Looks up the value of the environment variable named by name, in the OS environment.

[Function]

setenv name value => errno

Arguments and Values:

name---a string which is the name of a new or existing environment variable; case-sensitive

value---a string, to be the new value of the environment variable named by name

errno---zero if the function call completes successfully; otherwise, a platform-dependent integer which describes the problem

Description:

Sets the value of the environment variable named by name, in the OS environment. If there is no such environment variable, creates it.

[Function]

current-directory-name => path

Values:

path---a string, an absolute pathname in Posix format - with directory components separated by slashes

Description:

Looks up the current working directory of the Clozure CL process; unless it has been changed, this is the directory Clozure CL was started in.

[Function]

getuid => uid

Values:

uid---a non-negative integer, identifying a specific user account as defined in the OS user database

Description:

Returns the ("real") user ID of the current user.

[Function]

setuid uid => errno

Arguments and Values:

uid---a non-negative integer, identifying a specific user account as defined in the OS user database

errno---zero if the function call completes successfully; otherwise, a platform-dependent integer which describes the problem

Description:

Attempts to change the current user ID (both "real" and "effective"); fails unless the Clozure CL process has super-user privileges or the ID given is that of the current user.

[Function]

setgid gid => errno

Arguments and Values:

gid---a non-negative integer, identifying a specific group as defined in the OS user database

errno---zero if the function call completes successfully; otherwise, a platform-dependent integer which describes the problem

Description:

Attempts to change the current group ID (both "real" and "effective"); fails unless the Clozure CL process has super-user privileges or the ID given is that of a group to which the current user belongs.

[Function]

getpid => pid

Values:

pid---a non-negative integer, identifying an OS process

Description:

Returns the ID of the Clozure CL OS process.

[Function]

get-user-home-dir uid => path

Values:

uid---a non-negative integer, identifying a specific user account as defined in the OS user database

path---a string, an absolute pathname in Posix format - with directory components separated by slashes; or NIL

Description:

Looks up and returns the defined home directory of the user identified by uid. This value comes from the OS user database, not from the $HOME environment variable. Returns NIL if there is no user with the ID uid.

[Function]

os-command command-line => exit-code

Values:

command-line---a string, obeying all the whitespace and escaping conventions required by the user's default system shell

exit-code---a non-negative integer, returned as the exit code of a subprocess; zero indicates success

Description:

Invokes the Posix function system(), which invokes the user's default system shell (such as sh or tcsh) as a new process, and has that shell execute command-line.

If the shell was able to find the command specified in command-line, then exit-code is the exit code of that command. If not, it is the exit code of the shell itself.

Notes:

By convention, an exit code of 0 indicates success. There are also other conventions; unfortunately, they are OS-specific, and the portable macros to decode their meaning are implemented by the system headers as C preprocessor macros. This means that there is no good, automated way to make them available to Lisp.

[Macro]

@class class-name

Arguments and Values:

class-name---a string which denotes an existing class name, or a symbol which can be mapped to such a string via the standard name-mapping conventions for class names

Description:

Used to refer to a known ObjC class by name. (Via the use LOAD-TIME-VALUE, the results of a class-name -> class lookup are cached.)

@class is obsolete as of late 2004, because find-class now works on ObjC classes. It is described here only because some old code still uses it.

[Macro]

@selector string

Arguments and Values:

string---a string constant, used to canonically refer to an ObjC method selector

Description:

Used to refer to an ObjC method selector (method name). Uses LOAD-TIME-VALUE to cache the result of a string -> selector lookup.

[Macro]

objc:defmethod name-and-result-type ((receiver-arg-and-class) &rest other-args) &body body

Arguments and Values:

name-and-result-type---either an Objective-C message name, for methods that return a value of type :ID, or a list containing an Objective-C message name and a foreign type specifier for methods with a different foreign result type.

receiver-arg-and-class---a two-element list whose first element is a variable name and whose second element is the Lisp name of an Objective-C class or metaclass. The receiver variable name can be any bindable lisp variable name, but SELF might be a reasonable choice. The receiver variable is declared to be "unsettable"; i.e., it is an error to try to change the value of the receiver in the body of the method definition.

other-args---either variable names (denoting parameters of type :ID) or 2-element lists whose first element is a variable name and whose second element is a foreign type specifier.

Description:

Defines an Objective-C-callable method which implements the specified message selector for instances of the existing named Objective-C class.

For a detailed description of the features and restrictions of the OBJC:DEFMETHOD macro, see the section Using objc:defmethod.

[Macro]

define-objc-method (selector class-name) &body body

Arguments and Values:

selector---either a string which represents the name of the selector or a list which describes the method's return type, selector components, and argument types (see below.) If the first form is used, then the first form in the body must be a list which describes the selector's argument types and return value type, as per DEFCALLBACK.

class-name---either a string which names an existing ObjC class name or a list symbol which can map to such a string via the standard name-mapping conventions for class names. (Note that the "canonical" lisp class name is such a symbol)

Description:

Defines an ObjC-callable method which implements the specified message selector for instances of the existing ObjC class class-name.

[Macro]

define-objc-class-method (selector class-name) &body body

Arguments and Values:

As per DEFINE-OBJC-METHOD

Description:

Like DEFINE-OBJC-METHOD, only used to define methods on the class named by class-name and on its subclasses.

For both DEFINE-OBJC-METHOD and DEFINE-OBJC-CLASS-METHOD, the "selector" argument can be a list whose first element is a foreign type specifier for the method's return value type and whose subsequent elements are either:

  • a non-keyword symbol, which can be mapped to a selector string for a parameterless method according to the standard name-mapping conventions for method selectors.

  • a list of alternating keywords and variable/type specifiers, where the set of keywords can be mapped to a selector string for a parameterized method according to the standard name-mapping conventions for method selectors and each variable/type-specifier is either a variable name (denoting a value of type :ID) or a list whose CAR is a variable name and whose CADR is the corresponding argument's foreign type specifier.

[Variable]

CCL:*ALTERNATE-LINE-TERMINATOR*

Description:

This variable is currently only used by the standard reader macro function for #\; (single-line comments); that function reads successive characters until EOF, a #\NewLine is read, or a character EQL to the value of *alternate-line-terminator* is read. In Clozure CL for Darwin, the value of this variable is initially #\Return ; in Clozure CL for LinuxPPC, it's initially NIL.

Their default treatment by the #\; reader macro is the primary way in which #\Return and #\Linefeed differ syntactically; by extending the #\; reader macro to (conditionally) treat #\Return as a comment-terminator, that distinction is eliminated. This seems to make LOAD and COMPILE-FILE insensitive to line-termination issues in many cases. It could fail in the (hopefully rare) case where a LF-terminated (Unix) text file contains embedded #\Return characters, and this mechanism isn't adequate to handle cases where newlines are embedded in string constants or other tokens (and presumably should be translated from an external convention to the external one) : it doesn't change what READ-CHAR or READ-LINE "see", and that may be necessary to handle some more complicated cases.

[Keyword Argument]

:EXTERNAL-FORMAT

Description:

Per ANSI CL, Clozure CL supports the :EXTERNAL-FORMAT keyword argument to the functions OPEN, LOAD, and COMPILE-FILE. This argument is intended to provide a standard way of providing implementation-dependent information about the format of files opened with an element-type of CHARACTER. This argument can meaningfully take on the values :DEFAULT (the default), :MACOS, :UNIX, or :INFERRED in Clozure CL.

When defaulted to or specified as :DEFAULT, the format of the file stream is determined by the value of the variable CCL:*DEFAULT-EXTERNAL-FORMAT*. See below.

When specified as :UNIX, all characters are read from and written to files verbatim.

When specified as :MACOS, all #\Return characters read from the file are immediately translated to #\Linefeed (#\Newline); all #\Newline (#\Linefeed) characters are written externally as #\Return characters.

When specified as :INFERRED and the file is open for input, the first buffer-full of input data is examined; if a #\Return character appears in the buffer before the first #\Linefeed, the file stream's external-format is set to :MACOS; otherwise, it is set to :UNIX.

All other values of :EXTERNAL-FORMAT - and any combinations that don't make sense, such as trying to infer the format of a newly-created output file stream - are treated as if :UNIX was specified. As mentioned above, the :EXTERNAL-FORMAT argument doesn't apply to binary file streams.

The translation performed when :MACOS is specified or inferred has a somewhat greater chance of doing the right thing than the *alternate-line-terminator* mechanism does; it probably has a somewhat greater chance of doing the wrong thing, as well.

[Variable]

CCL:*DEFAULT-EXTERNAL-FORMAT*

Description:

The value of this variable is used when :EXTERNAL-FORMAT is unspecified or specified as :DEFAULT. It can meaningfully be given any of the values :UNIX, :MACOS, or :INFERRED, each of which is interpreted as described above.

Because there's some risk that unsolicited newline translation could have undesirable consequences, the initial value of this variable in Clozure CL is :UNIX.

[Class]

CCL::NS-LISP-STRING

Superclasses:

NS:NS-STRING

Initargs:

:string--- a Lisp string which is to be the content of the newly-created ns-lisp-string.

Description:

This class implements the interface of an NSString, which means that it can be passed to any Cocoa or Core Foundation function which expects one.

The string itself is stored on the Lisp heap, which means that its memory management is automatic. However, the ns-lisp-string object itself is a foreign object (that is, it has an objc metaclass), and resides on the foreign heap. Therefore, it is necessary to explicitly free it, by sending a dealloc message.

Examples:

You can create an ns-lisp-string with make-instance, just like any normal Lisp class:

? (defvar *the-string*
     (make-instance 'ccl::ns-lisp-string
                    :string "Hello, Cocoa."))

When you are done with the string, you must explicitly deallocate it:

? (ccl::send *the-string* 'dealloc)

You may wish to use an unwind-protect form to ensure that this happens:

(let (*the-string*)
  (unwind-protect (progn (setq *the-string*
                               (make-instance 'ccl::ns-lisp-string
                                              :string "Hello, Cocoa."))
                         (format t "~&The string is ~D characters long.~%"
                                 (ccl::send *the-string* 'length)))
    (when *the-string*
      (ccl::send *the-string* 'dealloc))))
Notes:

Currently, ns-lisp-string is defined in the file ccl/examples/cocoa-backtrace.lisp, which is a rather awkward place. It was probably not originally meant as a public utility at all. It would be good if it were moved someplace else. Use at your own risk.

Chapter 15. Understanding and Configuring the Garbage Collector

15.1. Heap space allocation

Release 0.10 or later of CCL uses a different memory management scheme than previous versions did. Those earlier versions would allocate a block of memory (of specified size) at startup and would allocate lisp objects within that block. When that block filled with live (non-GCed) objects, the lisp would signal a "heap full" condition. The heap size imposed a limit on the size of the largest object that could be allocated.

The new strategy involves reserving a very large (2GB on DarwinPPC32, 1GB on LinuxPPC, "very large" on 64-bit implementations) block at startup and consuming (and relinquishing) its contents as the size of the live lisp heap data grows and shrinks. After the initial heap image loads and after each full GC, the lisp kernel will try to ensure that a specified amount (the "lisp-heap-gc-threshold") of free memory is available. The initial value of this kernel variable is 16MB on 32-bit implementations and 32MB on 64-bit implementations ; it can be manipulated from Lisp (see below.)

The large reserved memory block consumes very little in the way of system resources; memory that's actually committed to the lisp heap (live data and the "threshold" area where allocation takes place) consumes finite resources (physical memory and swap space). The lisp's consumption of those resources is proportional to its actual memory usage, which is generally a good thing.

This scheme is much more flexible than the old one, but it may also increase the possibility that those resources can become exhausted. Neither the new scheme nor the old handles that situation gracefully; under the old scheme, a program that consumes lots of memory may have run into an artificial limit on heap size before exhausting virtual memory.

The -R or –heap-reserve command-line option can be use to limit the size of the reserved block and therefore bound heap expansion. Running

> openmcl --heap-reserve 8M

would provide an execution environment that's very similar to that provided by earlier CCL versions.

15.2. The Ephemeral GC

For many programs, the following observations are true to a very large degree:

  1. Most heap-allocated objects have very short lifetimes ("are ephemeral"): they become inaccessible soon after they're created.

  2. Most non-ephemeral objects have very long lifetimes: it's rarely productive for the GC to consider reclaiming them, since it's rarely able to do so. (An object that has survived a large number of GCs is likely to survive the next one. That's not always true of course, but it's a reasonable heuristic.)

  3. It's relatively rare for an old object to be destructively modified (via SETF) so that it points to a new one, therefore most references to newly-created objects can be found in the stacks and registers of active threads. It's not generally necessary to scan the entire heap to find references to new objects (or to prove that such references don't exists), though it is necessary to keep track of the (hopefully exceptional) cases where old objects are modified to point at new ones.

"Ephemeral" (or "generational") garbage collectors try to exploit these observations: by concentrating on frequently reclaiming newly-created objects quickly, it's less often necessary to do more expensive GCs of the entire heap in order to reclaim unreferenced memory. In some environments, the pauses associated with such full GCs can be noticeable and disruptive, and minimizing the frequency (and sometimes the duration) of these pauses is probably the EGC's primary goal (though there may be other benefits, such as increased locality of reference and better paging behavior.) The EGC generally leads to slightly longer execution times (and slightly higher, amortized GC time), but there are cases where it can improve overall performance as well; the nature and degree of its impact on performance is highly application-dependent.

Most EGC strategies (including the one employed by CCL) logically or physically divide memory into one or more areas of relatively young objects ("generations") and one or more areas of old objects. Objects that have survived one or more GCs as members of a young generation are promoted (or "tenured") into an older generation, where they may or may not survive long enough to be promoted to the next generation and eventually may become "old" objects that can only be reclaimed if a full GC proves that there are no live references to them. This filtering process isn't perfect - a certain amount of premature tenuring may take place - but it usually works very well in practice.

It's important to note that a GC of the youngest generation is typically very fast (perhaps a few milliseconds on a modern CPU, depending on various factors), CCL's EGC is not concurrent and doesn't offer realtime guarantees.

CCL's EGC maintains three ephemeral generations; all newly created objects are created as members of the youngest generation. Each generation has an associated threshold, which indicates the number of bytes in it and all younger generations that can be allocated before a GC is triggered. These GCs will involve the target generation and all younger ones (and may therefore cause some premature tenuring); since the older generations have larger thresholds, they're GCed less frequently and most short-lived objects that make it into an older generation tend not to survive there very long.

The EGC can be enabled or disabled under program control; under some circumstances, it may be enabled but inactive (because a full GC is imminent.) Since it may be hard to know or predict the consing behavior of other threads, the distinction between the "active" and "inactive" state isn't very meaningful, especially when native threads are involved.

15.3. GC Page reclamation policy

After a full GC finishes, it'll try to ensure that at least (LISP-HEAP-GC-THRESHOLD) of virtual memory are available; objects will be allocated in this block of memory until it fills up, the GC is triggered, and the process repeats itself.

Many programs reach near stasis in terms of the amount of logical memory that's in use after full GC (or run for long periods of time in a nearly static state), so the logical address range used for consing after the Nth full GC is likely to be nearly or entirely identical to the address range used by the N+1th full GC.

By default (and traditionally in CCL), the GC's policy is to "release" the pages in this address range: to advise the virtual memory system that the pages contain garbage and any physical pages associated with them don't need to be swapped out to disk before being reused and to (re-)map the logical address range so that the pages will be zero-filled by the virtual memory system when they're next accessed. This policy is intended to reduce the load on the VM system and keep CCL's working set to a minimum.

For some programs (especially those that cons at a very high rate), the default policy may be less than ideal: releasing pages that are going to be needed almost immediately - and zero-fill-faulting them back in, lazily - incurs unnecessary overhead. (There's a false economy associated with minimizing the size of the working set if it's just going to shoot back up again until the next GC.) A policy of "retaining" pages between GCs might work better in such an environment.

Functions described below give the user some control over this behavior. An adaptive, feedback-mediated approach might yield a better solution.

15.4. "Pure" areas are read-only, paged from image file

SAVE-APPLICATION identifies code vectors and the pnames of interned symbols and copies these objects to a "pure" area of the image file it creates. (The "pure" area accounts for most of what the ROOM function reports as "static" space.)

When the resulting image file is loaded, the pure area of the file is now memory-mapped with read-only access. Code and pure data are paged in from the image file as needed (and don't compete for global virtual memory resources with other memory areas.)

Code-vectors and interned symbol pnames are immutable : it is an error to try to change the contents of such an object. Previously, that error would have manifested itself in some random way. In the new scheme, it'll manifest itself as an "unhandled exception" error in the Lisp kernel. The kernel could probably be made to detect a spurious, accidental write to read-only space and signal a lisp error in that case, but it doesn't yet do so.

The image file should be opened and/or mapped in some mode which disallows writing to the memory-mapped regions of the file from other processes. I'm not sure of how to do that; writing to the file when it's mapped by CCL can have unpredictable and unpleasant results. SAVE-APPLICATION will delete its output file's directory entry and create a new file; one may need to exercise care when using file system utilities (like tar, for instance) that might overwrite an existing image file.

15.5. Weak References

In general, a "weak reference" is a reference to an object which does not prevent the object from being garbage-collected. For example, suppose that you want to keep a list of all the objects of a certain type. If you don't take special steps, the fact that you have a list of them will mean that the objects are always "live", because you can always reference them through the list. Therefore, they will never be garbage-collected, and their memory will never be reclaimed, even if they are referenced nowhere else in the program. If you don't want this behavior, you need weak references.

CCL supports weak references with two kinds of objects: weak hash tables and populations.

Weak hash tables are created with the standard Common Lisp function make-hash-table, which is extended to accept the keyword argument :weak. Hash tables may be weak with respect to either their keys or their values. To make a hash table with weak keys, invoke make-hash-table with the option :weak t, or, equivalently, :weak :key. To make one with weak values, use :weak :value. When the key is weak, the equality test must be #'eq (because it wouldn't make sense otherwise).

When garbage-collection occurs, key-value pairs are removed from the hash table if there are no non-weak references to the weak element of the pair (key or value).

In general, weak-key hash tables are useful when you want to use the hash to store some extra information about the objects you look up in it, while weak-value hash tables are useful when you want to use the hash as an index for looking up objects.

A population encapsulates an object, causing certain reference from the object to be considered weak. CCL supports two kinds of populations: lists, in which case the encapsulated object is a list of elements, which are spliced out of the list when there are no non-weak references to the element; and alists, in which case the encapsulated object is a list of conses which are spliced out of the list if there are no non-weak references to the car of the cons.

If you are experimenting with weak references interactively, remember that an object is not dead if it was returned by one of the last three interactively-evaluated expressions, because of the variables *, **, and ***. The easy workaround is to evaluate some meaningless expression before invoking gc, to get the object out of the REPL variables.

15.6. Weak References Dictionary

[Function]

make-population &key type initial-contents

Arguments and Values:

type---The type of population, one of :LIST (the default) or :ALIST

initial-contents--- A sequence of elements (or conses, for :alist) to be used to initialize the population. The sequence itself (and the conses in case of an alist) is not stored in the population, a new list or alist is created to hold the elements.

Description:

Creates a new population of the specified type.

[Function]

population-type population

Description:

returns the type of population, one of :LIST or :ALIST

[Function]

population-contents population

Description:

returns the list encapsulated in population. Note that as long as there is a direct (non-weak) reference to this list, it will not be modified by the garbage collector. Therefore it is safe to traverse the list, and even modify it, no different from any other list. If you want the elements to become garbage-collectable again, you must stop refering to the list directly.

[Function]

(setf (population-contents population) contents)

Description:

Sets the list encapsulated in population to contents. Contents is not copied, it is used directly.

15.7. Garbage-Collection Dictionary

[Function]

lisp-heap-gc-threshold

Description:

Returns the value of the kernel variable that specifies the amount of free space to leave in the heap after full GC.

[Function]

set-lisp-heap-gc-threshold new-threshold

Arguments and Values:

new-threshold---The requested new lisp-heap-gc-threshold.

Description:

Sets the value of the kernel variable that specifies the amount of free space to leave in the heap after full GC to new-value, which should be a non-negative fixnum. Returns the value of that kernel variable (which may be somewhat larger than what was specified).

[Function]

use-lisp-heap-gc-threshold

Description:

Tries to grow or shrink lisp's heap space, so that the free space is (approximately) equal to the current heap threshold. Returns NIL

[Function]

egc arg

Arguments and Values:

arg---a generalized boolean

Description:

Enables the EGC if arg is non-nil, disables the EGC otherwise. Returns the previous enabled status. Although this function is thread-safe (in the sense that calls to it are serialized), it doesn't make a whole lot of sense to be turning the EGC on and off from multiple threads ...

[Function]

egc-enabled-p

Description:

Returns T if the EGC was enabled at the time of the call, NIL otherwise.

[Function]

egc-active-p

Description:

Returns T if the EGC was active at the time of the call, NIL otherwise. Since this is generally a volatile piece of information, it's not clear whether this function serves a useful purpose when native threads are involved.

[Function]

egc-configuration

Description:

Returns, as multiple values, the sizes in kilobytes of the thresholds associated with the youngest ephemeral generation, the middle ephemeral generation, and the oldest ephemeral generation

[Function]

configure-egc generation-0-size generation-1-size generation-2-size

Arguments and Values:

generation-0-size---the requested threshold size of the youngest generation, in kilobytes

generation-1-size---the requested threshold size of the middle generation, in kilobytes

generation-2-size---the requested threshold size of the oldest generation, in kilobytes

Description:

Puts the indicated threshold sizes in effect. Each threshold indicates the total size that may be allocated in that and all younger generations before a GC is triggered. Disables EGC while setting the values. (The provided threshold sizes are rounded up to a multiple of 64Kbytes in CCL 0.14 and to a multiple of 32KBytes in earlier versions.)

[Function]

gc-retain-pages arg

Arguments and Values:

arg---a generalized boolean

Description:

Tries to influence the GC to retain/recycle the pages allocated between GCs if arg is true, and to release them otherwise. This is generally a tradeoff between paging and other VM considerations.

[Function]

gc-retaining-pages

Description:

Returns T if the GC tries to retain pages between full GCs and NIL if it's trying to release them to improve VM paging performance.

Chapter 16. Implementation Details of Clozure CL

This chapter describes many aspects of OpenMCL's implementation as of (roughly) version 1.1. Details vary a bit between the three architectures (PPC32, PPC64, and x86-64) currently supported and those details change over time, so the definitive reference is the source code (especially some files in the ccl/compiler/ directory whose names contain the string "arch" and some files in the ccl/lisp-kernel/ directory whose names contain the string "constants".) Hopefully, this chapter will make it easier for someone who's interested to read and understand the contents of those files.

16.1. Threads and exceptions

Clozure CL's threads are "native" (meaning that they're scheduled and controlled by the operating system.) Most of the implications of this are discussed elsewhere; this section tries to describe how threads look from the lisp kernel's perspective (and especially from the GC's point of view.)

Clozure CL's runtime system tries to use machine-level exception mechanisms (conditional traps when available, illegal instructions, memory access protection in some cases) to detect and handle exceptional situations. These situations include some TYPE-ERRORs and PROGRAM-ERRORS (notably wrong-number-of-args errors), and also include cases like "not being able to allocate memory without GCing or obtaining more memory from the OS." The general idea is that it's usually faster to pay (very occasional) exception-processing overhead and figure out what's going on in an exception handler than it is to maintain enough state and context to handle an exceptional case via a lighter-weight mechanism when that exceptional case (by definition) rarely occurs.

Some emulated execution environments (the Rosetta PPC emulator on x86 versions of Mac OS X) don't provide accurate exception information to exception handling functions. Clozure CL can't run in such environments.

16.1.1. The Thread Context Record

When a lisp thread is first created (or when a thread created by foreign code first calls back to lisp), a data structure called a Thread Context Record (or TCR) is allocated and initialized. On modern versions of Linux and FreeBSD, the allocation actually happens via a set of thread-