Dec 13, 2012, 6:27:55 AM (7 years ago)

Support using the "coding" option in a file's file options line (a
line at the start of a text file that contains name:value pairs
separated by semicolons bracketed by -*- sequences) to determine a
file's character encoding. Specifically:

  • OPEN now allows an external-format of :INFERRED; previously, this was shorthand for an external-format whose line-termination was inferred and whose character encoding was based on *DEFAULT-FILE-CHARACTER-ENCODING*. When an input file whose external-format is specified as :INFERRED is opened, its file options are parsed and the value of the "coding" option is used if such an option is found (and if the value is something that CCL supports.) If a supported "coding" option isn't found, *DEFAULT-FILE-CHARACTER-ENCODING* is used as before.
  • In the Cocoa IDE, the Hemlock command "Ensure File Options Line" (bound to Control-Meta-M by default) ensures that the first line in the current buffer is a file options line and fills in some plausible values for the "Mode", "Package", and "Coding" options. The "Process File Options" command (Control-Meta-m) can be used to process the file options line after it's been edited. (The file options line is always processed when the file is first opened; changes to the "coding" option affect how the file will be saved.)

When a Lisp source file is opened in the IDE editor, the following
character encodings are tried in this order until one of them

  • if the "Open ..." panel was used to open the file and an encoding other than "Automatic" - which is now the default - is selected, that encoding is tried.
  • if a "coding" option is found, that encoding is tried.
  • the value of *DEFAULT-FILE-CHARACTER-ENCODING* is tried.
  • iso-8859-1 is tried. All files can be decoded in iso-8859-1.

This is all supposed to be what Emacs does and I think that it's
pretty close in practice.

A file that caused problems for Paul Krueger a few days ago
because its encoding (ISO-8859-1) wasn't guessed correctly
now has an explicit "coding" option and serves as a test case.

1 edited


  • trunk/source/cocoa-ide/hemlock/src/filecoms.lisp

    r14734 r15536  
    6161          (cond
    6262           ((find #\: string :start start :end end)
    63             (do ((opt-start start (1+ semi)) colon semi)
     63            (do ((opt-start start (1+ semi)) colon semi real-semi)
    6464                (nil)
    6565              (setq colon (position #\: string :start opt-start :end end))
    6666              (unless colon
    67                 (loud-message "Missing \":\".  Aborting file options.")
     67                (unless real-semi
     68                  (loud-message "Missing \":\".  Aborting file options."))
    6869                (return-from do-file-options))
    69               (setq semi (or (position #\; string :start colon :end end) end))
     70              (setq semi (or (setq real-semi (position #\; string :start colon :end end)) end))
    7071              (let* ((option (nstring-downcase
    7172                              (trim-subseq string opt-start colon)))
    130131(define-file-option "log" (buffer string)
    131132  (declare (ignore buffer string)))
     134(define-file-option "base" (buffer string)
     135  (declare (ignore buffer string)))
     137(define-file-option "syntax" (buffer string)
     138  (declare (ignore buffer string)))
     140(define-file-option "coding" (buffer string)
     141  (hemlock-ext:set-buffer-external-format buffer string))
    245256                      (variable-value 'hemlock::current-package
    246257                                      :buffer buffer)
    247                       "CL-USER")))
     258                      "CL-USER"))
     259                   (encoding-string (let* ((string (hemlock-ext:buffer-encoding-name buffer))
     260                                           (suffix (case (hi::buffer-line-termination buffer)
     261                                                     (:cr "mac")
     262                                                     (:crlf "dos"))))
     263                                      (if suffix
     264                                        (concatenate 'string string "-" suffix)
     265                                        string))))
    248266              (insert-string
    249267               mark
    250                (format nil ";;; -*- Mode: Lisp; Package: ~a -*-" package-name)))
     268               (format nil ";;; -*- Mode: Lisp; Package: ~a; Coding: ~a; -*-" package-name encoding-string)))
    251269            (insert-string
    252270             mark
Note: See TracChangeset for help on using the changeset viewer.