Changeset 9628


Ignore:
Timestamp:
May 30, 2008, 1:44:22 PM (11 years ago)
Author:
gb
Message:

UTF-8-MEMORY-ENCODE: remember to logior #x80 to the last octet of a
4-element sequence (as noted by Bob Cassels.).

UTF-8-LENGTH-OF-MEMORY-ENCODING: note that codes less than #xc2 and

#xf8 are invalid and will return a single #\Replacement_Character

without consuming additional octets. Make the second return value
be relative to the "start" argument, which may be non-zero.


File:
1 edited

Legend:

Unmodified
Added
Removed
  • branches/working-0711/ccl/level-0/l0-io.lisp

    r8705 r9628  
    8181                   (logior #x80 (the fixnum (logand #x3f (the fixnum (ash code -6))))))
    8282             (setf (%get-unsigned-byte pointer (the fixnum (+ idx 3)))
    83                    (logand #x3f code))
     83                   (logior #x80 (logand #x3f code)))
    8484             (incf idx 4))))))
    8585
     
    147147        (end (+ start noctets))
    148148        (nchars 0 (1+ nchars)))
    149        ((= i end) (values nchars i))
     149       ((= i end) (values nchars (- i start)))
    150150    (let* ((code (%get-unsigned-byte pointer i))
    151            (nexti (+ i (cond ((< code #x80) 1)
     151           (nexti (+ i (cond ((< code #xc2) 1)
    152152                             ((< code #xe0) 2)
    153153                             ((< code #xf0) 3)
    154                              (t 4)))))
     154                             ((< code #xf8) 4)
     155                             (t 1)))))
    155156      (declare (type (unsigned-byte 8) code))
    156157      (if (> nexti end)
    157         (return (values nchars i))
     158        (return (values nchars (- i start)))
    158159        (setq i nexti)))))
    159160
Note: See TracChangeset for help on using the changeset viewer.