Changes between Initial Version and Version 1 of Ticket #732

08/28/10 00:15:05 (7 years ago)

Saying :external-format :utf-8 should cause both the input and output files to be treated as being encoded in utf-8. You probably do need to add :IF-DOES-NOT-EXIST :CREATE to the clause that opens the output file.

As far as I can tell, your test worked correctly. "roots.txt" contains one form; the second element of the second element of that form is a token containing the two characters #\U+0915 and #\U+0943; those characters were encoded (in UTF-8) in the input file as the octet sequences #xe0 #xa4 #x95 and #xe0 #xa5 #x83. That same sequence of octets was written to the output file; that sequence will look like garbage unless whatever's looking at it knows that the file's encoded in utf-8; it'll look like two Devangari characters to something that knows how the file's encoded.

Unless I'm missing something, I don't see a bug here. I get the same results that you report, and they seem to be correct.


  • Ticket #732

    • Property Status changed from new to closed
    • Property Resolution changed from to invalid
  • Ticket #732 – Description

    initial v1  
    11I'm doing an experiment in formatting some files that use Devanagari unicode characters. The input file is utf8, and it's my intention to produce a utf8 output file. The following function reads a sexp, and for each correctly prints a Devanagari word to the screen, and apparently writes the same word as garbage to the output file. Can you please tell me the right stream parameters? Thanks. 
    34(defun format-dict () 
    45  (let ((fi "/Users/kmorgan/documents/yoga/sanskrit/roots/roots.txt") 
    1314            ;(print-entry so x) 
    1415            ))))))