Ticket #358 (closed defect: fixed)

Opened 5 years ago

Last modified 5 years ago

filesystem character encoding

Reported by: stassats Owned by: gb
Priority: minor Milestone:
Component: Runtime (threads, GC) Version: trunk
Keywords: Cc:

Description

CCL doesn't properly deal with filenames with unicode characters which are beyond latin-1.

Neither DIRECTORY does list correct filenames, nor OPEN can acces files with unicode pathnames.

-K utf-8 is supplied, *default-file-character-encoding* => :UTF-8. That is on 64 and 32 bit linux.

Change History

comment:1 Changed 5 years ago by gb

  • Status changed from new to assigned

To the best of my knowledge, Linux is completely ignorant of pathname encoding: a file or directory just has a NUL-terminated string for a name, and whether that's encoded in UTF-8 or ASCII or ... is up to the application. (This is in contrast to the approach taken by Darwin - for example - where filenames are internally represented in a kind of weird decomposed UTF-8, regardless of the filesystem. Windows uses UTF-16; FreeBSD either uses UTF-8 or plans to standardize on that in the near future.)

Even if there's no good way for OPEN or DIRECTORY or ... to guess what encoding's in use, there should at least be a default for Linux and other OSes that don't impose or follow encoding conventions.

(Whatever that's called - perhaps *DEFAULT-PATHNAME-ENCODING* - the way that a pathname is encoded doesn't generally have anything to with how its contents are encoded.)

comment:2 Changed 5 years ago by stassats

Maybe it's reasonable to determine default encoding by the value of LC_CTYPE or LANG.

comment:3 Changed 5 years ago by rme

r11200, r11202 adds some support for specifying the encoding used for filenames.

comment:4 Changed 5 years ago by rme

  • Status changed from assigned to closed
  • Resolution set to fixed

Given the presence of ccl:pathname-encoding-name and its setf inverse (which I just added to the manual with r12432), I'm going to close this ticket.

If this functionality is inadequate, please re-open the ticket.

Note: See TracTickets for help on using tickets.