Ticket #970 (closed defect: invalid)

Opened 2 years ago

Last modified 2 years ago

DIRECTORY includes directories by default

Reported by: adapterik Owned by:
Priority: normal Milestone:
Component: ANSI CL Compliance Version: trunk
Keywords: documentation Cc:

Description

I think that DIRECTORY is out of compliance with the (albeit fuzzy in this regard) ANSI CL as well as the CCL documentation for said function.

According to the ANSI CL, DIRECTORY should return "files", and by "files" it is pretty clear it does not refer to directories. The spec gives plenty of wiggle room for extensions, however.

CCL provides the extension keyword options :DIRECTORIES and :FILES to specify whether directories and/or files are considered. :DIRECTORIES should default to nil according to the CCL docs (and my interpretation of ANSI CL compliance) but actually defaults to T (from usage results and source code inspection.) Thus the bare, naked usage of DIRECTORY, such as (DIRECTORY "*") will provide directories, if any are found, in the results.

If CCL's DIRECTORY defaults to :DIRECTORY nil, it would be in compliance as far as this point goes. However, I think there is further work to ensure that DIRECTORY (and related functions like PROBE-FILE) are consistent and compliant in the treatment of pathnames.

That will be the subject of a follow up either here or the the mailing list.

Man do I feel like a nit picking Lisper now!

Change History

comment:1 follow-up: ↓ 2 Changed 2 years ago by gb

  • Status changed from new to closed
  • Resolution set to invalid

The 1.8 release notes mention that the default value of the :DIRECTORIES argument has changed; you're correct in noting that the documentation should be updated.

If you think the spec says enough about DIRECTORY's behavior for there to be a conformance issue here, I think that you should consider that very carefully.

People have been making the argument for years that DIRECTORY's output should include directories, since (after all, they claim) "directories are just files." A lot of different behaviors are (or or not) "intuitive". The spec (the glossary) defines a file to be "a named entry in a file system, having an implementation-defined nature." That doesn't seem to resolve the question of whether or not a directory is a file.

CCL's traditional behavior was inherited from MCL, and some people may prefer this. That behavior differs from that of most (all?) other implementations, which behave as if :DIRECTORIES T was specified (and this difference is noted in e.g. Chapter 15 of the book "Practical Common Lisp".)

DIRECTORY is (by definition) extremely implementation-dependent. (Among other things, it's defined to accept keyword arguments but no portable keyword arguments are defined.) It is almost laughable to see people trying to use DIRECTORY in portable code, but that doesn't stop people from doing so (and, as that chapter of Practical Common Lisp notes, it's possible to do so in practice in many cases if one is aware of some implementation-specific issues.)

I've had people literally question whether it was legal for DIRECTORY to accept keyword arguments, and after listening to this kind of thing several times a year for 10 years or so I figured that changing the default might improve the signal-to-noise ratio on the mailling list as far as this issue was concerned. (Yes, the change breaks backward compatibility and negatively affects anyone who expects the old behavior; that's a large part of the reason that this didn't change for ~10 years.)

comment:2 in reply to: ↑ 1 ; follow-up: ↓ 3 Changed 2 years ago by adapterik

Replying to gb:

The 1.8 release notes mention that the default value of the :DIRECTORIES argument has changed; you're correct in noting that the documentation should be updated.

If you think the spec says enough about DIRECTORY's behavior for there to be a conformance issue here, I think that you should consider that very carefully.

Granted, granted, this is murky territory. I really shouldn't be wading in here...

People have been making the argument for years that DIRECTORY's output should include directories, since (after all, they claim) "directories are just files." A lot of different behaviors are (or or not) "intuitive". The spec (the glossary) defines a file to be "a named entry in a file system, having an implementation-defined nature." That doesn't seem to resolve the question of whether or not a directory is a file.

Yes, but if you follow up on "file system", you will find:

a facility which permits aggregations of data to be stored in named files

which I think makes it pretty clear that files need to be able to store data, and by that we should assume (although it isn't referenced in the glossary ala the CLHS ubiquitous style) Lisp data. That clearly rules out directories. Also, the vast majority of references to files in the CLHS presume a regular container file, not a directory, not a link, etc.

CCL's traditional behavior was inherited from MCL, and some people may prefer this. That behavior differs from that of most (all?) other implementations, which behave as if :DIRECTORIES T was specified (and this difference is noted in e.g. Chapter 15 of the book "Practical Common Lisp".)

DIRECTORY is (by definition) extremely implementation-dependent. (Among other things, it's defined to accept keyword arguments but no portable keyword arguments are defined.) It is almost laughable to see people trying to use DIRECTORY in portable code, but that doesn't stop people from doing so (and, as that chapter of Practical Common Lisp notes, it's possible to do so in practice in many cases if one is aware of some implementation-specific issues.)

I've had people literally question whether it was legal for DIRECTORY to accept keyword arguments, and after listening to this kind of thing several times a year for 10 years or so I figured that changing the default might improve the signal-to-noise ratio on the mailling list as far as this issue was concerned. (Yes, the change breaks backward compatibility and negatively affects anyone who expects the old behavior; that's a large part of the reason that this didn't change for ~10 years.)

Haha, okay.

I do remember being shocked in some implementations (CLISP, e.g., and I guess CCL), not to find directories in a (DIRECTORY "*"). However, I'm not as pessimistic as you. I don't think its all that hard to explain the logic of DIRECTORY -- okay, you've got to figure it out first!

I think part of the problem is that the pathname/logical-pathname is too overloaded and punned to death. However, I think that proper usage of it, short of killing it off in a new Modern Common Lisp, is to get a definitive interpretation out there. I know, I really know, from my early days reading the diatribes and flame fests on c.l.l. what can happen when this is considered.

Let me mention a simple method:

1.The interpretation of the the pathname spec for DIRECTORY requires a recognition that it can represent either a directory or a file and not both.

2.If file name patterns are provided (:NAME or :TYPE or :VERSION) or are :UNSPECIFIC or nil then the directory slot refers to the directory path of the file.

3.If no file name patterns are supplied (or are :UNSPECIFIC or nil) then the directory slot, if specified, refers to a directory itself.

  1. This implies that directories do not have a type or version. Directory paths are represented as lists of symbols and strings, so it is clear that directory components do not have a type or version.

Does that cover it?

The proviso being if you really want to get to platform specific features that by definition would require more powerful tools, e.g. what ccl offers.

The same rules should apply to probe-file and friends.

I also wanted to mention the point that IF non-regular-file results are returned, the programmer MUST test the returned pathnames to determine whether they are files or directories, because anything that expects a file will not work if the pathname indeed points to a directory. In order to test whether the pathname points to a file or directory, you've gotta understand pathname anyway. This is an argument for the programmer constructing a proper pathname spec for DIRECTORY to start with, and for having that work sensibly.

Finally, I wanted to underline that I think the usage of the :TEST extension, along with the other CCL extensions, allow a totally different approach to DIRECTORY that would be a very useful, if implementation specific, alternative.

Erik.

comment:3 in reply to: ↑ 2 Changed 2 years ago by adapterik

Replying to adapterik:

Replying to gb:

  1. This implies that directories do not have a type or version. Directory paths are represented as lists of symbols and strings, so it is clear that directory components do not have a type or version.

Okay, I want to retract that to some degree. Although it is possible to consider a directory with a type or version, the representation of a directory spec as a list of keyword symbols and strings (or nil or :wild) does not say anything about this. I was getting the path representation confused with the directory as an object. In that case, I can see that a dotted suffix would get interpreted as a "type". In this case, I don't see a way to disambiguate a dotted file from a dotted directory.

Wouldn't a "file-type" slot have been nice?

However, it does seem to me that it could be agreed upon that a pathname with no name, type or version should be interpreted as the directory container itself, which is found in the directory slot, the (car (last (pathname-directory p))) of which is the name of the directory.

On the other hand I can see why it is tempting to put the directory name in the name slot. After all, there it is, in ls. But then that pathname can't be sensibly used as a directory, e.g. with merge-pathnames. I guess that is why CCL returns pathnames that make sense, but is forgiving with users by allowing them to supply pathnames that bend the rules. But perhaps that is too asymmetric and just confuses people (like me)?

? (probe-file (make-pathname :type "gconf")) #P"/home/epearson/.gconf/" ? (pathname-type (probe-file ".gconf")) NIL

Oh, god, now I'm going down the rabbit hole again.

Erik.

comment:4 Changed 2 years ago by adapterik

I don't want anyone to stress about this (but I doubt anyone but me is, since it is a laughable problem), so I'll consent to closing this.

Let me just leave a few shared thoughts. The first is that in order to built strong, portable applications, we need to remove ambiguity. The spec provides the framework for considering strict and tolerant applications. It provides wiggle room, but you don't need to use it. Wide tolerances lead to unpredictable behavior.

The second is that I'm sure there is a solution to the DIRECTORY and PATHNAME dilemma, one that solves the ambiguity problem with a small set of reasonable assumptions. The benefits for reliable portability are worth it.

The third is that there should be an enhancement in this area as part of an open CL extension process. There will always be room for, and Lisp certainly encourages, independent solutions, but I feel there is great benefit in a shared solution to common use cases.

Note: See TracTickets for help on using tickets.