Ticket #849 (closed defect: invalid)

Opened 3 years ago

Last modified 3 years ago

Reader tokenization is wrong for +.e2

Reported by: pjb@… Owned by:
Priority: normal Milestone:
Component: ANSI CL Compliance Version: trunk
Keywords: reader token float symbol Cc:

Description (last modified by gb) (diff)

$ clall -r '(read-from-string "+.e2")'

International Allegro CL Free Express Edition --> +.E2, 4
Clozure Common Lisp            --> 0.0, 4
CLISP                          --> |+.E2|, 4
CMU Common Lisp                --> |+.E2|, 4
ECL                            --> 0.0, 4
SBCL                           --> |+.E2|, 4


CLHS is clear about it "2.3.1 Numbers as Tokens", the syntax for
floating point numbers is:

    float          ::=  [sign]
                       {decimal-digit}*
                       decimal-point
                       {decimal-digit}+
                       [exponent]  
                        | 
                       [sign]
                       {decimal-digit}+
                       [decimal-point
                           {decimal-digit}*]
                       exponent    
    exponent       ::=  exponent-marker
                       [sign]
                       {digit}+    

that is, at least one decimal digit is required before or after the dot.

With no digit, the token +.e2 should be read as a symbol.

Change History

comment:1 Changed 3 years ago by gb

  • Status changed from new to closed
  • Resolution set to invalid
  • Description modified (diff)

The token in question is a "potential number", which basically means that implementations are free to interpret it as a number (or not do so), which in turn means that you can't really use such a token in portable code without escaping it. See section 2.3.1.1.

I'm not sure that CCL's interpretation of that token as a number is intentional or that we want to advertise that that syntax is supported as an extension, but the spec calls such tokens (potential numbers that don't have standard number syntax) "reserved".

I think that the CCL code that concludes that this is a number is probably just being sloppy about things and if that code is used elsewhere it's possible that that sloppiness could affect something that's more stringently defined. Since in this case (reading a potential number) it's explicitly undefined whether a number or symbol is returned, it's necessary to escape the token to remove that ambiguity.

Note: See TracTickets for help on using tickets.