2.1 Character Set

  1. The only characters allowed outside of comments are the graphic_characters and format_effectors.


  2. character ::=
       | format_effector
       | other_control_function
  3. graphic_character ::=
       | digit
       | space_character
       | special_character

    Static Semantics

  4. The character repertoire for the text of an Ada program consists of the collection of characters called the Basic Multilingual Plane (BMP) of the ISO 10646 Universal Multiple-Octet Coded Character Set, plus a set of format_ effectors and, in comments only, a set of other_control_functions; the coded representation for these characters is implementation defined (it need not be a representation defined within ISO-10646-1).
  5. The description of the language definition in this International Standard uses the graphic symbols defined for Row 00: Basic Latin and Row 00: Latin-1 Supplement of the ISO 10646 BMP; these correspond to the graphic symbols of ISO 8859-1 (Latin-1); no graphic symbols are used in this International Standard for characters outside of Row 00 of the BMP. The actual set of graphic symbols used by an implementation for the visual representation of the text of an Ada program is not specified.
  6. The categories of characters are defined as follows:
  7. identifier_letter
    upper_case_identifier_letter | lower_case_identifier_letter
  8. upper_case_identifier_letter
    Any character of Row 00 of ISO 10646 BMP whose name begins ``Latin
    Capital Letter''.
  9. lower_case_identifier_letter
    Any character of Row 00 of ISO 10646 BMP whose name begins
    ``Latin Small Letter''.
  10. digit
    One of the characters 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9.
  11. space_character
    The character of ISO 10646 BMP named ``Space''.
  12. special_character
    Any character of the ISO 10646 BMP that is not reserved for a
    control function, and is not the space_character, an
    identifier_letter, or a digit.
  13. format_effector
    The control functions of ISO 6429 called character tabulation
    (HT), line tabulation (VT), carriage return (CR), line feed (LF),
    and form feed (FF).
  14. other_control_function
    Any control function, other than a format_effector, that is
    allowed in a comment; the set of other_control_functions allowed
    in comments is implementation defined.
  15. The following names are used when referring to certain special_characters:
    symbol   name                   symbol   name
      "      quotation mark           :      colon
      #      number sign              ;      semicolon
      &      ampersand                <      less-than sign
      '      apostrophe, tick         =      equals sign
      (      left parenthesis         >      greater-than sign
      )      right parenthesis        _      low line, underline
      *      asterisk, multiply       |      vertical line
      +      plus sign                [      left square bracket
      ,      comma                    ]      right square bracket
      -      hyphen-minus, minus      {      left curly bracket
      .      full stop, dot, point    }      right curly bracket
      /      solidus, divide

    Implementation Permissions

  16. In a nonstandard mode, the implementation may support a different character repertoire; in particular, the set of characters that are considered identifier_letters can be extended or changed to conform to local conventions.


  17. (1) Every code position of ISO 10646 BMP that is not reserved for a control function is defined to be a graphic_character by this International Standard. This includes all code positions other than 0000 - 001F, 007F - 009F, and FFFE - FFFF.
  18. (2) The language does not specify the source representation of programs.

