wiki:HemlockProgrammer/HighLevelTextPrimitives

Version 2 (modified by gz, 6 years ago) (diff)

--

Back to Table of Contents

15. High-Level Text Primitives

This chapter discusses primitives that operate on higher level text forms than characters and words. For English text, there are functions that know about sentence and paragraph structures, and for Lisp sources, there are functions that understand this language. This chapter also describes mechanisms for organizing file sections into logical pages and for formatting text forms.

15.1 Indenting Text

Indent Function (initial value tab-to-tab-stop) [Variable]

The value of this variable determines how indentation is done, and it is a function which is passed a mark as its argument. The function should indent the line that the mark points to. The function may move the mark around on the line. The mark will be :left-inserting. The default simply inserts a tab character at the mark. A function for Lisp mode probably moves the mark to the beginning of the line, deletes horizontal whitespace, and computes some appropriate indentation for Lisp code.

Indent with Tabs (initial value nil) [Variable]
Spaces per Tab (initial value 8) [Variable]

Indent with Tabs should be true if indenting should use tabs whenever possible. If nil, the default, it only uses spaces. Spaces per Tab defines the size of a tab.

indent-region region [Function]
indent-region-for-commands region [Function]

indent-region invokes the value of Indent Function on every line of region. indent-region-for-commands uses indent-region but first saves the region for the Undo command.

delete-horizontal-space mark [Function]

This deletes all characters on either side of mark with a Space attribute (see section 9.5) of 1.

15.2. Lisp Text Buffers

Hemlock bases its Lisp primitives on parsing a block of the buffer and annotating lines as to what kind of Lisp syntax occurs on the line or what kind of form a mark might be in (for example, string, comment, list, etc.). These do not work well if the block of parsed forms is exceeded when moving marks around these forms, but the block that gets parsed is somewhat programmable.

There is also a notion of a top level form which this documentation often uses synonymously with defun, meaning a Lisp form occurring in a source file delimited by parentheses with the opening parenthesis at the beginning of some line. The names of the functions include this inconsistency.

pre-command-parse-check mark for-sure [Function]
Parse Start Function (initial value start-of-parse-block) [Variable]
Parse End Function (initial value end-of-parse-block) [Variable]
Minimum Lines Parsed (initial value 50) [Variable]
Maximum Lines Parsed (initial value 500) [Variable]
Defun Parse Goal (initial value 2) [Variable]

pre-command-parse-check calls Parse Start Function and Parse End Function on mark to get two marks. It then parses all the lines between the marks including the complete lines they point into. When for-sure is non-nil, this parses the area regardless of any cached information about the lines. Every command that uses the following routines calls this before doing so.

The default values of the start and end variables use Minimum Lines Parsed, Maximum Lines Parsed, and Defun Parse Goal to determine how big a region to parse. These two functions always include at least the minimum number of lines before and after the mark passed to them. They try to include Defun Parse Goal number of top level forms before and after the mark passed them, but these functions never return marks that include more than the maximum number of lines before or after the mark passed to them.

form-offset mark count [Function]

This tries to move mark count forms forward if positive or -count forms backwards if negative. Mark is always moved. If there were enough forms in the appropriate direction, this returns mark, otherwise nil.

top-level-offset mark count [Function]

This tries to move mark count top level forms forward if positive or -count top level forms backwards if negative. If there were enough top level forms in the appropriate direction, this returns mark, otherwise nil. Mark is moved only if this is successful.

mark-top-level-form mark1 mark2 [Function]

This moves mark1 and mark2 to the beginning and end, respectively, of the current or next top level form. Mark1 is used as a reference to start looking. The marks may be altered even if unsuccessful. If successful, return mark2, else nil. Mark2 is left at the beginning of the line following the top level form if possible, but if the last line has text after the closing parenthesis, this leaves the mark immediately after the form.

defun-region mark [Function]

This returns a region around the current or next defun with respect to mark. Mark is not used to form the region. If there is no appropriate top level form, this signals an editor-error. This calls pre-command-parse-check first.

inside-defun-p mark [Function]
start-defun-p mark [Function]

These return, respectively, whether mark is inside a top level form or at the beginning of a line immediately before a character whose Lisp Syntax (see section 9.5) value is :opening-paren.

forward-up-list mark [Function]
backward-up-list mark [Function]

Respectively, these move mark immediately past a character whose Lisp Syntax (see section 9.5) value is :closing-paren or immediately before a character whose Lisp Syntax value is :opening-paren.

valid-spot mark forwardp [Function]

This returns t or nil depending on whether the character indicated by mark is a valid spot. When forwardp is set, use the character after mark and vice versa. Valid spots exclude commented text, inside strings, and character quoting.

defindent name count [Function]

This defines the function with name to have count special arguments. indent-for-lisp, the value of Indent Function (see section 15.1) in Lisp mode, uses this to specially indent these arguments. For example, do has two, with-open-file has one, etc. There are many of these defined by the system including definitions for special Hemlock forms. Name is a simple-string, case insensitive and purely textual (that is, not read by the Lisp reader); therefore, "with-a-mumble" is distinct from "mumble:with-a-mumble".

15.3. English Text Buffers

This section describes some routines that understand basic English language forms.

word-offset mark count [Function]

This moves mark count words forward (if positive) or backwards (if negative). If mark is in the middle of a word, that counts as one. If there were count (-count if negative) words in the appropriate direction, this returns mark, otherwise nil. This always moves mark. A word lies between two characters whose Word Delimiter attribute value is 1 (see section 9.5).

sentence-offset mark count [Function]

This moves mark count sentences forward (if positive) or backwards (if negative). If mark is in the middle of a sentence, that counts as one. If there were count (-count if negative) sentences in the appropriate direction, this returns mark, otherwise nil. This always moves mark.

A sentence ends with a character whose Sentence Terminator attribute is 1 followed by two spaces, a newline, or the end of the buffer. The terminating character is optionally followed by any number of characters whose Sentence Closing Char attribute is 1. A sentence begins after a previous sentence ends, at the beginning of a paragraph, or at the beginning of the buffer.

paragraph-offset mark count &optional prefix [Function]
Paragraph Delimiter Function (initial value ) [Variable]

This moves mark count paragraphs forward (if positive) or backwards (if negative). If mark is in the middle of a paragraph, that counts as one. If there were count (-count if negative) paragraphs in the appropriate direction, this returns mark, otherwise nil. This only moves mark if there were enough paragraphs.

Paragraph Delimiter Function holds a function that takes a mark, typically at the beginning of a line, and returns whether or not the current line should break the paragraph. default-para-delim-function returns t if the next character, the first on the line, has a Paragraph Delimiter attribute value of 1. This is typically a space, for an indented paragraph, or a newline, for a block style. Some modes require a more complicated determinant; for example, Scribe modes adds some characters to the set and special cases certain formatting commands.

Prefix defaults to Fill Prefix (see section 15.5), and the right prefix is necessary to correctly skip paragraphs. If prefix is non-nil, and a line begins with prefix, then the scanning process skips the prefix before invoking the Paragraph Delimiter Function. Note, when scanning for paragraph bounds, and prefix is non-nil, lines are potentially part of the paragraph regardless of whether they contain the prefix; only the result of invoking the delimiter function matters.

The programmer should be aware of an idiom for finding the end of the current paragraph. Assume paragraphp is the result of moving mark one paragraph, then the following correctly determines whether there actually is a current paragraph:

(or paragraphp
  (and (last-line-p mark)
       (end-line-p mark)
       (not (blank-line-p (mark-line mark))))) 

In this example mark is at the end of the last paragraph in the buffer, and there is no last newline character in the buffer. paragraph-offset would have returned nil since it could not skip any paragraphs since mark was at the end of the current and last paragraph. However, you still have found a current paragraph on which to operate. mark-paragraph understands this problem.

mark-paragraph mark1 mark2 [Function]

This marks the next or current paragraph, setting mark1 to the beginning and mark2 to the end. This uses Fill Prefix (see section 15.5). Mark1 is always on the first line of the paragraph, regardless of whether the previous line is blank. Mark2 is typically at the beginning of the line after the line the paragraph ends on, this returns mark2 on success. If this cannot find a paragraph, then the marks are left unmoved, and nil is returned.

15.4. Logical Pages

Logical pages are not supported at this time.

15.5. Filling

Filling is an operation on text that breaks long lines at word boundaries before a given column and merges shorter lines together in an attempt to make each line roughly the specified length. This is different from justification which tries to add whitespace in awkward places to make each line exactly the same length. Hemlock's filling optionally inserts a specified string at the beginning of each line. Also, it eliminates extra whitespace between lines and words, but it knows two spaces follow sentences (see section 15.3).

Fill Column (initial value 75) [Variable]
Fill Prefix (initial value nil) [Variable]

These variables hold the default values of the prefix and column arguments to Hemlock's filling primitives. If Fill Prefix is nil, then there is no fill prefix.

fill-region region &optional prefix column [Function]

This deletes any blank lines in region and fills it according to prefix and column. Prefix and column default to Fill Prefix and Fill Column.

fill-region-by-paragraphs region &optional prefix column [Function]

This finds paragraphs (see section 15.3) within region and fills them with fill-region. This ignores blank lines between paragraphs. Prefix and column default to Fill Prefix and Fill Column.

Back to Table of Contents