(sed.info.gz) Escapes
Info Catalog
(sed.info.gz) Extended Commands
(sed.info.gz) sed Programs
GNU Extensions for Escapes in Regular Expressions
=================================================
Until this chapter, we have only encountered escapes of the form
`\^', which tell `sed' not to interpret the circumflex as a special
character, but rather to take it literally. For example, `\*' matches
a single asterisk rather than zero or more backslashes.
This chapter introduces another kind of escape(1)--that is, escapes
that are applied to a character or sequence of characters that
ordinarily are taken literally, and that `sed' replaces with a special
character. This provides a way of encoding non-printable characters in
patterns in a visible manner. There is no restriction on the
appearance of non-printing characters in a `sed' script but when a
script is being prepared in the shell or by text editing, it is usually
easier to use one of the following escape sequences than the binary
character it represents:
The list of these escapes is:
`\a'
Produces or matches a BEL character, that is an "alert" (ASCII 7).
`\f'
Produces or matches a form feed (ASCII 12).
`\n'
Produces or matches a newline (ASCII 10).
`\r'
Produces or matches a carriage return (ASCII 13).
`\t'
Produces or matches a horizontal tab (ASCII 9).
`\v'
Produces or matches a so called "vertical tab" (ASCII 11).
`\cX'
Produces or matches `CONTROL-X', where X is any character. The
precise effect of `\cX' is as follows: if X is a lower case
letter, it is converted to upper case. Then bit 6 of the
character (hex 40) is inverted. Thus `\cz' becomes hex 1A, but
`\c{' becomes hex 3B, while `\c;' becomes hex 7B.
`\dXXX'
Produces or matches a character whose decimal ASCII value is XXX.
`\oXXX'
Produces or matches a character whose octal ASCII value is XXX.
`\xXX'
Produces or matches a character whose hexadecimal ASCII value is
XX.
`\b' (backspace) was omitted because of the conflict with the
existing "word boundary" meaning.
Other escapes match a particular character class and are valid only
in regular expressions:
`\w'
Matches any "word" character. A "word" character is any letter or
digit or the underscore character.
`\W'
Matches any "non-word" character.
`\b'
Matches a word boundary; that is it matches if the character to
the left is a "word" character and the character to the right is a
"non-word" character, or vice-versa.
`\B'
Matches everywhere but on a word boundary; that is it matches if
the character to the left and the character to the right are
either both "word" characters or both "non-word" characters.
`\`'
Matches only at the start of pattern space. This is different
from `^' in multi-line mode.
`\''
Matches only at the end of pattern space. This is different from
`$' in multi-line mode.
---------- Footnotes ----------
(1) All the escapes introduced here are GNU extensions, with the
exception of `\n'. In basic regular expression mode, setting
`POSIXLY_CORRECT' disables them inside bracket expressions.
Info Catalog
(sed.info.gz) Extended Commands
(sed.info.gz) sed Programs
automatically generated byinfo2html