Documentation on here-docs

Here-docs are borrowed into Perl, from shell programming. They allow for multiline strings without have to worry about backslashing your quoting delimiter. Other single-quoted string nastiness is also averted, when you use here-docs in single-quoted context. Here-docs can be used in ANY place where a normal quoted string would be used.

1. A Primer in Notation

[borrowed from Chapter 1 of the Python Reference Manual]

This document is presented using a modified BNF grammar notation. This uses the following style of definiton: name: lc_letter (lc_letter | "_")* lc_letter: "a"..."z" The first line states that a name is an lc_letter followed by a sequence of zero or more lc_letters and underscores. An lc_letter in turn, is any of the single characters "a" through "z".

Each rule begins with a name (whish is the name defined by the rule) and a colon. A vertical bar (|) is used to separate alternatives. A star (*) means zero or more repetitions of the preceding item; likewise, a plus (+) means one or more repetitions, and a phrase enclosed in square brackets ([ ]) means zero or one occurrences (in other words, the enclosed phrase is optional). The * and + operators bind as tightly as possible; parentheses (( )) are used for grouping. Literal strings are enclosed in quotes. White space is only meaningful to separate tokens. Rules are normally contained on a single line; rules with many alternatives may be formatted alternatively with each line after the first beginning with a vertical bar. Two literal characters separated by three dots mean a choice of any single character in the given (inclusive) range of ASCII characters. A phrase between angular brackets (< >) gives an informal description of the symbol defined; e.g., this could be used to describe the notion of 'any character but X' if needed.

2. Notation

The here-doc is initialized by <<, and then given an identifier. This identifier is a string used as the here-docs's terminator to represent the end of the here-doc, and must exist on a line of its own, at the end of the here-doc, followed by a newline character. Programs ending with a terminator and no trailing newline will complain that the terminator was never found. If there is whitespace between the << and the identifier, it can only be spaces or tabs; a newline before the identifier will make the here-doc use a blank line as its identifier, and this use is deprecated -- Perl suggests you use "" as the here-doc identifier to indicate a blank line as the terminator.

The placement of the semicolon (;) after the here-doc is initialized, or on the line after the terminator ending the here-doc, is only needed if there is a new Perl statement to follow. The semicolon must NOT directly proceed the terminator on the same line -- there must be a newline after the terminator.

The format of a here-doc identifier: identifer: word | ws* ['"' dq* '"' | "'" sq* "'" | "`" bt* "`"] word: (letter | digit | "_") letter: "A"..."Z", "a"..."z" digit: "0"..."9" ws: "\t" | " " dq: "\\\\" | '\\\"' | <any character except '"'> sq: "\\\\" | "\\\'" | <any character except "'"> bt: "\\\\" | "\\\`" | <any character except "`"> The format of how a here-doc is initialized: initializer: "<<" identifier The terminator must look exactly the same as the identifier used in the initializer of the here-doc, but without the surrounding quote characters, if any were used.

3. More on Identifiers

If there is no identifier given -- that is, there are only 0 or more spaces after the <<, in the place of the identifier, the here-doc will end on the first blank line, a line consisting solely of a newline. Perl would rather you use the empty string "" instead of an implicitly empty identifier, and will report a warning if the -w switch to Perl is used.

The quotes used around an identifier change the way the body of the here-doc is interpreted. If they are single quotes, then the ENTIRE body of the text is taken character for character, and absolutely no interpolation is done. That means that you needn't backslash a backslash. If double quotes are used, the body is treated as a double-quoted string, and all interpolation is done as such. If you use no identifier, or an identifier without quotes, double quoted interpolation is also performed. If backticks are used, the body of the here-doc is sent through the shell as commands.

Quotes around the identifier allow you to include spaces in the identifier, but they do not impose their interpolation on the identifier; meaning, the two identifiers "A $ball" and 'A $ball' are the same.

4. Examples

4.1. Working Examples

print <<EOF; Hello, $name. EOF print <<"EOF"; # same as above Hello, $name EOF print << "SAME THING"; # same as above Hello, $name SAME THING print << 'EOF'; # single quote => no interpolation This is not interpolated. \n is a \ and an n \\ is REALLY two backslashes! EOF