(flex.info.gz) Options for Scanner Speed and Size
Info Catalog
(flex.info.gz) Code-Level And API Options
(flex.info.gz) Scanner Options
(flex.info.gz) Debugging Options
16.4 Options for Scanner Speed and Size
=======================================
`-C[aefFmr]'
controls the degree of table compression and, more generally,
trade-offs between small scanners and fast scanners.
`-C'
A lone `-C' specifies that the scanner tables should be
compressed but neither equivalence classes nor
meta-equivalence classes should be used.
`-Ca, --align, `%option align''
("align") instructs flex to trade off larger tables in the
generated scanner for faster performance because the elements
of the tables are better aligned for memory access and
computation. On some RISC architectures, fetching and
manipulating longwords is more efficient than with
smaller-sized units such as shortwords. This option can
quadruple the size of the tables used by your scanner.
`-Ce, --ecs, `%option ecs''
directs `flex' to construct "equivalence classes", i.e., sets
of characters which have identical lexical properties (for
example, if the only appearance of digits in the `flex' input
is in the character class "[0-9]" then the digits '0', '1',
..., '9' will all be put in the same equivalence class).
Equivalence classes usually give dramatic reductions in the
final table/object file sizes (typically a factor of 2-5) and
are pretty cheap performance-wise (one array look-up per
character scanned).
`-Cf'
specifies that the "full" scanner tables should be generated -
`flex' should not compress the tables by taking advantages of
similar transition functions for different states.
`-CF'
specifies that the alternate fast scanner representation
(described above under the `--fast' flag) should be used.
This option cannot be used with `--c++'.
`-Cm, --meta-ecs, `%option meta-ecs''
directs `flex' to construct "meta-equivalence classes", which
are sets of equivalence classes (or characters, if equivalence
classes are not being used) that are commonly used together.
Meta-equivalence classes are often a big win when using
compressed tables, but they have a moderate performance
impact (one or two `if' tests and one array look-up per
character scanned).
`-Cr, --read, `%option read''
causes the generated scanner to _bypass_ use of the standard
I/O library (`stdio') for input. Instead of calling
`fread()' or `getc()', the scanner will use the `read()'
system call, resulting in a performance gain which varies
from system to system, but in general is probably negligible
unless you are also using `-Cf' or `-CF'. Using `-Cr' can
cause strange behavior if, for example, you read from `yyin'
using `stdio' prior to calling the scanner (because the
scanner will miss whatever text your previous reads left in
the `stdio' input buffer). `-Cr' has no effect if you define
`YY_INPUT()' ( Generated Scanner).
The options `-Cf' or `-CF' and `-Cm' do not make sense together -
there is no opportunity for meta-equivalence classes if the table
is not being compressed. Otherwise the options may be freely
mixed, and are cumulative.
The default setting is `-Cem', which specifies that `flex' should
generate equivalence classes and meta-equivalence classes. This
setting provides the highest degree of table compression. You can
trade off faster-executing scanners at the cost of larger tables
with the following generally being true:
slowest & smallest
-Cem
-Cm
-Ce
-C
-C{f,F}e
-C{f,F}
-C{f,F}a
fastest & largest
Note that scanners with the smallest tables are usually generated
and compiled the quickest, so during development you will usually
want to use the default, maximal compression.
`-Cfe' is often a good compromise between speed and size for
production scanners.
`-f, --full, `%option full''
specifies "fast scanner". No table compression is done and
`stdio' is bypassed. The result is large but fast. This option
is equivalent to `--Cfr'
`-F, --fast, `%option fast''
specifies that the _fast_ scanner table representation should be
used (and `stdio' bypassed). This representation is about as fast
as the full table representation `--full', and for some sets of
patterns will be considerably smaller (and for others, larger). In
general, if the pattern set contains both _keywords_ and a
catch-all, _identifier_ rule, such as in the set:
"case" return TOK_CASE;
"switch" return TOK_SWITCH;
...
"default" return TOK_DEFAULT;
[a-z]+ return TOK_ID;
then you're better off using the full table representation. If
only the _identifier_ rule is present and you then use a hash
table or some such to detect the keywords, you're better off using
`--fast'.
This option is equivalent to `-CFr'. It cannot be used with
`--c++'.
Info Catalog
(flex.info.gz) Code-Level And API Options
(flex.info.gz) Scanner Options
(flex.info.gz) Debugging Options
automatically generated byinfo2html