DynaPDF Manual - Page 646
Previous Page 645 Index Next Page 647

Function Reference
Page 646 of 874
to Unicode with TranslateRawCode(). TranslateRawCode() converts the source string on a per
character basis and calculates the width of that character in one pass.
The parameter Width represents the width of the entire text record. The source array provides also
the displacement vector Advance. Advance is a vector also if only coordinate is given; the y-
coordinate is always zero. Positive values of Advance move the cursor to the left; negative values
move it to the right in a non-rotated coordinate system. The string widths and the displacement
vector are measured in text space.
The displacement vector is often used to apply kerning between two characters but it can also be
used to emulate spaces or to move the cursor to an arbitrary position on the x-axis of the text line.
Because CID fonts do not support word spacing, spaces are very often emulated with the
displacment vector.
The source strings are not null-terminated. The array can also contain strings with zero length. In
this case only the displacement vector Advance must be considered.
DynaPDF is delivered with the example text_search which demonstrates how a text search
algorithm can be developed. This project should be used as basis to develop your own code.
TShowTextArrayW
This is the preferred callback function to develop text extraction algorithms. See also Sub string
coordinates for further information.
The arrays Source and Kerning contain the source and translated Unicode strings of a text record.
Both arrays contain always the same number of elements (parameter Count).
The parameter Width represents the width of the enitre text record. The kerning array provides also
the width of each sub record and the displacement vector Advance. Advance is a vector also if only
coordinate is given; the y-coordinate is always zero. Positive values of Advance move the cursor to
the left; negative values move it to the right in a non-rotated coordinate system. The string widths
and the displacement vector are measured in text space.
The displacement vector is often used to apply kerning between two characters but it can also be
used to emulate spaces or to move the cursor to an arbitrary position on the x-axis of the text line.
Because CID fonts do not support word spacing, spaces are very often emulated with the
displacment vector.
The source strings are required if the width of a sub string must be calculated. Note that it is not
possible to calculate the the width of a sub string from the Unicode string.
It is possible that one or more sub records contain strings with a zero length. In this case, only the
displacment vector Advance must be considered.
DynaPDF is delivered with the examples text_extraction and text_coordinates which demonstrate
how text extraction algorithms can be developed and how text coordinates must be calculated. One
of these projects should be used as basis to develop your own code.
Previous topic: TSetFont, TRestoreGraphicState, TSaveGraphicState, TShowTextArrayA
Next topic: Image Extraction