|
- % Copyright (C) 1991, 1995, 1996, 1998, 1999 Aladdin Enterprises. All rights reserved.
- %
- % This software is provided AS-IS with no warranty, either express or
- % implied.
- %
- % This software is distributed under license and may not be copied,
- % modified or distributed except as expressly authorized under the terms
- % of the license contained in the file LICENSE in this distribution.
- %
- % For more information about licensing, please refer to
- % http:
- % commercial licensing, go to http:
- % contact Artifex Software, Inc., 101 Lucas Valley Road #110,
- % San Rafael, CA 94903, U.S.A., +1(415)492-9861.
-
- % $Id: ps2ascii.ps 10152 2009-10-08 01:53:10Z alexcher $
- % Extract the ASCII text from a PostScript file. Nothing is displayed.
- % Instead, ASCII information is written to stdout. The idea is similar to
- % Glenn Reid's `distillery', only a lot more simple-minded, and less robust.
-
- % If SIMPLE is defined, just the text is written, with a guess at line
- % breaks and word spacing. If SIMPLE is not defined, lines are written
- % to stdout as follows:
- %
- % F <height> <width> (<fontname>)
- % Indicate the font height and the width of a space.
- %
- % P
- % Indicate the end of the page.
- %
- % S <x> <y> (<string>) <width>
- % Display a string.
- %
- % <width> and <height> are integer dimensions in units of 1/720".
- % <x> and <y> are integer coordinates, in units of 1/720", with the origin
- % at the lower left.
- % <string> and <fontname> are strings represented with the standard
- % PostScript escape conventions.
-
- % If COMPLEX is defined, the following additional types of lines are
- % written to stdout.
- %
- % C <r> <g> <b>
- % Indicate the current color.
- %
- % I <x> <y> <width> <height>
- % Note the presence of an image.
- %
- % R <x> <y> <width> <height>
- % Fill a rectangle.
- %
- % <r>, <g>, and <b> are RGB values expressed as integers between 0 and 1000.
- %
- % Note that future versions of this program (in COMPLEX mode) may add
- % other output elements, so programs parsing the output should be
- % prepared to ignore elements that they do not recognize.
-
- % Note that this code will only work in all cases if systemdict is writable
- % and if `binding' the definitions of operators defined as procedures
- % is deferred. For this reason, it is normally invoked with
- % gs -q -dNODISPLAY -dDELAYBIND -dWRITESYSTEMDICT ps2ascii.ps
-
- % Thanks to:
- % J Greely <jgreely@cis.ohio-state.edu> for improvements to this code;
- % Jerry Whelan <jerryw@abode.ccd.bnl.gov> for motivating other improvements;
- % David M. Jones <dmjones@theory.lcs.mit.edu> for improvements noted below.
-
- %% Additional modifications by David M. Jones
- %% (dmjones@theory.lcs.mit.edu), December 23, 1997
- %%
- %% (a) Rewrote forall loop at the end of .show.write. This fixes a
- %% stack leakage problem, but the changes are more significant
- %% than that.
- %%
- %% .char.map includes the names of all characters in the
- %% StandardEncoding, ISOLatin1Encoding, OT1Encoding and
- %% T1Encoding vectors. Thus, if the Encoding vector for the
- %% current font contains a name that is not in .char.map, it's
- %% redundant to check if the Encoding vector is equal to one of
- %% the known vectors. Previous versions of ps2ascii would give
- %% up at this point, and substitute an asterisk (*) for the
- %% character. I've taken the liberty of instead using the
- %% OT1Encoding vector to translate the character, on the grounds
- %% that in the cases I'm most interested in, a font without a
- %% useful Encoding vector was most likely created by a DVI to PS
- %% converter such as dvips or DVILASER (and OT1Encoding is
- %% largely compatible with StandardEncoding anyway). [Note that
- %% this does not make my earlier changes to support dvips (see
- %% fix (a) under my 1996 changes) completely obsolete, since
- %% there's additional useful information I can extract in that
- %% case.]
- %%
- %% Overall, this should provide better support for some documents
- %% (e.g, DVILASER documents will no longer be translated into a
- %% series of *'s) without breaking any other documents any worse
- %% than they already were broken.
- %%
- %% (b) Fixed two bugs in dvips.df-tail: (1) changed "dup 127" to "dup
- %% 128" to fix fencepost error, and (2) gave each font it's own
- %% FontName rather than having all fonts share the same name.
- %%
- %% (c) Added one further refinement to the heuristic for detecting
- %% paragraph breaks: do not ever start a new paragraph after a
- %% line ending in a hyphen.
- %%
- %% (d) Added a bunch of missing letters from the T1Encoding,
- %% OT1Encoding and ISOLatin1Encoding vectors to .letter.chars to
- %% improve hyphen-elimination algorithm. This still won't help
- %% if there's no useful Encoding vector.
- %%
- %% NOTE: A better solution to the problem of missing Encoding vectors
- %% might be to redefine definefont to check whether the Encoding
- %% vector is sensible and, if not, replace it by a default. This
- %% would alleviate the need for constant tests in the .show.write
- %% loop, as well as automatically solving the problem noted in fix
- %% (d) above, and the similar problem with .break.chars. This should
- %% be investigated. Also, the hyphen-elimination algorithm really
- %% needs to be looked at carefully and rethought.
-
- %%* Modifications to ps2ascii.ps by David M. Jones
- %%* (dmjones@theory.lcs.mit.edu), June 25-July 8, 1996
-
- %%* Modifications:
- %%*
- %%* (a) added code to give better support for dvips files by providing
- %%* FontBBox's, FontName's and Encoding vectors for downloaded
- %%* bitmap fonts. This is done by using dvips's start-hook to
- %%* overwrite the df-tail and D procedures that dvips uses to
- %%* define its Type 3 bitmap fonts. Thus, this change should
- %%* provide better support for dvips-generated PS files without
- %%* affecting the handling of other documents.
- %%*
- %%* (b) Fixed two bugs that could potentially affect any PS file, not
- %%* just those created by dvips: (1) added missing "get" operator
- %%* in .show.write and (2) fixed bug that caused a hyphen at the
- %%* end of a line to be replaced by a space rather than begin
- %%* deleted. Note that the first bug was a source of stack
- %%* leakage, causing ps2ascii to run out of operand stack space
- %%* occasionally.
- %%*
- %%* Search for "%%* BF" to find these modifications.
- %%*
- %%* (c) Improved the heuristic for determining whether a line break
- %%* has occurred and whether a line break represents a paragraph
- %%* break. Previously, any change in the vertical position caused
- %%* a line break; now a line break is only registered if the
- %%* change is larger than the height of the current font. This
- %%* means that superscripts, subscripts, and such things as
- %%* shifted accents generated by TeX won't cause line breaks.
- %%* Paragraph-recognition is now done by comparing the indentation
- %%* of the new line to the indentation of the previous line and by
- %%* comparing the vertical distance between the new line and the
- %%* previous line to the vertical distance between the previous
- %%* line and its predecessor.
- %%*
- %%* (d) Added a hook for renaming the files where stdout and stderr
- %%* go.
- %%*
- %%* In general, my additions or changes to the code are described in
- %%* comments beginning with "%%*". However, there are numerous other
- %%* places where I have either re-formatted code or added comments to
- %%* the code while I was trying to understand it. These are usually
- %%* not specially marked.
- %%*
-
- /QUIET true def
- systemdict wcheck { systemdict } { userdict } ifelse begin
- /.max where { pop } { /.max { 2 copy lt { exch } if pop } bind def } ifelse
- /COMPLEX dup where { pop true } { false } ifelse def
- /SIMPLE dup where { pop true } { false } ifelse def
- /setglobal where
- { pop currentglobal /setglobal load true setglobal }
- { { } }
- ifelse
-
- % Define a way to store and retrieve integers that survives save/restore.
- /.i.string0 (0 ) def
- /.i.string .i.string0 length string def
- /.iget { cvi } bind def
- /.iput { exch
- /.inew {
-
- % We only want to redefine operators if they are defined already.
-
- /codef { 1 index where { pop def } { pop pop } ifelse } def
-
- % Redefine the end-of-page operators.
-
- /erasepage { } codef
- /copypage { SIMPLE { (\014) } { (P\n) } ifelse
- /showpage { copypage erasepage initgraphics } codef
-
- % Redefine the fill operators to detect rectangles.
-
- /.orderrect % <llx> <lly> <urx> <ury> .orderrect <llx> <lly> <w> <h>
- { % Ensure llx <= urx, lly <= ury.
- 1 index 4 index lt { 4 2 roll } if
- dup 3 index lt { 3 1 roll exch } if
- exch 3 index sub exch 2 index sub
- } odef
- /.fillcomplex
- { % Do a first pass to see if the path is all rectangles in
- % the output coordinate system. We don't worry about overlapping
- % rectangles that might be partially not filled.
- % Stack: mark llx0 lly0 urx0 ury0 ... true mark x0 y0 ...
- mark true mark
- % Add a final moveto so we pick up any trailing unclosed subpath.
- 0 0 itransform moveto
- { .coord counttomark 2 gt
- { counttomark 4 gt { .fillcheckrect } { 4 2 roll pop pop } ifelse }
- if
- }
- { .coord }
- { cleartomark not mark exit }
- { counttomark -2 roll 2 copy counttomark 2 roll .fillcheckrect }
- pathforall cleartomark
- { .showcolor counttomark 4 idiv
- { counttomark -4 roll .orderrect
- (R )
- }
- repeat pop
- }
- { cleartomark
- }
- ifelse
- } odef
- /.fillcheckrect
- { % Check whether the current subpath is a rectangle.
- % If it is, add it to the list of rectangles being accumulated;
- % if not exit the .fillcomplex loop.
- % The subpath has not been closed.
- % Stack: as in .fillcomplex, + newx newy
- counttomark 10 eq { 9 index 9 index 4 2 roll } if
- counttomark 12 ne { cleartomark not mark exit } if
- 12 2 roll
- % Check for the two possible forms of rectangles:
- % x0 y0 x0 y1 x1 y1 x1 y0 x0 y0
- % x0 y0 x1 y0 x1 y1 x0 y1 x0 y0
- 9 index 2 index eq 9 index 2 index eq and
- 10 index 9 index eq
- { % Check for first form.
- 7 index 6 index eq and 6 index 5 index eq and 3 index 2 index eq and
- }
- { % Check for second form.
- 9 index 8 index eq and
- 8 index 7 index eq and 5 index 4 index eq and 4 index 3 index eq and
- }
- ifelse not { cleartomark not mark exit } if
- % We have a rectangle.
- pop pop pop pop 4 2 roll pop pop 8 4 roll
- } odef
- /eofill { COMPLEX { .fillcomplex } if newpath } codef
- /fill { COMPLEX { .fillcomplex } if newpath } codef
- /rectfill { gsave newpath .rectappend fill grestore } codef
- /ueofill { gsave newpath uappend eofill grestore } codef
- /ufill { gsave newpath uappend fill grestore } codef
-
- % Redefine the stroke operators to detect rectangles.
-
- /rectstroke
- { gsave newpath
- dup type dup /arraytype eq exch /packedarraytype eq or
- { dup length 6 eq { exch .rectappend concat } { .rectappend } ifelse }
- { .rectappend }
- ifelse stroke grestore
- } codef
- /.strokeline % <fromx> <fromy> <tox> <toy> .strokeline <tox> <toy>
- % Note: fromx and fromy are in output coordinates;
- % tox and toy are in user coordinates.
- { .coord 2 copy 6 2 roll .orderrect
- % Add in the line width. Assume square or round caps.
- currentlinewidth 2 div dup .dcoord add abs 1 .max 5 1 roll
- 4 index add 4 1 roll 4 index add 4 1 roll
- 4 index sub 4 1 roll 5 -1 roll sub 4 1 roll
- (R )
- } odef
- /.strokecomplex
- { % Do a first pass to see if the path is all horizontal and vertical
- % lines in the output coordinate system.
- % Stack: true mark origx origy curx cury
- true mark null null null null
- { .coord 6 2 roll pop pop pop pop 2 copy }
- { .coord 1 index 4 index eq 1 index 4 index eq or
- { 4 2 roll pop pop }
- { cleartomark not mark exit }
- ifelse
- }
- { cleartomark not mark exit }
- { counttomark -2 roll 2 copy counttomark 2 roll
- 1 index 4 index eq 1 index 4 index eq or
- { pop pop 2 copy }
- { cleartomark not mark exit }
- ifelse
- }
- pathforall cleartomark
- 0 currentlinewidth .dcoord 0 eq exch 0 eq or and
- % Do the second pass to write out the rectangles.
- % Stack: origx origy curx cury
- { .showcolor null null null null
- { 6 2 roll pop pop pop pop 2 copy .coord }
- { .strokeline }
- { }
- { 3 index 3 index .strokeline }
- pathforall pop pop pop pop
- }
- if
- } odef
- /stroke { COMPLEX { .strokecomplex } if newpath } codef
- /ustroke
- { gsave newpath
- dup length 6 eq { exch uappend concat } { uappend } ifelse
- stroke grestore
- } codef
-
- % The image operators must read the input and note the dimensions.
- % Eventually we should redefine these to detect 1-bit-high all-black images,
- % since this is how dvips does underlining (!).
-
- /.noteimagerect % <width> <height> <matrix> .noteimagerect -
- { COMPLEX
- { gsave setmatrix itransform 0 0 itransform
- grestore .coord 4 2 roll .coord .orderrect
- (I )
- }
- { pop pop pop
- }
- ifelse
- } odef
- /colorimage where
- { pop /colorimage
- { 1 index
- { dup 6 add index 1 index 6 add index 2 index 5 add index }
- { 6 index 6 index 5 index }
- ifelse .noteimagerect gsave nulldevice
- } codef
- } if
- /.noteimage % Arguments as for image[mask]
- { dup type /dicttype eq
- { dup /Width get 1 index /Height get 2 index /ImageMatrix get }
- { 4 index 4 index 3 index }
- ifelse .noteimagerect
- } odef
- /image { .noteimage gsave nulldevice
- /imagemask { .noteimage gsave nulldevice
-
- % Output the current color if necessary.
- /.color.r .inew def
- .color.r -1 .iput % make sure we write the color at the beginning
- /.color.g .inew def
- /.color.b .inew def
- /.showcolor
- { COMPLEX
- { currentrgbcolor
- 1000 mul round cvi
- 3 1 roll 1000 mul round cvi
- exch 1000 mul round cvi
- % Stack: b g r
- dup
- 2 index
- 3 index
- { pop pop pop
- }
- { (C )
- dup
- ( )
- ( )
- (\n)
- }
- ifelse
- }
- if
- } bind def
-
- % Redefine `show'.
-
- % Set things up so our output will be in tenths of a point, with origin at
- % lower left. This isolates us from the peculiarities of individual devices.
-
- /.show.ident.matrix matrix def
- /.show.ident { % - .show.ident <scale> <matrix>
- %
- % % Assume the original transformation is well-behaved.
- % 0.1 0 2 index dtransform abs exch abs .max /.show.scale exch def
- % 0.1 dup 3 -1 roll scale
- gsave initmatrix
- % Assume the original transformation is well-behaved...
- 0.1 0 dtransform abs exch abs .max
- 0.1 dup scale .show.ident.matrix currentmatrix
- % ... but undo any rotation into landscape orientation.
- dup 0 get 0 eq {
- 1 get dup abs div 90 mul rotate
- .show.ident.matrix currentmatrix
- } if
- grestore
- } bind def
-
- /.coord { % <x> <y> .coord <x'> <y'>
- transform .show.ident exch pop itransform
- exch round cvi exch round cvi
- } odef
-
- /.dcoord { % <dx> <dy> .coord <dx'> <dy'>
- % Transforming distances is trickier, because
- % the coordinate system might be rotated.
- .show.ident pop 3 1 roll
- exch 0 dtransform
- dup mul exch dup mul add sqrt
- 2 index div round cvi
- exch 0 exch dtransform
- dup mul exch dup mul add sqrt
- 3 -1 roll div round cvi
- } odef
-
- % Remember the current X, Y, and height.
- /.show.x .inew def
- /.show.y .inew def
- /.show.height .inew def
-
- % Remember the last character of the previous string; if it was a
- % hyphen preceded by a letter, we didn't output the hyphen.
-
- /.show.last (\000) def
-
- % Remember the current font.
- /.font.name 130 string def
- /.font.name.length .inew def
- /.font.height .inew def
- /.font.width .inew def
-
- %%* Also remember indentation of current line and previous vertical
- %%* skip
-
- /.show.indent .inew def
- /.show.dy .inew def
-
- % We have to redirect stdout somehow....
-
- /.show.stdout { (%stdout) (w) file } bind def
-
- % Make sure writing will work even if a program uses =string.
- /.show.string =string length string def
- /.show.=string =string length string def
- /.show==only
- {
- dup type /stringtype eq
- { dup length
- { dup rcheck {
- } if
- } if
- .show.stdout exch write==only
-
- } odef
- /.show==4
- { 4 -1 roll .show==only ( )
- 3 -1 roll .show==only ( )
- exch .show==only ( )
- .show==only (\n)
- } odef
-
- /.showwidth % Same as stringwidth, but disable COMPLEX so that
- % we don't try to detect rectangles during BuildChar.
- { COMPLEX
- { /COMPLEX false def stringwidth /COMPLEX true def }
- { stringwidth }
- ifelse
- } odef
-
- /.showfont % <string> .showfont <string>
- { gsave
- % Try getting the height and width of the font from the FontBBox.
- currentfont /FontBBox .knownget not { {0 0 0 0} } if
- aload pop % llx lly urx ury
- exch 4 -1 roll % lly ury urx llx
- sub % lly ury dx
- 3 1 roll exch % dx ury lly
- sub % dx dy
- 2 copy .max 0 ne
- { currentfont /FontMatrix get dtransform
- }
- { pop pop
- % Fonts produced by dvips, among other applications, have
- % BuildChar procedures that bomb out when given unexpected
- % characters, and there is no way to determine whether a given
- % character will do this. So for Type 1 fonts, we measure a
- % typical character ('X'); for others, we punt.
- currentfont /FontType get 1 eq
- { (X) .showwidth pop dup 1.3 mul
- }
- { % No safe way to get the character size. Punt.
- 0 0
- }
- ifelse
- }
- ifelse .dcoord exch
- currentfont /FontName .knownget not { () } if
- dup type /stringtype ne {
- grestore
- % Stack: height width fontname
- SIMPLE
- { pop pop
- { 2 index
- 2 index
- 1 index
- { pop pop pop
- }
- { (F )
- 3 -1 roll dup
- exch dup
- dup length
-
- }
- ifelse
- }
- ifelse
- } odef
-
- % Define the letters -- characters which, if they occur followed by a hyphen
- % at the end of a line, cause the hyphen and line break to be ignored.
- /.letter.chars 100 dict def
- mark
- 65 1 90 { dup 32 add } for
- counttomark
- { StandardEncoding exch get .letter.chars exch dup put }
- repeat
- pop
-
- %%* Add the rest of the letters from the [O]T1Encoding and
- %%* ISOLatin1Encoding vectors
-
- mark
- /AE
- /Aacute
- /Abreve
- /Acircumflex
- /Adieresis
- /Agrave
- /Aogonek
- /Aring
- /Atilde
- /Cacute
- /Ccaron
- /Ccedilla
- /Dcaron
- /Eacute
- /Ecaron
- /Ecircumflex
- /Edieresis
- /Egrave
- /Eng
- /Eogonek
- /Eth
- /Gbreve
- /Germandbls
- /IJ
- /Iacute
- /Icircumflex
- /Idieresis
- /Idot
- /Igrave
- /Lacute
- /Lcaron
- /Lslash
- /Nacute
- /Ncaron
- /Ntilde
- /OE
- /Oacute
- /Ocircumflex
- /Odieresis
- /Ograve
- /Ohungarumlaut
- /Oslash
- /Otilde
- /Racute
- /Rcaron
- /Sacute
- /Scaron
- /Scedilla
- /Tcaron
- /Tcedilla
- /Thorn
- /Uacute
- /Ucircumflex
- /Udieresis
- /Ugrave
- /Uhungarumlaut
- /Uring
- /Yacute
- /Ydieresis
- /Zacute
- /Zcaron
- /Zdot
- /aacute
- /abreve
- /acircumflex
- /adieresis
- /ae
- /agrave
- /aogonek
- /aring
- /atilde
- /cacute
- /ccaron
- /ccedilla
- /dbar
- /dcaron
- /dotlessi
- /dotlessj
- /eacute
- /ecaron
- /ecircumflex
- /edieresis
- /egrave
- /eng
- /eogonek
- /eth
- /exclamdown
- /ff
- /ffi
- /ffl
- /fi
- /fl
- /gbreve
- /germandbls
- /iacute
- /icircumflex
- /idieresis
- /igrave
- /ij
- /lacute
- /lcaron
- /lslash
- /nacute
- /ncaron
- /ntilde
- /oacute
- /ocircumflex
- /odieresis
- /oe
- /ograve
- /ohungarumlaut
- /oslash
- /otilde
- /questiondown
- /racute
- /rcaron
- /sacute
- /scaron
- /scedilla
- /section
- /sterling
- /tcaron
- /tcedilla
- /thorn
- /uacute
- /ucircumflex
- /udieresis
- /ugrave
- /uhungarumlaut
- /uring
- /yacute
- /ydieresis
- /zacute
- /zcaron
- /zdot
- counttomark
- { .letter.chars exch dup put }
- repeat
- pop
-
- % Define a set of characters which, if they occur at the start of a line,
- % are taken as indicating a paragraph break.
- /.break.chars 50 dict def
- mark
- /bullet /dagger /daggerdbl /periodcentered /section
- counttomark
- { .break.chars exch dup put }
- repeat
- pop
-
- % Define character translation to ASCII.
- % We have to do this for the entire character set.
-
- /.char.map 500 dict def
-
- /.chars.def { counttomark 2 idiv { .char.map 3 1 roll put } repeat pop } def
-
- % Encode the printable ASCII characters.
-
- mark 32 1 126
- { 1 string dup 0 4 -1 roll put
- dup 0 get StandardEncoding exch get exch
- }
- for .chars.def
-
- % Encode accents.
- mark
- /acute (')
- /caron (^)
- /cedilla (,)
- /circumflex (^)
- /dieresis (")
- /grave (`)
- /ring (*)
- /tilde (~)
- .chars.def
-
- % Encode the ISO accented characters.
- mark 192 1 255
- { ISOLatin1Encoding exch get =string cvs
- dup 0 1 getinterval 1 index dup length 1 sub 1 exch getinterval
- .char.map 2 index known .char.map 2 index known and
- { .char.map 3 -1 roll get .char.map 3 -1 roll get concatstrings
- .char.map 3 1 roll put
- }
- { pop pop pop
- }
- ifelse
- }
- for .chars.def
-
- % Encode the remaining standard and ISO alphabetic characters.
-
- mark
- /AE (AE) /Eth (DH) /OE (OE) /Thorn (Th)
- /ae (ae) /eth (dh)
- /ffi (ffi) /ffl (ffl) /fi (fi) /fl (fl)
- /germandbls (ss) /oe (oe) /thorn (th)
- .chars.def
-
- % Encode the other standard and ISO characters.
-
- mark
- /brokenbar (|) /bullet (*) /copyright ((C)) /currency (#)
- /dagger (#) /daggerdbl (##) /degree (o) /divide (/) /dotaccent (.)
- /dotlessi (i)
- /ellipsis (...) /emdash (--) /endash (-) /exclamdown (!)
- /florin (f) /fraction (/)
- /guillemotleft (<<) /guillemotright (>>)
- /guilsinglleft (<) /guilsinglright (>) /hungarumlaut ("") /logicalnot (~)
- /macron (_) /minus (-) /mu (u) /multiply (*)
- /ogonek (,) /onehalf (1/2) /onequarter (1/4) /onesuperior (1)
- /ordfeminine (-a) /ordmasculine (-o)
- /paragraph (||) /periodcentered (*) /perthousand (o/oo) /plusminus (+-)
- /questiondown (?) /quotedblbase (") /quotedblleft (") /quotedblright (")
- /quotesinglbase (,) /quotesingle (') /registered ((R))
- /section ($) /sterling (#)
- /threequarters (3/4) /threesuperior (3) /trademark ((TM)) /twosuperior (2)
- /yen (Y)
- .chars.def
-
- % Encode a few common Symbol characters.
-
- mark
- /asteriskmath (*) /copyrightsans ((C)) /copyrightserif ((C))
- /greaterequal (>=) /lessequal (<=) /registersans ((R)) /registerserif ((R))
- /trademarksans ((TM)) /trademarkserif ((TM))
- .chars.def
-
- %%* Add a few characters from StandardEncoding and ISOLatin1Encoding
- %%* that were missing.
-
- mark
- /cent (c)
- /guilsinglleft (<)
- /guilsinglright (>)
- /breve (*)
- /Lslash (L/)
- /lslash (l/)
- .chars.def
-
- %%* Define the OT1Encoding and T1Encoding vectors for use with dvips
- %%* files. Unfortunately, there's no way of telling what font is
- %%* really being used within a dvips document, so we can't provide an
- %%* appropriate encoding for each individual font. Instead, we'll
- %%* just provide support for the two most popular text encodings, the
- %%* OT1 and T1 encodings, and just accept the fact that any font not
- %%* using one of those encodings will be rendered as gibberish.
- %%*
- %%* OT1 is Knuth's 7-bit encoding for the CMR text fonts, while T1
- %%* (aka the Cork encoding) is the 8-bit encoding used by the DC
- %%* fonts, a preliminary version of the proposed Extended Computer
- %%* Modern fonts. Unfortunately, T1 is not a strict extension of OT1;
- %%* they differ in positions 8#000 through 8#040, 8#074, 8#076, 8#134,
- %%* 8#137, 8#173, 8#174, 8#175 and 8#177, so we can't use the same
- %%* vector for both.
- %%*
- %%* Of course, we also can't reliably tell the difference between an
- %%* OT1-encoded font and a T1-encoded font based on the information in
- %%* a dvips-created PostScript file. As a best-guess solution, we'll
- %%* use the T1 encoding if the font contains any characters in
- %%* positions above 8#177 and the OT1 encoding if it doesn't.
-
- /T1Encoding 256 array def
-
- /OT1Encoding 256 array def
-
- %%* T1Encoding shares a lot with StandardEncoding, so let's start
- %%* there.
-
- StandardEncoding T1Encoding copy pop
-
- /OT1.encode {
- counttomark
- 2 idiv
- { OT1Encoding 3 1 roll put }
- repeat
- cleartomark
- } def
-
- /T1.encode {
- counttomark
- 2 idiv
- { T1Encoding 3 1 roll put }
- repeat
- cleartomark
- } def
-
- mark
- 8#000 /grave
- 8#001 /acute
- 8#002 /circumflex
- 8#003 /tilde
- 8#004 /dieresis
- 8#005 /hungarumlaut
- 8#006 /ring
- 8#007 /caron
-
- 8#010 /breve
- 8#011 /macron
- 8#012 /dotaccent
- 8#013 /cedilla
- 8#014 /ogonek
- 8#015 /quotesinglbase
- 8#016 /guilsinglleft
- 8#017 /guilsinglright
-
- 8#020 /quotedblleft
- 8#021 /quotedblright
- 8#022 /quotedblbase
- 8#023 /guillemotleft
- 8#024 /guillemotright
- 8#025 /endash
- 8#026 /emdash
- 8#027 /cwm
-
- 8#030 /perthousandzero
- 8#031 /dotlessi
- 8#032 /dotlessj
- 8#033 /ff
- 8#034 /fi
- 8#035 /fl
- 8#036 /ffi
- 8#037 /ffl
-
- %% 8#040 through 8#176 follow StandardEncoding
-
- 8#177 /hyphen
- T1.encode
-
- mark
- 8#200 /Abreve
- 8#201 /Aogonek
- 8#202 /Cacute
- 8#203 /Ccaron
- 8#204 /Dcaron
- 8#205 /Ecaron
- 8#206 /Eogonek
- 8#207 /Gbreve
- 8#210 /Lacute
- 8#211 /Lcaron
- 8#212 /Lslash
- 8#213 /Nacute
- 8#214 /Ncaron
- 8#215 /Eng
- 8#216 /Ohungarumlaut
- 8#217 /Racute
- 8#220 /Rcaron
- 8#221 /Sacute
- 8#222 /Scaron
- 8#223 /Scedilla
- 8#224 /Tcaron
- 8#225 /Tcedilla
- 8#226 /Uhungarumlaut
- 8#227 /Uring
- 8#230 /Ydieresis
- 8#231 /Zacute
- 8#232 /Zcaron
- 8#233 /Zdot
- 8#234 /IJ
- 8#235 /Idot
- 8#236 /dbar
- 8#237 /section
- 8#240 /abreve
- 8#241 /aogonek
- 8#242 /cacute
- 8#243 /ccaron
- 8#244 /dcaron
- 8#245 /ecaron
- 8#246 /eogonek
- 8#247 /gbreve
- 8#250 /lacute
- 8#251 /lcaron
- 8#252 /lslash
- 8#253 /nacute
- 8#254 /ncaron
- 8#255 /eng
- 8#256 /ohungarumlaut
- 8#257 /racute
- 8#260 /rcaron
- 8#261 /sacute
- 8#262 /scaron
- 8#263 /scedilla
- 8#264 /tcaron
- 8#265 /tcedilla
- 8#266 /uhungarumlaut
- 8#267 /uring
- 8#270 /ydieresis
- 8#271 /zacute
- 8#272 /zcaron
- 8#273 /zdot
- 8#274 /ij
- 8#275 /exclamdown
- 8#276 /questiondown
- 8#277 /sterling
-
- 8#300 /Agrave
- 8#301 /Aacute
- 8#302 /Acircumflex
- 8#303 /Atilde
- 8#304 /Adieresis
- 8#305 /Aring
- 8#306 /AE
- 8#307 /Ccedilla
- 8#310 /Egrave
- 8#311 /Eacute
- 8#312 /Ecircumflex
- 8#313 /Edieresis
- 8#314 /Igrave
- 8#315 /Iacute
- 8#316 /Icircumflex
- 8#317 /Idieresis
- 8#320 /Eth
- 8#321 /Ntilde
- 8#322 /Ograve
- 8#323 /Oacute
- 8#324 /Ocircumflex
- 8#325 /Otilde
- 8#326 /Odieresis
- 8#327 /OE
- 8#330 /Oslash
- 8#331 /Ugrave
- 8#332 /Uacute
- 8#333 /Ucircumflex
- 8#334 /Udieresis
- 8#335 /Yacute
- 8#336 /Thorn
- 8#337 /Germandbls
-
- 8#340 /agrave
- 8#341 /aacute
- 8#342 /acircumflex
- 8#343 /atilde
- 8#344 /adieresis
- 8#345 /aring
- 8#346 /ae
- 8#347 /ccedilla
- 8#350 /egrave
- 8#351 /eacute
- 8#352 /ecircumflex
- 8#353 /edieresis
- 8#354 /igrave
- 8#355 /iacute
- 8#356 /icircumflex
- 8#357 /idieresis
- 8#360 /eth
- 8#361 /ntilde
- 8#362 /ograve
- 8#363 /oacute
- 8#364 /ocircumflex
- 8#365 /otilde
- 8#366 /odieresis
- 8#367 /oe
- 8#370 /oslash
- 8#371 /ugrave
- 8#372 /uacute
- 8#373 /ucircumflex
- 8#374 /udieresis
- 8#375 /yacute
- 8#376 /thorn
- 8#377 /germandbls
-
- T1.encode
-
- %%* Now copy OT1Encoding into T1Encoding and make a few changes.
-
- T1Encoding OT1Encoding copy pop
-
- mark
- 8#000 /Gamma
- 8#001 /Delta
- 8#002 /Theta
- 8#003 /Lambda
- 8#004 /Xi
- 8#005 /Pi
- 8#006 /Sigma
- 8#007 /Upsilon
-
- 8#010 /Phi
- 8#011 /Psi
- 8#012 /Omega
- 8#013 /ff
- 8#014 /fi
- 8#015 /fl
- 8#016 /ffi
- 8#017 /ffl
-
- 8#020 /dotlessi
- 8#021 /dotlessj
- 8#022 /grave
- 8#023 /acute
- 8#024 /caron
- 8#025 /breve
- 8#026 /macron
- 8#027 /ring
-
- 8#030 /cedilla
- 8#031 /germandbls
- 8#032 /ae
- 8#033 /oe
- 8#034 /oslash
- 8#035 /AE
- 8#036 /OE
- 8#037 /Oslash
-
- 8#040 /polishslash
-
- 8#042 /quotedblright
-
- 8#074 /exclamdown
- 8#076 /questiondown
-
- 8#134 /quotedblleft
- 8#137 /dotaccent
-
- 8#173 /endash
- 8#174 /emdash
- 8#175 /hungarumlaut
- 8#177 /dieresis
- OT1.encode
-
- %%* And add a few characters from the OT1Encoding
-
- mark
- /Gamma (\\Gamma )
- /Delta (\\Delta )
- /Theta (\\Theta )
- /Lambda (\\Lambda )
- /Xi (\\Xi )
- /Pi (\\Pi )
- /Sigma (\\Sigma )
- /Upsilon (\\Upsilon )
-
- /Phi (\\Phi )
- /Psi (\\Psi )
- /Omega (\\Omega )
-
- /dotlessj (j)
- /ff (ff)
-
- /cwm ()
-
- /perthousandzero (0)
-
- /polishslash ()
-
- /Abreve (A*)
- /Aogonek (A,)
- /Cacute (C')
- /Ccaron (C^)
- /Dcaron (D^)
- /Ecaron (E^)
- /Eogonek (E,)
- /Gbreve (G*)
- /Lacute (L')
- /Lcaron (L^)
- /Nacute (N')
- /Ncaron (N^)
- /Eng (NG)
- /Ohungarumlaut (O"")
- /Racute (R')
- /Rcaron (R^)
- /Sacute (S')
- /Scaron (S^)
- /Scedilla (S,)
- /Tcaron (T^)
- /Tcedilla (T,)
- /Uhungarumlaut (U"")
- /Uring (U*)
- /Ydieresis (Y")
- /Zacute (Z')
- /Zcaron (Z^)
- /Zdot (Z.)
- /IJ (IJ)
- /Idot (I.)
- /dbar (d-)
- /abreve (a*)
- /aogonek (a,)
- /cacute (c')
- /ccaron (c^)
- /dcaron (d^)
- /ecaron (e^)
- /eogonek (e,)
- /gbreve (g*)
- /lacute (l')
- /lcaron (l^)
- /nacute (n')
- /ncaron (n^)
- /eng (ng)
- /ohungarumlaut (o"")
- /racute (r')
- /rcaron (r^)
- /sacute (s')
- /scaron (s^)
- /scedilla (s,)
- /tcaron (t^)
- /tcedilla (t,)
- /uhungarumlaut (u"")
- /uring (u*)
- /zacute (z')
- /zcaron (z^)
- /zdot (z.)
- /ij (ij)
- /Germandbls (SS)
- .chars.def
-
- %%* We extend the df-tail command to stick in an Encoding vector (see
- %%* above for a discussion of the T1 and OT1 encodings), put in a
- %%* FontName (which will just be dvips's name for the font, i.e., Fa,
- %%* Fb, etc.) and give each font a separate FontBBox instead of
- %%* letting them all share a single one.
-
- /dvips.df-tail % id numcc maxcc df-tail
- {
- /nn 9 dict N
- nn begin
- %%
- %% Choose an encoding based on the highest position occupied.
- %%
- dup 128 gt { T1Encoding } { OT1Encoding } ifelse
- /Encoding X
- /FontType 3 N
- %%
- %% It's ok for all the fonts to share a FontMatrix, but they
- %% need to have separate FontBBoxes
- %%
- /FontMatrix fntrx N
- /FontBBox [0 0 0 0] N
- string /base X
- array /BitMaps X
- %%
- %% And let's throw in a FontName for good measure
- %%
- dup ( ) cvs
- %%
- %% Make sure each font gets it own private FontName. -- dmj,
- %% 12/23/97
- %%
- dup length string copy
- /FontName X
- /BuildChar {CharBuilder} N
- end
- dup { /foo setfont }
- 2 array copy cvx N
- load
- 0 nn put
- /ctr 0 N
- [
- } def
-
- %%* This is functionally equivalent to dvips's /D procedure, but it
- %%* also calculates the Font Bounding Box while defining the
- %%* characters.
-
- /dvips.D % char-data ch D - : define character bitmap in current font
- {
- /cc X % char-data
- dup type /stringtype ne {]} if % char-data
-
- /ch-xoff where
- { pop }
- { dup /Cd exch def
- /ch-width { Cw } def
- /ch-height { Ch } def
- /ch-xoff { Cx } def
- /ch-yoff { Cy } def
- /ch-dx { Cdx } def
- } ifelse
- /ch-data X
- nn /base get cc ctr put % (adds ctr to cc'th position of BASE)
- nn /BitMaps get
- ctr
- ch-data % BitMaps ctr char-data
- sf 1 ne {
- dup dup length 1 sub dup 2 index S get sf div put
- } if
- put % puts char-data into BitMaps at index ctr
- /ctr ctr 1 add N
- %%
- %% Make sure the Font Bounding Box encloses the Bounding Box of the
- %% current character
- %%
- nn /FontBBox get % BB
-
- dup % calculate new llx
- dup 0 get
- ch-xoff
- .min
- 0 exch put
-
- dup % calculate new lly
- dup 1 get
- ch-yoff ch-height sub
- .min
- 1 exch put
-
- dup % calculate new urx
- dup 2 get
- ch-dx ch-width add
- .max
- 2 exch put
-
- dup 3 get % calculate new ury
- ch-yoff
- .max
- 3 exch put
-
- } def
-
- %%* Define start-hook to replace df-tail and D by our versions.
- %%* Unfortunately, the user can redefine start-hook and thus bypass
- %%* these changes, but I don't see an obvious way around that.
-
- userdict /start-hook {
- TeXDict /df-tail /dvips.df-tail load bind put
- TeXDict /D /dvips.D load bind put
- } put
-
- %%* Introduce a symbolic constant for hyphens. (Need to make
- %%* allowance for hyphen being in different place?)
-
- /.hyphen 45 def
-
- % Write out a string. If it ends in a letter and a hyphen,
- % don't write the hyphen, and set .show.last to a hyphen;
- % otherwise, set .show.last to the character (or \000 if it was a hyphen).
- /.show.write % <string>
- {
- dup length 1 ge
- { dup dup length 1 sub get % string last_char
- dup .hyphen eq % string last_char hyphen?
- { % string last_char
- 1 index length 1 gt
- { 1 index dup length 2 sub get }
- {
- ifelse % string last_char prev-char
- currentfont /Encoding get exch get % look up prev-char
-
- { % Remove the hyphen % string last_char
- exch % last_char string
- dup length 1 sub % last_char string len-1
- 0 exch getinterval % last_char string-1
- exch % string-1 last_char
- }
- { pop 0 } % string 0
- ifelse
- }
- if
-
- % in .show.last
- % If .show.last ==
- % hyphen, then
- % last char of
- % previous string
- % was a hyphen
- }
- if % string
- currentfont /FontType get 0 ne
- {
- { % begin forall % c
- dup % c c
- currentfont /Encoding get % c c vec
- exch get % c name
- dup
- { exch pop }
- { pop OT1Encoding exch get }
- ifelse % name
-
- .show.stdout exch writestring
- }
- forall
- }
- { (\0) dup 0 get 0 eq
- { 0 1 put
- (%stderr) (w) file dup
- (*** Warning: composite font characters dumped without decoding.\n) writestring
- closefile
- }
- { pop
- }
- ifelse
- .show.stdout exch writestring
- }
- ifelse
- } odef
-
- /.showstring1 { % string
- currentpoint .coord % string x y
- 3 -1 roll dup .showwidth % x y string dx dy
- 1 index % x y string dx dy dx
- 0 rmoveto % x y string dx dy
- .dcoord pop % x y string width
- SIMPLE
- { % x y string width
- 2 index % x y string width y
-
- %%*
- %%* Replaced test "has y changed" by "has y changed by more
- %%* than the current font height" so that subscripts and
- %%* superscripts won't cause line/paragraph breaks
- %%*
- sub abs dup % x y string width dy dy
-
- gt
- { % x y string width dy
-
- %%* Vertical position has changed by more than the font
- %%* height, so we now try to figure out whether we've
- %%* started a new paragraph or merely a new line, using a
- %%* variety of heuristics.
-
- %%* If any of the following is true, we start a new
- %%* paragraph:
-
- %%* (a) the current vertical shift is more than 1.1 times
- %%* the previous vertical shift, where 1.1 is an
- %%* arbitrarily chosen factor that could probably be
- %%* refined.
-
- dup % x y string width dy dy
-
- gt
- exch
-
- %%* Save the new vertical shift
-
-
-
- %%* (b) The vertical shift is more than 1.3 times the
- %%* "size" of the current font. I've removed this
- %%* test since it's not really very useful.
-
- %%*
- %%*
- %%* gt % x y string width bool
- %%* .show.height .iget 0 gt and % only perform test if font
- %%* % height is nonzero
- %%* or
-
- %%* (c) the first character of the new line is one of the
- %%* .break.chars
-
- 2 index length % x y string width newpar? len
- 0 gt % x y string width newpar? len>0?
- {
- 2 index 0 get % x y string width newpar? s
- currentfont /Encoding get
- exch get % x y string width newpar? s_enc
-
- }
- if % x y string width newpar?
-
- %%* (d) The indentation of the new line is greater than
- %%* the indentation of the previous line.
-
- 4 index
-
- gt
- or
-
- %%* HOWEVER, if the line ends in a hyphen, we do NOT begin
- %%* a new paragraph (cf. comment at end of BF2). --dmj,
- %%* 12/23/97
-
-
- and
-
- % newpar?
- { (\n\n) } % Paragraph
- { % Line
- %%*
- %%* BF2: If last character on a line is
- %%* a hyphen, we omit the hyphen and
- %%* run the lines together. Of
- %%* course, this will fail if a word
- %%* with an explicit hyphen (e.g.,
- %%* X-ray) is split across two lines.
- %%* Oh, well. (What should we do
- %%* about a hyphen that ends a
- %%* "paragraph"? Perhaps that should
- %%* inhibit a paragraph break.)
- %%*
-
- { () }
- { (\n) }
- ifelse % x y string width char
- }
- ifelse
-
-
-
-
-
- }
- { % x y string width dy
- % If the word processor split a hyphenated word within
- % the same line, put out the hyphen now.
- pop
-
- }
- ifelse
- %%*
- %%* If have moved more than 1 point to
- %%* the right, interpret it as a
- %%* space? This need to be looked at
- %%* more closely.
- %%*
- 3 index % x y string width x
-
- { ( )
- if
- % x y string width
- 4 1 roll % width x y string
- .show.write pop % width x
- add
- }
- { (S )
- ifelse
- } odef
-
- /.showstring
- { dup () eq { pop } { .showstring1 } ifelse
- } bind def
-
- % Redefine all the string display operators.
-
- /show {
- .showfont
- .showcolor
- .showstring
- } codef
-
- % We define all the other operators in terms of .show1.
-
- /.show1.string ( ) def
- /.show1 {
- /ashow
- { .showfont .showcolor
- { .show1 2 copy rmoveto } forall
- pop pop
- } codef
- /awidthshow
- { .showfont .showcolor
- { dup .show1 4 index eq { 4 index 4 index rmoveto } if
- 2 copy rmoveto
- }
- forall
- pop pop pop pop pop
- } codef
- /widthshow
- { .showfont .showcolor
-
- {
- .showstring .showstring
- 2 index 2 index rmoveto
- } loop
- .showstring pop pop
- } codef
- /kshow
- { .showfont .showcolor
- %**************** Should construct a closure, in case the procedure
- %**************** affects the o-stack.
- { .show1 dup exec } forall pop
- } codef
-
- % We don't really do the right thing with the Level 2 show operators,
- % but we do something semi-reasonable.
- /xshow { pop show } codef
- /yshow { pop show } codef
- /xyshow { pop show } codef
- /glyphshow
- { currentfont /Encoding .knownget not { {} } if
- 0 1 2 index length 1 sub
- { % Stack: glyph encoding index
- 2 copy get 3 index eq { exch pop exch pop null exit } if
- pop
- }
- for null eq { (X) dup 0 4 -1 roll put show } { pop } ifelse
- } codef
-
- end
-
- % Bind the operators we just defined, and all the others if we didn't
- % do it before.
-
- DELAYBIND { .bindnow } if
-
- % Make systemdict read-only if it wasn't already.
-
- systemdict wcheck { systemdict readonly pop } if
-
- % Restore the current local/global VM mode.
-
- exec
|