The texts offered for downloading from this Henry James website are designated ASCII as they contain only characters defined in the American Standard Code for Information Interchange 7-bit character set. They are generated from the HTML, or more recently, XHTML text found on the website by a Perl program written by the editor to strip out or replace characters (chiefly mark-up) not belonging to the source text(s) or the character set.

The aim has been more to provide text suitable for computer analysis of linguistic features than to offer one for the casual reader. Apart from the inconvenience of being on a computer display instead of paper, the latter will better provided by the typographical features of the XHTML. Thus, for example, no words are broken over line ends by using hyphens; sometimes this leads to very ragged right-hand margins, but broken words would upset concordances and word counters.

Conventions implemented in the XHTML version of the text are retained by the conversion as detailed below. Users should note particularly the handling of printed emphasis (italic) characters and of diacritics :

The ‘retained’ conventions include :

