A DATABASE OF
Peter Garside and
can also access a downloadable version of this report from our
Project Downloads section.]
The early decades of the nineteenth century represent a period of
unparalleled development in the novel. While many of the ideological
battles surrounding fiction had been fought in the charged atmosphere
of the 1790s, the anti-Jacobin reaction to the polemical aspects
of the novel necessitated a reinterpretation of the role of the
novel at the turn of the century. Writers such as Maria Edgeworth
and Hannah More, and later Walter Scott and Jane Austen, did much
to make this period significant. However, the era was also the time
of less notable, but still prolific, writers, such as Mary Meeke,
the Porter sisters, Anthony Frederick Holstein, and Barbara Hofland.
The initial aim of the database
project was to create a tool to allow a broad and sophisticated
level of analysis of over two thousand titles from the period
1800–29. Fields allowing analysis of gender distributions, publisher
popularity, authorial status, prices, translations, etc. were
created during the first phase of the database. As well as enabling
the study of broad statistical data, fuller bibliographical information
can also be consulted on a per-record basis. Thus, analysis can
take place on two levels: the general (for spans of years, types
of fiction, specific authors) and the individual (studies of individual
texts, with rich information about the work concerned).
The first phase of this project has involved extrapolating basic
bibliographical details of over 2,200 fictional works from the period.
This led to the creation of specific fields forming part of an Access
97 database, which were divided into different types of categories:
The main fields of the database refer particularly to the
bibliographical details of the titles concerned, and can be analysed
in a variety of ways from a record-by-record basis to a full statistical
gathering over the three-decade period.
and Date: provide standardised
information of authors, including variant spellings and married
names, as well as the year of first publication for the title
both full titles (for close examination) and short-titles (for
quick consultation) are provided
follows seven categories split along two axes: i) gender typemale,
ii) status of gender ascription for male and femalenamed
(i.e. a gender-specific, authentic name appears on title-page),
identified (gender has been ascribed
through scholarship, authorial chains, etc.), implied
(gender has been defined through unidentified pseudonyms, generic
phrases such as ‘By the daughter of a clergyman’, ‘By an officer
…’, and so on). The Broad Gender
field offers a summary of the (detailed) Gender field by condensing
these variations into simple Male, Female, Unknown categories
for a more general analysis
Status records whether a title was published anonymously,
pseudonymously, or with the author’s
name explicitly stated (nominally).
Gender: unlike the Broad Gender field, this simply records
texts about which modern scholarship can be absolutely sure
as far as authorial gender is concerned (i.e. ‘By a Lady’ does
not provide enough evidence that a woman wrote the text, and
is therefore treated as Unknown). This is most useful when combined
with the Authorial Status field to provide an accurate record
of how many men and women we can be sure wrote were writing
using their own name or not
details list the number of subscribers and page numbers of subscription
lists featured in the work concerned
details provide information on the name of the translator of
a foreign work into English; whether the work is an authenticated
foreign original or whether it is an ‘implied’ translation;
and first located translations into French and/or German
fields give details of all known prices,
holdings, further editions,
and miscellaneous textual-bibliographical notes
The Publisher fields represent detailed aspects of
the text as a manufactured item in the publishing world, rather
than as an authored work. It was necessary to provide a separate
section for this aspect of the database because there are cases
of a single title being issued by different publishers in different
forms at different prices.
publisher: the main publishing partnership (i.e. those
that generally appear at the beginning of the publisher’s imprint)
with the fullest names of partners is provided as a field
for the sake of standardised searches the varying partnerships
which constituted ventures are also subsumed into a more generic
field, allowing continuity of searches (e.g. William Lane, the
Minerva Press, A. K. Newman, are all grouped under ‘Minerva’
allowing users to search all Minerva titles without worrying
about the changing nature of the firm)
details include number of volumes,
pagination, and format
(octavo, duodecimo, etc.)
details for individual volumes are taken as a subset of the
all known prices field, where sources indicate a preponderance
or agreement as to price (e.g. two out of three reviews agree
on a price, which is then recorded in this field). Prices have
been converted into decimal figures, for the sake of statistical
analysis: e.g. 10s 6d is entered as 10.5, etc. In total, 83.5%
of all records in the database have an ‘agreed’ price
publishers (cases where firms have played a subsidiary
role in a publishing enterprise, and appear after the first
firm on the publisher’s imprint) are also recorded, so that
users can see which publishers played ‘second fiddle’ more often
than others (e.g. ‘A. K. Newman & Co’ was secondary
publisher in only a negligible number of cases as compared to
well over 500 titles where his firm was primary.
The first phase of data entry is now complete, and
the current records are closed. The flexibility of Access 97 allows
information to be parsed in a number of complementary ways.
Forms allow users to view and manipulate data in a number
of ways. In this example, a tabbed system is used with related
fields grouped in separate pages: Main, Publication Details,
Formatting/Price, Translations, and Notes.
Forms display material on a record-by-record basis,
and allow users to examine each title on an individual basis (e.g.
see Figs 1a and 1b). Because forms
can be designed relatively quickly, they can display full or partial
details as the user requires.
can be made in any field: for instance, users can search for keywords
in titles, author names, publishers, and so forth.
Forms also allow
the application of Filters, which
enable users to specify criteria in fields on the form in any
combination: the application will then search through all the
records and only return those which follow these conditions.
For instance. a
user could fill in a form to return the following details: All
Female-gendered authors published by Longmans after 1810. The
database would then display each individual record for analysis
at whatever level of detail the form has been designed.
|Fig 1b. This
second example of a form contains all the details for each
record on one sheet. While not particularly suitable for viewing
data, it is ideal for the actual process of data-entry itself.
Filters can be
applied on records which have already been filtered to provide
an even more localised level of specificity. Filters can also
be saved as queries (see below) for use later.
Queries offer a far more sophisticated degree of analysis
than the simpler forms, and tend to return data on a larger scale.
They can be constructed either to return information such as bibliographical
details (e.g. author names, titles, publishers), or more significantly
statistical information (e.g. total works by female authors; numbers
of titles published within specific period; maximum, minimum,
average, most frequent prices of texts, etc.). See Fig
2 for an example of a simple query design.
Queries can be constructed by selecting
the relevant fields for analysis from a list, specifying conditions,
and the ways in which the data should be analysed. The usual method
would employ a simple query, which returns the data in list form
(see Fig 3a), or one which seeks
to return statistical data such as totals, percentages, etc. within
specific categories (see Fig 3b).
Users can also
employ Boolean operators (AND, OR,
NOT) to include and exclude different criteria: e.g. a query can
be set up for all titles in the 1820s by non-male authors, published
by Longmans or Colburn, with the word ‘domestic’ in the title-page).
If an even more
sophisticated level of analysis is necessary than the usual query
boxes, complex searches can be created using Structured
Query Language (SQL), a standard way of constructing database
queries. By entering an appropriately organised SQL statement,
users can return detailed statistical figures with complex selection
procedures, such as the percentage of women writers from the 1810s
using pseudonyms, as they figured in the London and Edinburgh
markets. The more simplistic example of an SQL statement below
actually returns the annual totals of foreign works translated
into English, and sorted by gender (‘Main Listing’ refers to the
table containing the Main Fields
listed in section 2):
Listing].ID) AS CountOfID
SELECT [Main Listing].Year
FROM [Main Listing]
WHERE ((([Main Listing].[Implied Translations From]) Is Not Null))
OR ((([Main Listing].[Translations From]) Is Not Null))
GROUP BY [Main Listing].Year
PIVOT [Main Listing].Gender;
|Fig 2. A
simple query design, which surveys annual totals of output
Summaries of queries can also be prepared for outputwhether
as printed copy or as HTML pages ready for the web. Access enables
this through the use of Reports, which can be designed on a single
page as simply as forms, while running for hundreds of pages once
the data has been processed through the forms. A typical example
of this usage in ongoing research has been the providing of checklists
with Author, Year, Short-Title, and Publisher details in order
to examine the output by the top five publishers of the period.
Manipulating the Results
Once the user has acquired the data needed from Access,
it is a simple matter of exporting into a suitable package as
requirements demand. If the material is to be further examined
it can be exported (at the click of a button) as a spreadsheet
into Excel 97, which is far more flexible and sophisticated than
Access as far as statistical analysis is concerned. These spreadsheet
data can then be used as the basis for creating graphs to illustrate
trends, preponderance, etc. The graphs which feature in our Cardiff
Corvey articles have been constructed using this procedure.
The examples which follow also demonstrated (albeit at a rather
simplistic level) the kinds of information which can be acquired
from the database.
If more detailed
information, with less of a statistical bent, is required, then
reports can be exported as rich-text documents as simply as the
transfer into Excel. These documents can then be used in any word
processing package for incorporation into studies, checklists,
and so on. Again
reports can also be exported as HTML documents for use on the
web. The checklist accompanying the report on our Corvey Microfiche
Edition (CME) cataloguing project, which employs a similar system
of data-keeping as our fiction database, was presented using this
|Fig 3a. This
screen capture shows a part of a comprehensive listing of
all short-titles published in the 1810s by the Minerva Press,
arranged alphabetically by author.
Query 1: This first request from
the database took less than two minutes to construct and run:
it requests the top ten female novelists during 1800–29 whose
own names appeared on first edition title-pages. It also lists
the total numbers of their works published this way.
- GENLIS, Stéphanie Félicité,
Comtesse de 
- WARD, Catherine George 
- HOFLAND, Barbara 
- STANHOPE, Louisa Sydney 
- HARVEY, Jane 
- MEEKE, Mary 
- ROCHE, Regina Maria 
- OPIE, Amelia Alderson 
- PORTER, Anna Maria 
- MOSSE, Henrietta Rouviere 
The results are
interesting because if the top ten female novelists were required,
whether they published pseudonymously, anonymously, or under their
own names, Anna Maria Porter and Henrietta
Mosse fall into 11th and 13th places respectively, with ten titles
each in total. In fact, many of the figures are rearranged, with
Barbara Hofland at the top, followed by Mary Meeke.
|Fig 3b. This
extract displays the results of the query design shown in
Query 2: A user
can easily request a listing of all the works of a particular
author by dates of publication and publisher through another simple
database query: additionally a query can list the main holding
library for the source text (here the presence of first editions
in the Corvey Microfiche Edition (CME) is listed with an asterisk
followed by the ISBN). In this case, ‘Ann[e] of Swansea’ (i.e.
Anne Julia Kemble Hatton):
- CAMBRIAN PICTURES (1810. London.
- SICILIAN MYSTERIES (1812.
London. Colburn, Henry) *CME 3-628-48690-4
- CONVICTION (1814. London.
Minerva: Newman, Anthony King; & Co) *CME 3-628-48744-7
- SECRET AVENGERS (1815. London.
Minerva: Newman, Anthony King; & Co) *CME 3-628-48805-2
- CHRONICLES OF AN ILLUSTRIOUS
HOUSE (1816. London. Minerva: Newman, Anthony King; & Co)
- GONZALO DE BALDIVIA (1817.
London. Minerva: Newman, Anthony King; & Co) *CME 3-628-48802-8
- SECRETS IN EVERY MANSION (1818.
London. Minerva: Newman, Anthony King; & Co) *CME 3-628-48806-0
- CESARIO ROSALBA (1819. London.
Minerva: Newman, Anthony King; & Co) *CME 3-628-48742-0
- LOVERS AND FRIENDS (1821.
London. Minerva: Newman, Anthony King; & Co) *CME 3-628-48804-4
- GUILTY OR NOT GUILTY (1822.
London. Minerva: Newman, Anthony King; & Co) *CME 3-628-48803-6
- WOMAN'S A RIDDLE (1824. London.
Minerva: Newman, Anthony King; & Co) *CME 3-628-48789-7
- DEEDS OF THE OLDEN TIME (1826.
London. Minerva: Newman, Anthony King; & Co) *CME 3-628-48797-8
- UNCLE PEREGRINE'S HEIRESS
(1828. London. Minerva: Newman, Anthony King; & Co) *CME
The user could
easily request far more detailed information for these titles,
such as the full title as it appears on the title-page, further
editions, translations, etc.
Query 3 (Fig 4):
The final example shows how the database can again be used as
the basis of complex and/or significant analysis of broad sweeps
of data from the period. In this case, a graph of price fluctuations
has been calculated, showing the minimum, maximum, and average
prices per volume. A query requesting analysis of the Year field
followed by the Min, Max, and Average options for an expression
Price/Vol was run, imported into Excel 97, and a graph was created
from this data.
4. Example Graph Extrapolated From Database
The first phase of the database project, involving
the entry of basic bibliographical data for each title, was completed
early in 1999 after two years’ work. There are a few minor elements
of data entered from the first phase that could be developed more
appropriately for statistical analysis. For instance, our Translation
and Further Editions fields at the moment are simply entered as
notes: an appropriate forward move would be to split the information
contained in these into separate fields, as with our Publisher
category, specifying Year, Place of Publication, and Title.
The main thrust
of future development, however, will be towards the individual
records themselves, in terms of production and especially readership.
In May 1999 a research application made by the Centre for Editorial
and Intertextual Research to develop a second phase was rated
Alpha Plus by the Arts and Humanities Research Board, resulting
in the award of funding in the amount of £30,000, to support
the employment of a post-doctoral Research Associate.
The aim of Phase
Two will be consist of one years extensive data-collection,
concentrating especially on library catalogues and reviews, followed
by the processing of the material after collection. To this end
a post for a Research Associate who would be responsible, from
January 2000, in the first year for gathering pertinent information
is now available (see section 5 for more details). This would
involve the examination of a variety of sources for information
which would then be added to the records as appropriate, in order
to build on our perceptions of the presentation of and reaction
to fiction of the early nineteenth century. Sources marked for
examination include the following:
- Details of holdings in circulating library
catalogues: the project already holds xeroxes of a substantial
number belonging to the period (we have already processed those
from the Newman (London), Kinnear (Edinburgh), and Bettison
(Cheltenham) circulating libraries
- Subscription lists: the project
has xeroxes of more than 60 lists
- Reviews: this new phase of the
project will enable inclusion of material from a wider band
of contemporary journals
- Newspaper announcements and advertisements
- Publishing papers: details from
archives such as the Longman Papers (the microfilms of which
are being purchased by the Centre) concerning print runs, copies
- Anecdotal information: collected
from contemporary memoirs, etc.
In terms of miscellaneous additions,
it is our aim to also improve the database by the possible inclusion,
whenever possible, of information such as biographical details
of authors, review transcripts, facsimiles from title pages, and
other significant matter (e.g. illustrations), etc. As well as
this, we would improve on our user interface, so that standard
searches can be made by non-specialist users with as much ease
Our aims at this
stage are clear and appropriately narrowed, however, and we are
focused on developing the aspects of reception we have detailed
above before proceedingin the longer termon the inclusion
of further materials.
31 December, 2001
This document is maintained by Anthony