|
Introduction
It has always surprised me that there is no standard for transcribing
English parish registers. We see all manner of variations, from a literal
line by line transcript, to a simple index of surnames and dates, and all
manner of formats in between. I suppose that the variations have evolved
from handwritten and typewritten transcripts of the days before computers,
and in many ways, these methods and styles have often been carried forwards,
often reluctantly, into the computer age.
There is the school of thought that there is nothing like good old paper
for archiving information, and with this I would certainly have no argument,
for computer data, stored on whatever media, can hardly be considered to
be permanent, or indeed, readable by future generations of computers and
software. However, data stored on computer database has one superbly brilliant
advantage. It can be searched and sorted in the blink of an eye.
Imagine, even with a well indexed paper transcript, never mind an original
parish baptism register, how long it would take to find every baptism of
a SMITH in a large parish with some 200,000 baptisms. Life is just too short.
A computer database can do it in one or two seconds, ready to print. Furthermore,
it can sort out all of the families for you too.
A database is the ideal way to store information such as a parish register.
It can be sorted and searched in seconds (or less), even with a file containing
hundreds of thousands of records. We can still print out a meaningful
transcription onto paper, and the database can also be used to
create an instant index for the paper print, although in use in the computer,
indexes are no longer needed, as the information can be searched for and
found incredibly quickly. But we need to consider the way that the data is
entered in the first place, and hence the point of this article.
One point which I should make very clear right from the start. Some people
like the idea of entering the data into a spreadsheet program such as Microsoft
Excel. Quite simply, it isn't man enough for the task. A spreadsheet is not
intended for this type of data; it is intended for performing calculations.
Sure, it has a "table" format similar to a database, but a database program
is so much more powerful when it comes to doing searches, queries and filters.
Furthermore, a spreadsheet runs out of data space, (there are only
so many records that you can enter before it is full up!), whereas a database's
size is limited only by the size of your hard disk. Forget the idea of using
spreadsheets for this type of data!
I would thoroughly recommend the exercise of transcribing parish registers
to anyone who has a computer with database software. There is nothing really
difficult about learning to use a database, and in many ways it is easier
than using a modern word processor. The exercise is boring, but you get an
incredible amount of satisfaction from the knowledge that you are helping
other fellow researchers. You also gain a great deal of experience in reading
old handwriting. A good hint, if you havent done it before, is to start
by transcribing the post 1813 registers for a parish first. The handwriting
is usually a little clearer, and you get to know the surnames and places
referred to. That makes it a whole lot easier when you come to do the older
registers.
There is a down-side though. Someone has to enter all of the information
into the computer database, typing it in line by line. It is a painfully
slow process, and mind-blowingly boring. It could not, however, ever be
considered to be a thankless task, and I would recommend the exercise to
anyone, no matter how slow they type. What we should consider though, is
the format in which the information is transcribed, so that it can be easily
searchable.
There are basically three types of computer software which can be used, a
word processor, a spreadsheet, or a database, but only one of these will
do the task that we want really successfully. A database was invented for
this type of information, so that it can be sorted and searched with great
ease. A database stores the information in what is known as fields
- one field for each item of data. A collection of fields make up the information
for one record, that is, all of the information about the one
person or event (e.g. a baptism). A collection of records makes up the whole
database file.
The use of commas and quotation marks in a database file
It is very tempting to enter data such as: Margin note:
Illegitimate, or 13, High Street, Newent. Commas or quotation
marks should never be used in a database! It presents a big problem
if the data needs to be exported into a different database software. I prefer
to use Microsoft Access as my database software, but not everyone else does,
furthermore, others may have different types of computers.
There is a standard for transferring data between different
databases, and this standard has existed since computers first began to be
used. Every database can import files of this standard. The file
type is known as a Comma Separated Variable file. CSV
file for short. It is a simple text only computer file. Each field
of data is separated by a comma, and enclosed in speech marks. It looks something
like this:
No,Birth date,Baptism date,First
name,sex,Father,Mother .... etc.
0001,20 Jan 1875,26 Mar
1875,John,son,James,Mary
... etc.
So if our data contains commas or speech marks, then it is impossible for
it to be transferred sensibly to a different database or a database in a
different type of computer. Well, not quite impossible, but it entails hours
of work editing the CSV file to get rid of all of the extra commas and speech
marks. Believe me, it is a pain!
Never use speech marks or commas in database files! |