California Digital Newspaper Collection

I discovered the California Digital Newspaper Collection a few months ago. Now, as part of my daily routine, I log in and correct text. It is most interesting as it contains scans of newspapers from 1846 to 2012.

To start, I was working on the Alta California from San Francisco in the 1849-1850 time frame. The CDNC scanned in the newspaper and used some OCR software to convert them to text. Naturally, this doesn’t work all the time, the papers can be difficult to read and the software often fails to “translate” it correctly.

There are a few thousand people with CDNC accounts, and we correct the texts using the scanned image of the newspaper’s page as a guide. Currently, I’ve corrected 16,000 lines of text and hold a rank of 31st “most lines corrected.” It is very interesting reading most of the time, but there are times where I just power through the want-ads or there are some very verbose speeches or lists of things which need correcting but are a bit tedious. The third time through the shipping column gets pretty boring but needs to be done.

If you like history and have some time to correct a few lines, you can read all about it here:

http://cdnc.ucr.edu/

Here is an example taken from the San Francisco CALL just after the April 1906 earthquake. Many business advertised new addresses like this:

THEO. TREYER, the wigmaker, of 331 Kear-
ny st.,  S. F. can be found at his residence,
842 52d st,. bet. Grove and Genoa, Oakland.

Or this rather bizarre one:

An insane Chinaman at the Presidio Hospital was killed yesterday by a delirious mental patient. The Oriental’s skull was crushed by an iron bar in the hands of his crazy aggressor.

 

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.