| Case Studies | |
| Case 6: Data Conversion of Historic Records | |
The client is a court. The cases ruled in this court starts from 14th Century. All these proceedings are recorded in the old Gothic and Victorian English. |
|
| Client’s requirement: | |
All the proceedings starting from 14th Century are stored in the Microfilm and Microfiches and papers which are fading. These proceedings of the past are of public interest. The present day lawyers, Law Students and the public at large did not have access to these interesting court cases and to make it accessible to majority the court wanted to convert all the cases into digital format which then can be viewed over the internet. The criminal court through a University in UK; who acted as consultant commissioned VITL to give a solution to the court’s requirement. |
|
| Solution: | |
Starting with microfilms of the original Proceedings, page images were digitized, creating TIFF files, from which GIF files have been created for transmission over the internet. In order to create a fully searchable resource, it was necessary to digitize the entire text and not just page images of the Proceedings. This text could be searched for any character string desired. Current optical character recognition programmes cannot consistently read eighteenth-century fonts, particularly where the original pages are faded or damaged. Consequently, it was necessary to have the text manually typed. This was performed by the process known as 'double re-keying', whereby the text is typed in twice, by two different typists, and then the two files are compared by computer. Differences are identified and then resolved manually. Due to the nature of English text which was belonging to ancient times some of the characters in the text were unidentifiable. So VITL had to evolve a training strategy to educate the operators in ancient English. With a perfectly clear original text, this conversion resulted in an accuracy rate of 99.8%. However, the fourteenth, seventeenth and eighteenth-century originals are often faded or suffer from 'bleed through' (where print on the other side of the page bleeds through), and these defects are sometimes exacerbated by the processes of microfilming and image digitization. Consequently, not all text could be transcribed with optimal accuracy. Where re-keyers had particular difficulty reading text due to poor quality of the original, a symbol of a torn page appears on your screen next to the transcribed text. By clicking on the thumbnail icon of the original page, you will be able to see an image of the original and interpret the text for yourself. Where a perfectly accurate reading of the text is required, users are thus advised to open the original page image files and read the original. |
|
| Results: | |
This project has created a fully digitized and structured version of all surviving published trial accounts between 14th to 19th centuries, and made them available as a searchable online resource. |
|
