ncse: Images by category
Purpose of this archive
This archive was created to offer a more convenient way to find images which represent specific subjects. The subjects belong to a taxonomy developed for the NCSE project – a copy of the taxonomy expressed in SKOS/RDF is included for reference.
Note that this archive is not comprehensive. Many small, low-resolution images have been left out, as have numerous duplicates of repeated images used in advertisements or as page ornaments. In several cases the automated process of finding and aggregating images could not match index entry to image (and vice-versa).
Contents
1. Archive description (this document)
2. Five compressed folders containing images from the following journals:
a. Leader
b. Monthly Repository
c. Northern Star
d. Publishers Circular
e. Tomahawk
3. Four index files, comprising two types, and two formats for each type
a. category_instance_index.txt
b. category_instance_index.xml
c. journal_issue_to_image_index.txt
d. journal_issue_to_image_index.xml
4. Image subject taxonomy file
Structure and contents of compressed folders of journal images
When uncompressed each folder of images has no internal structure and simply contains a collection of image files. The filenames are in the form [EDITION PREFIX][DATE][TYPE&NUMBER][FILETYPE SUFFIX] where
[EDITION PREFIX] will be one of the same edition abbreviations found in the main archive for the Journal (e.g. ‘LDR’, ‘CLDR’)
[DATE] is date of the issue that contains the image
[TYPE&NUMBER] is a combination of a category abbreviation (e.g. ‘Ad’ for advert, ‘Pg’ for page, ‘Pc’ for picture) and a number which refers to the issue page the image occurs on.
[FILETYPE SUFFIX] – will be either .jpg or .png
NOTE: the folder for each journal does not contain every graphic image from every issue. This is because
a) there are often multiple instances of a small image (e.g. one that is used in a recurring advertisement)
b) it was not always possible to make an automatic match between an index reference and the relevant image file
Indexes
There are two types of index:
· category instances – this is structured as a taxonomy of subject types and for each type shows where instances of images manifesting that type occur in journal issues.
· journal issues – this lists images that occur in journal issues and for each instance lists what types of subjects the image manifests
Each type of index is provided in both XML format and in plain text.
NOTE: the indexes are not comprehensive. There may be images which are not mentioned in the indexes. Conversely, as mentioned above, there may be instances where an image referenced in an index is not present in the compressed image folder. However, the image can still be seen by downloading the main archive for the journal in question and either viewing the relevant PDF file from the compressed folder of PDFs (the quickest and most convenient way, but probably not the highest resolution) or by finding the original image file within the Document.zip file for that issue in the main compressed journal archive file (less convenient but probably better resolution than the PDF version).