King's College London
Browse
DOCUMENT
README_images_by_category.docx (26.63 kB)
TEXT
category_instance_index.txt (216.27 kB)
TEXT
category_instance_index.xml (1.88 MB)
TEXT
journal_issue_to_image_index.txt (553.37 kB)
TEXT
journal_issue_to_image_index.xml (2.46 MB)
ARCHIVE
Leader_images.zip (1.45 MB)
ARCHIVE
Monthly_Repository_images.zip (947.39 kB)
ARCHIVE
Northern_Star_images.zip (20.98 MB)
ARCHIVE
Publishers_Circular_images.zip (119.11 MB)
ARCHIVE
Tomahawk_images.zip (131.04 MB)
TEXT
image_subject_taxonomy.xml (110.55 kB)
1/0
11 files

ncse: Images by category

dataset
posted on 2024-10-28, 17:04 authored by Mark TurnerMark Turner

Purpose of this archive

This archive was created to offer a more convenient way to find images which represent specific subjects. The subjects belong to a taxonomy developed for the NCSE project – a copy of the taxonomy expressed in SKOS/RDF is included for reference.

Note that this archive is not comprehensive. Many small, low-resolution images have been left out, as have numerous duplicates of repeated images used in advertisements or as page ornaments. In several cases the automated process of finding and aggregating images could not match index entry to image (and vice-versa).

Contents

1. Archive description (this document)

2. Five compressed folders containing images from the following journals:

a. Leader

b. Monthly Repository

c. Northern Star

d. Publishers Circular

e. Tomahawk

3. Four index files, comprising two types, and two formats for each type

a. category_instance_index.txt

b. category_instance_index.xml

c. journal_issue_to_image_index.txt

d. journal_issue_to_image_index.xml

4. Image subject taxonomy file

Structure and contents of compressed folders of journal images

When uncompressed each folder of images has no internal structure and simply contains a collection of image files. The filenames are in the form [EDITION PREFIX][DATE][TYPE&NUMBER][FILETYPE SUFFIX] where

[EDITION PREFIX] will be one of the same edition abbreviations found in the main archive for the Journal (e.g. ‘LDR’, ‘CLDR’)

[DATE] is date of the issue that contains the image

[TYPE&NUMBER] is a combination of a category abbreviation (e.g. ‘Ad’ for advert, ‘Pg’ for page, ‘Pc’ for picture) and a number which refers to the issue page the image occurs on.

[FILETYPE SUFFIX] – will be either .jpg or .png


NOTE: the folder for each journal does not contain every graphic image from every issue. This is because

a) there are often multiple instances of a small image (e.g. one that is used in a recurring advertisement)

b) it was not always possible to make an automatic match between an index reference and the relevant image file

Indexes

There are two types of index:

· category instances – this is structured as a taxonomy of subject types and for each type shows where instances of images manifesting that type occur in journal issues.

· journal issues – this lists images that occur in journal issues and for each instance lists what types of subjects the image manifests

Each type of index is provided in both XML format and in plain text.

NOTE: the indexes are not comprehensive. There may be images which are not mentioned in the indexes. Conversely, as mentioned above, there may be instances where an image referenced in an index is not present in the compressed image folder. However, the image can still be seen by downloading the main archive for the journal in question and either viewing the relevant PDF file from the compressed folder of PDFs (the quickest and most convenient way, but probably not the highest resolution) or by finding the original image file within the Document.zip file for that issue in the main compressed journal archive file (less convenient but probably better resolution than the PDF version).

Funding

AHRC Resource Enhancement Grant (RE/APN No:18, 260)

History

Temporal coverage

1806-1890