updown.data.readers¶
A Reader simply reads data from disk and returns it _almost_ as is. Readers should be
utilized by PyTorch Dataset. Too much of data pre-processing is not
recommended in the reader, such as tokenizing words to integers, embedding tokens, or passing
an image through a pre-trained CNN. Each reader must implement at least two methods:
__len__to return the length of data this Reader can read.
__getitem__to return data based on an index or a primary key (such asimage_id).
- 
class 
updown.data.readers.ImageFeaturesReader(features_h5path: str, in_memory: bool = False)[source]¶ Bases:
objectA reader for H5 files containing pre-extracted image features. A typical image features file should have at least two H5 datasets, named
image_idandfeatures. It may optionally have other H5 datasets, such asboxes(for bounding box coordinates),widthandheightfor image size, and others. This reader only reads image features, because our UpDown captioner baseline does not require anything other than image features.Example of an h5 file:
image_bottomup_features.h5 |--- "image_id" [shape: (num_images, )] |--- "features" [shape: (num_images, num_boxes, feature_size)] +--- .attrs {"split": "coco_train2017"}
- Parameters
 - features_h5pathstr
 Path to an H5 file containing image ids and features corresponding to one of the four splits used: “coco_train2017”, “coco_val2017”, “nocaps_val”, “nocaps_test”.
- in_memorybool
 Whether to load the features in memory. Beware, these files are sometimes tens of GBs in size. Set this to true if you have sufficient RAM.
- 
class 
updown.data.readers.CocoCaptionsReader(captions_jsonpath: str)[source]¶ Bases:
objectA reader for annotation files containing training captions. These are JSON files in COCO format.
- Parameters
 - captions_jsonpathstr
 Path to a JSON file containing training captions in COCO format (COCO train2017 usually).
- 
class 
updown.data.readers.ConstraintBoxesReader(boxes_jsonpath: str)[source]¶ Bases:
objectA reader for annotation files containing detected bounding boxes (in COCO format). The JSON file should have
categories,imagesandannotationsfields (similar to COCO instance annotations).For our use cases, the detections are from an object detector trained using Open Images. These can be produced for any set of images by following instructions here.
- Parameters
 - boxes_jsonpath: str
 Path to a JSON file containing bounding box detections in COCO format (nocaps val/test usually).