How to train your captioner?
We manage experiments through config files – a config file should contain arguments which are
specific to a particular experiment, such as those defining model architecture, or optimization
hyperparameters. Other arguments such as GPU ids, or number of CPU workers should be declared in
the script and passed in as argparse-style arguments.
UpDown Captioner (without CBS)
Train a baseline UpDown Captioner with all the default hyperparameters as follows. This would
reproduce results of the first row in nocaps val
table from our paper.
python scripts/train.py \
--config configs/updown_nocaps_val.yaml \
--gpu-ids 0 --serialization-dir checkpoints/updown
Refer updown.config.Config
for default hyperparameters.
For other configurations, write your own config file, and/or a set of key-value pairs through
--config-override
argument. For example:
python scripts/train.py \
--config configs/updown_nocaps_val.yaml \
--config-override OPTIM.BATCH_SIZE 250 \
--gpu-ids 0 --serialization-dir checkpoints/updown-baseline
Note
This configuration uses randomly initialized word embeddings, which are trained during
training. It is not possible to run Constrained Beam Search on this checkpoint.
UpDown Captioner (with CBS)
Train a baseline UpDown Captioner with cnstrained Beam Search decoding during evaluation. This
would reproduce results of the second row in nocaps val
table from our paper.
python scripts/train.py \
--config configs/updown_plus_cbs_nocaps_val.yaml \
--gpu-ids 0 --serialization-dir checkpoints/updown_plus_cbs
The only difference with original config is the word embedding size, this one is set to the
GloVe dimension (300), and frozen during training. A checkpoint trained using this config can
be run without Constrained Beam Search decoding.
Additional Details
Multi-GPU Training
Multi-GPU training is fully supported, pass GPU IDs as --gpu-ids 0 1 2 3
.
Saving Model Checkpoints
This script serializes model checkpoints every few iterations, and keeps track of best performing
checkpoint based on overall CIDEr score.
Logging
This script logs loss curves and metrics to Tensorboard, log files are at --serialization-dir
.
Execute tensorboard --logdir /path/to/serialization_dir --port 8008
and visit
localhost:8008
in the browser.