detectron2.data¶
-
detectron2.data.
DatasetCatalog
(dict)¶ A global dictionary that stores information about the datasets and how to obtain them.
It contains a mapping from strings (which are names that identify a dataset, e.g. “coco_2014_train”) to a function which parses the dataset and returns the samples in the format of list[dict].
The returned dicts should be in Detectron2 Dataset format (See DATASETS.md for details) if used with the data loader functionalities in data/build.py,data/detection_transform.py.
The purpose of having this catalog is to make it easy to choose different datasets, by just using the strings in the config.
-
detectron2.data.
MetadataCatalog
(dict)¶ MetadataCatalog is a global dictionary that provides access to
Metadata
of a given dataset.The metadata associated with a certain name is a singleton: once created, the metadata will stay alive and will be returned by future calls to
get(name)
.It’s like global variables, so don’t abuse it. It’s meant for storing knowledge that’s constant and shared across the execution of the program, e.g.: the class names in COCO.
-
detectron2.data.
build_detection_test_loader
(dataset: Union[List[Any], torch.utils.data.Dataset], *, mapper: Callable[[Dict[str, Any]], Any], sampler: Optional[torch.utils.data.Sampler] = None, batch_size: int = 1, num_workers: int = 0, collate_fn: Optional[Callable[[List[Any]], Any]] = None) → torch.utils.data.DataLoader[source]¶ Similar to build_detection_train_loader, with default batch size = 1, and sampler =
InferenceSampler
. This sampler coordinates all workers to produce the exact set of all samples.- Parameters
dataset – a list of dataset dicts, or a pytorch dataset (either map-style or iterable). They can be obtained by using
DatasetCatalog.get()
orget_detection_dataset_dicts()
.mapper – a callable which takes a sample (dict) from dataset and returns the format to be consumed by the model. When using cfg, the default choice is
DatasetMapper(cfg, is_train=False)
.sampler – a sampler that produces indices to be applied on
dataset
. Default toInferenceSampler
, which splits the dataset across all workers. Sampler must be None if dataset is iterable.batch_size – the batch size of the data loader to be created. Default to 1 image per worker since this is the standard when reporting inference time in papers.
num_workers – number of parallel data loading workers
collate_fn – same as the argument of torch.utils.data.DataLoader. Defaults to do no collation and return a list of data.
- Returns
DataLoader – a torch DataLoader, that loads the given detection dataset, with test-time transformation and batching.
Examples:
data_loader = build_detection_test_loader( DatasetRegistry.get("my_test"), mapper=DatasetMapper(...)) # or, instantiate with a CfgNode: data_loader = build_detection_test_loader(cfg, "my_test")
-
detectron2.data.
build_detection_train_loader
(dataset, *, mapper, sampler=None, total_batch_size, aspect_ratio_grouping=True, num_workers=0, collate_fn=None)[source]¶ Build a dataloader for object detection with some default features.
- Parameters
dataset (list or torch.utils.data.Dataset) – a list of dataset dicts, or a pytorch dataset (either map-style or iterable). It can be obtained by using
DatasetCatalog.get()
orget_detection_dataset_dicts()
.mapper (callable) – a callable which takes a sample (dict) from dataset and returns the format to be consumed by the model. When using cfg, the default choice is
DatasetMapper(cfg, is_train=True)
.sampler (torch.utils.data.sampler.Sampler or None) – a sampler that produces indices to be applied on
dataset
. Ifdataset
is map-style, the default sampler is aTrainingSampler
, which coordinates an infinite random shuffle sequence across all workers. Sampler must be None ifdataset
is iterable.total_batch_size (int) – total batch size across all workers.
aspect_ratio_grouping (bool) – whether to group images with similar aspect ratio for efficiency. When enabled, it requires each element in dataset be a dict with keys “width” and “height”.
num_workers (int) – number of parallel data loading workers
collate_fn – a function that determines how to do batching, same as the argument of torch.utils.data.DataLoader. Defaults to do no collation and return a list of data. No collation is OK for small batch size and simple data structures. If your batch size is large and each sample contains too many small tensors, it’s more efficient to collate them in data loader.
- Returns
torch.utils.data.DataLoader – a dataloader. Each output from it is a
list[mapped_element]
of lengthtotal_batch_size / num_workers
, wheremapped_element
is produced by themapper
.
-
detectron2.data.
get_detection_dataset_dicts
(names, filter_empty=True, min_keypoints=0, proposal_files=None, check_consistency=True)[source]¶ Load and prepare dataset dicts for instance detection/segmentation and semantic segmentation.
- Parameters
names (str or list[str]) – a dataset name or a list of dataset names
filter_empty (bool) – whether to filter out images without instance annotations
min_keypoints (int) – filter out images with fewer keypoints than min_keypoints. Set to 0 to do nothing.
proposal_files (list[str]) – if given, a list of object proposal files that match each dataset in names.
check_consistency (bool) – whether to check if datasets have consistent metadata.
- Returns
list[dict] – a list of dicts following the standard dataset dict format.
-
detectron2.data.
load_proposals_into_dataset
(dataset_dicts, proposal_file)[source]¶ Load precomputed object proposals into the dataset.
The proposal file should be a pickled dict with the following keys:
“ids”: list[int] or list[str], the image ids
“boxes”: list[np.ndarray], each is an Nx4 array of boxes corresponding to the image id
“objectness_logits”: list[np.ndarray], each is an N sized array of objectness scores corresponding to the boxes.
“bbox_mode”: the BoxMode of the boxes array. Defaults to
BoxMode.XYXY_ABS
.
-
class
detectron2.data.
Metadata
[source]¶ Bases:
types.SimpleNamespace
A class that supports simple attribute setter/getter. It is intended for storing metadata of a dataset and make it accessible globally.
Examples:
# somewhere when you load the data: MetadataCatalog.get("mydataset").thing_classes = ["person", "dog"] # somewhere when you print statistics or visualize: classes = MetadataCatalog.get("mydataset").thing_classes
-
class
detectron2.data.
DatasetFromList
(*args, **kwds)[source]¶ Bases:
torch.utils.data.Dataset
Wrap a list to a torch Dataset. It produces elements of the list as data.
-
__init__
(lst: list, copy: bool = True, serialize: bool = True)[source]¶ - Parameters
lst (list) – a list which contains elements to produce.
copy (bool) – whether to deepcopy the element when producing it, so that the result can be modified in place without affecting the source in the list.
serialize (bool) – whether to hold memory using serialized objects, when enabled, data loader workers can use shared RAM from master process instead of making a copy.
-
-
class
detectron2.data.
MapDataset
(dataset, map_func)[source]¶ Bases:
torch.utils.data.Dataset
Map a function over the elements in a dataset.
-
__init__
(dataset, map_func)[source]¶ - Parameters
dataset – a dataset where map function is applied. Can be either map-style or iterable dataset. When given an iterable dataset, the returned object will also be an iterable dataset.
map_func – a callable which maps the element in dataset. map_func can return None to skip the data (e.g. in case of errors). How None is handled depends on the style of dataset. If dataset is map-style, it randomly tries other elements. If dataset is iterable, it skips the data and tries the next.
-
-
class
detectron2.data.
ToIterableDataset
(*args, **kwds)[source]¶ Bases:
torch.utils.data.IterableDataset
Convert an old indices-based (also called map-style) dataset to an iterable-style dataset.
-
__init__
(dataset: torch.utils.data.Dataset, sampler: torch.utils.data.Sampler, shard_sampler: bool = True)[source]¶ - Parameters
dataset – an old-style dataset with
__getitem__
sampler – a cheap iterable that produces indices to be applied on
dataset
.shard_sampler –
whether to shard the sampler based on the current pytorch data loader worker id. When an IterableDataset is forked by pytorch’s DataLoader into multiple workers, it is responsible for sharding its data based on worker id so that workers don’t produce identical data.
Most samplers (like our TrainingSampler) do not shard based on dataloader worker id and this argument should be set to True. But certain samplers may be already sharded, in that case this argument should be set to False.
-
-
class
detectron2.data.
DatasetMapper
(*args, **kwargs)[source]¶ Bases:
object
A callable which takes a dataset dict in Detectron2 Dataset format, and map it into a format used by the model.
This is the default callable to be used to map your dataset dict into training data. You may need to follow it to implement your own one for customized logic, such as a different way to read or transform images. See 데이터로더 for details.
The callable currently does the following:
Read the image from “file_name”
Applies cropping/geometric transforms to the image and annotations
Prepare data and annotations to Tensor and
Instances
-
__init__
(is_train: bool, *, augmentations: List[Union[detectron2.data.transforms.Augmentation, detectron2.data.transforms.Transform]], image_format: str, use_instance_mask: bool = False, use_keypoint: bool = False, instance_mask_format: str = 'polygon', keypoint_hflip_indices: Optional[numpy.ndarray] = None, precomputed_proposal_topk: Optional[int] = None, recompute_boxes: bool = False)[source]¶ NOTE: this interface is experimental.
- Parameters
is_train – whether it’s used in training or inference
augmentations – a list of augmentations or deterministic transforms to apply
image_format – an image format supported by
detection_utils.read_image()
.use_instance_mask – whether to process instance segmentation annotations, if available
use_keypoint – whether to process keypoint annotations if available
instance_mask_format – one of “polygon” or “bitmask”. Process instance segmentation masks into this format.
keypoint_hflip_indices – see
detection_utils.create_keypoint_hflip_indices()
precomputed_proposal_topk – if given, will load pre-computed proposals from dataset_dict and keep the top k proposals for each image.
recompute_boxes – whether to overwrite bounding box annotations by computing tight bounding boxes from instance mask annotations.
detectron2.data.detection_utils module¶
Common data processing utilities that are used in a typical object detection data pipeline.
-
exception
detectron2.data.detection_utils.
SizeMismatchError
[source]¶ Bases:
ValueError
When loaded image has difference width/height compared with annotation.
-
detectron2.data.detection_utils.
convert_image_to_rgb
(image, format)[source]¶ Convert an image from given format to RGB.
- Parameters
image (np.ndarray or Tensor) – an HWC image
format (str) – the format of input image, also see read_image
- Returns
(np.ndarray) – (H,W,3) RGB image in 0-255 range, can be either float or uint8
-
detectron2.data.detection_utils.
check_image_size
(dataset_dict, image)[source]¶ Raise an error if the image does not match the size specified in the dict.
-
detectron2.data.detection_utils.
transform_proposals
(dataset_dict, image_shape, transforms, *, proposal_topk, min_box_size=0)[source]¶ Apply transformations to the proposals in dataset_dict, if any.
- Parameters
dataset_dict (dict) – a dict read from the dataset, possibly contains fields “proposal_boxes”, “proposal_objectness_logits”, “proposal_bbox_mode”
image_shape (tuple) – height, width
transforms (TransformList) –
proposal_topk (int) – only keep top-K scoring proposals
min_box_size (int) – proposals with either side smaller than this threshold are removed
The input dict is modified in-place, with abovementioned keys removed. A new key “proposals” will be added. Its value is an Instances object which contains the transformed proposals in its field “proposal_boxes” and “objectness_logits”.
-
detectron2.data.detection_utils.
transform_instance_annotations
(annotation, transforms, image_size, *, keypoint_hflip_indices=None)[source]¶ Apply transforms to box, segmentation and keypoints annotations of a single instance.
It will use transforms.apply_box for the box, and transforms.apply_coords for segmentation polygons & keypoints. If you need anything more specially designed for each data structure, you’ll need to implement your own version of this function or the transforms.
- Parameters
- Returns
dict – the same input dict with fields “bbox”, “segmentation”, “keypoints” transformed according to transforms. The “bbox_mode” field will be set to XYXY_ABS.
-
detectron2.data.detection_utils.
annotations_to_instances
(annos, image_size, mask_format='polygon')[source]¶ Create an
Instances
object used by the models, from instance annotations in the dataset dict.- Parameters
- Returns
Instances – It will contain fields “gt_boxes”, “gt_classes”, “gt_masks”, “gt_keypoints”, if they can be obtained from annos. This is the format that builtin models expect.
-
detectron2.data.detection_utils.
annotations_to_instances_rotated
(annos, image_size)[source]¶ Create an
Instances
object used by the models, from instance annotations in the dataset dict. Compared to annotations_to_instances, this function is for rotated boxes only
-
detectron2.data.detection_utils.
build_augmentation
(cfg, is_train)[source]¶ Create a list of default
Augmentation
from config. Now it includes resizing and flipping.- Returns
list[Augmentation]
-
detectron2.data.detection_utils.
create_keypoint_hflip_indices
(dataset_names: Union[str, List[str]]) → List[int][source]¶ - Parameters
dataset_names – list of dataset names
- Returns
list[int] – a list of size=#keypoints, storing the horizontally-flipped keypoint indices.
-
detectron2.data.detection_utils.
filter_empty_instances
(instances, by_box=True, by_mask=True, box_threshold=1e-05, return_mask=False)[source]¶ Filter out empty instances in an Instances object.
- Parameters
instances (Instances) –
by_box (bool) – whether to filter out instances with empty boxes
by_mask (bool) – whether to filter out instances with empty masks
box_threshold (float) – minimum width and height to be considered non-empty
return_mask (bool) – whether to return boolean mask of filtered instances
- Returns
Instances – the filtered instances. tensor[bool], optional: boolean mask of filtered instances
detectron2.data.datasets module¶
-
detectron2.data.datasets.
load_coco_json
(json_file, image_root, dataset_name=None, extra_annotation_keys=None)[source]¶ Load a json file with COCO’s instances annotation format. Currently supports instance detection, instance segmentation, and person keypoints annotations.
- Parameters
json_file (str) – full path to the json file in COCO instances annotation format.
image_root (str or path-like) – the directory where the images in this json file exists.
the name of the dataset (e.g., coco_2017_train). When provided, this function will also do the following:
Put “thing_classes” into the metadata associated with this dataset.
Map the category ids into a contiguous range (needed by standard dataset format), and add “thing_dataset_id_to_contiguous_id” to the metadata associated with this dataset.
This option should usually be provided, unless users need to load the original json content and apply more processing manually.
extra_annotation_keys (list[str]) – list of per-annotation keys that should also be loaded into the dataset dict (besides “iscrowd”, “bbox”, “keypoints”, “category_id”, “segmentation”). The values for these keys will be returned as-is. For example, the densepose annotations are loaded in this way.
- Returns
list[dict] – a list of dicts in Detectron2 standard dataset dicts format (See Using Custom Datasets ) when dataset_name is not None. If dataset_name is None, the returned category_ids may be incontiguous and may not conform to the Detectron2 standard format.
Notes
This function does not read the image files. The results do not have the “image” field.
-
detectron2.data.datasets.
load_sem_seg
(gt_root, image_root, gt_ext='png', image_ext='jpg')[source]¶ Load semantic segmentation datasets. All files under “gt_root” with “gt_ext” extension are treated as ground truth annotations and all files under “image_root” with “image_ext” extension as input images. Ground truth and input images are matched using file paths relative to “gt_root” and “image_root” respectively without taking into account file extensions. This works for COCO as well as some other datasets.
- Parameters
gt_root (str) – full path to ground truth semantic segmentation files. Semantic segmentation annotations are stored as images with integer values in pixels that represent corresponding semantic labels.
image_root (str) – the directory where the input images are.
gt_ext (str) – file extension for ground truth annotations.
image_ext (str) – file extension for input images.
- Returns
list[dict] – a list of dicts in detectron2 standard format without instance-level annotation.
Notes
This function does not read the image and ground truth files. The results do not have the “image” and “sem_seg” fields.
-
detectron2.data.datasets.
register_coco_instances
(name, metadata, json_file, image_root)[source]¶ Register a dataset in COCO’s json annotation format for instance detection, instance segmentation and keypoint detection. (i.e., Type 1 and 2 in http://cocodataset.org/#format-data. instances*.json and person_keypoints*.json in the dataset).
This is an example of how to register a new dataset. You can do something similar to this function, to register new datasets.
- Parameters
name (str) – the name that identifies a dataset, e.g. “coco_2014_train”.
metadata (dict) – extra metadata associated with this dataset. You can leave it as an empty dict.
json_file (str) – path to the json instance annotation file.
image_root (str or path-like) – directory which contains all the images.
-
detectron2.data.datasets.
convert_to_coco_json
(dataset_name, output_file, allow_cached=True)[source]¶ Converts dataset into COCO format and saves it to a json file. dataset_name must be registered in DatasetCatalog and in detectron2’s standard format.
- Parameters
dataset_name – reference from the config file to the catalogs must be registered in DatasetCatalog and in detectron2’s standard format
output_file – path of json file that will be saved to
allow_cached – if json file is already present then skip conversion
-
detectron2.data.datasets.
register_coco_panoptic
(name, metadata, image_root, panoptic_root, panoptic_json, instances_json=None)[source]¶ Register a “standard” version of COCO panoptic segmentation dataset named name. The dictionaries in this registered dataset follows detectron2’s standard format. Hence it’s called “standard”.
- Parameters
name (str) – the name that identifies a dataset, e.g. “coco_2017_train_panoptic”
metadata (dict) – extra metadata associated with this dataset.
image_root (str) – directory which contains all the images
panoptic_root (str) – directory which contains panoptic annotation images in COCO format
panoptic_json (str) – path to the json panoptic annotation file in COCO format
sem_seg_root (none) – not used, to be consistent with register_coco_panoptic_separated.
instances_json (str) – path to the json instance annotation file
-
detectron2.data.datasets.
register_coco_panoptic_separated
(name, metadata, image_root, panoptic_root, panoptic_json, sem_seg_root, instances_json)[source]¶ Register a “separated” version of COCO panoptic segmentation dataset named name. The annotations in this registered dataset will contain both instance annotations and semantic annotations, each with its own contiguous ids. Hence it’s called “separated”.
It follows the setting used by the PanopticFPN paper:
The instance annotations directly come from polygons in the COCO instances annotation task, rather than from the masks in the COCO panoptic annotations.
The two format have small differences: Polygons in the instance annotations may have overlaps. The mask annotations are produced by labeling the overlapped polygons with depth ordering.
The semantic annotations are converted from panoptic annotations, where all “things” are assigned a semantic id of 0. All semantic categories will therefore have ids in contiguous range [1, #stuff_categories].
This function will also register a pure semantic segmentation dataset named
name + '_stuffonly'
.- Parameters
name (str) – the name that identifies a dataset, e.g. “coco_2017_train_panoptic”
metadata (dict) – extra metadata associated with this dataset.
image_root (str) – directory which contains all the images
panoptic_root (str) – directory which contains panoptic annotation images
panoptic_json (str) – path to the json panoptic annotation file
sem_seg_root (str) – directory which contains all the ground truth segmentation annotations.
instances_json (str) – path to the json instance annotation file
-
detectron2.data.datasets.
load_lvis_json
(json_file, image_root, dataset_name=None, extra_annotation_keys=None)[source]¶ Load a json file in LVIS’s annotation format.
- Parameters
json_file (str) – full path to the LVIS json annotation file.
image_root (str) – the directory where the images in this json file exists.
dataset_name (str) – the name of the dataset (e.g., “lvis_v0.5_train”). If provided, this function will put “thing_classes” into the metadata associated with this dataset.
extra_annotation_keys (list[str]) – list of per-annotation keys that should also be loaded into the dataset dict (besides “bbox”, “bbox_mode”, “category_id”, “segmentation”). The values for these keys will be returned as-is.
- Returns
list[dict] – a list of dicts in Detectron2 standard format. (See Using Custom Datasets )
Notes
This function does not read the image files. The results do not have the “image” field.
-
detectron2.data.datasets.
register_lvis_instances
(name, metadata, json_file, image_root)[source]¶ Register a dataset in LVIS’s json annotation format for instance detection and segmentation.
- Parameters
-
detectron2.data.datasets.
get_lvis_instances_meta
(dataset_name)[source]¶ Load LVIS metadata.
- Parameters
dataset_name (str) – LVIS dataset name without the split name (e.g., “lvis_v0.5”).
- Returns
dict – LVIS metadata with keys: thing_classes
-
detectron2.data.datasets.
load_voc_instances
(dirname: str, split: str, class_names: Union[List[str], Tuple[str, …]])[source]¶ Load Pascal VOC detection annotations to Detectron2 format.
- Parameters
dirname – Contain “Annotations”, “ImageSets”, “JPEGImages”
split (str) – one of “train”, “test”, “val”, “trainval”
class_names – list or tuple of class names
detectron2.data.samplers module¶
-
class
detectron2.data.samplers.
TrainingSampler
(*args, **kwds)[source]¶ Bases:
torch.utils.data.Sampler
In training, we only care about the “infinite stream” of training data. So this sampler produces an infinite stream of indices and all workers cooperate to correctly shuffle the indices and sample different indices.
The samplers in each worker effectively produces indices[worker_id::num_workers] where indices is an infinite stream of indices consisting of shuffle(range(size)) + shuffle(range(size)) + … (if shuffle is True) or range(size) + range(size) + … (if shuffle is False)
Note that this sampler does not shard based on pytorch DataLoader worker id. A sampler passed to pytorch DataLoader is used only with map-style dataset and will not be executed inside workers. But if this sampler is used in a way that it gets execute inside a dataloader worker, then extra work needs to be done to shard its outputs based on worker id. This is required so that workers don’t produce identical data.
ToIterableDataset
implements this logic. This note is true for all samplers in detectron2.-
__init__
(size: int, shuffle: bool = True, seed: Optional[int] = None)[source]¶ - Parameters
size (int) – the total number of data of the underlying dataset to sample from
shuffle (bool) – whether to shuffle the indices or not
seed (int) – the initial seed of the shuffle. Must be the same across all workers. If None, will use a random seed shared among workers (require synchronization among all workers).
-
-
class
detectron2.data.samplers.
RandomSubsetTrainingSampler
(*args, **kwds)[source]¶ Bases:
detectron2.data.samplers.distributed_sampler.TrainingSampler
Similar to TrainingSampler, but only sample a random subset of indices. This is useful when you want to estimate the accuracy vs data-number curves by
training the model with different subset_ratio.
-
__init__
(size: int, subset_ratio: float, shuffle: bool = True, seed_shuffle: Optional[int] = None, seed_subset: Optional[int] = None)[source]¶ - Parameters
size (int) – the total number of data of the underlying dataset to sample from
subset_ratio (float) – the ratio of subset data to sample from the underlying dataset
shuffle (bool) – whether to shuffle the indices or not
seed_shuffle (int) – the initial seed of the shuffle. Must be the same across all workers. If None, will use a random seed shared among workers (require synchronization among all workers).
seed_subset (int) – the seed to randomize the subset to be sampled. Must be the same across all workers. If None, will use a random seed shared among workers (require synchronization among all workers).
-
-
class
detectron2.data.samplers.
InferenceSampler
(*args, **kwds)[source]¶ Bases:
torch.utils.data.Sampler
Produce indices for inference across all workers. Inference needs to run on the __exact__ set of samples, therefore when the total number of samples is not divisible by the number of workers, this sampler produces different number of samples on different workers.
-
class
detectron2.data.samplers.
RepeatFactorTrainingSampler
(*args, **kwds)[source]¶ Bases:
torch.utils.data.Sampler
Similar to TrainingSampler, but a sample may appear more times than others based on its “repeat factor”. This is suitable for training on class imbalanced datasets like LVIS.
-
__init__
(repeat_factors, *, shuffle=True, seed=None)[source]¶ - Parameters
repeat_factors (Tensor) – a float vector, the repeat factor for each indice. When it’s full of ones, it is equivalent to
TrainingSampler(len(repeat_factors), ...)
.shuffle (bool) – whether to shuffle the indices or not
seed (int) – the initial seed of the shuffle. Must be the same across all workers. If None, will use a random seed shared among workers (require synchronization among all workers).
-
static
repeat_factors_from_category_frequency
(dataset_dicts, repeat_thresh)[source]¶ Compute (fractional) per-image repeat factors based on category frequency. The repeat factor for an image is a function of the frequency of the rarest category labeled in that image. The “frequency of category c” in [0, 1] is defined as the fraction of images in the training set (without repeats) in which category c appears. See LVIS: A Dataset for Large Vocabulary Instance Segmentation (>= v2) Appendix B.2.
- Parameters
- Returns
torch.Tensor – the i-th element is the repeat factor for the dataset image at index i.
-