Datasets#
Build an albumentations pipeline for image augmentation. |
|
Build a dataset. |
|
Dataset in the ImageFolder style. |
|
Collate function for |
Dataset Classes#
ImageFolder#
- class saliency_metrics.datasets.ImageFolder(img_root, pipeline, smap_root=None, smap_extension='.png', cls_to_ind_file=None)[source]#
Dataset in the ImageFolder style.
Compared to the
torchvision.datasets.ImageFolder, this class can load an image and its corresponding saliency map (abbreviated as “smap”) simultaneously. It is assumed that the dataset folder has the following hierarchy:# images root/images/dog/dog_0.jpg root/images/dog/dog_1.jpg ... root/images/cat/cat_0.jpg root/images/cat/cat_1.jpg ... # saliency maps root/smaps/dog/dog_0.png root/smaps/dog/dog_1.png ... root/smaps/cat/cat_0.png root/smaps/cat/cat_1.png ...
Note
An image and its corresponding saliency map must have the same spatial size. Please pre-process the images and saliency maps in advance.
The file names (without extensions) of an image and its corresponding saliency map must be consistent, e.g.
"dog_0.jpg"and"dog_0.png".
Each sample is a
dictcontaining following fields:"img": (Union[torch.Tensor, numpy.ndarray]) Transformed image. The image is converted totorch.Tensorwith shape (num_channels, height, width) ifToTensorV2(orToTensor) is in the transform pipeline. Otherwise, it is anumpy.ndarraywith shape (height, width, num_channels)."smap": (numpy.ndarray) Saliency map with shape (height, width). This field exists only whensmap_rootis not None."target": (int) Ground truth label."meta": (dict) A dictionary containing meta information like image path (with key"img_path") and original size (with key"ori_size") of the image.
- Parameters
img_root (
str) – Root of the image folders.smap_root (
Optional[str]) – Root of the saliency map folders. If None, no saliency maps will be loaded.smap_extension (
Optional[str]) – File extension of the saliency maps. This argument only has influence whensmap_rootis not None. Ifsmap_extensionis None, then the extension of the images will be used, this assumes that all the images have the same extension.cls_to_ind_file (
Optional[str]) – Path of a file (json, yaml etc.) that can be de-serialized to a dictionary, which maps class names to indices. If None, the class names (folder names underimg_root) will be sorted and mapped to the sorted indices. For example,["a", "b"]will be mapped to[0, 1], respectively.
Examples
from saliency_metrics.datasets import build_dataset pipeline = [ dict(type="Resize", height=5, width=5), dict(type="Normalize", mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)), dict(type="ToTensorV2",) ] cfg = dict( type="ImageFolder", img_root="path/to/data/images/", pipeline=pipeline, smap_root="path/to/data/smaps/", smap_extension=".png", cls_to_ind_file="path/to/data/cls_to_ind_file.json", ) dataset = build_dataset(cfg) assert isinstance(dataset, ImageFolder)
Functions#
build_dataset#
build_pipeline#
- saliency_metrics.datasets.build_pipeline(cfg, default_args=None)[source]#
Build an albumentations pipeline for image augmentation.
import numpy as np import albumentations as A from saliency_metrics.datasets import build_pipeline img = np.random.randint(0, 225, (250, 250), dtype=np.uint8) bboxes = [[10, 100, 10, 100]] labels = [0] # build single augmentation cfg_1 = dict(type="GaussianBlur", blur_limit=(3, 7), p=0.5) pipeline_1 = build_pipeline(cfg_1) img_1 = pipeline_1(image=img)["image"] # build multiple augmentations and perform transformation for bounding boxes cfg_2 = [ dict(type="RandomCrop", height=200, width=200), dict(type="Resize", height=224, width=224), dict(type="ToTensorV2") ] default_args = dict(bbox_params=A.BboxParams(format="pascal_voc", label_fields=["labels"])) pipeline_2 = build_pipeline(cfg_2, default_args=default_args) output_2 = pipeline_2(image=img, bboxes=bboxes, labels=labels) img_2, bboxes_2, labels_2 = output_2["image"], output_2["bboxes"], output_2["labels"]
image_folder_collate_fn#
- saliency_metrics.datasets.image_folder_collate_fn(batch, smap_as_tensor=True)[source]#
Collate function for
saliency_metrics.datasets.image_folder.ImageFolder.The collated batch is a dict that contains:
"img": (Tensor) images with shape (batch_size, num_channels, height, width)."target": (Tensor) targets with shape (batch_size,)."smap": (Optional[Union[Tensor, ndarray]]) saliency maps with shape (batch_size, height, width)."meta": A dict that contains:"img_path": (List[str]) image paths with the length of batch_size."ori_size": (List[Tuple[int, int]]) list of original spatial sizes of images.