Datasets#

build_pipeline

Build an albumentations pipeline for image augmentation.

build_dataset

Build a dataset.

ImageFolder

Dataset in the ImageFolder style.

image_folder_collate_fn

Collate function for saliency_metrics.datasets.image_folder.ImageFolder.

Dataset Classes#

ImageFolder#

class saliency_metrics.datasets.ImageFolder(img_root, pipeline, smap_root=None, smap_extension='.png', cls_to_ind_file=None)[source]#

Dataset in the ImageFolder style.

Compared to the torchvision.datasets.ImageFolder, this class can load an image and its corresponding saliency map (abbreviated as “smap”) simultaneously. It is assumed that the dataset folder has the following hierarchy:

# images
root/images/dog/dog_0.jpg
root/images/dog/dog_1.jpg
...

root/images/cat/cat_0.jpg
root/images/cat/cat_1.jpg
...

# saliency maps
root/smaps/dog/dog_0.png
root/smaps/dog/dog_1.png
...

root/smaps/cat/cat_0.png
root/smaps/cat/cat_1.png
...

Note

  1. An image and its corresponding saliency map must have the same spatial size. Please pre-process the images and saliency maps in advance.

  2. The file names (without extensions) of an image and its corresponding saliency map must be consistent, e.g. "dog_0.jpg" and "dog_0.png".

Each sample is a dict containing following fields:

  • "img": (Union[torch.Tensor, numpy.ndarray]) Transformed image. The image is converted to torch.Tensor with shape (num_channels, height, width) if ToTensorV2 (or ToTensor) is in the transform pipeline. Otherwise, it is a numpy.ndarray with shape (height, width, num_channels).

  • "smap": (numpy.ndarray) Saliency map with shape (height, width). This field exists only when smap_root is not None.

  • "target": (int) Ground truth label.

  • "meta": (dict) A dictionary containing meta information like image path (with key "img_path") and original size (with key "ori_size") of the image.

Parameters
  • img_root (str) – Root of the image folders.

  • pipeline (List[Dict]) – Config of transform pipeline.

  • smap_root (Optional[str]) – Root of the saliency map folders. If None, no saliency maps will be loaded.

  • smap_extension (Optional[str]) – File extension of the saliency maps. This argument only has influence when smap_root is not None. If smap_extension is None, then the extension of the images will be used, this assumes that all the images have the same extension.

  • cls_to_ind_file (Optional[str]) – Path of a file (json, yaml etc.) that can be de-serialized to a dictionary, which maps class names to indices. If None, the class names (folder names under img_root) will be sorted and mapped to the sorted indices. For example, ["a", "b"] will be mapped to [0, 1], respectively.

Examples

from saliency_metrics.datasets import build_dataset

pipeline = [
    dict(type="Resize", height=5, width=5),
    dict(type="Normalize", mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)),
    dict(type="ToTensorV2",)
]

cfg = dict(
    type="ImageFolder",
    img_root="path/to/data/images/",
    pipeline=pipeline,
    smap_root="path/to/data/smaps/",
    smap_extension=".png",
    cls_to_ind_file="path/to/data/cls_to_ind_file.json",
)

dataset = build_dataset(cfg)
assert isinstance(dataset, ImageFolder)
get_cls_to_ind()[source]#

Get the dictionary mapping class names to indices.

Return type

Dict[str, int]

Returns

A dict mapping class names to indices.

get_ind_to_cls()[source]#

Get the dictionary mapping indices to class names.

Return type

Dict[int, str]

Returns

A dict mapping indices to class names.

Functions#

build_dataset#

saliency_metrics.datasets.build_dataset(cfg, default_args=None)[source]#

Build a dataset.

Parameters
  • cfg (Dict) – A config dict. It should at least contain the field “type”, which is the registered name of the dataset.

  • default_args (Optional[Dict]) – Other default arguments.

Return type

Dataset

Returns

An instance of torch.utils.data.Dataset.

build_pipeline#

saliency_metrics.datasets.build_pipeline(cfg, default_args=None)[source]#

Build an albumentations pipeline for image augmentation.

import numpy as np
import albumentations as A
from saliency_metrics.datasets import build_pipeline

img = np.random.randint(0, 225, (250, 250), dtype=np.uint8)
bboxes = [[10, 100, 10, 100]]
labels = [0]

# build single augmentation
cfg_1 = dict(type="GaussianBlur", blur_limit=(3, 7), p=0.5)
pipeline_1 = build_pipeline(cfg_1)
img_1 = pipeline_1(image=img)["image"]

# build multiple augmentations and perform transformation for bounding boxes
cfg_2 = [
    dict(type="RandomCrop", height=200, width=200),
    dict(type="Resize", height=224, width=224),
    dict(type="ToTensorV2")
]
default_args = dict(bbox_params=A.BboxParams(format="pascal_voc", label_fields=["labels"]))
pipeline_2 = build_pipeline(cfg_2, default_args=default_args)
output_2 = pipeline_2(image=img, bboxes=bboxes, labels=labels)
img_2, bboxes_2, labels_2 = output_2["image"], output_2["bboxes"], output_2["labels"]
Parameters
  • cfg (Union[Dict, List]) – Config dictionary. If cfg is a dict, then the function returns a single albumentations augmentation. If cfg is a list of dict, then the function first builds each augmentation respectively, and then compose them into an albumentations.Compose.

  • default_args (Optional[Dict]) – Other default arguments.

Return type

Union[object, Compose]

Returns

A single albumentations augmentation or albumentations.Compose.

image_folder_collate_fn#

saliency_metrics.datasets.image_folder_collate_fn(batch, smap_as_tensor=True)[source]#

Collate function for saliency_metrics.datasets.image_folder.ImageFolder.

The collated batch is a dict that contains:

  • "img": (Tensor) images with shape (batch_size, num_channels, height, width).

  • "target": (Tensor) targets with shape (batch_size,).

  • "smap": (Optional[Union[Tensor, ndarray]]) saliency maps with shape (batch_size, height, width).

  • "meta": A dict that contains:

    • "img_path": (List[str]) image paths with the length of batch_size.

    • "ori_size": (List[Tuple[int, int]]) list of original spatial sizes of images.

Parameters
  • batch (List[Dict]) – A batch of data with length of batch_size.

  • smap_as_tensor (bool) – If True, batch the saliency maps to a torch.Tensor. Otherwise, batch them to a numpy.ndarray.

Return type

Dict

Returns

Collated batch.