megatron.data.image_folder.make_dataset#
- megatron.data.image_folder.make_dataset(directory: str, class_to_idx: Dict[str, int], data_per_class_fraction: float, extensions: Tuple[str, ...] | None = None, is_valid_file: Callable[[str], bool] | None = None) List[Tuple[str, int]] #
Generates a list of samples of a form (path_to_sample, class). :param directory: root dataset directory :type directory: str :param class_to_idx: dictionary mapping class name to class index :type class_to_idx: Dict[str, int] :param extensions: A list of allowed extensions.
Either extensions or is_valid_file should be passed. Defaults to None.
- Parameters:
is_valid_file (optional) – A function that takes path of a file and checks if the file is a valid file (used to check of corrupt files) both extensions and is_valid_file should not be passed. Defaults to None.
- Raises:
ValueError – In case
extensions
andis_valid_file
are None or both are not None.- Returns:
samples of a form (path_to_sample, class)
- Return type: