megatron.data.gpt_dataset.build_train_valid_test_datasets#
- megatron.data.gpt_dataset.build_train_valid_test_datasets(data_prefix: str | None, data_impl: str, splits_string: str, train_valid_test_num_samples: List[int], seq_length: int, seed: int, skip_warmup: bool, train_data_prefix=None, valid_data_prefix=None, test_data_prefix=None)#
Build train, valid, and test datasets.