Hub

Configure the Dataset Viewer

The Dataset Viewer supports many data files formats, from text to tabular and from image to audio formats. It also separates the train/validation/test splits based on file and folder names.

To configure the Dataset Viewer for your dataset, first make sure your dataset is in a supported data format.

Configure dropdowns for splits or subsets

In the Dataset Viewer you can view the train/validation/test splits of datasets, and sometimes additionally choose between multiple subsets (e.g. one per language).

To define those dropdowns, you can name the data files or their folder after their split names (train/validation/test). It is also possible to customize your splits manually using YAML.

For more information, feel free to check out the documentation on Data files Configuration and the collections of example datasets. The Image Dataset doc page proposes various methods to structure a dataset with images.

Disable the viewer

The dataset viewer can be disabled. To do this, add a YAML section to the dataset’s README.md file (create one if it does not already exist) and add a viewer property with the value false.

---
viewer: false
---

Private datasets

For private datasets, the Dataset Viewer is enabled for PRO users and Enterprise Hub organizations.

< > Update on GitHub