Datasets
A dataset is a collection of samples (prompt, response) pairs that is used to train a model to perform a specific task. Within a dataset you can have any number of versions (each with their own model).
You should make new datasets for the following reasons:
The problem the model should solve is different from any other dataset.
The model will be evaluated differently.
Datasets are designed to help you iterate on solving a single problem with a set of ever improving models .You should not create datasets that mix different problems, even if the they are very similar types of problems.
e.g a tweet classifier and a report classifier are both classifiers but should be in separate datasets.
Last updated