Models

Models are fine-tuned LLMs trained on a specific version of a dataset. Each version can have exactly one model and vice versa. Currently all models share the same base model which is Mistral 7B as its an excellent generalist base model.

How long do models take to train?

Models typically take 20-30 minutes to train but this all depends on how busy our GPUs are and it can take much longer.

How large are the model weights?

With Mistral 7B the weights (in a .bin file) are usually around 15GB.

Last updated