As noted by 9to5Mac, Apple has released a family of open-source large language models. Called OpenELM, Apple describes these as: a family of Open-source Efficient Language Models.

A large language model (LLM) is a language model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification.Apple says that OpenELM offers similar performance to other open language models, but with less training data.

OpenELM consists of small models designed to perform efficiently at text generation tasks. There are eight OpenELM models in total 

From Apple: To this end, we release OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy. For example, with a parameter budget of approximately one billion parameters, OpenELM exhibits a 2.36% improvement in accuracy compared to OLMo while requiring 2× fewer pre-training tokens.

Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations. We also release code to convert models to MLX library for inference and fine-tuning on Apple devices. This comprehensive release aims to empower and strengthen the open research community, paving the way for future open research endeavors.




Article provided with permission from AppleWorld.Today