25 Apr

Microsoft stated today that Phi-3, the third member of its Phi family of Small Language Models (SLMs), is "the most capable and cost-effective small language models (SLMs) available," outperforming some larger and comparable sized models.

One kind of AI model that is intended to be very effective at carrying out particular language-related activities is called a Small Language Model (SLM). SLMs are constructed on a smaller dataset in order to make them more effective and economical for certain use cases, in contrast to Large Language Models (LLMs), which are ideally suited for a variety of generic tasks.

Microsoft clarified that there are various versions of Phi-3, the smallest of which is Phi-3 Mini, a 3.8 billion parameter model trained on 3.3 trillion tokens. Phi-3 Mini can handle 128K tokens of context despite its very small size—Llama-3's corpus has approximately 15 trillion pieces of data. This puts it on par with GPT-4 and gives it an advantage over Mistral Large and Llama-3 in terms of token capacity.

Put another way, massive AI systems like Mistral Large and Llama-3 on Meta.ai may crumble after a protracted conversation or prompt before this lightweight model starts to falter.

The fact that Phi-3 Mini can be installed and operated on a standard smartphone is one of its biggest benefits. Using an iPhone 14, Microsoft tested the concept and found that it operated flawlessly, producing 14 tokens every second. Phi-3 Mini is a lightweight and effective substitute for people with more concentrated requirements because it only needs 1.8GB of VRAM to run.

 Phi-3 Mini can be a useful substitute for users with particular needs, but it might not be as appropriate for sophisticated programmers or those with wide requirements. Phi-3 Mini can be used, for instance, by startups in need of a chatbot or by individuals using LLMs for data analysis to perform activities like organizing data, extracting information, performing mathematical reasoning, and creating agents. The model can become rather powerful if given internet connectivity, making up for its lack of capabilities with real-time data. 

Because Microsoft concentrates on selecting the most relevant data for its dataset, Phi-3 Mini scores highly on tests. In actuality, the larger Phi family does poorly on tasks requiring factual information, but their superior reasoning abilities set them apart from their main rivals. The 14-billion parameter Phi-3 Medium model routinely outperforms sophisticated LLMs such as GPT-3.5, which powers ChatGPT's free version, while the Mini version outperforms complex models such as Mixtral-8x7B in most synthetic benchmarks. 

It is important to note, nevertheless, that unlike Phi-2, Phi-3 is not available for public use. Rather, it is an open model, which means that while it is usable and accessible, its licensing is not as open source as that of Phi-2, which permits more extensive use and commercial applications.

Microsoft announced that it will introduce additional Phi-3 family models in the upcoming weeks, including the previously stated Phi-3 Medium and Phi-3 Small (7 billion parameters).

April 2024, Cryptoniteuae

* The email will not be published on the website.