Jump to content

Mistral AI: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
source? The website still list models under apache liicenses
No edit summary
Tag: Reverted
Line 19: Line 19:
}}
}}


'''Mistral AI''' is a French company selling [[artificial intelligence]] (AI) products. It was founded in April 2023 by previous employees of [[Meta Platforms]] and [[Google DeepMind]].<ref>{{Cite news |date=2023-12-12 |title=France's unicorn start-up Mistral AI embodies its artificial intelligence hopes |language=en |publisher=Le Monde.fr |url=https://www.lemonde.fr/en/economy/article/2023/12/12/french-unicorn-start-up-mistral-ai-embodies-its-artificial-intelligence-hopes_6337125_19.html |access-date=2023-12-16}}</ref> The company raised 385 million euros in October 2023 <ref>{{cite news|url=https://www.nytimes.com/2023/12/10/technology/mistral-ai-funding.html|title=Mistral, French A.I. Start-Up, Is Valued at $2 Billion in Funding Round|publisher=The New York Times |date=10 December 2023 |last1=Metz |first1=Cade }}</ref> and in December 2023 it was valued at more than $2 billion.<ref>{{Cite web |last=Fink |first=Charlie |title=This Week In XR: Epic Triumphs Over Google, Mistral AI Raises $415 Million, $56.5 Million For Essential AI |url=https://www.forbes.com/sites/charliefink/2023/12/14/this-week-in-xr-epic-triumphs-over-google-mistral-ai-raises-415-million-565-million-for-essential-ai/ |access-date=2023-12-16 |publisher=Forbes |language=en}}</ref><ref>{{Cite web|url=https://www.hindustantimes.com/business/a-french-ai-start-up-may-have-commenced-an-ai-revolution-silently-101702370816617.html|title=A French AI start-up may have commenced an AI revolution, silently|date=December 12, 2023|publisher=Hindustan Times}}</ref><ref>{{Cite web|url=https://www.ft.com/content/ea29ddf8-91cb-45e8-86a0-f501ab7ad9bb|title=French AI start-up Mistral secures €2bn valuation|publisher= ft.com Financial Times}}</ref>
'''Mistral AI''' is a French company selling [[artificial intelligence]] (AI) products. It was founded in April 2023 by previous employees of [[Meta Platforms]] and [[Google DeepMind]].<ref>{{Cite news |date=2023-12-12 |title=France's unicorn start-up Mistral AI embodies its artificial intelligence hopes |language=en |publisher=Le Monde.fr |url=https://www.lemonde.fr/en/economy/article/2023/12/12/french-unicorn-start-up-mistral-ai-embodies-its-artificial-intelligence-hopes_6337125_19.html |access-date=2023-12-16}}</ref> The company raised 385 million euros in October 2023 <ref>{{cite news|url=https://www.nytimes.com/2023/12/10/technology/mistral-ai-funding.html|title=Mistral, French A.I. Start-Up, Is Valued at $2 Billion in Funding Round|publisher=The New York Times |date=10 December 2023 |last1=Metz |first1=Cade }}</ref> and in December 2023 it was valued at more than $2 billion.<ref>{{Cite web |last=Fink |first=Charlie |title=This Week In XR: Epic Triumphs Over Google, Mistral AI Raises $415 Million, $56.5 Million For Essential AI |url=https://www.forbes.com/sites/charliefink/2023/12/14/this-week-in-xr-epic-triumphs-over-google-mistral-ai-raises-415-million-565-million-for-essential-ai/ |access-date=2023-12-16 |publisher=Forbes |language=en}}</ref><ref>{{Cite web|url=https://www.hindustantimes.com/business/a-french-ai-start-up-may-have-commenced-an-ai-revolution-silently-101702370816617.html|title=A French AI start-up may have commenced an AI revolution, silently|date=December 12, 2023|publisher=Hindustan Times}}</ref><ref>{{Cite web|url=https://www.ft.com/content/ea29ddf8-91cb-45e8-86a0-f501ab7ad9bb|title=French AI start-up Mistral secures €2bn valuation|publisher= ft.com Financial Times}}</ref> It is one of the European leaders in the field of AI.


It produces [[open source]] [[large language model]]s,<ref name=":0">{{Cite web |date=2023-12-12 |title=Buzzy Startup Just Dumps AI Model That Beats GPT-3.5 Into a Torrent Link |url=https://gizmodo.com/mistral-artificial-intelligence-gpt-3-openai-1851091217 |access-date=2023-12-16 |publisher=Gizmodo |language=en}}</ref> citing the foundational importance of [[open-source software]], and as a response to proprietary models.<ref>{{cite web |title=Bringing open AI models to the frontier |url=https://mistral.ai/news/about-mistral-ai/ |publisher=Mistral AI |access-date=4 January 2024 |language=en-us |date=27 September 2023}}</ref>
It produces [[open source]] [[large language model]]s,<ref name=":0">{{Cite web |date=2023-12-12 |title=Buzzy Startup Just Dumps AI Model That Beats GPT-3.5 Into a Torrent Link |url=https://gizmodo.com/mistral-artificial-intelligence-gpt-3-openai-1851091217 |access-date=2023-12-16 |publisher=Gizmodo |language=en}}</ref> citing the foundational importance of [[open-source software]], and as a response to proprietary models.<ref>{{cite web |title=Bringing open AI models to the frontier |url=https://mistral.ai/news/about-mistral-ai/ |publisher=Mistral AI |access-date=4 January 2024 |language=en-us |date=27 September 2023}}</ref>

Revision as of 18:01, 1 March 2024

Mistral AI
Company typePrivate
IndustryArtificial intelligence
Founded28 April 2023
FoundersArthur Mensch
Headquarters
Paris
,
France
Products
  • Mistral 7B
  • Mixtral 8x7B
  • Mistral Medium
  • Mistral Large
Websitemistral.ai

Mistral AI is a French company selling artificial intelligence (AI) products. It was founded in April 2023 by previous employees of Meta Platforms and Google DeepMind.[1] The company raised 385 million euros in October 2023 [2] and in December 2023 it was valued at more than $2 billion.[3][4][5] It is one of the European leaders in the field of AI.

It produces open source large language models,[6] citing the foundational importance of open-source software, and as a response to proprietary models.[7]

As of December 2023, two models have been published, and are available as weights.[8] Another prototype "Mistral Medium" is available via API only.[9]

History

Mistral AI was co-founded in April 2023 by Arthur Mensch, Guillaume Lample and Timothée Lacroix. Prior to co-founding Mistral AI, Arthur Mensch worked at Google DeepMind which is Google's artificial intelligence laboratory, while Guillaume Lample and Timothée Lacroix worked at Meta Platforms.[10]

In June 2023, the start-up carried out a first fundraising of 105 million euros (117 million US$) with investors including the American fund Lightspeed Venture Partners, Eric Schmidt, Xavier Niel and JCDecaux. The valuation is then estimated by the Financial Times at €240 million, that is about US$267 million.

On 27 September 2023, the company made its language processing model “Mistral 7B” available under the free Apache 2.0 license. This model has 7 billion parameters, a small size compared to its competitors.

On 10 December 2023, Mistral AI announced that it had raised 385 million € (428 million US$) as part of its second fundraising. This round of financing notably involves the Californian fund Andreessen Horowitz, BNP Paribas and the software publisher Salesforce.[11]

On 11 December 2023, the company released the “Mixtral 8x7B” model with 46.7 billion parameters but using only 12.9 billion per token thanks to the mixture of experts architecture. The model masters 5 languages (French, Spanish, Italian, English and German) and outperforms, according to its developers' tests, the "LLama 2 70B" model from Meta. A version trained to follow instructions and called “Mixtral 8x7B Instruct” is also offered.[12]

On 26 February 2024, Microsoft announced a new partnership with the company to expand its presence in the rapidly evolving artificial intelligence industry. Under the agreement, Mistral's rich language models will be available on Microsoft's Azure cloud, while the multilingual conversational assistant "Le Chat" will be launched in the style of ChatGPT.[13]

Models

Mistral 7B

Mistral 7B is a 7.3B parameter language model using the transformers architecture. Officially released on September 27, 2023 via a BitTorrent magnet link,[14] and Hugging Face.[15] The model was released under the Apache 2.0 license. The release blog post claimed the model outperforms LLaMA 2 13B on all benchmarks tested, and is on par with LLaMA 34B on many benchmarks tested.[16]

Mistral 7B uses a similar architecture to LLaMA, but with some changes to the attention mechanism. In particular it uses Grouped-query attention (GQA) intended for faster inference and Sliding Window Attention (SWA) intended to handle longer sequences.

Sliding Window Attention (SWA) reduces the computational cost and memory requirement for longer sequences. In sliding window attention, each token can only attend to a fixed number of tokens from the previous layer in a "sliding window" of 4096 tokens, with a total context length of 32768 tokens. At inference time, this reduces the cache availability, leading to higher latency and smaller throughput. To alleviate this issue, Mistral 7B uses a rolling buffer cache.

Mistral 7B uses grouped-query attention (GQA), which is a variant of the standard attention mechanism. Instead of computing attention over all the hidden states, it computes attention over groups of hidden states.[17]

Both a base model and "instruct" model were released with the later receiving additional tuning to follow chat-style prompts. The fine-tuned model is only intended for demonstration purposes, and does not have guardrails or moderation built-in.[16]

Mixtral 8x7B

Much like Mistral's first model, Mixtral 8x7B was released via BitTorrent on December 9, 2023,[6] and later Hugging Face and a blog post were released two days later.[12]

Unlike the previous Mistral model, Mixtral 8x7B uses a sparse mixture of experts architecture. The model has 8 distinct groups of "experts", giving the model a total of 46.7B usable parameters.[18][19] Each single token can only use 12.9B parameters, therefore giving the speed and cost that a 12.9B parameter model would incur.[12]

Mistral AI's testing shows the model beats both LLaMA 70B, and GPT-3.5 in most benchmarks.[20]

Mistral Medium

Unlike Mistral 7B and Mixtral 8x7B, Mistral Medium is a closed-source prototype only available through the Mistral API.[21] It is trained in various languages including English, French, Italian, German, Spanish and code with a score of 8.6 on MT-Bench.[22] It is Mistral's highest performing large language model, being ranked in performance above Claude and below GPT-4 on the LMSys ELO Arena benchmark.[23]

The number of parameters, and architecture of Mistral Medium is not known as Mistral has not published public information about it.

References

  1. ^ "France's unicorn start-up Mistral AI embodies its artificial intelligence hopes". Le Monde.fr. 2023-12-12. Retrieved 2023-12-16.
  2. ^ Metz, Cade (10 December 2023). "Mistral, French A.I. Start-Up, Is Valued at $2 Billion in Funding Round". The New York Times.
  3. ^ Fink, Charlie. "This Week In XR: Epic Triumphs Over Google, Mistral AI Raises $415 Million, $56.5 Million For Essential AI". Forbes. Retrieved 2023-12-16.
  4. ^ "A French AI start-up may have commenced an AI revolution, silently". Hindustan Times. December 12, 2023.
  5. ^ "French AI start-up Mistral secures €2bn valuation". ft.com Financial Times.
  6. ^ a b "Buzzy Startup Just Dumps AI Model That Beats GPT-3.5 Into a Torrent Link". Gizmodo. 2023-12-12. Retrieved 2023-12-16.
  7. ^ "Bringing open AI models to the frontier". Mistral AI. 27 September 2023. Retrieved 4 January 2024.
  8. ^ "Open-weight models and Mistral AI Large Language Models". docs.mistral.ai. Retrieved 2024-01-04.
  9. ^ "Endpoints and Mistral AI Large Language Models". docs.mistral.ai.
  10. ^ "France's unicorn start-up Mistral AI embodies its artificial intelligence hopes". Le Monde.fr. 12 December 2023.
  11. ^ "Mistral lève 385 M€ et devient une licorne française - le Monde Informatique". 11 December 2023.
  12. ^ a b c "Mixtral of experts". mistral.ai. 2023-12-11. Retrieved 2024-01-04.
  13. ^ Bableshwar (2024-02-26). "Mistral Large, Mistral AI's flagship LLM, debuts on Azure AI Models-as-a-Service". techcommunity.microsoft.com. Retrieved 2024-02-26.
  14. ^ Goldman, Sharon (2023-12-08). "Mistral AI bucks release trend by dropping torrent link to new open source LLM". VentureBeat. Retrieved 2024-01-04.
  15. ^ Coldewey, Devin (27 September 2023). "Mistral AI makes its first large language model free for everyone". TechCrunch. Retrieved 4 January 2024.
  16. ^ a b "Mistral 7B". mistral.ai. Mistral AI. 27 September 2023. Retrieved 4 January 2024.
  17. ^ Jiang, Albert Q.; Sablayrolles, Alexandre; Mensch, Arthur; Bamford, Chris; Chaplot, Devendra Singh; Casas, Diego de las; Bressand, Florian; Lengyel, Gianna; Lample, Guillaume (2023-10-10). "Mistral 7B". arXiv:2310.06825v1 [cs.CL].
  18. ^ "Mixture of Experts Explained". huggingface.co. Retrieved 2024-01-04.
  19. ^ Marie, Benjamin (2023-12-15). "Mixtral-8x7B: Understanding and Running the Sparse Mixture of Experts". Medium. Retrieved 2024-01-04.
  20. ^ Franzen, Carl (2023-12-11). "Mistral shocks AI community as latest open source model eclipses GPT-3.5 performance". VentureBeat. Retrieved 2024-01-04.
  21. ^ "Pricing and rate limits | Mistral AI Large Language Models". docs.mistral.ai. Retrieved 2024-01-22.
  22. ^ AI, Mistral (2023-12-11). "La plateforme". mistral.ai. Retrieved 2024-01-22.
  23. ^ "LMSys Chatbot Arena Leaderboard - a Hugging Face Space by lmsys". huggingface.co. Retrieved 2024-01-22.