French AI startup Mistral has launched its first multimodal model, the Pixtral 12B, which can handle both text and images, according to Techcrunch. The model uses 12 billion parameters and is based on Mistral’s Nemo 12B text model. Pixtral 12B can answer questions about images via URLs or base64-encoded images, such as how many copies of a certain object are visible.
Most generative AI (genAI) models have been partially trained on copyrighted material, leading to lawsuits from copyright owners. (AI companies argue that the tactic should be classified as fair use.)
It is unclear what image data Mistral used to develop the Pixtral 12B.