
StepFun has officially released Step-R1-V-Mini, a multimodal reasoning model that supports text and image input, text output, strong instruction-following capabilities, and general-purpose functionality. It boasts high-precision image perception and the ability to complete complex reasoning tasks.
Reproduction without permission is prohibited:AI LAB » StepFun Releases Multimodal Reasoning Model Step-R1-V-Mini