A few months after launching Qwen3-VL, Alibaba has released a detailed technical report on the open multimodal model. The data shows the system excels at image-based math tasks and can analyze hours o...
Similar Articles (10 found)
π 70.8% similar
Things we learned about LLMs in 2024
31st December 2024
A lot has happened in the world of Large Language Models over the course of 2024. Hereβs a rev...
π 64.7% similar
Table of Contents
- Video Understanding and Grounding with Qwen 2.5
- Enhanced Video Comprehension Ability in Qwen 2.5 Models
- Dynamic Frame Rate (FP...
π 64.7% similar
Veo 3 shows emergent zero-shot abilities across many visual tasks, indicating that video models are on a path to becoming vision foundation modelsβjus...
π 62.5% similar
OpenAI just released two open-weight modelsβgpt-oss-120b and gpt-oss-20bβafter months of anticipation (you can try them here).
That means anyone with ...
π 60.3% similar
Evaluating LLMs for my personal use case
Summary
Itβs great that AI can win maths Olympiads, but thatβs not what Iβm doing. I mostly ask basic Rust, P...
π 59.9% similar
How We Cut Inference Costs from $46K to $7.5K Fine-Tuning Qwen-Image-Edit
Running quality inference at scale is something we think about a lot at Oxen...
π 59.4% similar
GPT-5: Key characteristics, pricing and model card
7th August 2025
Iβve had preview access to the new GPT-5 model family for the past two weeks (see r...
π 58.7% similar
Table of Contents
- SmolVLM to SmolVLM2: Compact Models for Multi-Image VQA
- SmolVLM 1: A Compact Yet Capable Vision-Language Model
- What Is SmolVLM...
π 58.5% similar
2 Years of ML vs. 1 Month of Prompting
November 7, 2025
Recalls at major automakers cost hundreds of millions of dollars a year. Itβs a huge issue. To...
π 58.1% similar
> the generation of 281,128 augmented examples, from which 1,000 were
held out as a benchmark test set.
This model is trained on a custom dataset of 2...