I suspect calculating the average RGB of each emoji and comparing the distances between the average RGB of subsections of the image would have resulted in better emoji choices and better output (simpl...
Similar Articles (10 found)
π 57.6% similar
Every year, we have a new iPhone that claims to be faster and better in every way. And yes, these new computer vision models and new image sensors can...
π 56.8% similar
I think image-encoder from CLIP (even smallest variant ViT B/32) is good enough to capture a lot of semantic information to allow natural language que...
π 53.1% similar
https://medium.com/@mustafaakin/indexing-icloud-photos-with-ai-using-llava-and-pgvector-fd58182febf6
Indexing iCloud Photos with AI Using LLaVA and pgvector
A straightforward idea, gluing stuff together until it works, but itβs a glimpse of whatβs pos...
π 53.1% similar
How We Cut Inference Costs from $46K to $7.5K Fine-Tuning Qwen-Image-Edit
Running quality inference at scale is something we think about a lot at Oxen...
π 52.6% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
π 50.8% similar
Table of Contents
- SmolVLM to SmolVLM2: Compact Models for Multi-Image VQA
- SmolVLM 1: A Compact Yet Capable Vision-Language Model
- What Is SmolVLM...
π 50.0% similar
Table of Contents
- Synthetic Data Generation Using the VLM-as-Judge Method
- Configuring Your Development Environment
- Set Up and Imports
- Download...
π 49.3% similar
In early 2024, I began investigating collaborative editing systems for use in Momentβs core text editor.
In some ways, we are in a golden era for the ...
π 48.9% similar
Evaluating LLMs for my personal use case
Summary
Itβs great that AI can win maths Olympiads, but thatβs not what Iβm doing. I mostly ask basic Rust, P...
π 48.8% similar
Color Wheels are wrong? How color vision actually works
Why are artists special?
Ask any artist to explain how color works, and theyβll launch into a ...