Use your own customized open-source Large Language Model
Youβve built it. Now unleash it.
You already fine-tuned a model (great!). Now itβs time to use it. Convert it to GGUF, quantize for local hardw...
Similar Articles (10 found)
π 70.4% similar
A HN user asked me0 how I run LLMs locally with some specific questions, Iβm documenting it here for everyone.
Before I begin I would like to credit t...
π 69.6% similar
2 Years of ML vs. 1 Month of Prompting
November 7, 2025
Recalls at major automakers cost hundreds of millions of dollars a year. Itβs a huge issue. To...
π 67.6% similar
The edge is back. This time, it speaks.
Letβs be honest.
Talking to ChatGPT is fun.
But do you really want to send your "lock my screen" or "write a n...
π 65.3% similar
There has been a lot of interest on HN in fine-tuning open-source LLMs recently (eg. Anyscale's post at
https://news.ycombinator.com/item?id=37090632)...
π 64.2% similar
> the generation of 281,128 augmented examples, from which 1,000 were
held out as a benchmark test set.
This model is trained on a custom dataset of 2...
π 64.1% similar
Table of Contents
- Setting Up LLaVA/BakLLaVA with vLLM: Backend and API Integration
- Why vLLM for Multimodal Inference
- Configuring Your Developmen...
π 63.1% similar
The field of applied AI, which typically involves building pipelines that connect data to Large Language Models (LLMs) in a way that generates busines...
π 62.9% similar
The Rise of Multimodal LLMs and Efficient Serving with vLLM
In this tutorial, you will learn how multimodal LLMs like LLaVA, GPT-4V, and BakLLaVA comb...
π 62.7% similar
I want everything local β no cloud, no remote code execution.
Thatβs what a friend said. That one-line requirement, albeit simple, would need multiple...
π 61.6% similar
How We Cut Inference Costs from $46K to $7.5K Fine-Tuning Qwen-Image-Edit
Running quality inference at scale is something we think about a lot at Oxen...