What Models? — Pick the right model for your GPU in seconds

About

What Models? is a free tool built to answer a simple question that doesn't have a great answer anywhere else: given my GPU, which LLMs can I actually run — and how fast?

I built this after spending too much time downloading models only to find they didn't fit in VRAM, or ran too slowly to be useful. The calculations aren't complicated, but they're tedious to do by hand for every model you're considering. This tool does them all at once.

The site is open source. If you spot a mistake in the data, want a model or GPU added, or have a suggestion, the best place to raise it is GitHub Issues.

How the data is sourced

Model weights and VRAM figures come from bartowski's GGUF releases on HuggingFace — consistently the most reliable source for Q4_K_M and Q8_0 quantizations. GPU specs are sourced from manufacturer datasheets. Benchmark scores (MMLU, SWE-Bench) are pulled from public leaderboards.

Built by

BenD10 on GitHub.