OpenAI intros two open-weight language models that can run on consumer GPUs — optimized to run on devices with just 16GB of memory

Date:

Share post:

OpenAI has developed a pair of new open-weight language models optimized for consumer GPUs. In a blog post, OpenAI announced “gpt-oss-120b” and “gpt-oss-20b”, the former designed to run on a single 80GB GPU and the latter optimized to run on edge devices with just 16GB of memory.

Both models take advantage of a Transformer using the mixture-of-experts model, a model that was popularized with DeepSeek R1. Despite gpt-oss-120b and 20b’s design focus towards consumer GPUs, both support up to 131,072 context lengths, the longest available for local inference. gpt-oss-120b activates 5.1 billion parameters per token, and gpt-oss-20b activates 3.6 billion parameters per token. Both models use alternating dense and locally banded sparse attention patterns and use grouped multi-query attention with a group size of 8.

Source link

spot_img

Related articles

5 Best Electric Toothbrushes, Backed by Dentists and Hygienists

What About U-Shaped Toothbrushes?There are many U-shaped toothbrushes available now that use a mouthpiece full of bristles to...

Efimer Trojan delivered via email and hacked WordPress websites

Introduction In June, we encountered a mass mailing campaign impersonating lawyers from a major company. These emails falsely claimed...

CRUA 27″ 144hz/165HZ Curved Gaming Monitor Review

Key Features1800R CurvatureFrameless Design1ms ResponseFull HD 1080PFreeSync SupportWall MountablePros & ConsProsHigh Refresh RateLow Motion BlurEasy InstallationConsColor InconsistenciesLimited USB...