OpenAI intros two open-weight language models that can run on consumer GPUs — optimized to run on devices with just 16GB of memory

Date:

Share post:

OpenAI has developed a pair of new open-weight language models optimized for consumer GPUs. In a blog post, OpenAI announced “gpt-oss-120b” and “gpt-oss-20b”, the former designed to run on a single 80GB GPU and the latter optimized to run on edge devices with just 16GB of memory.

Both models take advantage of a Transformer using the mixture-of-experts model, a model that was popularized with DeepSeek R1. Despite gpt-oss-120b and 20b’s design focus towards consumer GPUs, both support up to 131,072 context lengths, the longest available for local inference. gpt-oss-120b activates 5.1 billion parameters per token, and gpt-oss-20b activates 3.6 billion parameters per token. Both models use alternating dense and locally banded sparse attention patterns and use grouped multi-query attention with a group size of 8.

Source link

spot_img

Related articles

Weekly Update 487

I thought Scott would cop it first when he posted about what his solar system really cost him...

Screen-ripping 300 Hz gaming monitor crashes to just $229, touts 1440p resolution and 1ms response time — save $220 on the LG 27-inch UltraGear...

LG is offering a limited-period discount on its UltraGear 27G640A-B 27-inch 1440p gaming monitor. Originally priced at $449.99,...

How to Build Cloud-Based Accounting Software for SMEs

Cloud-based accounting software has become a critical tool for small and medium-sized businesses in the United States. As...

Tech CEOs boast and bicker about AI at Davos

There were times at this week’s meeting of the World Economic Forum when Davos seemed transformed into a...