Back to AI Pulse

GEMMA 4 12B WITH MULTI TOKEN PREDICTION ON GMKTEC EVO-X2

Video shows 14+ tokens per second at nearly 30k context.

GEMMA 4 12B WITH MULTI TOKEN PREDICTION ON THE GMKTEC EVO-X2 IS A DIFFERENT LEVEL ran mtp in llama.cpp and you can immediately see the difference draft candidates get accepted almost instantly video shows 14+ tokens per second at nearly 30k context and the gpu holds steady with

Source