GEMMA 4 12B WITH MULTI TOKEN PREDICTION ON THE GMKTEC EVO-X2 IS A DIFFERENT LEVEL ran mtp in llama.cpp and you can immediately see the difference draft candidates get accepted almost instantly video shows 14+ tokens per second at nearly 30k context and the gpu holds steady with
GEMMA 4 12B WITH MULTI TOKEN PREDICTION ON GMKTEC EVO-X2
Video shows 14+ tokens per second at nearly 30k context.