AgentPerf treats agentic AI as a long-running systems workload, where memory, networking, batching, MoE routing, and latency control all matter together. It includes long context windows, with requests ranging from 5K to 131K tokens and an average near 27K tokens, which stresses
AgentPerf treats agentic AI as a workload
AgentPerf treats agentic AI as a long-running workload, considering memory and networking.