Network packet classification is the central building block for important services such as QoS routing and firewalling. Accordingly, a wide range of classification schemes has been proposed, each with its own specific set of characteristics. But while novel algorithms keep being developed at a high pace, there barely exists tool support for proper benchmarking, which makes it hard for researchers and engineers to evaluate and compare those algorithms in changing scenarios. In this paper, we present the Classification Algorithm Testing Environment (CATE). CATE consistently and reproducibly extracts the key performance characteristics, such as memory footprint and matching speed, for a predefined set of classification algorithms from a highly customizable set of benchmarks. In addition, we demonstrate that CATE can be used to gain new insights on both the input parameter sensitivity and the scalability of even well-studied algorithms.