A benchmark for ML inference latency on mobile devices
Z. Li, M. Paolieri, L. Golubchik
Abstract: Inference latency prediction on mobile devices is essential for multiple applications, including collaborative inference and neural architecture search. Training accurate latency predictors using ML techniques requires sufficient and representative data; however, collection of such data is challenging. To overcome these challenges, in this work, we focus on constructing a comprehensive dataset that can be used to predict inference latency on mobile devices. Our dataset contains 102 real-world CNNs, 69 real-world ViTs and 1000 synthetic CNNs across 174 diverse experimental environments on mobile platforms, accounting for critical factors affecting inference latency, including hardware heterogeneity, data representations and ML frameworks. Our code is available at: https://github.com/qed-usc/mobile-ml-benchmark.git.
Proceedings of EdgeSys, pp. 31-36, ACM, 2024 2024Performance ModelsML SystemsEdge AIApplications
copy bib | save bib | save pdf | go to publisher
🏠 Home