Journal Article 2022 Fault Tolerance in Distributed Machine Learning Systems Prof. Mohamed Youssef · Faculty of Computer Science Journal of Parallel and Distributed Computing 64 citations fault tolerance distributed ML federated learning