Understanding and Improving the Efficiency of Failure Resilience for Big Data Frameworks