Copyset-merging Algorithm Design and Evaluation
This work extends original copyset algorithm to dynamic setting. New algorithm reduced data-loss probability by 15-40% under concurrent node failures with minimal performance degradation.
My contribution:
- Identified the copyset degeneration problem, solved it by proposing data migration mechanism and copyset-merging algorithm.
- Implemented the initial simulation prototype.
- Designed performance evaluation experiment and hacked HDFS source code (in Java), conducted microbenchmarks (with a teammate) on CloudLab.