diff --git a/BDPA_Assign2_WJIN.md b/BDPA_Assign2_WJIN.md index 8f69dfbb5dcd7f4c0fb75d8a3d3dd622aac15fcd..72ff5b3ec209f8fe5de4855784a9480d6c7c5468 100644 --- a/BDPA_Assign2_WJIN.md +++ b/BDPA_Assign2_WJIN.md @@ -250,6 +250,7 @@ To avoid redundant parses of the input file, some intuition is needed. In this a * The two instances are exactly the two documents that we need to compare for each key. Calculate similarity and emit key pairs that are similar. Following code can be found [here](src/similarity/NaiveApproach.java). + **Mapper:** ```java @Override @@ -346,6 +347,7 @@ In this part, the implementation is more trivial: * At reduce phase, for each key, compute similarity if severals document id are represented. Since the words are sorted by frequency, ideally there will be much less comparaison needed Following code can be found [here](src/similarity/PrefilteringApproach.java). + **Mapper:** ```java @Override