@@ -250,6 +250,7 @@ To avoid redundant parses of the input file, some intuition is needed. In this a
* The two instances are exactly the two documents that we need to compare for each key. Calculate similarity and emit key pairs that are similar.
Following code can be found [here](src/similarity/NaiveApproach.java).
**Mapper:**
```java
@Override
...
...
@@ -346,6 +347,7 @@ In this part, the implementation is more trivial:
* At reduce phase, for each key, compute similarity if severals document id are represented. Since the words are sorted by frequency, ideally there will be much less comparaison needed
Following code can be found [here](src/similarity/PrefilteringApproach.java).