From a9001215b8a2c8d3f49664e0f2cb865b91dcd3af Mon Sep 17 00:00:00 2001
From: Wen Yao Jin <wen-yao.jin@student.ecp.fr>
Date: Sat, 11 Mar 2017 21:48:13 +0100
Subject: [PATCH] Update BDPA_Assign2_WJIN.md

---
 BDPA_Assign2_WJIN.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/BDPA_Assign2_WJIN.md b/BDPA_Assign2_WJIN.md
index bd9c9c4..8f69dfb 100644
--- a/BDPA_Assign2_WJIN.md
+++ b/BDPA_Assign2_WJIN.md
@@ -249,6 +249,7 @@ To avoid redundant parses of the input file, some intuition is needed. In this a
 * In the reduce phrase, process only keys with `two` instances. In this way we ignore empty documents because empty documents are not in input file, so they only appear once. Since empty documents are not often, computational time will not be too much affected.
 * The two instances are exactly the two documents that we need to compare for each key. Calculate similarity and emit key pairs that are similar.  
 
+Following code can be found [here](src/similarity/NaiveApproach.java).
 **Mapper:**
 ```java
       @Override
@@ -344,6 +345,7 @@ In this part, the implementation is more trivial:
 * Since in the map phrase we didn't output the document corpus but only ids, a hashmap for document retrieval is needed at the reduce phase. We load it in `setup` function.
 * At reduce phase, for each key, compute similarity if severals document id are represented. Since the words are sorted by frequency, ideally there will be much less comparaison needed
 
+Following code can be found [here](src/similarity/PrefilteringApproach.java).
 **Mapper:**
 ```java
 @Override
-- 
GitLab