diff --git a/BDPA_Assign2_WJIN.md b/BDPA_Assign2_WJIN.md
index c6708ab14952bcb554ff4a99f6aaed9a6e884f88..744a466f81a0872b420c820d7a86ab07c64c9608 100644
--- a/BDPA_Assign2_WJIN.md
+++ b/BDPA_Assign2_WJIN.md
@@ -400,10 +400,13 @@ The hadoop job overview:
 
 #### 3 Justification of difference
 
+The output similar documents can be find [here](similardoc). Remember that we used a sampled file, so there are way less similar docs than it supposed to be. However we can still see that, similar doc is very rare even compared to the sampled file length.
+
 | Job       | # of comparaison | Execution Time |
 |:----------------:|:----------------:|:--------------:|
 | NaiveApproach               | 365085           | 7m 50s         |
 | PrefilteringApproach               | 976               | 15s         |
-The naive approach takes O(n) computational time and memory, thus needs much more time, even in the shuffle and sort phase.  
+The naive approach takes O(n) computational time and memory, thus needs much more time, even in the shuffle and sort phase. 
+
 The prefiltering approach is very efficient when similar documents are rare and documents are not very long, which is exactly our case. This explains the drastic performance difference.