Update Report.md

fd406f7a · Meiqi Guo · b0c798fe · fd406f7a
Commit fd406f7a authored 8 years ago by Meiqi Guo
--- a/Report.md
+++ b/Report.md
@@ -364,6 +364,7 @@ public void reduce(Text key, Iterable<Text> values, Context context)
    	 }
      }
 ```
 [The excution time](https://gitlab.my.ecp.fr/2014guom/BigDataProcessAssignment2/blob/master/output/Hadoop_IndexApproach.PNG) is `42seconds`, much less than Naive Approach.
 [Comparaison times](https://gitlab.my.ecp.fr/2014guom/BigDataProcessAssignment2/blob/master/output/counter_IndexApproach.PNG) are 17, much less than Naive Approach. 
@@ -374,4 +375,14 @@ You can find the overview of hadoop below:
 See the complete code [here](https://gitlab.my.ecp.fr/2014guom/BigDataProcessAssignment2/blob/master/IndexApproach.java). I didn't commit the output since it's empty for the sample.
+### Explain and justify the difference
+Methods of approach | Excution time | Comparaison times
+------------------- | ------------- | -----------------
+Naive Approach      | 4min 15s      | 11476
+Index Approach      | 42s           | 17
+We can clearly see that the Index Approach is quicker than the Naive Approach, even on a sample dataset. 
+This is raisonable because the second method aims at reducing the number of pair comparisions by the inverted index, which allows to skip the (huge) number of comparisons between some non-similar documents.
+But the first method takes O(n) computational time and memory, thus needs much more time.