From 6bb4cfa97d53ed1fa964515b8bcf22e29edf86e9 Mon Sep 17 00:00:00 2001
From: Meiqi Guo <mei-qi.guo@student.ecp.fr>
Date: Fri, 17 Mar 2017 03:04:37 +0100
Subject: [PATCH] Update README.md

---
 README.md | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index e65b96a..85a5e27 100644
--- a/README.md
+++ b/README.md
@@ -22,7 +22,7 @@ else if (stopWords.contains(word)){
 word.set(token.replaceAll("[^A-Za-z0-9]+", "").toLowerCase())
 ```
 
-**keep each unique word only once per line**
+**Keep each unique word only once per line**
 
 
 We define a *hashset* where we store words
@@ -53,8 +53,16 @@ I used two counters:
 * the other one is to record the number of lines for the output, named *FinalLineNumCounter*, which means the number after removing all empty lines. 
 
 The result is shown as below:
+
+NUM = 124787
+
+Final_NUM = 114815
+
+So nearly 10000 lines are empty.
+
+
 ![](https://gitlab.my.ecp.fr/2014guom/BigDataProcessAssignment2/blob/master/output/counters.PNG)
-<img src="https://gitlab.my.ecp.fr/2014guom/BigDataProcessAssignment2/blob/master/output/counters.PNG" width="100px" height="80px" alt="简书">
+
 
 **Order the tokens of each line in ascending order of global frequency**
 
-- 
GitLab