> HTML title preprocessing


Have HTML Titles been put through any text-processing procedure like stemming, lemmatization, etc, before hash calculation?

Posted by: shestakoff_andrey @ Sept. 13, 2016, 10:42 a.m.


We used lemmatization for titles preprocessing.

Posted by: vfedorenko @ Sept. 16, 2016, 1:28 p.m.

Does that mean the title words are meaningful without words like 'and', 'hello', 'we','get',etc???

Posted by: xjSJTU @ Sept. 17, 2016, 1:40 p.m.
