因项目需要对中文进行切词,故找同事要了段代码,现记录下来,以便日后使用
public static String detailData(String text) throws IOException{
String returnStr = "";
//创建分词对象
Analyzer anal = new IKAnalyzer(true);
StringReader reader = new StringReader(text);
//分词
TokenStream ts = anal.tokenStream("", reader);
CharTermAttribute term = ts.getAttribute(CharTermAttribute.
class);
while(ts.incrementToken()){
returnStr = returnStr + term.toString()+"#@@#";
}
reader.close();
return returnStr;
}
另:附件1和2放在lib中,附件3放在src根目录
- IKAnalyzer3.2.3Stable.jar (1.1 MB)
- 下载次数: 0
- lucene-core-3.6.0.jar (1.5 MB)
- 下载次数: 0
- src根目录.zip (746 Bytes)
- 下载次数: 0