今天我们要部分实现背单词功能,在开始正题之前,还是附上背单词软件的下载链接:http://apk.91.com/Soft/Android/com.carlos.yueci-4.html
最近实验室有了任务,时间会紧些,因此这个项目要加快进度了。
首先把我在系列二中的功能分析拷贝过来:
功能2、背单词。
实现方法:这里要用到第二个数据库,背单词的词库。我们需要一个存放单词的TXT文件,通过解析这个TXT文件,将要背的单词解析并存进数据库中,然后根据一定的规 律弹出单词。
所用到的技术:
1)数据库,同前面的数据库技术相似;
2)对TXT文件中的单词进行解析,字符串解析函数;
3)单词状态机,设计一定的算法,按照一定的规律弹出单词,并进行背词操作。(这个确实挺麻烦)
4)文件浏览,做一个简易的文件浏览器,用于浏览SD卡中的单词源文件txt,然后导入词库。这个属于比较单独的一个功能。
今天的主要工作是实现数据库,解析txt单词本,然后篇幅允许的话会分析一下拓词软件的背单词算法。
一、数据库部分。
为了避免和词典功能对应的数据库相互冲突,我们重新建一个DatabaseHelper,而不是采用在一个数据库中创建多个表的方法,这样操作起来互不干扰,不容易出错。
这里创建的数据库用来记录所需要背的单词、释义,以及背词次数,答错次数,掌握程度,等等。这里我借鉴了拓词的“掌握程度”的思路,利用掌握程度来决定单词出现的频度。
首先创建一个DatabaseHelper子类:
package com.carlos.database; import android.content.ContentValues; import android.content.Context; import android.database.Cursor; import android.database.sqlite.SQLiteDatabase; import android.database.sqlite.SQLiteDatabase.CursorFactory; import android.database.sqlite.SQLiteOpenHelper; public class DataBaseHelper extends SQLiteOpenHelper{ public Context mContext=null; public String tableName=null; public static int VERSION=1; public SQLiteDatabase dbR=null; public SQLiteDatabase dbW=null; public DataBaseHelper(Context context, String name, CursorFactory factory, int version) { super(context, name, factory, version); // TODO Auto-generated constructor stub mContext=context; tableName=name; dbR=this.getReadableDatabase(); dbW=this.getWritableDatabase(); } public DataBaseHelper(Context context, String name, CursorFactory factory){ this(context,name,factory,VERSION); mContext=context; tableName=name; } public DataBaseHelper(Context context, String name){ this(context,name,null); mContext=context; tableName=name; }; @Override public void onCreate(SQLiteDatabase db) { // TODO Auto-generated method stub db.execSQL("create table glossary(word text,interpret text," + "right int,wrong int,grasp int,learned int)"); } @Override public void onUpgrade(SQLiteDatabase arg0, int arg1, int arg2) { // TODO Auto-generated method stub } /** * * @param word * @param interpret * @param overWrite 是否覆写原有数据 * @return */ public boolean insertWordInfoToDataBase(String word,String interpret,boolean overWrite){ Cursor cursor=null; cursor= dbR.query(tableName, new String[]{"word"}, "word=?", new String[]{word},null, null, "word"); if(cursor.moveToNext()){ if(overWrite){ ContentValues values=new ContentValues(); values.put("interpret", interpret); values.put("right", 0); values.put("wrong",0); values.put("grasp",0); values.put("learned", 0); dbW.update(tableName, values, "word=?", new String[]{word}); cursor.close(); return true; }else{ cursor.close(); return false; } }else{ ContentValues values=new ContentValues(); values.put("word", word); values.put("interpret", interpret); values.put("right", 0); values.put("wrong",0); values.put("grasp",0); values.put("learned", 0); dbW.insert(tableName, null, values); cursor.close(); return true; } } }
db.execSQL("create table glossary(word text,interpret text,right int,wrong int,grasp int,learned int)");
word: 单词;
interpret: 翻译;
right: 答对次数;
wrong: 答错次数;
grasp: 掌握程度;
learned: 用于标识该单词是否已经背过
另外这里添加了一个方法,外部可以调用该方法insertWordInfoToDataBase,把单词和其对应的翻译导入数据库,isOverWrite用来设置是否覆盖数据库中原有的单词信息。
然后在使用数据库时和之前词典功能中类似,实例化一个DatabaseHelper子类对象,然后获得readableDatabase和writableDatabase 即可执行增删改查等操作。
二、解析单词TXT文本。
可能朋友们一直有个疑惑,到底单词是以怎样的形式导入到词库中的?其实用txt导入是我的一个巧合性的选择,因为正好我背单词时使用的是一个PDF的单词表,于是我就把这个PDF转换成txt,然后想办法对这个txt进行解析,txt中的内容格式如下:
hello int.你好,哈喽 beat v.打败,战胜happy a.欢乐幸福
number n.数字 alphabetical a.按字母表顺序的 wind n.风
每一行的单词可以是多个,单词和该单词的释义之间必须要有一个或多个空格。
如何解析呢?基本的思路,就是要把单词和释义“分离”开来,并且要一一对应。这里我首先想到使用正则表达式进行匹配,基本的思路如下:
1、首先在一行单词表中把所有的单词找出来,存放在一个ArrayList<String >数组中,使用正则表达式 “[a-zA-Z]+[ ]+” [a-zA-Z]+ 表示有一个或多个英文字母,后面紧跟着一个或多个空格[ ]+。
hello int.你好,哈喽 beat v.打败,战胜happy a.欢乐幸福
这样可以就可以提出三个单词,hello ,beat, happy
2、然后把一行字符串中所有的单词位置替换成一个标记—— <S%%E>,那么字符串将会变成:注意这个标记可以自己设计成其它的,但最好特殊一点!
<S%%E>int.你好,哈喽 <S%%E>v.打败,战胜<S%%E>a.欢乐幸福
3、然后在字符串尾再加上一个标记:<S%%E>
<S%%E>int.你好,哈喽 <S%%E>v.打败,战胜<S%%E>a.欢乐幸福<S%%E>
4、这样我们可以发现在%E> 和<S%之间的内容肯定是翻译,那么我们用一个正则表达式%E>[^<S%%E>]+<S% 把它匹配出来,[^<S%%E>]+ 的含义是%E>和<%S之间不能有:<S%%E>,这样就可以避免一次就把多个翻译匹配出来。
%E>int.你好,哈喽 <S%
%E>v.打败,战胜<S%
%E>a.欢乐幸福<S%
然后取第4到倒数第4个字符之间的字符创建字符串就可去除标记,获得最终的翻译:
int.你好,哈喽
v.打败,战胜
a.欢乐幸福
这样我们把单词和其对应的翻译就获得了,然后把其insert到前面创建的数据库就可以把单词信息导入到数据库了:,这里需要用到上面提到的数据库中的insertWordToDatabase方法。
这里我把这个方法封装到一个类中了:
package com.carlos.text_parser; import java.util.ArrayList; import java.util.regex.Matcher; import java.util.regex.Pattern; import android.content.Context; import android.database.Cursor; import android.database.sqlite.SQLiteDatabase; import com.carlos.database.DataBaseHelper; public class WordListParser { public DataBaseHelper dbHelper=null; public Context context=null; public String tableName=null; public WordListParser(){ } public WordListParser(Context context, String tableName) { this.context=context; this.tableName=tableName; dbHelper=new DataBaseHelper(context, tableName); } public void parse(String lineStr){ int countWord=0; int countInterpret=0; int count=0; String strInterpret=""; String str=""; char[] charArray=null; Pattern patternWord=Pattern.compile("[a-zA-Z]+[ ]+"); //"%>[^<%%>]*<%" Pattern patternInterpret=Pattern.compile("%E>[^<S%%E>]+<S%"); Matcher matcherWord=patternWord.matcher(lineStr); Matcher matcherInterpret=null; ArrayList<String> wordList=new ArrayList<String>(); ArrayList<String> interpretList=new ArrayList<String>(); while(matcherWord.find()){ str=matcherWord.group(); charArray=str.toCharArray(); if(charArray.length>0 && (charArray[0]>='A'&& charArray[0]<='Z' )){ charArray[0]+=('a'-'A'); str=new String(charArray,0,charArray.length); //首字母去掉大写 } wordList.add(str.trim()); } if(wordList.size()<=0) return; matcherWord.reset(lineStr); if(matcherWord.find()){ strInterpret=matcherWord.replaceAll("<S%%E>"); } strInterpret+="<S%%E>"; matcherInterpret=patternInterpret.matcher(strInterpret); while(matcherInterpret.find()){ str=matcherInterpret.group(); interpretList.add(new String(str.toCharArray(),3,str.length()-6)); } countWord=wordList.size(); countInterpret=interpretList.size(); count=countWord>countInterpret?countInterpret:countWord; for(int i=0;i<count;i++){ dbHelper.insertWordInfoToDataBase(wordList.get(i), interpretList.get(i), true); } } // public boolean isOfAnWord(int index,char[] str){ // int i=index; // for( ;i<str.length;i++ ){ // if(isAlpha(str[i])==false) // break; // } // if(i==index) // return false; // if(i>=str.length) // return true; // if(str[i]=='.') // return false; // return true; // // } // // // public boolean isAlpha(char ch){ // if((ch>='A'&&ch<='Z') ||(ch>='a'&&ch<='z')){ // return true; // } // else // return false; // } // // // public boolean isChinese(char ch){ // if(ch>129) // return true; // else // return false; // } }
三、拓词算法分析
诚然,“悦词”软件在背单词的思路上是参考了拓词,但因为做这款应用的初衷是方便我背单词,所以只保留了背单词算法中最核心的部分。下面我介绍下基本思路:
这里采用了掌握程度的思路,采用了状态机的模式:
使用了状态变量:
1、process:每一天总的学习进度:对应的状态值有
public final static int GRASP_89=8; //89级掌握程度,属于复习背过的单词的阶段,下同
public final static int GRASP_67=6;
public final static int GRASP_45=4;
public final static int GRASP_23=2;
public final static int GRASP_01=0;
public final static int LEARN_NEW_WORD=10; //当天学习新单词的阶段
2、processLearnNewWord:学习新单词阶段对应的子进度,对应的状态有:
public final static int STEP_1_NEWWORD1=0; //阶段1,学习新单词10个
public final static int STEP_2_REVIEW_20=1; //阶段2,复习20个旧的单词
public final static int STEP_3_NEWWORD2=2; //阶段3,学习新单词10个左右
public final static int STEP_4_REVIEW_6=3; //阶段4,复习约6个旧单词
3、wordCount:某一阶段内已经背的单词数
新的一天开始后,应用会自动将process重置为GRASP_89(如何实现这一点后面会讲),开始复习背过的单词,先背掌握程度高的,然后掌握程度逐渐降低。89掌握程度约背全部89掌握程度的20%,67掌握程度背全部67掌握程度的45%,45掌握程60%,23 70% ,01 85%,这个概率可以根据个人理解改变,当完成所有旧单词的复习过程后,进入新单词学习阶段。
在新单词学习阶段,首先背十个新单词,然后转为复习01掌握程度的20个左右的单词,然后再被10个新单词,然后再复习01掌握程度的6个左右的单词,然后进入下一轮循环。新单词阶段一直循环上述四个阶段,直至完成当天的背词任务。
对于出错单词的处理,这里我添加了一个队列LinkedList<WordInfo> wrongWordList ,用于存放背错的单词。当被错的单词数目达到一定值后,状态机会自动进入背错词状态,程序会把前一段时间被错的单词顺序弹出来,直至队列变空,然后单词状态机回到原来的背词状态。 出错状态用状态变量processWrong控制,该变量的优先级要高于process,当processWrong为true时,状态机将优先进入弹错词状态,当processWrong变回wrong,状态机再执行原来的process.
以上就是基本思路,下面将该思路封装成一个类WordBox
package com.carlos.wordcontainer; import java.util.ArrayList; import java.util.Collection; import java.util.Iterator; import java.util.LinkedList; import java.util.Queue; import java.util.Random; import java.util.Stack; import android.content.ContentValues; import android.content.Context; import android.database.Cursor; import android.database.sqlite.SQLiteDatabase; import com.carlos.database.DataBaseHelper; public class WordBox { public Context context=null; public String tableName=null; private DataBaseHelper dbHelper=null; private SQLiteDatabase dbR=null,dbW=null; public final static int GRASP_89=8; public final static int GRASP_67=6; public final static int GRASP_45=4; public final static int GRASP_23=2; public final static int GRASP_01=0; public final static int LEARN_NEW_WORD=10; public final static int LEARNED=1; public final static int UNLEARNED=0; public static int process=GRASP_89; //总学习进度控制变量 public static int wordCount=0; //在某一复习阶段背的单词数 public static boolean processWrong=false; //是否要开始背错误的单词 public final static int STEP_1_NEWWORD1=0; public final static int STEP_2_REVIEW_20=1; public final static int STEP_3_NEWWORD2=2; public final static int STEP_4_REVIEW_6=3; public static int processLearnNewWord=STEP_1_NEWWORD1; public LinkedList<WordInfo> wrongWordList=null; public Random rand=null; public WordBox(Context context,String tableName){ this.context=context; this.tableName=tableName; dbHelper=new DataBaseHelper(context, tableName); dbR=dbHelper.getReadableDatabase(); dbW=dbHelper.getWritableDatabase(); wrongWordList=new LinkedList<WordInfo>(); rand=new Random(); } @Override protected void finalize() throws Throwable { // TODO Auto-generated method stub dbR.close(); dbW.close(); dbHelper.close(); super.finalize(); } public void removeWordFromDatabase(String word){ dbW.delete(tableName, "word=?", new String[]{word}); } /** * 多个条件查找Where子句时需要用and 或or连接 * @param grasp * @param learned * @return */ public int getWordCountByGrasp(int grasp ,int learned){ //获得数据库中某个掌握程度的单词的个数 Cursor cursor=dbR.query(tableName, new String[]{"word"}, "grasp=? and learned=?", new String[]{grasp+"",learned+""}, null, null, null); int count=cursor.getCount(); cursor.close(); return count; } public int getTotalLearnProgress(){ int learnCount=0; int totalCount=0; Cursor cursor=dbR.query(tableName, new String[]{"word"}, "grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=?", new String[]{"3","4","5","6","7","8","9","10"}, null, null, null); learnCount=cursor.getCount(); Cursor cursorTotal=dbR.query(tableName, new String[]{"word"}, "word like?", new String[]{"%"}, null, null, null); totalCount=cursorTotal.getCount(); cursor.close(); cursorTotal.close(); if(totalCount==0){ return 0; } return (int)(((float)learnCount/(float)totalCount)*100); } public int getWordCountOfUnlearned(){ Cursor cursorTotal=dbR.query(tableName, new String[]{"word"}, "word like?", new String[]{"%"}, null, null, null); int totalCount=cursorTotal.getCount(); Cursor cursor=dbR.query(tableName, new String[]{"word"}, "grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=?", new String[]{"3","4","5","6","7","8","9","10"}, null, null, null); int learnCount=cursor.getCount(); cursor.close(); cursorTotal.close(); return totalCount-learnCount; } public WordInfo getWordByGraspByRandom(int fromGrasp,int toGrasp,int learned){ //从数据库中随机取出某个特定掌握程度区间的单词,加learned是区别学习程度为0的学过得和没学过的 int totalCount=0,temp=0; ArrayList<Integer> graspsNotEmpty=new ArrayList<Integer>(); for(int i=fromGrasp; i<=toGrasp;i++){ temp=getWordCountByGrasp(i,learned); //这说明给定掌握程度范围内没有单词 totalCount+=temp; if(temp>0) graspsNotEmpty.add(i); //把对应的grasp添加 } if(totalCount<=0){ //这里应该在外部添加判断空的表不能够进来 return null; } int length=graspsNotEmpty.size(); if(length<=0) return null; //在有数据的掌握程度中随机找出一个单词来 int graspInt=graspsNotEmpty.get(rand.nextInt(length)); //随机确定一个掌握程度 int count=getWordCountByGrasp(graspInt, learned); //确定该掌握程度单词数,获得Cursor对象,利用move方法进行随机移动 int index=rand.nextInt(count)+1; Cursor cursor=dbR.query(tableName, new String[]{"word","interpret","right","wrong","grasp"},"grasp=? and learned=?" , new String[]{graspInt+"",learned+""}, null, null, null); cursor.move(index); String word=cursor.getString(cursor.getColumnIndex("word")); String interpret=cursor.getString(cursor.getColumnIndex("interpret")); int wrong=cursor.getInt(cursor.getColumnIndex("wrong")); int right=cursor.getInt(cursor.getColumnIndex("right")); int grasp=cursor.getInt(cursor.getColumnIndex("grasp")); cursor.close(); return new WordInfo(word, interpret, wrong, right, grasp); } /** * 随机从词库中找一个单词! */ public static int lastGetIndex=0; public WordInfo getWordByRandom(){ int count=0; Cursor cursor=dbR.query(tableName, new String[]{"word","interpret","right","wrong","grasp"},"word like?" , new String[]{"%"}, null, null, null); if((count=cursor.getCount())<=0){ cursor.close(); return null; } int i=0; int index=0; while(i<6){ index=rand.nextInt(count)+1; if(index!=lastGetIndex) break; i++; } lastGetIndex=index; cursor.move(index); String word=cursor.getString(cursor.getColumnIndex("word")); String interpret=cursor.getString(cursor.getColumnIndex("interpret")); int wrong=cursor.getInt(cursor.getColumnIndex("wrong")); int right=cursor.getInt(cursor.getColumnIndex("right")); int grasp=cursor.getInt(cursor.getColumnIndex("grasp")); cursor.close(); return new WordInfo(word, interpret, wrong, right, grasp); } String[] logProcess=new String[]{"G01","","G23","","G45","","G67","","G89","","NEW WORD"}; String[] logLearn=new String[]{"NEW1","REVIEW20","NEW2","REVIEW6"}; //外部接口,点击事件后获得单词 public WordInfo popWord(){ WordInfo wordInfo=null; /** * 打印参数信息 */ if(processWrong){ return getWrongWord(); } switch(process){ case GRASP_89:{ if((wordInfo=getWordByAccurateGrasp(GRASP_89, GRASP_67,0.1))!=null) return wordInfo; } case GRASP_67:{ if((wordInfo=getWordByAccurateGrasp(GRASP_67, GRASP_45,0.3))!=null) return wordInfo; } case GRASP_45:{ if((wordInfo=getWordByAccurateGrasp( GRASP_45,GRASP_23,0.4))!=null) return wordInfo; } case GRASP_23:{ if((wordInfo=getWordByAccurateGrasp(GRASP_23, GRASP_01,0.5))!=null) return wordInfo; } case GRASP_01:{ if((wordInfo=getWordByAccurateGrasp(GRASP_01,LEARN_NEW_WORD,0.5))!=null) return wordInfo; } case LEARN_NEW_WORD:{ return learnNewWord(); } default: break; } return null; } //外部敲击后反馈回来的函数 public void feedBack(WordInfo wordInfo,boolean isRight){ if(wordInfo==null) return; //对可能出现的空指针异常进行处理 String word=wordInfo.getWord(); int right=wordInfo.getRight(); int wrong=wordInfo.getWrong(); int graspInt=0; if(isRight){ right++; }else{ wrong++; //更新答对答错次数 } if(right-2*wrong<0){ graspInt=0; }else if(right-2*wrong>10){ graspInt=10; }else{ graspInt=right-2*wrong; } //更新数据库 ContentValues values=new ContentValues(); //更新应该只会更新添加的项吧,暂时这么处理 values.put("right", right); values.put("wrong",wrong); values.put("grasp",graspInt); values.put("learned", LEARNED); dbW.update(tableName, values, "word=?", new String[]{word}); //若出错,将数据存在出错队列中 if(isRight==false){ wordInfo.setRight(right); wordInfo.setWrong(wrong); wordInfo.setGrasp(graspInt); wrongWordList.offer(wordInfo); } } //新词学习阶段调用的函数 public WordInfo learnNewWord(){ //这里设置一个彩蛋 WordInfo wordInfo=null; switch(processLearnNewWord){ case STEP_1_NEWWORD1:{ if((wordInfo=getWordByGraspByRandom(GRASP_01,GRASP_01,UNLEARNED ))==null || wordCount>rand.nextInt(3)+9 ){ processLearnNewWord=STEP_2_REVIEW_20; wordCount=0; //这里表示所有的词都已经学完了 if(getWordCountByGrasp(GRASP_01, UNLEARNED)<=0){ process=GRASP_89; } }else{ wordCount++; return wordInfo; } } case STEP_2_REVIEW_20:{ if((wordInfo=getWordByGraspByRandom(0,2, LEARNED))==null){ processLearnNewWord=STEP_3_NEWWORD2; wordCount=0; }else{ wordCount++; if(wordCount>rand.nextInt(3)+19){ processLearnNewWord=STEP_3_NEWWORD2; wordCount=0; if(wrongWordList.size()>0) processWrong=true; } return wordInfo; } } case STEP_3_NEWWORD2:{ if((wordInfo=getWordByGraspByRandom(GRASP_01,GRASP_01,UNLEARNED ))==null || wordCount>rand.nextInt(3)+9 ){ processLearnNewWord=STEP_4_REVIEW_6; wordCount=0; }else{ wordCount++; return wordInfo; } } case STEP_4_REVIEW_6:{ if((wordInfo=getWordByGraspByRandom(0,2, LEARNED))==null){ processLearnNewWord=STEP_1_NEWWORD1; wordCount=0; /** * 这里必须返回一个非空值,否则程序将面临空指针异常(会执行default) * 解决这个问题的方法是从数据库中随机取一个单词填坑。 */ return getWordByRandom(); }else{ wordCount++; if(wordCount>rand.nextInt(3)+5){ processLearnNewWord=STEP_1_NEWWORD1; wordCount=0; if(wrongWordList.size()>0) processWrong=true; } return wordInfo; } } default: return null; } } //复习阶段调用的取词函数 public WordInfo getWordByAccurateGrasp(int curentGrasp,int nextGrasp,double percent){ int count=0; if((count=getWordCountByGrasp(curentGrasp,LEARNED)+getWordCountByGrasp(curentGrasp+1,LEARNED))<=0 || wordCount>=count*percent){ process=nextGrasp; wordCount=0; return null; }else{ wordCount++; if(wordCount%(rand.nextInt(2)+19) ==0 && wrongWordList.size()>0 ){ //错误列表中必须有单词 processWrong=true; } /** * return getWordByGraspByRandom(rand.nextInt(2)+curentGrasp,LEARNED ); * 这样写会可能返回空值!需要逐个排除 */ return getWordByGraspByRandom(curentGrasp,curentGrasp+1, LEARNED); } } //学习错词的函数 public WordInfo getWrongWord(){ //该函数被调用时,意味着错误词列表中一定有单词 WordInfo word=null; word=wrongWordList.poll(); if(wrongWordList.size()<=0){ processWrong=false; //停止显示错词 } return word; } }
一些细节在代码中都有说明,该对象的调用popWord()获得单词对象WordInfo ,而feedBack用于将背单词的结果(对或错)反馈到数据库中,修改该单词的掌握程度。
还有一些随机获得单词,等等方法,以后在实现背单词UI时会有涉及。若有不清楚之处可以评论
另外给出WordInfo单词信息类:
package com.carlos.wordcontainer; public class WordInfo{ public String word; public String interpret; public int wrong; public int right; public int grasp; public WordInfo(String word, String interpret, int wrong, int right, int grasp) { super(); this.word = word; this.interpret = interpret; this.wrong = wrong; this.right = right; this.grasp = grasp; } public String getWord() { return word; } public void setWord(String word) { this.word = word; } public String getInterpret() { return interpret; } public void setInterpret(String interpret) { this.interpret = interpret; } public int getWrong() { return wrong; } public void setWrong(int wrong) { this.wrong = wrong; } public int getRight() { return right; } public void setRight(int right) { this.right = right; } public int getGrasp() { return grasp; } public void setGrasp(int grasp) { this.grasp = grasp; } }
OK,这节就讲到这里了,背单词的算法部分已经完成,下一节我们将搭建背单词的UI界面,敬请期待