前段时间想找点事做,就是试着看能不能用豆瓣的API做点什么,于是就碰到了这个问题——XML解析。
老师还没讲,只能自己去查。
XML文档解析主要有SAX和DOM两种模式,IOS上两种模式都可以用,这里就不做过多介绍,我选择的SAX模式。
IOS解析XML用的是自带的NSXML框架,框架的核心是NSXMLParser类和它的委托协议NSXMLParserDelegate,其主要的解析工作是在NSXMLParserDelegate实现类中完成的。委托中定义了许多回掉方法,在SAX解析器从上到下遍历XML文档的过程中,遇到开始标签、结束标签、文档开始、文档结束和字符串结束是就会触发这些方法。这些方法有很多,下面我们列出5个常用的方法。
在文档开始时触发
-(void)parserDidStartDocument:(NSXMLParser *)parser
遇到一个新标签是触发,其中namespaceURI是命名空间,qualifiedName是限定名,attributes是字典类型的属性集合。
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
找到字符串时触发
-(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
遇到结束标签时触发
-(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
在文档结束时触发
-(void)parserDidEndDocument:(NSXMLParser *)parser
下面通过一个具体的例子来看整个的调用与解析过程
首先这是我们将要解析的XML文件 "info.xml"
? ? ?<?xml version="1.0" encoding="UTF-8"?> ? ? ?<root> ? ? ? ? ? ? <person id="1"> ? ? ? ? ? <firstName>Wythe</firstName> ? ? ? ? ? ? ? ? ? ? <lastName>xu</lastName> ? ? ? ? ? ? ? ? ? ? <age>22</age> ? ? ? ? ? ? ? ? </person> ? ? ? ? ? ? ? <person id="2"> ? ? ? ? ? ? ? ? ? <firstName>li</firstName> ? ? ? ? ? ? ? ? ? ? <lastName>si</lastName> ? ? ? ? ? ? ? ? ? ? <age>31</age> ? ? ? ? ? ? ? ? </person> ? ? ? ? ? ? ? ? ?<person id="3"> ? ? ? ? ? ? ? ? ? ? <firstName>Dipen</firstName> ? ? ? ? ? ? ? ? ? ? <lastName>Shah</lastName> ? ? ? ? ? ? ? ? ? ? <age>24</age> ? ? ? ? ? ? ? ? </person> ? ? ? ? ? ? ?</root>
接来来是一个头文件 "ViewController.h"
#import <UIKit/UIKit.h> @interface ViewController : UIViewController<NSXMLParserDelegate> @property NSXMLParser *parser; @property NSMutableArray *person; @property NSString *currenttag; @end
然后是它的实现文件 "ViewController.m"
#import "ViewController.h" @interface ViewController () @end @implementation ViewController @synthesize parser = _parser , person = _person , currenttag = _currenttag; - (id)initWithNibName:(NSString *)nibNameOrNil bundle:(NSBundle *)nibBundleOrNil { self = [super initWithNibName:nibNameOrNil bundle:nibBundleOrNil]; if (self) { // Custom initialization } return self; } - (void)viewDidLoad { [super viewDidLoad]; NSString *xmlFilePath = [[NSBundle mainBundle]pathForResource:@"info"ofType:@"xml"]; NSData *data = [[NSData alloc]initWithContentsOfFile:xmlFilePath]; self.parser = [[NSXMLParser alloc]initWithData:data]; self.parser.delegate = self; [self.parser parse]; NSLog(@"%@",_person); } - (void)didReceiveMemoryWarning { [super didReceiveMemoryWarning]; // Dispose of any resources that can be recreated. } #pragma mark delegate method -(void)parserDidStartDocument:(NSXMLParser *)parser { _person = [[NSMutableArray alloc]init]; NSLog(@"start parse 1"); } -(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict { _currenttag = elementName; if ([_currenttag isEqualToString:@"person"]) { NSString *_id = [attributeDict objectForKey:@"id"]; NSMutableDictionary *dict = [[NSMutableDictionary alloc]init]; [dict setObject:_id forKey:@"id"]; [_person addObject:dict]; } NSLog(@"start element"); } -(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { NSMutableDictionary *dict = [_person lastObject]; if ([_currenttag isEqualToString:@"firstName"] && dict) { [dict setObject:string forKey:@"firstName"]; } if ([_currenttag isEqualToString:@"lastName"] && dict) { [dict setObject:string forKey:@"lastName"]; } if ([_currenttag isEqualToString:@"age"] && dict) { [dict setObject:string forKey:@"age"]; } NSLog(@"found characters"); } -(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName { _currenttag = nil; NSLog(@"end element"); } -(void)parserDidEndDocument:(NSXMLParser *)parser { NSLog(@"parse end"); } @end
通过断电和输出信息,我们可以知道整个解析过程是 开始解析文档、开始标签、找到字符串、结束标签、文档结束。
class="code_img_closed" src="/Upload/Images/2014091019/0015B68B3C38AA5B.gif" alt="" />logs_code_hide('6aa2c2ce-2ee8-48c5-9a63-92a35cf3fe5a',event)" src="/Upload/Images/2014091019/2B1B950FA3DF188F.gif" alt="" />2014-09-10 16:45:32.920 xmlforblog[3820:60b] start parse 1 2014-09-10 16:45:32.921 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.922 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.922 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.922 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.922 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.923 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.923 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.923 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.923 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.924 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.924 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.924 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.924 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.925 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.925 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.925 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.925 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.926 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.926 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.928 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.929 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.929 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.929 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.930 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.930 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.930 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.930 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.931 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.931 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.931 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.931 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.931 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.932 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.932 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.932 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.932 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.933 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.933 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.933 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.933 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.934 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.934 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.934 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.934 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.935 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.935 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.935 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.935 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.936 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.936 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.936 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.936 xmlforblog[3820:60b] parse end 2014-09-10 16:45:32.936 xmlforblog[3820:60b] ( { age = 22; firstName = Wythe; id = 1; lastName = xu; }, { age = 31; firstName = li; id = 2; lastName = si; }, { age = 24; firstName = Dipen; id = 3; lastName = Shah; } )执行结果
而我们的处理主要是在 开始标签、找到字符串 (
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
-(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
) 中。
遇到开始标签时,我们现判断标签,名字,如果是person,表明接下来就是person的信息,这样我们就先创建一个可变字典,以便将来存放它的值。
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict { _currenttag = elementName; if ([_currenttag isEqualToString:@"person"]) { NSString *_id = [attributeDict objectForKey:@"id"]; NSMutableDictionary *dict = [[NSMutableDictionary alloc]init]; [dict setObject:_id forKey:@"id"]; [_person addObject:dict]; } NSLog(@"start element"); }
在找到字符串时,我们就是通过判断当前标签名,将对应的信息保存到刚刚创建的字典中
-(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { NSMutableDictionary *dict = [_person lastObject]; if ([_currenttag isEqualToString:@"firstName"] && dict) { [dict setObject:string forKey:@"firstName"]; } if ([_currenttag isEqualToString:@"lastName"] && dict) { [dict setObject:string forKey:@"lastName"]; } if ([_currenttag isEqualToString:@"age"] && dict) { [dict setObject:string forKey:@"age"]; } NSLog(@"found characters"); }
不断循环这样的过程,最后我们就可以解析出整个XML文档。
另外说一句,这只是解析一般的文档,如果你跟我曾经一样学会这个就去解析豆瓣API的XML文档,会发现行不通。这时因为许多网站因为它的数据较多,为了避免标签的重复,使用了命名空间,带有命名空间的XML文档解析和这稍有不同。
以后我会写带命名空间的XML文档解析,敬请期待。
拖了快一个月了,今天终于写完。以后不能这么懒了
?