Characteristics
XML handling operation has two major phases.
- Parse XML document and make data(tree of elements)
- xmerl_scan (whole element tree is processed at once)
- SAX type parser
- xmerl_eventp
- erlsom_sax (developed 3rd party. not bundled with Erlang OTP)
- Access (traverse) elements within data
- XPATH (xmerl_xpath)
- XSLT (xmerl_xs)
- callback (hook) function from SAX type parser
- hand-made logic
- traverse tree
- extract tuple from element tree(list) by 'list comprehension' technique
So, methodology of XML parsing is characterized by Parsing and Access method
matrix of each method
parse method | Acceess method | samples on the Web | by | |
xmerl_scan | xmerl_xpath | Parsing Atom with Erlang | Sam Ruby | |
xmerl_scan | xmerl_xpath (with useful MACRO) | XML processing in Erlang | Torbjörn Törnkvist | |
xmerl_scan | hand-made (traverse tree by lists:foldl) | Return Erlang Data from XML | Muharem Hrnjadovic | |
xmerl_scan | hand-made (use list comprehension) |
| Hakan Mattson | |
xmerl_eventp | callback(hook) function |
| Torbjörn Törnkvist | |
erlsom_sax | callback function |
| Willem de Jong |
Operation example
example-1 : emerl_scan + xpath
If you know which elements you need exactly, and source XML file is not so huge, parse by xmerl_scan, access by xmerl_xpath.
Note: As of Erlang/OTP R13B01 supports XPATH 1.0
inspired by Torbjörn Törnkvist's code.
sample xml data ("e.xml")
<Envelope><Title>envelope title</Title>
<InnerEnv>
<IDNUM>403276</IDNUM>
<ItemName>Name String</ItemName>
<Pages>0</Pages>
</InnerEnv>
</Envelope>
code
-module(example1).
-export([doit/1]).
-include_lib("xmerl/include/xmerl.hrl").
-define(Val(X),
(fun() ->
[#xmlElement{name = N,
content = [#xmlText{value = V}|_]}] = X,
{N,V} end)())
.
doit(File) ->
{Xml, _} = xmerl_scan:File(File),
[
?Val(xmerl_xpath:string("/Envelope/Title", Xml)),
?Val(xmerl_xpath:string("//IDNUM", Xml)),
?Val(xmerl_xpath:string("//ItemName", Xml)),
?Val(xmerl_xpath:string("//Pages", Xml))
]
.
results
1> example1:go("e.xml").[
{'Title',"envelope title"},
{'IDNUM',"403276"},
{'ItemName',"Name String"},
{'Pages',"0"}]
example-2 : xmerl_scan + traverse element tree by lists:foldl
if you want to translate whole XML data into other scheme, you need to traverse whole tree by lists:foldl function.
inspired by Muharem Hrnjadovic's code
sample xml data ("e.xml")
<Envelope><Title>envelope title</Title>
<InnerEnv>
<IDNUM>403276</IDNUM>
<ItemName>Name String</ItemName>
<Pages>0</Pages>
</InnerEnv>
</Envelope>
code
-module(example2).-export([go/1]).
-include_lib("xmerl/include/xmerl.hrl").
go(File) -> {R, _} = xmerl_scan:file(File),
io:format("~p~n",[lists:reverse(traverse(R, []))]) .
traverse(R, L) when is_record(R, xmlElement) -> lists:foldl(fun traverse/2, L, R#xmlElement.content) ;
traverse(#xmlText{parents=[{'Title',_},_], value=V}, L) -> [{title, V}|L];
traverse(#xmlText{parents=[{'IDNUM',_},_,_], value=V}, L) -> [{idnum, V}|L];
traverse(#xmlText{parents=[{'ItemName',_},_,_], value=V}, L) -> [{itemname, V}|L];
traverse(#xmlText{parents=[{'Pages',_},_,_], value=V}, L) -> [{pages, V}|L];
traverse(_R, L) ->
L
.
results
2> example2:go("e.xml"). [{title,"envelope title"},
{idnum,"403276"},
{itemname,"Name String"},
{pages,"0"}]
example-3 : SAX type parsing operation
if the computation resouce is limited, whole XML data cannot be processed at once. So, SAX type parser callback functions to process just parsed element.
As SAX type parser, erlsom_sax is welknown. And Willem de Jong posted his code based on erlsom_sax.
see http://blog.tornkvist.org/blog.yaws?id=1193209275268448
here is the code based on xmerl_sax_parser library which is inspired by above code.
sample xml data ("e.xml")
<Envelope>
<Title>envelope title</Title>
<InnerEnv>
<IDNUM>403276</IDNUM>
<ItemName>Name String</ItemName>
<Pages>0</Pages>
</InnerEnv>
</Envelope>
code
-module(example3).-export([go/1]).
go(File) ->
Option = [ {event_fun, fun eventfun/3}, {event_state, {[], []}} ], case xmerl_sax_parser:file(File, Option) of
{ok,{Stack, Acc}} -> lists:reverse(Acc);
{Other} -> Other
end .
eventfun({ignorableWhitespace, _}, _, State) ->
State ;
eventfun({startElement, _, Tag, _, _}, _Location, {Stack, Acc}) -> {[Tag | Stack], Acc} ;
eventfun({characters, Value}, _Location, {[Tag | _L] = Stack, Acc}) -> {Stack, [{Tag, Value} | Acc]} ;
eventfun({endElement, _, _, _}, _Location, {[_ | L], Acc}) -> {L, Acc} ;
eventfun(_,_,State) -> State .
results
6> example3:go("e.xml").
[{"Title","envelope title"},
{"IDNUM","403276"},
{"ItemName","Name String"},
{"Pages","0"}]
You may want to look at vtd-xml which is a lot easier than SAX...
ReplyDeleteThere is also xmerl_sax_parser, new in R13.
ReplyDeleteThe tribe standard is very good continues to refuel
ReplyDelete..............................................
日本美女寫真dvd ■色妹妹情愛影音網 ■美女寫真集 ■火辣淫妹視訊網 ■sex520免費影片 ■免費視訊聊天秀 ■台灣kiss色情貼圖 ■成人免費聊天 ■免費視訊妹 ■上班族聊天室F1 ■日本色情網站 ■視訊色妹妹 ■美女寫真集影片 ■視訊聊天交友網 ■免費視訊聊天區 ■173 視訊美女聊天 ■成人性站 ■金瓶梅影音視訊聊天室 ■台灣美女寫真集圖片館 ■免費情色視訊 ■免費視訊秀 ■美眉-交友視訊 ■成人片分享區 ■免費視訊辣妹聊天室 ■美女寫真集圖片 ■2009真情寫真 ■18成人色情小說網 ■免費視訊聊天網 ■日本美女寫真集 ■後宮視訊情色網 ■sex888影片分享區 ■後宮電影院 ■美女影片試看 ■火辣妹妹視訊網 ■173影音視訊live秀 ■我愛78論壇 ■免費視訊交友 ■成人片線上看 ■免費視訊秀視訊交友 ■日本免費視訊 ■
ReplyDelete