This package implements a simple HTML parser.
Example:
(parse-html-string "<html><head><title>Test</title></head>
<body><h1>Little Test</h1>
<p>How dy? <a href=\"/check.html\">Check this</a></p>
<ul><li>one<li>two<li>three</ul></body></html>")
--> ((:html nil (:head nil (:title nil "Test")) "
" (:body nil (:h1 nil "Little Test") "
" (:p nil "How dy? " (:a (:href "/check.html") "Check this")) "
" (:ul nil (:li nil "one" (:li nil "two" (:li nil "three")))))))
Sexp html format:
element ::= (tag (&rest attributes) &rest contents) .
tag ::= (or symbol string) . -- usually a keyword
attributes ::= list of (name value) .
contents ::= list of element | string .
name ::= (or symbol string) . -- usually a keyword.
value ::= string .
License:
AGPL3
Copyright Pascal J. Bourguignon 2003 - 2015
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program.
If not, see <http://www.gnu.org/licenses/>
|
(parse-html-file pathname &key verbose external-format) |
function |
DO: Parse the HTML file PATHNAME. VERBOSE: When true, writes some information in the *TRACE-OUTPUT*. EXTERNAL-FORMAT: The external-format to use to open the HTML file. RETURN: A list of html elements. SEE ALSO: ELEMENT-TAG, ELEMENT-ATTRIBUTES, ATTRIBUTE-NAMED, ELEMENT-CHILDREN.
|
(parse-html-stream stream &key verbose) |
function |
DO: Parse the HTML stream STREAM. VERBOSE: When true, writes some information in the *TRACE-OUTPUT*. RETURN: A list of html elements. SEE ALSO: ELEMENT-TAG, ELEMENT-ATTRIBUTES, ATTRIBUTE-NAMED, ELEMENT-CHILDREN.
|
(parse-html-string string &key start end verbose) |
function |
DO: Parse the HTML in the STRING (between START and END) VERBOSE: When true, writes some information in the *TRACE-OUTPUT*. RETURN: A list of html elements. SEE ALSO: ELEMENT-TAG, ELEMENT-ATTRIBUTES, ATTRIBUTE-NAMED, ELEMENT-CHILDREN.
|
(unparse-html html &optional stream) |
function |
Writes back on STREAM the reconstituted HTML source.
|
(write-html-text html &optional stream) |
function |
Writes on STREAM a textual rendering of the HTML. Some reStructuredText formating is used. Simple tables are rendered, but colspan and rowspan are ignored.