Secure HTML parser and filter

Screenshot of the Web user interface to test the secure HTML filter class
Package:
Secure HTML parser and filter
Summary:
Parse and filter insecure HTML tags and CSS styles
Groups:
HTML, Security
Author:
Manuel Lemos
Description:
This package can be used to parse and filter insecure HTML tags and CSS styles.

It comes with a general purpose markup parser class that can parse any type of markup documents like HTML, XML and DTD files.

There are several other classes that can be chained together to retrieve the document token elements returned by the main markup parser class and filter the document elements in an useful way.

The markup validator filter class validates a document against a DTD, eventually removing invalid tags and attributes.

The safe HTML filter class uses several white lists to process HTML tags and data returned by the markup validator class and discards potentially harmful HTML tags and CSS that could be used to perform cross-site scripting (XSS) or cross-site request forgery (CSRF) security attacks.

The filtered HTML tokens can be reassembled to return a well-formed and secure HTML document.

The HTML links filter class can extract the links contained in an HTML document.

The DTD parser and CSS parser are utility classes used by the other classes.


Powered by Gewgley