XML Entity Expansion in PHP
Vulnerable example
The following PHP script parses the content of an uploaded XML file using the php-xml
library.
<?php
$xml = file_get_contents($_FILES["file_to_upload"]["tmp_name"]);
libxml_disable_entity_loader(false);
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOENT);
echo($dom->textContent);
?>
The script parses XML data in an unsafe way and can be exploited to inject a specially forged XML to disclose the /etc/passwd
system file, referring its path as an external entity.
<!DOCTYPE d [<!ENTITY e SYSTEM "/etc/passwd">]><t>&e;</t>
Prevention
In PHP, several XML processing libraries use the libxml2
library for actually parsing the XML, which may resolve external entities depending on how it is used and on how the system is configured.
The libxml2
parser resolves the entities when the LIBXML_NOENT
option is used, which may be set at system level or used in the code. To prevent the code resolving to any entity, libxml_disable_entity_loader(true)
can be invoked to disable the load to any entity regardless of the parser settings. This option may break other libxml2
based functions that deal with URIs.
The fixed code parses the XML code without loading the malicious entities.
<?php
$xml = file_get_contents($_FILES["file_to_upload"]["tmp_name"]);
libxml_disable_entity_loader(true);
$dom = new DOMDocument();
$dom->loadXML($xml);
echo($dom->textContent);
?>
List of other functions that may be exploited using XXE.
XMLReader::read()
DOMDocument::loadXML()
DOMDocument::loadHTML()
simplexml_load_string()
simplexml_load_file()
References
OWASP - XML External Entity (XXE) Processing OWASP - XML External Entity Prevention Cheat Sheet