Hacking Articles|Raj Chandel's Blog: Comprehensive Guide on XXE Injection

XML is a markup language that is commonly used in web development. It is used for storing and transporting data. So, today in this article, we will learn how an attacker can use this vulnerability to gain the information and try to defame web-application.

XXE Testing Methodology:

● Introduction to XML

● Introduction to XXE Injection

● Impacts

● XXE For SSRF

o Local File

o Remote File

● XXE Billion Laugh Attack

● XXE Using File Upload

● XXE to Remote Code Execution

● Countermeasures

Introduction to XML

What are XML and Entity?

XML stands for “Extensible Markup Language”,It is the most common language for storing and transporting data. It is a self-descriptive language. It does not contain any predefined tags like <p>, <img>, etc. All the tags are user-defined depending upon the data it is representing for example. <email></email>, <message></message> etc.

● Version: It is used to specify what version of XML standard is being used.

o Values: 1.0

● Encoding: It is declared to specify the encoding to be used. Default encoding that is used in XML is UTF-8.

o Values: UTF-8, UTF-16, ISO-10646-UCS-2, ISO-10646-UCS-4, Shift_JIS, ISO-2022-JP, ISO-8859-1 to ISO-8859-9, EUC-JP

● Standalone: It informs the parser if the document has any link to external source or there is any reference to external document. Default value is no.

o Values: yes, no

What is an Entity?

Like there are variable in programming languages we have XML Entity. They are the way of representing data that are present inside an XML document. There are various built-in entities in XML language like < and > which are used for less than and greater than in XML language. All of these are metacharacters that are generally represented using entities that appear in data. XML external entities are the entities which are located outside DTD.

The declaration of an external entity uses the SYSTEM keyword and must specify a URL from which the value of the entity should be loaded. For example

<!ENTITY ignite SYSTEM "URL">

In this syntax Ignite is the name of the entity,

SYSTEM is the keyword used,

URL is the URL that we want to get by performing XXE attack.

What is the Document Type Definition (DTD)?

It is used for declaration of the structure of XML document, types of data value that it can contain, etc. DTD can be present inside the XML file or can be defined separately. It is declared at the beginning of XML using <!DOCTYPE>.

There are several types of DTDs and the one we are interested in is external DTDs. There are two types of external DTDs:

SYSTEM: System identifier enables us to specify the external file location that contains the DTD declaration.

<!DOCTYPE ignite SYSTEM "URL" [...] >

PUBLIC: Public identifiers provide a mechanism to locate DTD resources and are written as below −

As you can see, it begins with the keyword PUBLIC, followed by a specialized identifier. Public identifiers are used to identify an entry in a catalogue.

<!DOCTYPE raj PUBLIC "URL"

Introduction to XXE

An XXE is a type of attack that is performed against an application in order to parse its XML input. In this attack XML input containing a reference to an external entity is processed by a weakly configured XML parser. Like in Cross-Site Scripting (XSS) we try to inject scripts similarly in this we try to insert XML entities to gain crucial information.

There are several types of DTDs and the one we are interested in is external DTDs. There are two types of external DTDs:

SYSTEM: System identifier enables us to specify the external file location that contains the DTD declaration.

In this XML external entity payload is sent to the server and the server sends that data to an XML parser that parses the XML request and provides with the desired output to the server. Then server returns that output to the attacker.

Impacts

XML External Entity (XXE) can possess a severe threat to a company or a web developer. XXE has always been in Top 10 list of OWASP. It is common as lots of website uses XML in the string and transportation of data and if the countermeasures are not taken then this information will be compromised. Various attacks that are possible are:

● Server-Side Request Forgery

● DoS Attack

● Remote Code Execution

● Cross-Site Scripting

The CVSS score of XXE is 7.5 and its severity is Medium with –

● CWE-611: Improper Restriction of XML External Entity.

● CVE-2019-12153: Local File SSRF

● CVE-2019-12154: Remote File SSRF

● CVE-2018-1000838: Billion Laugh Attack

● CVE-2019-0340: XXE via File Upload

Performing XXE Attack to perform SSRF:

Server-Side Request Forgery (SSRF) is a web vulnerability where the hacker injects server-side HTML codes to get control over the site or to redirect the output to the attacker's server. File types for SSRF attacks are –

Local File:

These are the files that are present on the website domain like robots.txt, server-info, etc. So, let’s use “bWAPP” to perform an XXE attack at a level set to low.

Now we will fire up our BurpSuite and intercept after pressing Any Bugs? button and we will get the following output on burp:

We can see that there is no filter applied so XXE is possible so we will send it to the repeater and there we will perform our attack. We will try to know which field is vulnerable or injectable because we can see there are two0 fields i.e., login and secret.

So, we will test it as follows:

In the repeater tab, we will send the default request and observe the output in the response tab.

It says “bee’s secret has been reset” so it seems that login is injectable but let's verify this by changing it from bee and then sending the request.

Now again we will be observing its output in response tab:

We got the output “ignite’s secret has been reset” so it makes it clear that login is injectable. Now we will perform our attack.

Now as we know which field is injectable, let’s try to get robots.txt file. And for this, we’ll be using the following payload –

<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE reset [

<!ENTITY ignite SYSTEM "http://192.168.1.15/bWAPP/robots.txt">

<reset><login>&ignite;</login><secret>Any bugs?</secret></reset>

Understanding the payload:

We have declared a doctype with the name “reset” and then inside that declared an entity named “ignite”. We are using SYSTEM identifier and then entering the URL to robots.txt. Then in login, we are entering “&ignite;” to get the desired information.

After inserting the above code, we will click on send and will get output like below in the response tab:

We can see in the above output that we got all the details that are present in the robots.txt. This tells us that SSRF of the local file is possible using XXE.

So now, let's try to understand how it all worked. Firstly, we will inject the payload and it will be passed on to the server and as there are no filters present to avoid XXE the server sends the request to an XML parser and then sends the output of the parsed XML file. In this case, robots.txt was disclosed to the attacker using XML query.

Remote File:

These are the files that attacker injects a remotely hosted malicious scripts in order to gain admin access or crucial information. We will try to get /etc/passwd for that we will enter the following command.

<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE reset [

<!ENTITY ignite SYSTEM "file:///etc/passwd">

]><reset><login>&ignite;</login><secret>Any bugs?</secret></reset>

After entering the above command as soon as we hit the send button we’ll be reflected with the passwd file !!

XXE Billion Laugh Attack-DOS

These are aimed at XML parsers in which both, well-formed and valid, XML data crashes the system resources when being parsed. This attack is also known as XML bomb or XML DoS or exponential entity expansion attack.

Before performing the attack, lets know why it is known as Billion Laugh Attack?

“For the first time when this attack was done, the attacker used lol as the entity data and the called it multiple times in several following entities. It took exponential amount of time to execute and its result was a successful DoS attack bringing the website down. Due to usage of lol and calling it multiple times that resulted in billions of requests we got the name Billion Laugh Attack”

Before using the payload lets understand it:

In this, we see that at 1 we have declared the entity named “ignite” and then calling ignite in several other entities thus forming a chain of callbacks which will overload the server. At 2 we have called entity &ignite9; We have called ignite9 instead of ignite as ignite9 calls ignite8 several times and each time ignite8 is called ignite7 is initiated and so on. Thus, the request will take an exponential amount of time to execute and as a result, the website will be down.

Above command results in DoS attack and the output that we got is:

Now after entering the XML command we will not see any output in response field and also bee box is not accessible and it will be down.

XXE Using File Upload

XXE can be performed using the file upload method. We will be demonstrating this using Port Swigger lab “Exploiting XXE via Image Upload”. The payload that we will be using is:

<?XML version="1.0" standalone="yes"?>

<!DOCTYPE reset [

<!ENTITY xxe SYSTEM "file:///etc/hostname"> ] >

</svg>

Understanding the payload: We will be making an SVG file as only image files are accepted by the upload area. The basic syntax of the SVG file is given above and in that, we have added a text field that will

We will be saving the above code as “payload.svg”. Now on portswigger, we will go on a post and comment and then we will add the made payload in the avatar field.

Now we will be posting the comment by pressing Post Comment button. After this, we will visit the post on which we posted our comment, and we will see our comment in the comments section.

Let’s check its page source in order to find the comment that we posted. You will find somewhat similar to what I got below

We will be clicking on the above link and we will get the flag in a new window as follows:

This can be verified by submitting the flag and we will get the success message.

Understanding the whole concept: So, when we uploaded the payload in the avatar field and filled all other fields too our comment was shown in the post. Upon examining the source file, we got the path where our file was uploaded. We are interested in that field as our XXE payload was inside that SVG file and it will be containing the information that we wanted, in this case, we wanted”/etc/domain”. After clicking on that link, we were able to see the information.

XXE to Remote code Execution

Remote code execution is a very server web application vulnerability. In this an attacker is able to inject its malicious code on the server in order to gain crucial information. To demonstrate this attack I have used XXE LAB. We will follow below steps to download this lab and to run this on our Linux machine:

git clone https://github.com/jbarone/xxelab.git

cd xxelab

vagrant up

In our terminal we will get somewhat similar output as following:

Now once it's ready to be use we will open the browser and type: http://192.168.33.10/ and we will see the site looks like this:

We will be entering our details and intercepting the request using Burp Suite. In Burp Suite we will see the request as below:

We will send this request to repeater and we will see which field is vulnerable. So, firstly we will send the request as it is and observe the response tab:

We can notice that we see only email so we will further check with one more entry to verify that this field is the vulnerable one among all the fields.

From the above screenshot it's clear that the email field is vulnerable. Now we will enter our payload:

<!DOCTYPE root [

<!ENTITY ignite SYSTEM "expect://id"> ]>

Lets understand the payload before implementing it:

We have created a doctype with the name ”root” and under that we created an entity named “ignite” which is asking for “expect://id”. If expect is being accepted in a php page then remote code execution is possible. We are fetching the id so we used “id” in this case.

And we can see that we got the uid,gid and group number successfully. This proves that our remote code execution was successful in this case.

Mitigation Steps –

● The safest way to prevent XXE is always to disable DTDs (External Entities) completely. Depending on the parser, the method should be similar to the following:

factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

● Also, DoS attacks can be prevented by disabling DTD. If it is not possible to disable DTDs completely, then external entities and external document type declarations must be disabled in the way that's specific to each parser.

● Another method is using CDATA for ignoring the external entities. CDATA is character data which provides a block which is not parsed by the parser.

<data><!CDATA [ "'& > characters are ok in here] ]></data>

XSS via XXE

Nowadays we can see that scripts are blocked by web applications so there is a way of trespassing this. We can use the CDATA of XML to carry out this attack. We will also see CDATA in our mitigation step. We have used the above XXE LAB to perform XSS. So, we have the same intercepted request as in the previous attack and we know that the email field is vulnerable so we will be injecting our payload in that field only. Payload that we gonna use is as below:

<![CDATA[<]]>img src="" onerror=javascript:alert(1)<![CDATA[>]]>

Understanding the payload: As we know that in most of the input fields < and > are blocked so we have included it inside the CDATA. CDATA is character data and the data inside CDATA is not parsed by XML parser and is as it is pasted in the output.

Lets see this attack:

We will enter the above command in between the email field and we will observe the output in the response tab.

We can see that we have got the image tag embedded in the field with our script. We will right click on it and select the option “Show response in browser”

We will copy the above link and paste it in the browser and we will be shown an alert box saying “1” as we can observe in the below screenshot.

So, the screenshot makes us clear that we were able to do Cross-Site Scripting using XML.

JSON and Content Manipulation

JSON is JavaScript Object Notation which is also used for storing and transporting data like XML. We can convert JSON to XML and still get the same output as well as get some juicy information using it. We can also do content manipulation so that XML can be made acceptable. We will be using WebGoat for this purpose. In WebGoat we will be performing an XXE attack.

We can see that the intercepted request looks like above. We will change its content-type and replace JSON with XML code. XML code that we will be using is:

<?xml?>
<!DOCTYPE root [
<!ENTITY ignite SYSTEM "file:///">
]>
<comment>
<text>
&ignite;
</text>
</comment>

We will be observing that our comment will be posted with the root file.

So in this we learnt how we can perform XML injection on JSON fields and also how we can pass XML by manipulating its content-type.

Let us understand what happened above:

JSON is the same as XML language so we can get the same output using XML as we will expect from a JSON request. In the above we saw that JSON has text value so we replaced the JSON request with the above payload and got the root information. If we would have not changed its content type to application/xml then our XML request would not have been passed.

Blind XXE

As we have seen in above attacks we were seeing which field is vulnerable. But, when there is a different output on our provided input then we can use Blind XXE for this purpose. We will be using portswigger lab for demonstrating Blind XXE. For this we will be using burp collaborator which is present in BurpSuite professional version only. We are using a lab named “Blind XXE with out-of-band interaction via XML parameter Entities”. When we visit the lab we will see a page like below:

We will click on View details and we will be redirected to the below page in which we will be intercepting the “check stock” request.

We will be getting intercepted request as below:

We can see that if we normally send the request we will get the number of stocks. Now we will fire up the burp collaborator from the burp menu and we will see the following window.