Metadata mining: Dont let them follow your breadcrumbs !

Oct 1, 2014 • pentest

Information about metadata and how it can be used to find out more information than you orginally thought

Metadata means the data about the data. That sounds too weird ? Ooh well, it can be simply considered as a very small file which stores information about a big file. Oops, now that’s strange.. A file info saved in another file (now that goes into an infinite loop doesn’t it ) ? For the purpose of understanding, metadata is the section of the file that describes what the file is, providing additional information that is generally not directly visible to the user. Usually when a file is created several informations like “last modified”, “Author who modified” etc… are saved within the file as metadata. For example an image file usually contains meta information like “Width”, “Height”, “Resolution”, “Compression” etc. So what is the risk of being adding a little more info about the file ? Let us analyse it.

The Metadata inside an image file may not be much useful to the attackers (There were situations when an Anonymous hacker named w0rmer from the team CabinCr3w got caught by the police when they got his GPS coordinates from the image he uploaded) but metadata has more meaning than simply file properties. Consider a developer designing a website for a client and chances are high that he will add meaningful comments into the code so that the client can understand it better. But are the comments useful only for the client ? Lets us take a piece of code and analyse this (code sample taken from OWASP):

   <div class="table2">
     <div class="col1">1</div><div class="col2">Mary</div>
     <div class="col1">2</div><div class="col2">Peter</div>
     <div class="col1">3</div><div class="col2">Joe</div>
   <!-- Query: SELECT id, name FROM app.users WHERE active='1' -->

Did you see any issues in the above code ? You can see a comment which exposes how the data is taken from the database. An attacker can easily understand the code and can try an SQLinjection to get un-Authorized access to the database.

So what is the best possible way to avoid attackers from mining your metadata ? The answer is simple as you might have thought. Just don’t comment or put any information on the code that you write or make sure you remove the comments and other useless things from the source code before you deploy it. But if you had already deployed hundreds of pages, going through all of them manually and removing the comments is not possible. Absolute HTML compressor comes to the rescue here. This small tool can automate the process of removing the comments from your source code.

Anirudh Anand

Security Engineer @CRED | Web Application Security ♥ | Google, Microsoft, Zendesk, Gitlab Hall of Fames | Blogger | CTF lover - @teambi0s | Certs - eWDP, OSCP