Sunday, November 30, 2014

FOCA Metadata Analysis Tool

Written by Pranshu Bajpai |  | LinkedIn

Foca is an easy-to-use GUI Tool for Windows that automates the process of searching a website to grab documents and extract information. Foca also helps in structuring and storing the Metadata revealed. Here we explore the importance of Foca for Penetration Testers

Figure 1: Foca ‘New Project’ Window

Penetration Testers are well-versed in utilizing every bit of Information for constructing sophisticated attacks in later phases.  This information is collected in the ‘Reconnaissance’ or ‘Information gathering’ phase of the Penetration Test. A variety of tools help Penetration Testers in this phase. One such Tool is Foca.
Documents are commonly found on websites, created by internal users for a variety of purposes. Releasing such Public Documents is a common practice and no one thinks twice before doing so. However, these public documents contain important information like the ‘creator’ of the document, the ‘date’ it was written on, the ‘software’ used for creating the document etc.  To a Black Hat Hacker who is looking for compromising systems, such information may provide crucial information about the internal users and software deployed within the organization.

What is this ‘Metadata’ and Why would we be interested in it?
The one line definition of Metadata would be “A set of data that describes and gives information about other data”. So when a Document is created, its Metadata would be the name of the ‘User’ who created it, ‘Time’ when it was created, ‘Time’ it was last modified, the ‘folder path’ and so on. As Penetration Testers we are interested in metadata because we like to collect all possible information before proceeding with the attack. Abraham Lincoln said “Give me six hours to chop down a tree and I will spend the first four sharpening the axe”. Metadata analysis is part of the Penetration Tester’s act of ‘sharpening the axe’. This Information would reveal the internal users, their emails, their software and much more.

Gathering Metadata
As Shown in Figure 1, Foca organizes various Projects, each relating to a particular domain. So if you’re frequently analyzing Metadata from several domains as a Pen Tester, it can be stored in an orderly fashion. Foca lets you crawl ‘Google’, ‘Bing’ and ‘Exalead’ looking for publicly listed documents (Figure 2).

Figure 2: Foca searching for documents online as well as detecting insecure methods
 You can discover the following type of documents:

Once the documents are listed, you have to explicitly ‘Download All’ (Figure 3).

Figure 3: Downloading Documents to a Local Drive
 Once you have the Documents in your local drive, you can ‘Extract All Metadata’ (Figure 4).

Figure 4: Extracting All Metadata from the downloaded documents
This Metadata will be stored under appropriate tabs in Foca. For Example, ‘Documents’ tab would hold the list of all the documents collected, further classified into ‘Doc’, ‘Docx’, ‘Pdf’ etc. After ‘Extracting Metadata’, you can see ‘numbers’ next to ‘Users’, ‘Folders’, ‘Software’, ‘Emails’ and ‘Passwords’ (Figure 5). These ‘Numbers’ depend on how much Metadata the documents have revealed. If the documents were a part of a database then you would important information about the database like ‘name of the database’, ‘the tables contained in it’, the ‘columns in the tables’ etc.

Figure 5: Foca showing the ‘numbers’ related to Metadata collected

Figure 6: Metadata reveals Software being used internally
Such Information can be employed during attacks. For Example, ‘Users’ can be profiled and corresponding names can be tried as ‘Usernames’ for login panels. Another Example would be that of finding out the exact software version being used internally and then trying to exploit a weakness in that software version, either over the network or by social engineering (Figure 6).
At the same time it employs ‘Fuzzing’ techniques to look for ‘Insecure Methods’ (Figure 2)
Clearly Information that should stay within the organization is leaving the organization without the administrators’ knowledge. This may prove to be a critical security flaw. It’s just a matter of ‘who’ understands the importance of this information and ‘how’ to misuse it.
So Can Foca Tell Us Something About the Network?
Yes and this is one of the best features in Foca. Based on the Metadata in the documents, Foca attempts to map the Network for you. This can be a huge bonus for Pen Testers. Understanding the Network is crucial, especially in Black Box Penetration Tests.

Figure 7: Network Mapping using Foca
As seen in Figure, a lot of Network information may be revealed by Foca. A skilled attacker can leverage this information to his advantage and cause a variety of security problems. For example ‘DNS Snoop’ in Foca can be used to determine what websites the internal users are visiting and at what time.
So is Foca Perfect for Metadata Analysis?
There are other Metadata Analyzers out there like Metagoofil, Cewl and Libextractor. However, Foca seems to stand out. It is mainly because it has a very easy to use interface and the nice way in which it organizes Information. Pen Testers work every day on a variety of command line tools and while they enjoy the smoothness of working in ‘shell’, their appreciation is not lost for a stable GUI tool that automates things for them. Foca does the same.
However, Foca has not been released for ‘Linux’ and works under ‘Windows only’, which may be a drawback for Penetration Testers because many of us prefer working on Linux. The creators of Foca joked about this issue in DEFCON 18“Foca does not support Linux whose symbol is a Penguin. Foca (Seal) eats Penguins”.

Protection Against Such Inadvertent Information Compromise
Clearly, public release of documents on websites is essential. The solution to the problem lies in making sure that such documents do not cough up critical information about systems, softwares and users. Such documents should be internally analyzed before release over the web. Foca can be used to import and analyze local documents as well. It is wise to first locally extract and remove Metadata contained in documents before releasing them over the web using a tool called ’OOMetaExtractor’. Also, a plugin called ‘IIS Metashield Protector’ can be installed in your server which cleans your document of all the Metadata before your server is going to serve it.


Like many security tools, Foca can be used for good or bad. It depends on who extracts the required information first, the administrator or the attacker. Ideally an administrator would not only locally analyze documents before release, but also take a step ahead to implement a Security Policy within the organization to make sure such Metadata content is minimized (or falsified). But it is surprising how the power of information contained in the Metadata has been belittled and ignored. A reason for this maybe that there are more direct threats to security that the administrators would like to focus their attention on, rather than small bits of Information in the Metadata. But it is to be remembered that if Hackers have the patience to go ‘Dumpster Diving’, they will surely go for Metadata Analysis and an administrator’s ignorance is the Hacker’s bliss.

On the Web

                     http:// – Foca Official Website

No comments:

Post a Comment