Extracting economic information from underground marketplaces

Author: Valentin Habsburg-Lothringen
Supervisor: Wolfgang Kastner ,  Gilbert Wondracek
Type: Master Thesis
Finished: 2013-03-06

The primary goal of this Master's thesis is to develop a focused crawling engine for information extraction from underground marketplaces. The crawling targets are forums in which illegal trade relations are being established, forums where illicit services - such as botnet renting are offered, and public web stores for dual-use goods, i.e. goods that can be harnessed for harmful and non-harmful purposes, such as chemicals.
Today's general purpose crawlers are not sufficient for forum crawling because they typically limit the search depth and have difficulties in handling dynamic sites which make extensive use of JavaScript and XML (e.g. AJAX) languages. Focused crawlers usually can only handle either forums or web stores, but not both. However, the participants in the underground economy carry out their trade in both.
By using a generic crawling engine architecture it is possible to crawl various kinds of underground marketplaces using the same software system. This type of engine also features a component that supports API based crawling. What is more, some operators of marketplaces block crawlers after their detection. The crawler created in this work features built-in capabilities that greatly reduce the likelihood of detection and is therefore able to extract information in a covert way.
The experimental evaluation shows that the software is capable of extracting data from distinct types of underground marketplaces within a reasonable time frame and without excessive resource usage. Thus, this tool can help researchers to study the underground economy in qualitative, as well as quantitative, aspects of the goods and services offered

