Detecting Illicit content in picture streams

Author: Martin Stütz
Supervisor: Wolfgang KastnerChristian Platzer
Type: Master Thesis
Finished: 2013-06-14
Download



Abstract:

This thesis presents an algorithm for the detection of nudity or pornography in colour images combined with an age classification based on facial geometry. It upgrades the Algorithm for nudity detection proposed by Rigan Ap-apid by using support vector machines and targeted approaches for the elimination of false positives.


The nudity detection was based on skin detection, which was used to locate skin areas in images. Sizes, shapes and placements of detected skin regions were used as features. Based on these features and the total amount of skin in the image a support vector machine classified the image as non-pornographic or pornographic. The classification did not differentiate between nudity or pornography. Nudity and pornography are used as synonyms in this thesis.

Analyzed images were scanned for faces by OpenCv. Positions of eyes, mouth, nose and chin were extracted from located faces. A support vector machine used the differences between these facial features for a prediction of the age in two classes (<18 or >18).
Several skin detection approaches were evaluated on the Compaq-dataset, including image processing before and after skin detection. The skin detection performed with 82.3% recall and with a false positive rate of 11.4%. Pornography detection performed with a recall of 65.7%, a precision of 39.8% and a false positive rate of 6.4% on a dataset containing 12524 non-pornographic images and 811 images showing pornography.


The presented approach yielded a 6.8% higher recall, a 8.9% higher precision and a 2.2% lower false positive rate respectively than our best-performing reference algorithm for nudity detection. ge classification was tested on 2957 images showing faces at several image scaling sizes. These images were taken from FG-NET Aging-dataset part A and from the Labelled Faces in the Wild-dataset. Age classification delivered a recall of 47.7% and a precision of 71.3% with a false positive rate of 2.9%. Only images, from which the extraction of facial features was possible were included in these rates. Based on the classification results a GUI-based software was developed, which enables users to execute pornography detection and age classification on eligible testing images.

DATENSCHUTZERKLÄRUNG - Joomla templates by a4joomla