ACI Generated or Modified Data Sets

Cyber Defense Exercise Day 1 - 6 of 16

Erik Dean, United States Military Academy
Gregory Conti, Army Cyber Institute
Gregory Conti, United States Military Academy
Thomas Cook, Army Cyber Institute
Thomas Cook, United States Military Academy
Benjamin Sangster, United States Military Academy
David Raymond, Army Cyber Institute

Description

Unlabeled network traffic data is readily available to the security research community, but there is a severe shortage of labeled datasets that allow validation of experimental results. The labeled DARPA datasets of 1998 and 1999, while innovative at the time, are of only marginal utility in today’s threat environment. In this paper we demonstrate that network warfare competitions can be instrumented to generate modern labeled datasets. Our contributions include design parameters for competitions as well as results and analysis from a test implementation of our techniques. Our results indicate that network warfare competitions can be used to generate scientifically valuable labeled datasets and such games can thus be used as engines to produce future datasets on a routine basis.