USMA Research Unit Affiliation
Army Cyber Institute
Feature engineering and selection is a critical step in the implementation of any machine learning system. In application areas such as intrusion detection for cybersecurity, this task is made more complicated by the diverse data types and ranges presented in both raw data packets and derived data fields. Additionally, the time and context specific nature of the data requires domain expertise to properly engineer the features while minimizing any potential information loss. Many previous efforts in this area naively apply techniques for feature engineering that are successful in image recognition applications. In this work, we use network packet dataflows from the Defense Research and Engineering Network (DREN) and the Engineer Research and Development Center's (ERDC) high performance computing systems to experimentally analyze various methods of feature engineering. The results of this research provide insight on the suitability of the features for machine learning based cybersecurity applications.
Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons, Digital Communications and Networking Commons, Electrical and Computer Engineering Commons, Information Security Commons, OS and Networks Commons, Theory and Algorithms Commons