Deception technologies like honeypots produce comprehensive log reports, but
often lack interoperability with EDR and SIEM technologies. A key bottleneck is
that existing information transformation plugins perform well on static logs
(e.g. geolocation), but face limitations when it comes to parsing dynamic log
topics (e.g. user-generated content). In this paper, we present a run-time
system (GPT-2C) that leverages large pre-trained models (GPT-2) to parse
dynamic logs generate by a Cowrie SSH honeypot. Our fine-tuned model achieves
89% inference accuracy in the new domain and demonstrates acceptable execution
latency.

By admin