Everybody is aware of the buzzword BINGO wining square of “Machine Learning”, but how can we apply this to a real problem? More importantly what output can we drive from doing some analysis! This talk will cover clustering (unlabeled data) of file types based off various static features. Then, using information from the clusters, is it possible to automatically generate Yara signatures to go hunting for files that are similar? We believe so, and we’ll show you how you can do this at home.
Bio: David has been in the security field for over 10 years now. He enjoys static file analysis and tearing apart shellcode. He’s starting to add various data analysis techniques to this toolbox when before he would only rely on hex editors, debuggers, and disassemblers. avatar for Mike Sconzo Mike enjoys attempting to solve/solving interesting security problems with data analysis. He’s spent most of his career on the defensive side, and is constantly looking for new ways to detect suspicious and malicious behavior. His background is heavy in network analysis and most of the explored techniques revolve around use cases involved with network forensics. Mike also really dislikes talking about himself in the 3rd person.