Making a good machine learning model involves more than just good data and well-selected features. Each model also has its own set of hyperparameters, variables which are set before training begins to influence how a given model learns. In this article, we go over grid search, a technique used to select the best hyperparameters for a model quickly and efficiently!
While two different malware samples might appear completely different to a human's evaluation, those same samples, stripped of their identities and reduced down to a vectorized representation of their most important qualities, might be found by a machine to have been twins all along. Insights like this are the goal of "clustering," a machine learning technique based on finding the similarities and differences across and between a massive amount of data points. What follows is an overview of one of those techniques, K-means.
Unsupervised machine learning can give us insights that supervised learning cannot. Here, we go over one of these algorithms, MinHash.
Inquest uses a variety of machine learning algorithms to model the features of malware that we collect and to gain new insights from such data. Here we travel down the branching rabbit hole of random forests and gradient boosting.
Machine learning is one of the most versatile fields in all of computer science, with applications ranging from physics to art history, so, of course, it has a myriad of uses with regards to the detection and diagnosis of malicious programs; uses that we at InQuest would be remiss to not start utilizing ourselves. Here we go over some of the many ways ML algorithms are being leveraged for our purposes.