The pace of machine learning adoption for cybersecurity is increasing. This may appear to be obvious (virtually no new security product or version is released without claim to artificial intelligence), but a new report confirms this with hard figures. While around 20% of firms used ML prior to 2019, closer to 60% will be using it by the end of the year.
The Capgemini Research Institute queried 850 senior executives from IT, cybersecurity and OT in seven sectors across 10 countries, and compiled the report 'Reinventing Cybersecurity with Artificial Intelligence'. The report will not help the understanding of artificial intelligence in cybersecurity, but it does provide information on its current use.
Sadly, it does not differentiate between different the types of artificial intelligence. It says that AI in cybersecurity is "a set of capabilities that allows organizations to detect, predict and respond to cyberthreats in real time using machine and deep learning." Relatively few security products use deep learning -- it is perhaps really a technology for the future. Most current products employ machine learning -- but here the report makes no differentiation between the types of machine learning.
This is a weakness in the report. It doesn't mention that machine learning can be supervised or unsupervised, nor the difference in false positive returns between the two. With unsupervised ML, the algorithms teach themselves. It is probably fair to suggest that this technology is not yet sufficiently mature -- and the result is a higher number of false positives. Supervised ML, where human experts continue to teach the system, currently remains the more effective approach, and anecdotally, companies receive better results from supervised machine learning.
The result of this overly high-level view of artificial intelligence is that the figures quoted cannot provide a detailed analysis of the current state of either artificial intelligence in cybersecurity, or its effectiveness in different areas. For example, the report highlights that 64% of companies say that AI lowers the cost of detecting and responding to breaches. Thirty-six percent say it does not -- but there is no explanation of why or how AI has failed more than one-third of all organizations. Could it be as simple as many companies adopting the unsupervised route and having to handle large numbers of false positives? The report does not tell us.
Similar could be asked about faster response to breaches. This is, after all, one of the primary selling points of ML in cybersecurity. Seventy-four percent of companies have experienced a timesaving of an average of 12%. But 26% of companies have experienced no time-saving. It would be useful for a better understanding of machine learning solutions if we understood why it has failed in these cases.
Sixty-nine percent of organizations claim higher accuracy in detecting breaches -- but 31% have had no improvement. Sixty percent claim higher efficiency from analysts supported by ML -- but 40% claim no improvement. Since machine learning security products are being sold -- and bought -- on the basis of better detections and higher speeds, understanding failures could help better decision-making.
Another fundamental weakness in the report (PDF) is a lack of clarity over whether the AI employed by the respondents is developed in-house or bought from a security vendor. While the former is possible for the largest of firms, it will be difficult and probably involve a range of issues not experienced by users of vendor-supplied solutions.
One potentially strong area in the report is the development of a recommended use case quadrant based on benefits against complexity. Unsurprisingly, malware detection, intrusion detection and fraud detection all figure in the high benefit, low complexity quadrant. These are classic uses for a wide range of cybersecurity ML-based product.
Surprisingly, perhaps, endpoint protection and user behavioral analysis are at the other end of the scale. Given that attacks enter via endpoints, and the potential for user behavioral analysis to protect credentials and detect lateral adversarial movement on the network, the 'low benefit' classification could be questioned.
It isn't clear how this quadrant was developed -- whether from independent expert opinion or from the survey respondents' own use cases. Either way, the quadrants agree with the respondents use: the average implementation of low complexity/high benefits was 54%, while the other extreme of high complexity and low benefits was just 42%. The value of the quadrant is further reduced by having malware detection appearing in three separate quadrants, and intrusion detection appearing in two.
The main problem with the report is that it does not sufficiently define or focus its own purpose -- and as a result it tries to say too much, too shallowly. For example, malware detection in OT is described as one of five high potential use cases. It would be easy to assume that this is recommending the use of ML-based anti-malware in OT environments; but in reality OT administrators are reluctant to install anti-malware either because legacy ICS systems do not have the processing capacity, or for fear that such software might interfere with existing smoothly running devices.
That doesn't mean that ML cannot help protect OT. In December 2018, NIST published the NISTIR 8219. It is an examination and demonstration that the use of off-the-shelf behavioral anomaly detection systems can improve visibility, identify new devices, detect assets that have disappeared, and detect anomalies that might indicate a malicious presence in a non-intrusive manner; that is, without any interruption or performance impact to the ICS network. ML-enhanced BAD would simply be better, and a good solution to malware detection within OT.
The basic premise of this Capgemini report is to confirm the increasing use of AI (actually in the form of ML) within cybersecurity. This, however, is already self-evident to all security professionals. The report is titled, 'Reinventing Cybersecurity with Artificial Intelligence'. That has already happened, and is confirmed by the number of Capgemini respondents who intend to install ML over the coming months. What is necessary is an explanation of how different forms of AI will work best in different areas, and the specific advantages that can be delivered. Sadly, this is not delivered by this report.