The long-standing problem of email attacks
For several years now, the digital revolution has radically changed people’s daily lives, both their private and work lives. The use of the Internet and its services has become indispensable for interaction between people and a priority for all companies that want to remain competitive on the market.
Although digitalization has been a growing phenomenon for decades, the coronavirus pandemic has accelerated it even further, making it necessary to transfer various activities to the Internet. Among the various services on the net, e-mail has further consolidated its position as an essential tool for exchanging information and coordinating business processes. 269 billion e-mails were sent and received every day in 2017 alone and this figure is expected to reach almost 362 billion daily e-mails in 2024 .
The widespread use of e-mail has made this an attractive channel for spreading malware, phishing, and fraud (Business Email Compromise). Not all emails contain useful information for the recipient: according to a recent study more than 55% of e-mail traffic is composed of spam , also known as junk mail, which is used by spammers for a variety of purposes, some of which are harmless from a security point of view (but still create inconveniences), while others are downright malicious. One of the difficulties for security analysts and victims, apart from triaging expected messages from unwanted ones, is also distinguishing unwanted but innocuous mails from unsolicited dangerous ones. For the security analysts an actual search for the needle in the haystack.
Typically, the spam email phenomenon involves harvesting e-mail addresses by crawling the web and sending bulk unsolicited messages (e.g. advertising) in order to reach the highest possible number of users. In most cases, these represent “just” a nuisance to the recipient and to network operators. However, the most worrying aspect is that this noise is exploited by cyber criminals to convey malicious emails, attempting security breaches. It is also typical to target a specific person, usually an executive of a company, an individual with special privileges or a prominent person.
In recent years, it has become increasingly common to come across e-mails that try to emulate some authoritative site or company in order to trick the victim into performing controversial actions, such as installing malicious software (e.g. ransomware) or issuing a fraudulent credit transfer. Due to their rapid spread, this kind of attack has attracted the attention of government agencies (e.g. Europol , FBI), which have stated that email attacks are increasing in number and malignance, with about 12 billion in financial losses worldwide from incidents reported between 2013 and 2018. According to Google, its Gmail service blocks more than 100 million phishing emails every day and in the last year 18 million were related to COVID-19 . Very often, these messages are sent by experienced groups of hackers and cybercriminals with the sole purpose of obtaining money or sensitive information in an illicit manner by exploiting various tricks, such as attacks operated via social engineering techniques. In fact, the spectrum of email attacks is variegated, ranging from the legacy ones concerning purely technical aspects, still feasible due to SMTP protocol and configurations vulnerabilities, to the more sophisticated socio-technical methods made possible by modern machine learning and social engineering techniques. Hence the need to take into account not only the security of IT systems but also the human factor, often representing the weak link in the defense chain.
Therefore, email security awareness is essential to protect an organization against email attacks. In fact, it is now widespread to use periodic phishing awareness campaigns, to train employees to be vigilant and to recognize and report fake emails that pass the traditional spam filters.
Despite these efforts, there still is a need to improve the effectiveness of these training programs, which should be tailored to specific human cognitive vulnerabilities, exploited by criminals to trick them in a moment of distraction. Curbing this long-standing problem is a difficult challenge, especially for large companies as described in this article , which also proposes a solution.
The paper describes a collaborative approach for early detection of malicious spam emails and its application in the context of large companies. By the joint effort of the employees and the security analysts during the last two years, a large dataset of potentially malicious spam emails has been collected with each email being labeled as critical or irrelevant spam. By analyzing the main distinguishing characteristics of dangerous emails, a set of both traditional and novel features was identified, tested and optimized by applying common supervised machine learning classifiers.
The results of these experiments have a twofold usefulness: to know exactly what the cognitive vulnerabilities of the company’s people are, on top of which to design a specific awareness and training campaign, but also to appropriately configure and train Machine Learning engines that can detect in advance any possible security incidents caused by malicious emails. The obtained massive experimental results show that Support Vector Machine and Random Forest classifiers achieve the best performance, with the optimized feature set of only 36 features achieving 91.6% Recall and 95.2% Precision. The results obtained are validated with experiments conducted on more than 40,000 people. The human factor in cybersecurity scenarios is crucial, and having virtuous users enables a very effective collaborative approach.
Figure below shows the life-cycle of a spam email when using such an approach, and how the designed defense ecosystem also interfaces with the Threat Intelligence and Information Sharing MISP Platform deployed within the CONCORDIA 2020 European project.
By learning from previous security incidents this approach allows an early detection of malicious spam emails providing an ongoing tuning of spam filters and an early sharing of the retrieved indicators with the other partners of the consortium thanks to the integration with the Concordia Threat Intelligence platform.
(By De Lutiis Paolo, TELECOM Italia)