Through synthetic training sample dataset generation and ML training.

Preface

Cracking CAPTCHAs is already a well-documented and established process which this article looks to expand on. We will approach this article with a general view of how we’ve cracked CAPTCHAs within undesirable conditions. This article is not meant to be a how-to or detailed guide to replicate our steps. However, it may give you some inspiration for your specific challenge.

We believe that the methods laid out in this article are novel and significantly improve the efficiency of automated CAPTCHA solving in contrast to traditional approaches. Especially when considering a target CAPTCHA system with poor sample harvesting opportunities.

Ethics

We bypass human verification checks to maintain automatic information collection pipelines. The use of the methods we have developed only extends as far as what is required to automate our collection process.

If a CAPTCHA or other human verification check system is poorly designed and not adequately rate limited, condition checked etc. bypassing it on scale may lead to a DDoS (Distributed Denial of Service) attack in the worst of cases. But with correctly implemented human verification systems, you should mitigate this even with the system bypassed. At best, unethical manipulation of these verification systems can lead to spam posts/comments and otherwise undesirable automated “bot” interaction. We do not condone this type of use.

The Problem

There are several well-established methods to automate the solving of CAPTCHAs, depending on the complexity of the CAPTCHA, and if we start at the easy end of the spectrum we are presented with a fairly basic alphabetical captcha.

With a simple distortion background, one might choose to apply a straightforward process of applying denoise filters or Gaussian blurring to an image to reduce or remove the amount of “stars” or random dot pixels present in its background that are applied at random.

This process can give us a less noisy picture and we can further convert the image to grayscale. If the source sample is a colour image doing so improves edge detection.

The image can then be processed through a standard OCR (Optical Character Recognition) library and in our experience can result in a 0.1% failure rate yielding excellent stable solutions.

In some cases, a good test of CAPTCHA ease of solvability is to feed it to Google Translate as an image; have Google Translate attempt to read the text and translate the letters back into English. If it can, then you have a very good chance that rudimentary OCR libraries will also work for you.

But this article is not about the easy end of the challenge…

What we are dealing with is a CAPTCHA that is both alphanumeric, upper and lower case with random character placement and rotation, and random disruption lines across the image and characters. Furthermore, most importantly, a point that we will discuss in more detail is where the target source is a Tor Onion website that, at the best of times loads slowly and at the worst of times is offline or responds with backend timeout errors.

The image complexity of the source CAPTCHA means it’s nearly impossible to effectively read it by OCR. This is made challenging due to the disruption patterns provided by the background random line arrangement (an outward star pattern) and each of our characters are independently disrupted with seemingly random lines of various length and width. Combining all that with offset angles of each character it’s beyond what most OCR or OpenCV methods can handle.

Therefore, for more complex CAPTCHAs image manipulation (removing noise, grey scaling etc.) is typically not sufficient. These challenges usually require machine learning to get a reasonable failure rate and sufficient solving speed.

The biggest factor in achieving a good model that will solve accurately is having a large enough sample base. In some cases, many thousands of samples are required for training. Certainly, when dealing with a CAPTCHA that may have upper, lowercase and numerical characters with randomisation of all these points plus randomisation on disruption patterns or lines the larger the sample set, the more accurate a model the training will produce.

So how do you get thousands of samples from a source that is slow to load and has poor availability, both conditions of the source being a Tor website? Harvesting samples this way would be far too inefficient and we can’t hang around!

Even with a target source that responds reasonably quickly, has good availability, and can be harvested without aggressively hitting rate limits, who would want to sit there endlessly solving eight thousand captchas to feed to an optical character recognition model?

I know that’s not going to be me! Sure, there are options to outsource these problems and crowdsource them, but those options take time, money and are likely to introduce errors in our training sample data. Neither of these is desirable, so how do we get 100% accurate sample data cheaply without human solving, without having to harvest the source, and that can scale?

The Solution

The solution we came up with was first to not focus on the solving of the CAPTCHAs, or the training of our model, or anything that was a direct result or outcome of the end goal we are driving towards. Instead, we looked at how the CAPTCHAs are constructed; what do they look like and what are their elemental parts.

We know harvesting is not an optimal option, so we have put that aside. Doing so leaves us with a handful of maybe 20 or so harvested solved CAPTCHA samples. Nowhere near enough to start training but it’s enough to start focusing on the sample set we have.

If we look at how the CAPTCHA is constructed and try and break its construction down piece by piece, in a way “reverse engineering” the construction of the CAPTCHA we might either: 1) be able to generate our own `synthetic` CAPTCHAs on demand and at scale all 100% accurately pre solved, or 2) sufficiently understand the method of construction to identify the library or process in which the CAPTCHA is constructed and reimplement it for ourselves with the same 100% accurately pre-solved outcome.

In our case and the example, we are writing this article from the path of the former option. This option was chosen as some time was spent trying to identify the particular CAPTCHA library but no exact match was found, and in the interest of not burning too much time, and depending on external factors we decided to attempt to create our own synthetic CAPTCHA generation process.

To create our CAPTCHAs, we used Pillow (a PIL Python Fork), a Python Image Manipulation Library that offers a wide range of features all well suited for the job at hand.

We start by defining a few values that we have observed to be fixed, such as a defined image size (in our case, 280 by 50 pixels) and use this to create a simple image.

Then we define our letter set (a to z, A to Z, 0 to 9) as we know these to be fixed.

Using `random.choice` we can pick a required amount of characters. In our case, the CAPTCHA uses a fixed length of 6 characters.

The text font is also important and from our source samples we see it is fixed: therefore we try to match the font type as closely as possible. Font size also remains constant. This will be important in ensuring that our training is as accurate as possible when our model is presented with real sample data.

To kick things off, the process carefully establishes the dimensions of the image canvas, akin to laying out a pristine piece of paper before beginning a drawing. Then, with a deft stroke, we construct a blank background canvas, pristine and white, awaiting the arrival of the CAPTCHA characters. But here’s where the true artistry takes centre stage; the process methodically layers complexity onto the character,

With each character in the CAPTCHA text, our process doesn’t simply slap it onto the canvas; instead, it treats each letter as an individual brushstroke, adding specific characteristics at every turn. We begin by precisely measuring the width and height of each character, ensuring that characters will not be chopped off the edges, correctly fit and fill the CAPTCHA, and that they resemble the source CAPTCHA text. Then, like with the source samples, we introduce randomness into the mix, spacing out the letters with varying degrees of separation, akin to scattering scrabble pieces.

We are also introducing a touch of chaos by randomly rotating each character, giving them a tilt that defies conventional alignment. This clever sleight of hand resembles the source samples accurately and adds to the difficulty level of solving this CAPTCHA.

Yet the process doesn’t stop there. No, it goes above and beyond, adorning our canvas with a riotous display of crisscrossing lines, as if an abstract artist had gone wild with a brush. These random lines serve as a digital labyrinth, obscuring the text beneath a veil of confusion and intrigue.

We then add and overlay lines of random length and weight across each character, aligned to the character’s angle closely matching that of the source sample.

Now that we have a way to populate our image canvas, we have a working framework with which we can iterate to get an output that resembles the source samples as closely as possible.

For now, we generate a few hundred samples, each image file is named the randomly selected CAPTCHA text, assisting us by essentially generating a sample set that has already been solved.

After that, we compared each iteration’s output closely to the source and made tweaks and adaptations. For each iteration of the CAPTCHA generator we looked closely at just one specific attribute to simplify the synthesis process. We adjust the random scattered background lines, adjusting their length, width and count. Moving then onto tweaking the letter placement and random angles, to closely match the apparent pseudo randomness of the sample data set.

Following sufficient tweaking and iterations, we are producing a CAPTCHA that is at least visually very closely matching our source samples. It matches so closely that if mixed with real samples it’s difficult to distinguish. This is the ideal level of synthesis we are looking to achieve.

Example synthetic captcha on the left, real on the right

Next steps

Now that we have a way to produce synthetic CAPTCHAs that very closely match our target, it’s time to produce a few thousand of them. This is easily and quickly done by specifying the total count in our process loop and out pops 5,000 freshly generated pre-solved captchas all nicely labelled and ready for shoving into our training process.

For model training, we’ve chosen to use the TensorFlow framework alongside the ONNX Runtime machine learning model accelerator. This combination worked well for us for both training accuracy and efficiency. All training was conducted with the use of a Nvidia GPU.

Following initial training, using just our best-produced synthetic CAPTCHA samples as our data set, we achieved a CER (Character error rate) of 3.26%. For a first batch run of a model trained against a synthetic data set was not too bad at all. But we knew we could do better.

Now that we had a model to work with, we could use it to start solving actual real target CAPTCHAs. This would allow us to generate a larger pool of real CAPTCHA samples, with a solve set, and mix those in with our synthetic set. We were looking to generate 5k synthetic and 1k real harvested CAPTCHAs with our newly trained, albeit unoptimized model.

With a framework in place that would interface with the target website, collect CAPTCHAs, generate a text prediction, check that with the website and if solved, store the solved and labelled CAPTCHA image we generated about 1,000 samples over a short time.

Feeding this back into the mix of training model data we dropped the CER down to 2.77%.

A screen shot of a black screen

Description automatically generated

We were confident that even with 2.7% it was a rate better than a human could achieve, and we were also confident that our methodology was working.

Our remaining tasks were to reiterate the model once more, using this slightly more optimised model and generate a slightly larger set of labelled real CAPTCHAs.

We were able to go from the initial model, with a worse CER (orange line) to the best model (green line) in only a few training iterations.

The model training improvements are best shown in the graph below with each improvement yielding a lower CER, for longer (more stable) and at a sooner point in time.

At which point we settled on a final model, with a CER of 1.4%, opting for an optimal mix real CAPTCHAs to synthetic.

Our final ML model diagram:

Once the efficacy of this model was validated it was then a task of simply plugging it into the collection pipeline process and enlivening it into our production collection system. The automated solver process has been running stable ever since and most of the disruption we’ve observed has solely been to the target source going offline and being unavailable.

Bias and Variance

A key consideration during the training process was to be aware of and mitigate where possible Overfitting and Overtraining our model. Instead of using the terms `overfitting` and `overtraining` I like to instead use Bias and Variance as two potential pitfalls of ML training as they better explain undesirable conditions that may occur. Without diving into too many details around these ML concepts as to fully understand them you would probably need a PhD. The best way I can describe what my simple mind can understand is as follows.

Due to the nature of our novel, one might say clever iterative process to train a CAPTCHA solver on a very low original source data set we are by virtue potentially adding bias into our training process. For example, from the first model any solved data sets will be solved by a model that has a predefined bias to solving a particular set, style or character combination potentially resulting in a new data set that is biassed towards what that previous model was good at solving thereby amplifying the bias in our next model’s training.

This bias would result in a real world regression of CER as the model is unoptimised to solve a wider range of character combinations and randomisation characteristics.

Our second pitfall: overfitting slides at both ends of the extreme in terms of providing an overly varied training set or an insufficiently varied training set, i.e. creeping into bias. Whereby we must consider that although we could train a model to solve many different types of CAPTCHAs, beyond just this one example, from one model using a very varied data set doing so and if not carefully tuned could result in `overfitting` our data set thereby introducing an unoptimised CER as our model is essentially training on more noise than signal.

We therefore considered both Bias and Variance closely, ensuring a healthy mix of varied real correctly labelled CAPTCHAs harvested from source to a ratio of synthetically generated CAPTCHAs with a randomly distributed character set. An optimal CER band was then discovered through iterative AB testing of data set mix, training iterations until a stable plateau was identified.

Conclusion

We deploy a final model, incorporating a mix of synthetic and real CAPTCHAs, achieving a CER of 1.4%. The automated solver process seamlessly integrates into our production collection system, ensuring stability and efficiency.

By leveraging synthetic sample training data generation, we’ve advanced CAPTCHA cracking. Our approach offers an effective and efficient solution for CAPTCHA cracking without significant human involvement or effort allowing for effective automated data collection.

With this capability, we are able to add value to our customers by automating the collection from otherwise programmatically inaccessible sources, where we would have to manually have a human solve the CAPTCHA access the page, insert any updates and then alert our customers. Automation is key to what we do at speed and at scale especially when dealing with many hundreds of collection sources as we do.

Photo by Kaffeebart on Unsplash.

Investigation, Ransomware

Ransomware – State of Play February 2024

by Daniel Collyer

March 14, 2024

SOS Intelligence is currently tracking 180 distinct ransomware groups, with data collection covering 348 relays and mirrors.

In the reporting period, SOS Intelligence has identified 395 instances of publicised ransomware attacks. These have been identified through the publication of victim details and data on ransomware blog sites accessible via Tor. Our analysis is presented below:

LockBit has maintained its position as the most active and popular ransomware strain.

This is despite significant law enforcement interruption, the impact of which will be discussed further below.

Despite law enforcement action towards the end of 2023, ALPHV/Blackcat has maintained a strong presence online and continues to post victim data. However, owing to how the ransomware process operates, this could be seen to be victims compromised before law enforcement takedown of ALPHV/Blackcat infrastructure.

Increased activity has been identified amongst BianLian, Play, QiLin, BlackBasta, 8base and Hunters ransomware strain. This increase may be attributed to these strains absorbing affiliates from LockBit and ALPHV/Blackcat as those services went offline.

This month, Ransomhub, AlphaLocker, Mogilevich, & Blackout have emerged as new strains. Mogilevich has been observed targeting high-value victims, including Epic Games, luxury car company Infiniti, and the Irish Department of Foreign Affairs.

Group targeting continues to follow familiar patterns in terms of the victim’s country of origin.

Attacks have increased in South American countries, particularly in Argentina, which may be a response to presidential elections in November 2023 in which the far-right libertarian Javier Milei was elected.

Targeting continues to follow international, geopolitical lines. Heavy targeting follows countries that have supported Ukraine against Russia. Attacks against Sweden continued as it pressed ahead with preparations to join NATO. This highlights the level of support ransomware groups continue to show towards the Russian state, and they will continue to use cyber crime to destabilise and weaken Western and pro-Ukrainian states.

Manufacturing and Construction and Engineering have remained the key targeted industries for February. These industries would be more reliant on technology to continue their business activities, and so it logically follows that they would be more likely to pay a ransom to regain access to compromised computer systems. The Financial, Retail & Wholesale, Legal, and Education sectors have also seen increased activity over the period. Health & Social Care has seen a significant increase over the period. This is likely in response to several groups, including ALPHV/Blackcat reacting to law enforcement activity and allowing their affiliates to begin targeting these industries.

We are seeing a shift in tactics for certain industries, particularly those where data privacy carries a higher importance (such as legal or healthcare), where threat actors are not deploying encryption software and instead relying solely on data exfiltration as the main source of material for blackmail and extortion.

LockBit Takedown

On 20 February, an international law enforcement effort was successful in taking control of and shutting down the infrastructure of the LockBit ransomware strain. Much has been disclosed and said regarding the takedown, some of it speculative, however, it was confirmed by the UK’s National Crime Agency (NCA) and the US’s Federal Bureau of Investigation that control of their dark web domains and infrastructure was obtained, providing them with significant information regarding the activity of the LockBit group and its affiliates.

Since then, multiple LockBit blog sites have re-emerged, and new data continues to be published. However, it is not clear whether or not this is new activity since the takedown. It is more likely that these are victims compromised before law enforcement activity which are only now being blackmailed with data release.

We are continuing to monitor the ransomware landscape at this time to properly analyse the impact this takedown will have. This event has had a significant impact on the reputation of the LockBit group, with many affiliates angry at the perceived lack of operational security resulting in the possible identification of their real-world identities. We are anticipating many of these will look to gain access to the affiliate programs of other strains, and so we will expect to see a significant increase in reported attacks from those strains in the coming weeks and months. As for LockBit, the threat actors behind the group remain active, and it is likely we will see a re-emergence as a new group in due course.

ALPHV/Blackcat exit scam

The ALPHV/Blackcat group is making headlines for all the wrong reasons. After first having their leak site taken over by law enforcement, they now appear to have absconded with affiliate funds.

In February 2024, ALPHV/Blackcat announced an attack against healthcare provider Change Healthcare (part of United Health Group). Following this, a ransom of $22 million was paid to ALPHV. Several days later, the responsible affiliate took to the cybercrime forum RAMP to state that they hadn’t been paid their share of the spoils (potentially up to 90%). It appears now that the group has collapsed from within, ending with a final exit scam as they shut down operations. The group have further claimed to have sold their source code in the process, so we may see copycat groups emerge in due course.

While the dissolution of a notorious group should be celebrated, especially following successful law enforcement activity, it should be noted that shutting down in this way presents a significant risk to recent victims. The affiliate responsible for the Change Healthcare data, as well as affiliates who may have been similarly affected, are likely to still hold victim data and so, for those victims, there remains a risk that they may be further blackmailed as affiliates attempt to recoup their lost earnings.

Photo by FLY:D on Unsplash

Investigation, Ransomware

Ransomware – State of Play January 2024

by Daniel Collyer

February 15, 2024

SOS Intelligence currently tracks 173 distinct ransomware groups, with data collection covering 324 relays and mirrors.

In the reporting period, SOS Intelligence has identified 274 instances of publicised ransomware attacks. These were identified through the publication of victim details and data on ransomware blog sites accessible via Tor. Our analysis is presented below:

Threat Actor Activity

Lockbit has remained the market leader, maintaining a market share of approximately 23%. Blackbasta, Akira, Trigona, 8base and Bianlian have seen significant increases in activity over the month, while there have been decreases in activity from Cactus, Werewolves, Siegedsec, Dragonforce, and Play.

January is typically a quieter month for ransomware threat actors. In 2022, the volume of attacks was 17% less than the yearly average. In 2023, this increased to 54%. This slowing of activity is likely due to the proximity of several national and religious holidays observed globally between December and January. However, in 2024, we observed a significant increase in attacks across January. Two factors stand out as possible causes for this:

Ongoing global hostilities

It has been observed that pro-Russian cybercriminal groups have been vocally supportive of the ongoing war in Ukraine, and have diverted significant resources in targeting the supporters of Ukraine. Similar patterns have been noted in the targeting of victims in countries which have shown support for Israel.

Although ransomware groups and threat actors are primarily financially motivated, their resources and skills are often seen turned against perceived enemies of the state, blurring the lines between criminal and hostile state activity.

Counter Ransomware Initiative

The Counter Ransomware Initiative (CRI) is a US-led group of 50 nations and organisations dedicated to promoting solidarity and support in the face of ransomware activity. In October 2023, CRI members pledged not to pay ransoms when faced with cyber attacks.

As a result, it is expected that the number of observed postings to ransomware blogs will increase as victims no longer pay ransoms. This may show an increase in victims’ data being published, rather than an overall increase in the number of victims.

Country Targeting

As stated above, ransomware threat actors’ choice of targets can be politically motivated, as well as financially. This is why we continue to see the majority of attacks target the USA, UK, Canada, France, Germany and Italy. As members of the G7, these countries have strong economies and therefore possess lucrative targets for financially-minded threat actors. However, this surge in activity may be politically motivated. Continued support for Israel and Ukraine may give certain threat actors additional motivation to target those countries.

This month has seen an increase in attacks against victims in Sweden. Sweden is in the process of joining NATO, which appears to have presented the country as a target for pro-Russian threat actors in support of the Russian state. Sweden’s membership would increase NATO’s presence in and around the Baltic Sea, a key waterway for allowing the Russian Navy into the North Sea and onward into the Atlantic. Furthermore, it would increase a NATO presence close to Russia’s border with the rest of Europe.

Industry Targeting

Manufacturing, Construction & Engineering, and Logistics & Transportation have remained the key targeted industries for January. These industries would be more reliant on technology to continue their business activities, so it logically follows that they would be more likely to pay a ransom to regain access to compromised computer systems. The Financial and Education sectors have also seen increased activity over the period.

ALPHV/Blackcat

In December 2023, law enforcement agencies from multiple jurisdictions targeted the ALPHV/Blackcat ransomware group, disrupting the groups’ activities and seizing their domain. Shortly after, the domain was “un-seized” before law enforcement agencies took back control. As a result of this action, the operators behind ALPHV/Blackcat have publicly withdrawn their rules regarding the targeting of Critical National Infrastructure (CNI), in apparent revenge for law enforcement activity.

Since the takedown, ALPHV/Blackcat activity has slowed but does not appear to have stopped. In recent weeks they claim to have targeted and stolen confidential and sensitive data from Trans-Northern Pipelines in Canada, as well as Technica, a contractor working with the US Department of Defence, FBI, and USAF.

The veracity of these claims is still being investigated, and so should be taken with a grain of salt. The ALPHV/Blackcat group has been hurt by law enforcement, impacting their operations and losing them customers. Therefore, it is possible that exaggerated claims are being made to save face and their reputation amongst the cybercrime community.

Photo by FLY:D on Unsplash

Investigation, Opinion, Ransomware

Cybersecurity in 2024 – A Forward Look

by Daniel Collyer

January 24, 2024

2023 was a record year for cybercrime and threat actor activity, and we anticipate 2024 to be a continuation of this upward trend. Below we discuss a few key items we consider will be at the forefront of 2024’s cybersecurity landscape.

Expansion of ransomware operations

2023 was a record year for ransomware operators. Reported attacks were nearly double the numbers seen in 2022. The most successful groups operated as-a-service (RaaS), allowing them time to improve and develop their product whilst others worked to deploy the malware and bring in the money.

Law enforcement has been extremely active against these groups, taking down infrastructure relating to HIVE and ALPHV variants. However, in the latter’s case, this has seemingly slowed, but not halted their operations and they remain active in some capacity into 2024. Current data has shown a slight decline in the number of posts to their leak site however, this is a common pattern seen across many different variants and is likely due to the links to Russia and periods of inactivity over the holiday period.

We expect this year to be no exception to the continued growth of ransomware operations. It remains a lucrative opportunity for threat actors and the RaaS operating model allows less-skilled operators to partake in this criminal activity.

It is anticipated that ransomware tactics will expand to provide further opportunities to “motivate” victims into paying a ransom for their data. This will include the threat of deployment of “Wiper” malware – designed to fully delete an infected device or network in the event of non-compliance.

An increase in Supply Chain Attacks

It is highly anticipated that supply chain compromise will continue to be a tactic of choice for financially motivated and nation-state threat actors, who routinely and opportunistically scan the internet to identify unpatched systems ripe for exploitation.

The efficiency of supply chain attacks will likely be improved by both the infection and dissemination of software packages granting third-party access. This in turn allows threat actors to select and target their victims on a larger scale, leading to increased levels of compromise and wider attack surfaces for the deployment of malicious code. Subsequently, this will allow threat actors to better maintain persistence within victim networks, granting more time to conduct reconnaissance, analyse connected networks, and spread to encompass more victims.

It is anticipated that supply chain attacks will target vulnerabilities in generative AI ecosystems. With AI and LLMs being utilised more and more to improve productivity, inevitably supply chains are becoming more interconnected. Failure to properly secure these components within the supply chain could be fatal, allowing threat actors to poison AI training data, manipulate updates, inject malicious algorithms, engage in prompt engineering, or exploit vulnerabilities as an entry point to compromise organisations’ data or systems.

The growth of AI-driven cyber-crime

AI has seen a massive boom in 2023, and this is expected to continue into 2024 and beyond as it becomes increasingly integrated into all manner of processes and procedures.

In 2024, we anticipate a surge in threat actors embracing AI to improve the quality and speed of development of the tools in their arsenal. This will include a quick and cost-effective way to develop new malware and ransomware variants. We also expect to see the increasing use of deepfake technologies to improve the standard of phishing and impersonation to support cyber-enabled frauds and business email compromise (BEC)

In contrast, it is anticipated that cyber security will employ a proactive strategy; as threat actors continue to harness the potential of AI and machine learning, cyber defenders will look to utilise similar techniques to counter these offensive tactics. The cyber security industry is already making substantial investments into the use of AI for defensive purposes, and this is expected to grow and be adopted by more in the field. Generative AI (GenAI)-powered capabilities such as automated code generation, reverse engineering, and document exploitation will reach previously unthinkable levels of sophistication and speed.

It is believed that GenAI will provide an improved toolkit to those targeting the human element when seeking to compromise network security. GenAI will provide threat actors with an easier method for developing more convincing phishing messages at scale, create video and audio deepfakes, and more easily collect information on their targets. This highlights the need in 2024 for an increased focus on awareness training to better prepare staff and colleagues for the inevitable surge of phishing attacks in 2024.

Key Global Events

Geopolitics is a key motivator for threat actors in certain sectors, particularly nation-states and hacktivists. Many key global events are scheduled for this year, providing high-profile targets for those who would seek to manipulate these events for their own gains.

Elections are due to be held in the following countries:

Taiwan
USA
Iran
Russia
Ukraine
South Korea
India
Austria
United Kingdom
European Parliament

The BRICS group is due to expand, taking on the following new members: Egypt, Ethiopia, Iran, Saudi Arabia, and the United Arab Emirates. BRICS is now seen as an economic group to rival the G7, so it is anticipated that this expansion will lead to increased targeting of G7 financial institutions.

In July, the 2024 Summer Olympic Games will be held in Paris, France. Such events provide numerous opportunities for threat actors to make financial gains through fraudulent ticketing, and phishing to obtain financial data and credentials. Furthermore, it provides a canvas with global attention for those with a hacktivist agenda, ensuring their message reaches a wide audience.

Regulatory Changes Driving Threat Actor Innovation

Changes to regulations regarding the reporting of significant breaches, implemented in the USA by the Securities Exchange Commission (SEC), will force threat actors to hone and improve their stealth methods. We anticipate seeing increased focus on encryption and evasion techniques to allow threat actors to maintain undetected persistence within victim networks, to avoid triggering reporting to the SEC, and the expected forensic-level scrutiny that would follow. We believe that threat actors may look to non-material systems as a lower-risk target and entry point, quietly building their access, persistence and privileges from there before targeting higher-value network resources.

Additionally, we are also beginning to see ransomware groups using this new reporting requirement as an additional blackmail tool, threatening to report victims to the SEC themselves if their demands are not met. It is expected that this tactic will expand in use over the year to come.

What’s in store for SOS Intelligence in 2024

2024 looks to be an exciting year for SOS Intelligence.

Our team is growing further with a full time developer joining in early 2024. This will allow us to focus on improving the usability of the product, implement new features, and generate new data collection streams.

One of our key focus areas will be to improve the quality of the context around the data we provide. Improvements made to the platform will allow customers to see pertinent information relating to data sources, giving context to the risk and threat posed by that source. This will allow customers to make more informed decisions about the risks to their business or that of their clients.

We will also be looking to expand and improve the quality of our data collection. One particular focus will be on improving the reporting of CVEs. We aim to expedite alerts of new, high-risk vulnerabilities to our clients and subscribers so they can better mitigate and protect against the risks they pose.

SOS Intelligence has been diligently monitoring the digital landscape over 2023. Our recent findings are a stark reminder of the rising threat of phishing attacks. Over the past year, we have observed over half a million unique credentials compromised through phishing, and with the growth of GenAI techniques, we expect that number to grow in 2024.

One standout feature of our technology is our real-time alert system. This capability ensures that our clients are promptly notified when their staff have fallen victim to phishing, allowing for a swift response and effective risk mitigation.

The unique services we provide at SOS Intelligence aren’t just about securing your digital assets; it’s a practical investment in proactive cybersecurity. Join us in creating a more secure digital environment.

Header Photo by freestocks on Unsplash

Investigation, Ransomware

Ransomware – State of Play December 2023

by Daniel Collyer

January 11, 2024

SOS Intelligence is currently tracking 170 distinct ransomware groups, with data collection covering 319 relays and mirrors.

In the reporting period, SOS Intelligence has identified 373 instances of publicised ransomware attacks. These have been identified through the publication of victim details and data on ransomware blog sites accessible via Tor. Our analysis is presented below:

We first look at strain activity. As ever, the ransomware landscape is dominated by strains using affiliate models (Ransomware-as-a-Service (RaaS)). Lockbit remains the most active strain, and while there has been a decrease in overall activity, it maintains a 22% market share. 8base, AlphV and Play remain significantly active, but this month we have also seen significant activity by Hunters (RaaS), Cactus (RaaS), and Dragonforce.

Dragonforce are a newly emerged group, with little known about them at the time of print. Given the level of successful disruption by law enforcement during 2023, it is suspected that this group may be a rebranding of a previous threat group.

The Werewolves group has been observed increasing their level of attacks. The group appears relatively new, however, they have taken responsibility for a 2022 attack on the Electric Company of Ghana which resulted in significant power outages. The veracity of this claim is not known. Their level of activity is called into question by several of their victims also appearing on the LockBit breach site. Six identical posts were seen across both sites. Additionally, the ransomware used is a public domain version of Lockbit3, while their attacks make use of tools leaked from the Conti group. This would seem to indicate that the group was previously an affiliate of LockBit.

What makes this group standout is the targeting of Russian victims. Ransomware groups and operators are quite often pro-russian, with several groups supporting the Russian government publicly in its war against Ukraine. The targeting may explain a potential split from LockBit, and hint at a possible location for the group.

Finally, we have observed increased activity from the SiegedSec group. They appear focused more on data exfiltration, and are politically, rather than financially, motivated. Their focus has been on hacktivism, with a significant focus on targeting Israel and the USA.

As seen in previous months, the USA remains the primary target of ransomware groups and threat actors. We have observed a steady release of data from Canada, France, Germany, Italy, and the UK. As members of the G7, these countries have strong economies and therefore possess lucrative targets for financially-minded threat actors.

However, this surge in activity may be politically motivated. In recent weeks these countries have all shown support for Israel in its conflict with Hamas, which may give certain threat actors additional motivation to target those countries. As highlighted previously, there have also been significant increases in the targeting of Israel and Russia.

Manufacturing, Construction and Engineering, and IT and Technology have remained the key targeted industries for December. These industries would be more reliant on technology in order to continue their business activities, and so it logically follows that they would be more likely to pay a ransom in order to regain access to compromised computer systems. The Financial and Education sectors have also seen increased activity over the period.

Photo by FLY:D on Unsplash

Investigation, Ransomware

Ransomware – State of Play November 2023

by Daniel Collyer

December 6, 2023

SOS Intelligence is currently tracking 166 distinct ransomware groups. Data collection covers 309 relays and mirrors, 110 of which are currently online.

In the reporting period, SOS Intelligence has identified 437 instances of publicised ransomware attacks. These have been identified through the publication of victim details and data on ransomware blog sites accessible via Tor. Our analysis is presented below:

As in previous months, the ransomware landscape is dominated by strains using affiliate models. Lockbit remains the most active strain, and has seen a 73% increase in breach posts when compared to the previous month. High on the list is 8base, who release a large amount of data on 30th November. In contrast to the other high-profile groups observed, it is believed that the 8base group do not have their own proprietary ransomware, but instead rely on using other ransomware-as-a-service (RaaS) variants, such as Phobos.

As seen in previous months, the USA remains the primary target of ransomware groups and threat actors. We have observed an increased release of data from France, Germany and Italy, while the UK and Canada have remained high on the list of targeted countries.

As members of the G7, these countries have strong economies and therefore possess lucrative targets for financially-minded threat actors. However, this surge in activity may be politically motivated. In recent weeks these countries have all shown support for Israel in its conflict with Hamas, which may give certain threat actors additional motivation to target those countries.

Logistics, manufacturing, and construction have remained the key targeted industries for November. These industries would be more reliant on technology in order to continue their business activities, and so it logically follows that they would be more likely to pay a ransom in order to regain access to compromised computer systems. We are seeing a shift in tactics for certain industries, particularly those where data privacy carries a higher importance (such as legal or healthcare), where threat actors are not deploying encryption software and instead relying solely on data exfiltration as the main source of material for blackmail and extortion.

New for this month we have also considered the victim ownership; whether they’re privately or publicly owned. Within breach sites, the publicised victims are overwhelmingly privately owned. Publicly-owned victims tend to be either smaller, local government entities or educational districts within the US school system. Higher level public entities, while offering a lucrative target for hostile state actors, but may be more than a financially-motivated threat actor wishes to take on, owing to the likely increased law enforcement effort to obtain a judicial outcome.

Photo by FLY:D on Unsplash

Investigation, Ransomware

New SOS Intelligence Pricing

by Amir Hadzipasic

November 23, 2023

Investigation, Ransomware

Ransomware Statistics for October 2023

by Daniel Collyer

November 8, 2023

SOS Intelligence is currently tracking 163 distinct ransomware groups. Data collection covers 299 relays and mirrors, 93 of which are currently online.

In the reporting period, SOS Intelligence has identified 337 instances of publicised ransomware attacks. These have been identified through the publication of victim details and data on ransomware blog sites accessible via Tor. Our analysis is presented below:

Our first graph shows attacks organised by strain. The most prominent threat groups have been AlphV/BlackCat, Play, and LockBit3. All three provide operate a Ransomware-as-a-Service (RaaS) business model, which would increase the number of threat actors using them, so it is no surprise to see these variants appearing responsible for more attacks.

Secondly, we have looked at the spread of victims by country of origin. The USA remains the target of choice for many ransomware groups and threat actors, owing to the value of its economy and the likelihood of victims to pay ransoms.

A significant number of victims have been identified in Bulgaria, all of whom were targeted by the RansomedVC strain. RansomedVC does operate a RaaS business model, so it is hypothesised that this has been a single threat actor specifically targeting Bulgarian retail businesses.

Finally, we have looked at the targeted industries. Business Services, Manufacturing and Retail sectors have experienced significantly more targeting. This is likely due to their reliance on technology to undertake their business functions: a company more reliant on technology is more likely to pay if their services and networks are disrupted.

Photo by FLY:D on Unsplash

Investigation, Opinion, The Dark Web, Tips

The Dark Web in 2023 – SOS Intelligence Special Report

by Daniel Collyer

September 13, 2023

Investigation, The Dark Web

Stealer Logs – what you need to know

by Daniel Collyer

August 2, 2023

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cracking CAPTCHAs for fun and profit

Ransomware – State of Play February 2024

LockBit Takedown

ALPHV/Blackcat exit scam

Ransomware – State of Play January 2024

Cybersecurity in 2024 – A Forward Look

Ransomware – State of Play December 2023

Ransomware – State of Play November 2023

New SOS Intelligence Pricing

Ransomware Statistics for October 2023

The Dark Web in 2023 – SOS Intelligence Special Report

Stealer Logs – what you need to know

Recent Posts

Recent Comments