Blogs

The latest cybersecurity trends, best practices, security vulnerabilities, and more

Blogs:

Platform

Research

Perspectives

The State of AI Models: Performance, Cost, and Applications

Not all large language models are created equal!

By Martin Holste · March 31, 2025

The State of AI Models: Performance, Cost, and Applications

I still get the newspaper delivered. I don’t know why I get a kick out of an analog news delivery device, but I read it every day. Last week, it suddenly shrunk considerably. It went from around 30-40 pages to about 8, feeling more like a restaurant menu than a newspaper. Why? Ransomware. My local paper (their corporate conglomerate) was taken down by a crime gang. Unable to pay the ransom, they lost access to all of the typical publishing tools they relied upon, and had to start laying out the paper by hand. Add them to the ever-growing list of businesses I, as a consumer, have gotten breach notifications from, lost service to, or have otherwise heard about being hacked.

So how does a shiny new technology like generative AI keep my local newspaper from shrinking? It comes down to being able to know when criminals have first entered an environment, and boot them out before they can get a foothold to initiate a ransom. We measure this ability in mean-time-to-detection (MTTD), and this is where GenAI is changing the game and giving defenders a chance to turn things around.

Here's a breakdown: GenAI can automate the investigation process, finding critical alerts and assessing their severity to know the instant an attacker is in the environment. This auto-investigation involves several key steps:

Determining "good" or "bad" activity: GenAI analyzes whether an activity, like using PowerShell, is malicious or benign.
Identifying involved parties: GenAI examines user profiles, IP addresses, and standard tool usage to understand who is involved and what their roles are.
Defining "normal" behavior: GenAI understands what tools and activities are typical for users and can flag deviations.
Reconstructing the sequence of events: GenAI pieces together the story of what happened based on the available evidence.
Making decisions: GenAI evaluates all factors and determines the appropriate response.

This auto-investigation capability helps in quickly identifying and prioritizing critical alerts, ensuring that security teams can focus on the most pressing issues, which is how defenders have a chance of finding and evicting ransomware criminals before they can do damage.

But how can you trust that AI is making the right security decision for you? This is where expertise in different generative AI models is key, because the wrong model can make the wrong decision.

Choosing the right large language models (LLM’s)

First, let’s cover what we mean by generative AI models. A large language model (LLM) is an artificial intelligence (AI) system that can understand and generate human language. LLMs are trained on large amounts of data using machine learning techniques. The ability to generate new content in the same manner as a human would with cognitive thought is where the “GenAI” term comes from.

Different LLMs have different strengths and weaknesses, and so having an evaluation framework for choosing the right model is the only way to ensure that you’re using the right model for the job. At Trellix, we use a purpose-built system in which we test different prompts and models and have AI evaluate the responses to decide which is best.

Figure 1: You can see a wide range in model performance along the axes that matter to cybersecurity professionals, including quality, cost, accuracy, and detail.

Our framework measures the amount of the analysis detail, the correctness of the responses, and compares this with the price associated with running the model. Based on these evaluations, we’ve chosen the following models:

Anthropic's Claude Sonnet: When we need the most thorough, knowledgeable, and complex answers, Claude is our model of choice. It performs detailed analysis with incredible skill, and it performs machine-level tasks like decoding and info lookups better than a human.

Amazon Nova Micro: For tasks that require formatting, straightforward decisions, and quick analysis, Nova Micro is the perfect fit. Its price-to-performance ratio lets us use generative AI in situations that would be otherwise cost-prohibitive.

Amazon Nova Lite: When it comes to coding and automating tasks, such as generating new plugins or enhancing existing functions, Nova Lite takes the lead. It lets us create new content at will.

What about the other models out there? How do they compare against our selections – and against each other? Our evaluations find that the quality of analyses can range from superhuman performance to being flat out incorrect. Here’s a recent benchmark rating the model analyses compared with a human:

Figure 2: Model performance ranges from “Mistaken” to “Better than human.”

What does this difference in model performance look like in a real-world security scenario? Here are three comparisons between the responses for performing a task beyond what a human can do: decode base64-encoded text.

Claude Sonnet 3.5 V2

 The base64 decoded PowerShell script is checking and managing Windows Update settings
 - Script appears to be managing the conflict between MDM and GPO update settings
 - This appears to be legitimate MDM automation
 - The sequence of events shows expected MDM management activities
 - Other events (4688, 4662, 4670) are consistent with system management tasks
 - The base64 encoding is a standard practice for AirWatch/Workspace ONE

In this example, Claude 3.5v2 has gone beyond what a human can do (highlighted in green) and performed on-the-fly decoding, arriving at the better-than-human decision.

Gemini 1.5 Pro

 The long base64 encoded command line argument is highly suspicious. The decoded PowerShell script (provided in the supporting info) checks a registry value related to updates and then, if configured to allow automatic updates, removes a registry key related to Windows Updates (`HKLM:\SOFTWARE\Policies\Microsoft\Windows\WindowsUpdate`) and restarts the `wuauserv` service. Disabling automatic updates is a common tactic used by attackers to prevent systems from patching vulnerabilities. This is not standard behavior for MDM software.

Here we see Gemini 1.5 Pro do a detailed decoding and analysis, but then it makes a critical error by arriving at the wrong conclusion (disabling automatic updates is standard behavior for MDM software, because it controls the updates itself).

Nova Pro

The presence of base64-encoded strings in PowerShell scripts is common for configuration data or encoded commands in legitimate scenarios. - The data suggests routine system maintenance and configuration activities. The processes and commands align with standard administrative tasks.
 - There is no clear indication of anomalous behavior beyond the use of base64 encoding, which is not uncommon in scripting.

Nova Pro arrives at the right conclusion, but it has to assume what was inside the encoded command.

How Trellix Wise uses GenAI models

So how does all of this analysis of model efficacy benefit you? In 2024 Trellix introduced Trellix Wise, our capability of leveraging GenAI in the Trellix Security Platform. Built on over a decade of AI modeling and 25 years in threat intelligence, analytics, and machine learning, Trellix Wise capabilities relieve alert fatigue and surface stealthy threats, ensuring no threat is missed.

It enhances Trellix Managed Detection and Response (MDR) capabilities,with pre-training that focuses on valuable detections rather than requiring analysts to figure out effective prompts for chatbots. The platform offers a differentiated approach with extensive third-party integrations that leverages GenAI to address high priority use cases such as ransomware and identity theft where the speed and efficacy of Trellix Wise provides a crucial advantage.

Generative AI is at the heart of Trellix Wise, enabling automated investigations and decision-making. It can:

Automatically investigate alerts: Determining the severity and scope of potential threats.
Understand context: Recognizing normal behavior versus malicious activity.
Create detailed stories: Providing a complete picture of what happened.
Make decisions: Escalating or deprioritizing alerts based on comprehensive analysis.

By automating alert triage, investigation, and response, Trellix Wise enables security teams to work more efficiently and effectively. As the threat landscape continues to evolve, AI will become increasingly critical in defending against sophisticated attacks. With its rich history of innovation and commitment to AI-driven security, Trellix Wise is best positioned to lead the way into the future.

To experience Trellix Wise for yourself, take our interactive self-guided product tour.

If you’re interested in a security prompt engineering challenge, we have a capture-the-flag workshop that’s both fun and a great way to understand how LLM’s work with security data. Contact us today at ai@trellix.com to schedule a custom one for your organization!

The Trellix “Un-prompted” workshop involves a capture the flag exercise to test your prompting skills and see how Trellix Wise can help you avoid the time and effort of prompt engineering.

Figure 3: The Trellix “Un-prompted” workshop involves a capture the flag exercise to test your prompting skills and see how Trellix Wise can help you avoid the time and effort of prompt engineering.

Latest from our newsroom

Blogs | Research

Closing the Security Gap From Threat Hunting to Detection Engineering

By Ilya Kolmanovich, Alejandro Houspanossian, Joe Malenfant and Tomer Shloman · April 16, 2025

Learn how to use existing tooling to perform threat hunting and detection engineering to find hidden threats and strengthen your defenses.

Read the Article

Blogs | Perspectives

The State of AI Models: Performance, Cost, and Applications

By Martin Holste · March 31, 2025

How can you trust that AI is making the right security decision for you? This is where expertise in different generative AI models is key, because the wrong model can make the wrong decision.

Read the Article

Blogs | Platform

Trellix NDR: Unleashing the Power of Trellix Wise AI for Unmatched Network Security

By Brajesh Kumar, Praveen Soraganvi and Ravikanth Rudrabhatla · March 13, 2025

Trellix NDR with Wise AI provides greater threat detection accuracy, reduces false positives, automates complex tasks, and empowers your security team to focus on what matters most: strategic defense.

Read the Article

Get the latest

Stay up to date with the latest cybersecurity trends, best practices, security vulnerabilities, and so much more.

Zero spam. Unsubscribe at any time.

Quick Links

Blogs

The latest cybersecurity trends, best practices, security vulnerabilities, and more

ARIA Resort & Casino | Las Vegas
September 27-29, 2022

The State of AI Models: Performance, Cost, and Applications

Not all large language models are created equal!

Choosing the right large language models (LLM’s)

Claude Sonnet 3.5 V2

Gemini 1.5 Pro

Nova Pro

How Trellix Wise uses GenAI models

RECENT NEWS

RECENT STORIES

Latest from our newsroom

Closing the Security Gap From Threat Hunting to Detection Engineering

The State of AI Models: Performance, Cost, and Applications

Trellix NDR: Unleashing the Power of Trellix Wise AI for Unmatched Network Security

Featured Content

Get the latest

Stay up to date with the latest cybersecurity trends, best practices, security vulnerabilities, and so much more.

Blogs

The latest cybersecurity trends, best practices, security vulnerabilities, and more

ARIA Resort & Casino | Las Vegas September 27-29, 2022

The State of AI Models: Performance, Cost, and Applications

Not all large language models are created equal!

Choosing the right large language models (LLM’s)

Claude Sonnet 3.5 V2

Gemini 1.5 Pro

Nova Pro

How Trellix Wise uses GenAI models

RECENT NEWS

RECENT STORIES

Latest from our newsroom

Closing the Security Gap From Threat Hunting to Detection Engineering

The State of AI Models: Performance, Cost, and Applications

Trellix NDR: Unleashing the Power of Trellix Wise AI for Unmatched Network Security

Featured Content

Get the latest

Stay up to date with the latest cybersecurity trends, best practices, security vulnerabilities, and so much more.

Search Tips

ARIA Resort & Casino | Las Vegas
September 27-29, 2022