AI Threat Landscape Digest January-February 2026

Wait 5 sec.

KEY FINDINGSAI-assisted malware development has reached operational maturity.VoidLink framework, which is modular, professionally engineered, and fully functional,was built by a single developer using a commercial AI-powered IDE within a compressedtimeframe. AI-assisted development is no longer experimental but produces deploymentreadyoutput.AI-assisted development is not always obvious from the final product.VoidLink was initially assessed as the work of a coordinated team based on its architectureand implementation quality. The development method was exposed not from analyzingthe malware but through an operational security failure. AI-assisted developmentshould be considered a possibility from the outset, not as an afterthought.Adoption of self-hosted, open-source AI models is growing but still limited inpractice.Actors of varying skill levels are investing in self-hosted and unrestricted models toavoid commercial platform restrictions. However, underground discussions consistentlyreveal a gap between aspiration and capability: local models still underperform, finetuningremains aspirational, and commercial models remain the productive choiceeven for actors with explicit malicious intent.Jailbreaking is shifting from direct prompt engineering toward agenticarchitectureabuse.Traditional copy-paste jailbreaks are increasingly ineffective. The misuse of AI agentconfiguration mechanisms, specifically project files that redefine agent behavior, is amore significant development as it represents a qualitative shift from manipulating amodel’s responses to abusing its operational architecture.AI is showing early signs of deployment as a real-time operational component.Beyond its use as a development aid, AI is beginning to appear as a live element inoffensive workflows as autonomous agents performing security research tasks, andLLMs classifying and engaging targets at scale within automated pipelines.Enterprise AI adoption is itself an expanding attack surface.GenAI activity across enterprise networks shows that one in every 31 prompts riskedsensitive data leakage, impacting 90% of GenAI-adopting organizations.INTRODUCTIONDuring January-February 2026, cyber crime ecosystems continue to adopt AI in a widespread butuneven pattern. Throughout 2025, legitimate software development began shifting from promptbasedAI assistance to agent-based development. Tools such as Cursor, GitHub Copilot, Claude Code,and TRAE introduced a common paradigm: developers write structured specifications in markdownfiles, and AI agents autonomously implement, test, and iterate code based on those instructions. Thisagentic model, in which markdown is the operative control layer, is now starting to appear across thethreat landscape.The critical differentiator in what we observed is AI methodology combined with domain expertise. Acrosscyber crime forums, the dominant pattern of AI use remains unstructured prompting: actors requestmalware or exploit code from AI models as if entering a query in a search engine. VoidLink (detailedbelow) on the other hand, is the first documented case of AI producing truly advanced, deploymentreadymalware. The developer combined deep security knowledge with a disciplined, spec-drivenworkflow to produce results indistinguishable from professional team-based engineering. Forumactivity, which constitutes the bulk of observable evidence, primarily consists of actors who have notyet adopted structured AI workflows and whose efforts remain relatively unsophisticated. The morecapable actors, those who combine domain expertise with disciplined AI methodology, leave far fewertraces in open forums, making the true scope of this shift harder to measure.VOIDLINK: THE STANDARD WE MEASURE AGAINSTIn January 2026, Check Point Research (CPR) exposed VoidLink, a Linux-based malware frameworkfeaturing modular command-and-control (C2) architecture, eBPF and LKM rootkits, cloud and containerenumeration, and more than 30 post-exploitation plugins. The framework is highly sophisticated andprofessionally engineered, so much so that the initial assessment was that VoidLink was likely the productof a coordinated, multi-person development effort conducted over months of intensive development.Operational security (OPSEC) failures by the developer later exposed internal development artifactsthat told a different story. These materials revealed that VoidLink was authored by a singledeveloper using TRAE SOLO, the paid tier of ByteDance’s commercial AI-powered IDE. Instead ofunstructured prompting, the developer used Spec Driven Development (SDD), a disciplined engineeringworkflow, to first define the project goals and constraints, and then use an AI agent to generate acomprehensive architecture and development plan across three virtual teams (Core, Arsenal, andBackend). The resulting plan included sprint schedules, feature breakdowns, coding standards,and acceptance criteria, all documented as structured markdown files. The AI agent implementedthe framework sprint by sprint, with each sprint producing working, testable code. The developeracted as product owner, directing, reviewing, and refining, while the AI agent did the actual work.The results were striking. The recovered source code aligned so closely with the specification documentsthat it left little doubt that the codebase was written to those exact instructions. What normally wouldhave been a 30-week engineering effort across three teams was executed in under a week, producingover 88,000 lines of functional code. VoidLink reached its first functional implant around December4, 2025, one week after development began.THIS CASE ESTABLISHES TWO PRINCIPLES:AI-assisted development now produces operationally viable, deployment-readymalware: it has crossed the threshold from experimental to functional.The AI involvement was invisible until it was exposed by an unrelated OPSEC failure. For analysts and defenders, this means AI involvement in malware development should be treated as a default working assumption, even when there are no visible indicatorsThe ramifications of VoidLink’s methodology go beyond this individual case. Its workflow, in which structured markdown specifications direct an AI agent to autonomously implement, test, and iterate, is the same paradigm that defined the agentic AI revolution in legitimate software development throughout 2025. The cyber crime ecosystem is not developing its own AI capability. It is adopting the same tools and architectural patterns as legitimate technology, with the additional goal of trying to overcome the protective limitations built into these systems. This is more important than which model or platform the attackers use.The same architectural pattern repeatedly appears across the cases highlighted in our report: markdown skill files that transform a coding agent into an autonomous offensive security operator, and configuration files abused to override agent safety controls. In each case, the operative control layer is not code but structured documentation that determines what the AI agents build, how they behave, and what constraints they observe or ignore. This is in direct contrast to the underground forum activity, where the dominant approach remains unstructured prompting.MODELS: COMMERCIAL, SELF-HOSTED, AND INFORMAL SERVICESSELF-HOSTED OPEN-SOURCE MODELSAcross cyber crime forums, actors at all skill levels are actively exploring self-hosted, open-source AI models as alternatives to commercial platforms. Their motivations are consistent: to avoid moderation, prevent account bans, and maintain operational privacy. Users with malware and hacking backgrounds are installing uncensored model variants such as wizardlm-33b-v1.0-uncensored and openhermes-2.5-mistral, and prompt them with comprehensive malicious wishlists spanning ransomware, keyloggers, phishing kits, and exploit code.Figure 1 – User installing local LLM variants and prompting them to generate malware and fraud tooling.More established actors are conducting structured cost-benefit analyses, evaluating not only hardware requirements and GPU costs but whether locally hosted models produce reliable output (or hallucinate to the point of being operationally useless), and whether AI-generated malware meets the quality bar of current evasion techniques.Figure 2 – Threat actor inquiry into hardware, cost, and feasibility of running a fully “unrestricted” locally hosted model.SELF-HOSTED MODELS: LIMITATIONS IN PRACTICESelf-hosted models consistently show a gap between aspiration and capability. Community advice on improving local model output focuses on basic optimizations, such as switching to English-language prompts and increasing quantization levels, while references to more advanced techniques such as LoRA fine-tuning remain aspirational rather than operational.Figure 3 – Community feedback suggesting alternative local models and highlighting token/context limitations of smaller deployments.Cost estimates range from $5,000 to $50,000 depending on the desired performance, with training timelines of 3–12 months and frank admissions that models “hallucinate a lot” without extensive investment.Figure 4 – Discussion on cost and requirements for locally hosted unrestricted models.Most tellingly, an active offensive tools vendor, advertising C2 setups, EDR bypass services, and red team tooling, concluded that local deployment is currently “more of a burden than something productive,” while acknowledging that commercial models remain useful despite increasing restrictions.Figure 5 – Participants comparing commercial AI systems with alternative models and discussing perceived restriction levels.COMMERCIAL PLATFORMS AND INFORMAL ACCESS SHARINGRather than migrating to self-hosted infrastructure, users are comparing what the prevailing workarounds among commercial models provide. Participants recommended specific providers they view as less restrictive, shared experiences with account enforcement on multiple platforms, and refined prompt-splitting techniques to incrementally bypass safeguards, such as requesting explanations before progressing toward executable code.Figure 6 – Example of the structured prompt-splitting technique suggested to incrementally bypass AI safety restrictions.Some early signs of informal access sharing have been observed, with operators of local models offering to generate restricted outputs for others on request. However, given the historical precedent of “dark LLM” services that largely failed to deliver on their promises, it remains to be seen whether these will develop into durable service models.Figure 7 – Community member offering private generation of restricted output via locally hosted model infrastructure.JAILBREAKING AS ARCHITECTURAL ABUSETraditional jailbreaking, the practice of circulating copy‑paste prompts designed to trick models into producing restricted output, is becoming increasingly difficult to utilize. In some forum discussions, users seeking Claude jailbreaks were told that easy public prompts are no longer available, platforms have been cracking down on abusers, dedicated subreddits have been banned, and developing new jailbreaks is costly because the accounts are eventually terminated. Single‑prompt jailbreaking is becoming less attractive as model providers invest in safety enforcement.Figure 8 – Forum discussion highlighting the declining availability of easy public jailbreak prompts.ABUSING AGENT ARCHITECTUREA more significant development is the emergence of jailbreaking techniques that target the architecture of AI agent systems rather than the model’s conversational safeguards. A packaged “Claude Code Jailbreak” distributed on forums illustrates this shift.Claude Code is designed to read a CLAUDE.md file from a project’s root directory as configuration. Legitimate developers use this mechanism to define the project context, coding standards, and agent behavior. The jailbreak abuses this by placing override instructions in the CLAUDE.md file that suppresses safety controls and redefines the agent’s role. When Claude Code initializes in the directory, it reads these instructions as authoritative project configuration and follows them. The screenshots below claim successful generation of a RAT (Remote Access Trojan) using this method.Figure 9 – Packaged Claude Code jailbreak exploiting the CLAUDE.md project configuration mechanism.Figure 10 – Alleged jailbreak output showing generation of remote access malware code.This is not prompt injection in the traditional sense, but manipulation of the agent’s instruction hierarchy, the same architecture used for agentic AI tools in legitimate development. The CLAUDE. md file occupies the same functional role as VoidLink’s markdown specification files or RAPTOR’s skill definitions: a structured document that determines what the agent does, how it behaves, and what constraints it observes.FROM DEVELOPMENT TOOL TO OPERATIONAL AGENTThe preceding sections document AI as a development aid (as seen by VoidLink), a resource actors struggle to access on their own terms (self-hosted models), and as a system whose restrictions they attempt to bypass (jailbreaking). Now let’s look at AI deployed as a real-time operational component, performing offensive tasks autonomously within live workflows.RAPTOR: AGENT-BASED OFFENSIVE ARCHITECTURE VIA MARKDOWN SKILLSRAPTOR is a legitimate, open-source security research framework created by established security researchers and published on GitHub under an MIT license. It is not malicious tooling. Its significance for threat intelligence lies in its architectural pattern, and that criminal communities are paying attention. RAPTOR transforms Claude Code into an autonomous offensive security agent through a set of markdown skill files and agent definitions. The framework integrates static analysis, fuzzing, exploit generation, and vulnerability triage into an agentic pipeline orchestrated entirely through structured markdown instructions, with no compiled tooling required. In its most explicit form, it demonstrates what the agentic paradigm makes possible: a set of text files that turn a general‑purpose coding agent into a specialized offensive security operator.Figure 11 – RAPTOR documentation highlighting offensive security agent capabilities and exploit generation benchmarks across LLM providers.RAPTOR’s own data provides an additional data point on the commercial versus self-hosted question we discussed earlier. An evaluation of exploit generation across multiple model providers found that commercial frontier models (Anthropic Claude, OpenAI GPT-4, and Google Gemini) consistently produce compilable C code at approximately $0.03 per vulnerability, while locally hosted models via Ollama were marked as “often broken” and unreliable for exploit generation. This reinforces the conclusion reached independently by experienced actors in underground forums: commercial models remain significantly more capable than self-hosted alternatives for operational tasks.Figure 12 – Forum post sharing RAPTOR as an autonomous offensive and defensive security framework built on Claude Code.Discussions on criminal forums indicate that threat actors are aware of this architecture. The combination of a proven architectural pattern, open source availability, and documented criminal interest suggests that similar configurations, whether directly based on RAPTOR or just replicating its approach, are likely being developed and tested privately.AI AS ATTACK SURFACE: ENTERPRISE EXPOSUREThe preceding sections document how threat actors engage with AI as an offensive tool. But the same wave of AI adoption is simultaneously creating exposure from the defensive side. As enterprises integrate generative AI into daily workflows, the volume of sensitive data flowing through these tools introduces a distinct category of risk: instead of AI weaponized against organizations, AI is adopted by organizations in ways that outpace security controls.In January – February 2026, corporate use of generative AI tools continued to expand at scale. Analysis of GenAI activity across enterprise networks shows that one in every 31 prompts (approximately 3.2%) posed a high risk of sensitive data leakage, including the potential sharing of confidential business information, regulated data, source code, or other sensitive corporate content with external GenAI services.Critically, this risk is broadly distributed across the enterprise landscape rather than limited to a small number of outliers. High-risk prompt activity impacted 90% of organizations that use GenAI tools on a regular basis, indicating that nearly all GenAI-adopting enterprises encounter meaningful data leakage risk through everyday AI usage. Beyond these clearly high-risk events,16% of prompts contained potentially sensitive information, reflecting a wider pattern of questionable data-handling behavior that can still translate into compliance exposure or IP loss. Adoption trends further amplify the challenge. Over the last couple of months, organizations used 10 different GenAI tools on average, reflecting multi-tool environments. At the user level, an average employee generated 69 GenAI prompts per month. As prompt volume grows, the possibility of data exposure events scales accordingly, reinforcing the need for security policies, visibility, and real-time prevention controls.The post AI Threat Landscape Digest January-February 2026 appeared first on Check Point Research.