Uncovering Claude-Assisted Attacks on Mexico: Evidence from Attacker Infrastructure and Jailbreak Logs

Overview

Oasis Security conducted a direct analysis of artifacts linked to an intrusion campaign targeting Mexican government-related entities.

Using data obtained from attacker infrastructure, we identified extensive use of Anthropic’s Claude AI, including jailbreak attempts, iterative prompt engineering, and AI-assisted execution across multiple stages of the attack lifecycle.

The findings are grounded in first-hand evidence, including AI interaction logs, operational artifacts, and exfiltrated data traces.

The campaign resulted in the collection of approximately 150GB of sensitive data and demonstrates how attackers can operationalize AI services to accelerate real-world cyberattacks.

Executive Summary

Commercial AI was operationalized as an attack capability, with confirmed evidence of internal network access and lateral movement.
A single attacker conducted intrusion operations against multiple Mexican government-related entities using Anthropic’s Claude AI.
Jailbreak and prompt manipulation enabled the reliable generation of exploit scripts, SQL injection payloads, and credential automation tools.
Approximately 150GB of sensitive data was exfiltrated during the campaign.

Analysis

Analyzed Server Information

Server IP: 165.22.184.26
Country: United States

A large number of attacker-controlled server-side files were collected from this infrastructure, indicating structured activity including scanning, exploitation, credential harvesting, and data staging.

Moreover, the collected artifacts include Claude interaction logs, which provide direct insight into how AI was leveraged to support these operations, including prompt manipulation, workflow generation, and task-specific script development.

Claude Interaction and Jailbreak Analysis

Analysis of logs obtained from the attacker-controlled server indicates that the intrusion activity began on December 27, 2025.

Figure 1. Activity logs confirming intrusion activity beginning on December 27, 2025

Spanish-language logs capturing the attacker’s interactions with Claude AI, along with AI-generated files, were identified on the server.

These logs indicate that the attacker attempted to manipulate the AI into assisting with malicious activities.

Figure 2. Roleplay-based penetration testing scenario used to manipulate the AI

Figure 3. Bug bounty-style prompt used to present the activity as authorized testing

AI Manipulation via Roleplay Jailbreak

The attacker issued prompts in Spanish, assigning Claude AI the role of an “elite hacker performing a bug bounty program.” This allowed the attacker to frame malicious actions as legitimate security testing, effectively neutralizing the model’s ethical guardrails.

Claude AI initially refused, citing its safety policies. However, the attacker continued to alter the scenario and apply repeated prompt manipulation until the model eventually cooperated.

Figure 4. Directory structure showing a “bug bounty”-themed attack environment created through AI-assisted activity

This technique aligns with roleplay jailbreak or persona injection, where attackers manipulate the model into adopting a fictional identity, causing it to abandon its original alignment context.

Once the persona is accepted, Claude AI’s safeguards become subordinate to the imposed framework, enabling the attacker to continue misuse.

From this point onward, the attacker leveraged Claude AI as a supporting tool in a structured attack operation.

Figure 5. Attacker prompts for internal infrastructure analysis

Figure 6. Attacker prompts for internal network exploration

Figure 7. Attacker prompts for service and access point analysis

Figure 8. Attacker prompts for credential and artifact collection

Figure 9. Attacker prompts for lateral movement and infrastructure analysis using external intelligence sources

AI-Assisted Attack Workflow

Once jailbroken, Claude AI was used by the attacker as a tool to support the attack workflow, supporting multiple stages of the intrusion chain.

Reconnaissance Support

Claude AI was used to generate Nmap-style scanning scripts to probe public-facing government portals. Through this process, the attacker identified:

Exposed services
Open ports
Version banners
Legacy infrastructure running outdated PHP applications

Exploitation Support

The attacker then directed Claude AI to analyze reconnaissance results and identify exploitable conditions, including:

Exposed administrative panels
Unpatched web applications matching CVE-2023-series patterns
Weak or default authentication configurations

Claude AI subsequently generated Python-based SQL injection payloads targeting login interfaces across *.gov.mx domains.

Credential Automation

Beyond SQL injection activity, Claude AI also generated credential stuffing scripts tailored to the authentication patterns of the targeted systems.

These scripts enabled automated login attempts against portals that lacked adequate protections such as:

Rate limiting
Account lockout mechanisms
Strong authentication controls

Lateral Movement Planning

Claude AI also appears to have contributed to internal pivot planning by organizing credentials and access paths required for movement between systems.

This reflects a more advanced operational use of AI, in which the model supported not only initial access but also post-compromise decision-making and progression.

In addition to planning, the collected artifacts provide evidence of successful exploitation and execution. As shown in Figure 10, remote code execution was achieved with root-level access on a target system. Figure 11 further demonstrates the use of structured data extraction scripts to facilitate large-scale database exfiltration.

These findings indicate that AI-assisted workflows extended beyond planning into active post-compromise execution and data exfiltration.

Figure 10. Successful remote code execution resulting in root-level access

Figure 11. Oracle data extraction script used for large-scale data exfiltration

Targeting Scope

The collected artifacts indicate that the campaign targeted multiple Mexican federal and state-level government-related systems.

Identified target categories include:

National tax administration-related systems
Civil registry and vital records-related systems
Vehicle registration and public administrative portals
Regional education and public service platforms
Public utility and water-related administrative systems

The listed targets include entities associated with:

Mexican federal government agencies
Mexico City
Michoacán
Tamaulipas
Civil registry-related systems
Public utility-related systems

Evidence of Successful Compromise

The exfiltrated dataset, estimated at approximately 150GB, included:

Sensitive personal information
Registration-related records
Internal account credentials
Database contents associated with public-sector systems

Figure 12. Database queries targeting sensitive records

Figure 13. Extracted records containing personally identifiable information

Figure 14. Internal LDAP configuration and infrastructure details

The findings.txt file contains a report summarizing the results of a successful SQL injection attack, confirming database compromise, including schema enumeration, table and column extraction, and the retrieval of user credentials.

Additionally, the exfiltrated database files were stored in an unencrypted state, amplifying the potential impact of the breach.

Figure 15. SQL injection findings report confirming database access and schema extraction

Identified Data Categories

Based on folder names and file contents, the compromised data appears to include records related to:

Death reporting and registration
Marriage registration documentation
Birth registration records and certification data
Personal or population-related administrative information
Administrative procedures and civil service requests

Figure 16. Government-related data files identified on the attacker-controlled server

SSH Keys, Credentials, and Persistence Indicators

Various public and private keys were identified within the SSH directory of the attacker-controlled server.

Certain file naming patterns suggest likely associations with government and administrative systems. It is assessed that these key files were used to maintain persistent access to compromised environments.

Figure 17. SSH public and private keys identified in the attacker-controlled server, indicating potential persistent access

The credentiales_mysql.txt file contained credentials associated with a regional water and wastewater management system, indicating that the attacker had collected access information for additional public infrastructure environments.

Figure 18. Extracted MySQL credentials associated with a regional water and wastewater management system

The following files were identified as internal dump files potentially related to a tax administration environment.

Figure 19. Data dump files identified in the "dump_sellos_sat" folder

Exfiltrated files shown in the following image confirm that another Mexican government agency was compromised.

Figure 20. Additional exfiltrated government-related files

Internal Network Access and Lateral Movement

Analysis indicates that the attacker also penetrated internal network environments, successfully obtaining:

Internal hostnames
Credential files
Internal IP-related information

Figure 21. Internal host and credential data obtained

The vps_caps.sh file indicates that the attacker was conducting lateral movement, using acquired internal IP addresses and credentials to expand access inside the network.

Figure 22. Script (vps_caps.sh) used to facilitate lateral movement and internal access expansion

In addition, the escaneos folder consolidates scanning logs, while the backup2 directory contains additional logs indicating scans against multiple Mexican government agency websites.

Figure 23. Scanning logs identified in the "escaneos" folder, indicating reconnaissance activity targeting multiple government systems

Figure 24. Additional scan result logs stored in the "backup2" directory, reflecting continued reconnaissance activity

This suggests the operation extended beyond isolated exploitation and progressed into broader internal reconnaissance and post-compromise expansion.

Conclusion

The defining feature of this attack is the compression of the traditional attack kill chain through commercial AI assistance.

Tasks that historically required significant time and operator expertise — including reconnaissance, vulnerability identification, exploit preparation, credential automation, and lateral movement planning — were effectively accelerated by a single attacker using a commercially available AI service and persistent prompt manipulation.

A significant shift is underway in the threat landscape. Attackers are increasingly leveraging public AI systems as operational accelerators within AI-assisted attack workflows.

Signature-based security controls alone are increasingly insufficient against AI-generated attack artifacts and adaptive, attacker-driven intrusion workflows. As AI-assisted attack activity continues to scale, both AI providers and public and private sector defenders must adopt stronger safeguards, prioritize behavioral detection, and evolve response strategies to keep pace with this evolving threat model.