Is Claude Code suitable for large mainframe COBOL modernization programs?

Claude Code and similar AI tools can help with analysis, documentation, and smaller code changes. Our technical evaluation found that for larger COBOL programs, translation could include missing statements, duplicated logic, and incomplete support for constructs such as CICS and IMS. For multi-million line systems in regulated industries, enterprises usually require deterministic, repeatable transformation with controlled validation before production cutover.

What is the difference between AI-based and deterministic COBOL modernization?

AI-based modernization generates code from patterns learned by a model, so results may vary depending on prompts, naming, or model changes. Deterministic translation uses explicit programmed rules to convert COBOL constructs and produces the same output for the same input. This supports repeatable builds, structured testing, and stronger governance.

What did your technical evaluation of Claude Code show?

Our evaluation used both internal COBOL programs and the NIST COBOL test suite. Claude Code produced strong results on the standard NIST tests. However, when comments and recognizable names were removed, and when real application programs were tested, we observed missing sections of code, duplicated logic, and incomplete translations. This suggests that benchmark results alone may not reflect real enterprise COBOL behavior.

How does enterprise governance affect AI-based COBOL modernization?

In banking, insurance, and government environments, modernization must align with audit controls, change management, and controlled testing. If code output varies or is incomplete, validation effort increases. This raises project cost and delivery risk. Deterministic approaches simplify review, testing, and approval.

Can AI and deterministic COBOL translation be combined?

Yes. Many enterprises use AI to support documentation, test creation, and portfolio analysis, while using deterministic translation for the main COBOL to Java or C# conversion. This combines AI productivity with a more controlled modernization path.

Why is parallel run validation important in mainframe modernization?

Parallel run means operating the modernized system alongside the legacy system and comparing outputs under real workloads. This helps prove behavioral equivalence before cutover. Consistent, deterministic transformation makes this process simpler and more reliable.

Does AI reduce the overall cost of COBOL modernization?

AI can reduce effort in early analysis and initial code generation. However, our technical evaluation showed that missing or incomplete translations can create major extra effort in testing, debugging, and completion. In large enterprise systems, this can make AI-led modernization costly and difficult to deliver.

Claude Code vs SoftwareMining for Enterprise Mainframe Modernization

Claude Code and other AI tools are changing how organizations approach mainframe modernization. They promise faster analysis, automatic code generation, and less manual work.

Two main approaches are emerging:

AI-based tools: Generate or refactor code based on patterns and interpretation. Useful for analysis and small changes. (technical evaluation)
SoftwareMining's rule-based approach: Uses predefined transformation rules rather than trained models to convert COBOL to Java or C#. This produces consistent results that can be tested and validated.

Modernization can seem simple: analyze, translate, and deploy. In large enterprise systems, it is more complex. Small differences in logic, data handling, or transactions can cause errors. Because of this, testing and validation often take most of the time and cost.

Claude Code for COBOL Modernization: Technical Evaluation

We conducted a technical evaluation of Claude Code for COBOL modernization. The purpose was not to judge surface-level Java quality, but to test whether the generated code was complete, runnable, and suitable for enterprise use.

Our testing was done in two stages:

Internal COBOL programs: We translated our own programs, which are closer to real enterprise systems. We found missing statements, duplicated logic, and incomplete handling of constructs such as CICS and IMS. In some cases, the Java looked correct at first sight, but important logic was missing.
NIST benchmark programs: We translated the standard COBOL test suite from National Institute of Standards and Technology (NIST). The results were fully correct and the programs ran successfully.

We then repeated the NIST test after removing comments, renaming variables and paragraph names, and removing obvious identifiers. After these changes, the results were very different. Important sections of code were missing, and none of the programs ran successfully.

This is a key finding. It shows that benchmark results do not reflect real-world COBOL systems. Success on known test sets does not guarantee correct or complete translation of real applications.

For large organizations, this changes the economics of the project. AI-based translation may appear low-cost at the start. However, if the output is incomplete, the real cost moves into manual review, debugging, code completion, and testing.

In large banking, insurance, or government systems, this can become a major cost and delivery risk. The key question is not whether code can be generated, but whether it is complete and stable enough for controlled testing and migration.

Any client considering AI-led COBOL modernization should run its own structured test. This should not rely only on benchmark programs. It should include real application code and representative system patterns.

At minimum, a client evaluation should include:

batch COBOL programs with real business logic
programs using copybooks, REDEFINES, and packed decimals
file handling and error paths
transaction programs (CICS or IMS where relevant)
programs with comments removed and identifiers renamed
execution testing, not just code review

The evaluation should measure completeness, ability to compile and run, and the amount of manual correction required before the system can be tested in parallel with the legacy system.

AI-Based COBOL Modernization: Enterprise Considerations

AI tools such as Claude Code generate code based on patterns, not fixed rules. This works well for analysis and small changes where developers can review the results.

In large COBOL systems used by banks, insurance companies, and government, the requirements are stricter. Systems must behave exactly the same as before. This is usually proven by running the new system in parallel with the old one and comparing results.

If the generated code changes based on prompts or model updates, testing becomes harder. More time is needed to find and fix issues. This increases cost and project risk.

Rule-based approaches avoid this problem by producing consistent results. This makes testing, audit, and approval much simpler.

For large enterprise systems, this difference can have a major impact on cost and delivery.

Claude Code vs SoftwareMining: Enterprise Comparison

The following comparison focuses on the criteria that typically determine enterprise mainframe modernization programs: repeatability, auditability, scale, and governance alignment.

Enterprise criterion	AI-based approach (Claude Code)	Deterministic approach (SoftwareMining)
Cost and testing effort	Fast initial generation, but high effort for testing, debugging, and completing missing logic.	Predictable output reduces rework and shortens testing and validation cycles.
Governance	Additional controls required to manage variability and ensure correctness.	Stable output supports audit, change control, and approval processes.
Repeatability	Results may change based on prompts, naming, or model updates.	Same input produces the same output every time.
Completeness	May miss statements or generate incomplete logic, requiring manual fixes.	Full program structure is translated using defined rules.
Business logic accuracy	Depends on test coverage and manual validation to confirm correctness.	Preserves control flow, numeric precision, and transaction behavior.
Enterprise scale	Suitable for small or modular changes where manual review is manageable.	Designed for large, multi-million line COBOL systems.
Project risk	Risk increases due to hidden defects and incomplete translations.	Controlled process with predictable outcomes.
Transformation method	AI-generated code based on probability.	Rule-based translation with defined behavior.

See COBOL Modernization Cost Comparison

Enterprise COBOL Modernization: Controlled Outcomes

Agentic AI tools such as Claude Code represent a meaningful advance in engineering productivity. They can accelerate analysis, refactoring, and documentation across large codebases.

In regulated COBOL environments, modernization is ultimately judged on repeatability, auditability, and provable equivalence under real workloads. Deterministic transformation supports controlled validation, structured parallel run, and governed change management.

The question for enterprise leaders is not whether AI can generate modern code, but whether the modernization approach delivers predictable outcomes at scale.

Private AI Setup Costs vs SoftwareMining

Executive updates on COBOL modernization: Receive executive updates on COBOL modernization and migration risk (1-2 emails per quarter).
	*Comments and feedback (moderated):*
* Name:
* Company email:
Comments:
	We use your email only to respond and to send updates if selected. We do not sell your information.