I’ve been using Sonatype Nexus for the past few months as a repository for build artifacts (Python and RPM packages) for project I’ve been developing. Recently several of my build pipelines broke with RPM checksum errors. Doing some searches, I only found old, unanswered forum questions and absolutely useless AI-generated answers. Upgrading to the latest version of Nexus didn’t solve the issue either. After looking through some documentation, I did discover how to rebuild the package indexes and fix my issue. However, I didn’t realize the Nexus upgrade shifted me from a truly open-source artifact repository to closed proprietary commercial software with no way to downgrade back to the open-source version. This was infuriating. Now, even though my problem was fixed, I had to extract all my artifacts, rebuild an older Nexus installation and then reupload all my package. I’ll also have to create my own build pipeline if I want to continue getting updates for the open source-version of Nexus.

The Original Problem

I started to get errors in my build pipeline that took the form Downloading successful, but checksum doesn't match. Calculated: xxx (sha256) Expected: yyy. Searching for this error yielded very few results. I discovered a form post from five years ago where someone had the same issue, but there were no replies1.

RPM Signature Error Screenshot
RPM Signature Error Screenshot
 BattlePenguin (Fedora 42) (Unstable)                                        100% |   1.1 KiB/s |   1.6 KiB |  00m01s
}
>>> Librepo error: repomd.xml GPG signature verification error: Signing key not found                                
 Fedora 42 - x86_64 - Updates                                                                                                                                                                        100% |   1.9 MiB/s |   4.4 MiB |  00m02s
 https://nexus.sumit.im/repository/public-keys/RPM-GPG-KEY-battlepenguin                                                                                                                             100% |   1.7 KiB/s | 677.0   B |  00m00s
Importing OpenPGP key 0x5452534A:
 UserID     : "BattlePenguin.com <sumit@penguindreams.org>"
 Fingerprint: 30224325F56E8F327A961DB2C421BAF25452534A
 From       : https://nexus.sumit.im/repository/public-keys/RPM-GPG-KEY-battlepenguin
The key was successfully imported.
 BattlePenguin (Fedora 42) (Unstable)                                                                                                                                                                100% |  14.2 KiB/s |  33.7 KiB |  00m02s
>>> Downloading successful, but checksum doesn't match. Calculated: 7be73e822a35e6cde7ff71d6d25ee1e3b0e9d2d0ec3929e58a6fec7beac013f6(sha256)  Expected: 0a1156aa33e617b2f42f4e2a8b994c8b3487dfc84a89b84fc685d32a9ffd9325(sha256)  - https://n
>>> Downloading successful, but checksum doesn't match. Calculated: 7be73e822a35e6cde7ff71d6d25ee1e3b0e9d2d0ec3929e58a6fec7beac013f6(sha256)  Expected: 0a1156aa33e617b2f42f4e2a8b994c8b3487dfc84a89b84fc685d32a9ffd9325(sha256)  - https://n
>>> Downloading successful, but checksum doesn't match. Calculated: 7be73e822a35e6cde7ff71d6d25ee1e3b0e9d2d0ec3929e58a6fec7beac013f6(sha256)  Expected: 0a1156aa33e617b2f42f4e2a8b994c8b3487dfc84a89b84fc685d32a9ffd9325(sha256)  - https://n
>>> Downloading successful, but checksum doesn't match. Calculated: 7be73e822a35e6cde7ff71d6d25ee1e3b0e9d2d0ec3929e58a6fec7beac013f6(sha256)  Expected: 0a1156aa33e617b2f42f4e2a8b994c8b3487dfc84a89b84fc685d32a9ffd9325(sha256)  - https://n
>>> Librepo error: Yum repo downloading error: Downloading error(s): repodata/0a1156aa33e617b2f42f4e2a8b994c8b3487dfc84a89b84fc685d32a9ffd9325-primary.xml.gz - Cannot download, all mirrors were already tried without success              
Repositories loaded.
Failed to resolve the transaction:

I first made the mistake of updating my Nexus container from sonatype/nexus3:3.76.0-java17-ubi to sonatype/nexus3:3.82.0-java17-ubi. I’ll explain why this was a mistake later, but it did not fix the issue. Some more searching led to a support article from an older version of Nexus on how to rebuild the RPM indexes for Yum repositories2. Although some of the page locations from the article had changed, I was able to find the relevant settings and create a set of tasks for rebuilding the metadata on all my Yum repositories.

Creating Yum Repository Rebuild Tasks in Nexus Settings
Creating Yum Repository Rebuild Tasks in Nexus Settings

After running these tasks, my build pipelines were now able to pull RPM packages correctly from the repository.

Sonatype’s Slight of Hand and Dropping Open Source

However, during the upgrade process, I clicked through an agreement without noticing I was being moved away from the Free and Open Source version of Nexus to a Community Edition, which was closed source and had usage limits. It turns out, upgrading to version 3.77 or above moved users to the closed-source, proprietary Community Edition3.

Nexus Upgrade Resulted in New Usage Limits
Nexus Upgrade Resulted in New Usage Limits

The entire way Sonatype pushed the update is very shady. People in corporate environments were suddenly struck with this change, pushing them towards an expensive pro license they might not be able to afford4. I had forgotten to place the volumes for my Nexus container in my regular backup. Sonatype does not support rolling back releases of their software, recommending people restore from backups instead5. I’m glad I’m just using Nexus for my personal projects. This type of upgrade mistake, without viable backups to restore from, could easily be career-ending in a corporate environment.

Using Claude Code to Pull and Restore Sonatype Artifacts

I’ve only written briefly about large language models (LLMs) and the limited use I’ve found for them in software engineering. I have been fairly unimpressed by AI coding tools in the past. Yet with friends who continually swear by them, I’ve started to experiment with newer models. The chats integrated into my code editor have helped provide solutions for difficult problems I could not find answers for in documentation or forum posts. However, the code still wasn’t entirely correct and required some debugging. I’ve been told about the benefits of using AI tools that work on your entire codebase and decided this would be a good time to try out Claude Code for extracting all my artifacts from Sonatype Nexus and publishing them to a previous release of the software.

Using the command line tools and a detailed prompt, Claude Code created a task list and started a series of generations.

Initial Prompt and Generation
Initial Prompt and Generation

Some of the implementation choices it made were not great, but it allowed modification prompts before writing files.

Prompt to Adjust YAML Configuration
Prompt to Adjust YAML Configuration

Of course, the generated script didn’t work correctly the first time and required modification.

Prompt to Fix Incorrect Downloads
Prompt to Fix Incorrect Downloads

There were also issues with how it downloaded multiple assets to the exact same file and didn’t preserve any of the repository structure.

But eventually, I was able to get it to generate a script that extracted all the current assets from my Nexus repository. I then proceeded to back up and clean out the upgraded Nexus volume before running Nexus 3.76.1, the very last Free and Open Source release.

Startup Logs for Sonatype Nexus OSS 3.76.1-01
Startup Logs for Sonatype Nexus OSS 3.76.1-01

I did have to manually recreate all the repositories, users and roles. Once that was done, I once again utilized Claude Code to create the inverse of what it had just written. I then published all my old artifacts back to the newly running Nexus.

Uploading Artifacts to older Nexus Server
Uploading Artifacts to older Nexus Server

At no time did I create or alter any of the code myself, just the configuration files. The code it generated isn’t terrible. There is a lot of it though. I’d be wary of non-programmers running generated code without understanding how to read it first, as badly generated code can wipe out a year or more of projects6.

I’ve published the generated code as nexus-repo-extract, so you can be the judge of the output.

Final Thoughts on Sonatype

What Sonatype did with their 3.77 release of Nexus is an absolutely garbage move. They implemented a dark pattern that made the shift from their free and open-source tool into a closed, expensive and proprietary tool in a minor point release update. It’s disgusting and shows the company does not care at all about open source.

I’m glad I caught this upgrade when I did. Although Sonatype still has a public repository for the open-source version of Nexus published under the Eclipse Public License, they no longer provide releases leaving developers to compile it themselves7. The confusion over what is open source and proprietary is leading me to seek migration options away from Nexus entirely.

Thoughts on AI

I will admit that code generation has come a considerable way in the past few years. Yet, I’m still wary of people who treat it with abundant confidence. It’s not the LLMs themselves that should be of concern, but the insane amount of trust people put in them. Outside the software engineering world, there has already been a case of a chatbot encouraging a 14 year-old’s suicide8. Another lawsuit alleges a chatbot hinted at parenticide to a 9-year-old child9. The White House recently released a health report with citations to studies that did not exist, suggesting it may have been made using generative AI10.

Many people don’t understand the foundations of LLMs. They’re basically just very large mappings of weights (or parameters) from a massive set of parts of words (tokens) to every other part of a word11. Chatbots use a transformer that breaks apart a prompt into tokens and runs it through a series of blocks that generate results based on a pre-trained model12.

An LLM is basically a weighted random word generator, where the weights are derived by massive amounts of expensive human reinforced feedback training. When it gets something correct, it’s almost like an accident. When something that looks correct occurs often, the chatbot seems like an intelligent oracle. Yet, as observed above, initial prompts had to be rerun to truly get the correct results for code generation.

LLMs should certainly not be used to replace high-value work in fields that require a high level of accuracy in facts13. At the same time, people need to stop using words like “hallucinating” or “lying” when referring to LLMs, as these trained word generators lack the intent for such attributes.

When it comes to software engineering, I’ll admit these tools are getting better and seem like they can be immensely valuable for quickly generating large feature sets. Rules can be added to newer editors and generators, which ensure certain standards and code styles are applied. Still, they generate a lot of code that I doubt people fully review. They do require oversight and shouldn’t be used to generate anything software engineers couldn’t write themselves. I do have concerns about how younger junior engineers can even enter a field that is pushing developers to utilize AI. The software industry could easily enter an era in a few years when massive engineering efforts will be needed to understand and fix the ever-increasing mountain of technical debt from the current era of LLM code generation.