Refactored the Nix initialization script to reduce duplicated code and
centralize the installation workflow. The core functionality remains
unchanged, but all installer calls now use a unified function with retry
support to ensure resilient downloads in CI and container environments.
Key improvements:
- Added download retry logic (5 minutes total, 20-second intervals)
- Consolidated installer invocation into `install_nix_with_retry`
- Reduced code duplication across container/host install paths
- Preserved existing installation behavior for all environments
- Maintained `nixbld` group and build-user handling
- Improved consistency and readability without altering semantics
This prevents intermittent failures such as:
“curl: (6) Could not resolve host: nixos.org”
and ensures stable, deterministic Nix setup in CI pipelines.
https://chatgpt.com/share/693b13ce-fdcc-800f-a7bc-81c67478edff
Implement `ensure_nix_build_group()` and use it in all code paths where Nix is installed as root.
This resolves Nix installation failures on Ubuntu containers (root, no systemd) where the installer aborts with:
```
error: the group 'nixbld' specified in 'build-users-group' does not exist
```
The fix standardizes creation of the `nixbld` group and `nixbld1..10` build users across:
* container root mode
* systemd host daemon installs
* root-on-host without systemd (Debian/Ubuntu CI case)
This makes Nix initialization deterministic across all test distros and fixes failing Ubuntu E2E runs.
https://chatgpt.com/share/693b0e1a-e5d4-800f-8a89-7d91108b0368
The init-nix.sh script previously hardcoded /usr/bin/bash as the login shell
for the 'nix' user, which exists on Arch but not on Debian. This caused the
Nix single-user installer (run via `su - nix`) to fail silently or break in
unpredictable ways on Debian-based images.
We now resolve the shell dynamically via `command -v bash` and fall back to
/bin/sh on minimal systems. This makes Nix installation deterministic across
Arch, Debian, Ubuntu, Fedora, CentOS and CI containers.
https://chatgpt.com/share/6939e97f-c93c-800f-887b-27c7e67ec46d
In GitHub's Fedora-based CI containers the directory /nix may already exist
(e.g. from the base image or a previous build layer) and is often owned by
root:root. In this situation the Nix single-user installer aborts with:
"directory /nix exists, but is not writable by you"
This caused the container build to fail during `init-nix.sh`, leaving no
working `nix` binary on PATH. As a result, the runtime wrapper
(pkmgr-wrapper.sh) reported:
"[pkgmgr-wrapper] ERROR: 'nix' binary not found on PATH."
Local runs did not show the issue because a previous installation had already
created /nix with correct ownership.
This commit makes container-mode Nix initialization fully idempotent:
• If /nix does not exist → create it with owner nix:nixbld (existing logic).
• If /nix exists but has wrong owner/group → forcibly chown -R nix:nixbld.
• A warning is emitted if /nix remains non-writable after correction.
This guarantees that the Nix installer always has writable access to /nix
and prevents the installer from aborting in CI. As a result, `pkgmgr --help`
works again inside Fedora CI containers.
https://chatgpt.com/share/69384149-9dc8-800f-8148-55817ece8e21
References:
- Current ChatGPT conversation: https://chatgpt.com/share/6935d6d7-0ae4-800f-988a-44a50c17ba48
- Extended discussion: https://chatgpt.com/share/6935d734-fd84-800f-9755-290902b8cee8
Summary:
This commit performs a major cleanup and modernization of the installation pipeline:
1. Introduced a new capability-detection subsystem:
- Capabilities (python-runtime, make-install, nix-flake) are detected per installer/layer.
- Installers run only when they add new capabilities.
- Prevents duplicated work such as Python installers running when Nix already provides the runtime.
2. Removed deprecated pkgmgr.yml manifest installer:
- Dependency resolution is now delegated entirely to real package managers (Nix, pip, make, distro build tools).
- Simplifies layering and avoids unnecessary recursion.
3. Reworked OS-specific installers:
- Arch PKGBUILD now uses 'makepkg --syncdeps --cleanbuild --install --noconfirm'.
- Debian installer now builds proper .deb packages via dpkg-buildpackage + installs them.
- RPM installer now builds packages using rpmbuild and installs them via rpm.
4. Switched from remote GitHub flakes to local-flake execution:
- Wrapper now executes: nix run /usr/lib/package-manager#pkgmgr
- Avoids lock-file write attempts and improves reliability in CI.
5. Added bash -i based integration test:
- Correctly sources ~/.bashrc and evaluates alias + venv activation.
- ‘pkgmgr --help’ is now printed for debugging without failing tests.
6. Updated unit tests across all installers:
- Removed references to manifest installer.
- Adjusted expectations for new behaviors (makepkg, dpkg-buildpackage, rpmbuild).
- Added capability subsystem tests.
7. Improved flake.nix packaging logic:
- The entire project source tree is copied into the runtime closure.
- pkgmgr wrapper now executes runpy inside the packaged directory.
Together, these changes create a predictable, layered, capability-driven installer pipeline with consistent behavior across Arch, Debian, RPM, Nix, and Python layers.