This article was drafted during NaNoWriMo 2020.
I hit a frustrating problem while deploying to production at work so here is a little postmortem.
The problem came from a Jenkins build, from a repository that had not been updated since a week. We wanted to deploy from a release branch to master. The error was from an installation of MongoDB server 4.2 in a Dockerfile (not an official Dockerfile from MongoDB but an install of MongoDB server in our own Dockerfile).
I’ll give right away a solution for people that don’t want to read all of it.
The problem was that MongoDB server 4.3 and 4.4 had a change in a post-installation script (only for Debian I think) that adds a call to
systemctl. That change was backported to MongoDB server 4.2 in October, and was part of an update to the
mongodb-server around November 15.
systemctl is not present (in my case, in the Docker container based of a Debian image) when the post-installation script is called, the installation of the package fails and the build breaks.
To circumvent that problem, you can run
ln -s /bin/true /usr/local/bin/systemctl (or
RUN -s /bin/true /usr/local/bin/systemctl in the Dockerfile) before trying to install
mongodb-server. This will skip the
You should probably not do that if you need
systemctl elsewhere in your system though.
At first I thought our deployment pipeline had a hiccup. The build was fine on the develop branch and on the release branch. It sat there for a full week with no problem. I also checked the last commit from the week before and it was a change to the frequency of a cron job. Not really a build-breaking change usually.
I launched the build again. Our infrastructure is a bit weak, we have failing build every other day that can be successful by launching them again. This time it did not solve the problem. I also relaunched the build in develop to see if it was a problem with the environment, but it also failed.
I could not pinpoint the problem yet but it seemed as if it were not from the new code, nor from the pipeline. Also, the Dockerfile used an official node image, with an exact version (like
node:X.X.X-slim) so I expected things to be pretty stable between each run. The
node image is itself based on
I summoned my inner Sherlock Holmes and went investigating further.
First tries did not work. Now I had to dig into the error message a little more.
1 2 3 4 5 6
/var/lib/dpkg/info/mongodb-org-server.postinst: 43: /var/lib/dpkg/info/mongodb-org-server.postinst: systemctl: not found dpkg: error processing package mongodb-org-server (--configure): subprocess installed post-installation script returned error exit status 127 dpkg: dependency problems prevent configuration of mongodb-org: mongodb-org depends on mongodb-org-server; however: Package mongodb-org-server is not configured yet.
The error message above contains some information:
/var/lib/dpkg/info/mongodb-org-server.postinst: 43: /var/lib/dpkg/info/mongodb-org-server.postinst: systemctl: not foundmeans that at the line 43 of a file
mongodb-org-server.postinstthere was a call to systemctl that failed because the program was not found
dpkg: error processing package mongodb-org-server (--configure):this one tell us that its the mongodb-org-server package called with the
--configurepackage that went badly
subprocess installed post-installation script returned error exit status 127a bit more detail for the line above, not too much info here. This looks like a weirdly worded message
If you put the first line of the error message
/var/lib/dpkg/info/mongodb-org-server.postinst: 43: /var/lib/dpkg/info/mongodb-org-server.postinst: systemctl: not found into a search engine, you find a SO post1 that contained only an answer at the time and a post with a very similar error. His answer was cryptic to me, howewer, so I had to dig again.
In the SO post, there is a link to the official Dockerfile for MongoDB 4.4 so I went to take a look.
After reading it a bit and searching for
systemctl, I found this line in the Dockerfile2:
So that was a good hint. Thanks to all the developers that comments their code :) I tried to add
RUN ln -s /bin/true /usr/local/bin/systemctl to my Dockerfile and it worked! The build was fine now.
But it’s not over. I had to understand it to see if it would come bite our project later or if this adds some constraints on the repository because we skipped a step in a post-install script.
The part that failed is here3:
systemctl is a part of a group of software called
systemd provides a system and service manager that runs as PID 1 and starts the rest of the system (from their website).
/bin/true as a command (
/bin/false exists too!). It’s a command that you can use to always return 0 (which is a truthy value in the shell). This is like using a constant for true in shell scripts.
I knew about
ln for links between files on the system before but I never saw it used to replace the comportment of a command by another one.
So we have seen that using
ln in combination with
/usr/local/bin/systemctl did work to allow the build to succeed. But is that right to make it build with a step that is skipped?
In fact, I don’t know this one for sure (if you know tell me please). My understanding is that the post-install script uses this command to update the configuration of a service if it exists. But in my case I don’t have it so it should not matter if I don’t update it.
Another thing I would like to understand is why this happened to us this Thursday and not a week ago on the last build, or even a few months before? Maybe answering that could help prevent other build failure in the future.
As it turns out, looking around in the repository for the Dockerfile for MongoDB where I found the trick to solve my problem, there is a comment7:
It indicates that the change for 4.3+ might have been backported, let’s find out.
The following commit8
is for the change in February and it has been backported to 4.29 on October 15.
(and has been backported to 4.010 then reverted on November 11)
It’s getting closer but not quite, we last built our project successfully November 12, and it failed November 19. So what could have caused this?
My best guess right now is that the Debian package
mongodb-server we use in the Dockerfile was updated between the two dates.
I ended up looking at a website for Debian package, trying to get the changelog but I hit a 404 on the changelog page. Not helpful.
I also found the dpkg list for
mongodb-server but I dis not find a date of update in there.
Right now, I don’t know how to get the information apart from contacting a maintainer directly. I’ll edit the post when/if I find the response.
I tried to recap what I learned in the original SO post I found at first to help others in the same boat.
Amusingly, the day just before I wrote my answer, someone answered as well in this question. It would have saved time to me but I would not have looked into the problem as much but I am glad I did, I discovered a lot of stuff.
This one should be common sense but we hit this problem around 4.30p.m. on a Thursday and that was not fun. It did not break anything but it add a lot of stress to break a build at the time when you just want to end your day.
I’d like to thank my colleague Maxime who was with me at that time and did some initial investigation on the Jenkins builds with me.
While I looked around, I tried to search with
systemctl not found into a search engine and found another SO post11 that proposed to install
systemd. I did not try it but I expect it might have worked.
But it is not the relevant solution for me. We would not use
systemctl in this repository so I feel it is better to skip this part of the script we don’t need, instead of installing a “dead” package.
All in all, this problem took only a couple of hours to get fixed but it was annoying because of the circumstances. Having a build fail because of an external package being updated is never fun.
But it was a good learning experience, I discovered
systemctl, I got to look at the MongoDB open-source repository and the docker-library repository and the impact stayed low on the project so not too bad. 😀