#4784 accepted infra

Update servers to latest FreeBSD 13.1

Reported by: Amar Takhar Owned by: Amar Takhar
Priority: normal Milestone: Indefinite
Component: admin Version:
Severity: normal Keywords: funded project-1
Cc: Blocked By:
Blocking: #3606, #4785, #4786, #4787, #4789, #4790

Description

This list of tasks is not exhaustive there are many other changes that need to happen during this transition. It's been close to 9 years since this hardware was setup there is going to be a lot that is missing here.

  • Switch jails from no longer maintained EZ-Jail to Bastille
  • Update all services to latest version from ports.
  • Modernise config files many services use older configurations such as Postfix or require older versions.
  • Will require rolling downtime of services during OSL business hours.
  • Ticket will be updated with changes as work is completed
  • Weekly certificate updates to avoid certificate expiration

Change History (14)

comment:1 Changed on 01/19/23 at 14:59:39 by Amar Takhar

Blocking: 4785 added

comment:2 Changed on 01/19/23 at 15:02:40 by Amar Takhar

Blocking: 4786 added

comment:3 Changed on 01/19/23 at 15:11:37 by Amar Takhar

Blocking: 4787 added

comment:4 Changed on 01/19/23 at 15:23:27 by Amar Takhar

Blocking: 4789 added

comment:5 Changed on 01/19/23 at 15:27:26 by Amar Takhar

Blocking: 4790 added

comment:6 Changed on 01/19/23 at 16:32:57 by Amar Takhar

Blocking: 3606 added

comment:7 Changed on 01/25/23 at 21:04:24 by Amar Takhar

Keywords: funded project-1 added; need-funding removed
Owner: set to Amar Takhar
Status: newaccepted

This project has been funded by an anonymous donor, thank you!

comment:8 Changed on 01/26/23 at 16:33:41 by Amar Takhar

The first machine to be changed will be build1 then service3 since there are no user facing services on this there is no notification required.

It will take a few days to sort out exactly how things should be setup. The general plan is to have everything setup the same way for all the machines. I'll get backups running and setup certbot crontabs clearing out as many of these tickets as possible.

After this it will be a matter of replicating this setup to the other machines which will require downtime of various services.

All work will have to be done during OSL hours for any user facing machines in case there are any issues -- there shouldn't be as all our machines have IPMI but you never know.

comment:9 Changed on 02/01/23 at 18:20:23 by Amar Takhar

A lot has been going on the past week with the general server update. There are dozens of services we run and we need a system that will be easy to maintain going forward.

Each service really needs to be insular so we can change out one for another as updates happen or the requirements of the project change. This has been a current difficulty.

Overall as a general methodology the following rules have been followed:

  • Spend more time on setup for less longterm maintenance.
  • A longer restoration is OK for less longterm maintenance.
  • Heavyweight solutions that require more upkeep are dropped in lieu of simple crontabs and scripts located in one directory.
  • Setup is consistent across all servers and Jails.

We have not had any major issues in the last 9 years. If we had chosen more heavyweight options we would have garnered no benefit other than more complexity and ongoing maintenance.

As the project expands horizontal scaling is OK but we don't want maintenance to grow exponentially. By having a static cost associated with maintaining services it will make it far easier for the project to expand.

If, in the future we need to go with more heavyweight options we can. Going backwards is a lot more work but switching from a say a crontabbed backup to a service is trivial.

The following software will be deployed on each server:

There are still a lot more decisions to be made but those will have to happen as setup progresses. The major point in doing in depth planning was to avoid any major rewind of deployments if another, better solution is discovered late in the process.

comment:10 Changed on 02/01/23 at 18:30:29 by Amar Takhar

I should have added that iocage was initially considered as it's what I've used in the past but the project is dead. iXsystems has moved from FreeBSD to Linux and has stopped contributing. The original author(s) are no longer maintaining it see:

https://github.com/iocage/iocage/issues/1289

We are currently using EZJail which was abandoned 6 years ago now. Migrating away from is not an easy task and it has been working for what we need it for.

The BastillBSD author is extremely active and the project is well maintained.

Last edited on 02/01/23 at 18:31:04 by Amar Takhar (previous) (diff)

comment:11 Changed on 02/01/23 at 20:22:12 by Amar Takhar

The tentative order is as follows:

  • build1
  • service3
  • service4
  • service1
  • service2

build1 and service3 do not have anything that affect user-facing services so no notification required.

The other machines will require service some notification which is TBD based on how the first two updates go I'll have a much better idea of how long it will take.

comment:12 Changed on 02/03/23 at 03:21:27 by Amar Takhar

Instead of trying to do all this work during OSL operating hours only the firewalls need to be done that way. The IPMI setup is stable enough and traffic is far lower on the weekends it makes more sense to do it then.

If we do have an issue, which I doubt we can wait until Monday. Initially I thought there may be more critical changes involved but there aren't. Just a wipe and clean install I will verify everything works during the processing of build and service3.

comment:13 Changed on 02/06/23 at 17:43:37 by Amar Takhar

The BastilleBSD configs have been settled. now it's time to start transitioning the machines. I have been doing this locally using VMs to try various configurations using 3 different instances and several jails.

I'll be doing it one machine per day most likely and will give notice the day before if possible at the very least with a few hours of notice if a service is going down. I'll send an email to both the users and devel mailing lists.

comment:14 Changed on 02/09/24 at 01:21:02 by Amar Takhar

Priority: highestnormal
Note: See TracTickets for help on using tickets.