Season 4 — Episode 7

CentOS has become the de facto standard operating system for many organizations since it’s basically the same thing as RHEL, rebranded and without commercial support.

CentOS was originally a community project, but over time Red Hat has become more influential in its direction and has shifted it to a "Stream" model, effectively moving CentOS 8 from being a downstream release (built after RHEL) to an upstream release (built before RHEL). This isn't exactly suitable for production use and thus many users are looking for alternatives.

Paul joins us today to share insights and give some advice on how you can evaluate your options and migrate your infrastructure as painlessly as possible.

Learn more:

Curious to see Puppet in action across different kinds of infrastructure?

SEE A DEMO

Transcript

Ben FordHello and welcome to today's episode of Pulling the Strings podcast, as always, powered by Puppet. My name is Ben Ford. I am our developer relations director here at Puppet and pretty active in the community as @binford2k. We may have talked already today. We're talking with Paul Reed. He's a principal sales engineer here at Puppet, and I got to warn you, he told me before this that because he's a sales engineer, he knows how to talk, so we may end up going down some rabbit holes here. So Paul and I go way back and we sort of share some of the same juvenile, cynical humor, you know, lots of times where we've got businessy sort of meetings happening, we've got this back channel going. There's kind of like a mystery science theater, 3000 banter, sort of poking good hearted, snarky fun. You know, as one does. So I'll probably get some kind of jab on Slack at some point. So if I go really quiet for a second, don't worry, I'm just stifling a giggle or something. So how's it going, Paul? Would you like to introduce yourself?

Paul ReedYeah, hi, it's going great. Thanks for having me on the podcast today. I guess I could give myself a bit of an intro here, so folks, I'm Paul Reed, as Ben mentioned, principal sales engineer. I work in the Northeast mostly, but I'll help out anybody who's got a need. I am Canadian if you haven't picked up yet by the accent, but don't hold that against me.

Ben FordI heard it in some of the "aboots" ealier.

Paul ReedNo doubt about it. I get that. I get blamed Canada. I get you name it. It's all fun. It's all good fun as you said. Then we joke quite often through Slack and I might stifle a laugh here and there again as well.

Ben FordSo again, I was talking with Alex, a product manager who also is up in Canada. He's Toronto, I believe, and this winter he actually built an ice rink in his backyard. Have you ever done something like that?

Paul ReedI haven't, but Alex lives about 19 kilometers from me. That's I want to say about 12 miles. So I should go over to his place next winter and have myself a skate.

Ben FordThere you go. So let's start with the kind of the story around the CentOS end of life. It was, I remember when it first hit and I was kind of surprised and shocked, and I was like, I don't even know how to respond to this. So could you tell us kind of what happened and a little bit of the background there?

Paul ReedYeah. So I think a lot of people were taken aback by the situation that happened with CentOS being discontinued in terms of CentOS core being a downstream version of Red Hat that people could use for free. A lot of people eventually settled on other variations of Linux, but it was a long time coming, and none of those solutions were ready at the time.

Ben FordSo when you say downstream, what does that mean in context here? You'd said upstream and downstream?

Paul ReedYeah, yeah, absolutely. So the downstream represents a version that is created based off of the packages that are from the stable Red Hat Enterprise release, whereas an upstream version are basically development channels of Red Hat that are built before the stable release of Red Hat.

Ben FordGotcha. So like upstream feeds into Rell Linux, which then gets distributed out and people use it in their production environments, where downstream is like comes out of Rell and people can choose to use that instead of Rell when they deploy into production?

Paul ReedRight, exactly. I mean, you can you can think of it as upstream versions being development streams and downstream versions being production stable streams.

Ben FordIs that why they decided to call it stream or what is the name stream release mean anyways?

Paul ReedYeah, I think that was just a way to differentiate between the old CentOS 8, which they're now referring to as CentOS 8 Core and the upstream version, which is CentOS 8 stream of the same major version release.

Ben FordInteresting. I didn't actually realize that so core and stream like theoretically are kind of the same thing if you go back far enough. But now they've diverged and stream is upstream of Rell.

Paul ReedExactly.

Ben FordAnd Core is basically just kind of legacy.

Paul ReedYep, exactly.

Ben FordSo how does that impact production use of CentOS?

Paul ReedWell, it really depends because that could mean something different depending on the organization that you work for or what's acceptable to you as a user. So, for instance, they have decided to actually go with CentOS 8 stream as their default operating system as opposed to using one of the variants or even flipping over to Red Hat directly. You know, it really depends on the level of acceptability of risk, I guess you could say, that you want to have in your environment. There's nothing wrong with CentOS stream as an operating system. In fact, it's very much like Fedora, and we know there's a lot of people that use Fedora as an upstream version of  Rell today.

Ben FordYeah, I was kind of wondering, like at this point, now what is the real difference between CentOS stream and Fedora?

Paul ReedYeah, not much, to be honest. So they're both upstream variants of Rell. I think Fedora is a little bit more on the development side where it's less stable, I think you could say than what CentOS would be. And then obviously now CentOS stream edition is less stable than what RedHat Enterprise is.

Ben FordYeah, that makes a lot of sense. I sort of have gotten the impression over the years that Fedora is kind of like the more community oriented form of the Red Hat family, where the entire stack of Rell and CentOS and whatever else variants are, the more enterprise oriented. Is that reasonably accurate, you think?

Paul ReedYeah, I would say so. I mean, customers that I see that are using all of these variants that typically use Fedora in some development capacity, and then they were using CentOS as a production. But now, you know, have either shifted to something else or Rell directly.

Ben FordAnd you did mention that there are now a handful of downstream Rell alternatives, which would take the place of what CentOS used to be. And the two that I'm familiar with now are Alma and Rocky. I think you'd mentioned those two. But to be honest, I'll admit that I haven't actually been following their development as closely as maybe I should have. Could you talk a little bit about like where they're positioned and what sort of differences they are and maybe how you might choose between them or something?

Paul ReedYeah. You know, practically there's not much difference between the two. They are both separate community projects, and the development resources that they have available are different. There's different contributors to each project, but at the core, they're both based off of Red Hat Enterprise. So you get the same packages, the same baseline functionality, they're all Rell 8 variants at this point. So really, there's not much difference. I mean, for my functional testing, for everything that was practical to me, the only real difference was the logos and I picked Rocky because I liked the look of it better than the Alma logo. Literally, I mean, the migration process was a little different. They had a little different tools that they each have for that. But in terms of the base packages of what the OS can do, it's basically the same.

Ben FordHonestly, I'm pretty interested in seeing like how they diverge over time because I feel like that answer is going to be very different in five years or so. But such is the development of open source, right?

Paul ReedAbsolutely.

Ben FordI have been seeing another name and I'm not really familiar with it, so I'd love it if you could clarify. I've seen the name Vault, so there's CentOS stream, which was the first thing announced. But then what is CentOS Vault?

Paul ReedYes. OK. So when CentOS 8 was archived, CentOS 8 Core that is, in favor of stream, basically all of the packages and the update servers and all of that stuff, all of it was moved to an archive, which they call Vault. So CentOS 8 Vault is basically the last version that was ever and ever will be available of CentOS 8 Core. And in fact, that's very important. I'm glad you mentioned that because in order to update any of the stream additions or to Rocky or to Alma, you actually need to change your update servers to point to that Vault manually or using Task, which we could talk about in a minute, in order to get it to that latest version before the upgrade will happen.

Ben FordOh, I get it so that makes a lot of sense. It's like a point in time pin of getting you to the place where then you can choose where you want to go from there.

Paul ReedYeah, exactly. And it's actually kind of not a good thing that Red Hat did to the community in moving that over to Vault because what they basically did was broke updates on any existing CentOS 8 Core machines that were out there that weren't already at the latest version. So wherever they happen to be at that point in time, they could no longer get updates which left them vulnerable to security vulnerabilities.

Ben FordThe updates just stopped.

Paul ReedYeah, exactly the updates just stopped. So not not fun.

Ben FordSo how were they, I mean, this is kind of like tangent, and maybe we don't need to go there or not, but how were they expecting that you would go from the way that you've been working all this time to the new stream? Like, was there a migration path that was put together for that?

Paul ReedYeah. Red Hat Professional Migration Services, which is a paid for service, I guess that was the official way to do it. Now that said, you know, you could sidestep that by manually changing your update repos to the Vault versions, then doing the update to get to the latest version and then choose your own adventure from there.

Ben FordNow, that seems like a real stumbling block. I'm glad the people have written some tools to automate all of this so much. And actually that does bring up another question. And do you anticipate very many people migrating away from Rell or this whole Red Hat family because of this?

Paul ReedI do. I mean, there's obviously going to be the camp of people that decide that, hey, you know what, now is the time to get official support for my operating system and migrate directly to Red Hat Enterprise. And there's nothing wrong with that. It's an amazing operating system, and without that, you wouldn't have any of these other variants that are free to use. So absolutely, we need to support them in some capacity. Now that said this, it's not an option for everyone. I mean, there are some people that just for various reasons, can't use Red Hat Enterprise, whether it's cost of licensing or it's just practicality. So in those cases, yeah, I mean, you'll see a lot more of an uptick of going to one of the other variants. Now prior to I guess this critical event, CentOS 8 was a clear choice in that Rell variant that everyone used. There are others out there as well. I mean, there's you know, there's Scientific Linux, there's Oracle Linux. There's a few others that are based on, you know, Red Hat package management system and in the same packages. But up until now, there hasn't really been that divergence. So I see what you're saying. It would be interesting to see where Rocky and Alma go in the future, like three to five years.

Ben FordYeah, it is kind of funny watching everybody kind of swarm and right there, CentOS to whatever distribution guides and scripts and whatnot. So I remember this is kind of a little tangent and personal story here, but I remember like way back in the day, one of my first big migrations was actually away from Debian to this new distro, kind of like very short lived, a new distro that Ian Murdoch built called Progeny. And it kind of, I think it was called Progeny, sort of like built on the ideas that Debian had, but a commercially supported company. And they were in a lot of the ways, the same kind of story as like the Red Hat family and Rell and CentOS, Fedora and everything. But it was before a lot of the modern automation and config management tools that all existed and that migration, I don't even remember why we did it, to be honest, but that migration was a real beast. We were eventually successful, but it was a real ride, a lot of manual fixing and going in and running tools and like fixing a broken set or something. And I don't know if there were really better options at the time, but like that would be an anomaly, I think. And there's so many tools today that make that far less stressful. And looking back, I'm like, man, that was a real wild west time. I'm really glad that I won't ever have to do that again. And one of the things that the people have built to make these things easier was like that module you were talking about earlier. Could you tell us a little bit about the module and how it works?

Paul ReedAbsolutely. Before I get into that, you know, you tangentially just got me nostalgic. So where I started with Linux, it was in the manager. And then a couple of years later, I probably got into the Gen2 project, so absolutely everything was built by source on every machine. So you can imagine going through all of your applications and everything. Yeah, those were not fun days. I'm glad that we're in a state where we are today, where we got the right tooling, the right automation to do this stuff and not only do the stuff correctly, but do it at scale.

Ben FordAnd repeatedly, too.

Paul ReedRight. So you mentioned the module, so I'll cut to the chase and get to talking about the module. So I wrote really a quick Puppet module that's a collection of tasks that kind of help with the process of migration from, well, the first thing it does is there's a task to upgrade your CentOS 8 Core machines that are still running out there if you happen to have them to the Vault version of the upstream repository. So I say upstream repositories, I don't mean upstream in terms of the version that works, but I do mean for their update services. So to switch it over to Vault for you so that you can then run your Yammer dnf updates successfully. So the first task right there? Just run that against all of your aling CentOS 8 hosts and your updates will start working again. Of course, they will only update right to the last version of CentOS 8 Core that was available, but that'll get you at least to the point where you can make a decision of where to go to from there. Which brings me to the other tasks within the module, which will help you run through the process of migrating to either Rocky or Alma, depending on which version you want to go to. Again, there's other choices out there aside from those two operating systems, and you can see how I wrote those tasks. The code is freely available and you can choose your own adventure, even write your own task if you want it to go to something like an Oracle or Scientific Linux or even true Red Hat 8 directly.

Ben FordAnd I kind of creeped on the module a little bit and saw that, for completeness sake, you also have this to CentOS stream and to Rell in case anybody wants to, like, go with those and upgrade that way.

Paul ReedYeah, yeah, exactly. And fair warning. We tested it on a few VM instances, so it may not work everywhere, but test in your own test service first. And if you like the process, feel free to use it.

Ben FordGiant disclaimer warning here.

Paul ReedGiant disclaimer. I mean, you could seriously screw up your environment with this. So know what you're doing. Don't just run this on all of your production SQL servers, for instance.

Ben FordAnd one of the neat things about these tasks I look at is that you didn't really actually write the scripts yourself, so you didn't write the migration itself. You wrapped the upstream scripts like the one to migrate to Alma or to Rocky or whatnot. And to me, that seems like a much more maintainable solution because somebody else, like the team that is supporting Rocky, is also making this update script for you, and you don't have to go in and replicate everything. Could you tell us a little bit about how that works? And you sort of hinted at maybe doing that same idea for using like CentOS to OEL script?

Paul ReedYeah, exactly. So reuse where you can, right, if those projects, so in the case of Rocky Linux like they produce GitHub repo, I believe that has all of the migration scripts and stuff there. So why not just use what they've already produced? So in that case, I just wrote the task so that it goes out, runs the get clone on that repo and then runs the latest version of that script. So if there's any problems, you can talk to that team instead of me.

Ben FordSomebody else's problem?

Paul ReedYeah. But I mean, as an example, though, I mean, it's a very simple task and it really showcases the ability to do this kind of stuff at scale. I mean, not one of in IT hasn't googled for an answer for something. And, you know, just show me the three lines of code that I need to do what I need to do. So just grab that, wrap it in the task and you're good to go at scale.

Ben FordAbsolutely. And it's like the abstraction is, like they're responsible for building the tool that upgrades a machine. So that let's you take this machine and turn it into this other kind of machine. But it's your responsibility, and now ours with this module, to take that and turn it into something that you can replicate across your entire infrastructure.

Paul ReedRight, exactly. You know, I work with a lot of customers that they're not dealing with a hundred machines or two hundred machines. They're literally dealing with thousands of these. So they have to do a migration, especially of an operating system. That's not an easy task. Not easy, especially if you have to go in and do them by hand or run even a script on each of them individually. I mean, it takes an eon to get through or an army of people to go through and do it. And then, you know, you mentioned consistency earlier. That's a huge thing, is human error when you do it in in people terms. So, yeah, remove all of that reduces the risk.

Paul ReedBefore I worked for Puppet, I used to joke, I worked for one of the large three letter companies out there and I would do a lot of migration work. And the joke was that I was going to put on my resume that I watch progress bars all day because literally that's what I did. I mean, I have six instances on the screen at the same time doing things, but they all just have a progress bar at some certain point along the way during the migration.

Ben FordThat sounds so miserable. So, it sounds like that is sort of the benefit of using the Puppet tasks for this. Is that what you're thinking or are there other reasons for wrapping these scripts into tasks?

Paul ReedYeah, I mean, consistency and scale. Know for myself, it's also a reference right now. I don't have to go out to stack exchange or whatever it is online that Google points me at for these scripts because I have a local reference that I've used and I trust.

Ben FordAnd I like that if you have something that you're using, like PE or whatnot, you can put it in the console and then get reports where you're not looking at One Note or whatnot.

Paul ReedAbsolutetly.

Ben FordYeah, you got this whole list of all the reports that are that are coming back. You need to know what failed and, you know, whatnot.

Paul ReedWell, and I was going to say the whole audit trail and saying, like, if you do have an army of people that are running these tasks against thousands of endpoints,  you'll know who did which ones, when they did it. So you have that reporting so that you know you're not tracking this on a spreadsheet that maybe somebody forgot to fill out and then you run the migration on something that's already been run. It's just, you know, it probably won't screw anything up, but it is a waste of that person's time. So you're saving overhead there, tremendous overhead.

Ben FordThat is a really good point. And that kind of maybe brings up another idea of like, how do you know which machines you should run this task on? And like, how do you know that you haven't missed any or that you haven't skipped over any?

Paul ReedYeah, exactly. You need some form of report that will show you that, that's what the PE console can do. Now, I mean, there's other ways you could do that as well. Like, these tasks come in. They're just a set of tasks. You run one before you run the other. But with Puppet plans, you can actually run the stuff sequentially. You could build a task that went out and talked to, say, ServiceNow or some other type of ticketing system that would open and close the ticket for you. So not only do you have like the workflow to do the migration, but you have also now automated the workflow for tracking all these changes. And when you're doing this stuff again at scale, that's so important to be able to automate that workflow.

Ben FordYes, definitely. So I saw in your repo, I saw a quick PQL query to identify all the CentOS nodes. Is that like, can we trust that the facts are going to be updated and whatnot so you can look at facts and say, show me all the CentOS 8 machines and then show me all the machines that are updated and all of that. Is that something you would use for tracking progress and making sure you didn't miss machines?

Paul Reed100 percent. So what happens as part of the task, by default, there's a reboot that happens after the migration. You could cue it up and wait for a maintenance window if you want, so it won't officially take effect until then. But after that reboot occurs, actually, in fact, the next Puppet run that occurs regardless, we'll update those facts so that you can you could use that as tracking. So something that's really cool is we have a new product called HTP may not be its official name when it goes to market, but what it does is it tracks changes over time. So you'll actually see graphs with a drawdown of number of CentOS 8 hosts and an uptick of number of, say, Rocky or Alma Linux hosts as a result. So you can see your burn down rates for your migration and read on that tool. I you could do that as well through Splunk or any other data visualization tool. But yeah, I think that's a really cool to be able to track projects like this. Like you said that that Excel sheet.

Ben FordYeah, I really like that idea because it can show you not only is it something tactical that you can use to, as you're doing this thing, as you're making progress through the migration, but it's something that you can surface and you can show it as a report to the decision makers in your business that are sort of tracking the long term bigger position of your infrastructure, too. Kind of show the value of your team and show how quickly you're able to respond to, you know, show all those metrics that the point you hear people like to look at.

Paul ReedI geek out over those metrics, too, it's fun to watch, to be honest. I mean, especially if you are doing something at scale like this, you really see the impact that you have directly by just writing a few lines of task code to accomplish big things.

Ben FordThat is really cool. So how does somebody get started with this module?

Paul ReedSo this module can be simply added to your Puppet Enterprise infrastructure in your Puppet file. So the instructions are on the module page in the Forge. Doesn't necessarily mean you have to use it through Puppet Enterprise. I think the tasks themselves will work through Bolt as well. So if you're a Bolt user, you can use the same module and those tasks will be represented through like a Bolt task run.

Ben FordSo it's basically just like any other tasks. You just install the module, get it to the right place on  machine and run it. And if you're using PE, you can stick it on the primary server and interface with it through the console. But if you have developers working out of their own workstations, you could still do it that way as well with Bolt and SSH. If I'm understanding correctly.

Paul ReedYep, yep, 100 percent. But the downside of using Bolt is you don't get the benefit of all that tracking, the logging, the output saved into the database. So I personally like to do it through P just PE because it does retain the output of that task. So if something does go wrong and my terminal window is closed because of a security timeout or something like that, I still have access to all of that output. Maybe there's a failure messages, errors, warnings, something that can go in and clean up to try again. So yeah, that would be a real reason to do it through Puppet Enterprise. But yeah, like you said, there's no reason why you can't run these tasks through Bolt. If you only have a handful of servers and you're not concerned about the output or audit trailing or any of that. Yeah, absolutely. Run it through Bolt.

Ben FordRight on, that's pretty cool. So if you're working with different environments, like you got like dev and staging and prod and all of that, do you have tactics for handling all these migrations for the different sort of environments?

Paul ReedYeah, I mean, typical for any type of staged deployment, you would want to start with your dev servers, start with the easy things, run a few tests before running anything at scale. And then when you do things at scale, run it for one environment. We've got concurrency settings in the products as well, so you can limit the number that are actually happening at a time and then monitor it as it goes through and watch your failures. If there are failures, maybe you want to stop the task before it goes through any further just to analyze that and maybe correct what the issue is, and you can use other tasks or Puppet code or anything to kind of remediate that type of stuff at scale as well. But yeah, I mean, just a typical staged approach, wave after wave all the way, you know, doing your production stuff last so that you've worked out all the bugs around.

Ben FordI think we're coming a little bit towards the end here, up to closing here. It sounds like what a lot of what we're saying is, is the Puppet tasks, whether you do it by hand or running it through PE for the tracking and auditing and reporting, using them effectively really helps with taking these little migration tasks like individual scripts and running them at scale across your infrastructure and kind of like multiplying the effect that you can have. And even if you actually do have to monitor it, you can use, like you were saying just a minute ago, you can use tasks to see what is happening and see how changes are rolling out across your infrastructure. And whether that's migrating an operating system like we're talking about today or any other things, I mean, I can see like doing database schema upgrades or things like that this way. The idea of taking a tool that already exists and wrapping it into a task, to like, I don't know, infrastructure-ize it. See, I just made up a word today. It makes it a lot easier to take those things and just ramp them up to to scale. Do you have anything else you want to add to that?

Paul ReedYeah, I mean, you basically nailed it when you Puppetize these types of things, whether it's infrastructure-ize or task-ize or any other "ize" that you want to do, you're reducing the risk, right? I build tasks for anything that I'm going to do more than once. So you get rid of the risk of of inconsistency, human error. You add the capability of logging and audit trailing it's so important and you can't really, you know, whether it's an infrastructure of like 10 to 100 machines or if it's hundreds of thousands, like you need to use the right tool for the right job. And this stuff, I think is, I wish I had the stuff ten years ago.

Ben FordAbsolutely. It's really neat. I like that idea that you that you mentioned there of taking scripts to do a thing and just kind of bolting on all of the other stuff, like the reporting and the auditing to those existing scripts instead of reinventing that every single time.

Paul ReedYeah, exactly. Don't even get me into the downstream stuff that you can do, like sending it off to Splunk and ServiceNow and all the other great integrations that just extend even beyond what we've talked about.

Ben FordYeah, I've been getting started with with some of the data visualization of our content across the Forge, and it's like my brain is spinning with all the ideas of what we can do with data.

Paul ReedYeah, I think we're all going to be metric geeks this year.

Ben FordI mean, the industry is growing up, you know, that's a sign we're starting to be able to look at numbers and track numbers and track our improvement. You know, if you can't track it, you can't improve it.

Paul ReedExactly.

Ben FordCool. So I will make sure to drop a link to your module into the show notes when we publish it. Are you open to collaboration like for people to file pull requests or issues or anything?

Paul ReedAbsolutely. And just to be completely honest, I might not see it. So just also shoot me an email if you have done anything in terms of the PR against that repo, but yeah, more than welcome to have anybody contribute if they feel they can add any value to it.

Ben FordCool. Well, that means that we're going to have to put your email on the show notes, too. So I hope you know what you just did. So can you be reached in Slack or anything else too? Are you active?

Paul ReedYeah, I am active in the community Slack, not so much in the channels, but if you ping me one on one, I absolutely will respond.

Ben FordCool. And that's PSR in the community Slack and I'll drop that into the notes, too. Yeah, and I don't know if this is really feasible or not. So feel free to shoot me down, but I think it'd be really neat to get some sort of office hour, one or two set up, so that you can sort of help guide people through this upgrade process and migrating to non-EOL distributions, but we can talk offline about that.

Paul ReedAbsolutely. I'd be willing to help people if they're stuck, obviously.

Ben FordWell, thanks for showing up. Thanks for coming here. This is a really informative and super helpful, it it kind of cleared up a lot of, you know, vague ideas or misconceptions that I've had about the whole process. And thanks everybody for listening. So that is a wrap today. And again, thanks everybody for showing up. And thank you for being here on our Pulling the Strings podcast.

Paul ReedThanks for having me.