We're all familiar with Puppet's main forte; that of managing configuration for computers, servers, cloud instances, etc. But as with any tool, community members often… color outside the lines.
Carrying on with our community spotlight series, this episode highlights one of those unusual usages. Corey Osman started this project by using Puppet to manage tiny IoT devices but ended up building a whole management framework out of Puppet and Bolt technologies. Let's get started and hear his story.
Today's musical intro was recorded by Manny, a software engineer on our hosted services team.
Ben Ford is a developer advocate at Puppet.
Ben: [00:00:10] Hi everybody, my name's Ben Ford. I'm the developer advocate here at Puppet. This is Corey Osman. Could you introduce yourself real quick?
Corey: [00:00:16] Yeah, hi. So my name's Corey Osman. I've been in the Puppet community space for, well, actually since 2009 and I've been writing DevOps tools and doing DevOps consulting and I've now blocked chained since October.
Ben: [00:00:33] Wow. Okay, so Corey got a pretty unusual project he's been working on. Do you think you could tell us a little bit about that project and maybe how you got into this blockchaining?
Corey: [00:00:42] Yeah. So I'll start off how I started, when I started. It was almost a year ago that I got involved in the blockchain community and I got my my first set of mining hardware. For those listening that don't actually know what mining hardware is and why you would use it, it's essentially a motherboard with some RAM and a hard drive and some GPUs or CPUs attached. So that is what mining hardware is and I got my first set back in October and now I have a bunch more...so much more that I needed a warehouse.
Ben: [00:01:25] And something to automate it, of course.
Corey: [00:01:27] Yeah. And in the mining community we don't call it a data center because real data centers have air conditioning and proper networking equipment and good ventilation and dust free areas. We want to keep our costs low so we have a lot of unusual things. We have, we call them open airframes where it's literally just the motherboard and the equipment. Not in a computer case exposed to the elements except for water. It's very low budget.
Ben: [00:02:01] So what did you need to get started with this?
Corey: [00:02:03] Yeah. So to get started with you know mining you need the hardware and you know once you get up around five systems - or in the mining community we don't call them systems, we call them rigs. And once you get up to five rigs you start to realize that maintaining these systems is annoying. And so you need some software and you really have two options for operating systems. And if you can guess it's gonna be Windows or Linux - nothing runs on OSX really. You can't run Docker containers but it's the Windows of Linux because I'm a Linux fan. I chose Linux and there are specific mining distros that just have all the tools that you need to do the actual computations with. So that's one of the pieces the software and we talked about the hardware already and you know one of the pieces of software that is not required but definitely helps you manage is configuration tools, and also orchestration tools especially when you have as many systems as I do you start to run into these, you know, SSH for loop problems and it just begs you to use Puppet and Bolt.
Ben: [00:03:36] So what are these kind of tasks? I'm assuming you're running things across all of your mining, not systems, mining rigs.
Corey: [00:03:44] Yeah. So a lot of the tasks are actually the same thing that you would have in a normal data center. You you need to update software. That's one of the things in you you need to configure software as well. The big difference here is that these systems are disposable. In fact all of my systems run off of just an SSD little pin drive that I plug into the USB. So I just dd the operating system over to the USB drive and then plug in the USB drive and it's more of like a live boot type situation. So I don't really care about trying to maintain a snowflake, especially because of the software that I've built on top of this.
Ben: [00:04:39] Are you using like Puppet and Bolt out of the box right now or did you have to build anything more on top of it?
Corey: [00:04:44] Initially. So there's really one configuration file that needs to be managed and at first I was just using git to swap out different configuration files and then was like, well this is a problem Puppet solves easily. So then I started installing other services with you know package resources and using just Puppet in general how it's normally used. And then of course that invites the other problem is do I want to start a server agent setup for this little mining thing I have going in a warehouse and I was like, well I really don't want to be that heavy so.
Ben: [00:05:29] Right well you're trying to like eke out every single bit of performance you can. Because performance equals dollars.
Corey: [00:05:36] Correct. So I just use puppet apply. And from the beginning I essentially just run commands to call git pull to pull down the necessary Puppet modules. And that was the beginning of it where I just you know it was running. It was just like a quick bash script you know and then it was git pull. And then I would run puppet apply and "include site.pp" which would trigger everything else and that would that worked great for a while and then I just started to nerd out on making it better and I decided that I wanted to build software specifically for a lot of these use cases and some of the common tasks are just configuring that configuration file. But a lot of it has to do with changing what we're mining because we want to mine the most profitable thing. So we're constantly changing the parameters of the configuration file and we're constantly updating the mining software used. There's at least a dozen different mining programs that are in slot on the system. And if you don't stay up to date they no longer work because the blockchain is constantly changing. The next thing that really hurts in this space is now that you have all these systems, use Bolt's SSH ability to easily run these same tasks on all the systems...or, I'm sorry, rigs.
Ben: [00:07:19] Could you tell us a little bit about kind of the tooling that you wrote for this? And I mean we don't have to get too far down in the weeds or anything but just a little bit about how it like how it works.
Corey: [00:07:27] I wanted to write my own tool to to do this for not just me but also many other people and I was like well you know maybe I can sell this. So I sort of combined a lot of these tasks into just a single CLI app.
Ben: [00:07:44] So what does the tool do?
Corey: [00:07:47] So the tool does a lot of things and configuration is just one component of it. You know we need to figure out from a mining perspective is our hardware on fire. So we need to figure out what is the safe operating temperature of this equipment. And you know if we can't measure it we don't know really where we are. So part of this tool, it tells you what the temperature of each GPU is and you know at the moment you just need to know do I need to turn off my equipment right now. So the distro operating system does actually back off some of the computation cycles and we'll try and lower the temperature by not working as hard as well as increasing the speed of the fan to help cool and move the hot air away from the equipment. So some of that is monitoring and there's a bit of configuration in there. And also there's reporting in the tool and you know I when I look at like after I built all this stuff it's like you know these are very common with what we have in the sysadmin space. We have monitoring, we have configuring and we have reports that you know God forbid they get sent in emails but usually you have a dashboard to tell you, you know like a synopsis of what happened in the last couple of days. So a lot of these things that I've been used to using and in the sysadmin space I've sort of put into my tool. So that's so that's a that's kind of a basic usage of the CLI part of it.
Ben: [00:09:34] So that sounds like there's a whole lot of configuration data going around some of it is like configuring your nodes and somebody is like configuring the software and then there's some of it is probably configuring your own systems. So what are you're doing to manage the data part of the configuration management?
Corey: [00:09:50] The thing is is you want to maintain the highest rate of return. And in order to do that you need to make your custom BIOS for your GPU or download one that someone has customized and you need to flash that BIOS onto the GPU. You also need to overclock the GPU as well because you don't want to run at the slowest speeds. It's too stable. So you have to figure out what the best overclock settings for each GPU. Now this might sound easy and that you could just buy the same card and apply the same setting to all those cards. But back when I was getting into it you could not find more than two of the same cards at once unless you had a lot of money, so you ended up building these Frankenstein boxes with five or six different vendors. And so you would have a different overclock setting for each card. And so it really becomes unmanageable at that point.
Ben: [00:11:04] So you've got these different layers you've got like not only do you have the CPU but you've got like the different GPUs that are installed and you've got this complex like layering of all of these different tools into one system so you get like this Venn diagram of settings you have to yeah figure out for every single node.
Corey: [00:11:22] So let me ask you a question. How many GPUs do you think you can fit in a mining rig?
Ben: [00:11:29] I am not sure I want to know the answer to that question.
[00:11:34] OK, well I've got 11. That is my highest number. And into one motherboard I use these motherboards that allow up to 19, but I don't go that high because it just becomes completely unmanageable and if that rig goes down like oh man and so I have them are sectioned out. But you can imagine 11 different types of cards running in the same system with all with different configurations on top of that. Not all silicone is the same. And so you have to figure out hey this GPU crashed with this setting so now I got to back off. I got to back off some of the settings so you have to like apply the setting and back off for like each individual card almost. I mean you get lucky sometimes in that it just works on most of them but there's always a few outliers. So that is the unique configuration data that you need to apply to every card. One of the other things is you often move these cards from rig to rig and the ability that the tool that I wrote allows you to carry the overclock profile along with along with the card.
[00:12:55] So you really do have a whole bunch of special snowflakes and it's just that your system allows you to like account for like each of the snowflakes combined with each of the other snowflakes.
Corey: [00:13:04] Correct. Yeah, so that that is the huge ability of this tool is that you can carry these specific snowflake profiles from rig to rig and not have to worry about like where it is because Puppet will determine what kind of card it is.
Ben: [00:13:24] Right. Well how about like classification like what kind of profiles do you have to write to to enable all of this to work?
Corey: [00:13:32] So for the classification piece basically every rig gets the same class, you know. So I had to build a special module, not special, but I just built the module for my software and it - right now it just works on the single mining distro that ethos is. But I can totally move to other distros or other and basically any operating system, Windows is a little bit more more difficult. So I have that Puppet module and that will lay down the configuration for it and I have a boatload of Hiera data that is used to carry forward all the overclock settings. But one of the things that is really, you know when you go to classify a rig you're setting up the mining profile and that's what I'm calling it. You also have a site profile and you have an owner profile and inside each of those profiles are more configuration data as in like who owns this system or where is this system located and how much does power cost at this location. We need that configuration data to determine profitability at this location because when we go to move it the profitability might be more or less. Power is a huge factor in calculating those and so we put that data in the site classification part. I don't wanna say site.pp, haha. And so that gets carried away. We use geo look up to help determine which location it is. So those are kind of some of the pieces. The big piece is being able to swap out mining profiles so let's say that you want to mine Ethereum in this day and then you want to go mine Zcash another day and then you want to swap over to something like Raven Coin the other day you have the ability just swap out the name of the mining profile and all of that information gets looked up through Puppet and Hiera. It's really nice.
Ben: [00:15:59] Who else might be interested in this? Like like who would be your audience if you were to distribute this? And how would they get started using your tool?
Corey: [00:16:06] I've been in the DevOps space for a while and I've been consulting and a lot of the lessons I've learned from there is if you make a tool difficult to install it's not going to get used. The curl bash command is probably the easiest way to get something installed so that's essentially how you can install this software. I give you a curl command and you pipe it into bash and that runs a script, that script then installs the package. If you don't have a package these days you can't guarantee that the person is going to run the same Ruby runtime that you expect them to use, nor have all of the gems. So if you don't package these things together it just makes it more difficult for the person to install. This is important because a lot of the people in the mining community do not come from - they don't have sysadmin backgrounds right. They don't even know what configuration management is, so telling them that they can configure all of their systems at once and their mind is just completely blown. It's amazing you're like, well I've been using this for this tool called Puppet, you know, for almost a decade. You know it's just I don't even even know what I would use outside of that. So it's fun to tell them these tools have existed and they can use them to configure their systems because right now most people hand edit everything like back in the old days.
Ben: [00:17:51] So it sounds like you're actually even enabling them to do configuration management and use Puppet without even really knowing that they're using Puppet. So that's something I think to be proud of. So what's next on your roadmap?
Corey: [00:18:04] Right, so next on my roadmap is really releasing this tool. Right now it's just me and some friends using this tool and I have a cloud offering that sort of brings together the fleet view. What's interesting about this is that the fleet view gives you like a bird's eye view of everything and all of your rigs together, whereas using the CLI tool itself just gives you the single view of that system in the fleet view. Also, and this is where it gets interesting I think because right now all my configuration data is in Hiera. And for me to tell someone who doesn't know Puppet that they need to learn yaml to configure their systems is out of the question I think. So I built a UI for this and that means that you just type in a form with the quote Hiera data but they don't know that. It's just a form field and that gets sent down to the node - I'm sorry, the rig - and the rig uses that as configuration data and it essentially just writes it out to a common .yaml file and then uses normal Puppet means to compile the catalog. So I'm really abstracting out a lot of the hard parts of knowing yaml to begin with or I'm essentially wrapping the the puppet apply command because I'm using it internally you know in pure Ruby code I'm not shelling out or anything like that. So the user doesn't know exactly how it's even done. They just know that their system is configured as soon as they type that reconfigure command. So getting around to the question you know what's next is next is releasing this tool to the masses and and getting the proper and making sure that I don't have a bunch of security holes in it and packaging it up into a way that allows me to get paid to continue to write more of these quote sysadmin tools for miners.
Ben: [00:20:30] That sounds like a really incredibly fascinating use of Puppet. Honestly I'd love to see a a blog post about some of the details later on. Where can I follow up? Where can I see your release notification when you're when you're ready to release?
Corey: [00:20:45] So I have a blog. It's logicminds.github.io. I have grand plans for this tool and a new business sector for myself. I will be revealing more on that blog as to where you can go learn more about it and maybe how you can build something like this yourself. So check out logicminds.github.io.
Ben: [00:21:12] Awesome. Well thanks very much for talking to us today, Corey.