homeblogintroducing wash

Introducing Wash

Have you ever had to:

  • List all your AWS EC2 instances or Kubernetes pods?
  • Read a GCP Compute instance’s console output, or an AWS S3 object’s content?
  • Exec a command on a Kubernetes pod or GCP Compute Instance?
  • Find all AWS EC2 instances with a particular tag, or Docker containers/Kubernetes pods/GCP Compute instances with a specific label?

If so, then some parts of the following tables might look familiar to you. If not, then here’s how AWS/Docker/Kubernetes/GCP recommends that you do some of these tasks.

List all
AWS EC2 instancesaws ec2 describe-instances --profile foo --query 'Reservations[].Instances[].InstanceId' --output text
Docker containersdocker ps --all
Kubernetes podskubectl get pods --all-namespaces
GCP Compute instancesgcloud compute instances list
Read
Console output of an EC2 instanceaws ec2 get-console-output --profile foo --instance-id bar
Console output of a Google compute instancegcloud compute instances get-serial-port-output foo
An S3 object's contentaws s3api get-object content.txt --profile foo --bucket bar --key baz && cat content.txt && rm content.txt
A GCP Storage object's contentgsutil cat gs://foo/bar
Exec `uname` on
An EC2 instancessh -i /path/my-key-pair.pem ec2-user@195.70.57.35 uname
A Docker containerdocker exec foo uname
Exec on a Kubernetes podkubectl exec foo uname
On a Google Compute instancegcloud compute ssh foo --command uname
Find by 'owner' tag/label
EC2 instancesaws ec2 describe-instances --profile foo --query 'Reservations[].Instances[].InstanceId' --filters Name=tag-key,Values=owner --output text
Docker containersdocker ps --filter “label=owner”
Kubernetes podkubectl get pods --all-namespaces --selector=owner
Google Compute instancegcloud compute instances list --filter=”labels.owner:*”

We see that you need to use different commands to List/Read/Exec/Find different things. Furthermore, these commands require you to install different applications that each come with their own set of (possibly conflicting) dependencies, and their own calling conventions. For example, to complete all the Find tasks specified in the table, you need to:

  • Use the aws ec2 describe-instances, docker ps, kubectl get pods, gcloud compute instances list commands (4 different commands).
  • Install the aws, docker, kubectl and gcloud applications (4 different applications). Note that aws and gcloud are Python applications, so you must also install Python. Also, gcloud only works with Python 2 so if you just have Python 3 installed on your machine, you must now get and install Python 2 and do the installation in such a way that it is easy for you to switch-back to using Python 3 for some of your other applications. This is not an easy thing to do, especially if you are not familiar with the Python ecosystem.
  • Learn four different-but-similar DSLs for filtering stuff, which effectively means four different-but-similar ways of constructing and combining predicates on structured data (e.g. GCP’s filter expressions, Kubernetes’ field selectors, Kubernetes’ label selectors, aws’ describe-instances’ --filters option, docker ps filtering, etc.).

That’s a lot of stuff you have to use and learn to do some very fundamental and basic tasks. It naturally begs the question of whether there’s a better way of performing these tasks that is (1) more expressive than what’s presented here, and (2) does not require you to learn different commands to perform a task on different things.

The answer is Wash

With Wash, here’s what the invocations look like:

List all
AWS EC2 instancesls aws/foo/resources/ec2/instances
Docker containersls docker/containers
Kubernetes podsls kubernetes/foo/bar/pods
GCP Compute instancesls gcp/foo/compute
Read
Console output of an EC2 instancecat aws/foo/resources/ec2/instances/bar/console.out
Console output of a Google compute instancecat gcp/foo/compute/bar/console.out
An S3 object's contentcat aws/foo/resources/s3/bar/baz
A GCP Storage object's contentcat gcp/foo/storage/bar/baz
Exec `uname` on
An EC2 instancewexec aws/foo/resources/ec2/instances/bar uname
A Docker containerwexec docker/containers/foo uname
Exec on a Kubernetes podwexec kubernetes/foo/bar/pods/baz uname
On a Google Compute instancewexec gcp/foo/compute/bar uname
Find by 'owner' tag/label
EC2 instancesfind aws/foo -k ‘*ec2*instance’ -meta ‘.tags[?].key’ owner
Docker containersfind docker -k ‘*container’ -meta ‘.labels.owner’ -exists
Kubernetes podfind kubernetes/foo/bar -k ‘*pod’ -meta ‘.metadata.labels.owner’ -exists
Google Compute instancefind gcp/foo -k ‘*compute*instance’ -meta ‘.labels.owner’ -exists

Comparing the two tables, we immediately see that using Wash means:

  • You no longer have to learn different commands to execute a task across different things. All you need is one command (ls for List; cat for Read; wexec for Exec; and find for Find).
  • You no longer have to install a bunch of different tools. All you need to install is the Wash binary.
  • You no longer have to learn different DSLs for filtering stuff. All you need to learn is find’s expression syntax and its individual primaries. Once you do that, you can filter on almost any conceivable property of your specific thing.

Furthermore, Wash is built on a plugin architecture. It ships with some default plugins for Docker, AWS, Kubernetes, and GCP, but you can easily extend it with your own plugin and write it in any language you want. People have written plugins for all sorts of things such as IoT devices, Goodreads, GitHub, PuppetDB, and even Spotify.

If you’d like to learn more about Wash or contribute to its development, then please check out puppetlabs.github.io/wash.

Michael Smith is a principal software engineer at Puppet.