For the last weeks, I’ve been working on a CLI tool to help developers of the qbeast-spark open-source project test their changes to the code. I’ll show you how I did it using setuptools.
Some weeks ago, we, at Qbeast, were running tests manually, which involved several repetitive steps, which stole our developers a considerable amount of time. These steps are necessary for testing, but they are unnecessarily time-consuming. In a few words, these steps consist of “simple things”, such as creating clusters in Amazon EMR with the required dependencies, running spark applications on these clusters, checking available Datasets in Amazon S3, and other few things. Things that seem easy to achieve but complex when you have to run and remember several commands and fix problems manually. Something that could be automated somehow.
We had to develop a tool to automate all these steps as a solution… Something like a Command Line Interface (CLI), which lets us run easy commands doing the whole process automatically. We decided to call it qauto (‘q’ for ‘Qbeast’ and ‘auto’ for… well, our CEO has a gift naming things…). Of course, this will not be the name of your application, but you can get some inspiration from it.
This tool would let us run something like
qauto cluster create or
qauto benchmark run: Easy commands that wrap and ease complex ones.
You’d say: complex? – Yes. If you check the number of available options when you try to create a cluster in Amazon EMR using their CLI (if you never have), you’ll feel overwhelmed: take a look. There are more than 30 different options at the time of writing this! And most of these options will remain the same in all runs (except cluster name and the number of machines, maybe?).
So, why not create a simple command that lets you specify only the necessary options for your day-to-day commands?
With Python, you can create a simple tool to wrap these commands. Let’s see how to do it:
qautodirectory will contain different
.pyfiles, which will indeed be the code of your application. You can have as many files of these as you want to structure your code correctly.
setup.pyfile will be used to let the system install the application.
__init__.pywill contain the different packages that you have in your application. Imagine you have a
utils.pypython files, then your init file must include:
main.pywill contain the code for your application. In our case, we will write a simple example with a few options, but you can extend it to your like.
maingroup to wrap everything into the main application.
import click @click.group() def main(): pass
maingroup can contain other sub-groups. In our example, we’re going to add an
awsgroup to the main, which will indeed contain another sub-group:
@main.group() def aws(): """AWS Cloud Provider commands""" @aws.group() def cluster(): """Cluster-related commands."""
qauto aws cluster create <cluster-name> <number-of-nodes>command:
@cluster.command("create") @click.argument("cluster-name") @click.option("--number-of-nodes", help="Number of nodes for the cluster", default=2, show_default=True) def aws_create_cluster(cluster_name, number_of_nodes): # your program logic
Following this basic structure, the final result for the
main.py file could be something like:
With this made up, we currently have a command “create” inside some groups. Following the group structure from the main group, we can see the command itself is
qauto aws cluster create. But… wait a second! We defined an alias for “aws” in our
setup.py file, so an alternative will be
qw cluster create (obviously providing some required arguments!).
Easy, isn’t it?
Once you have finished building your application -or you want to test it- you can install it easily using Python’s pip installer. Being in the root directory of your project, you can run
pip install -e . to install your new application. From now on, you can run
qauto <command> (or the name you specified for your application).
-eoption installs your program in editable mode: You won’t need to re-install it if you make some changes.
pip uninstall qauto, or the name you specified for your application.
Setuptools is a powerful Python library that lets you package your python projects. It can be used for many applications when building something easy to run. We used it to create a CLI, which eases a command containing many redundant options. In the same way, you can add other commands and structure your code to your needs.