Author: Eric Ávila

Publish your SBT project to the Central Repository

Paola Pardo & Eric Ávila

You and your team have been working hard on the very first release of your beloved project. Now it’s time to make it public for the ease of software usage 🚀

In this post, we will be explaining how to publish an sbt project to the Central Repository through Sonatype OSSRH Nexus Manager.

First-time preparation for releasing artifacts

Sonatype credentials and GPG keys must be set up before publishing an artifact. These steps have to be done only once and require some human revision.

Sonatype Setup

Sonatype OSSRH (OSS Repository Hosting) provides a repository hosting service for open source project binaries. It uses Maven Repository format and it would allow you to:

  • deploy snapshots
  • stage releases
  • promote releases and sync to the Central Repository

Can I change, delete or modify the published artifacts? → Quick and short answer: No.

Be careful and use -SNAPSHOT suffix on your version to test binaries before moving to a definitive stage.

For more information, click here 👈

1. Register to JIRA

There are some configurations that require human interaction (see here why). Sonatype uses JIRA to manage requests, so if you don’t have an account, it’s time to do so.

2. Open a JIRA ticket to solicit the domain

Now you have to create a new ticket requesting the namespace for your packages.

It’s very simple, but here it’s the request we made in case you want some inspiration:

3. Set your domain accordingly

Right after you publish the ticket, you will receive an automatic notification to configure the domain properly.

After setting TXT in your domain and editing the status of the ticket, you have to wait for the congratulations message.

Let’s move on to the next step!

GPG Setup

In order to sign the artifacts that you want to publish, you will need to create a private/public key pair. Using your tool of choice, create it and upload the public key to a key server when asked, or upload it manually.

I’ll show how to do so on the Linux command line:

1. Generate a GPG key

$> gpg --gen-key 

2. List keys to know it’s present in your machine

Once the key pair is generated, we can list them along with any other keys installed:

$> gpg --list-keys 
/Users/xxx/.gnupg/pubring.gpg
----------------------------------
pub   rsa2048 2012-02-14 [SCEA] [expires: 2028-02-09]
      <public-key>
uid           [ultimate] Eugene Yokota <eed3si9n@gmail.com>
sub   rsa2048 2012-02-14 [SEA] [expires: 2028-02-09]

3. Upload the public key to a server, so you will be able to sign packages and verify them

Since other people need your public key to verify your files, you have to distribute your public key to a key server:

$> gpg --keyserver keyserver.ubuntu.com --send-keys <public-key>

This first key will be set as default for your system, so now the sbt-pgp plugin will be able to use it.

Releasing an artifact

Now that you are registered in the OSS Sonatype Repository and configured the GPG keys to sign your library, it’s time to prepare your project to compile and produce the corresponding artifacts.

1. Prepare build.sbt

Add ‘sbt-sonatype’ and ‘sbt-pgp’ plugins to your project/plugins.sbt  file.

addSbtPlugin("org.xerial.sbt" % "sbt-sonatype" % "3.9.9")
addSbtPlugin("com.github.sbt" % "sbt-pgp" % "2.1.2")

In your build.sbt you have to add the reference to the remote Sonatype repository and some settings to accomplish the Maven Central repository requirements.

// Repository for releases on Maven Central using Sonatype
publishTo := sonatypePublishToBundle.value
sonatypeCredentialHost := "s01.oss.sonatype.org"

publishMavenStyle := true
sonatypeProfileName := "io.qbeast" // Your sonatype groupID

// Reference the project OSS repository
import xerial.sbt.Sonatype._
sonatypeProjectHosting := Some(
  GitHubHosting(user = "Qbeast-io", repository = "qbeast-spark", email = "info@qbeast.io"))

// Metadata referring to licenses, website, and SCM (source code management)
licenses:= Seq(
  "APL2" -> url("https://www.apache.org/licenses/LICENSE-2.0.txt"))
homepage := Some(url("https://qbeast.io/"))
scmInfo := Some(
  ScmInfo(
    url("https://github.com/Qbeast-io/qbeast-spark"),
    "scm:git@github.com:Qbeast-io/qbeast-spark.git"))
// Optional: if you want to publish snapshots 
// (which cannot be released to the Central Repository)
// You must set the sonatypeRepository in which to upload the artifacts
sonatypeRepository := {
  val nexus = "https://s01.oss.sonatype.org/"
  if (isSnapshot.value) nexus + "content/repositories/snapshots"
  else nexus + "service/local"
}
<!-- (Optional) pomExtra field, where you can reference developers, 
		among other things. This configuration must be in XML format, like
		in the example below, and it will be included in your .pom file. -->
pomExtra := 
	<developers>
    <developer>
      <id>osopardo1</id>
      <name>Paola Pardo</name>
      <url>https://github.com/osopardo1</url>
    </developer>
  </developers>

2. Sonatype credentials ~/.sbt/1.0/sonatype.sbt

Apart from the key, you need to set up a credentials file for the Sonatype server. Create a file in $HOME/.sbt/1.0/sonatype.sbt. This file will contain the credentials for Sonatype:

credentials += Credentials("Sonatype Nexus Repository Manager",
				"s01.oss.sonatype.org", // all domains registered since February 2021
				"(username)",
				"(password)")

3. Publish, stage and close

The easiest way is to run the following commands.

sbt clean
sbt publishSigned
sbt sonatypeBundleRelease

Please note that executing the third command is a definitive step and there’s no way back.

The full guide and explanation of what the next commands do can be found here: https://github.com/xerial/sbt-sonatype#publishing-your-artifact. I recommend reading it if it’s your first time doing so, to better understand the process.

Let’s explain the commands step by step.

  1. The first command does the cleaning of your target/ directory inside the project.
project-root$> sbt clean

   2. The second command creates all the required artifacts to publish to Maven Central. These files include different JARs (jar, jar+javadoc, jar+sources), a POM file with metadata required for publishing in Maven (qbeast-spark_x.xx-y.y.y.pom), and all the required checksum/CRC files.

project-root$> sbt publishSigned

    3. The third command prepares the Sonatype repository, uploads JARs, and releases them to the public, syncing with the Maven Central repository.

CAUTION. Executing this command is a definitive step and there’s no way back.

project-root$> sbt sonatypeBundleRelease

If you want to control each of the steps of sonatypeBundleRelease, you can run:

project-root$> sbt sonatypePrepare
project-root$> sbt sonatypeBundleUpload
project-root$> sbt sonatypeRelease

And finally…

In 10 minutes you will have your package ready for others to download and use it 🔥

Don’t worry if it does not appear on the Maven Central Repository in the following hours, since it takes a little longer to sync. But you can play with it right away!

We hope that you find this post useful 🙌 And don’t hesitate to share your projects with us 😃

About Qbeast
Qbeast is here to simplify the lives of the Data Engineers and make Data Scientists more agile with fast queries and interactive visualizations. For more information, visit qbeast.io
© 2020 Qbeast. All rights reserved.
Share:

Back to menu

Continue reading

Reduce Repetitive Tasks and Development Time by Writing your own Tool in Python

For the last weeks, I’ve been working on a CLI tool to help developers of the qbeast-spark open-source project test their changes to the code. I’ll show you how I did it using setuptools.

Motivation

Some weeks ago, we, at Qbeast, were running tests manually, which involved several repetitive steps, which stole our developers a considerable amount of time. These steps are necessary for testing, but they are unnecessarily time-consuming. In a few words, these steps consist of “simple things”, such as creating clusters in Amazon EMR with the required dependencies, running spark applications on these clusters, checking available Datasets in Amazon S3, and other few things. Things that seem easy to achieve but complex when you have to run and remember several commands and fix problems manually. Something that could be automated somehow.

We had to develop a tool to automate all these steps as a solution… Something like a Command Line Interface (CLI), which lets us run easy commands doing the whole process automatically. We decided to call it qauto (‘q’ for ‘Qbeast’ and ‘auto’ for… well, our CEO has a gift naming things…). Of course, this will not be the name of your application, but you can get some inspiration from it.

The CLI

This tool would let us run something like qauto cluster create or qauto benchmark run: Easy commands that wrap and ease complex ones.
You’d say: complex? – Yes. If you check the number of available options when you try to create a cluster in Amazon EMR using their CLI (if you never have), you’ll feel overwhelmed: take a look. There are more than 30 different options at the time of writing this! And most of these options will remain the same in all runs (except cluster name and the number of machines, maybe?).

So, why not create a simple command that lets you specify only the necessary options for your day-to-day commands?

Setuptools – Package your Python projects

With Python, you can create a simple tool to wrap these commands. Let’s see how to do it:

  1. Create the following file structure in your directory:

  1. As you can see, the project folder contains a directory named qauto and a setup.py file. The qauto directory will contain different .py files, which will indeed be the code of your application. You can have as many files of these as you want to structure your code correctly.
  2. The setup.py file will be used to let the system install the application.
  3. The __init__.py will contain the different packages that you have in your application. Imagine you have a main.py and a utils.py python files, then your init file must include:
  4. The main.py will contain the code for your application. In our case, we will write a simple example with a few options, but you can extend it to your like.
    1. Application entry point. This is the part where your code begins its execution. We will create a main group to wrap everything into the main application.
      import click
      
      @click.group()
      def main():
      pass
    2. This main group can contain other sub-groups. In our example, we’re going to add an aws group to the main, which will indeed contain another sub-group:
      @main.group()
      def aws():
          """AWS Cloud Provider commands"""
      
      @aws.group()
      def cluster():
          """Cluster-related commands."""
      
    3. Groups can contain commands, which are the real “executable things”. These commands may have arguments and options (mandatory/optional). In the following example, we are implementing the qauto aws cluster create <cluster-name> <number-of-nodes> command:
      @cluster.command("create")
      @click.argument("cluster-name")
      @click.option("--number-of-nodes", help="Number of nodes for the cluster", default=2, show_default=True)
      def aws_create_cluster(cluster_name, number_of_nodes):
          # your program logic

Following this basic structure, the final result for the main.py file could be something like:

With this made up, we currently have a command “create” inside some groups. Following the group structure from the main group, we can see the command itself is qauto aws cluster create. But… wait a second! We defined an alias for “aws” in our setup.py file, so an alternative will be qw cluster create (obviously providing some required arguments!).

Easy, isn’t it?

Installing your new custom-CLI

Once you have finished building your application -or you want to test it- you can install it easily using Python’s pip installer. Being in the root directory of your project, you can run pip install -e . to install your new application. From now on, you can run qauto <command> (or the name you specified for your application).

  • The -e option installs your program in editable mode: You won’t need to re-install it if you make some changes.
  • To uninstall it, just run pip uninstall qauto, or the name you specified for your application.

Conclusion

Setuptools is a powerful Python library that lets you package your python projects. It can be used for many applications when building something easy to run. We used it to create a CLI, which eases a command containing many redundant options. In the same way, you can add other commands and structure your code to your needs.

About Qbeast
Qbeast is here to simplify the lives of the Data Engineers and make Data Scientists more agile with fast queries and interactive visualizations. For more information, visit qbeast.io
© 2020 Qbeast. All rights reserved.
Share:

Back to menu

Continue reading

Create awesome GIFs from a terminal: Nice-looking animations with Terminalizer

Have you ever wanted to generate cool GIFs from a terminal output? Do you want to have fancy animations to show some code snippets?

Using terminalizer, you will be able to create fantastic animations by following this simple guide!

The solution

1. First, you need to install NodeJS v12.21.0 (LTS) from https://nodejs.org/download/release/v12.21.0/. Other versions may not be compatible with terminalizer, and you might have problems when using it.

2. After that, install terminalizer globally by using the command:

npm install -g terminalizer

Now you can use the following commands to record, play and share a GIF:

# Start recording a demo in a file called my_demo.yml
terminalizer record my_demo

# Now run the commands you want to appear in the GIF.
# When you have finished, press Ctrl+D (⌘+D) to stop recording.

# You can play the demo you just recorded by using the play option.te
terminalizer play my_demo.yml

# At this point you can customize several things,
# check the "🌟 Pro tips" section below.

# If you're happy with the result, render the GIF from your YML
# file. This will create a file in your current directory.
terminalizer render my_demo.yml

🌟 Pro tips: You can edit and customize your GIF before rendering it by modifying the content of the .yml file.

  • For example, you can change the colours or the style by changing the theme and the frameBox objects in the YML:
theme:
    background: "#28225C"
(...)
frameBox:
    type: floating
    title: Terminalizer Rocks!
  • You can also edit the content and the timing of the output by modifying the records object:
# Records, feel free to edit them
records:
  - delay: 50
    content: "\e[35mErics-MacBook-Air\e[0m:~ eric$ "

So with a few modifications, we can get something like this:

You can find more customization options and tips at the original repo on GitHub: https://github.com/faressoft/terminalizer

Summary

Terminalizer is a beautiful and easy-to-use tool to create GIFs from your console/terminal output. It allows you to create GIFs by recording a session and customizing everything as desired in a simple YML format, perfect for newcomers to use it.

About Qbeast
Qbeast is here to simplify the lives of the Data Engineers and make Data Scientists more agile with fast queries and interactive visualizations. For more information, visit qbeast.io
© 2020 Qbeast. All rights reserved.
Share:

Back to menu

Continue reading

Contact us info@qbeast.io

C/ Roc Boronat 117, 2a Planta, 08018 Barcelona

© 2020 Qbeast
Design by Xurris