Author: Paola Pardo

Scala Test Dive-in: Public, Private and Protected methods

We all know that testing code can be done in different ways. This pill is not to explain which is the best way to see if your Scala project is working as it should. But it will provide some tips and tricks for testing public, private, and protected methods.

Public Methods

Public methods are the functions inside a class, that can be called from outside, through the instantiated object. Public method testing is no rocket science. In Scala, the use of Matchers and Clues is needed in order to understand what is wrong.

Imagine we want to test a MathUtils class that has simple methods min and max:

class MathUtils {
  def min(x: Int, y: Int): Int = if (x <= y) x else y

  def max(x: Int, y: Int): Int = if (x >= y) x else y

}

This is how your test should look like:

import org.scalatest.AppendedClues.convertToClueful
import org.scalatest.matchers.should.Matchers
import org.scalatest.flatspec.AnyFlatSpec


class MathUtilsTest extends AnyFlatSpec with Matchers {

  "MathUtils" should "compute min correctly" in {
    val min = 10
    val max = 20
		val mathUtils = new MathUtils()
    mathUtils.min(min, max) shouldBe min withClue s"Min is not $min"
  }

  it should "compute max correctly" in {
    val min = 10
    val max = 20
		val mathUtils = new MathUtils()
    mathUtils.max(min, max) shouldBe max withClue s"Max is not $max"
  }
}

Private Methods

Private methods are the methods that cannot be accessed in any other class than the one in which they are declared.

Testing these functions is way more tricky. You have different ways of proceeding: copy and paste the implementation in a test class (which is out of the table), use Mockito, or try with PrivateMethodTester.

Let’s write a private method on the class MathUtils:

class MathUtils {

  def min(x: Int, y: Int): Int = if (x <= y) x else y

  def max(x: Int, y: Int): Int = if (x >= y) x else y

  private def sum(x: Int, y: Int): Int = {
    x + y
  }

  def sum(x: Int, y: Int, z: Int): Int = {
    val aux = sum(x, y)
    sum(aux, z)
  }

}

PrivateMethodTester is a trait that facilitates the testing of private methods. You have to mix it in your test class in order to take advantage of it.


import org.scalatest.AppendedClues.convertToClueful
import org.scalatest.matchers.should.Matchers
import org.scalatest.flatspec.AnyFlatSpec
import org.scalatest.PrivateMethodTester

class MathUtilsPrivateTest extends AnyFlatSpec with Matchers with PrivateMethodTester {

  "MathUtils" should "compute sum correctly" in {
  
    val x = 1
    val y = 2

    val mathUtils = new MathUtils()
		val sumPrivateMethod = PrivateMethod[Int]('sum)
    val privateSum = mathUtils invokePrivate sumPrivateMethod(1, 2)
    privateSum shouldBe (x + y) withClue s"Sum is not is not ${x + y}"
  }
}

In val sumPrivateMethod = PrivateMethod[Int]('sum) we have different parts:

  • [Int] is the return type of the method
  • (’sum) is the name of the method to call

In mathUtils invokePrivate sumPrivateMethod(x, y) you can collect the result in a val to compare and understand if it’s working properly. You need to use an instance of the class/object to invoke the method, otherwise, it will not find it.

Protected Methods

A protected method is like a private method in that it can only be invoked from within the implementation of a class or its subclasses.

For example we decide to make sum method protected instead of private. Class MathUtils would look like this:

class MathUtils {
  def min(x: Int, y: Int): Int = if (x <= y) x else y

  def max(x: Int, y: Int): Int = if (x >= y) x else y

  protected def sum(x: Int, y: Int): Int = x + y

}

If we create a new object from MathUtils and try to call the sum method, it will throw a warning saying that ‘sum is not accessible from this place’

But don’t worry, we have a solution for that as well.

We can write a subclass specific for this test and override the method since it can be invoked through the implementation of its subclasses.


class MathUtilsTestClass extends MathUtils {
  override def sum(x: Int, y: Int): Int = super.sum(x, y)
}

class MathUtilsProtectedTest extends AnyFlatSpec with Matchers {
  "MathUtils" should "compute sum correctly" in {
    val x = 1
    val y = 2
    val mathUtilsProtected = new MathUtilsTestClass()
    mathUtilsProtected.sum(x, y) shouldBe (x + y) withClue s"Sum is not is not ${x + y}"
  }

}

Summary

Now you can test the different types of methods in your Scala project: public, private, and protected. For more information about Scala, functional programming, and style, feel free to ask us or check out our other pills!

About Qbeast
Qbeast is here to simplify the lives of the Data Engineers and make Data Scientists more agile with fast queries and interactive visualizations. For more information, visit qbeast.io
© 2020 Qbeast. All rights reserved.
Share:

Back to menu

Continue reading

Read from public S3 bucket with Spark

S3 Hadoop Compatibility

Trying to read from public Amazon S3 object storage with Spark can cause many errors related to Hadoop versions.

Here are some tips to configure your spark application.

Spark Configuration

To read the S3 public bucket, you need to start a spark-shell with version 3.1.1 or superior and Hadoop dependencies of 3.2.

If you have to update the binaries to a compatible version to use this feature, follow these steps:

  • Download spark tar from the repository
$ > wget https://archive.apache.org/dist/spark/spark-3.1.1/spark-3.1.1-bin-hadoop3.2.tgz
  • Decompress the files
$ > tar xzvf spark-3.1.1-bin-hadoop3.2.tgz
  • Update the SPARK_HOME environment variable
$ > export SPARK_HOME=$PWD/spark-3.1.1-bin-hadoop3.2

Once you have your spark ready to execute, the following configuration must be used:

$ > $SPARK_HOME/bin/spark-shell \
--conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider \ 
--packages com.amazonaws:aws-java-sdk:1.12.20,\
		org.apache.hadoop:hadoop-common:3.2.0,\
    org.apache.hadoop:hadoop-client:3.2.0,\
    org.apache.hadoop:hadoop-aws:3.2.0

The  org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider  provides Anonymous credentials in order to access the public S3.

And to read the file:

val df = spark
.read
.format("parquet")
.load("s3a://qbeast-public-datasets/store_sales")

Summary

There’s no known working version of Hadoop 2.7 for AWS S3. However, you can try to use it. If you do so, remember to include the following option:

--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem
About Qbeast
Qbeast is here to simplify the lives of the Data Engineers and make Data Scientists more agile with fast queries and interactive visualizations. For more information, visit qbeast.io
© 2020 Qbeast. All rights reserved.
Share:

Back to menu

Continue reading

Code Formatting with Scalafmt

Whether you are starting a Scala project or collaborating in one, here, you have a guide to know the most used frameworks for improving the code style.

Scalastyle and Scalafmt

Scalastyle is a handy tool for coding style in Scala, similar to what Checkstyle does in Java. Scalafmt formats code to look consistent between people on your team, and it is perfectly integrated into your toolchain.

Installation

For the installation, you need to add the following to the plugins.sbt file under the project folder.

addSbtPlugin("org.scalameta" % "sbt-scalafmt" % "2.4.2") 
addSbtPlugin("org.scalastyle" %% "scalastyle-sbt-plugin" % "1.0.0")

This will create a Scalastyle configuration under scalastyle_config.xml. And a file .scalafmt.conf where you can write rules to maintain consistency across the project.

For example:

# This style is copied from
# <https://github.com/apache/spark/blob/master/dev/.scalafmt.conf> version = "2.7.5"
align = none
align.openParenDefnSite = false
align.openParenCallSite = false
align.tokens = [] 
optIn = { 
  configStyleArguments = false 
} 
danglingParentheses = false 
docstrings = JavaDoc 
maxColumn = 98 
newlines.topLevelStatements = [before,after]

Quickstart

When opening a project that contains a .scalafmt.conf file, you will be prompted to use it:

Choose the scalafmt formatter, and it will be used at compile-time for formatting files.

However, you can check it manually with:

sbt scalastyle

Another exciting feature is that you can configure your IDE to reformat at saving:

Alternatively, force code formatting:

sbt scalafmt # Format main sources 

sbt test:scalafmt # Format test sources 

sbt scalafmtCheck # Check if the scala sources under the project have been formatted 

sbt scalafmtSbt # Format *.sbt and project /*.scala files 

sbt scalafmtSbtCheck # Check if the files have been formatted by scalafmtSbt

More tricks

Scaladocs

Sbt also checks the format of the Scala docs when publishing the artifacts. The following command will check and generate the Scaladocs:

sbt doc

Header Creation

Sometimes a header must be present in all files. You can do so by using this plugin: https://github.com/sbt/sbt-header

First, add it in the plugins.sbt:

addSbtPlugin("de.heikoseeberger" % "sbt-header" % "5.6.0")

Include the header you want to show in your build.sbt

headerLicense := Some(HeaderLicense.Custom("Copyright 2021 Qbeast Pills"))

And use it in compile time with:

Compile / compile := (Compile / compile).dependsOn(Compile / headerCheck).value

To automatize the creation of headers in all files, execute:

sbt headerCreate

Using println

Scalafmt has strong policies on print information. And we all debug like this now and then.

The quick solution is to wrap your code:

// scalastyle:off println
<your beautiful piece of code>
// scalastyle:on println

But make sure you delete these comments before pushing any commits 😉

About Qbeast
Qbeast is here to simplify the lives of the Data Engineers and make Data Scientists more agile with fast queries and interactive visualizations. For more information, visit qbeast.io
© 2020 Qbeast. All rights reserved.
Share:

Back to menu

Continue reading

Contact us info@qbeast.io

C/ Roc Boronat 117, 2a Planta, 08018 Barcelona

© 2020 Qbeast
Design by Xurris