A case to introduce Scala in a Java world
The Purple Tentacle is me: I want the blue water back. Yes, Iām never happy.
New Job, Big Company, Java team, no Scala miles away.
My goal: move some not-that-big Java projects to Scala and start the inertia to create new projects in Scala.
Why? While I like Java as it gains more capabilities, I simply love Scala. The existing projects were in good olā verbosy Java and I know quite well Scala and the benefits we could expect from it.
I think Scala makes it easier to readāāāwhen we donāt use āesotericā librariesāāā, to share with others, and to refactor. Scala leads to a more robust code, a better productivity, and developers produce less bugs. This is all due to the powerful Scala type system (among other things) and the way of coding.
The experimented developers around were inclined to learn and change. Only the managers were chilly about it: āIf we start using Scala, we need to validate it with the company hierarchy first. To find Scala developers is more difficult than Java ones.ā
I need to be persuasive. I need to explain why we are going to be more productive using Scala, why itās good for developers to use it, why we need to spread it: what are the drawbacks of Java, what are the strengths of Scala.
Years ago, coming from .NET, I had to jump into both the modern Java & Scala land at the same timeāāāIād never been past Java 1.5 and ant before that. The learning curve was quite steep but who is not looking for some challenges nowadays? Scala is a challenge to open our minds to the broad world of FP and types systems. I learned so much about programming thanks to Scala.
In this article, Iāll talk about what I lived and learned from my own experience. You may have a different point of view, feel free to share it.
Weāll look into Java drawbacks and improper practices Iāve seen in projects and consider how Scala can help us to avoid them.
What was wrong?
Logorrhea We all know it, Java is terribly verbose. Some people like it, others just live with it. The lambdas helped us tremendously to reduce verbosity but itās not enough.
var will help, data class will help, pattern matching will help, but not every project can upgrade their JVM like that (Java 9+). Most of us will be stuck on Java 8 for a while.
Accidental verbosity doesnāt help at all to understand the code, on the contrary. Java conveys way more info than it should. Itās not only a question of types but also of API design. The more you must read in a code, the more you have to accumulate in your mind to reason about it.
I prefer not to think a lot when I code or review, just the bare minimum, just the business flow, not the programming language itself. Scala makes the code more declarative whereas Java tends to be imperative. You need to ārun the programā in your head to understand whatās going on.
Even Mavenāāāthe ubiquitous Java build toolāāāis quite verbose and āhardā to understand. In Scala, we have sbt aka āSimple Build Toolā (letās not kid ourselves, itās far from being simple). Still, when a friend of yours helps a project and translate ā263 lines of unreadable maven xml to 34 lines of sbt definitionā: I prefer to read 34 (unobfuscated) lines to understand.
https://twitter.com/guizmaii/status/1032929860430323715
The not-so-happy path Because of the checked exceptions constraints in Java, a lot of Exceptions are rethrown as RuntimeExceptions. This happens because:
- the interface did not plan to throw anything
- the interface did not plan to throw the Exception an implementation could (it makes sense, why would the interface know that?)
- āwe donāt careā to declare them and prefer to let the downstream free of try-catch. Raise your hand if you never did that ever?
Letās put aside the catch-all Exception handler bad smell to avoid writting multiple catchs.
When reviewing how multiple functions work together, try-catch can blur the logic. Itās a āside-outputā that short-circuits the functions, something youād like not to have to think about. The code should always be read in one way, not several.
An analogy I like is the Railroad oriented programming to handle errors explicitly in the type (as we do in Scala), allowing to compose functions safely.
Most functions have a happy path and a not-happy path (errors). But itās explicit in the return types such as using Either[Error, Int], allowing for compositions and errors traversal.
https://www.slideshare.net/ScottWlaschin/railway-oriented-programming
Annotations break layers In the projects I was working on, the models were intertwined with annotations from Jackson and ORMs. Guice was all over the place too. Hopefully, I didnāt find any Spring inside (which rely heavily on annotations).
Annotations can have good sides:
- Bring āmagicalā behaviors at runtime without typing much (like with Guice or Spring).
- Add metadata on top of classes or member variables (like Jackson).
- Make it easy to create HTTP restful webservices (Spring, JAX-RS).
They are all good use-cases, but they are a two-edged sword.
Some annotations force us to break layers boundaries and merge code that should not be tied together. Often, Annotations force tight coupling. This is totally against the rules.
I may be too strict (Software Craftsmanship anyone?) but projects should be separated into layers: model, persistence, service, api⦠that should be reusable.
Read more about Clean Architecture https://android.jlelse.eu/thoughts-on-clean-architecture-b8449d9d02df
Call it what you want:
The point is that the frameworks choices should be deferred the further we can in the layers. Frameworks must not be part of the core businessāāāthe domaināāāand we should find a proper separation of concerns in code.
If I want to use some repository interface, I donāt want to import Spring because you added an Spring annotation on it. I want to provide my own implementation for some reason.
Layering makes it easy to understand the scope and the impact of the code we are reading as well as its module concerns.
Here is another example with serialization/deserialization: if Iām reading some code about a Car model to understand how we consider it, I truly do not care how it is serialized. This is totally orthogonal to the core domain, so why add coupling to the code?
Moreover, a Car could be serialized into different forms (JSON, Avro, Protobufā¦). Are you going to add more and more annotations from different frameworks on top of it? These form different models and should not be tied together, and even be in different packages and independent modules. What if I want to reuse your model but without the dependencies you carried with your annotations? Iām stuck.
Annotations make it hard to know what is relying on it or using it. How to know itās not dead code? You canāt just use your IDE to āfind referencesā. Itās another world, another language, like the Upside Down world. This prevents codebase exploration and leads to uncertainty.
In Java, itās too easy to use annotations all over the place without thinking about the layers and dependencies.
In Scala, we have frameworks that generally donāt work with annotations but with proper code (through code generation, macros and implicits). Therefore, itās natural to package this code in another module, without impacting the core code. Not all annotations are bad, weāll see that later.
Runtime checks instead of compile-time checks
Relying on runtime is like playing with fire
Many Java features and frameworks rely on runtime reflection, introspection, and classpath scanning.
As I said, in the projects I wanted to convert to Scala, the code was covered with Guice annotations. Itās āniceā but itās difficult to understand the graph of dependencies. Because it is built at runtime and uses reflection, we donāt even try, and we let the runtime crash at startup or not⦠Surprise! What a loss of our time. Who never had this issue? Again, this leads to uncertainty.
With Spring, you have so many annotations available, most of them have an impact at runtime and uses magic strings as parameters. Special tribute to @HystrixCommand(fallbackMethod=ānewListā) which will be triggered if the circuit is opened. Youāll notice you have renamed the fallback function but not this magic string in production, when it will crash (correct me if Iām wrong).
Also, this sort of code is idiomatic in Java:
if (obj instanceof Integer) {
int intValue = ((Integer) obj).intValue();
// ...
} else if (obj instanceof String) {
...
Because there is no smart pattern-matching (yet, but itās gonna be awesome), we sometimes see this code. But the compiler canāt guarantee its completeness at compile-time. What if a developer passes another type for obj and forgets to add a condition here? In Scala, pattern matching is powerful , ubiquitous, and checked at compile-time.
Scala programmers barely use annotations. They are mostly employ to generate boilerplates like with @BeanProperty (to support Java compatibility (!)) or scalameta. They are also use to check functions behavior at compile-time like with @tailrec, to ensure the function is tail-recursive. Scala is oriented compile-time, which is why the programs are more robust.
Note that all Java annotations arenāt bad when they are only processed at compile-time and their goal is to generate boilerplate, such as google/auto, mapstruct, Immutables, or lombok. Thatās the good way to use annotations.
A lake of semantics Often, I find the code not semantic enough in traditional Java. The Collections API donāt offer many functions. I still see home-made for-loops: nothing convey the idea of why are we looping, what do we want to achieve?
The Java Stream APIāāāmore semantic, more fluentā is not used enough because of it does not offer enough features and has some peculiarities.
In Scala, the Collections API is more complete, contains many common functions (fold, reduce, exists, diff, filter, head, tail, sliding, sum, zipā¦), donāt have the weird Collector thing. In Java, because of this lack of functions, you often have to rely on snippets found on StackOverflow or use a distinct 3rd party library as Collections API (like Guava, jOOĪ», vavr).
Using properly-named methods, you know what will be the effect without reading the code of the given function: it conveys way more meaning than classic for-loops where you need to dive into the code to understand its (side-)effects.
A lack of abstraction
null is still a thing in Java. We always assume things can be null in Java. I hate so much null tests. There are some annotations to avoid null tests like @Nullable or @NotNull: it should be part of the type itself, not part of the outer world which are the annotations.
Who never worked with Guavaās Preconditions? (example taken from Apache Druid):
Preconditions.checkNotNull(task, "task"); Preconditions.checkNotNull(status, "status"); Preconditions.checkArgument(
task.getId().equals(status.getId()),
"Task/Status ID mismatch[%s/%s]",
task.getId(), status.getId());
The code is littered with Preconditions that throw RuntimeExceptions. Every functions in the class needs to check the variables they use again and again. You never know exactly the state of your instance so you add defensive code. This is the worse code ever because nobody will dare to remove it now. It either means the model is wrong or the type system is not strong enough.
These kind of null tests or annotations donāt exist in Scala, so it canāt pollute the code. We use the Scala type system and abstractions to deal with them, like Option we map over. In Java, Optional is unfortunately not ubiquitous because of its peculiarities and poor API.
The F word All this leads us to Functional Programming.
Java is rarely used with FP in mind (vavr is trying). Itās often mutable and use impure code making it hard to reason about (ie: itās not referentially transparent, see my previous post to know what it matters: Why Referential Transparency matters).
Also, there is not a lot of Fluent Interfaces in Java. Stream and CompletableFuture are fluent but itās not universal. In Scala, itās everywhere because the language and libraries often embraces FP.
If we think about a program, its purpose is to process data like this: input ā transformation ā output. FP is exactly about this:
Output = Program(input).map(f1).map(f2)
This is a functional style: you pass and compose functions, easy to read and understand. To understand the whole, you divide-and-conquer: you just need to understand each functions apart to understand the whole.
Those functions (here f1 f2) donāt rely on an external context (this). They use the variables we give them and return a result.
Functions should respect 3 rules:
- Be Total: no partial functions that can return a result different of the function return type (like exceptions).
- Be Deterministic: given a fixed input, the output should be the same (no randomization, no dependency on an external source outside parameters).
- Be Free of Side-effects: no mutation outside of the function scope (like println which alters stdout). Those mutations must be declared in the return type of the function, for the caller to be aware of this behavior (often IO).
I wonāt explain more of this marvellous world here, you can read more about it on my blog. https://www.sderosiaux.com/articles/2018/08/15/types-never-commit-too-early-part1/
Contrast this with OOP: you have a long class hierarchy where each parent contains a bit of the state. The whole aggregation forms the state of the instance. Itās crazy hard to reason about: you have a this with tons of variables the methods of the class can use, but often they only need one or two. And this is where you start adding special conditions because some can be null, or the whole set of instance parameters can be incoherent.
In FP, there is no need to combine data and functions into objects. The OO approach concerns itself with encapsulation, protected and private data and members to protect itself against confusing mutable state. Once you no longer have mutable state, all of that justification evaporates.
This is why FP is powerful: the scope of the functions is reduced only to what it needs. Nothing is stateful in FP. The state flows from functions to functions.
It makes it easy to test FP programs. You donāt need to start your dependency injection framework or mock your ORM to test something. You just call any public functions with some parameters, test the output, and youāre done.
Should we move to Scala?
It's not going into extinction Some people say that Scala is in danger of becoming irrelevant. They supported their belief with GitHub & Tiobe stats which pointed out that Scala was in regression. Moreover, they implied there are less and less people interested in Scala (no source), and that itās already hard to find Scala developers: therefore itās a risky bait.
From my own perspective, I still see a ton of active people on gitter, writing and reading blogs, Scala open-source projects are still growing at a good pace.
Scala comes from the āBig Dataā world. Major softwares are written in Scala such as Spark and Kafka. In a world where the data is a first-class citizen, they are ubiquitous. Huge companies are using Scala: LinkedIn, Twitter, Netflix, Criteo. Itās not random: they know it makes them more robust and productive.
Developers are empowered The more abstract, the more powerful and āshortā syntax you get
We are still discovering better abstractions (thanks to the work in cats, monix, scalaz, zio, but also less impactful libraries) and better ways of doing things. It means we are not even on a āstableā phase and still iterating the āhow-toā.
Scala 3 is incoming. This will clearly improve Scala features set while simplifying the language. Weāll gain new patterns to work with, new ways of doing things.
Iām always happy when I find out that a French company is now doing Scala. The more, the merrier. Itās sad that itās a noticeable event and not as widespread as Java, I can just hope the time will come.
I think Scala raises the bar of code quality and encompass developers with powerful principles (FP) and help them open their minds to better code abstractions and a better organization (separation of concerns, Category Theory).
What about the JVM?
In Scala, we donāt need more than the JRE 8 to use awesome features Java lacks of: types inference, higher-order types, pattern matching. Scala is compiled to JVM bytecode, we donāt need any migration path to JRE 9+.
Itās like TypeScript and JavaScript: TypeScript provides tons of features and types on top of JavaScript where they donāt exist. Itās not a problem, because we donāt work directly with JavaScript and all its quirks. The TypeScript compiler ensures the program is valid TypeScript first and compiles it to Javascript. This reduces drastically the number of bugs you can have by just working with JavaScript thanks to the static typing.
Note that TypeScript can be translated to another language than JS (like WebAssembly). Itās just a language on top of another. Scala is the same abstraction on top of the JVM bytecode.
- ScalaJS which compiles to JavaScript. There are ReactJS bindings, VueJS bindings, and much more. This is very powerful because ScalaJS relies on the Scala type system. You have all its features at disposal, and many frameworks are compatible (ie: can be compiled to JS).
- Scala Native which compiles to native code. Itās still experimental but still growing slowly.
Also, because Scala runs on a JVM, we use the existing tools to monitor and debug Scala applications than we do with Java applications: jconsole, visualvm, Java Mission Control etc. A Java developer can continue using its tools to debug a Scala program, it does not matter. Only the data structures used internally will be different and proper to the Scala framework.
āScala is slower than Javaā
Iām always sceptic when a Java developer declares that.
It may be true according to your code. But often, you just donāt need the highest performance ever. We can run Scala servers with complex logic inside and still handle easily thousands of QPS (experience inside).
Yes, there is often an overhead when using Scala because we prefer immutability: more garbage objects are generated and GC cleanups more than traditional mutable Java programs. But GC are extremely fast to do this (this only concerns the young generation of objects).
You have to make a tradeoff between a more robust properly typed program in Scala that pressures the GC a bit more, and a less maintenable Java program that may be faster and consume less RAM. What do you prefer?
I would even argue that in Scala, itās easierāāāfaster and free of bugāāāto refactor code to improve performance that it is in Java, all thanks to the way we generally code in Scala. Code is generally more abstract and reusable. Just changing some typeclasses implementation can boost the performance without compromising the whole program (in a micro-benchmark, I got a 3x boost with ZIO for instance).
Talking about performances, in Scala we also use jmh which is the de-facto Java standard to write micro-benchmarks. Itās often used to improve a critical piece of code and ensure performance wonāt degrade over time and developers. Coupled to sbt-jmh to provide sbt commands, itās a wonderful tool to use.
Should we move further than Scala? Further than Scala, there is Haskell or Eta (Haskell implementation on the JVM). Most people think Haskell is the natural āevolutionā of confirmed Scala developers. Iām not sure itās the case, but it clearly helps to know well Scala.
The Scala ecosystem took a lot of ideas from Haskell (scalaz, cats) to deal with Category Theory: a bunch of abstract things that compose. Googling for answers, we often stumbled upon the same concept but with Haskell code. Itās more concise than Scala and has a even more powerful type system (better inference, kind polymorphism (which weāll have in Dotty!)).
But this looks like a niche (way smaller than Scala). Iāve not heard of any Haskell company in my region, only a few in Paris. But thatās only me.
Conclusion
I think going to Scala is the best step Java developers can do. They wonāt be lost because the existing Java ecosystem is available, but thereās more: the Scala ecosystem is also available (and preferred).
Itās a matter of time and experience to code in Scala from āa better Javaā to the Scala FP idiomatic way of doing things. Sharing with other experimented developers definitely helps to understand what is this āwayā.
If you are uncertain about transitioning directly to Scala, maybe using [vavr] is a good first step not to disrupt too much the developers. But why take a baby-step, when you can take a man-step? In both cases, there is a learning curve, so better get only one and focus directly on Scala.
Using Scala will make developers naturally be constrained (by the APIs) and guided to the Functional Programming paradigm: this will improve the code quality and expand their mindset. Even if they come back to Java or any language later, they wonāt code the same way: they will have evolved.
So, do you think Iām going to sell it?
Originally published on medium.com