Java 8: Lambdas, Part 2
by Ted Neward
Learn how to use lambda expressions to your advantage.
The release of Java SE 8 swiftly approaches. With it come not only the new linguistic lambda expressions (also called closures or anonymous methods)—along with some supporting language features—but also API and library enhancements that will make parts of the traditional Java core libraries easier to use. Many of these enhancements and additions are on the Collections API, and because the Collections API is pretty ubiquitous across applications, it makes the most sense to spend the majority of this article on it.
Originally published in the Sep/Oct 2013 issue of Java Magazine. Subscribe today.
However, it’s likely that most Java developers will be unfamiliar with the concepts behind lambdas and with how designs incorporating lambdas look and behave. So, it’s best to examine why these designs look the way they do before showing off the final stage. Thus, we’ll look at some before and after approaches to see how to approach a problem pre-lambda and post-lambda.
Note: This article was written against the b92 (May 30, 2013) build of Java SE 8, and the APIs, syntax, or semantics might have changed by the time you read this or by the time Java SE 8 is released. However, the concepts behind these APIs, and the approach taken by the Oracle engineers, should be close to what we see here.
Collections and Algorithms
The Collections API has been with us since JDK 1.2, but not all parts of it have received equal attention or love from the developer community. Algorithms, a more functional-centric way of interacting with collections, have been a part of the Collections API since its initial release, but they often get little attention, despite their usefulness. For example, the Collections
class sports a dozen or so methods all designed to take a collection as a parameter and perform some operation against the collection or its contents.
Consider, for example, the Person
class shown in Listing 1, which in turn is used by a List
that holds a dozen or so Person
objects, as shown in Listing 2.
BE ATTENTIVE
Algorithms, a more functional-centric way of interacting with collections, have been a part of the Collections API since its initial release, but they often get little attention, despite their usefulness.
Listing 1
public class Person {
public Person(String fn, String ln, int a) {
this.firstName = fn; this.lastName = ln; this.age = a;
}
public String getFirstName() { return firstName; }
public String getLastName() { return lastName; }
public int getAge() { return age; }
}
Listing 2
List<Person> people = Arrays.asList(
new Person("Ted", "Neward", 42),
new Person("Charlotte", "Neward", 39),
new Person("Michael", "Neward", 19),
new Person("Matthew", "Neward", 13),
new Person("Neal", "Ford", 45),
new Person("Candy", "Ford", 39),
new Person("Jeff", "Brown", 43),
new Person("Betsy", "Brown", 39)
);
}
Now, assuming we want to examine or sort this list by last name and then by age, a naive approach is to write a for
loop (in other words, implement the sort by hand each time we need to sort). The problem with this, of course, is that this violates DRY (the Don’t Repeat Yourself principle) and, worse, we have to reimplement it each time, because for
loops are not reusable.
The Collections API has a better approach: the Collections
class sports a sort
method that will sort the contents of the List
. However, using this requires the Person
class to implement the Comparable
method (which is called a natural ordering, and defines a default ordering for all Person
types) or you have to pass in a Comparator
instance to define how Person
objects should be sorted.
So, if we want to sort first by last name and then by age (in the event the last names are the same), the code will look something like Listing 3. But that’s a lot of work to do something as simple as sort by last name and then by age. This is exactly where the new closures feature will be of help, making it easier to write the Comparator
(see Listing 4).
Listing 3
Collections.sort(people, new Comparator<Person>() {
public int compare(Person lhs, Person rhs) {
if (lhs.getLastName().equals(rhs.getLastName())) {
return lhs.getAge() - rhs.getAge();
}
else
return lhs.getLastName().compareTo(rhs.getLastName());
}
});
Listing 4
Collections.sort(people, (lhs, rhs) -> {
if (lhs.getLastName().equals(rhs.getLastName()))
return lhs.getAge() - rhs.getAge();
else
return lhs.getLastName().compareTo(rhs.getLastName());
});
The Comparator
is a prime example of the need for lambdas in the language: it’s one of the dozens of places where a one-off anonymous method is useful. (Bear in mind, this is probably the easiest—and weakest—benefit of lambdas. We’re essentially trading one syntax for another, admittedly terser, syntax, but even if you put this article down and walk away right now, a significant amount of code will be saved just from that terseness.)
If this particular comparison is something that we use over time, we can always capture the lambda as a Comparator
instance, because that is the signature of the method—in this case, "int compare(Person, Person)"
—that the lambda fits, and store it on the Person
class directly, making the implementation of the lambda easier (see Listing 5) and its use even more readable (see Listing 6).
Listing 5
public class Person {
// . . .
public static final Comparator<Person> BY_LAST_AND_AGE =
(lhs, rhs) -> {
if (lhs.lastName.equals(rhs.lastName))
return lhs.age - rhs.age;
else
return lhs.lastName.compareTo(rhs.lastName);
};
}
Listing 6
Collections.sort(people, Person.BY_LAST_AND_AGE);
Storing a Comparator<Person>
instance on the Person
class is a bit odd, though. It would make more sense to define a method that does the comparison, and use that instead of a Comparator
instance. Fortunately, Java will allow any method to be used that satisfies the same signature as the method on Comparator
, so it’s equally possible to write the BY_LAST_AND_AGE Comparator
as a standard instance or static method on Person
(see Listing 7) and use it instead (see Listing 8).
BE ANONYMOUS
The Comparator is a prime example of the need for lambdas in the language: it’s one of the dozens of places where a one-off anonymous method is useful.
Listing 7
public static int compareLastAndAge(Person lhs, Person rhs) {
if (lhs.lastName.equals(rhs.lastName))
return lhs.age - rhs.age;
else
return lhs.lastName.compareTo(rhs.lastName);
}
Listing 8
Collections.sort(people, Person::compareLastAndAge);
Thus, even without any changes to the Collections API, lambdas are already helpful and useful. Again, if you walk away from this article right here, things are pretty good. But they’re about to get a lot better.
Changes in the Collections API
With some additional APIs on the Collection
classes themselves, a variety of new and more powerful approaches and techniques open up, most often leveraging techniques drawn from the world of functional programming. No knowledge of functional programming is necessary to use them, fortunately, as long you can open your mind to the idea that functions are just as valuable to manipulate and reuse as are classes and objects.
Comparisons. One of the drawbacks to the Comparator
approach shown earlier is hidden inside the Comparator
implementation. The code is actually doing two comparisons, one as a “dominant” comparison over the other, meaning that last names are compared first, and age is compared only if the last names are identical. If project requirements later demand that sorting be done by age first and by last names second, a new Comparator
must be written—no parts of compareLastAndAge
can be reused.
This is where taking a more functional approach can add some powerful benefits. If we look at that comparison as entirely separate Comparator
instances, we can combine them to create the precise kind of comparison needed (see Listing 9).
Listing 9
public static final Comparator<Person> BY_FIRST =
(lhs, rhs) -> lhs.firstName.compareTo(rhs.firstName);
public static final Comparator<Person> BY_LAST =
(lhs, rhs) -> lhs.lastName.compareTo(rhs.lastName);
public static final Comparator<Person> BY_AGE =
(lhs, rhs) -> lhs.age – rhs.age;
Historically, writing the combination by hand has been less productive, because by the time you write the code to do the combination, it would be just as fast (if not faster) to write the multistage comparison by hand.
As a matter of fact, this “I want to compare these two X things by comparing values returned to me by a method on each X” approach is such a common thing, the platform gave us that functionality out of the box. On the Comparator
class, a comparing
method takes a function (a lambda) that extracts a comparison key out of the object and returns a Comparator
that sorts based on that. This means that Listing 9 could be rewritten even more easily as shown in Listing 10.
Listing 10
public static final Comparator<Person> BY_FIRST =
Comparators.comparing(Person::getFirstName);
public static final Comparator<Person> BY_LAST =
Comparators.comparing(Person::getLastName);
public static final Comparator<Person> BY_AGE =
Comparators.comparing(Person::getAge);
Think for a moment about what this is doing: the Person
is no longer about sorting, but just about extracting the key by which the sort should be done. This is a good thing—Person
shouldn’t have to think about how to sort; Person
should just focus on being a Person
.
It gets better, though, particularly when we want to compare based on two or more of those values.
Composition. As of Java 8, the Comparator
interface comes with several methods to combine Comparator
instances in various ways by stringing them together. For example, the Comparator .thenComparing()
method takes a Comparator
to use for comparison after the first one compares. So, re-creating the “last name then age” comparison can now be written in terms of the two Comparator
instances LAST
and AGE
, as shown in Listing 11. Or, if you prefer to use methods rather than Comparator
instances, use the code in Listing 12.
BE REDUCTIONIST
Doing this bypasses an interesting opportunity to explore one of the more powerful features of the new Java API, that of doing a reduction—coalescing a collection of values down into a single one through some custom operations.
Listing 11
Collections.sort(people, Person.BY_LAST.
.thenComparing(Person.BY_AGE));
Listing 12
Collections.sort(people,
Comparators.comparing(Person::getLastName)
.thenComparing(Person::getAge));
By the way, for those who didn’t grow up using Collections.sort()
, there’s now a sort()
method directly on List
. This is one of the neat things about the introduction of interface default methods: where we used to have to put that kind of noninheritance-based reusable behavior in static methods, now it can be hoisted up into interfaces. (See the previous article in this series for more details.)
Similarly, if the code needs to sort the collection of Person
objects by last name and then by first name, no new Comparator
needs to be written, because this comparison can, again, be made of the two particular atomic comparisons shown in Listing 13.
Listing 13
Collections.sort(people,
Comparators.comparing(Person::getLastName)
.thenComparing(Person::getFirstName));
This combinatory “connection” of methods, known as functional composition, is common in functional programming and at the heart of why functional programming is as powerful as it is.
It’s important to understand that the real benefit here isn’t just in the APIs that enable us to do comparisons, but the ability to pass bits of executable code (and then combine them in new and interesting ways) to create opportunities for reuse and design. Comparator
is just the tip of the iceberg. Lots of things can be made more flexible and powerful, particularly when combining and composing them.
Iteration. As another example of how lambdas and functional approaches change the approach to code, consider one of the fundamental operations done with collections: that of iterating over them. Java 8 will bring to collections a change via the forEach()
default method defined on the Iterator
and Iterable
interfaces. Using it to print each of the items in the collection, for example, requires passing a lambda to the forEach
method on an Iterator
, as shown in Listing 14.
Listing 14
people.forEach((it) -> System.out.println("Person: " + it));
Officially, the type of lambda being passed in is a Consumer
instance, defined in the java.util.function
package. Unlike traditional Java interfaces, however, Consumer
is one of the new functional interfaces, meaning that direct implementations will likely never happen—instead, the new way to think about it is solely in terms of its single, important method, accept
, which is the method the lambda provides. The rest (such as compose
and andThen
) are utility methods defined in terms of the important method, and they are designed to support the important method.
For example, andThen()
chains two Consumer
instances together, so the first one is called first and the second is called immediately after into a single Consumer
. This provides useful composition techniques that are a little outside the scope of this article.
Many of the use cases involved in walking through a collection have the purpose of finding items that fit a particular criterion—for example, determining which of the Person
objects in the collection are of drinking age, because the automated code system needs to send everyone in that collection a beer. This “act upon a thing coming from a group of things” is actually far more widespread than just operating upon a collection. Think about operating on each line in a file, each row from a result set, each value generated by a random-number generator, and so on. Java SE 8 generalized this concept one step further, outside collections, by lifting it into its own interface: Stream
.
Stream. Like several other interfaces in the JDK, the Stream
interface is a fundamental interface that is intended for use in a variety of scenarios, including the Collections API. It represents a stream of objects, and on the surface of things, it feels similar to how Iterator
gives us access one object at a time through a collection.
BE A COLLECTOR
It is ugly enough to fix. The code is actually a lot easier to write if we use the built-in Collector interface and its partner Collectors, which specifically do this kind of mutable-reduction operation.
However, unlike collections, Stream
does not guarantee that the collection of objects is finite. Thus, it is a viable candidate for pulling strings from a file, for example, or other kinds of on-demand operations, particularly because it is designed not only to allow for composition of functions, but also to permit parallelization “under the hood.”
Consider the earlier requirement: the code needs to filter out any Person
object that is not at least 21 years of age. Once a Collection
converts to a Stream
(via the stream()
method defined on the Collection
interface), the filter
method can be used to produce a new Stream
through which only the filtered objects come (see Listing 15).
Listing 15
people
.stream()
.filter(it -> it.getAge() >= 21)
The parameter to filter
is a Predicate
, an interface defined as taking one genericized parameter and returning a Boolean. The intent of the Predicate
is to determine whether the parameter object is included as part of the returned set.
The return from filter()
is another Stream
, which means that the filtered Stream
is also available for further manipulation, such as to forEach()
through each of the elements that come through the Stream
, in this case to display the results (see Listing 16).
Listing 16
people.stream()
.filter((it) -> it.getAge() >= 21)
.forEach((it) ->
System.out.println("Have a beer, " + it.getFirstName()));
This neatly demonstrates the composability of streams—we can take streams and run them through a variety of atomic operations, each of which do one—and only one—thing to the stream. Additionally, it’s important to note that filter()
is lazy—it will filter only as it needs to, on demand, rather than going through the entire collection of Person
objects and filtering ahead of time (which is what we’re used to with the Collections API).
Predicates. It might seem odd at first that the filter()
method takes only a single Predicate
. After all, if a goal was to find all the Person
objects whose age is greater than 21 and whose last name is Neward, it would seem that filter()
could or should take a pair of Predicate
instances. Of course, this opens a Pandora’s box of possibilities. What if the goal is to find all Person
objects with an age greater than 21 and less than 65, and with a first name of at least four or more characters? Infinite possibilities suddenly open up, and the filter()
API would need to somehow approach all of these.
Unless, of course, a mechanism were available to somehow coalesce all of these possibilities down into a single Predicate
. Fortunately, it’s fairly easy to see that any combination of Predicate
instances can themselves be a single Predicate
. In other words, if a given filter needs to have condition A be true
and condition B be true
before an object can be included in the filtered stream, that is itself a Predicate (A and B)
, and we can combine those two together into a single Predicate
by writing a Predicate
that takes any two Predicate
instances and returns true
only if both A
and B
each yield true
.
This “and”ing Predicate
is—by virtue of the fact that it knows only about the two Predicate
instances that it needs to call (and nothing about the parameters being passed in to each of those)— completely generic and can be written well ahead of time.
If the Predicate
closures are stored in Predicate
references (similar to how Comparator
references were used earlier, as members on Person
), they can be strung together using the and()
method on them, as shown in Listing 17.
Listing 17
Predicate<Person> drinkingAge = (it) -> it.getAge() >= 21;
Predicate<Person> brown = (it) -> it.getLastName().equals("Brown");
people.stream()
.filter(drinkingAge.and(brown))
.forEach((it) ->
System.out.println("Have a beer, " +
it.getFirstName()));
As might be expected, and()
, or()
, and xor()
are all available. Make sure to check the Javadoc for a full introduction to all the possibilities.
map() and reduce(). Other common Stream
operations include map()
, which applies a function across each element present within a Stream
to produce a result out of each element. So, for example, we can obtain the age of each Person
in the collection by applying a simple function to retrieve the age out of each Person
, as shown in Listing 18.
Listing 18
IntStream ages =
people.stream()
.mapToInt((it) -> it.getAge());
For all practical purposes, IntStream
(and its cousins LongStream
and DoubleStream
) is a specialization of the Stream<T>
interface (meaning that it creates custom versions of that interface) for those primitive types.
This, then, produces a Stream
of integers out of a Collection
of Person
instances. This is also sometimes known as a transformation operation, because the code is transforming or projecting a Person
into an int
.
Similarly, reduce()
is an operation that takes a stream of values and, through some kind of operation, reduces them into a single value. Reduction is an operation already familiar to developers, though they might not recognize it at first: the COUNT()
operator from SQL is one such operation (reducing from a collection of rows to a single integer), as are the SUM()
, MAX()
, and MIN()
operators. Each of these takes a stream of values (rows) and produces a single value (the integer) by applying some operation (for example, increment a counter, add the value to a running total, select the highest, or select the lowest) to each of the values in the stream.
So, for example, you could sum the values prior to dividing by the number of elements in the stream to obtain an average age. Given the new APIs, it’s easiest to just use the built-in methods, as shown in Listing 19.
Listing 19
int sum = people.stream()
.mapToInt(Person::getAge)
.sum();
But doing this bypasses an interesting opportunity to explore one of the more powerful features of the new Java API, that of doing a reduction—coalescing a collection of values down into a single one through some custom operation. So, let’s rewrite the summation part of this using the new reduce()
method:
.reduce(0, (l, r) -> l + r);
This reduction, also known in functional circles as a fold, starts with a seed value (0, in this case), and applies the closure to the seed and the first element in the stream, taking the result and storing it as the accumulated value that will be used as the seed for the next element in the stream.
In other words, in a list of integers such as 1, 2, 3, 4, and 5, the seed 0 is added to 1 and the result (1) is stored as the accumulated value, which then serves as the left-hand value in addition to serving as the next number in the stream (1+2). The result (3) is stored as the accumulated value and used in the next addition (3+3). The result (6) is stored and used in the next addition (6+4), and the result is used in the final addition (10+5), yielding the final result 15. And, sure enough, if we run the code in Listing 20, we get that result.
Listing 20
List<Integer> values = Arrays.asList(1, 2, 3, 4, 5);
int sum = values.stream().reduce(0, (l,r) -> l+r);
System.out.println(sum);
Note that the type of closure accepted as the second argument to reduce
is an IntBinaryOperator
, defined as taking two integers and returning an int
result. IntBinaryOperator
and IntBiFunction
are examples of specialized functional interfaces—including other specialized versions for Double
and Long
—which take two parameters (of one or two different types) and return an int
. These specialized versions were created mostly to ease the work required for using the common primitive types.
IntStream
also has a couple of helper methods, including the average()
, min()
, and max()
methods, that do some of the more common integer operations. Additionally, binary operations (such as summing two numbers) are also often defined on the primitive wrapper classes for that type (Integer::sum
, Long::max
, and so on).
More maps and reduction. Maps and reduction are useful in a variety of situations beyond just simple math. After all, in any case where a collection of objects can be transformed into a different object (or value) and then collected into a single value, map and reduction operations work.
The map operation, for example, can be useful as an extraction or projection operation to take an object and extract portions of it, such as extracting the last name out of a Person
object:
Stream lastNames = people.stream() .map(Person::getLastName);
Once the last names have been retrieved from the Person
stream, the reduction can concatenate strings together, such as transforming the last name into a data representation for XML. See Listing 21.
Listing 21
String xml =
"<people data='lastname'>" +
people.stream()
.map(it -> "<person>" + it.getLastName() + "</person>")
.reduce("", String::concat)
+ "</people>";
System.out.println(xml);
And, naturally, if different XML formats are required, different operations can be used to control the contents of each format, supplied either ad hoc, as in Listing 21, or from methods defined on other classes, such as from the Person
class itself, as shown in Listing 22, which can then be used as part of the map()
operation to transform the stream of Person
objects into a JSON array of object elements, as shown in Listing 23.
Listing 22
public class Person {
// . . .
public static String toJSON(Person p) {
return
"{" +
"firstName: \"" + p.firstName + "\", " +
"lastName: \"" + p.lastName + "\", " +
"age: " + p.age + " " +
"}";
}
}
Listing 23
String json =
people.stream()
.map(Person::toJSON)
.reduce("[", (l, r) -> l + (l.equals("[") ? "" : ",") + r)
+ "]";
System.out.println(json);
The ternary operation in the middle of the reduce
operation is there to avoid putting a comma in front of the first Person
serialized to JSON. Some JSON parsers might accept this format, but that is not guaranteed, and it looks ugly to have it there.
It is ugly enough, in fact, to fix. The code is actually a lot easier to write if we use the built-in Collector
interface and its partner Collectors
, which specifically do this kind of mutable-reduction operation (see Listing 24). This has the added benefit of being much faster than the versions using the explicit reduce
and String::concat
from the earlier examples, so it’s generally a better bet.
BE READY
The release of Java SE 8 swiftly approaches. With it come not only the new linguistic lambda expressions (also called closures or anonymous methods)—along with some supporting language features—but also API and library enhancements that will make parts of the traditional Java core libraries easier to use.
Listing 24
String joined = people.stream()
.map(Person::toJSON)
.collect(Collectors.joining(", "));System.out.println("[" + joined + "]");
Oh, and lest we forget our old friend Comparator
, note that Stream
also has an operation to sort a stream in-flight, so the sorted JSON representation of the Person
list looks like Listing 25.
Listing 25
String json = people.stream()
.sorted(Person.BY_LAST)
.collect(Collectors.joining(", " "[", "]"));
System.out.println(json);
This is powerful stuff.
Parallelization. What’s even more powerful is that these operations are entirely independent of the logic necessary to pull each object through the Stream
and act on each one, which means that the traditional for
loop will break down when attempting to iterate, map, or reduce a large collection by breaking the collection into segments that will each be processed by a separate thread.
The Stream
API, however, already has that covered, making the XML or JSON map()
and reduce()
operations shown earlier a slightly different operation—instead of calling stream()
to obtain a Stream
from the collection, use parallelStream()
instead, as demonstrated in Listing 26.
Learn More
Listing 26
people.parallelStream()
.filter((it) -> it.getAge() >= 21)
.forEach((it) ->
System.out.println("Have a beer " + it.getFirstName() +
Thread.currentThread()));
For a collection of at least a dozen items, at least on my laptop, two threads are used to process the collection: the thread named main
, which is the traditional one used to invoke the main()
method of a Java class, and another thread named ForkJoinPool.commonPool worker-1
, which is obviously not of our creation.
Obviously, for a collection of a dozen items, this would be hideously unnecessary, but for several hundred or more, this would be the difference between “good enough” and “needs to go faster.” Without these new methods and approaches, you would be staring at some significant code and algorithmic study. With them, you can write parallelized code literally by adding eight keystrokes (nine if you count the Shift key required to capitalize the s in stream) to the previously sequential processing.
And, where necessary, a parallel Stream
can be brought back to a sequential one by calling—you can probably guess—sequential()
on it.
The important thing to note is that regardless of whether the processing is better done sequentially or in parallel, the same Stream
interface is used for both. The sequential or parallel implementation becomes entirely an implementation detail, which is exactly where we want it to be when working on code that focuses on business needs (and value); we don’t want to focus on the low-level details of firing up threads in thread pools and synchronizing across them.
Conclusion
Lambdas will bring a lot of change to Java, both in terms of how Java code will be written and how it will be designed. Some of these changes are already taking place within the Java SE libraries, and they will slowly make their way through many other libraries—both those owned by the Java platform and those out in “the wilds” of open source—as developers grow more comfortable with the abilities (and drawbacks) of lambdas.
Numerous other changes are present within the Java SE 8 release. But if you understand how lambdas on collections work, you will have a strong advantage when thinking about how to leverage lambdas within your own designs and code, and you can create better-decoupled code for years to come.
Ted Neward (@tedneward) is an architectural consultant for Neudesic. He has served on several Expert Groups; authored many books, including Effective Enterprise Java (Addison-Wesley Professional, 2004) and Professional F# 2.0 (Wrox, 2010); written hundreds of articles on Java, Scala, and other technologies; and spoken at hundreds of conferences.