This repository provides the necessary context to understand what Java 21 Virtual Threads are, how they work, and what makes them unique. It also provides a few examples to understand them more in depth.
In the context of programming, performing more than one task at a time is known as concurrent programming. Probably the most popular concurrent program example is an HTTP server. HTTP servers await incoming requests and concurrently process them in order to handle more than one request at a time.
Since its initial release, Java has provided the ability to write concurrent programs via the Thread class.
Historically, Thread
instances have been an expensive object to create and maintain since they have a 1:1 mapping with OS (operating system) threads.
As a consequence of this 1:1 mapping, each Thread
instance takes a significant amount of memory and requires the OS scheduler to perform a context switch in order for it to execute.
To optimize this cost, Java HTTP servers have traditionally processed HTTP requests using thread pools.
Thread pools are a collection of reusable Thread
objects that avoid the cost associated with creating a Thread
object for each processing task.
Thread pools can have a static or dynamic size, but in general they specify an upper bound to avoid depleting compute and memory resources.
Thus, in traditional Java HTTP servers, the maximum number of requests that can be processed concurrently is bound to the maximum size its associated thread pool.
Any incoming requests that exceed this limit would either have to be buffered while waiting for a Thread
to become available or be rejected altogether.
The following example shows two states that an HTTP server can be in:
State 1
| HTTP client | HTTP server |
|=============|=================================================|
| | HTTP requests | Thread Pool (max: 3) |
| |==========================|======================|
| | Buffer (max: 2) | Active | Idle | Active |
| |==========================|==========|===========|
| o------|-----------------|-->r1<--|----------|----t1 |
| o------|-----------------|-->r2<--|----------|----t2 |
| | | | t3 | |
- The request buffer is empty.
- There are two active requests:
r1
andr2
. - There are two active threads:
t1
andt2
. - The is one idle thread:
t3
.
Then, some time after state 1...
State 2
| HTTP client | HTTP server |
|=============|=================================================|
| | HTTP requests | Thread Pool (max: 3) |
| |==========================|======================|
| | Buffer (max: 2) | Active | Idle | Active |
| |==========================|==========|===========|
| o------|-----------------|-->r3<--|----------|----t1 |
| o------|-----------------|-->r2<--|----------|----t2 |
| o------|-----------------|-->r4<--|----------|----t3 |
| o------|------>r5 | | | |
| o------|------>r6 | | | |
| o-----X| | | | |
- There are two buffered requests:
r5
andr6
. - There are three active requests:
r2
,r3
andr4
. - There are three active threads:
t1
,t2
andt3
. - There are no idle threads.
- The HTTP server rejects an incoming request because the buffer is full.
Most services exposed via HTTP spend most of the request processing time waiting for other things to complete.
This is known as IO-bound work (the opposite of CPU-bound work).
For example, while processing a request, a financial service may internally send an HTTP request to another service in order to obtain the exchange rate between two currencies and then wait for a response.
Historically, this waiting BLOCKED the Thread
and prevented it from performing any other work (like processing a different HTTP request) until it was done waiting and it finished processing the HTTP request.
This programming model known as synchronous programming while easy to understand, it suffers from this major drawback.
To mitigate this problem, Java provides support for the asynchronous programming model. Asynchronous programming allows to execute a task without blocking and provides a combination of the following:
- A mechanism to check the asynchronous execution status and obtain its result when complete. Java 5 introduced the Future interface to support this.
- A mechanism to provide a callback (i.e. a method) that will be invoked with the result of the asynchronous execution. Java 8 introduced the CompletableFuture class to support this.
While this solves the Thread
blocking problem, it comes at the expense of a more complex programming model that's harder to read, write, test, debug and reason about.
It's worth noting that asynchronous tasks are not exempt from the Thread
exhaustion problem as they're also executed by Thread
instances.
For example, CompletableFuture
async methods run by default in the ForkJoinPool.commonPool() (a special kind of thread pool).
This pool also has an upper bound, so it's critical that any waiting is performed in a non-blocking manner.
For example, this means using CompletableFuture.delayedExecutor() instead of Thread.sleep(), Java 11's HttpClient instead of HttpURLConnection, etc.
Java 21 introduced the concept of Virtual Threads which provide the best of both worlds: the simplicity of the synchronous programming model with the performance benefits of the asynchronous programming model. Starting with Java 21 there are now two kinds of threads:
- Platform Threads: These are the pre-Java 21 "classic"
Thread
instances that have a 1:1 mapping with OS threads. - Virtual Threads: These are lightweight
Thread
instances that only exist in the JVM (Java Virtual Machine).
Since Virtual Threads are also instances of the Thread
class, they can be distinguished by invoking Thread.isVirtual().
Virtual Threads can be created in various ways including: Thread.startVirtualThread(), Thread.ofVirtual(), and Executors.newVirtualThreadPerTaskExecutor().
Virtual Threads cannot execute on their own. Instead, they must be "mounted" by a Platform Thread.
The Platform Thread that executes a mounted Virtual Thread become its "Carrier Thread".
Whenever a Virtual Thread invokes a blocking method like Thread.sleep(), the Virtual Thread becomes BLOCKED
and it's unmounted from its Carrier Thread.
As a result, the Platform Thread is "released" and becomes available for mounting any other Virtual Thread in the RUNNABLE state which has not been mounted by another Platform Thread.
Once the BLOCKED
Virtual Thread is done waiting and becomes RUNNABLE
again, it can be mounted by the same or a different Platform Thread to resume execution.
However, for all intents and purposes Virtual Threads are still Thread
instances, which means that things like ThreadLocal will work as expected even if the underlying Platform Thread changes.
Virtual Threads are executed by an implicitly-created thread pool of Platform Threads akin to, but separate from ForkJoinPool.commonPool()
.
Developers have little control over this pool other than the following system properties:
jdk.virtualThreadScheduler.parallelism
: The number of platform threads available for scheduling virtual threads. It defaults to the number of available processors.jdk.virtualThreadScheduler.maxPoolSize
: The maximum number of platform threads available to the scheduler. It defaults to 256.
There is one major caveat when using Virtual Threads known as thread pinning. When a Virtual Thread performs one of the following actions:
- Invoke a native method of a foreign function.
- Invoke a blocking method while inside a synchronized block.
Then the Virtual Thread will "pin" to its Carrier Thread before becoming BLOCKED
and preventing it from mounting any other Virtual Threads.
In other words, the underlying Platform Threads also becomes BLOCKED
.
Therefore, in order to safeguard the throughput of Virtual Threads, it's critical that these actions are avoided in Virtual Threads.
This repository contains the following examples.
- VirtualThreadBasics.java: Contains the basics on creating and starting Virtual Threads.
- VirtualThreadsVsOtherApproaches.java: Compares Virtual Threads with other concurrent programming approaches.
- VirtualThreadsAreCheap.java: Showcases how creating Virtual Threads is a cheap operation.
- VirtualThreadPinning.java: Demonstrates how thread pinning negatively affects throughput.
- VirtualThreadLocal.java: Displays how
ThreadLocal
works in Virtual Threads even when the Carrier Thread changes.