Virtual Threads in Java 21

← Back to the section

Before Java 21, every thread in the JVM mapped directly to an operating system thread. That put a hard ceiling on the number of concurrent tasks. Virtual threads remove that limit — and let you write plain blocking code where a reactive style used to be required.

The problem: platform threads are expensive

A platform thread (Thread in the classic sense) is a wrapper around an OS thread. Each such thread consumes:

roughly 1–2 MB of stack by default,
a system handle from the OS,
time spent on context switching.

A typical server can hold a few thousand platform threads — and that is already the limit. If every request is waiting for a database response, all threads block for nothing: the CPU sits idle, yet there is nobody to accept new tasks.

Historically this was solved in two ways:

Thread pools (ExecutorService, ForkJoinPool) — threads are reused, but the limit remains.
Reactive style (CompletableFuture, Project Reactor, RxJava) — the task is split into callbacks, the thread does not block. It works, but the code becomes hard to read and debug.

Virtual threads: thread-per-task is cheap again

A virtual thread is a thread managed by the JVM, not by the OS. You can create them in the thousands and tens of thousands: it does not reserve megabytes of stack up front and is not permanently bound to an OS handle.

In short: a virtual thread is a lightweight task that the JVM schedules on top of a small pool of carrier threads bound to CPU cores.

When a virtual thread enters a blocking call (reading from the network, a database query, Thread.sleep), the JVM removes it from the carrier thread and frees the carrier for another virtual thread. When the block is released, the virtual thread is put back into the queue.

The result: thousands of concurrent blocking operations on a handful of real OS threads.

How it works inside: continuation and the scheduler

Under the hood, a virtual thread is a pair of two parts:

a continuation — a captured execution state (the call stack) that can be paused and resumed;
a scheduler, which decides on which carrier thread to resume the continuation.

When a virtual thread reaches a blocking call instrumented by the JDK (network I/O, Thread.sleep, BlockingQueue.take, and so on), an unmount happens: the JVM folds up the virtual thread's stack and moves it into the heap as small fragment objects (stack chunks), and the carrier thread is freed. When the operation is ready to continue, the scheduler performs a mount — it copies the stack back onto some free carrier thread (not necessarily the same one as before) and resumes execution.

That is exactly why a virtual thread's stack is cheap: it does not reserve 1–2 MB in OS memory up front, but lives in the heap and grows in portions as call depth increases. The initial cost is hundreds of bytes, not megabytes.

The default scheduler is a special ForkJoinPool in FIFO mode. The number of carrier threads (the degree of parallelism) defaults to the number of available cores. It can be tuned with system properties at JVM startup:

-Djdk.virtualThreadScheduler.parallelism=8   # how many carriers run in parallel
-Djdk.virtualThreadScheduler.maxPoolSize=256 # carrier ceiling (to compensate on pinning/blocking)

An important consequence: scheduling is cooperative. A virtual thread yields the carrier only at blocking points that the JDK knows about. More on this below, in the section on CPU-intensive code.

How to create a virtual thread

The simplest way is the factory method Thread.ofVirtual():

Thread vt = Thread.ofVirtual()
    .name("worker-1")
    .start(() -> {
        // plain blocking code — here that is fine
        String result = fetchFromDatabase();
        System.out.println(result);
    });
vt.join();

For a pool of tasks — Executors.newVirtualThreadPerTaskExecutor(). It creates a new virtual thread for each task:

try (ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor()) {
    for (int i = 0; i < 10_000; i++) {
        executor.submit(() -> handleRequest());
    }
} // here the executor is closed and waits for all tasks to finish

Launching 10,000 tasks this way is fine. With platform threads it would be a guaranteed OutOfMemoryError.

Pinning: when a virtual thread gets stuck to the carrier

There is a pitfall — pinning. This is a situation where a virtual thread cannot unmount from the carrier thread during a block: the carrier stays busy for the entire wait, as if it were an ordinary platform thread. If there is a lot of such code, the number of actually working carriers hits the ceiling and the benefit of virtual threads is lost.

Versions matter — this area has changed noticeably:

Java 21–23. Pinning occurs when a blocking call happens inside a synchronized block or inside a native method (native). The reason for synchronized: the JVM tracked monitor ownership at the level of the OS carrier thread, so the virtual thread could not be detached. The recommended workaround is to replace synchronized with ReentrantLock in sections that contain I/O.

// Java 21: synchronized + blocking call = pinning
synchronized (lock) {
    String data = socket.read(); // the carrier is not freed
}

// Java 21: ReentrantLock does not cause pinning
ReentrantLock lock = new ReentrantLock();
lock.lock();
try {
    String data = socket.read(); // the virtual thread unmounts correctly
} finally {
    lock.unlock();
}

Java 24+ (JEP 491). The JVM learned to track monitor ownership at the level of the virtual thread itself, and synchronized no longer causes pinning — a virtual thread unmounts fine even from inside a blocking call within synchronized. Rewriting code to ReentrantLock for this reason is no longer needed. Pinning remains only on native frames (native methods) and while executing a class initializer on it.

In short: on Java 21, avoid blocking calls under synchronized; on Java 24+ this problem is largely solved by the JVM itself.

Diagnosing pinning

The way to diagnose it also depends on the version:

Java 21–23: the system property -Djdk.tracePinnedThreads=full (or short) prints a stack trace every time a virtual thread is pinned to a carrier. Note: in newer versions this flag is deprecated and being removed.
The modern way — JFR. The JVM emits a jdk.VirtualThreadPinned event when a virtual thread stays pinned longer than a threshold. It is enabled through Flight Recorder and visible in the .jfr recording — this is the recommended path, independent of the deprecated flag. The events jdk.VirtualThreadStart, jdk.VirtualThreadEnd, and jdk.VirtualThreadSubmitFailed are useful alongside it (the last one signals that the scheduler failed to accept a task).

CPU-intensive code and cooperative scheduling

Virtual thread scheduling is cooperative, with no timer-based preemption. A virtual thread gives up the carrier only at blocking points the JDK knows about. If a task spins in a long computation loop without a single blocking call, it will not yield the carrier on its own — and will hold it until it finishes.

// Anti-example: a pure CPU loop in a virtual thread
Thread.ofVirtual().start(() -> {
    long acc = 0;
    for (long i = 0; i < 100_000_000_000L; i++) acc += i; // not a single unmount point
});

A few such tasks will occupy all carriers, and the remaining virtual threads will wait — even those ready to do useful work. Virtual threads are designed for code that waits a lot, not that computes a lot. For heavy computations use platform threads or ForkJoinPool. If you really need to "make way" out of a long loop, Thread.yield() will help, but that treats the symptom, not an indication for virtual threads.

Do not create a pool of virtual threads

Virtual threads should not be pooled and reused — that changes the very model. They are cheap to create one per task, which is exactly what Executors.newVirtualThreadPerTaskExecutor() does: a new thread for each task. A fixed pool of N virtual threads is an anti-pattern: it artificially reintroduces the very limit that virtual threads were invented to remove.

When you do need to limit concurrency — for example, to avoid overloading the database connection pool or an external service — limit access to the specific resource with a Semaphore, not the size of the thread pool:

Semaphore dbPermits = new Semaphore(20); // no more than 20 concurrent DB requests

try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    for (var task : tasks) {
        executor.submit(() -> {
            dbPermits.acquire();
            try {
                return queryDatabase(task); // guarded by the semaphore
            } finally {
                dbPermits.release();
            }
        });
    }
}

This way there can be a hundred thousand tasks, yet at most 20 hit the database at once — the rest of the virtual threads wait cheaply on the semaphore without occupying carriers.

ThreadLocal and ScopedValue

Virtual threads support ThreadLocal — but at scale it is a trap. With platform threads there were hundreds of them, and values in ThreadLocal cost little. Virtual threads number in the hundreds of thousands; if each holds a hefty value in ThreadLocal, memory bloats. Plus the typical trick of "caching an expensive object in ThreadLocal per thread pool" loses meaning with the thread-per-task model — there is no reuse.

The replacement for passing context (user id, trace id) is ScopedValue: an immutable value bound to an execution scope and automatically visible to child tasks, with no risk of leaks. ScopedValue is finalized in Java 25 (in Java 21–24 it is a preview feature).

private static final ScopedValue<String> USER_ID = ScopedValue.newInstance();

ScopedValue.where(USER_ID, "user-42").run(() -> {
    // inside this scope USER_ID.get() == "user-42",
    // the value is visible to child tasks too; outside the scope it is unavailable
    handleRequest();
});

Structured concurrency

When one task spawns several subtasks (query two services in parallel and combine the result), it is convenient to manage their lifecycle as a single unit: if one fails, cancel the rest; wait for all of them in one place. StructuredTaskScope exists for this.

try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
    var user  = scope.fork(() -> fetchUser(id));      // each fork is a separate virtual thread
    var order = scope.fork(() -> fetchOrder(id));
    scope.join();            // wait for both
    scope.throwIfFailed();   // if either failed — cancel the other and rethrow the error
    return combine(user.get(), order.get());
}

This eliminates subtask "leaks" (a hung subtask with nobody to cancel it) and makes error handling predictable. At the time of writing, StructuredTaskScope is still a preview feature — the API changed between versions, so check the signatures for your JDK version.

Observability

There are too many virtual threads for a plain jstack — by default it does not print them. A full dump including virtual threads is taken with jcmd:

jcmd <pid> Thread.dump_to_file -format=json threads.json

For continuous monitoring, use the JFR events jdk.VirtualThreadStart / jdk.VirtualThreadEnd (lifecycle), jdk.VirtualThreadPinned (pinning longer than a threshold), and jdk.VirtualThreadSubmitFailed (the scheduler did not accept a task).

When virtual threads, when platform threads

Virtual threads are a good fit for:

HTTP servers with thousands of concurrent requests,
services that work heavily with a database or external APIs,
any IO-intensive code where the thread spends most of its time waiting.

Platform threads remain preferable for:

long CPU-intensive computations (for example, image processing, encryption),
code with strict requirements on the OS scheduler.

Virtual threads do not speed up compute-bound tasks — they remove the limit on the number of concurrent IO waits.

Compatibility with existing code

Virtual threads implement the same Thread interface. Most standard libraries, JDBC drivers, and frameworks work with them unchanged.

Spring Boot 3.2+ includes virtual thread support: it is enough to add to the configuration:

@Bean
public TomcatProtocolHandlerCustomizer<?> virtualThreadsCustomizer() {
    return handler -> handler.setExecutor(
        Executors.newVirtualThreadPerTaskExecutor()
    );
}

After that, each incoming HTTP request is handled in its own virtual thread — no reactive style, no callbacks.

In short

A platform thread = an OS thread: expensive, limited to thousands.
A virtual thread is managed by the JVM and is cheap: you can create hundreds of thousands.
Created via Thread.ofVirtual() or Executors.newVirtualThreadPerTaskExecutor().
Inside — a continuation (the stack lives in the heap, grows in portions) + a scheduler based on ForkJoinPool (carriers matching the core count, tunable via system properties).
On a block an unmount happens: the stack moves to the heap, the carrier is free; on unblock — a mount onto any free carrier.
Pinning: on Java 21–23 it is caused by synchronized + a blocking call (workaround — ReentrantLock); on Java 24+ (JEP 491) synchronized no longer pins, only native frames remain. Diagnosis — the JFR event jdk.VirtualThreadPinned.
Scheduling is cooperative: a pure CPU loop does not yield the carrier. The benefit is for IO-intensive code; CPU tasks gain nothing.
Do not pool virtual threads (one per task); limit resource access with a Semaphore, not the pool size.
Context is better passed via ScopedValue (final in Java 25), not ThreadLocal; for managed fan-out — StructuredTaskScope (still preview).