Concurrency & Multithreading

Production outages from concurrency rarely look exotic—they look like a counter that is almost right, a pool that accepted work until the JVM ran out of memory, or a thread that never noticed shutdown. This chapter goes deep on threads, the Java Memory Model, java.util.concurrent, executors, CompletableFuture, and virtual threads—the material senior interviews and on-call expect.

mid senior

Thread fundamentals

A thread is a lightweight unit of execution within a process—shared heap, separate stack. The OS schedules many threads onto fewer CPU cores; Java maps platform threads 1:1 to OS threads (until virtual threads).

Thread lifecycle

StateTypical cause
NEWConstructed, not started
RUNNABLEEligible to run (includes actually running or ready)
BLOCKEDWaiting to enter synchronized monitor
WAITINGwait(), join(), LockSupport.park()
TIMED_WAITINGsleep(ms), timed wait, join(timeout)
TERMINATEDCompleted or died with exception

Creating threads

Subclassing Thread overrides run() on the Thread object itself—works but couples task to thread type. Implementing Runnable (or lambda) and passing to Thread or an executor separates concerns and is the standard style. Callable adds a return value and checked exceptions—always submit to an ExecutorService to obtain a Future.

ApproachWhen
Subclass ThreadRare—couples task to Thread type; prefer Runnable
Runnablevoid task—no result, no checked exception in signature
Callable<V>Returns value, throws checked exceptions—use with ExecutorService
Executor / virtual thread factoryProduction default—pooling and lifecycle managed
Java
Thread t = Thread.ofPlatform().name("worker-1").start(() -> {
    System.out.println("running on " + Thread.currentThread().getName());
});

Callable<Integer> task = () -> 42;
ExecutorService pool = Executors.newSingleThreadExecutor();
Future<Integer> future = pool.submit(task);
Integer result = future.get();  // blocks until done
pool.shutdown();

Key thread methods

MethodPurpose
start()Begin new thread—calls run() on new stack; never call run() directly for concurrency
join() / join(ms)Wait for thread termination
sleep(ms)TIMED_WAITING—does not release locks held
interrupt()Sets interrupt flag—cooperative cancellation
isInterrupted()Check flag without clearing
Thread.interrupted()Static—check and clear current thread flag
Java
void pollUntilDone(Runnable work) throws InterruptedException {
    Thread worker = Thread.startVirtualThread(work);
    while (worker.isAlive()) {
        try {
            worker.join(500);
        } catch (InterruptedException e) {
            worker.interrupt();  // propagate shutdown
            throw e;
        }
    }
}

start() vs run()

Calling start() asks the JVM to schedule a new call stack that enters run(). Calling run() directly on a Thread object executes on the current thread—no parallelism. This mistake appears in tutorials and still shows up in code review.

Cooperative interruption

interrupt() sets a boolean flag on the target thread. It does not forcibly stop execution (deprecated stop() was unsafe). Blocking methods like sleep, wait, join, and BlockingQueue.take respond by throwing InterruptedException and clearing the interrupt flag—catch blocks should usually call Thread.currentThread().interrupt() again to preserve the signal for upstream shutdown logic.

Daemon threads

Daemon threads do not keep the JVM alive—when all user (non-daemon) threads end, the VM exits. Use for background housekeeping (metrics flush), not for work that must complete on shutdown unless you register a Runtime.addShutdownHook that flushes buffers and closes pools. Main business work should run on non-daemon threads (default).

Java
Thread daemon = Thread.ofPlatform().daemon().start(() -> {
    while (!Thread.currentThread().isInterrupted()) {
        // background tick
    }
});

Synchronization

Correctness requires mutual exclusion and visibility—two threads can corrupt data without ever holding the same lock if they only read/write a shared field without happens-before edges.

Race condition — broken counter

Java
class BrokenCounter {
    private int count = 0;

    void increment() {
        count++;  // read-modify-write — NOT atomic
    }
}

// Two threads each increment 1_000_000 times — result often < 2_000_000

Data race (JMM): two accesses to same variable, at least one write, not ordered by happens-before—undefined behavior for non-volatile non-atomic fields.

synchronized — intrinsic monitor lock

Every object has an associated monitor. One thread holds the lock; others block in BLOCKED.

Java
class SafeCounter {
    private int count = 0;

    synchronized void increment() {  // lock = this
        count++;
    }

    int get() {
        synchronized (this) {        // block-level — finer control
            return count;
        }
    }
}

// Static synchronized — lock = Class object
static synchronized void global() { }

Method-level locks the whole instance—can reduce parallelism. Block-level lets separate locks for independent fields (dedicated final Object lock = new Object() guards are clearer than locking on this when external code might also synchronize on your instance).

Reentrancy: the same thread may re-enter a synchronized method on the same monitor (recursive calls or synchronized methods calling each other on the same object)—the JVM tracks entry count and only releases when the outermost scope exits.

volatile — visibility, not atomicity

A volatile read or write participates in the JMM: a write to a volatile field happens-before any subsequent read of that field by another thread. Compilers and CPUs cannot cache volatile reads in registers across potential concurrent writes in ways that break visibility. It does not make compound operations atomic—count++ is still read-modify-write with a race between threads.

Valid uses: one-shot shutdown flags, double-checked locking when combined with correct synchronization (prefer holder idiom or static init), status fields where only visibility matters. For counters, use atomics or locks.

Java
class Status {
    private volatile boolean shutdown = false;

    void shutdown() { shutdown = true; }  // visible to poller immediately

    void loop() {
        while (!shutdown) { /* work */ }   // OK flag pattern
    }
}

volatile int counter;
void bad() { counter++; }  // still racy — use AtomicInteger or synchronized

Java Memory Model & liveness failures

The Java Memory Model (JMM) defines when writes by one thread are visible to another. Without happens-before edges, the CPU and compiler may reorder or cache reads/writes in ways that break your mental sequential model—even if every read “looks” like it sees latest memory.

Happens-before rules (selected)

If action A happens-before B, then A’s effects are visible to B and ordered before B for readers that observe the relationship. Key sources:

  • Unlock on monitor happens-before subsequent lock on same monitor
  • volatile write happens-before subsequent read of that volatile
  • Thread.start() happens-before any action in started thread
  • Actions in thread happens-before join() returning
  • java.util.concurrent utilities document their happens-before (e.g. ConcurrentHashMap, CountDownLatch)

Why “just use a lock” is not enough alone

Locks provide both mutual exclusion and release/acquire visibility. A bug pattern is: thread A writes under lock, releases; thread B reads the same field without acquiring the lock—B may see a stale value. Every read that must see published writes needs a matching synchronization edge (same lock, volatile read after volatile write, or concurrent utility with documented ordering).

Deadlock — Coffman conditions (all four required)

Deadlock is a state where a set of threads each wait for a resource held by another in the set—no thread can proceed. All four Coffman conditions must hold simultaneously; breaking any one prevents deadlock.

ConditionMeaning
Mutual exclusionResource held exclusively
Hold and waitHold one lock while waiting for another
No preemptionCannot force release
Circular waitCycle in wait graph
Java
Object lockA = new Object();
Object lockB = new Object();

Thread t1 = Thread.startVirtualThread(() -> {
    synchronized (lockA) {
        synchronized (lockB) { work(); }
    }
});
Thread t2 = Thread.startVirtualThread(() -> {
    synchronized (lockB) {  // opposite order → circular wait possible
        synchronized (lockA) { work(); }
    }
});

Prevention & detection strategies

  • Lock ordering — assign global order to resources; always acquire in ascending ID order
  • Try-lock with timeout — back off, log, retry with jitter instead of waiting forever
  • Shrink critical sections — fewer nested locks, fewer resources per transaction
  • Detectionjcmd <pid> Thread.print / jstack reports Java-level deadlocks; APM thread pool metrics stuck at max

Livelock & starvation

Livelock — threads are not blocked on locks but keep reacting to each other without forward progress (two people swapping sides in a hallway forever). Fix by introducing backoff, randomness, or passing priority.

Starvation — a thread never gains the lock because others always win (unfair lock, high arrival rate). new ReentrantLock(true) enables FIFO-ish fairness; throughput drops because the JVM cannot reorder lock grants for performance.

Locks, conditions & StampedLock

synchronized is built into every object and is easy to reason about for simple critical sections. The java.util.concurrent.locks package adds explicit locks with timeouts, interruptible waiting, fairness, multiple condition queues, and read/write separation—useful when intrinsic monitors are not flexible enough.

synchronizedReentrantLock
Automatic release on exit (even on exception)Must unlock() in finally
No timeoutstryLock(timeout)
Not interruptible while waitinglockInterruptibly()
Single wait set per monitorMultiple Condition objects per lock
Non-fair by defaultOptional fair ordering

ReentrantLock

A reentrant lock: the thread that already holds it may acquire again (recursion, or nested methods on the same object). An internal hold count tracks depth; each unlock() decrements until the lock is fully released. Implementation uses CAS + park/unpark when contended—similar goals to synchronized but with a richer API surface.

  • lock() — blocks until acquired; does not respond to interrupt while waiting (use lockInterruptibly instead if you need that)
  • tryLock() — returns immediately with success/failure—good for avoiding deadlock (try both locks, release and retry if fail)
  • tryLock(timeout, unit) — bounded wait; useful for SLA-bound work (“wait at most 200ms for shard lock”)
  • lockInterruptibly() — if interrupted while waiting, throws InterruptedException and does not acquire
  • Fair locknew ReentrantLock(true) grants in roughly arrival order; reduces starvation, lowers throughput
Java
ReentrantLock lock = new ReentrantLock();

lock.lock();
try {
    // critical section
} finally {
    lock.unlock();  // always in finally
}

if (lock.tryLock(100, TimeUnit.MILLISECONDS)) {
    try { /* work */ } finally { lock.unlock(); }
}

lock.lockInterruptibly();  // responds to interrupt while waiting
💡 Pro Tip

Always pair lock() with try/finally/unlock. Never call lock() twice in one path without two unlocks—easy to break with early returns. For virtual-thread-heavy code, profile for pinning: long synchronized or native blocks on carrier; consider ReentrantLock in hot paths if pinning appears.

ReadWriteLock

Splits access into a read lock (shared, many threads) and a write lock (exclusive). ReentrantReadWriteLock is ideal when reads dominate (configuration caches, feature flags, in-memory indexes) and writes are rare. Readers block only when a writer holds the write lock; writers block all readers and other writers.

Downgrade is possible in advanced patterns (hold write lock, release to read lock) but easy to misuse—most apps stick to simple lock/unlock per method. Stale reads are still possible if you return mutable objects without copying—locking does not replace defensive copies.

Java
ReadWriteLock rw = new ReentrantReadWriteLock();
Map<String, Config> cache = new HashMap<>();

Config get(String key) {
    rw.readLock().lock();
    try {
        return cache.get(key);
    } finally {
        rw.readLock().unlock();
    }
}

StampedLock Java 8+ — optimistic reads

StampedLock is not a replacement for every ReadWriteLock—it trades API complexity for throughput on read-heavy workloads. Three modes: write lock, read lock, and optimistic read (no lock acquired).

  1. tryOptimisticRead() returns a stamp (version token)
  2. Read data without holding a lock
  3. validate(stamp) — if no write happened since step 1, read was valid
  4. If validation fails, acquire read lock (or write lock) and retry

Optimistic paths win when writes are very rare; under heavy write contention, validation fails often and performance can fall below a plain read lock. StampedLock is not reentrant and does not support Condition—design carefully.

Java
StampedLock sl = new StampedLock();
double x;

long stamp = sl.tryOptimisticRead();
x = this.x;  // read fields
if (!sl.validate(stamp)) {
    stamp = sl.readLock();
    try {
        x = this.x;
    } finally {
        sl.unlockRead(stamp);
    }
}

Condition — wait/notify replacement

Object.wait/notify ties one wait set to the intrinsic monitor. Condition (from ReentrantLock.newCondition()) gives multiple wait sets per lock—e.g. “not empty” and “not full” on the same bounded buffer. await() atomically releases the lock and parks; signal() wakes one waiter; signalAll() wakes all. Always use await in a while (predicate) { await(); } loop, not if—spurious wakeups and barging require re-checking the condition.

Java
ReentrantLock lock = new ReentrantLock();
Condition notEmpty = lock.newCondition();
Queue<Task> queue = new ArrayDeque<>();

void put(Task t) {
    lock.lock();
    try {
        queue.add(t);
        notEmpty.signal();  // or signalAll()
    } finally {
        lock.unlock();
    }
}

Task take() throws InterruptedException {
    lock.lock();
    try {
        while (queue.isEmpty()) {
            notEmpty.await();  // releases lock, waits for signal
        }
        return queue.remove();
    } finally {
        lock.unlock();
    }
}

Coordination utilities

These classes encode common “wait for N things” and “only M at a time” patterns with correct memory visibility— better than hand-rolled wait/notify on a private lock. They live in java.util.concurrent and document happens-before between their methods and your threads.

Semaphore

Maintains a set of permits. acquire() blocks until a permit is available; release() returns one. A binary semaphore (1 permit) behaves like a mutex, but semaphores are not reentrant and can be released by a thread other than the acquirer (unlike locks)—usually you still release in the same thread that acquired.

Use case: cap concurrent outbound HTTP calls to 50 so you do not overwhelm a partner API; cap DB sessions to match pool size when work is submitted from many platform threads.

CountDownLatch

One-shot counter initialized to N. Threads call countDown() to decrement; await() blocks until the count hits zero. The latch cannot be reset—create a new instance for another batch.

Use case: integration test waits for embedded Kafka + DB + app to bind ports; ETL driver waits for N worker shards to finish before merging results.

CyclicBarrier

Reusable barrier: N parties await(); when the Nth arrives, optional barrier action runs and all parties proceed. If a party times out or is interrupted, the barrier breaks and others get BrokenBarrierException.

Use case: parallel matrix multiply phases—each row worker finishes step k, barriers, then starts step k+1; game simulation tick where all agents finish “think” before “act”.

Phaser

Flexible phases: parties can register/deregister dynamically; arriveAndAwaitAdvance() moves to next phase when all registered parties arrive. Supports hierarchical phasers for tree-structured parallelism.

Use case: variable number of forked subtasks per stage in a pipeline where worker count changes between phases.

Exchanger

Two-thread rendezvous: each exchange(v) hands its object to the other. Blocks until the partner arrives.

Use case: double-buffered logging—producer fills buffer A while consumer drains B, then they swap references at the exchanger.

Java
Semaphore dbSlots = new Semaphore(20);
void query() throws InterruptedException {
    dbSlots.acquire();
    try { runQuery(); } finally { dbSlots.release(); }
}

CountDownLatch ready = new CountDownLatch(3);
// each service calls ready.countDown() when up
ready.await();  // main continues when 3 reached zero

CyclicBarrier barrier = new CyclicBarrier(4, () -> mergePartialResults());
// 4 worker threads call barrier.await() per phase

Atomic classes & CAS

When contention is low and you only need a single variable updated atomically, locks are heavier than necessary. java.util.concurrent.atomic exposes CAS-based types the JVM can implement with a few CPU instructions— foundation of lock-free counters, concurrent queue nodes, and ConcurrentHashMap bin updates.

Compare-And-Swap (CAS)

Hardware primitive: read value V; if V == expected, write newValue; else fail. Java loops retry on failure (compare-and-swap spin). No mutex, but under high contention many threads spin—CPU burn. Progress guarantees differ from lock-based “lock-free” vs “wait-free” definitions used in textbooks.

ABA problem: value changes A→B→A; CAS thinks nothing happened. Rare for integers; addressed in AtomicStampedReference with version stamps.

ClassTypical use
AtomicInteger / AtomicLongCounters, sequence IDs, stats
AtomicReference<V>Atomic swap of immutable config snapshot
AtomicIntegerArrayPer-slot atomics without boxing
LongAdder / LongAccumulatorHigh-write metrics—striped cells, sum at end
AtomicReferenceFieldUpdaterCAS on volatile field inside node—less allocation
Java
AtomicInteger counter = new AtomicInteger();
counter.incrementAndGet();
counter.compareAndSet(old, old + 1);

AtomicReference<State> ref = new AtomicReference<>(State.INIT);
ref.compareAndSet(State.INIT, State.RUNNING);

// High-contention metrics — prefer LongAdder over AtomicLong
LongAdder requests = new LongAdder();
requests.increment();
long total = requests.sum();

LongAdder maintains per-thread cells that accumulate under contention; sum() merges them—much faster than hammering one AtomicLong on a hot metrics path. Trade-off: sum() is not a linearizable snapshot under concurrent updates unless updates quiesce.

AtomicReferenceFieldUpdater — reflection-based CAS on a named volatile field inside a class—used in ConcurrentLinkedQueue-style node structures to avoid an extra AtomicReference object per node.

🔬 Under the Hood

HotSpot lowers CAS to CPU instructions (cmpxchg on x86). VarHandle (Java 9+) generalizes atomic access with fences—foundation for future APIs and off-heap work.

Executor framework

Prefer submitting tasks to an ExecutorService over raw new Thread—centralized queuing, reuse, naming, metrics, and shutdown policies. The framework separates task definition (Runnable/Callable) from execution policy (how many threads, which queue, what to do when saturated).

Runnable vs Callable vs Future

TypeReturnsExceptions
RunnablevoidUnchecked only (in signature)
Callable<V>VMay throw checked exceptions—propagate to Future.get()
Future<V>Handle to async resultget() blocks; wraps execution exception

submit(Callable) returns Future; execute(Runnable) is fire-and-forget. Cancel with future.cancel(true)—interrupts running task if started; may not stop non-interruptible blocking I/O immediately.

Graceful shutdown

shutdown() stops accepting new tasks; previously submitted work continues. awaitTermination waits for completion. shutdownNow() attempts to cancel queued tasks and interrupt workers—use on timeout during deploy. Do not let pools leak on redeploy in long-lived containers.

Java
ExecutorService pool = Executors.newFixedThreadPool(8);

Future<Report> future = pool.submit(() -> generateReport());
Report r = future.get(30, TimeUnit.SECONDS);

pool.shutdown();
if (!pool.awaitTermination(60, TimeUnit.SECONDS)) {
    pool.shutdownNow();
}

ThreadPoolExecutor internals

ThreadPoolExecutor is the workhorse implementation behind most Executors factory methods. Understanding the interaction of core size, max size, queue type, and rejection policy is how you prevent “the service accepted traffic until it died” incidents.

ParameterRole
corePoolSizeBaseline threads kept alive
maximumPoolSizeCap when queue is full (bounded queue)
keepAliveTimeIdle non-core thread TTL
workQueueLinkedBlockingQueue, ArrayBlockingQueue, SynchronousQueue
RejectedExecutionHandlerPolicy when saturated

Work queue choice

QueueBehavior
LinkedBlockingQueue (unbounded)Tasks never rejected until OOM—latency grows under load
ArrayBlockingQueue (bounded)Backpressure when full—pairs with max pool + rejection policy
SynchronousQueueZero capacity—handoff to thread or new thread; used by cached pool

RejectedExecutionHandler policies

  • AbortPolicy (default) — throw RejectedExecutionException—fail fast, caller handles
  • CallerRunsPolicy — run task on submitting thread—slows producers, natural backpressure
  • DiscardPolicy — silent drop—only for non-critical telemetry
  • DiscardOldestPolicy — drop head of queue—rare, can lose oldest work unpredictably

Executors factories — production cautions

Factory methods hide the queue and bounds. Treat them as shortcuts for prototypes, not production defaults without reading the JDK source behavior.

FactoryRisk
newFixedThreadPoolUnbounded LinkedBlockingQueue—tasks pile up, memory pressure
newCachedThreadPoolUnbounded threads + SynchronousQueue—under spike creates thousands of threads → OOM, context switch storm
newSingleThreadExecutorSerialized execution—OK for ordered jobs; unbounded queue
newScheduledThreadPoolTimers/cron-like tasks—size the pool for overlap
Java
ThreadPoolExecutor pool = new ThreadPoolExecutor(
    4, 16,
    60, TimeUnit.SECONDS,
    new ArrayBlockingQueue<>(1000),
    Thread.ofPlatform().name("api-", 0).factory(),
    new ThreadPoolExecutor.CallerRunsPolicy());  // backpressure on caller
⚠️ Pitfall

Never use Executors.newCachedThreadPool() for bursty server request handling without limits. Define explicit pool sizes, bounded queues, and rejection metrics—Spring Boot exposes task.execution pool tuning.

CompletableFuture Java 8+

Composable async pipelines without nested Future.get() blocking. A CompletableFuture<T> is both a Future and a completion stage you can chain. Default async stages use ForkJoinPool.commonPool()—fine for CPU-light transforms; dangerous for blocking JDBC/HTTP unless you pass an explicit executor.

Creating and transforming

MethodRole
supplyAsync(Supplier)Start async computation with result
runAsync(Runnable)Async void task
thenApplyMap result T → U (sync on completion thread unless *Async)
thenComposeFlat-map when next step returns another CF (avoid nested CF)
thenCombineMerge two CFs when both complete
allOf / anyOfWait for all or first completion

Error handling stages

exceptionally(ex -> fallback) runs only if the stage completed exceptionally—like catch. handle((result, ex) -> ...) sees both success and failure in one function. whenComplete is for side effects (logging) without changing the value—exceptions from the consumer can still complete the next stage exceptionally.

Unchecked exceptions in supplyAsync complete the CF exceptionally; checked exceptions cannot be thrown from Supplier without wrapping.

Java
ExecutorService ioPool = Executors.newFixedThreadPool(32);

CompletableFuture<User> userFuture = CompletableFuture
    .supplyAsync(() -> http.fetchUser(id), ioPool);

CompletableFuture<Orders> ordersFuture = userFuture.thenCompose(u ->
    CompletableFuture.supplyAsync(() -> http.fetchOrders(u.id()), ioPool));

CompletableFuture<String> summary = userFuture.thenCombine(ordersFuture,
    (u, o) -> u.name() + " has " + o.count() + " orders");

// Error handling
CompletableFuture<User> safe = userFuture
    .exceptionally(ex -> User.guest())
    .handle((u, ex) -> ex == null ? u : User.guest());

userFuture.whenComplete((u, ex) -> {
    if (ex != null) log.error("failed", ex);
});

CompletableFuture<Void> all = CompletableFuture.allOf(userFuture, ordersFuture);
CompletableFuture<Object> first = CompletableFuture.anyOf(userFuture, ordersFuture);

Blocking at the edge

Inside service code, chain freely. At the boundary (HTTP controller, main), use cf.get(timeout), join(), or convert to reactive type with explicit timeout. join() on failed CF throws unchecked CompletionException wrapping the cause—unwrap getCause() for the real failure.

Chaining patterns & anti-patterns

GoodAvoid
thenCompose when next step returns CF (flatMap)thenApply returning CF → nested CompletableFuture<CompletableFuture<T>>>
Dedicated executor for blocking JDBC/HTTPBlocking call inside supplyAsync on common pool
orTimeout / completeOnTimeout 9+Infinite get() without timeout
One exceptionally at pipeline endLog in every stage and rethrow
📦 Real World

Microservice orchestration: fetch user, then orders, then recommendations with thenCompose and a shared bounded IO executor. Set client timeouts with orTimeout on each stage. Aggregate with allOf for parallel fan-out to downstream APIs.

ForkJoinPool & divide-and-conquer

ForkJoinPool implements work-stealing: each worker thread maintains a deque of tasks. When idle, a worker steals from the tail of another worker’s deque—balances CPU-bound fork/join workloads without a central queue bottleneck. Parallel streams default to the common pool (ForkJoinPool.commonPool()).

Fork / join / compute

  • fork() — schedule subtask asynchronously on the pool
  • compute() — run subtask on current thread (or join stolen work)
  • join() — wait for forked subtask result

Rule of thumb: fork only when the split is large enough to amortize scheduling overhead—often thousands of elements for array work. RecursiveTask<V> returns a value; RecursiveAction is void (side-effect only if you manage merging carefully).

Do not block inside ForkJoin tasks on I/O—blocks a worker and reduces steal efficiency; use an Executor for I/O instead.

Java
class SumTask extends RecursiveTask<Long> {
    private final int[] arr;
    private final int lo, hi;

    @Override
    protected Long compute() {
        if (hi - lo <= 10_000) {
            long sum = 0;
            for (int i = lo; i < hi; i++) sum += arr[i];
            return sum;
        }
        int mid = (lo + hi) >>> 1;
        SumTask left = new SumTask(arr, lo, mid);
        SumTask right = new SumTask(arr, mid, hi);
        left.fork();
        long rightVal = right.compute();
        return left.join() + rightVal;
    }
}

ForkJoinPool pool = new ForkJoinPool();
long total = pool.invoke(new SumTask(array, 0, array.length));
pool.shutdown();

Virtual threads Java 21 — Project Loom

Platform threads = OS threads (expensive, ~MB stack). Virtual threads = JVM-managed, cheap to create millions—mounted on a small pool of carrier platform threads. When virtual thread blocks on I/O, it unmounts so the carrier runs another virtual thread.

Java
Thread v = Thread.ofVirtual().name("req-42").start(() -> {
    String body = httpClient.sendBlocking(request);  // blocks virtual thread cheaply
});

try (ExecutorService exec = Executors.newVirtualThreadPerTaskExecutor()) {
    IntStream.range(0, 10_000).forEach(i ->
        exec.submit(() -> handleRequest(i)));
}

Pinning virtual threads

When a virtual thread blocks in native code or on a monitor while holding a carrier, it may pin the carrier thread—reducing scalability. JDK flight recorder event jdk.VirtualThreadPinned helps diagnose. Mitigations: reduce synchronized in hot paths, prefer ReentrantLock, update drivers/libraries that block carriers, avoid long CPU loops on virtual threads.

Structured concurrency Java 21+

StructuredTaskScope models child tasks as nested within a parent scope—if the parent ends (success, failure, or cancel), children are cancelled together. This fixes “fire off async tasks and lose track” bugs in imperative code. Shutdown policies like ShutdownOnFailure fail fast when any child fails; ShutdownOnSuccess supports racing alternatives. API stabilized across JDK 21–23—check your target JDK javadoc for exact class names and preview flags.

Java
// Conceptual shape — API evolved across JDK versions; check your JDK javadoc
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
    Subtask<User> user = scope.fork(() -> fetchUser(id));
    Subtask<Orders> orders = scope.fork(() -> fetchOrders(id));
    scope.join();
    scope.throwIfFailed();
    return combine(user.get(), orders.get());
}

When virtual threads shine vs when they do not

Good fitPoor fit
I/O-bound: HTTP, JDBC, messagingCPU-bound compute (use platform pool or ForkJoin)
High fan-out request-per-thread styleHeavy synchronized on hot paths (pins carrier—"pinning")
Simpler than reactive for many teamsNative code / some blocking in libraries being updated

Spring Boot 3.2+ & reactive vs imperative

Spring Boot 3.2+ can enable virtual threads for request handling (spring.threads.virtual.enabled=true). Embedded servers dispatch each request on a virtual thread; blocking JDBC, JPA, and RestClient calls release the carrier while waiting on the network or DB— so thread count is no longer the limiting factor for I/O concurrency in many apps.

Limits remain: connection pool size, downstream rate limits, heap, and CPU for serialization still cap throughput. WebFlux/reactive stacks still win when you need end-to-end reactive streams, backpressure, and non-blocking drivers throughout. Virtual threads are the pragmatic middle path: keep imperative code, gain I/O concurrency—without claiming they help CPU-bound thread pools.

🎯 Interview Tip

Contrast: platform thread pool size ≈ CPU cores for CPU work; virtual threads ≈ number of concurrent blocking operations. Explain pinning, why synchronized can hurt virtual threads (use ReentrantLock in hot paths when profiling shows pinning). Draw happens-before for volatile + lock.

📦 Real World

Instrument pool queue depth, rejection counts, and thread dump on deadlock. Use timeouts on all external calls inside async chains. For Kubernetes, align pool sizes with CPU limits; virtual threads reduce thread count metrics but JDBC pool size still caps DB concurrency.