How the JVM HotSpot Works: Explained Simply

Have you ever noticed that a Java application feels a little slow right after it starts, but gets faster after a few minutes of running? That is not a bug. That is the JVM doing its job.

This post explains why that happens, in plain language.

Start with a story

Imagine you just started working at a restaurant as a chef.

On your first day, you follow every recipe step by step. You open the book, read the first step, do it, read the second step, do it. It works, but it is slow. You are spending as much time reading as you are cooking.

After a few weeks, you have made the top ten most popular dishes hundreds of times. You no longer need the book for those. You just cook from memory. It is much faster.

And over time, you notice shortcuts. The pasta dish always uses the same sauce, so you make a big batch at the start of the shift. The salad never needs the croutons because your regulars always skip them. You stop doing steps that never matter.

But one day, a new customer comes in and orders the pasta without garlic. That breaks your shortcut. You go back to the recipe book, adjust your notes, and eventually memorize the new version.

The JVM works exactly like this. Every part of that story maps to something real inside the JVM:

Story	JVM
Recipe book	Your compiled `.class` files (bytecode)
Reading step by step	The interpreter
Memorizing the recipe	JIT compilation
Shortcuts you develop	Optimizations (inlining, escape analysis, etc.)
Customer breaks a shortcut	Deoptimization
First hour of service	The warmup period

Keep this story in mind. Everything in this post is just a deeper look at one part of it.

What even is bytecode?

When you write Java code and compile it, you do not get native machine code (the low-level instructions your CPU understands directly). You get bytecode.

Bytecode is a simplified set of instructions designed to be easy for any JVM to read, on any operating system, on any CPU. This is why the same .jar file runs on Windows, macOS, and Linux. The JVM reads the bytecode and figures out how to run it on whatever machine it is on.

Think of bytecode like sheet music. A musician from any country can read the same sheet music and play it, even if the instruments are different. The sheet music is not tied to one specific piano. The JVM is the musician. Bytecode is the sheet music.

Part 1: The Interpreter

When your Java program starts, the JVM does not immediately turn your bytecode into fast native code. That would take too long before running anything.

Instead, it starts in interpreter mode: it reads one bytecode instruction at a time and executes it immediately. This is exactly like the chef reading the recipe line by line.

This is slow. But it starts instantly. And it has an important side effect: while the interpreter is running your code, it is also watching your code. It counts things.

Specifically, it tracks:

How many times each method is called
How many times each loop runs

It keeps these counts quietly in the background, waiting to pass a threshold.

Part 2: The JIT Compiler

Once a method has been called enough times (around 10,000 by default), the JVM says: “this method is hot. It is worth investing in.”

It hands that method to the JIT compiler. JIT stands for Just-In-Time. It compiles the hot method into native machine code, the real instructions your CPU understands directly. From that point on, calls to that method skip the interpreter entirely and run the native version.

This is the chef memorizing the recipe. The first 10,000 times are slow (reading the book). After that, every subsequent time is fast (cooking from memory).

// This method will be JIT compiled after enough calls
static long sum(int[] arr) {
    long total = 0;
    for (int v : arr) total += v;
    return total;
}

You can actually watch this happen. Run your program with -XX:+PrintCompilation and you will see lines like:

138   27   3   MyApp::sum (18 bytes)

That line means: at 138 milliseconds, the JVM compiled sum to native code at tier 3. (Tiers are explained next.)

Part 3: Two Compilers Working Together

Here is something surprising: the JVM has not one but two JIT compilers, called C1 and C2.

Why two?

Because there is a trade-off between how fast you can compile and how good the result is.

C1 compiles quickly. The result is decent native code. It gets you off the slow interpreter fast.
C2 takes longer to compile. But the result is much better. It applies serious optimizations using everything it learned by watching your code run.

They work together like this:

Slow interpreter
      ↓
C1 compiled  (fast to compile, decent speed)
      ↓
C2 compiled  (slow to compile, much faster to run)

A method starts interpreted. Once it is hot, C1 compiles it quickly. Once it is very hot and C2 has enough data, C2 replaces the C1 version with a better one.

This is the difference between quickly memorizing a recipe (C1) and truly mastering it with your own improvements after making it a thousand times (C2).

Drag · Scroll to zoom

Part 4: The Shortcuts (Optimizations)

Once C2 takes over, it does not just translate bytecode to native code. It actively looks for shortcuts using everything it learned while watching your code run.

Here are the most important ones, explained simply.

Shortcut 1: Inlining

The idea: If a small method is called a million times, stop calling it as a separate method. Copy its body directly into the place that calls it.

The chef version: Your pasta recipe says “add sauce (see page 47).” After memorizing it, you do not flip to page 47 anymore. You just know what to do. The sub-recipe becomes part of the main recipe in your head.

Why it matters:

Every method call in Java has overhead: the JVM has to set up a new stack frame, jump to the method, and jump back. For a tiny getter like this:

public String getName() {
    return this.name;
}

The overhead of calling it can be larger than the work it does. After inlining, it disappears:

// Before inlining: a method call
String name = user.getName();

// After inlining: the JIT replaces the call with the body
String name = user.name;

The method call is gone. The JVM goes directly to the field. This also unlocks other optimizations, because now C2 can see what was inside that method.

The catch: Very large methods do not get inlined. The JIT has a size limit. This is one reason keeping methods small is not just about readability.

Shortcut 2: Escape Analysis

The idea: If an object is only used inside one method and never leaves it, do not put it on the heap. Put it on the stack instead.

First, what is the heap and stack?

The heap is the large shared memory area where most Java objects live. The garbage collector manages it. Objects on the heap survive across method calls.
The stack is a small, fast memory area used for the current method call. When the method returns, the stack is automatically cleaned up, instantly, no garbage collector needed.

The chef version: If you only need a small bowl for one step of one recipe, you do not walk to the storage room to get it and bring it back later. You grab one from the counter, use it, and put it right back. Much faster.

Why it matters:

for (Order order : orders) {
    // This object is only used inside this loop iteration
    ValidationContext ctx = new ValidationContext(order);
    if (!ctx.isValid()) errors++;
}

Without escape analysis, every ValidationContext goes to the heap. With 10,000 orders, that is 10,000 heap objects the garbage collector has to clean up later.

With escape analysis, the JIT sees that ctx never leaves the loop body. It puts it on the stack. When the iteration ends, it is instantly gone. No garbage collector involved.

The catch: If the JIT cannot prove the object stays inside the method (for example, if it is passed to a method that is not inlined), the optimization does not apply.

Shortcut 3: Monomorphic Dispatch

The idea: If a method call always uses the same concrete type, stop looking it up every time. Just call it directly.

The chef version: You have a regular customer, Alice, who orders the same thing every day. After a while, when you see Alice walk in, you do not ask what she wants. You just start making her usual. But you keep a quick eye on her in case she orders something different today.

Why it matters:

In Java, when you write code against an interface:

List<Order> orders = getOrders();
orders.add(newOrder);

The JVM does not know which add() to call at compile time. orders could be an ArrayList, a LinkedList, or anything else. So at runtime, it has to look up which add() to use every single time. That lookup takes time.

But the JIT watches. If it sees that orders is always an ArrayList in practice, it generates:

// What the JIT does behind the scenes
if (orders is an ArrayList) {
    call ArrayList.add() directly  ← fast, can also be inlined
} else {
    do the slow lookup             ← almost never happens
}

The check is one instruction. The fast path is fully inlined. It is as if you wrote ArrayList directly.

The catch: If three or more different types appear at the same call site, the JIT gives up on this optimization entirely. Mixing many different types through the same interface reference in a hot loop is one of the most common causes of unexpectedly slow Java code.

Shortcut 4: Dead Code Elimination

The idea: If a branch of code is never taken in practice, remove it from the compiled version.

The chef version: The menu says every pasta dish comes with a bread basket. But after six months, you notice that 100% of customers skip the bread. You stop preparing it. The menu still says it is included, but you know it never happens. (And if someone actually wants bread, you go back and make it.)

Why it matters:

public Response handle(Request req) {
    if (featureFlags.isEnabled("old_checkout")) {
        return oldCheckout.process(req);   // this flag has been false for months
    }
    return newCheckout.process(req);
}

After seeing millions of requests where old_checkout is always false, C2 removes the dead branch from its compiled code. oldCheckout.process() is gone. The compiled method is just one line: call the new checkout.

If the flag is ever turned back on, the JVM notices immediately and handles it. More on that next.

Part 5: When a Shortcut Breaks (Deoptimization)

All of the shortcuts in Part 4 are bets. The JIT is betting that what was true during observation will stay true. Usually it is right. Sometimes it is not.

When a bet turns out to be wrong, the JVM does something called deoptimization. Think of it as the JIT saying: “I was wrong. Let me tear this up and start over.”

Here is what happens step by step:

The JIT compiled a method with an assumption (“this is always an ArrayList”)
At runtime, a LinkedList shows up for the first time
The assumption is wrong
The JVM throws away the compiled native code for that method
It falls back to the interpreter for that method
It watches again, collects new data
It recompiles with the correct understanding

Drag · Scroll to zoom

A single deoptimization is harmless. The method just runs a bit slower until it is recompiled.

The dangerous case is when many methods deoptimize at the same time. This is called a deoptimization storm. It can happen when a new plugin or class is loaded at runtime that the JIT did not know about. For 30 to 60 seconds, the application feels slow as the JIT rebuilds everything. After that, it recovers.

Part 6: The Warmup Period

Now you understand why Java applications feel slow at first.

When your application just started:

All methods are running in the interpreter
The JIT has not compiled anything yet
None of the shortcuts exist yet

Over the next few minutes:

Hot methods get compiled by C1 (moderate speed boost)
Very hot methods get compiled by C2 (large speed boost)
Shortcuts start kicking in

After a few minutes:

Everything that matters has been compiled and optimized
The application reaches its full speed

This window is called the warmup period.

Why this matters in production

Here is a real problem that many teams run into.

You deploy a new version of your service to Kubernetes. The container starts. After two seconds, the health check passes. Kubernetes thinks the service is ready. It starts sending live traffic.

But the JVM has only been running for two seconds. It is still in the middle of the interpreter phase. Every user request for the next two minutes is slower than normal.

The fix: do not let the health check pass until after warmup is done.

@Component
public class WarmupRunner implements ApplicationRunner {

    private final OrderService orderService;
    private final ReadinessProbe probe;

    @Override
    public void run(ApplicationArguments args) {
        // Run the hot methods enough times to trigger JIT compilation
        Order fake = Order.synthetic();
        for (int i = 0; i < 15_000; i++) {
            orderService.validate(fake);
            orderService.calculateTotal(fake);
        }

        // Now we are ready for real traffic
        probe.setReady(true);
        log.info("Warmup complete, accepting traffic");
    }
}

15,000 iterations is enough to push the hot methods past C2’s threshold. The health check stays failing until the warmup completes. Users always get a warm JVM.

Part 7: When Should You Care About All This?

You do not need to think about HotSpot every day. But here are the situations where this knowledge actually changes what you do:

Your app is slow for the first few minutes after deploy. This is the warmup period. Add a warmup routine like the one above and keep the readiness probe failing until it completes.

You are writing a performance benchmark. Never measure Java performance without a warmup phase. If you run a method 100 times and report the average, you are measuring the interpreter and C1, not C2. Use JMH (Java Microbenchmark Harness), which handles warmup automatically.

@Benchmark
@Warmup(iterations = 5, time = 1)      // 5 seconds of warmup first
@Measurement(iterations = 10, time = 1)
public long myBenchmark() {
    return sum(data);
}

You are building a short-lived tool (a CLI, a Lambda function, a one-off script). HotSpot will never warm up before the process exits. Consider GraalVM Native Image instead: it compiles your Java code ahead of time to a native binary that starts in milliseconds.

You have unexplained slowness in a hot loop. Check whether you are mixing multiple different types through the same interface reference. If you pass three or more concrete types through the same call site, the JIT gives up on the most valuable optimization (monomorphic dispatch + inlining) for that site.

You see a mysterious 30 to 60 second slowness in production. It is probably a deoptimization storm. Use JVM Flight Recorder to check: java -XX:StartFlightRecording=filename=recording.jfr YourApp. Open the recording in JDK Mission Control and look at the JIT section for a spike in deoptimization events.

Summary

Let us bring the chef story all the way home.

The JVM starts like a chef on their first day: slow, following the recipe book step by step. This is the interpreter. It is slow but it starts immediately.

After seeing which dishes are ordered all the time, the chef memorizes them. This is JIT compilation: the JVM compiles hot methods to native code. C1 is the quick memorization. C2 is the deep, polished version that comes after months of practice.

Over time the chef develops shortcuts: skipping steps that never apply, knowing what regulars want, doing multiple things at once. These are the optimizations: inlining, escape analysis, monomorphic dispatch, dead code elimination.

Sometimes a new order breaks a shortcut. The chef goes back to the book, updates the notes, and re-memorizes. This is deoptimization.

The first hour of service, when the chef is still learning, is the warmup period. It is slower. You would not seat all your VIP customers in the first five minutes of a chef’s first day.

That is HotSpot.