Kotlin Performance Optimization

Overview

This guide covers performance optimization techniques specific to Kotlin, including understanding inline functions, optimizing collections with sequences, minimizing allocations, avoiding common performance pitfalls, and profiling Kotlin code. While premature optimization should be avoided, understanding performance characteristics helps you write efficient code from the start and know where to focus optimization efforts when needed.

Core Principles

Measure First: Profile before optimizing - intuition is often wrong
Inline Functions: Eliminate lambda overhead for higher-order functions
Sequences: Lazy evaluation for large collection operations
Avoid Allocations: Minimize object creation in hot paths
Immutability Trade-offs: Balance safety and performance
Data Classes: Understand generated code overhead
Coroutines: Prefer coroutines over threads for scalability
Smart Casts: Leverage type inference to avoid unnecessary casts
Collection Choice: Use appropriate collection types
Benchmarking: Use proper microbenchmarking tools

Inline Functions

Inline functions eliminate the overhead of lambda objects and function calls by copying function code directly to the call site. This is particularly valuable for higher-order functions (functions that take other functions as parameters) used frequently in hot code paths.

Understanding Inline Functions

When you pass a lambda to a regular higher-order function, Kotlin creates an object to hold the lambda and invokes it polymorphically. This incurs allocation overhead and virtual call overhead. For small functions called frequently, this overhead can be significant.

Marking a function as inline instructs the compiler to copy both the function body and the lambda argument to the call site, eliminating both allocations and call overhead. This is why many standard library functions like let, apply, run, forEach, and filter are inline.

When to inline: Inline functions with lambda parameters used in performance-critical code. Don't inline large functions - you'll increase bytecode size. Don't inline functions that store lambda references (the lambda must be callable at the inlined call site). For general function design, see our Kotlin general guide.

When to Use Inline

// GOOD: Inline for frequently-used utility functions
inline fun <T> measureTime(block: () -> T): T {
    val start = System.nanoTime()
    val result = block()
    val elapsed = System.nanoTime() - start
    println("Execution took ${elapsed}ns")
    return result
}

// GOOD: Inline with reified type parameters
inline fun <reified T> isInstance(value: Any): Boolean {
    return value is T // Reified type available at runtime
}

// Usage - no lambda allocation
val result = measureTime {
    // Some computation
    complexCalculation()
}

// BAD: Inlining large function - increases bytecode size
inline fun largeFunction() {
    // 100+ lines of code
    // Inlining copies all this to every call site
}

// BAD: Cannot inline - stores lambda reference
fun storeAndCallLater(block: () -> Unit) {
    savedBlocks.add(block) // Lambda object must exist
    block()
}

Inline functions enable reified type parameters - generic types that are available at runtime. Normally, generics are erased, but inlined functions can preserve type information because the generic type is known at the call site.

Noinline and Crossinline

Sometimes you want to inline a function but not inline all its lambda parameters. Use noinline for lambda parameters that can't be inlined (e.g., stored or passed to non-inline functions). Use crossinline for lambdas that must not contain non-local returns.

// GOOD: Mix of inline and noinline parameters
inline fun processData(
    data: List<Int>,
    filter: (Int) -> Boolean, // Inlined
    noinline logger: (String) -> Unit // Not inlined - might be stored
) {
    data.filter(filter).forEach { value ->
        logger("Processing $value")
    }
}

// GOOD: Crossinline prevents non-local returns
inline fun runSafely(crossinline block: () -> Unit) {
    try {
        block() // Cannot contain 'return' that would exit runSafely's caller
    } catch (e: Exception) {
        println("Error: ${e.message}")
    }
}

// BAD: Non-local return in lambda (without crossinline)
inline fun forEach(list: List<Int>, action: (Int) -> Unit) {
    for (item in list) {
        action(item) // If action contains 'return', it exits forEach's caller
    }
}

fun processItems(items: List<Int>) {
    forEach(items) { item ->
        if (item < 0) return // Returns from processItems, not just the lambda!
        println(item)
    }
    println("Done") // Might not print if negative item found
}

Sequences for Lazy Evaluation

Sequences provide lazy evaluation for collection operations. Unlike regular collection functions (which are eager), sequences defer computation until a terminal operation is called. This eliminates intermediate collections and improves performance for large data sets with multiple operations.

When to Use Sequences

Collection functions like map, filter, and flatMap are eager - each operation creates a new intermediate collection. For a chain like list.map {...}.filter {...}.take(10), map creates a full list, then filter creates another list, then take creates a final list. If the list has 1 million elements but you only need 10, you've wasted significant work.

Sequences evaluate elements one-by-one through the entire pipeline. Each element passes through map, then filter, then take. No intermediate collections are created. Evaluation stops once take has 10 elements - the remaining 999,990 elements are never processed.

Use sequences for large collections with multiple transformations, or when you might not need all results (e.g., take, first, any). Don't use sequences for small collections or single operations - the lazy evaluation overhead isn't worth it. For collection operations, see our Kotlin collections guide.

Sequences vs Collections

// BAD: Eager evaluation - creates intermediate lists
fun processLargeList(numbers: List<Int>): List<Int> {
    return numbers
        .map { it * 2 }      // Creates List<Int> with 1M elements
        .filter { it > 100 } // Creates another List<Int>
        .take(10)            // Creates final List<Int> with 10 elements
    // Processed 1M elements through map, 1M through filter
}

// GOOD: Lazy evaluation with sequence
fun processLargeListEfficiently(numbers: List<Int>): List<Int> {
    return numbers
        .asSequence()        // Convert to sequence - no cost
        .map { it * 2 }      // No intermediate list
        .filter { it > 100 } // No intermediate list
        .take(10)            // Stops after finding 10 elements
        .toList()            // Terminal operation - creates single result list
    // Only processes elements until 10 results found
}

// GOOD: Sequence with short-circuiting
fun findFirstEligible(users: List<User>): User? {
    return users
        .asSequence()
        .filter { it.isActive }
        .map { it.toDetailedProfile() } // Expensive operation
        .firstOrNull { it.meetsRequirements() }
    // Stops as soon as first match is found
}

// BAD: Using sequence for small collections
fun processSmallList(numbers: List<Int>): List<Int> {
    return numbers // Only 10 elements
        .asSequence()
        .map { it * 2 }
        .toList()
    // Overhead of sequence isn't worth it
}

Sequences shine when combined with short-circuiting operations (first, take, any, all). The lazy evaluation stops processing once the condition is met. For small collections (< 100 elements), the sequence overhead likely exceeds any benefit.

generateSequence for Infinite Sequences

generateSequence creates sequences from a generator function, enabling infinite or computed sequences. This is powerful for iterative algorithms, pagination, or any scenario where you generate values on demand.

// GOOD: Fibonacci sequence generator
val fibonacci = generateSequence(Pair(0, 1)) { (a, b) ->
    Pair(b, a + b)
}.map { it.first }

// Take first 10 Fibonacci numbers
val first10 = fibonacci.take(10).toList()
// [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

// GOOD: Pagination with sequence
fun fetchAllPages(): Sequence<Page> = sequence {
    var nextToken: String? = null
    do {
        val response = api.fetchPage(nextToken)
        yield(response.page)
        nextToken = response.nextToken
    } while (nextToken != null)
}

// Usage - can stop early
val firstThreePages = fetchAllPages().take(3).toList()

// GOOD: Tree traversal with sequence
fun Node.traverse(): Sequence<Node> = sequence {
    yield(this@traverse)
    for (child in children) {
        yieldAll(child.traverse())
    }
}

// Find first node matching condition without traversing entire tree
val foundNode = rootNode.traverse().firstOrNull { it.matches(criteria) }

Memory and Allocation Optimization

Excessive object allocation triggers garbage collection, which pauses application execution. Minimizing allocations in hot paths improves performance and reduces GC pressure.

Avoiding Unnecessary Allocations

// BAD: Creates new list on every call
fun getActiveUsers(users: List<User>): List<User> {
    return users.filter { it.isActive } // New list allocated
}

// GOOD: Return sequence to defer allocation
fun getActiveUsers(users: List<User>): Sequence<User> {
    return users.asSequence().filter { it.isActive } // No allocation until consumed
}

// BAD: Allocates string repeatedly in loop
fun formatNumbers(numbers: List<Int>): String {
    var result = ""
    for (num in numbers) {
        result += "$num, " // Creates new string each iteration
    }
    return result
}

// GOOD: StringBuilder for efficient string building
fun formatNumbers(numbers: List<Int>): String {
    return buildString {
        numbers.forEach { append("$it, ") }
    } // Single allocation
}

// BAD: Boxing primitives in collection
val numbers: List<Int> = listOf(1, 2, 3, 4, 5) // Each Int is boxed (heap object)

// GOOD: Use primitive array when performance matters
val numbers = intArrayOf(1, 2, 3, 4, 5) // Primitive array, no boxing

Every object allocation requires heap space and eventual GC. In hot loops processing thousands of elements, small allocations add up. Use primitives instead of boxed types, reuse objects when safe, and prefer sequences or inline functions to eliminate lambda allocations.

Data Class Overhead

Data classes generate equals, hashCode, toString, copy, and componentN functions. These are convenient but have performance implications. equals and hashCode iterate through all properties. For data classes with many properties or used in large collections, this can be expensive.

// GOOD: Data class for DTOs - convenience worth the cost
data class User(
    val id: String,
    val name: String,
    val email: String,
    val age: Int
)

// BAD: Large data class used as map key
data class ComplexKey(
    val field1: String,
    val field2: String,
    // ... 20 more fields
)

val cache = mutableMapOf<ComplexKey, Value>()
cache[complexKey] // hashCode iterates all 22 fields!

// GOOD: Custom hashCode focusing on key fields
data class OptimizedKey(
    val id: String, // Primary identifier
    val type: String, // Secondary identifier
    val field3: String,
    // ... more fields
) {
    // Only hash the key fields
    override fun hashCode(): Int = id.hashCode() * 31 + type.hashCode()

    override fun equals(other: Any?): Boolean {
        if (this === other) return true
        if (other !is OptimizedKey) return false
        return id == other.id && type == other.type
    }
}

For data classes used as map keys or in large sets, consider custom equals/hashCode implementations that focus on key fields. For data class patterns, see our Kotlin general guide.

Collection Optimization

Choosing the right collection type impacts both performance and memory usage. Kotlin provides various collection types optimized for different access patterns.

Collection Type Selection

// GOOD: ArrayList for sequential access and dynamic sizing
val items = ArrayList<Int>(initialCapacity = 1000) // Specify capacity if known
items.add(1)
items.add(2)

// GOOD: LinkedList for frequent insertions/removals at both ends
val queue = LinkedList<Task>()
queue.addFirst(urgentTask)
queue.addLast(normalTask)

// GOOD: HashSet for fast lookup
val uniqueIds = HashSet<String>()
if (id in uniqueIds) { ... } // O(1) lookup

// GOOD: HashMap for key-value lookup
val userCache = HashMap<String, User>(initialCapacity = 10000)
val user = userCache[userId] // O(1) lookup

// BAD: List for membership testing
val allowedIds = listOf("id1", "id2", "id3", /* ... hundreds more */)
if (userId in allowedIds) { ... } // O(n) linear search!

// GOOD: Set for membership testing
val allowedIds = setOf("id1", "id2", "id3", /* ... hundreds more */)
if (userId in allowedIds) { ... } // O(1) hash lookup

// GOOD: Specify initial capacity to avoid resizing
val largeMap = HashMap<String, User>(initialCapacity = 10000)
// Prevents multiple resize operations as map grows

ArrayList: Constant-time random access, amortized constant-time append. Use for indexed access and iteration. LinkedList: Constant-time insertions/removals at ends. Use for queues or frequent modifications. HashSet/HashMap: Constant-time lookup, insertion, removal. Use when you need fast membership testing or key-value lookup. TreeSet/TreeMap: Sorted, log-time operations. Use when you need ordering.

For collection patterns, see our Kotlin general guide.

Immutable vs Mutable Collections

Kotlin distinguishes read-only (List, Set, Map) and mutable (MutableList, MutableSet, MutableMap) interfaces. Read-only collections don't guarantee immutability - they just prevent modification through that reference. For performance, prefer mutable collections internally and expose read-only views.

// GOOD: Mutable internally, read-only externally
class UserRepository {
    private val users = mutableListOf<User>() // Mutable for internal updates

    fun addUser(user: User) {
        users.add(user)
    }

    fun getUsers(): List<User> = users // Read-only view
}

// BAD: Copying on every read
class UserRepository {
    private val users = mutableListOf<User>()

    fun getUsers(): List<User> = users.toList() // Creates new list every call!
}

// GOOD: Persistent collections for true immutability (kotlinx.collections.immutable)
import kotlinx.collections.immutable.*

val list1 = persistentListOf(1, 2, 3)
val list2 = list1.add(4) // Returns new list, list1 unchanged
// Structural sharing - efficient updates without full copying

Read-only collections provide API safety without copying overhead. For truly immutable collections with efficient updates, use kotlinx.collections.immutable, which provides persistent data structures with structural sharing.

Coroutine Performance

Coroutines are more efficient than threads for concurrent tasks. A thread requires significant memory (typically 1MB stack) and OS-level context switching. Thousands of active threads cause performance issues. Coroutines are lightweight - you can run hundreds of thousands concurrently.

Coroutines vs Threads

// BAD: Creating threads for many concurrent tasks
fun processWithThreads(items: List<Item>) {
    val threads = items.map { item ->
        Thread {
            processItem(item)
        }.apply { start() }
    }
    threads.forEach { it.join() }
    // If items has 10,000 elements, creates 10,000 threads - will crash or be very slow
}

// GOOD: Using coroutines for many concurrent tasks
suspend fun processWithCoroutines(items: List<Item>) = coroutineScope {
    items.map { item ->
        launch {
            processItem(item)
        }
    }.forEach { it.join() }
    // 10,000 coroutines - no problem, they share a small thread pool
}

// GOOD: Limit concurrency with Semaphore
suspend fun processWithLimitedConcurrency(items: List<Item>) {
    val semaphore = Semaphore(100) // Max 100 concurrent operations
    coroutineScope {
        items.map { item ->
            launch {
                semaphore.withPermit {
                    processItem(item)
                }
            }
        }.forEach { it.join() }
    }
}

Each coroutine requires only a small object (typically < 1KB). They're scheduled on a shared thread pool (sized to CPU cores for CPU-bound work). Context switching is fast because it's user-space, not kernel-level. For coroutine fundamentals, see our Kotlin coroutines guide.

Dispatcher Selection

Choose the right dispatcher for your workload. Dispatchers.Default for CPU-bound work uses a thread pool sized to CPU cores. Dispatchers.IO for I/O operations uses a larger pool (default 64 threads) optimized for blocking I/O. Using the wrong dispatcher wastes resources or limits throughput.

// GOOD: IO dispatcher for network calls
suspend fun fetchUsers(): List<User> = withContext(Dispatchers.IO) {
    api.getUsers() // Blocking I/O
}

// GOOD: Default dispatcher for CPU-intensive work
suspend fun processData(data: ByteArray): ProcessedData = withContext(Dispatchers.Default) {
    // Heavy computation
    complexAlgorithm(data)
}

// BAD: Default dispatcher for blocking I/O
suspend fun fetchUsers(): List<User> = withContext(Dispatchers.Default) {
    api.getUsers() // Blocks a thread from CPU pool!
}

For dispatcher selection strategies, see our Kotlin concurrency guide.

Profiling and Benchmarking

Always measure before optimizing. Use profilers to identify actual bottlenecks. Use proper benchmarking tools to verify optimizations actually help.

Profiling Tools

Android Profiler (Android Studio): CPU, memory, network profiling for Android apps. Identifies hot methods and memory leaks.

IntelliJ IDEA Profiler: CPU and memory profiling for JVM applications. Flame graphs show where time is spent.

Java Flight Recorder (JFR): Low-overhead profiling for production JVM apps. Captures events over time for offline analysis.

VisualVM: Free standalone profiler. CPU, memory, and thread profiling.

// GOOD: Simple timing for quick checks
inline fun <T> measureTimeMillis(block: () -> T): Pair<T, Long> {
    val start = System.currentTimeMillis()
    val result = block()
    val elapsed = System.currentTimeMillis() - start
    return result to elapsed
}

val (result, time) = measureTimeMillis {
    processLargeDataset()
}
println("Took ${time}ms")

// For accurate benchmarking, use kotlinx.benchmark or JMH

Microbenchmarking with kotlinx.benchmark

For accurate micro benchmarks, use kotlinx.benchmark (multiplatform) or JMH (JVM only). These handle warmup, JIT compilation, and statistical analysis.

// GOOD: Proper microbenchmark with kotlinx.benchmark
@State(Scope.Benchmark)
@Warmup(iterations = 5)
@Measurement(iterations = 10)
class CollectionBenchmark {

    private val data = (1..10000).toList()

    @Benchmark
    fun filterWithList(): List<Int> {
        return data.filter { it % 2 == 0 }
    }

    @Benchmark
    fun filterWithSequence(): List<Int> {
        return data.asSequence().filter { it % 2 == 0 }.toList()
    }
}

Microbenchmarks are tricky - JIT compilation, GC, and caching affect results. Always warm up the JVM, run multiple iterations, and use statistical analysis. Don't trust single-run timing.

Common Performance Pitfalls

Avoid String Concatenation in Loops

// BAD: Creates new string each iteration
fun buildString(items: List<String>): String {
    var result = ""
    for (item in items) {
        result += item // Allocates new string
    }
    return result
}

// GOOD: Use StringBuilder
fun buildString(items: List<String>): String = buildString {
    items.forEach { append(it) }
}

Minimize Lambda Allocations in Hot Paths

// BAD: Lambda allocated on every iteration
fun processItems(items: List<Item>) {
    for (item in items) {
        item.process { result -> // Lambda allocation
            handleResult(result)
        }
    }
}

// GOOD: Extract lambda to reuse or use inline function
val handler: (Result) -> Unit = { result -> handleResult(result) }

fun processItems(items: List<Item>) {
    for (item in items) {
        item.process(handler) // Reuse same lambda
    }
}

Cache Expensive Computations

// BAD: Recomputes on every access
class UserProfile(val user: User) {
    val displayName: String
        get() = "${user.firstName} ${user.lastName}".trim() // Recomputed each time
}

// GOOD: Compute once
class UserProfile(val user: User) {
    val displayName: String = "${user.firstName} ${user.lastName}".trim()
}

// GOOD: Lazy initialization for expensive computation
class UserProfile(val user: User) {
    val detailedInfo: String by lazy {
        // Expensive computation
        computeDetailedInfo(user)
    }
}

Summary

Key Takeaways

Measure First - Profile before optimizing
Inline Functions - Eliminate lambda overhead in hot paths
Sequences - Lazy evaluation for large collections with multiple operations
Avoid Allocations - Minimize object creation in performance-critical code
Collection Choice - Use HashSet/HashMap for lookup, ArrayList for iteration
Coroutines - More scalable than threads for concurrent work
Dispatcher Selection - IO for I/O, Default for CPU work
String Building - Use StringBuilder or buildString, not concatenation
Cache Computations - Use lazy or compute once for expensive operations
Microbenchmarking - Use proper tools like kotlinx.benchmark or JMH

Next Steps: Review Kotlin Testing for testing performance optimizations and Android Performance for Android-specific optimization techniques.

Kotlin Performance Optimization

Overview

Core Principles

Inline Functions

Understanding Inline Functions

When to Use Inline

Noinline and Crossinline

Sequences for Lazy Evaluation

When to Use Sequences

Sequences vs Collections

generateSequence for Infinite Sequences

Memory and Allocation Optimization

Avoiding Unnecessary Allocations

Data Class Overhead

Collection Optimization

Collection Type Selection

Immutable vs Mutable Collections

Coroutine Performance

Coroutines vs Threads

Dispatcher Selection

Profiling and Benchmarking

Profiling Tools

Microbenchmarking with kotlinx.benchmark

Common Performance Pitfalls

Avoid String Concatenation in Loops

Minimize Lambda Allocations in Hot Paths

Cache Expensive Computations

Further Reading

General Performance Concepts

Internal Documentation

External Resources

Summary

Key Takeaways

Overview​

Core Principles​

Inline Functions​

Understanding Inline Functions​

When to Use Inline​

Noinline and Crossinline​

Sequences for Lazy Evaluation​

When to Use Sequences​

Sequences vs Collections​

generateSequence for Infinite Sequences​

Memory and Allocation Optimization​

Avoiding Unnecessary Allocations​

Data Class Overhead​

Collection Optimization​

Collection Type Selection​

Immutable vs Mutable Collections​

Coroutine Performance​

Coroutines vs Threads​

Dispatcher Selection​

Profiling and Benchmarking​

Profiling Tools​

Microbenchmarking with kotlinx.benchmark​

Common Performance Pitfalls​

Avoid String Concatenation in Loops​

Minimize Lambda Allocations in Hot Paths​

Cache Expensive Computations​

Further Reading​

General Performance Concepts​

Internal Documentation​

External Resources​

Summary​

Key Takeaways​

Overview

Core Principles

Inline Functions

Understanding Inline Functions

When to Use Inline

Noinline and Crossinline

Sequences for Lazy Evaluation

When to Use Sequences

Sequences vs Collections

generateSequence for Infinite Sequences

Memory and Allocation Optimization

Avoiding Unnecessary Allocations

Data Class Overhead

Collection Optimization

Collection Type Selection

Immutable vs Mutable Collections

Coroutine Performance

Coroutines vs Threads

Dispatcher Selection

Profiling and Benchmarking

Profiling Tools

Microbenchmarking with kotlinx.benchmark

Common Performance Pitfalls

Avoid String Concatenation in Loops

Minimize Lambda Allocations in Hot Paths

Cache Expensive Computations

Further Reading

General Performance Concepts

Internal Documentation

External Resources

Summary

Key Takeaways