Kotlin Performance Optimization
Overview
This guide covers performance optimization techniques specific to Kotlin, including understanding inline functions, optimizing collections with sequences, minimizing allocations, avoiding common performance pitfalls, and profiling Kotlin code. While premature optimization should be avoided, understanding performance characteristics helps you write efficient code from the start and know where to focus optimization efforts when needed.
Core Principles
- Measure First: Profile before optimizing - intuition is often wrong
- Inline Functions: Eliminate lambda overhead for higher-order functions
- Sequences: Lazy evaluation for large collection operations
- Avoid Allocations: Minimize object creation in hot paths
- Immutability Trade-offs: Balance safety and performance
- Data Classes: Understand generated code overhead
- Coroutines: Prefer coroutines over threads for scalability
- Smart Casts: Leverage type inference to avoid unnecessary casts
- Collection Choice: Use appropriate collection types
- Benchmarking: Use proper microbenchmarking tools
Inline Functions
Inline functions eliminate the overhead of lambda objects and function calls by copying function code directly to the call site. This is particularly valuable for higher-order functions (functions that take other functions as parameters) used frequently in hot code paths.
Understanding Inline Functions
When you pass a lambda to a regular higher-order function, Kotlin creates an object to hold the lambda and invokes it polymorphically. This incurs allocation overhead and virtual call overhead. For small functions called frequently, this overhead can be significant.
Marking a function as inline instructs the compiler to copy both the function body and the lambda argument to the call site, eliminating both allocations and call overhead. This is why many standard library functions like let, apply, run, forEach, and filter are inline.
When to inline: Inline functions with lambda parameters used in performance-critical code. Don't inline large functions - you'll increase bytecode size. Don't inline functions that store lambda references (the lambda must be callable at the inlined call site). For general function design, see our Kotlin general guide.
When to Use Inline
// GOOD: Inline for frequently-used utility functions
inline fun <T> measureTime(block: () -> T): T {
val start = System.nanoTime()
val result = block()
val elapsed = System.nanoTime() - start
println("Execution took ${elapsed}ns")
return result
}
// GOOD: Inline with reified type parameters
inline fun <reified T> isInstance(value: Any): Boolean {
return value is T // Reified type available at runtime
}
// Usage - no lambda allocation
val result = measureTime {
// Some computation
complexCalculation()
}
// BAD: Inlining large function - increases bytecode size
inline fun largeFunction() {
// 100+ lines of code
// Inlining copies all this to every call site
}
// BAD: Cannot inline - stores lambda reference
fun storeAndCallLater(block: () -> Unit) {
savedBlocks.add(block) // Lambda object must exist
block()
}
Inline functions enable reified type parameters - generic types that are available at runtime. Normally, generics are erased, but inlined functions can preserve type information because the generic type is known at the call site.
Noinline and Crossinline
Sometimes you want to inline a function but not inline all its lambda parameters. Use noinline for lambda parameters that can't be inlined (e.g., stored or passed to non-inline functions). Use crossinline for lambdas that must not contain non-local returns.
// GOOD: Mix of inline and noinline parameters
inline fun processData(
data: List<Int>,
filter: (Int) -> Boolean, // Inlined
noinline logger: (String) -> Unit // Not inlined - might be stored
) {
data.filter(filter).forEach { value ->
logger("Processing $value")
}
}
// GOOD: Crossinline prevents non-local returns
inline fun runSafely(crossinline block: () -> Unit) {
try {
block() // Cannot contain 'return' that would exit runSafely's caller
} catch (e: Exception) {
println("Error: ${e.message}")
}
}
// BAD: Non-local return in lambda (without crossinline)
inline fun forEach(list: List<Int>, action: (Int) -> Unit) {
for (item in list) {
action(item) // If action contains 'return', it exits forEach's caller
}
}
fun processItems(items: List<Int>) {
forEach(items) { item ->
if (item < 0) return // Returns from processItems, not just the lambda!
println(item)
}
println("Done") // Might not print if negative item found
}
Sequences for Lazy Evaluation
Sequences provide lazy evaluation for collection operations. Unlike regular collection functions (which are eager), sequences defer computation until a terminal operation is called. This eliminates intermediate collections and improves performance for large data sets with multiple operations.
When to Use Sequences
Collection functions like map, filter, and flatMap are eager - each operation creates a new intermediate collection. For a chain like list.map {...}.filter {...}.take(10), map creates a full list, then filter creates another list, then take creates a final list. If the list has 1 million elements but you only need 10, you've wasted significant work.
Sequences evaluate elements one-by-one through the entire pipeline. Each element passes through map, then filter, then take. No intermediate collections are created. Evaluation stops once take has 10 elements - the remaining 999,990 elements are never processed.
Use sequences for large collections with multiple transformations, or when you might not need all results (e.g., take, first, any). Don't use sequences for small collections or single operations - the lazy evaluation overhead isn't worth it. For collection operations, see our Kotlin collections guide.
Sequences vs Collections
// BAD: Eager evaluation - creates intermediate lists
fun processLargeList(numbers: List<Int>): List<Int> {
return numbers
.map { it * 2 } // Creates List<Int> with 1M elements
.filter { it > 100 } // Creates another List<Int>
.take(10) // Creates final List<Int> with 10 elements
// Processed 1M elements through map, 1M through filter
}
// GOOD: Lazy evaluation with sequence
fun processLargeListEfficiently(numbers: List<Int>): List<Int> {
return numbers
.asSequence() // Convert to sequence - no cost
.map { it * 2 } // No intermediate list
.filter { it > 100 } // No intermediate list
.take(10) // Stops after finding 10 elements
.toList() // Terminal operation - creates single result list
// Only processes elements until 10 results found
}
// GOOD: Sequence with short-circuiting
fun findFirstEligible(users: List<User>): User? {
return users
.asSequence()
.filter { it.isActive }
.map { it.toDetailedProfile() } // Expensive operation
.firstOrNull { it.meetsRequirements() }
// Stops as soon as first match is found
}
// BAD: Using sequence for small collections
fun processSmallList(numbers: List<Int>): List<Int> {
return numbers // Only 10 elements
.asSequence()
.map { it * 2 }
.toList()
// Overhead of sequence isn't worth it
}
Sequences shine when combined with short-circuiting operations (first, take, any, all). The lazy evaluation stops processing once the condition is met. For small collections (< 100 elements), the sequence overhead likely exceeds any benefit.
generateSequence for Infinite Sequences
generateSequence creates sequences from a generator function, enabling infinite or computed sequences. This is powerful for iterative algorithms, pagination, or any scenario where you generate values on demand.
// GOOD: Fibonacci sequence generator
val fibonacci = generateSequence(Pair(0, 1)) { (a, b) ->
Pair(b, a + b)
}.map { it.first }
// Take first 10 Fibonacci numbers
val first10 = fibonacci.take(10).toList()
// [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
// GOOD: Pagination with sequence
fun fetchAllPages(): Sequence<Page> = sequence {
var nextToken: String? = null
do {
val response = api.fetchPage(nextToken)
yield(response.page)
nextToken = response.nextToken
} while (nextToken != null)
}
// Usage - can stop early
val firstThreePages = fetchAllPages().take(3).toList()
// GOOD: Tree traversal with sequence
fun Node.traverse(): Sequence<Node> = sequence {
yield(this@traverse)
for (child in children) {
yieldAll(child.traverse())
}
}
// Find first node matching condition without traversing entire tree
val foundNode = rootNode.traverse().firstOrNull { it.matches(criteria) }
Memory and Allocation Optimization
Excessive object allocation triggers garbage collection, which pauses application execution. Minimizing allocations in hot paths improves performance and reduces GC pressure.
Avoiding Unnecessary Allocations
// BAD: Creates new list on every call
fun getActiveUsers(users: List<User>): List<User> {
return users.filter { it.isActive } // New list allocated
}
// GOOD: Return sequence to defer allocation
fun getActiveUsers(users: List<User>): Sequence<User> {
return users.asSequence().filter { it.isActive } // No allocation until consumed
}
// BAD: Allocates string repeatedly in loop
fun formatNumbers(numbers: List<Int>): String {
var result = ""
for (num in numbers) {
result += "$num, " // Creates new string each iteration
}
return result
}
// GOOD: StringBuilder for efficient string building
fun formatNumbers(numbers: List<Int>): String {
return buildString {
numbers.forEach { append("$it, ") }
} // Single allocation
}
// BAD: Boxing primitives in collection
val numbers: List<Int> = listOf(1, 2, 3, 4, 5) // Each Int is boxed (heap object)
// GOOD: Use primitive array when performance matters
val numbers = intArrayOf(1, 2, 3, 4, 5) // Primitive array, no boxing
Every object allocation requires heap space and eventual GC. In hot loops processing thousands of elements, small allocations add up. Use primitives instead of boxed types, reuse objects when safe, and prefer sequences or inline functions to eliminate lambda allocations.
Data Class Overhead
Data classes generate equals, hashCode, toString, copy, and componentN functions. These are convenient but have performance implications. equals and hashCode iterate through all properties. For data classes with many properties or used in large collections, this can be expensive.
// GOOD: Data class for DTOs - convenience worth the cost
data class User(
val id: String,
val name: String,
val email: String,
val age: Int
)
// BAD: Large data class used as map key
data class ComplexKey(
val field1: String,
val field2: String,
// ... 20 more fields
)
val cache = mutableMapOf<ComplexKey, Value>()
cache[complexKey] // hashCode iterates all 22 fields!
// GOOD: Custom hashCode focusing on key fields
data class OptimizedKey(
val id: String, // Primary identifier
val type: String, // Secondary identifier
val field3: String,
// ... more fields
) {
// Only hash the key fields
override fun hashCode(): Int = id.hashCode() * 31 + type.hashCode()
override fun equals(other: Any?): Boolean {
if (this === other) return true
if (other !is OptimizedKey) return false
return id == other.id && type == other.type
}
}
For data classes used as map keys or in large sets, consider custom equals/hashCode implementations that focus on key fields. For data class patterns, see our Kotlin general guide.
Collection Optimization
Choosing the right collection type impacts both performance and memory usage. Kotlin provides various collection types optimized for different access patterns.
Collection Type Selection
// GOOD: ArrayList for sequential access and dynamic sizing
val items = ArrayList<Int>(initialCapacity = 1000) // Specify capacity if known
items.add(1)
items.add(2)
// GOOD: LinkedList for frequent insertions/removals at both ends
val queue = LinkedList<Task>()
queue.addFirst(urgentTask)
queue.addLast(normalTask)
// GOOD: HashSet for fast lookup
val uniqueIds = HashSet<String>()
if (id in uniqueIds) { ... } // O(1) lookup
// GOOD: HashMap for key-value lookup
val userCache = HashMap<String, User>(initialCapacity = 10000)
val user = userCache[userId] // O(1) lookup
// BAD: List for membership testing
val allowedIds = listOf("id1", "id2", "id3", /* ... hundreds more */)
if (userId in allowedIds) { ... } // O(n) linear search!
// GOOD: Set for membership testing
val allowedIds = setOf("id1", "id2", "id3", /* ... hundreds more */)
if (userId in allowedIds) { ... } // O(1) hash lookup
// GOOD: Specify initial capacity to avoid resizing
val largeMap = HashMap<String, User>(initialCapacity = 10000)
// Prevents multiple resize operations as map grows
ArrayList: Constant-time random access, amortized constant-time append. Use for indexed access and iteration. LinkedList: Constant-time insertions/removals at ends. Use for queues or frequent modifications. HashSet/HashMap: Constant-time lookup, insertion, removal. Use when you need fast membership testing or key-value lookup. TreeSet/TreeMap: Sorted, log-time operations. Use when you need ordering.
For collection patterns, see our Kotlin general guide.
Immutable vs Mutable Collections
Kotlin distinguishes read-only (List, Set, Map) and mutable (MutableList, MutableSet, MutableMap) interfaces. Read-only collections don't guarantee immutability - they just prevent modification through that reference. For performance, prefer mutable collections internally and expose read-only views.
// GOOD: Mutable internally, read-only externally
class UserRepository {
private val users = mutableListOf<User>() // Mutable for internal updates
fun addUser(user: User) {
users.add(user)
}
fun getUsers(): List<User> = users // Read-only view
}
// BAD: Copying on every read
class UserRepository {
private val users = mutableListOf<User>()
fun getUsers(): List<User> = users.toList() // Creates new list every call!
}
// GOOD: Persistent collections for true immutability (kotlinx.collections.immutable)
import kotlinx.collections.immutable.*
val list1 = persistentListOf(1, 2, 3)
val list2 = list1.add(4) // Returns new list, list1 unchanged
// Structural sharing - efficient updates without full copying
Read-only collections provide API safety without copying overhead. For truly immutable collections with efficient updates, use kotlinx.collections.immutable, which provides persistent data structures with structural sharing.
Coroutine Performance
Coroutines are more efficient than threads for concurrent tasks. A thread requires significant memory (typically 1MB stack) and OS-level context switching. Thousands of active threads cause performance issues. Coroutines are lightweight - you can run hundreds of thousands concurrently.
Coroutines vs Threads
// BAD: Creating threads for many concurrent tasks
fun processWithThreads(items: List<Item>) {
val threads = items.map { item ->
Thread {
processItem(item)
}.apply { start() }
}
threads.forEach { it.join() }
// If items has 10,000 elements, creates 10,000 threads - will crash or be very slow
}
// GOOD: Using coroutines for many concurrent tasks
suspend fun processWithCoroutines(items: List<Item>) = coroutineScope {
items.map { item ->
launch {
processItem(item)
}
}.forEach { it.join() }
// 10,000 coroutines - no problem, they share a small thread pool
}
// GOOD: Limit concurrency with Semaphore
suspend fun processWithLimitedConcurrency(items: List<Item>) {
val semaphore = Semaphore(100) // Max 100 concurrent operations
coroutineScope {
items.map { item ->
launch {
semaphore.withPermit {
processItem(item)
}
}
}.forEach { it.join() }
}
}
Each coroutine requires only a small object (typically < 1KB). They're scheduled on a shared thread pool (sized to CPU cores for CPU-bound work). Context switching is fast because it's user-space, not kernel-level. For coroutine fundamentals, see our Kotlin coroutines guide.
Dispatcher Selection
Choose the right dispatcher for your workload. Dispatchers.Default for CPU-bound work uses a thread pool sized to CPU cores. Dispatchers.IO for I/O operations uses a larger pool (default 64 threads) optimized for blocking I/O. Using the wrong dispatcher wastes resources or limits throughput.
// GOOD: IO dispatcher for network calls
suspend fun fetchUsers(): List<User> = withContext(Dispatchers.IO) {
api.getUsers() // Blocking I/O
}
// GOOD: Default dispatcher for CPU-intensive work
suspend fun processData(data: ByteArray): ProcessedData = withContext(Dispatchers.Default) {
// Heavy computation
complexAlgorithm(data)
}
// BAD: Default dispatcher for blocking I/O
suspend fun fetchUsers(): List<User> = withContext(Dispatchers.Default) {
api.getUsers() // Blocks a thread from CPU pool!
}
For dispatcher selection strategies, see our Kotlin concurrency guide.
Profiling and Benchmarking
Always measure before optimizing. Use profilers to identify actual bottlenecks. Use proper benchmarking tools to verify optimizations actually help.
Profiling Tools
Android Profiler (Android Studio): CPU, memory, network profiling for Android apps. Identifies hot methods and memory leaks.
IntelliJ IDEA Profiler: CPU and memory profiling for JVM applications. Flame graphs show where time is spent.
Java Flight Recorder (JFR): Low-overhead profiling for production JVM apps. Captures events over time for offline analysis.
VisualVM: Free standalone profiler. CPU, memory, and thread profiling.
// GOOD: Simple timing for quick checks
inline fun <T> measureTimeMillis(block: () -> T): Pair<T, Long> {
val start = System.currentTimeMillis()
val result = block()
val elapsed = System.currentTimeMillis() - start
return result to elapsed
}
val (result, time) = measureTimeMillis {
processLargeDataset()
}
println("Took ${time}ms")
// For accurate benchmarking, use kotlinx.benchmark or JMH
Microbenchmarking with kotlinx.benchmark
For accurate micro benchmarks, use kotlinx.benchmark (multiplatform) or JMH (JVM only). These handle warmup, JIT compilation, and statistical analysis.
// GOOD: Proper microbenchmark with kotlinx.benchmark
@State(Scope.Benchmark)
@Warmup(iterations = 5)
@Measurement(iterations = 10)
class CollectionBenchmark {
private val data = (1..10000).toList()
@Benchmark
fun filterWithList(): List<Int> {
return data.filter { it % 2 == 0 }
}
@Benchmark
fun filterWithSequence(): List<Int> {
return data.asSequence().filter { it % 2 == 0 }.toList()
}
}
Microbenchmarks are tricky - JIT compilation, GC, and caching affect results. Always warm up the JVM, run multiple iterations, and use statistical analysis. Don't trust single-run timing.
Common Performance Pitfalls
Avoid String Concatenation in Loops
// BAD: Creates new string each iteration
fun buildString(items: List<String>): String {
var result = ""
for (item in items) {
result += item // Allocates new string
}
return result
}
// GOOD: Use StringBuilder
fun buildString(items: List<String>): String = buildString {
items.forEach { append(it) }
}
Minimize Lambda Allocations in Hot Paths
// BAD: Lambda allocated on every iteration
fun processItems(items: List<Item>) {
for (item in items) {
item.process { result -> // Lambda allocation
handleResult(result)
}
}
}
// GOOD: Extract lambda to reuse or use inline function
val handler: (Result) -> Unit = { result -> handleResult(result) }
fun processItems(items: List<Item>) {
for (item in items) {
item.process(handler) // Reuse same lambda
}
}
Cache Expensive Computations
// BAD: Recomputes on every access
class UserProfile(val user: User) {
val displayName: String
get() = "${user.firstName} ${user.lastName}".trim() // Recomputed each time
}
// GOOD: Compute once
class UserProfile(val user: User) {
val displayName: String = "${user.firstName} ${user.lastName}".trim()
}
// GOOD: Lazy initialization for expensive computation
class UserProfile(val user: User) {
val detailedInfo: String by lazy {
// Expensive computation
computeDetailedInfo(user)
}
}
Further Reading
General Performance Concepts
- Performance Overview - Performance strategy and principles
- Performance Optimization - Cross-language optimization techniques
- Performance Testing - Load testing strategies
Internal Documentation
- Kotlin General - Kotlin language features
- Kotlin Concurrency - Concurrency patterns
- Java Performance - JVM performance considerations
- Android Performance - Android-specific optimization
External Resources
Summary
Key Takeaways
- Measure First - Profile before optimizing
- Inline Functions - Eliminate lambda overhead in hot paths
- Sequences - Lazy evaluation for large collections with multiple operations
- Avoid Allocations - Minimize object creation in performance-critical code
- Collection Choice - Use HashSet/HashMap for lookup, ArrayList for iteration
- Coroutines - More scalable than threads for concurrent work
- Dispatcher Selection - IO for I/O, Default for CPU work
- String Building - Use StringBuilder or buildString, not concatenation
- Cache Computations - Use lazy or compute once for expensive operations
- Microbenchmarking - Use proper tools like kotlinx.benchmark or JMH
Next Steps: Review Kotlin Testing for testing performance optimizations and Android Performance for Android-specific optimization techniques.