MongoDB Performance Tuning (WiredTiger, Indexing)
MongoDB with WiredTiger as storage engine — by default allocates 50% of available RAM for cache. On a server with 32 GB that's 16 GB of cache. Sounds generous, but the devil is in details: inefficient indexes, collections without needed indexes, operations without hint don't use cache at all — they read from disk page by page.
WiredTiger Cache: Configuration
# /etc/mongod.conf
storage:
wiredTiger:
engineConfig:
# Explicit cache size (default: max(50% RAM - 1GB, 256MB))
cacheSizeGB: 12 # For 32 GB server, leave ~20 GB for OS and other processes
# Transaction journal
journalCompressor: snappy # snappy is faster than zlib, slightly less compression
collectionConfig:
# Data compression algorithm
blockCompressor: snappy # snappy — balance of speed/compression
indexConfig:
# Index compression
prefixCompression: true # saves space in memory and on disk
Monitor cache status:
const cacheStats = db.serverStatus().wiredTiger.cache
printjson({
"cache_size_MB": Math.round(cacheStats["maximum bytes configured"] / 1024 / 1024),
"currently_in_cache_MB": Math.round(cacheStats["bytes currently in the cache"] / 1024 / 1024),
"dirty_bytes_MB": Math.round(cacheStats["tracked dirty bytes in the cache"] / 1024 / 1024),
"pages_read_from_disk": cacheStats["pages read into cache"],
"pages_evicted": cacheStats["pages evicted by application threads"],
// If evicted > 0 — cache under pressure, may need to increase cacheSizeGB
})
Indexing Strategy
Indexes are the main performance tool. Without index — COLLSCAN, full scan of each document.
ESR Rule (Equality, Sort, Range) for composite indexes:
// Query: find user's active orders, sort by date
db.orders.find({
user_id: ObjectId("..."), // Equality
status: "active" // Equality (second)
}).sort({ created_at: -1 }) // Sort
// Correct index: equality fields first, then sort
db.orders.createIndex(
{ user_id: 1, status: 1, created_at: -1 },
{ name: "idx_user_status_date", background: true }
)
// WRONG: range before sort
// { user_id: 1, created_at: -1, status: 1 } — sort won't use index
Indexes for Specific Patterns
// Multikey index for array tags
db.articles.createIndex({ tags: 1 })
// Document: { tags: ["nodejs", "mongodb", "backend"] }
// Query: db.articles.find({ tags: "mongodb" }) — uses index
// Text index for full-text search
db.articles.createIndex(
{ title: "text", body: "text" },
{ weights: { title: 10, body: 1 }, default_language: "english" }
)
db.articles.find({ $text: { $search: "indexing MongoDB" } })
// Sparse index — only documents with field (saves space)
db.users.createIndex(
{ phone: 1 },
{ sparse: true, unique: true }
)
// 2dsphere for geolocation
db.stores.createIndex({ location: "2dsphere" })
db.stores.find({
location: {
$near: {
$geometry: { type: "Point", coordinates: [37.618, 55.752] },
$maxDistance: 5000 // 5 km
}
}
})
Aggregation Pipeline: Optimization
// Slow: $match after $lookup
db.orders.aggregate([
{ $lookup: { from: "users", localField: "user_id", foreignField: "_id", as: "user" } },
{ $match: { "user.country": "US" } } // BAD: lookup entire collection
])
// Fast: $match as early as possible in pipeline
db.orders.aggregate([
{ $match: { status: "completed", created_at: { $gte: ISODate("2025-01-01") } } },
// ^ Uses index, filters BEFORE lookup
{ $lookup: {
from: "users",
localField: "user_id",
foreignField: "_id",
as: "user",
pipeline: [
{ $match: { country: "US" } }, // Filter inside lookup
{ $project: { name: 1, email: 1 } } // Only needed fields
]
}},
{ $project: { _id: 1, total: 1, "user.name": 1 } }
])
// allowDiskUse for heavy aggregation (if exceeds 100 MB RAM)
db.orders.aggregate([...], { allowDiskUse: true })
Profiler and Query Analysis
// Enable profiler for slow operations
db.setProfilingLevel(1, { slowms: 50 })
// Analyze results
db.system.profile.aggregate([
{ $match: { millis: { $gt: 50 } } },
{ $group: {
_id: "$command.filter",
count: { $sum: 1 },
avg_ms: { $avg: "$millis" },
max_ms: { $max: "$millis" }
}},
{ $sort: { avg_ms: -1 } },
{ $limit: 10 }
])
// Force explain for suspicious query
db.orders.find({ status: "pending" }).explain("allPlansExecution")
// "winningPlan" — chosen plan
// "rejectedPlans" — alternatives considered
// "executionStats.totalDocsExamined" vs "nReturned" — index selectivity
Aggregation: $facet for Parallel Pipelines
// Instead of multiple separate queries — one $facet
db.products.aggregate([
{ $match: { category: "electronics", price: { $gte: 1000 } } },
{ $facet: {
"total_count": [{ $count: "count" }],
"by_brand": [{ $group: { _id: "$brand", count: { $sum: 1 } } }, { $sort: { count: -1 } }],
"price_stats": [{ $group: { _id: null, min: { $min: "$price" }, max: { $max: "$price" }, avg: { $avg: "$price" } } }],
"paginated": [{ $sort: { price: 1 } }, { $skip: 0 }, { $limit: 20 }]
}}
])
// One round-trip instead of four queries
Read Preference on Replica Set
// Analytical queries — on secondary, no load on primary
const client = new MongoClient(uri, {
readPreference: ReadPreference.SECONDARY_PREFERRED
})
// Or for specific query
db.orders.find({ status: "completed" })
.readPreference("secondaryPreferred")
.hint({ status: 1, created_at: -1 })
Sharding: When and How
Sharding adds complexity — need mongos, config servers, shard key choice. Makes sense when:
- Data > 100–200 GB that doesn't fit on one server
- Writes > 10,000 RPS that one primary can't handle
sh.enableSharding("mydb")
// Sharding key: high cardinality, monotonic growth is bad
// Good: hash of user_id (even distribution)
sh.shardCollection("mydb.events", { user_id: "hashed" })
// Bad: monotonically increasing _id (all writes go to one shard)
// sh.shardCollection("mydb.events", { _id: 1 }) // DON'T DO







