Best practices for performance

Use this information to apply best practices to your Databases for MongoDB deployment running on IBM Cloud.

Performance troubleshooting flowchart

Use the flowchart to determine how to troubleshoot performance and steps to take next.

┌─────────────────────────────────┐
│   Performance issue detected    │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│  Check IBM Cloud Monitoring     │
│  - CPU > 80%?                   │
│  - Memory > 80%?                │
│  - Disk latency high?           │
└────────────┬────────────────────┘
             │
        ┌────┴────┐
        │  YES    │
        ▼         │
┌──────────────┐  │
│ Scale        │  │
│ resources    │  │
└──────────────┘  │
                  │ NO
                  ▼
        ┌─────────────────────┐
        │ Check slow queries  │
        │ db.system.profile   │
        └─────────┬───────────┘
                  │
             ┌────┴────┐
             │  Found? │
             ▼         │
        ┌─────────┐    │
        │ Optimize│    │
        │ queries │    │
        │ & indexes│   │
        └─────────┘    │
                       │ NO
                       ▼
              ┌────────────────┐
              │ Check Locks    │
              │ currentOp()    │
              └────────┬───────┘
                       │
                  ┌────┴────┐
                  │ Locked? │
                  ▼         │
             ┌─────────┐    │
             │ Kill or │    │
             │ optimize│    │
             └─────────┘    │
                            │ NO
                            ▼
                   ┌────────────────┐
                   │ Check cache    │
                   │ hit ratio      │
                   └────────┬───────┘
                            │
                       ┌────┴────┐
                       │ < 95%?  │
                       ▼         │
                  ┌─────────┐    │
                  │ Scale   │    │
                  │ memory  │    │
                  └─────────┘    │
                                 │ NO
                                 ▼
                        ┌────────────────┐
                        │ Check          │
                        │ replication    │
                        └────────┬───────┘
                                 │
                            ┌────┴────┐
                            │ Lagging?│
                            ▼         │
                       ┌─────────┐    │
                       │ Scale   │    │
                       │ or fix  │    │
                       └─────────┘    │
                                      │ NO
                                      ▼
                             ┌────────────────┐
                             │ Contact IBM    │
                             │ Support        │
                             └────────────────┘

Common anti-patterns

Avoid these common mistakes that lead to performance issues.

Query anti-patterns

1. Missing indexes

Problem:

// No index on 'email' field
db.users.find({ email: "user@example.com" })

Solution:

// Create index
db.users.createIndex({ email: 1 })

2. Inefficient regex queries

Problem:

// Case-insensitive regex without index
db.users.find({ name: /john/i })

Solution:

// Use text index or exact match
db.users.createIndex({ name: "text" })
db.users.find({ $text: { $search: "john" } })

3. Large skip() operations

Problem:

// Skipping thousands of documents
db.collection.find().skip(10000).limit(10)

Solution:

// Use range queries with indexed field
db.collection.find({ _id: { $gt: lastSeenId } }).limit(10)

4. Selecting unnecessary fields

Problem:

// Fetching entire documents
db.users.find({ status: "active" })

Solution:

// Use projection
db.users.find({ status: "active" }, { name: 1, email: 1 })

5. Inefficient aggregation pipelines

Problem:

// $match after $lookup
db.orders.aggregate([
  { $lookup: { ... } },
  { $match: { status: "completed" } }
])

Solution:

// $match first to reduce documents
db.orders.aggregate([
  { $match: { status: "completed" } },
  { $lookup: { ... } }
])

Schema design issues

1. Unbounded arrays

Problem:

// Array grows indefinitely
{
  userId: 123,
  activities: [/* thousands of items */]
}

Solution:

// Use separate collection or bucketing
{
  userId: 123,
  month: "2024-01",
  activities: [/* limited items */]
}

2. Excessive embedding

Problem:

// Deeply nested documents
{
  user: {
    profile: {
      settings: {
        preferences: {
          // many levels deep
        }
      }
    }
  }
}

Solution:

// Flatten or use references
{
  userId: 123,
  profileId: 456
}

3. Large documents

Problem:

// Documents approaching 16MB limit
{
  data: "very large string...",
  attachments: [/* large binary data */]
}

Solution:

// Store large data separately (GridFS or object storage)
{
  dataRef: "s3://bucket/key",
  attachments: [{ ref: "gridfs://id" }]
}

Connection management mistakes

1. Not using connection pooling

Problem:

// Creating new connection per request
app.get('/api/users', async (req, res) => {
  const client = await MongoClient.connect(uri);
  // ...
  await client.close();
});

Solution:

// Reuse connection pool
const client = new MongoClient(uri, { maxPoolSize: 50 });
await client.connect();

app.get('/api/users', async (req, res) => {
  const db = client.db();
  // ...
});

2. Not closing cursors

Problem:

// Cursor left open
const cursor = db.collection.find();
// Never closed

Solution:

// Always close cursors
const cursor = db.collection.find();
try {
  await cursor.forEach(doc => { /* process */ });
} finally {
  await cursor.close();
}

3. Too many connections

Problem:

// One connection per user session
const connections = new Map();
users.forEach(user => {
  connections.set(user.id, new MongoClient(uri));
});

Solution:

// Share connection pool across application
const client = new MongoClient(uri);
// All users share the same pool

Indexing pitfalls

1. Too many indexes

Problem:

// Index on every field
db.collection.createIndex({ field1: 1 })
db.collection.createIndex({ field2: 1 })
db.collection.createIndex({ field3: 1 })
// ... 20+ indexes

Impact: Slows down writes and increases storage.

Solution: Keep only necessary indexes and use compound indexes.

2. Wrong index order in compound indexes

Problem:

// Query: { status: "active", createdAt: { $gt: date } }
// Index: { createdAt: 1, status: 1 }  // Wrong order

Solution:

// Correct order: equality first, range second
db.collection.createIndex({ status: 1, createdAt: 1 })

3. Not using covered queries

Problem:

// Index exists but query not covered
db.users.createIndex({ email: 1 })
db.users.find({ email: "user@example.com" }, { name: 1, email: 1 })
// Still fetches documents

Solution:

// Include all projected fields in index
db.users.createIndex({ email: 1, name: 1 })
db.users.find({ email: "user@example.com" }, { name: 1, email: 1, _id: 0 })

Appendix: metrics thresholds

Recommended thresholds for key performance metrics.

Metrics thresholds
Metric	Warning threshold	Critical threshold	Recommended action
CPU utilization	> 75%	> 90%	Scale CPU cores
Memory utilization	> 80%	> 95%	Scale memory allocation
Disk utilization	> 80%	> 90%	Scale disk space
Disk IOPS	> 80% of limit	> 95% of limit	Increase disk size for more IOPS
Active connections	> 80% of limit	> 95% of limit	Scale plan or optimize connection pooling
Replication lag	> 5 seconds	> 30 seconds	Investigate and scale if needed
Cache hit ratio	< 95%	< 90%	Scale memory or optimize queries
Query execution time	> 100ms (avg)	> 1000ms (avg)	Optimize queries and indexes
Lock wait time	> 100ms	> 1000ms	Optimize operations and kill long-running queries
Page faults	> 100/sec	> 1000/sec	Scale memory
Network latency	> 10ms	> 50ms	Check network configuration
Backup duration	> 1 hour	> 4 hours	Consider scaling or optimization

Monitoring frequency recommendations

Monitoring frequency
Metric category	Check frequency	Retention period
Resource utilization	Every 1 minute	30 days
Query performance	Every 5 minutes	14 days
Replication status	Every 1 minute	30 days
Connection statistics	Every 5 minutes	14 days
Backup status	Every 1 hour	90 days
Disk growth	Every 1 hour	90 days

Alert configuration examples

CPU alert

Condition: CPU > 80% for 10 consecutive minutes
Action: Send notification to ops team
Escalation: Page on-call if > 90% for 15 minutes

Memory alert

Condition: Memory > 85% for 15 consecutive minutes
Action: Send notification to ops team
Escalation: Auto-scale if > 95% for 10 minutes

Replication lag alert

Condition: Lag > 10 seconds
Action: Send notification immediately
Escalation: Page on-call if > 60 seconds

Disk space alert

Condition: Disk > 80%
Action: Send notification to ops team
Escalation: Create incident if > 90%

Best practices summary

Best practices
Area	Recommendation
Indexing	Regularly review and remove unused indexes
Monitoring	Configure alerts for CPU, memory, disk, and replication lag
Capacity planning	Keep disk usage below 80% and scale proactively
Query design	Use explain plans during development
Scaling	Scale proactively before saturation
Connection pooling	Use connection pools and avoid per-request connections
Read preferences	Use secondaries for read-heavy workloads
Write concern	Balance durability with performance needs
Schema design	Avoid unbounded arrays and excessive embedding
Backup planning	Schedule during low-traffic periods
Network	Use private endpoints for IBM Cloud workloads
Security	Rotate credentials regularly and use IP allowlisting
Documentation	Document baseline metrics and normal patterns
Testing	Test performance changes in non-production first
Support	Gather diagnostics before contacting support

Best practices for performance

Performance troubleshooting flowchart

Common anti-patterns

Query anti-patterns

1. Missing indexes

2. Inefficient regex queries

3. Large skip() operations

4. Selecting unnecessary fields

5. Inefficient aggregation pipelines

Schema design issues

1. Unbounded arrays

2. Excessive embedding

3. Large documents

Connection management mistakes

1. Not using connection pooling

2. Not closing cursors

3. Too many connections

Indexing pitfalls

1. Too many indexes

2. Wrong index order in compound indexes

3. Not using covered queries

Appendix: metrics thresholds

Monitoring frequency recommendations

Alert configuration examples

CPU alert

Memory alert

Replication lag alert

Disk space alert

Best practices summary

Additional resources

IBM Cloud documentation

MongoDB documentation

Community resources