IBM Cloud Docs
Best practices for performance

Best practices for performance

Use this information to apply best practices to your Databases for MongoDB deployment running on IBM Cloud.

Performance troubleshooting flowchart

Use the flowchart to determine how to troubleshoot performance and steps to take next.

┌─────────────────────────────────┐
│   Performance issue detected    │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│  Check IBM Cloud Monitoring     │
│  - CPU > 80%?                   │
│  - Memory > 80%?                │
│  - Disk latency high?           │
└────────────┬────────────────────┘
             │
        ┌────┴────┐
        │  YES    │
        ▼         │
┌──────────────┐  │
│ Scale        │  │
│ resources    │  │
└──────────────┘  │
                  │ NO
                  ▼
        ┌─────────────────────┐
        │ Check slow queries  │
        │ db.system.profile   │
        └─────────┬───────────┘
                  │
             ┌────┴────┐
             │  Found? │
             ▼         │
        ┌─────────┐    │
        │ Optimize│    │
        │ queries │    │
        │ & indexes│   │
        └─────────┘    │
                       │ NO
                       ▼
              ┌────────────────┐
              │ Check Locks    │
              │ currentOp()    │
              └────────┬───────┘
                       │
                  ┌────┴────┐
                  │ Locked? │
                  ▼         │
             ┌─────────┐    │
             │ Kill or │    │
             │ optimize│    │
             └─────────┘    │
                            │ NO
                            ▼
                   ┌────────────────┐
                   │ Check cache    │
                   │ hit ratio      │
                   └────────┬───────┘
                            │
                       ┌────┴────┐
                       │ < 95%?  │
                       ▼         │
                  ┌─────────┐    │
                  │ Scale   │    │
                  │ memory  │    │
                  └─────────┘    │
                                 │ NO
                                 ▼
                        ┌────────────────┐
                        │ Check          │
                        │ replication    │
                        └────────┬───────┘
                                 │
                            ┌────┴────┐
                            │ Lagging?│
                            ▼         │
                       ┌─────────┐    │
                       │ Scale   │    │
                       │ or fix  │    │
                       └─────────┘    │
                                      │ NO
                                      ▼
                             ┌────────────────┐
                             │ Contact IBM    │
                             │ Support        │
                             └────────────────┘

Common anti-patterns

Avoid these common mistakes that lead to performance issues.

Query anti-patterns

1. Missing indexes

Problem:

// No index on 'email' field
db.users.find({ email: "user@example.com" })

Solution:

// Create index
db.users.createIndex({ email: 1 })

2. Inefficient regex queries

Problem:

// Case-insensitive regex without index
db.users.find({ name: /john/i })

Solution:

// Use text index or exact match
db.users.createIndex({ name: "text" })
db.users.find({ $text: { $search: "john" } })

3. Large skip() operations

Problem:

// Skipping thousands of documents
db.collection.find().skip(10000).limit(10)

Solution:

// Use range queries with indexed field
db.collection.find({ _id: { $gt: lastSeenId } }).limit(10)

4. Selecting unnecessary fields

Problem:

// Fetching entire documents
db.users.find({ status: "active" })

Solution:

// Use projection
db.users.find({ status: "active" }, { name: 1, email: 1 })

5. Inefficient aggregation pipelines

Problem:

// $match after $lookup
db.orders.aggregate([
  { $lookup: { ... } },
  { $match: { status: "completed" } }
])

Solution:

// $match first to reduce documents
db.orders.aggregate([
  { $match: { status: "completed" } },
  { $lookup: { ... } }
])

Schema design issues

1. Unbounded arrays

Problem:

// Array grows indefinitely
{
  userId: 123,
  activities: [/* thousands of items */]
}

Solution:

// Use separate collection or bucketing
{
  userId: 123,
  month: "2024-01",
  activities: [/* limited items */]
}

2. Excessive embedding

Problem:

// Deeply nested documents
{
  user: {
    profile: {
      settings: {
        preferences: {
          // many levels deep
        }
      }
    }
  }
}

Solution:

// Flatten or use references
{
  userId: 123,
  profileId: 456
}

3. Large documents

Problem:

// Documents approaching 16MB limit
{
  data: "very large string...",
  attachments: [/* large binary data */]
}

Solution:

// Store large data separately (GridFS or object storage)
{
  dataRef: "s3://bucket/key",
  attachments: [{ ref: "gridfs://id" }]
}

Connection management mistakes

1. Not using connection pooling

Problem:

// Creating new connection per request
app.get('/api/users', async (req, res) => {
  const client = await MongoClient.connect(uri);
  // ...
  await client.close();
});

Solution:

// Reuse connection pool
const client = new MongoClient(uri, { maxPoolSize: 50 });
await client.connect();

app.get('/api/users', async (req, res) => {
  const db = client.db();
  // ...
});

2. Not closing cursors

Problem:

// Cursor left open
const cursor = db.collection.find();
// Never closed

Solution:

// Always close cursors
const cursor = db.collection.find();
try {
  await cursor.forEach(doc => { /* process */ });
} finally {
  await cursor.close();
}

3. Too many connections

Problem:

// One connection per user session
const connections = new Map();
users.forEach(user => {
  connections.set(user.id, new MongoClient(uri));
});

Solution:

// Share connection pool across application
const client = new MongoClient(uri);
// All users share the same pool

Indexing pitfalls

1. Too many indexes

Problem:

// Index on every field
db.collection.createIndex({ field1: 1 })
db.collection.createIndex({ field2: 1 })
db.collection.createIndex({ field3: 1 })
// ... 20+ indexes

Impact: Slows down writes and increases storage.

Solution: Keep only necessary indexes and use compound indexes.

2. Wrong index order in compound indexes

Problem:

// Query: { status: "active", createdAt: { $gt: date } }
// Index: { createdAt: 1, status: 1 }  // Wrong order

Solution:

// Correct order: equality first, range second
db.collection.createIndex({ status: 1, createdAt: 1 })

3. Not using covered queries

Problem:

// Index exists but query not covered
db.users.createIndex({ email: 1 })
db.users.find({ email: "user@example.com" }, { name: 1, email: 1 })
// Still fetches documents

Solution:

// Include all projected fields in index
db.users.createIndex({ email: 1, name: 1 })
db.users.find({ email: "user@example.com" }, { name: 1, email: 1, _id: 0 })

Appendix: metrics thresholds

Recommended thresholds for key performance metrics.

Metrics thresholds
Metric Warning threshold Critical threshold Recommended action
CPU utilization > 75% > 90% Scale CPU cores
Memory utilization > 80% > 95% Scale memory allocation
Disk utilization > 80% > 90% Scale disk space
Disk IOPS > 80% of limit > 95% of limit Increase disk size for more IOPS
Active connections > 80% of limit > 95% of limit Scale plan or optimize connection pooling
Replication lag > 5 seconds > 30 seconds Investigate and scale if needed
Cache hit ratio < 95% < 90% Scale memory or optimize queries
Query execution time > 100ms (avg) > 1000ms (avg) Optimize queries and indexes
Lock wait time > 100ms > 1000ms Optimize operations and kill long-running queries
Page faults > 100/sec > 1000/sec Scale memory
Network latency > 10ms > 50ms Check network configuration
Backup duration > 1 hour > 4 hours Consider scaling or optimization

Monitoring frequency recommendations

Monitoring frequency
Metric category Check frequency Retention period
Resource utilization Every 1 minute 30 days
Query performance Every 5 minutes 14 days
Replication status Every 1 minute 30 days
Connection statistics Every 5 minutes 14 days
Backup status Every 1 hour 90 days
Disk growth Every 1 hour 90 days

Alert configuration examples

CPU alert

Condition: CPU > 80% for 10 consecutive minutes
Action: Send notification to ops team
Escalation: Page on-call if > 90% for 15 minutes

Memory alert

Condition: Memory > 85% for 15 consecutive minutes
Action: Send notification to ops team
Escalation: Auto-scale if > 95% for 10 minutes

Replication lag alert

Condition: Lag > 10 seconds
Action: Send notification immediately
Escalation: Page on-call if > 60 seconds

Disk space alert

Condition: Disk > 80%
Action: Send notification to ops team
Escalation: Create incident if > 90%

Best practices summary

Best practices
Area Recommendation
Indexing Regularly review and remove unused indexes
Monitoring Configure alerts for CPU, memory, disk, and replication lag
Capacity planning Keep disk usage below 80% and scale proactively
Query design Use explain plans during development
Scaling Scale proactively before saturation
Connection pooling Use connection pools and avoid per-request connections
Read preferences Use secondaries for read-heavy workloads
Write concern Balance durability with performance needs
Schema design Avoid unbounded arrays and excessive embedding
Backup planning Schedule during low-traffic periods
Network Use private endpoints for IBM Cloud workloads
Security Rotate credentials regularly and use IP allowlisting
Documentation Document baseline metrics and normal patterns
Testing Test performance changes in non-production first
Support Gather diagnostics before contacting support

Additional resources

IBM Cloud documentation

MongoDB documentation

Community resources