MongoDB Database Design: Best Practices untuk Developer

MongoDB adalah document database yang memberikan fleksibilitas tinggi dalam schema design. Namun, fleksibilitas ini juga membutuhkan pemahaman yang baik tentang best practices untuk membangun aplikasi yang performant dan scalable.

Mengapa MongoDB?

Keuntungan MongoDB

Schema Flexibility: Dynamic schema yang mudah diubah
Document Model: Natural mapping ke object-oriented programming
Horizontal Scaling: Built-in sharding support
Rich Query Language: Powerful aggregation framework
High Performance: Optimized untuk read-heavy workloads

Kapan Menggunakan MongoDB

Rapid Development: Ketika schema requirements sering berubah
Complex Data Structures: Nested objects dan arrays
Real-time Analytics: Time-series data dan event logging
Content Management: Flexible content types
IoT Applications: High-volume, varied data structures

Document Structure Design

1. Embedding vs Referencing

Embedding (Denormalization)

// User dengan embedded posts
{
  _id: ObjectId("..."),
  name: "John Doe",
  email: "john@example.com",
  posts: [
    {
      _id: ObjectId("..."),
      title: "My First Post",
      content: "This is my first post...",
      createdAt: ISODate("2024-01-15")
    },
    {
      _id: ObjectId("..."),
      title: "Another Post",
      content: "This is another post...",
      createdAt: ISODate("2024-01-20")
    }
  ],
  profile: {
    bio: "Software developer",
    location: "Jakarta",
    website: "https://johndoe.com"
  }
}

Kapan menggunakan Embedding:

One-to-few relationships
Data yang sering diakses bersamaan
Data yang jarang berubah
Atomic updates diperlukan

Referencing (Normalization)

// User collection
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  name: "John Doe",
  email: "john@example.com",
  profile: {
    bio: "Software developer",
    location: "Jakarta",
    website: "https://johndoe.com"
  }
}

// Posts collection
{
  _id: ObjectId("507f1f77bcf86cd799439012"),
  title: "My First Post",
  content: "This is my first post...",
  author: ObjectId("507f1f77bcf86cd799439011"), // Reference to user
  tags: ["mongodb", "database", "nosql"],
  createdAt: ISODate("2024-01-15"),
  updatedAt: ISODate("2024-01-15")
}

Kapan menggunakan Referencing:

One-to-many atau many-to-many relationships
Data yang sering berubah secara independen
Large documents (>16MB limit)
Data yang diakses secara terpisah

2. Schema Design Patterns

Subset Pattern

// Product collection (frequently accessed data)
{
  _id: ObjectId("..."),
  name: "MacBook Pro",
  price: 25000000,
  category: "Laptop",
  brand: "Apple",
  rating: 4.8,
  reviewCount: 150,
  inStock: true,
  // Subset of most recent reviews
  recentReviews: [
    {
      _id: ObjectId("..."),
      rating: 5,
      comment: "Excellent laptop!",
      author: "John Doe",
      date: ISODate("2024-02-01")
    }
  ]
}

// Reviews collection (complete data)
{
  _id: ObjectId("..."),
  productId: ObjectId("..."),
  rating: 5,
  comment: "Excellent laptop! Great performance and battery life.",
  author: {
    _id: ObjectId("..."),
    name: "John Doe",
    verified: true
  },
  helpful: 25,
  date: ISODate("2024-02-01")
}

Extended Reference Pattern

// Order collection
{
  _id: ObjectId("..."),
  orderNumber: "ORD-2024-001",
  customer: {
    _id: ObjectId("..."),
    name: "John Doe", // Denormalized for quick access
    email: "john@example.com"
  },
  items: [
    {
      productId: ObjectId("..."),
      name: "MacBook Pro", // Denormalized
      price: 25000000, // Price at time of order
      quantity: 1
    }
  ],
  total: 25000000,
  status: "shipped",
  createdAt: ISODate("2024-02-01")
}

Bucket Pattern (Time Series)

// IoT sensor data bucketed by hour
{
  _id: ObjectId("..."),
  sensorId: "TEMP_001",
  timestamp: ISODate("2024-02-01T10:00:00Z"), // Start of hour
  measurements: [
    {
      timestamp: ISODate("2024-02-01T10:00:00Z"),
      temperature: 25.5,
      humidity: 60.2
    },
    {
      timestamp: ISODate("2024-02-01T10:01:00Z"),
      temperature: 25.7,
      humidity: 60.1
    }
    // ... up to 60 measurements per hour
  ],
  count: 60,
  avgTemperature: 25.6,
  maxTemperature: 26.1,
  minTemperature: 25.1
}

Indexing Strategies

1. Single Field Indexes

// Create index on email field
db.users.createIndex({ email: 1 });

// Create index on createdAt field (descending)
db.posts.createIndex({ createdAt: -1 });

// Create unique index
db.users.createIndex({ email: 1 }, { unique: true });

// Create sparse index (only documents with the field)
db.users.createIndex({ phoneNumber: 1 }, { sparse: true });

2. Compound Indexes

// Compound index for queries on status and createdAt
db.orders.createIndex({ status: 1, createdAt: -1 });

// Index for user's posts sorted by date
db.posts.createIndex({ author: 1, createdAt: -1 });

// Index for geospatial queries
db.stores.createIndex({ location: "2dsphere", category: 1 });

3. Text Indexes

// Create text index for search
db.articles.createIndex(
  {
    title: "text",
    content: "text",
    tags: "text",
  },
  {
    weights: {
      title: 10,
      content: 5,
      tags: 1,
    },
    name: "article_text_index",
  }
);

// Search query
db.articles
  .find({
    $text: {
      $search: "mongodb database design",
      $caseSensitive: false,
    },
  })
  .sort({ score: { $meta: "textScore" } });

4. Partial Indexes

// Index only active users
db.users.createIndex(
  { email: 1 },
  {
    partialFilterExpression: { isActive: true },
    name: "active_users_email_index",
  }
);

// Index only published posts
db.posts.createIndex(
  { createdAt: -1 },
  {
    partialFilterExpression: { status: "published" },
    name: "published_posts_date_index",
  }
);

Query Optimization

1. Efficient Query Patterns

// Good: Use indexes effectively
db.posts
  .find({ author: ObjectId("..."), status: "published" })
  .sort({ createdAt: -1 })
  .limit(10);

// Bad: Inefficient regex
db.users.find({ email: /.*@gmail.com$/ });

// Good: Use prefix regex with index
db.users.find({ email: /^john/ });

// Good: Use $in for multiple values
db.posts.find({ status: { $in: ["published", "featured"] } });

// Good: Use projection to limit returned fields
db.users.find({ isActive: true }, { name: 1, email: 1, _id: 0 });

2. Aggregation Pipeline Optimization

// Optimized aggregation pipeline
db.orders.aggregate([
  // 1. Match early to reduce documents
  {
    $match: {
      createdAt: {
        $gte: ISODate("2024-01-01"),
        $lt: ISODate("2024-02-01"),
      },
      status: "completed",
    },
  },
  // 2. Project only needed fields
  {
    $project: {
      customerId: 1,
      total: 1,
      createdAt: 1,
    },
  },
  // 3. Group and calculate
  {
    $group: {
      _id: "$customerId",
      totalSpent: { $sum: "$total" },
      orderCount: { $sum: 1 },
      lastOrder: { $max: "$createdAt" },
    },
  },
  // 4. Sort by total spent
  {
    $sort: { totalSpent: -1 },
  },
  // 5. Limit results
  {
    $limit: 100,
  },
]);

3. Using Explain

// Analyze query performance
db.posts
  .find({ author: ObjectId("...") })
  .sort({ createdAt: -1 })
  .explain("executionStats");

// Check if index is used
db.posts.find({ title: /mongodb/i }).explain("queryPlanner");

// Analyze aggregation performance
db.orders
  .aggregate([
    { $match: { status: "completed" } },
    { $group: { _id: "$customerId", total: { $sum: "$amount" } } },
  ])
  .explain("executionStats");

Data Modeling Examples

1. E-commerce Schema

// Users Collection
{
  _id: ObjectId("..."),
  email: "john@example.com",
  password: "hashed_password",
  profile: {
    firstName: "John",
    lastName: "Doe",
    phone: "+62812345678",
    dateOfBirth: ISODate("1990-01-15")
  },
  addresses: [
    {
      _id: ObjectId("..."),
      type: "home",
      street: "Jl. Sudirman No. 1",
      city: "Jakarta",
      postalCode: "10110",
      isDefault: true
    }
  ],
  preferences: {
    newsletter: true,
    notifications: {
      email: true,
      sms: false
    }
  },
  createdAt: ISODate("2024-01-01"),
  lastLogin: ISODate("2024-02-10")
}

// Products Collection
{
  _id: ObjectId("..."),
  sku: "LAPTOP-001",
  name: "MacBook Pro 14-inch",
  description: "Apple MacBook Pro with M3 chip",
  category: {
    _id: ObjectId("..."),
    name: "Laptops",
    path: "Electronics > Computers > Laptops"
  },
  brand: "Apple",
  price: {
    amount: 25000000,
    currency: "IDR"
  },
  inventory: {
    quantity: 50,
    reserved: 5,
    available: 45
  },
  specifications: {
    processor: "Apple M3",
    memory: "16GB",
    storage: "512GB SSD",
    display: "14-inch Liquid Retina XDR"
  },
  images: [
    {
      url: "https://example.com/images/macbook-1.jpg",
      alt: "MacBook Pro front view",
      isPrimary: true
    }
  ],
  seo: {
    slug: "macbook-pro-14-inch-m3",
    metaTitle: "MacBook Pro 14-inch with M3 Chip",
    metaDescription: "Powerful MacBook Pro with M3 chip..."
  },
  isActive: true,
  createdAt: ISODate("2024-01-01")
}

// Orders Collection
{
  _id: ObjectId("..."),
  orderNumber: "ORD-2024-001",
  customer: {
    _id: ObjectId("..."),
    email: "john@example.com",
    name: "John Doe"
  },
  items: [
    {
      productId: ObjectId("..."),
      sku: "LAPTOP-001",
      name: "MacBook Pro 14-inch",
      price: 25000000,
      quantity: 1,
      subtotal: 25000000
    }
  ],
  shipping: {
    address: {
      street: "Jl. Sudirman No. 1",
      city: "Jakarta",
      postalCode: "10110"
    },
    method: "express",
    cost: 50000,
    trackingNumber: "TRK123456789"
  },
  payment: {
    method: "credit_card",
    status: "paid",
    transactionId: "TXN123456789",
    paidAt: ISODate("2024-02-01T10:30:00Z")
  },
  totals: {
    subtotal: 25000000,
    shipping: 50000,
    tax: 2505000,
    total: 27555000
  },
  status: "shipped",
  statusHistory: [
    {
      status: "pending",
      timestamp: ISODate("2024-02-01T10:00:00Z")
    },
    {
      status: "paid",
      timestamp: ISODate("2024-02-01T10:30:00Z")
    },
    {
      status: "shipped",
      timestamp: ISODate("2024-02-02T09:00:00Z")
    }
  ],
  createdAt: ISODate("2024-02-01T10:00:00Z")
}

2. Social Media Schema

// Users Collection
{
  _id: ObjectId("..."),
  username: "johndoe",
  email: "john@example.com",
  profile: {
    displayName: "John Doe",
    bio: "Software developer passionate about technology",
    avatar: "https://example.com/avatars/john.jpg",
    website: "https://johndoe.com",
    location: "Jakarta, Indonesia"
  },
  stats: {
    postsCount: 150,
    followersCount: 1250,
    followingCount: 300
  },
  settings: {
    isPrivate: false,
    allowMessages: true,
    showEmail: false
  },
  createdAt: ISODate("2023-01-01")
}

// Posts Collection
{
  _id: ObjectId("..."),
  author: {
    _id: ObjectId("..."),
    username: "johndoe",
    displayName: "John Doe",
    avatar: "https://example.com/avatars/john.jpg"
  },
  content: {
    text: "Just finished building an amazing MongoDB schema!",
    media: [
      {
        type: "image",
        url: "https://example.com/images/post1.jpg",
        alt: "MongoDB schema diagram"
      }
    ]
  },
  hashtags: ["mongodb", "database", "nosql"],
  mentions: [
    {
      _id: ObjectId("..."),
      username: "mongodb"
    }
  ],
  stats: {
    likesCount: 45,
    commentsCount: 12,
    sharesCount: 8,
    viewsCount: 320
  },
  // Recent interactions for quick access
  recentLikes: [
    {
      userId: ObjectId("..."),
      username: "alice",
      timestamp: ISODate("2024-02-10T15:30:00Z")
    }
  ],
  isPublic: true,
  createdAt: ISODate("2024-02-10T14:00:00Z")
}

// Follows Collection (for scalability)
{
  _id: ObjectId("..."),
  follower: ObjectId("..."), // User who follows
  following: ObjectId("..."), // User being followed
  createdAt: ISODate("2024-01-15")
}

// Comments Collection
{
  _id: ObjectId("..."),
  postId: ObjectId("..."),
  author: {
    _id: ObjectId("..."),
    username: "alice",
    displayName: "Alice Smith",
    avatar: "https://example.com/avatars/alice.jpg"
  },
  content: "Great explanation of MongoDB patterns!",
  parentId: null, // For nested comments
  likesCount: 5,
  createdAt: ISODate("2024-02-10T15:00:00Z")
}

Performance Optimization

1. Connection Pooling

// MongoDB connection with proper pooling
const { MongoClient } = require("mongodb");

const client = new MongoClient(uri, {
  maxPoolSize: 10, // Maximum connections in pool
  minPoolSize: 2, // Minimum connections in pool
  maxIdleTimeMS: 30000, // Close connections after 30s of inactivity
  serverSelectionTimeoutMS: 5000, // How long to try selecting a server
  socketTimeoutMS: 45000, // How long a send or receive on a socket can take
  bufferMaxEntries: 0, // Disable mongoose buffering
});

2. Read Preferences

// Read from secondary for analytics queries
db.orders
  .find({ status: "completed" })
  .readPref("secondary")
  .hint({ createdAt: -1 });

// Read from primary for critical data
db.users.findOne({ _id: ObjectId("...") }).readPref("primary");

3. Write Concerns

// High durability for critical operations
db.orders.insertOne(
  {
    /* order data */
  },
  { writeConcern: { w: "majority", j: true } }
);

// Fast writes for logging
db.logs.insertOne(
  {
    /* log data */
  },
  { writeConcern: { w: 1, j: false } }
);

4. Bulk Operations

// Bulk insert for better performance
const bulk = db.products.initializeUnorderedBulkOp();

products.forEach((product) => {
  bulk.insert(product);
});

bulk.execute();

// Bulk update operations
const bulkOps = [
  {
    updateOne: {
      filter: { _id: ObjectId("...") },
      update: { $set: { status: "active" } },
    },
  },
  {
    updateMany: {
      filter: { category: "electronics" },
      update: { $inc: { views: 1 } },
    },
  },
];

db.products.bulkWrite(bulkOps);

Monitoring dan Maintenance

1. Database Profiling

// Enable profiling for slow operations
db.setProfilingLevel(2, { slowms: 100 });

// Query profiler collection
db.system.profile
  .find({
    ts: {
      $gte: new Date(Date.now() - 1000 * 60 * 60), // Last hour
    },
  })
  .sort({ ts: -1 });

// Find slow queries
db.system.profile
  .find({
    millis: { $gt: 1000 },
  })
  .sort({ ts: -1 });

2. Index Usage Analysis

// Check index usage statistics
db.posts.aggregate([{ $indexStats: {} }]);

// Find unused indexes
db.runCommand({ collStats: "posts", indexDetails: true });

3. Database Statistics

// Collection statistics
db.posts.stats();

// Database statistics
db.stats();

// Server status
db.serverStatus();

Best Practices Summary

1. Schema Design

Embed untuk one-to-few relationships dan data yang sering diakses bersamaan
Reference untuk one-to-many dan data yang sering berubah independen
Gunakan schema patterns yang sesuai dengan use case
Pertimbangkan document size limit (16MB)

2. Indexing

Buat index berdasarkan query patterns
Gunakan compound indexes untuk multi-field queries
Implementasikan partial indexes untuk subset data
Monitor index usage dan hapus yang tidak terpakai

3. Queries

Gunakan projection untuk membatasi field yang dikembalikan
Implementasikan pagination untuk large result sets
Optimalkan aggregation pipelines dengan match early
Gunakan explain() untuk analisis performance

4. Performance

Konfigurasi connection pooling yang tepat
Gunakan bulk operations untuk multiple writes
Implementasikan read preferences sesuai kebutuhan
Monitor dan profile slow operations

5. Security

Implementasikan authentication dan authorization
Gunakan field-level encryption untuk sensitive data
Konfigurasi network security dengan proper firewall
Regular backup dan disaster recovery planning

Kesimpulan

MongoDB memberikan fleksibilitas yang besar dalam database design, namun membutuhkan pemahaman yang mendalam tentang data access patterns dan performance characteristics. Dengan menerapkan best practices yang telah dibahas, Anda dapat membangun aplikasi yang scalable dan performant.

Kunci sukses adalah memahami trade-offs antara embedding vs referencing, membuat index yang tepat, dan selalu monitor performance aplikasi di production environment.

MongoDB Database Design: Best Practices untuk Developer

MongoDB Database Design: Best Practices untuk Developer

Mengapa MongoDB?

Keuntungan MongoDB

Kapan Menggunakan MongoDB

Document Structure Design

1. Embedding vs Referencing

Embedding (Denormalization)

Referencing (Normalization)

2. Schema Design Patterns

Subset Pattern

Extended Reference Pattern

Bucket Pattern (Time Series)

Indexing Strategies

1. Single Field Indexes

2. Compound Indexes

3. Text Indexes

4. Partial Indexes

Query Optimization

1. Efficient Query Patterns

2. Aggregation Pipeline Optimization

3. Using Explain

Data Modeling Examples

1. E-commerce Schema

2. Social Media Schema

Performance Optimization

1. Connection Pooling

2. Read Preferences

3. Write Concerns

4. Bulk Operations

Monitoring dan Maintenance

1. Database Profiling

2. Index Usage Analysis

3. Database Statistics

Best Practices Summary

1. Schema Design

2. Indexing

3. Queries

4. Performance

5. Security

Kesimpulan

Artikel Terkait

Web Performance Optimization: Panduan Lengkap untuk Website yang Cepat

GraphQL API Development: Panduan Lengkap dari Basic hingga Advanced

Getting Started with Next.js - Panduan Lengkap untuk Pemula

Mari Mulai Kolaborasi.