MongoDB Database Design: Best Practices untuk Developer
MongoDB adalah document database yang memberikan fleksibilitas tinggi dalam schema design. Namun, fleksibilitas ini juga membutuhkan pemahaman yang baik tentang best practices untuk membangun aplikasi yang performant dan scalable.
Mengapa MongoDB?
Keuntungan MongoDB
- Schema Flexibility: Dynamic schema yang mudah diubah
- Document Model: Natural mapping ke object-oriented programming
- Horizontal Scaling: Built-in sharding support
- Rich Query Language: Powerful aggregation framework
- High Performance: Optimized untuk read-heavy workloads
Kapan Menggunakan MongoDB
- Rapid Development: Ketika schema requirements sering berubah
- Complex Data Structures: Nested objects dan arrays
- Real-time Analytics: Time-series data dan event logging
- Content Management: Flexible content types
- IoT Applications: High-volume, varied data structures
Document Structure Design
1. Embedding vs Referencing
Embedding (Denormalization)
// User dengan embedded posts
{
_id: ObjectId("..."),
name: "John Doe",
email: "john@example.com",
posts: [
{
_id: ObjectId("..."),
title: "My First Post",
content: "This is my first post...",
createdAt: ISODate("2024-01-15")
},
{
_id: ObjectId("..."),
title: "Another Post",
content: "This is another post...",
createdAt: ISODate("2024-01-20")
}
],
profile: {
bio: "Software developer",
location: "Jakarta",
website: "https://johndoe.com"
}
}
Kapan menggunakan Embedding:
- One-to-few relationships
- Data yang sering diakses bersamaan
- Data yang jarang berubah
- Atomic updates diperlukan
Referencing (Normalization)
// User collection
{
_id: ObjectId("507f1f77bcf86cd799439011"),
name: "John Doe",
email: "john@example.com",
profile: {
bio: "Software developer",
location: "Jakarta",
website: "https://johndoe.com"
}
}
// Posts collection
{
_id: ObjectId("507f1f77bcf86cd799439012"),
title: "My First Post",
content: "This is my first post...",
author: ObjectId("507f1f77bcf86cd799439011"), // Reference to user
tags: ["mongodb", "database", "nosql"],
createdAt: ISODate("2024-01-15"),
updatedAt: ISODate("2024-01-15")
}
Kapan menggunakan Referencing:
- One-to-many atau many-to-many relationships
- Data yang sering berubah secara independen
- Large documents (>16MB limit)
- Data yang diakses secara terpisah
2. Schema Design Patterns
Subset Pattern
// Product collection (frequently accessed data)
{
_id: ObjectId("..."),
name: "MacBook Pro",
price: 25000000,
category: "Laptop",
brand: "Apple",
rating: 4.8,
reviewCount: 150,
inStock: true,
// Subset of most recent reviews
recentReviews: [
{
_id: ObjectId("..."),
rating: 5,
comment: "Excellent laptop!",
author: "John Doe",
date: ISODate("2024-02-01")
}
]
}
// Reviews collection (complete data)
{
_id: ObjectId("..."),
productId: ObjectId("..."),
rating: 5,
comment: "Excellent laptop! Great performance and battery life.",
author: {
_id: ObjectId("..."),
name: "John Doe",
verified: true
},
helpful: 25,
date: ISODate("2024-02-01")
}
Extended Reference Pattern
// Order collection
{
_id: ObjectId("..."),
orderNumber: "ORD-2024-001",
customer: {
_id: ObjectId("..."),
name: "John Doe", // Denormalized for quick access
email: "john@example.com"
},
items: [
{
productId: ObjectId("..."),
name: "MacBook Pro", // Denormalized
price: 25000000, // Price at time of order
quantity: 1
}
],
total: 25000000,
status: "shipped",
createdAt: ISODate("2024-02-01")
}
Bucket Pattern (Time Series)
// IoT sensor data bucketed by hour
{
_id: ObjectId("..."),
sensorId: "TEMP_001",
timestamp: ISODate("2024-02-01T10:00:00Z"), // Start of hour
measurements: [
{
timestamp: ISODate("2024-02-01T10:00:00Z"),
temperature: 25.5,
humidity: 60.2
},
{
timestamp: ISODate("2024-02-01T10:01:00Z"),
temperature: 25.7,
humidity: 60.1
}
// ... up to 60 measurements per hour
],
count: 60,
avgTemperature: 25.6,
maxTemperature: 26.1,
minTemperature: 25.1
}
Indexing Strategies
1. Single Field Indexes
// Create index on email field
db.users.createIndex({ email: 1 });
// Create index on createdAt field (descending)
db.posts.createIndex({ createdAt: -1 });
// Create unique index
db.users.createIndex({ email: 1 }, { unique: true });
// Create sparse index (only documents with the field)
db.users.createIndex({ phoneNumber: 1 }, { sparse: true });
2. Compound Indexes
// Compound index for queries on status and createdAt
db.orders.createIndex({ status: 1, createdAt: -1 });
// Index for user's posts sorted by date
db.posts.createIndex({ author: 1, createdAt: -1 });
// Index for geospatial queries
db.stores.createIndex({ location: "2dsphere", category: 1 });
3. Text Indexes
// Create text index for search
db.articles.createIndex(
{
title: "text",
content: "text",
tags: "text",
},
{
weights: {
title: 10,
content: 5,
tags: 1,
},
name: "article_text_index",
}
);
// Search query
db.articles
.find({
$text: {
$search: "mongodb database design",
$caseSensitive: false,
},
})
.sort({ score: { $meta: "textScore" } });
4. Partial Indexes
// Index only active users
db.users.createIndex(
{ email: 1 },
{
partialFilterExpression: { isActive: true },
name: "active_users_email_index",
}
);
// Index only published posts
db.posts.createIndex(
{ createdAt: -1 },
{
partialFilterExpression: { status: "published" },
name: "published_posts_date_index",
}
);
Query Optimization
1. Efficient Query Patterns
// Good: Use indexes effectively
db.posts
.find({ author: ObjectId("..."), status: "published" })
.sort({ createdAt: -1 })
.limit(10);
// Bad: Inefficient regex
db.users.find({ email: /.*@gmail.com$/ });
// Good: Use prefix regex with index
db.users.find({ email: /^john/ });
// Good: Use $in for multiple values
db.posts.find({ status: { $in: ["published", "featured"] } });
// Good: Use projection to limit returned fields
db.users.find({ isActive: true }, { name: 1, email: 1, _id: 0 });
2. Aggregation Pipeline Optimization
// Optimized aggregation pipeline
db.orders.aggregate([
// 1. Match early to reduce documents
{
$match: {
createdAt: {
$gte: ISODate("2024-01-01"),
$lt: ISODate("2024-02-01"),
},
status: "completed",
},
},
// 2. Project only needed fields
{
$project: {
customerId: 1,
total: 1,
createdAt: 1,
},
},
// 3. Group and calculate
{
$group: {
_id: "$customerId",
totalSpent: { $sum: "$total" },
orderCount: { $sum: 1 },
lastOrder: { $max: "$createdAt" },
},
},
// 4. Sort by total spent
{
$sort: { totalSpent: -1 },
},
// 5. Limit results
{
$limit: 100,
},
]);
3. Using Explain
// Analyze query performance
db.posts
.find({ author: ObjectId("...") })
.sort({ createdAt: -1 })
.explain("executionStats");
// Check if index is used
db.posts.find({ title: /mongodb/i }).explain("queryPlanner");
// Analyze aggregation performance
db.orders
.aggregate([
{ $match: { status: "completed" } },
{ $group: { _id: "$customerId", total: { $sum: "$amount" } } },
])
.explain("executionStats");
Data Modeling Examples
1. E-commerce Schema
// Users Collection
{
_id: ObjectId("..."),
email: "john@example.com",
password: "hashed_password",
profile: {
firstName: "John",
lastName: "Doe",
phone: "+62812345678",
dateOfBirth: ISODate("1990-01-15")
},
addresses: [
{
_id: ObjectId("..."),
type: "home",
street: "Jl. Sudirman No. 1",
city: "Jakarta",
postalCode: "10110",
isDefault: true
}
],
preferences: {
newsletter: true,
notifications: {
email: true,
sms: false
}
},
createdAt: ISODate("2024-01-01"),
lastLogin: ISODate("2024-02-10")
}
// Products Collection
{
_id: ObjectId("..."),
sku: "LAPTOP-001",
name: "MacBook Pro 14-inch",
description: "Apple MacBook Pro with M3 chip",
category: {
_id: ObjectId("..."),
name: "Laptops",
path: "Electronics > Computers > Laptops"
},
brand: "Apple",
price: {
amount: 25000000,
currency: "IDR"
},
inventory: {
quantity: 50,
reserved: 5,
available: 45
},
specifications: {
processor: "Apple M3",
memory: "16GB",
storage: "512GB SSD",
display: "14-inch Liquid Retina XDR"
},
images: [
{
url: "https://example.com/images/macbook-1.jpg",
alt: "MacBook Pro front view",
isPrimary: true
}
],
seo: {
slug: "macbook-pro-14-inch-m3",
metaTitle: "MacBook Pro 14-inch with M3 Chip",
metaDescription: "Powerful MacBook Pro with M3 chip..."
},
isActive: true,
createdAt: ISODate("2024-01-01")
}
// Orders Collection
{
_id: ObjectId("..."),
orderNumber: "ORD-2024-001",
customer: {
_id: ObjectId("..."),
email: "john@example.com",
name: "John Doe"
},
items: [
{
productId: ObjectId("..."),
sku: "LAPTOP-001",
name: "MacBook Pro 14-inch",
price: 25000000,
quantity: 1,
subtotal: 25000000
}
],
shipping: {
address: {
street: "Jl. Sudirman No. 1",
city: "Jakarta",
postalCode: "10110"
},
method: "express",
cost: 50000,
trackingNumber: "TRK123456789"
},
payment: {
method: "credit_card",
status: "paid",
transactionId: "TXN123456789",
paidAt: ISODate("2024-02-01T10:30:00Z")
},
totals: {
subtotal: 25000000,
shipping: 50000,
tax: 2505000,
total: 27555000
},
status: "shipped",
statusHistory: [
{
status: "pending",
timestamp: ISODate("2024-02-01T10:00:00Z")
},
{
status: "paid",
timestamp: ISODate("2024-02-01T10:30:00Z")
},
{
status: "shipped",
timestamp: ISODate("2024-02-02T09:00:00Z")
}
],
createdAt: ISODate("2024-02-01T10:00:00Z")
}
2. Social Media Schema
// Users Collection
{
_id: ObjectId("..."),
username: "johndoe",
email: "john@example.com",
profile: {
displayName: "John Doe",
bio: "Software developer passionate about technology",
avatar: "https://example.com/avatars/john.jpg",
website: "https://johndoe.com",
location: "Jakarta, Indonesia"
},
stats: {
postsCount: 150,
followersCount: 1250,
followingCount: 300
},
settings: {
isPrivate: false,
allowMessages: true,
showEmail: false
},
createdAt: ISODate("2023-01-01")
}
// Posts Collection
{
_id: ObjectId("..."),
author: {
_id: ObjectId("..."),
username: "johndoe",
displayName: "John Doe",
avatar: "https://example.com/avatars/john.jpg"
},
content: {
text: "Just finished building an amazing MongoDB schema!",
media: [
{
type: "image",
url: "https://example.com/images/post1.jpg",
alt: "MongoDB schema diagram"
}
]
},
hashtags: ["mongodb", "database", "nosql"],
mentions: [
{
_id: ObjectId("..."),
username: "mongodb"
}
],
stats: {
likesCount: 45,
commentsCount: 12,
sharesCount: 8,
viewsCount: 320
},
// Recent interactions for quick access
recentLikes: [
{
userId: ObjectId("..."),
username: "alice",
timestamp: ISODate("2024-02-10T15:30:00Z")
}
],
isPublic: true,
createdAt: ISODate("2024-02-10T14:00:00Z")
}
// Follows Collection (for scalability)
{
_id: ObjectId("..."),
follower: ObjectId("..."), // User who follows
following: ObjectId("..."), // User being followed
createdAt: ISODate("2024-01-15")
}
// Comments Collection
{
_id: ObjectId("..."),
postId: ObjectId("..."),
author: {
_id: ObjectId("..."),
username: "alice",
displayName: "Alice Smith",
avatar: "https://example.com/avatars/alice.jpg"
},
content: "Great explanation of MongoDB patterns!",
parentId: null, // For nested comments
likesCount: 5,
createdAt: ISODate("2024-02-10T15:00:00Z")
}
Performance Optimization
1. Connection Pooling
// MongoDB connection with proper pooling
const { MongoClient } = require("mongodb");
const client = new MongoClient(uri, {
maxPoolSize: 10, // Maximum connections in pool
minPoolSize: 2, // Minimum connections in pool
maxIdleTimeMS: 30000, // Close connections after 30s of inactivity
serverSelectionTimeoutMS: 5000, // How long to try selecting a server
socketTimeoutMS: 45000, // How long a send or receive on a socket can take
bufferMaxEntries: 0, // Disable mongoose buffering
});
2. Read Preferences
// Read from secondary for analytics queries
db.orders
.find({ status: "completed" })
.readPref("secondary")
.hint({ createdAt: -1 });
// Read from primary for critical data
db.users.findOne({ _id: ObjectId("...") }).readPref("primary");
3. Write Concerns
// High durability for critical operations
db.orders.insertOne(
{
/* order data */
},
{ writeConcern: { w: "majority", j: true } }
);
// Fast writes for logging
db.logs.insertOne(
{
/* log data */
},
{ writeConcern: { w: 1, j: false } }
);
4. Bulk Operations
// Bulk insert for better performance
const bulk = db.products.initializeUnorderedBulkOp();
products.forEach((product) => {
bulk.insert(product);
});
bulk.execute();
// Bulk update operations
const bulkOps = [
{
updateOne: {
filter: { _id: ObjectId("...") },
update: { $set: { status: "active" } },
},
},
{
updateMany: {
filter: { category: "electronics" },
update: { $inc: { views: 1 } },
},
},
];
db.products.bulkWrite(bulkOps);
Monitoring dan Maintenance
1. Database Profiling
// Enable profiling for slow operations
db.setProfilingLevel(2, { slowms: 100 });
// Query profiler collection
db.system.profile
.find({
ts: {
$gte: new Date(Date.now() - 1000 * 60 * 60), // Last hour
},
})
.sort({ ts: -1 });
// Find slow queries
db.system.profile
.find({
millis: { $gt: 1000 },
})
.sort({ ts: -1 });
2. Index Usage Analysis
// Check index usage statistics
db.posts.aggregate([{ $indexStats: {} }]);
// Find unused indexes
db.runCommand({ collStats: "posts", indexDetails: true });
3. Database Statistics
// Collection statistics
db.posts.stats();
// Database statistics
db.stats();
// Server status
db.serverStatus();
Best Practices Summary
1. Schema Design
- Embed untuk one-to-few relationships dan data yang sering diakses bersamaan
- Reference untuk one-to-many dan data yang sering berubah independen
- Gunakan schema patterns yang sesuai dengan use case
- Pertimbangkan document size limit (16MB)
2. Indexing
- Buat index berdasarkan query patterns
- Gunakan compound indexes untuk multi-field queries
- Implementasikan partial indexes untuk subset data
- Monitor index usage dan hapus yang tidak terpakai
3. Queries
- Gunakan projection untuk membatasi field yang dikembalikan
- Implementasikan pagination untuk large result sets
- Optimalkan aggregation pipelines dengan match early
- Gunakan explain() untuk analisis performance
4. Performance
- Konfigurasi connection pooling yang tepat
- Gunakan bulk operations untuk multiple writes
- Implementasikan read preferences sesuai kebutuhan
- Monitor dan profile slow operations
5. Security
- Implementasikan authentication dan authorization
- Gunakan field-level encryption untuk sensitive data
- Konfigurasi network security dengan proper firewall
- Regular backup dan disaster recovery planning
Kesimpulan
MongoDB memberikan fleksibilitas yang besar dalam database design, namun membutuhkan pemahaman yang mendalam tentang data access patterns dan performance characteristics. Dengan menerapkan best practices yang telah dibahas, Anda dapat membangun aplikasi yang scalable dan performant.
Kunci sukses adalah memahami trade-offs antara embedding vs referencing, membuat index yang tepat, dan selalu monitor performance aplikasi di production environment.



