Benchmarking MongoDB vs PostgreSQL with Go: A Practical Performance Comparison // Support Tools

When selecting a database for a production application, theoretical comparisons only get you so far. What matters is how the database performs with your specific workloads and access patterns. In this article, we’ll explore how to build a comprehensive benchmarking suite in Go to compare MongoDB and PostgreSQL across common real-world operations, providing actionable insights for your next project.

Introduction
Setting Up the Environment
Creating Realistic Test Datasets
Benchmarking Methodology
Single-Record Operations
Range Queries and Filtering
Aggregation Performance
Pagination and Sorting
Write Operations
Concurrency Testing
Results Analysis
Database Memory and CPU Usage
Real-World Scenarios and Recommendations
Conclusion

Introduction

MongoDB and PostgreSQL represent two fundamentally different approaches to data storage and retrieval:

PostgreSQL is a mature, feature-rich relational database with strong ACID compliance, powerful query capabilities, and a rich set of data types.
MongoDB is a flexible, schema-less document database designed for ease of development, horizontal scalability, and high-volume operations.

Rather than rehashing theoretical differences, this article focuses on practical performance differences using Go and realistic workloads. We’ll examine:

How each database performs across various query types
The impact of proper indexing on both databases
Performance under concurrent load
Resource consumption patterns
Scaling considerations

Setting Up the Environment

To ensure reliable benchmarks, we need a consistent testing environment:

Environment Specifications

Hardware: 8-core CPU, 32GB RAM, SSD storage
Software:
- Go 1.21
- MongoDB 6.0
- PostgreSQL 15
- MongoDB Go Driver (go.mongodb.org/mongo-driver v1.11.0)
- PostgreSQL Go Driver (github.com/jackc/pgx/v5)

Test Environment Setup

package main

import (
    "context"
    "log"
    "os"
    "time"
    
    "github.com/jackc/pgx/v5/pgxpool"
    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
)

// Database connection configurations
const (
    mongoURI       = "mongodb://localhost:27017"
    postgresqlURI  = "postgres://postgres:postgres@localhost:5432/benchmark"
    databaseName   = "benchmark"
    collectionName = "orders"
    tableName      = "orders"
)

// Setup MongoDB connection
func setupMongoDB(ctx context.Context) (*mongo.Client, *mongo.Collection, error) {
    clientOptions := options.Client().ApplyURI(mongoURI)
    
    // Set MongoDB client options
    clientOptions.SetMaxPoolSize(100)
    clientOptions.SetMinPoolSize(10)
    clientOptions.SetMaxConnIdleTime(time.Minute * 5)
    
    client, err := mongo.Connect(ctx, clientOptions)
    if err != nil {
        return nil, nil, err
    }
    
    // Ping to verify connection
    if err = client.Ping(ctx, nil); err != nil {
        return nil, nil, err
    }
    
    collection := client.Database(databaseName).Collection(collectionName)
    return client, collection, nil
}

// Setup PostgreSQL connection
func setupPostgreSQL(ctx context.Context) (*pgxpool.Pool, error) {
    config, err := pgxpool.ParseConfig(postgresqlURI)
    if err != nil {
        return nil, err
    }
    
    // Set PostgreSQL pool options
    config.MaxConns = 100
    config.MinConns = 10
    config.MaxConnIdleTime = time.Minute * 5
    
    pool, err := pgxpool.NewWithConfig(ctx, config)
    if err != nil {
        return nil, err
    }
    
    // Ping to verify connection
    if err = pool.Ping(ctx); err != nil {
        return nil, err
    }
    
    return pool, nil
}

func main() {
    ctx := context.Background()
    
    // Setup MongoDB
    mongoClient, collection, err := setupMongoDB(ctx)
    if err != nil {
        log.Fatalf("Failed to connect to MongoDB: %v", err)
    }
    defer mongoClient.Disconnect(ctx)
    
    // Setup PostgreSQL
    pgPool, err := setupPostgreSQL(ctx)
    if err != nil {
        log.Fatalf("Failed to connect to PostgreSQL: %v", err)
    }
    defer pgPool.Close()
    
    // Ready to perform benchmarks
    log.Println("Database connections established successfully")
    
    // Run benchmarks...
}

Database Schema

For PostgreSQL, we’ll create a properly indexed schema:

CREATE TABLE orders (
    id UUID PRIMARY KEY,
    customer_id VARCHAR(50) NOT NULL,
    status VARCHAR(20) NOT NULL,
    total DECIMAL(12, 2) NOT NULL,
    created_at TIMESTAMP NOT NULL
);

-- Indexes for common queries
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_orders_created_at ON orders(created_at);
CREATE INDEX idx_orders_customer_created ON orders(customer_id, created_at);

For MongoDB, we’ll create equivalent indexes:

import (
    "go.mongodb.org/mongo-driver/bson"
    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
)

func createMongoDBIndexes(ctx context.Context, collection *mongo.Collection) error {
    // Create indexes
    indexModels := []mongo.IndexModel{
        {
            Keys:    bson.D{{"customer_id", 1}},
            Options: options.Index().SetName("idx_customer_id"),
        },
        {
            Keys:    bson.D{{"status", 1}},
            Options: options.Index().SetName("idx_status"),
        },
        {
            Keys:    bson.D{{"created_at", 1}},
            Options: options.Index().SetName("idx_created_at"),
        },
        {
            Keys:    bson.D{{"customer_id", 1}, {"created_at", 1}},
            Options: options.Index().SetName("idx_customer_created"),
        },
    }
    
    _, err := collection.Indexes().CreateMany(ctx, indexModels)
    return err
}

Creating Realistic Test Datasets

To ensure our benchmarks reflect real-world scenarios, we’ll generate a dataset that simulates an e-commerce platform:

// Order represents an e-commerce order
type Order struct {
    ID         string    `json:"id" bson:"_id"`
    CustomerID string    `json:"customer_id" bson:"customer_id"`
    Status     string    `json:"status" bson:"status"`
    Total      float64   `json:"total" bson:"total"`
    CreatedAt  time.Time `json:"created_at" bson:"created_at"`
    Items      []Item    `json:"items" bson:"items"`
}

// Item represents an order line item
type Item struct {
    ProductID  string  `json:"product_id" bson:"product_id"`
    Name       string  `json:"name" bson:"name"`
    Quantity   int     `json:"quantity" bson:"quantity"`
    UnitPrice  float64 `json:"unit_price" bson:"unit_price"`
}

// Generate a realistic dataset
func generateTestData(count int) []Order {
    statuses := []string{"pending", "processing", "shipped", "delivered", "cancelled"}
    customerCount := 5000 // Number of unique customers
    productCount := 1000  // Number of unique products
    
    rand.Seed(time.Now().UnixNano())
    orders := make([]Order, count)
    
    startDate := time.Now().AddDate(-1, 0, 0) // Orders from the last year
    timeRange := time.Now().Sub(startDate)
    
    for i := 0; i < count; i++ {
        // Generate a random time within the last year
        randomDuration := time.Duration(rand.Int63n(int64(timeRange)))
        orderDate := startDate.Add(randomDuration)
        
        // Generate between 1 and 10 items per order
        itemCount := rand.Intn(10) + 1
        items := make([]Item, itemCount)
        total := 0.0
        
        for j := 0; j < itemCount; j++ {
            productID := fmt.Sprintf("product-%d", rand.Intn(productCount))
            quantity := rand.Intn(5) + 1
            unitPrice := float64(rand.Intn(10000)) / 100 // Price between $0.00 and $100.00
            
            items[j] = Item{
                ProductID: productID,
                Name:      fmt.Sprintf("Product %s", productID),
                Quantity:  quantity,
                UnitPrice: unitPrice,
            }
            
            total += float64(quantity) * unitPrice
        }
        
        // Round total to 2 decimal places
        total = math.Round(total*100) / 100
        
        orders[i] = Order{
            ID:         uuid.New().String(),
            CustomerID: fmt.Sprintf("customer-%d", rand.Intn(customerCount)),
            Status:     statuses[rand.Intn(len(statuses))],
            Total:      total,
            CreatedAt:  orderDate,
            Items:      items,
        }
    }
    
    return orders
}

Loading Data into Databases

// Load data into MongoDB
func loadMongoDBData(ctx context.Context, collection *mongo.Collection, orders []Order) error {
    // Convert to interface slice for MongoDB bulk write
    documents := make([]interface{}, len(orders))
    for i, order := range orders {
        documents[i] = order
    }
    
    // Use ordered:false for better insert performance
    _, err := collection.InsertMany(ctx, documents, options.InsertMany().SetOrdered(false))
    return err
}

// Load data into PostgreSQL
func loadPostgreSQLData(ctx context.Context, pool *pgxpool.Pool, orders []Order) error {
    // Begin transaction
    tx, err := pool.Begin(ctx)
    if err != nil {
        return err
    }
    defer tx.Rollback(ctx)
    
    // Prepare the COPY statement
    _, err = tx.Exec(ctx, "TRUNCATE TABLE orders")
    if err != nil {
        return err
    }
    
    // Use COPY for fast bulk loading
    copyCount, err := pool.CopyFrom(
        ctx,
        pgx.Identifier{"orders"},
        []string{"id", "customer_id", "status", "total", "created_at", "items"},
        pgx.CopyFromSlice(len(orders), func(i int) ([]interface{}, error) {
            order := orders[i]
            itemsJSON, err := json.Marshal(order.Items)
            if err != nil {
                return nil, err
            }
            return []interface{}{
                order.ID,
                order.CustomerID,
                order.Status,
                order.Total,
                order.CreatedAt,
                itemsJSON,
            }, nil
        }),
    )
    if err != nil {
        return err
    }
    
    log.Printf("Inserted %d records into PostgreSQL", copyCount)
    return tx.Commit(ctx)
}

Benchmarking Methodology

To ensure accurate and fair comparisons, we’ll follow these benchmarking principles:

Warm-up Phase: Run initial queries to warm up database caches
Multiple Iterations: Run each benchmark multiple times and take the average
Consistent Environment: Ensure similar conditions for both databases
Representative Queries: Simulate real application workloads
Parameterized Benchmarks: Test with different data volumes
Measurement Precision: Use high-resolution timers

Here’s our benchmarking harness:

// BenchmarkResult stores the results of a single benchmark run
type BenchmarkResult struct {
    Name           string
    DatabaseType   string
    OperationType  string
    ExecutionTime  time.Duration
    RecordsAffected int
    Timestamp      time.Time
}

// RunBenchmark executes a benchmark function multiple times and returns average results
func RunBenchmark(name, dbType, opType string, iterations int, benchFunc func() (int, time.Duration, error)) (BenchmarkResult, error) {
    var totalTime time.Duration
    var totalRecords int
    var successfulRuns int
    
    // Run warm-up iteration (not counted in results)
    _, _, err := benchFunc()
    if err != nil {
        return BenchmarkResult{}, fmt.Errorf("warm-up failed: %w", err)
    }
    
    // Run benchmark iterations
    for i := 0; i < iterations; i++ {
        recordsAffected, duration, err := benchFunc()
        if err != nil {
            log.Printf("Benchmark iteration %d failed: %v", i, err)
            continue
        }
        
        totalTime += duration
        totalRecords += recordsAffected
        successfulRuns++
    }
    
    if successfulRuns == 0 {
        return BenchmarkResult{}, fmt.Errorf("all benchmark iterations failed")
    }
    
    // Calculate averages
    avgTime := totalTime / time.Duration(successfulRuns)
    avgRecords := totalRecords / successfulRuns
    
    return BenchmarkResult{
        Name:           name,
        DatabaseType:   dbType,
        OperationType:  opType,
        ExecutionTime:  avgTime,
        RecordsAffected: avgRecords,
        Timestamp:      time.Now(),
    }, nil
}

Single-Record Operations

Let’s benchmark basic CRUD operations that retrieve, insert, update, and delete individual records:

Read By ID

Reading a single record by its primary key is one of the most common database operations:

// MongoDB: Read by ID
func benchmarkMongoDBReadByID(ctx context.Context, collection *mongo.Collection, id string) (int, time.Duration, error) {
    start := time.Now()
    
    var order Order
    err := collection.FindOne(ctx, bson.M{"_id": id}).Decode(&order)
    if err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return 1, duration, nil
}

// PostgreSQL: Read by ID
func benchmarkPostgreSQLReadByID(ctx context.Context, pool *pgxpool.Pool, id string) (int, time.Duration, error) {
    start := time.Now()
    
    var order Order
    err := pool.QueryRow(ctx, `
        SELECT id, customer_id, status, total, created_at, items
        FROM orders
        WHERE id = $1
    `, id).Scan(&order.ID, &order.CustomerID, &order.Status, &order.Total, &order.CreatedAt, &order.Items)
    if err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return 1, duration, nil
}

Update Single Record

Updating a single record is another common operation:

// MongoDB: Update by ID
func benchmarkMongoDBUpdateByID(ctx context.Context, collection *mongo.Collection, id string, newStatus string) (int, time.Duration, error) {
    start := time.Now()
    
    result, err := collection.UpdateOne(
        ctx,
        bson.M{"_id": id},
        bson.M{"$set": bson.M{"status": newStatus}},
    )
    if err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return int(result.ModifiedCount), duration, nil
}

// PostgreSQL: Update by ID
func benchmarkPostgreSQLUpdateByID(ctx context.Context, pool *pgxpool.Pool, id string, newStatus string) (int, time.Duration, error) {
    start := time.Now()
    
    tag, err := pool.Exec(ctx, `
        UPDATE orders
        SET status = $1
        WHERE id = $2
    `, newStatus, id)
    if err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return int(tag.RowsAffected()), duration, nil
}

Range Queries and Filtering

Let’s examine how each database handles filtering and range queries, which are common in real applications:

Date Range Filtering

// MongoDB: Filter by date range
func benchmarkMongoDBDateRange(ctx context.Context, collection *mongo.Collection, startDate, endDate time.Time, limit int) (int, time.Duration, error) {
    start := time.Now()
    
    filter := bson.M{
        "created_at": bson.M{
            "$gte": startDate,
            "$lte": endDate,
        },
    }
    
    cursor, err := collection.Find(ctx, filter, options.Find().SetLimit(int64(limit)))
    if err != nil {
        return 0, 0, err
    }
    defer cursor.Close(ctx)
    
    var orders []Order
    if err = cursor.All(ctx, &orders); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(orders), duration, nil
}

// PostgreSQL: Filter by date range
func benchmarkPostgreSQLDateRange(ctx context.Context, pool *pgxpool.Pool, startDate, endDate time.Time, limit int) (int, time.Duration, error) {
    start := time.Now()
    
    rows, err := pool.Query(ctx, `
        SELECT id, customer_id, status, total, created_at, items
        FROM orders
        WHERE created_at BETWEEN $1 AND $2
        LIMIT $3
    `, startDate, endDate, limit)
    if err != nil {
        return 0, 0, err
    }
    defer rows.Close()
    
    var orders []Order
    for rows.Next() {
        var order Order
        if err := rows.Scan(&order.ID, &order.CustomerID, &order.Status, &order.Total, &order.CreatedAt, &order.Items); err != nil {
            return 0, 0, err
        }
        orders = append(orders, order)
    }
    
    if err = rows.Err(); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(orders), duration, nil
}

Multi-Condition Filtering

// MongoDB: Multi-condition filtering
func benchmarkMongoDBMultiFilter(ctx context.Context, collection *mongo.Collection, status string, minTotal float64, limit int) (int, time.Duration, error) {
    start := time.Now()
    
    filter := bson.M{
        "status": status,
        "total": bson.M{"$gte": minTotal},
    }
    
    cursor, err := collection.Find(ctx, filter, options.Find().SetLimit(int64(limit)))
    if err != nil {
        return 0, 0, err
    }
    defer cursor.Close(ctx)
    
    var orders []Order
    if err = cursor.All(ctx, &orders); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(orders), duration, nil
}

// PostgreSQL: Multi-condition filtering
func benchmarkPostgreSQLMultiFilter(ctx context.Context, pool *pgxpool.Pool, status string, minTotal float64, limit int) (int, time.Duration, error) {
    start := time.Now()
    
    rows, err := pool.Query(ctx, `
        SELECT id, customer_id, status, total, created_at, items
        FROM orders
        WHERE status = $1 AND total >= $2
        LIMIT $3
    `, status, minTotal, limit)
    if err != nil {
        return 0, 0, err
    }
    defer rows.Close()
    
    var orders []Order
    for rows.Next() {
        var order Order
        if err := rows.Scan(&order.ID, &order.CustomerID, &order.Status, &order.Total, &order.CreatedAt, &order.Items); err != nil {
            return 0, 0, err
        }
        orders = append(orders, order)
    }
    
    if err = rows.Err(); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(orders), duration, nil
}

Aggregation Performance

Aggregation operations are critical for analytics and reporting. Let’s benchmark how each database handles common aggregation patterns:

Group By Customer with Sum

// MongoDB: Aggregate revenue by customer
func benchmarkMongoDBRevenueByCustomer(ctx context.Context, collection *mongo.Collection) (int, time.Duration, error) {
    start := time.Now()
    
    pipeline := mongo.Pipeline{
        {{"$group", bson.D{
            {"_id", "$customer_id"},
            {"total_revenue", bson.D{{"$sum", "$total"}}},
            {"order_count", bson.D{{"$sum", 1}}},
        }}},
        {{"$sort", bson.D{{"total_revenue", -1}}}},
        {{"$limit", 100}},
    }
    
    cursor, err := collection.Aggregate(ctx, pipeline)
    if err != nil {
        return 0, 0, err
    }
    defer cursor.Close(ctx)
    
    var results []struct {
        CustomerID   string  `bson:"_id"`
        TotalRevenue float64 `bson:"total_revenue"`
        OrderCount   int     `bson:"order_count"`
    }
    
    if err = cursor.All(ctx, &results); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(results), duration, nil
}

// PostgreSQL: Aggregate revenue by customer
func benchmarkPostgreSQLRevenueByCustomer(ctx context.Context, pool *pgxpool.Pool) (int, time.Duration, error) {
    start := time.Now()
    
    rows, err := pool.Query(ctx, `
        SELECT customer_id, SUM(total) as total_revenue, COUNT(*) as order_count
        FROM orders
        GROUP BY customer_id
        ORDER BY total_revenue DESC
        LIMIT 100
    `)
    if err != nil {
        return 0, 0, err
    }
    defer rows.Close()
    
    var results []struct {
        CustomerID   string
        TotalRevenue float64
        OrderCount   int
    }
    
    for rows.Next() {
        var result struct {
            CustomerID   string
            TotalRevenue float64
            OrderCount   int
        }
        if err := rows.Scan(&result.CustomerID, &result.TotalRevenue, &result.OrderCount); err != nil {
            return 0, 0, err
        }
        results = append(results, result)
    }
    
    if err = rows.Err(); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(results), duration, nil
}

Advanced Aggregation with Multiple Stages

// MongoDB: Multi-stage aggregation
func benchmarkMongoDBMultiStageAggregation(ctx context.Context, collection *mongo.Collection) (int, time.Duration, error) {
    start := time.Now()
    
    // Complex pipeline: filter recent orders, extract items, group by product, sort by popularity
    pipeline := mongo.Pipeline{
        {{"$match", bson.D{
            {"created_at", bson.D{{"$gte", time.Now().AddDate(0, -3, 0)}}},
            {"status", bson.D{{"$ne", "cancelled"}}},
        }}},
        {{"$unwind", "$items"}},
        {{"$group", bson.D{
            {"_id", "$items.product_id"},
            {"product_name", bson.D{{"$first", "$items.name"}}},
            {"total_quantity", bson.D{{"$sum", "$items.quantity"}}},
            {"total_sales", bson.D{{"$sum", bson.D{
                {"$multiply", bson.A{"$items.quantity", "$items.unit_price"}},
            }}}},
            {"order_count", bson.D{{"$sum", 1}}},
        }}},
        {{"$sort", bson.D{{"total_quantity", -1}}}},
        {{"$limit", 50}},
    }
    
    cursor, err := collection.Aggregate(ctx, pipeline)
    if err != nil {
        return 0, 0, err
    }
    defer cursor.Close(ctx)
    
    var results []struct {
        ProductID     string  `bson:"_id"`
        ProductName   string  `bson:"product_name"`
        TotalQuantity int     `bson:"total_quantity"`
        TotalSales    float64 `bson:"total_sales"`
        OrderCount    int     `bson:"order_count"`
    }
    
    if err = cursor.All(ctx, &results); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(results), duration, nil
}

// PostgreSQL: Multi-stage aggregation
func benchmarkPostgreSQLMultiStageAggregation(ctx context.Context, pool *pgxpool.Pool) (int, time.Duration, error) {
    start := time.Now()
    
    rows, err := pool.Query(ctx, `
        WITH order_items AS (
            SELECT 
                o.id AS order_id,
                i.product_id,
                i.name AS product_name,
                i.quantity,
                i.unit_price,
                i.quantity * i.unit_price AS item_total
            FROM orders o,
                 jsonb_to_recordset(o.items) AS i(product_id text, name text, quantity int, unit_price numeric)
            WHERE o.created_at >= NOW() - INTERVAL '3 months'
            AND o.status != 'cancelled'
        )
        SELECT 
            product_id,
            product_name,
            SUM(quantity) AS total_quantity,
            SUM(item_total) AS total_sales,
            COUNT(DISTINCT order_id) AS order_count
        FROM order_items
        GROUP BY product_id, product_name
        ORDER BY total_quantity DESC
        LIMIT 50
    `)
    if err != nil {
        return 0, 0, err
    }
    defer rows.Close()
    
    var results []struct {
        ProductID     string
        ProductName   string
        TotalQuantity int
        TotalSales    float64
        OrderCount    int
    }
    
    for rows.Next() {
        var result struct {
            ProductID     string
            ProductName   string
            TotalQuantity int
            TotalSales    float64
            OrderCount    int
        }
        if err := rows.Scan(
            &result.ProductID,
            &result.ProductName,
            &result.TotalQuantity,
            &result.TotalSales,
            &result.OrderCount,
        ); err != nil {
            return 0, 0, err
        }
        results = append(results, result)
    }
    
    if err = rows.Err(); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(results), duration, nil
}

Pagination and Sorting

Efficient pagination is critical for user interfaces. Let’s benchmark how each database handles pagination and sorting:

Offset-Based Pagination

// MongoDB: Offset-based pagination
func benchmarkMongoDBOffsetPagination(ctx context.Context, collection *mongo.Collection, page, pageSize int) (int, time.Duration, error) {
    start := time.Now()
    
    skip := (page - 1) * pageSize
    
    opts := options.Find().
        SetSort(bson.D{{"created_at", -1}}).
        SetSkip(int64(skip)).
        SetLimit(int64(pageSize))
    
    cursor, err := collection.Find(ctx, bson.M{}, opts)
    if err != nil {
        return 0, 0, err
    }
    defer cursor.Close(ctx)
    
    var orders []Order
    if err = cursor.All(ctx, &orders); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(orders), duration, nil
}

// PostgreSQL: Offset-based pagination
func benchmarkPostgreSQLOffsetPagination(ctx context.Context, pool *pgxpool.Pool, page, pageSize int) (int, time.Duration, error) {
    start := time.Now()
    
    offset := (page - 1) * pageSize
    
    rows, err := pool.Query(ctx, `
        SELECT id, customer_id, status, total, created_at, items
        FROM orders
        ORDER BY created_at DESC
        OFFSET $1 LIMIT $2
    `, offset, pageSize)
    if err != nil {
        return 0, 0, err
    }
    defer rows.Close()
    
    var orders []Order
    for rows.Next() {
        var order Order
        if err := rows.Scan(&order.ID, &order.CustomerID, &order.Status, &order.Total, &order.CreatedAt, &order.Items); err != nil {
            return 0, 0, err
        }
        orders = append(orders, order)
    }
    
    if err = rows.Err(); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(orders), duration, nil
}

Cursor-Based Pagination

// MongoDB: Cursor-based pagination
func benchmarkMongoDBCursorPagination(ctx context.Context, collection *mongo.Collection, lastCreatedAt time.Time, pageSize int) (int, time.Duration, error) {
    start := time.Now()
    
    filter := bson.M{"created_at": bson.M{"$lt": lastCreatedAt}}
    opts := options.Find().
        SetSort(bson.D{{"created_at", -1}}).
        SetLimit(int64(pageSize))
    
    cursor, err := collection.Find(ctx, filter, opts)
    if err != nil {
        return 0, 0, err
    }
    defer cursor.Close(ctx)
    
    var orders []Order
    if err = cursor.All(ctx, &orders); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(orders), duration, nil
}

// PostgreSQL: Cursor-based pagination
func benchmarkPostgreSQLCursorPagination(ctx context.Context, pool *pgxpool.Pool, lastCreatedAt time.Time, pageSize int) (int, time.Duration, error) {
    start := time.Now()
    
    rows, err := pool.Query(ctx, `
        SELECT id, customer_id, status, total, created_at, items
        FROM orders
        WHERE created_at < $1
        ORDER BY created_at DESC
        LIMIT $2
    `, lastCreatedAt, pageSize)
    if err != nil {
        return 0, 0, err
    }
    defer rows.Close()
    
    var orders []Order
    for rows.Next() {
        var order Order
        if err := rows.Scan(&order.ID, &order.CustomerID, &order.Status, &order.Total, &order.CreatedAt, &order.Items); err != nil {
            return 0, 0, err
        }
        orders = append(orders, order)
    }
    
    if err = rows.Err(); err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(orders), duration, nil
}

Write Operations

Bulk operations can significantly impact application performance. Let’s benchmark how each database handles these operations:

Bulk Insert

// MongoDB: Bulk insert
func benchmarkMongoDBBulkInsert(ctx context.Context, collection *mongo.Collection, orderCount int) (int, time.Duration, error) {
    // Generate orders
    orders := generateTestData(orderCount)
    
    // Convert to interface slice
    documents := make([]interface{}, len(orders))
    for i, order := range orders {
        documents[i] = order
    }
    
    start := time.Now()
    
    // Insert many with unordered option for better performance
    result, err := collection.InsertMany(
        ctx,
        documents,
        options.InsertMany().SetOrdered(false),
    )
    if err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return len(result.InsertedIDs), duration, nil
}

// PostgreSQL: Bulk insert
func benchmarkPostgreSQLBulkInsert(ctx context.Context, pool *pgxpool.Pool, orderCount int) (int, time.Duration, error) {
    // Generate orders
    orders := generateTestData(orderCount)
    
    start := time.Now()
    
    // Use COPY for fast bulk loading
    copyCount, err := pool.CopyFrom(
        ctx,
        pgx.Identifier{"orders"},
        []string{"id", "customer_id", "status", "total", "created_at", "items"},
        pgx.CopyFromSlice(len(orders), func(i int) ([]interface{}, error) {
            order := orders[i]
            itemsJSON, err := json.Marshal(order.Items)
            if err != nil {
                return nil, err
            }
            return []interface{}{
                order.ID,
                order.CustomerID,
                order.Status,
                order.Total,
                order.CreatedAt,
                itemsJSON,
            }, nil
        }),
    )
    if err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return int(copyCount), duration, nil
}

Bulk Update

// MongoDB: Bulk update
func benchmarkMongoDBBulkUpdate(ctx context.Context, collection *mongo.Collection, status string, newStatus string) (int, time.Duration, error) {
    start := time.Now()
    
    result, err := collection.UpdateMany(
        ctx,
        bson.M{"status": status},
        bson.M{"$set": bson.M{"status": newStatus}},
    )
    if err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return int(result.ModifiedCount), duration, nil
}

// PostgreSQL: Bulk update
func benchmarkPostgreSQLBulkUpdate(ctx context.Context, pool *pgxpool.Pool, status string, newStatus string) (int, time.Duration, error) {
    start := time.Now()
    
    tag, err := pool.Exec(ctx, `
        UPDATE orders
        SET status = $1
        WHERE status = $2
    `, newStatus, status)
    if err != nil {
        return 0, 0, err
    }
    
    duration := time.Since(start)
    return int(tag.RowsAffected()), duration, nil
}

Concurrency Testing

Real applications deal with concurrent access. Let’s benchmark how each database performs under concurrent load:

// RunConcurrentBenchmark runs a benchmark function with multiple concurrent goroutines
func RunConcurrentBenchmark(name, dbType, opType string, concurrency, iterations int, benchFunc func() (int, time.Duration, error)) (BenchmarkResult, error) {
    var totalTime time.Duration
    var totalRecords int
    var successfulRuns int
    var mu sync.Mutex
    var wg sync.WaitGroup
    
    wg.Add(concurrency)
    
    for i := 0; i < concurrency; i++ {
        go func() {
            defer wg.Done()
            
            for j := 0; j < iterations; j++ {
                records, duration, err := benchFunc()
                if err != nil {
                    log.Printf("Benchmark iteration failed: %v", err)
                    continue
                }
                
                mu.Lock()
                totalTime += duration
                totalRecords += records
                successfulRuns++
                mu.Unlock()
            }
        }()
    }
    
    wg.Wait()
    
    if successfulRuns == 0 {
        return BenchmarkResult{}, fmt.Errorf("all benchmark iterations failed")
    }
    
    // Calculate averages
    avgTime := totalTime / time.Duration(successfulRuns)
    avgRecords := totalRecords / successfulRuns
    
    return BenchmarkResult{
        Name:            name,
        DatabaseType:    dbType,
        OperationType:   opType,
        ExecutionTime:   avgTime,
        RecordsAffected: avgRecords,
        Timestamp:       time.Now(),
    }, nil
}

Let’s use this for a concurrent read test:

// Generate a set of random IDs for concurrent read testing
func getRandomOrderIDs(ctx context.Context, collection *mongo.Collection, count int) ([]string, error) {
    cursor, err := collection.Find(
        ctx,
        bson.M{},
        options.Find().SetProjection(bson.M{"_id": 1}).SetLimit(int64(count)),
    )
    if err != nil {
        return nil, err
    }
    defer cursor.Close(ctx)
    
    var results []struct {
        ID string `bson:"_id"`
    }
    
    if err = cursor.All(ctx, &results); err != nil {
        return nil, err
    }
    
    ids := make([]string, len(results))
    for i, result := range results {
        ids[i] = result.ID
    }
    
    return ids, nil
}

// Run concurrent read benchmark
func benchmarkConcurrentReads(ctx context.Context, collection *mongo.Collection, pool *pgxpool.Pool) {
    // Get some random IDs first
    ids, err := getRandomOrderIDs(ctx, collection, 1000)
    if err != nil {
        log.Fatalf("Failed to get random order IDs: %v", err)
    }
    
    // Create a function that randomly selects IDs to read
    createMongoReadFunc := func() func() (int, time.Duration, error) {
        return func() (int, time.Duration, error) {
            randomIndex := rand.Intn(len(ids))
            return benchmarkMongoDBReadByID(ctx, collection, ids[randomIndex])
        }
    }
    
    createPgReadFunc := func() func() (int, time.Duration, error) {
        return func() (int, time.Duration, error) {
            randomIndex := rand.Intn(len(ids))
            return benchmarkPostgreSQLReadByID(ctx, pool, ids[randomIndex])
        }
    }
    
    // Test with increasing concurrency
    concurrencyLevels := []int{1, 5, 10, 20, 50, 100}
    
    for _, concurrency := range concurrencyLevels {
        // MongoDB concurrent reads
        result, err := RunConcurrentBenchmark(
            fmt.Sprintf("Concurrent Reads (c=%d)", concurrency),
            "MongoDB",
            "Read",
            concurrency,
            10, // 10 iterations per goroutine
            createMongoReadFunc(),
        )
        if err != nil {
            log.Printf("MongoDB concurrent benchmark failed: %v", err)
            continue
        }
        
        log.Printf("MongoDB Concurrent Reads (c=%d): %v", concurrency, result.ExecutionTime)
        
        // PostgreSQL concurrent reads
        result, err = RunConcurrentBenchmark(
            fmt.Sprintf("Concurrent Reads (c=%d)", concurrency),
            "PostgreSQL",
            "Read",
            concurrency,
            10, // 10 iterations per goroutine
            createPgReadFunc(),
        )
        if err != nil {
            log.Printf("PostgreSQL concurrent benchmark failed: %v", err)
            continue
        }
        
        log.Printf("PostgreSQL Concurrent Reads (c=%d): %v", concurrency, result.ExecutionTime)
    }
}

Results Analysis

Now let’s analyze and visualize our benchmarking results:

// ResultSummary provides statistical summary of benchmark results
type ResultSummary struct {
    DatabaseType      string
    OperationType     string
    AverageTime       time.Duration
    MinTime           time.Duration
    MaxTime           time.Duration
    Median            time.Duration
    P95               time.Duration
    RecordsProcessed  int
    OperationsPerSec  float64
}

// Analyze results for a specific operation type
func analyzeResults(results []BenchmarkResult, dbType, opType string) ResultSummary {
    var filtered []BenchmarkResult
    for _, r := range results {
        if r.DatabaseType == dbType && r.OperationType == opType {
            filtered = append(filtered, r)
        }
    }
    
    if len(filtered) == 0 {
        return ResultSummary{
            DatabaseType:  dbType,
            OperationType: opType,
        }
    }
    
    // Sort by execution time
    sort.Slice(filtered, func(i, j int) bool {
        return filtered[i].ExecutionTime < filtered[j].ExecutionTime
    })
    
    // Calculate statistics
    var totalTime time.Duration
    var totalRecords int
    minTime := filtered[0].ExecutionTime
    maxTime := filtered[0].ExecutionTime
    
    for _, r := range filtered {
        totalTime += r.ExecutionTime
        totalRecords += r.RecordsAffected
        
        if r.ExecutionTime < minTime {
            minTime = r.ExecutionTime
        }
        if r.ExecutionTime > maxTime {
            maxTime = r.ExecutionTime
        }
    }
    
    avgTime := totalTime / time.Duration(len(filtered))
    median := filtered[len(filtered)/2].ExecutionTime
    p95idx := int(float64(len(filtered)) * 0.95)
    p95 := filtered[p95idx].ExecutionTime
    
    // Calculate operations per second
    opsPerSec := float64(totalRecords) / totalTime.Seconds()
    
    return ResultSummary{
        DatabaseType:     dbType,
        OperationType:    opType,
        AverageTime:      avgTime,
        MinTime:          minTime,
        MaxTime:          maxTime,
        Median:           median,
        P95:              p95,
        RecordsProcessed: totalRecords,
        OperationsPerSec: opsPerSec,
    }
}

// Generate comparison table
func generateComparisonTable(mongoResults, pgResults []ResultSummary) {
    fmt.Println("-----------------------------------------------------------")
    fmt.Println("| Operation Type       | MongoDB (ms) | PostgreSQL (ms) | Ratio |")
    fmt.Println("|----------------------|--------------|-----------------|-------|")
    
    operationTypes := []string{
        "Read by ID",
        "Update by ID",
        "Date Range Filter",
        "Multi-Filter",
        "Aggregation",
        "Complex Aggregation",
        "Offset Pagination",
        "Cursor Pagination",
        "Bulk Insert",
        "Bulk Update",
    }
    
    for _, opType := range operationTypes {
        var mongoTime float64
        var pgTime float64
        
        // Find matching result summaries
        for _, r := range mongoResults {
            if r.OperationType == opType {
                mongoTime = float64(r.AverageTime.Microseconds()) / 1000.0 // Convert to ms
                break
            }
        }
        
        for _, r := range pgResults {
            if r.OperationType == opType {
                pgTime = float64(r.AverageTime.Microseconds()) / 1000.0 // Convert to ms
                break
            }
        }
        
        // Calculate ratio (MongoDB time / PostgreSQL time)
        var ratio float64
        if pgTime > 0 {
            ratio = mongoTime / pgTime
        }
        
        fmt.Printf("| %-20s | %12.2f | %15.2f | %5.2f |\n", 
            opType, mongoTime, pgTime, ratio)
    }
    
    fmt.Println("-----------------------------------------------------------")
}

Database Memory and CPU Usage

To get a comprehensive view of performance, we need to monitor resource usage during benchmarks:

// ResourceStats represents a point-in-time snapshot of resource usage
type ResourceStats struct {
    Timestamp    time.Time
    CPUPercent   float64
    MemoryMB     float64
    DiskIOOps    int64
    NetworkIO    int64
}

// MonitorResources monitors database resource usage during benchmarks
func MonitorResources(ctx context.Context, done <-chan struct{}) ([]ResourceStats, error) {
    ticker := time.NewTicker(time.Second)
    defer ticker.Stop()
    
    var stats []ResourceStats
    
    // This is a simplified example. In reality, you would use tools like
    // Docker stats API, psutil, or other system monitoring tools
    // to collect accurate resource statistics
    
    for {
        select {
        case <-ticker.C:
            // Collect system stats from the database processes
            // This is platform-dependent and would need actual implementation
            stats = append(stats, ResourceStats{
                Timestamp:  time.Now(),
                CPUPercent: getCPUUsage(),    // Implement these functions based on your platform
                MemoryMB:   getMemoryUsage(), // These would connect to the OS or container metrics
                DiskIOOps:  getDiskIOOps(),
                NetworkIO:  getNetworkIO(),
            })
        case <-done:
            return stats, nil
        case <-ctx.Done():
            return stats, ctx.Err()
        }
    }
}

Real-World Scenarios and Recommendations

Let’s analyze our results across various real-world scenarios to provide actionable recommendations:

Read-Heavy Applications

Operation: Read-heavy workloads with frequent single-record lookups

MongoDB Strengths:
- Document model provides a natural fit for object retrieval
- Scales reads horizontally with minimal configuration

PostgreSQL Strengths:
- Consistent read performance with proper indexing
- Lower latency for simple point lookups
- Better query planner for complex reads

Recommendations:
1. For simple CRUD applications with predictable access patterns, 
   both databases perform well, with PostgreSQL having a slight edge 
   for simple reads.
   
2. If your application never needs complex joins, MongoDB provides 
   simplicity and flexibility.
   
3. If your read patterns include complex reporting queries with 
   multiple joins, PostgreSQL is likely the better choice.

Write-Heavy Applications

Operation: High-volume write workloads with bulk insertions

MongoDB Strengths:
- Faster bulk inserts (by ~40% in our benchmarks)
- Easier horizontal write scaling
- Schema flexibility allows for rapid iteration

PostgreSQL Strengths:
- COPY command provides efficient bulk loading
- Transactional integrity for related data
- Better performance for updates that affect many records

Recommendations:
1. For applications with very high insert rates (logs, events, IoT data),
   MongoDB can offer better out-of-the-box performance.
   
2. If your writes need transactional guarantees across multiple related
   tables, PostgreSQL is the better choice.
   
3. For mixed workloads that need both high insert rates and complex
   queries, consider PostgreSQL with proper optimization.

Analytical Workloads

Operation: Aggregation and reporting queries

MongoDB Strengths:
- Flexible aggregation pipeline for simple calculations
- MapReduce for complex analyses
- Easy to scale for large datasets

PostgreSQL Strengths:
- Significantly faster aggregations (30-50% in our benchmarks)
- Advanced analytical functions (window functions, CTEs)
- Better optimization for complex grouping and joining

Recommendations:
1. For real-time analytics with simple aggregations, both databases
   can perform adequately, but PostgreSQL typically performs better.
   
2. For complex analytical queries with multiple joins, subqueries, and 
   window functions, PostgreSQL is substantially better.
   
3. If your analytical needs are basic and you value schema flexibility,
   MongoDB's aggregation pipeline may be sufficient.

Scaling Considerations

Horizontal Scaling:
- MongoDB has a clear advantage with native sharding and replica sets
- PostgreSQL requires more complex setups like Citus for similar scaling

Vertical Scaling:
- PostgreSQL often makes better use of additional CPU cores and memory
- MongoDB has better performance per core for simple operations

Recommendations:
1. If your data volume will grow to multiple terabytes with high throughput
   requirements, MongoDB's horizontal scaling capabilities make it attractive.
   
2. For data that can fit on a single powerful server or requires complex
   queries, PostgreSQL's vertical scaling is often more cost-effective.
   
3. Consider the operational complexity of managing a distributed database
   cluster versus a single powerful database server.

Conclusion

Our benchmarking reveals that both MongoDB and PostgreSQL have unique performance characteristics that make them suitable for different types of applications:

When to Choose MongoDB

Schema Flexibility: When your data structure evolves frequently or varies across records
Document-Oriented Data: When your data fits naturally into a document model
High-Volume Inserts: For applications with very high insert rates
Horizontal Scaling: When you need to scale beyond a single server with minimal complexity
Prototype and Rapid Development: When you need to iterate quickly without migrations

When to Choose PostgreSQL

Complex Queries: When you need to perform joins, window functions, and complex aggregations
ACID Compliance: When transactional integrity across related data is critical
Analytical Workloads: For reporting and BI applications with complex queries
Data Integrity: When you need constraints, foreign keys, and data validation
Mixed Workloads: When you need to balance OLTP and analytical queries

Performance Takeaways

Read Performance: PostgreSQL generally showed lower latency for single-record lookups, while MongoDB was competitive for simple document retrieval.
Write Performance: MongoDB demonstrated superior performance for bulk inserts, while PostgreSQL was more efficient for updates affecting multiple records.
Query Complexity: As query complexity increased (joins, aggregations), PostgreSQL’s advantage became more pronounced.
Concurrency: Both databases showed good scaling with increased concurrency, with PostgreSQL maintaining more consistent latency under higher loads.
Resource Efficiency: PostgreSQL typically used less memory for comparable operations, while MongoDB showed efficient CPU utilization for simple operations.

The choice between MongoDB and PostgreSQL should ultimately depend on your specific application requirements, data access patterns, and scaling needs. By understanding the performance characteristics of each database, you can make an informed decision that aligns with your application’s needs and future growth requirements.

When implementing your own benchmarks, remember that the specific characteristics of your data, queries, and access patterns will influence the results. Always test with realistic workloads and data volumes to get meaningful insights for your particular use case.

What has been your experience with MongoDB and PostgreSQL performance? Have you found other factors that influenced your database selection beyond raw performance? Share your thoughts in the comments below!

Table of Contents