In-depth analysis and engineering practice of MySQL InnoDB MVCC mechanism

1. MVCC Architecture Design and Implementation Principles

1.1 Storage Engine Layer Architecture

InnoDB uses a layered storage architecture to implement the MVCC mechanism:

  1. Memory structure:

    • Buffer Pool: Data page cache pool (default 128MB)
    • Undo Log Buffer: transaction rollback log cache (default 16MB)
    • Change Buffer: Non-unique index update buffer
  2. Disk structure:

    • Clustered index B+ tree (primary key index)
    • Secondary index B+ tree
    • Undo Log segment (rollback segment)
    • Redo Log File Group

InnoDB architecture diagram

1.2 Multi-version data storage

Each data page (16KB) contains a version chain head pointer, and the historical versions are linked in series through DB_ROLL_PTR. Undo Log is managed in segment storage, and each rollback segment contains 1024 undo slots.

Undo Log Type:

-- Insert operation log 
UNDO_INSERT = { 
    type: 'INSERT', 
    table_id: 0x0012, 
    old_data: null 
} 

-- Update operation log 
UNDO_UPDATE = {
     type: 'UPDATE', 
     table_id: 0x0012, 
     old_data: {
        id:1, 
        balance:100
    },
    index_updates: [ {
        index_id: 0x01, 
        old_value: 'Alice'
    } ] 
}

1.3 Transaction ID Management

The transaction system maintains a global transaction ID allocator:

  • The trx_sys structure manages the list of active transactions
  • Transaction ID (trx_id) allocation rules:
    • Read-only transaction: trx_id=0
    • Read-write transaction: Get the incremented value from the global counter
  • Transaction visibility judgment formula:
    visible = (trx_id < view_up_limit_id) && 
           (trx_id == creator_trx_id || 
           trx_id not in active_trx_set)

2. Version chain construction and traversal mechanism

2.1 Line Record Format Analysis

Take the Compact row format as an example:

+--------+---------+---------+---------+-----+
| Header | Col1    | Col2    | TRX_ID  | PTR |
+--------+---------+---------+---------+-----+
| 0x80   | 'Alice' | 100.00  | 0x00001 | 0x7F|
+--------+---------+---------+---------+-----+
  • Header: contains deletion mark and column number information
  • TRX_ID: 6-byte transaction ID (maximum 2^48 transactions)
  • PTR: 7-byte rollback pointer (pointing to UNDO_LOG address)

2.2 Version chain construction process

Transaction update operation process:

  1. Acquire row exclusive lock (X Lock)
  2. Copy the current version to the Undo Log
  3. Modify data and update TRX_ID
  4. Point the old version PTR to the new Undo Log
// InnoDB kernel code snippet(storage/innobase/row/row0upd.cc)
void row_upd_clust_step() {
    undo = trx_undo_report_begin(); // Start recording undo 
    old_rec = btr_copy_record();    // Copy old record 
    row_upd_rec(update, rec);       // Update record 
    trx_undo_report_end(undo);      // Submit undo record
}

2.3 Version Chain Traversal Algorithm

Version selection process during query:

  1. Start traversing from the current record
  2. Check TRX_ID visibility for each version
  3. Find the first visible version or reach the end of the chain

Traversal pseudo code:

def find_visible_version(row, read_view):
    current = row
    while current:
        if is_visible(current.trx_id, read_view):
            return current
        current = current.roll_ptr
    return None  # No visible version (deleted)

3. Read View Implementation Details

3.1 Snapshot Generation Mechanism

Read View to get the system status when it is created:

  • low_limit_id = trx_sys->max_trx_id
  • up_limit_id = trx_sys->min_active_trx_id
  • active_trx_ids = List of currently active transaction IDs

Memory data structure:

struct read_view_t {
    trx_id_t    low_limit_id;
    trx_id_t    up_limit_id;
    trx_id_t    creator_trx_id;
    ids_t       active_trx_ids; 
    ulint       n_trx_ids;
};

3.2 Visibility judgment optimization

InnoDB uses binary search to optimize active transaction judgment:

bool read_view_sees_trx_id(
    const read_view_t* view, 
    trx_id_t trx_id) 
{
    if (trx_id up_limit_id) return true;
    if (trx_id >= view->low_limit_id) return false;
    return !bsearch(&trx_id, view->active_trx_ids, 
                   view->n_trx_ids, sizeof(trx_id_t), 
                   trx_id_compare);
}

4. In-depth analysis of differences in isolation level implementation

4.1 READ COMMITTED Implementation

Key Features:

  • Create a new Read View each time you SELECT
  • Using statement-level snapshots
  • Phantom reads may occur

Locking mechanism cooperation:

UPDATE accounts SET balance = balance - 100 
WHERE user_id = 123; -- Hold X lock until the transaction ends

4.2 REPEATABLE READ Implementation

Special treatment:

  • Using transaction-level snapshots
  • Gap locks prevent phantom reads
  • Unique index optimization: Do not use gap locks for unique index equality queries

Lock scope example:

Table structure: id INT PRIMARY KEY, name VARCHAR(20) 
Query: SELECT * FROM t WHERE id > 100 FOR UPDATE; 
Lock range: (100, +∞)

5. Production Environment Optimization Practice

5.1 Parameter Tuning Suggestions

Key parameter configuration:

# Control Undo Log cleanup speed 
innodb_purge_threads = 4 
innodb_purge_batch_size = 300 

# Long transaction monitoring threshold 
innodb_rollback_segments = 128 
innodb_max_undo_log_size = 1G 

# Version chain traversal optimization 
innodb_read_ahead_threshold = 56

5.2 Monitoring and Diagnosis

Common monitoring commands:

SELECT * FROM information_schema.innodb_trx 
WHERE TIMEDIFF(now(), trx_started) > '00:00:10'; 

-- Analyze Undo space usage 
SELECT 
    TABLESPACE_NAME,
    FILE_SIZE/1024/1024 as size_mb, 
    ALLOCATED_SIZE/1024/1024 as alloc_mb 
FROM information_schema.FILES 
WHERE FILE_TYPE = 'UNDO LOG'; 

-- Detect version chain length 
SELECT 
    TABLE_NAME, 
    INDEX_NAME, 
    AVG(`NUMBER_RECORDS`) AS avg_versions
FROM information_schema.INNODB_BUFFER_PAGE 
WHERE PAGE_TYPE = 'INDEX'
GROUP BY TABLE_NAME, INDEX_NAME;

VI. Solutions to Typical Problems

6.1 Version chain is too long

Symptoms:

  • Query performance gradually degrades
  • Undo tablespace continues to grow
  • Snapshot too old error occurs

Solution:

  1. Optimize transaction size and reduce the amount of modification per transaction
  2. Clean up historical snapshots regularly
    SET GLOBAL innodb_purge_batch_size = 1000;
    SET GLOBAL innodb_max_purge_lag = 1000000;

6.2 Snapshot Isolation and Lock Conflicts

Typical deadlock scenario:

Transaction A: SELECT ... FOR UPDATE (holds gap lock) 
Transaction B: INSERT INTO ... (requests insert intention lock)

Solution:

  1. Enable deadlock detection
    innodb_deadlock_detect = ON innodb_lock_wait_timeout = 30
  2. Business layer retry mechanism

7. New hardware environment optimization

7.1 NVMe SSD Optimization

Configuration suggestions:

innodb_io_capacity = 20000 innodb_io_capacity_max = 40000 innodb_flush_neighbors = 0

7.2 Persistent Memory Applications

Use PMEM as Undo Log storage:

innodb_undo_directory = /mnt/pmem innodb_undo_log_encrypt = OFF

8. Version evolution and future direction

8.1 MySQL 8.0 Improvements

Important Update:

  1. Atomic DDL
  2. Enhanced data dictionary
  3. Histogram Statistics

8.2 Cloud Native Architecture Adaptation

Changes:

  • Distributed transaction support (XA enhancement)
  • Computing and storage separation architecture
  • Snapshot export/import optimization

IX. Engineering Practice Suggestions

  1. Index design principles

    • The primary key field should not be too large
    • Avoid using frequently updated columns as secondary indexes
    • The combined index follows the leftmost matching principle
  2. Transaction Writing Specifications

    • Keep transaction code blocks as short as possible
    • Avoid making network calls within transactions
    • Explicitly set the isolation level
  3. Version upgrade strategy

    • First upgrade the slave library to observe
    • Using the mysql_upgrade Tool
    • Verify the compatibility of MVCC related parameters

10. Issues that should be considered and solved in practice

  1. How to design linearizable distributed MVCC?
  2. How to balance the version requirements of OLTP and OLAP in the HTAP scenario?
  3. When the length of the version chain exceeds the capacity of the undo tablespace, how does the system ensure transaction security?

Through the above content, this article systematically explains the implementation details, optimization methods and engineering practice points of InnoDB MVCC. It is recommended that readers refer to the configuration parameters and monitoring methods provided in this article in combination with actual work scenarios to establish a complete database performance optimization system. For key business systems, MVCC health checks should be performed regularly to ensure the efficient operation of the version chain mechanism. Finally, we should think about the last three questions in the process of practice.

No Comments

Send Comment Edit Comment

|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠(ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ°Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
Emoticons
Emoji
Little Dinosaur
flower!
Previous
Next