Protocol Buffers basic introduction and basic syntax

introduction

In modern software development, data exchange and storage are crucial, and choosing the right serialization protocol is of great significance to improving performance and efficiency. Protocol Buffers (commonly known as Protobuf) is a language-neutral, platform-neutral, and extensible method for serializing structured data developed by Google. This article will introduce the basic concepts of Protocol Buffers, the differences from other serialization protocols, the basic syntax, and provide sample code to help you get started quickly.

1. What are Protocol Buffers?

Protocol Buffers is a lightweight serialization format that is mainly used for serialization and deserialization of structured data. It allows developers to define data structures and then automatically generate code to transfer data between multiple programming languages.

1.1 Features

Efficient: Protobuf uses a compact binary format that is smaller and faster to store and transmit than other text formats such as JSON and XML.
Language neutral: Supports multiple programming languages, including C++, Java, Python, Go, Ruby, etc.
Backward Compatibility: Allows you to modify the data structure without disrupting existing services, and supports the addition and removal of fields.
Easy to use: Use concise syntax to define data structures and automatically generate corresponding code.

2. Differences between Protocol Buffers and other serialization protocols

When choosing a serialization protocol, it is important to understand the pros and cons of different protocols. Here is a comparison of Protocol Buffers with other common serialization protocols:

2.1 Comparison with JSON

Format: JSON is a text format that is easy to read and debug; while Protobuf is a binary format that is more compact but not easy for humans to read directly.
performance: Protobuf generally outperforms JSON in serialization and deserialization performance, especially when dealing with large amounts of data.
Data type support: JSON supports basic data types, while Protobuf provides more data type support, such as enumeration, nested structures, etc.
Schema: Protobuf enforces the definition of data structure (schema), while JSON is schema-less.

2.2 Comparison with XML

Format: XML is a text format with a complex and lengthy structure; Protobuf uses a binary format and the data is more compact.
Parsing speed: Protobuf is usually parsed faster than XML because XML requires the parser to process tags and attributes.
Type Safety: Protobuf provides type definitions to ensure data security, while XML is just a string without type restrictions.
Usage scenarios: Although XML is more suitable for representing document-type data, Protobuf is more suitable in terms of efficient network communication and storage.

3. Basic syntax of Protocol Buffers

3.1 Defining Data Structure

Using Protobuf requires defining .proto Files that describe data structures and their types. The following is a simple .proto File example:

syntax = "proto3";
package example; 

// define a message 
message Person { 
     string name = 1;      // name 
     int32 id = 2;              // ID 
     string email = 3;      // email address 
}

3.2 Syntax Parsing

Syntax version:syntax = "proto3"; Specifies the Protobuf syntax version to use. The latest one is proto3.
Package Name:package example; Defines the package to which the message belongs.
Message Type:use message Keywords define a message, similar to a class definition.
Fields: Each field has a type, a name, and a unique numeric identifier. The identifier is used to distinguish the fields during serialization.

3.3 Basic Data Types

Protobuf supports a variety of basic data types, including:

int32: 32-bit integer
int64: 64-bit integer
uint32: Unsigned 32-bit integer
uint64: Unsigned 64-bit integer
sint32: Signed 32-bit integer, using ZigZag encoding
sint64: Signed 64-bit integer
fixed32,fixed64: Fixed-length integer
bool: Boolean
string: string
bytes: byte array

3.4 Nested Messages and Enumerations

Protobuf also supports nested messages and enumeration types. Here are some examples:

message Company { 
      string name = 1; 
      repeated Person employees = 2;    // Multiple employees
} 

enum Role { 
      UNKNOWN = 0; 
      DEVELOPER = 1;
      MANAGER = 2; 
}

Nested messages:use repeated The keyword indicates that a field can contain multiple values (like an array).
enumerate:use enum Define a finite set of values to conveniently represent states or roles.

4. Examples of using Protocol Buffers

4.1 Installing Protocol Buffers

First, make sure you have the Protocol Buffers compiler installed. protoc And the corresponding language plugin. Take Go as an example:

# Install protoc  
sudo apt install -y protobuf-compiler

# Install Go plugin 
go install google.golang.org/protobuf/cmd/protoc-gen-go@latest

4.2 Create a sample project

Create a simple Protobuf sample project structure:

protobuf-example/
    ├── proto/
    │   └── example.proto
    ├── main.go

4.3 Defining Data Structure

exist proto/example.proto The data structure is defined in:

syntax = "proto3"; 

package example; 

message Person { 
     string name = 1; 
     int32 id = 2; 
     string email = 3;
 }

4.4 Generating Code

Generate Go code in the project directory:

protoc --go_out=. --go-grpc_out=. proto/example.proto

4.5 Implementation Code

exist main.go Serialization and deserialization are implemented in:

package main

import (
    "log"
    "github.com/golang/protobuf/proto"
    pb "protobuf-example/proto"
)

func main() {
	// Create a Person instance 
    person := &pb.Person{
        Name:  "ivanzhang",
        Id:    101,
        Email: "ivanzhang@ztoutup.com",
    }

	// Serialize 
    data, err := proto.Marshal(person)
    if err != nil {
        log.Fatalf("failed to marshal: %v", err)
    }

	// Deserialize 
    newPerson := &pb.Person{}
    if err := proto.Unmarshal(data, newPerson); err != nil {
        log.Fatalf("failed to unmarshal: %v", err)
    }

	// Output result 
    log.Printf("Name: %s, ID: %d, Email: %s", newPerson.Name, newPerson.Id, newPerson.Email)
}

4.6 Run the example

Run the code in the project directory:

go run main.go

Output:

Name: ivanzhang, ID: 101, Email: ivanzhang@ztoutup.com

V. Conclusion

Protocol Buffers is an efficient and flexible serialization protocol suitable for data transmission in modern applications. This article introduces the basic concepts of Protocol Buffers, comparison with other serialization protocols, basic syntax and sample code. By mastering Protobuf, you can process structured data more efficiently and improve the performance and reliability of network communication.

If you have any questions or suggestions, please leave a comment below. Thanks for reading! I hope this article can help you better understand and use Protocol Buffers.

Reference Links: