Protocol Buffers

Data serialization format
Protocol Buffers
Developer(s)Google
Initial releaseEarly 2001 (internal)[1]
July 7, 2008 (2008-07-07) (public)
Stable release
26.0 Edit this on Wikidata / 13 March 2024; 12 days ago (13 March 2024)[2]
Repository
  • github.com/protocolbuffers/protobuf Edit this at Wikidata
Written inC++, C#, Java, Python, JavaScript, Ruby, Go, PHP, Dart
Operating systemAny
PlatformCross-platform
Typeserialization format and library, IDL compiler
LicenseBSD
Websiteprotobuf.dev Edit this at Wikidata

Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs that communicate with each other over a network or for storing data. The method involves an interface description language that describes the structure of some data and a program that generates source code from that description for generating or parsing a stream of bytes that represents the structured data.

Overview

Google developed Protocol Buffers for internal use and provided a code generator for multiple languages under an open-source license (see below).

The design goals for Protocol Buffers emphasized simplicity and performance. In particular, it was designed to be smaller and faster than XML.[3]

Protocol Buffers are widely used at Google for storing and interchanging all kinds of structured information. The method serves as a basis for a custom remote procedure call (RPC) system that is used for nearly all inter-machine communication at Google.[4]

Protocol Buffers are similar to the Apache Thrift, Ion, and Microsoft Bond protocols. Offering a concrete RPC protocol stack to use for defined services called gRPC.[5]

Data structure schemas (called messages) and services are described in a proto definition file (.proto) and compiled with protoc. This compilation generates code that can be invoked by a sender or recipient of these data structures. For example, example.pb.cc and example.pb.h are generated from example.proto. They define C++ classes for each message and service in example.proto.

Canonically, messages are serialized into a binary wire format which is compact, forward- and backward-compatible, but not self-describing (that is, there is no way to tell the names, meaning, or full datatypes of fields without an external specification). There is no defined way to include or refer to such an external specification (schema) within a Protocol Buffers file. The officially supported implementation includes an ASCII serialization format,[6] but this format—though self-describing—loses the forward- and backward-compatibility behavior, and is thus not a good choice for applications other than human editing and debugging.[7]

Though the primary purpose of Protocol Buffers is to facilitate network communication, its simplicity and speed make Protocol Buffers an alternative to data-centric C++ classes and structs, especially where interoperability with other languages or systems might be needed in the future.

Limitations

Protobufs have no single specification.[8] The format is best suited for small data chunks that don't exceed few megabytes and can be loaded/sent into a memory right away and therefore is not a streamable format.[9] The library doesn't provide compression out of the box. The format also isn't well supported in non–object-oriented languages (e.g. Fortran).[10]

Example

A schema for a particular use of protocol buffers associates data types with field names, using integers to identify each field. (The protocol buffer data contains only the numbers, not the field names, providing some bandwidth/storage savings compared with systems that include the field names in the data.)

// polyline.proto
syntax = "proto2";

message Point {
  required int32 x = 1;
  required int32 y = 2;
  optional string label = 3;
}

message Line {
  required Point start = 1;
  required Point end = 2;
  optional string label = 3;
}

message Polyline {
  repeated Point point = 1;
  optional string label = 2;
}

The "Point" message defines two mandatory data items, x and y. The data item label is optional. Each data item has a tag. The tag is defined after the equal sign. For example, x has the tag 1.

The "Line" and "Polyline" messages, which both use Point, demonstrate how composition works in Protocol Buffers. Polyline has a repeated field, and thus Polyline behaves like a set of points (of unspecified number).

This schema can subsequently be compiled for use by one or more programming languages. Google provides a compiler called protoc which can produce output for C++, Java or Python. Other schema compilers are available from other sources to create language-dependent output for over 20 other languages.[11]

For example, after a C++ version of the protocol buffer schema above is produced, a C++ source code file, polyline.cpp, can use the message objects as follows:

// polyline.cpp
#include "polyline.pb.h"  // generated by calling "protoc polyline.proto"

Line* createNewLine(const std::string& name) {
  // create a line from (10, 20) to (30, 40)
  Line* line = new Line;
  line->mutable_start()->set_x(10);
  line->mutable_start()->set_y(20);
  line->mutable_end()->set_x(30);
  line->mutable_end()->set_y(40);
  line->set_label(name);
  return line;
}

Polyline* createNewPolyline() {
  // create a polyline with points at (10,10) and (20,20)
  Polyline* polyline = new Polyline;
  Point* point1 = polyline->add_point();
  point1->set_x(10);
  point1->set_y(10);
  Point* point2 = polyline->add_point();
  point2->set_x(20);
  point2->set_y(20);
  return polyline;
}

Language support

Protobuf 2.0 provides a code generator for C++, Java, C#,[12] and Python.[13]

Protobuf 3.0 provides a code generator for C++, Java (including JavaNano, a dialect intended for low-resource environments), Python, Go, Ruby, Objective-C, C#.[14] It also supports JavaScript since 3.0.0-beta-2.[15]

Third-party implementations are also available for Ballerina,[16] C,[17][18] C++,[19] Dart, Elixir,[20][21] Erlang,[22] Haskell,[23] JavaScript,[24] Julia,[25] Nim,[26] Perl, PHP, Prolog,[27][28] R,[29] Rust,[30][31][32] Scala,[33] and Swift.[34]

See also

  • Free and open-source software portal

References

  1. ^ "Frequently Asked Questions | Protocol Buffers". Google Developers. Retrieved 2 October 2016.
  2. ^ "Releases - google/protobuf" – via GitHub.
  3. ^ Eishay Smith. "jvm-serializers Benchmarks". GitHub. Retrieved 2010-07-12.
  4. ^ Kenton Varda. "A response to Steve Vinoski". Retrieved 2008-07-14.
  5. ^ "grpc". grpc.io. Retrieved 2 October 2016.
  6. ^ "text_format.h - Protocol Buffers - Google Code". Retrieved 2012-03-02.
  7. ^ "Proto Best Practices | Protocol Buffers Documentation". Retrieved 2023-05-26.
  8. ^ "Overview". protobuf.dev. Retrieved 2023-05-28.
  9. ^ "Overview". protobuf.dev. Retrieved 2023-05-28.
  10. ^ "Overview". protobuf.dev. Retrieved 2023-05-28.
  11. ^ ThirdPartyAddOns - protobuf - Links to third-party add-ons. - Protocol Buffers - Google's data interchange format - Google Project Hosting. Code.google.com. Retrieved on 2013-09-18.
  12. ^ "Protocol Buffers in C#". Code Blockage. Retrieved 2017-05-12.
  13. ^ "Protocol Buffers Language Guide". Google Developers. Retrieved 2016-04-21.
  14. ^ "Language Guide (proto3) | Protocol Buffers". Google Developers. Retrieved 2020-08-09.
  15. ^ "Release Protocol Buffers v3.0.0-beta-2 · protocolbuffers/protobuf". GitHub. Retrieved 2020-08-09.
  16. ^ "Ballerina - GRPC". Archived from the original on 2021-11-15. Retrieved 2021-03-24.
  17. ^ "Nanopb - protocol buffers with small code size". Retrieved 2017-12-12.
  18. ^ "Protocol Buffers implementation in C". GitHub. Retrieved 2017-12-12.
  19. ^ "Embedded Proto - Protobuf for microcontrollers". Retrieved 2021-08-15.
  20. ^ "Protox". GitHub. 25 October 2021.
  21. ^ "Protobuf-elixir". GitHub. 26 October 2021.
  22. ^ "Tomas-abrahamsson/GPB". GitHub. 19 October 2021.
  23. ^ "Proto-lens". GitHub. 16 October 2021.
  24. ^ "Protocol Buffers for JavaScript". github.com. Retrieved 2016-05-14.
  25. ^ "ThirdPartyAddOns - protobuf - Links to third-party add-ons. - Protocol Buffers - Google's data interchange format - Google Project Hosting". Retrieved 2012-11-07.
  26. ^ "Protobuf implementation in pure Nim that leverages the power of the macro system to not depend on any external tools". GitHub. 21 October 2021.
  27. ^ "SWI-Prolog: Google's Protocol Buffers Library".
  28. ^ "SWI-Prolog / contrib-protobufs". GitHub. Retrieved 2022-04-21.
  29. ^ "RProtoBuf". GitHub.
  30. ^ "Rust-protobuf". GitHub. 26 October 2021.
  31. ^ "PROST!". GitHub. 21 August 2021.
  32. ^ "Quick-protobuf". GitHub. 12 October 2021.
  33. ^ "ScalaPB". GitHub. Retrieved 27 September 2022.
  34. ^ "Swift Protobuf". GitHub. 26 October 2021.

External links

  • Official documentation at developers.google.com
  • protobuf on GitHub
  • v
  • t
  • e
Google free and open-source software
Software
Applications
Programming languages
  • Carbon
  • Dart
  • Go
  • Sawzall
Frameworks and
development tools
Operating systems
Related