source: rtems/c/src/librpc/src/rpc/PSD.doc/xdr.rfc.ms @ df49c60

4.104.114.84.95
Last change on this file since df49c60 was df49c60, checked in by Joel Sherrill <joel.sherrill@…>, on 06/12/00 at 15:00:15

Merged from 4.5.0-beta3a

  • Property mode set to 100644
File size: 32.4 KB
Line 
1.\"
2.\"  Must use -- tbl -- with this one
3.\"
4.\" @(#)xdr.rfc.ms      2.2 88/08/05 4.0 RPCSRC
5.de BT
6.if \\n%=1 .tl ''- % -''
7..
8.ND
9.\" prevent excess underlining in nroff
10.if n .fp 2 R
11.OH 'External Data Representation Standard''Page %'
12.EH 'Page %''External Data Representation Standard'
13.IX "External Data Representation"
14.if \\n%=1 .bp
15.SH
16\&External Data Representation Standard: Protocol Specification
17.IX XDR RFC
18.IX XDR "protocol specification"
19.LP
20.NH 0
21\&Status of this Standard
22.nr OF 1
23.IX XDR "RFC status"
24.LP
25Note: This chapter specifies a protocol that Sun Microsystems, Inc., and
26others are using.  It has been designated RFC1014 by the ARPA Network
27Information Center.
28.NH 1
29Introduction
30\&
31.LP
32XDR is a standard for the description and encoding of data.  It is
33useful for transferring data between different computer
34architectures, and has been used to communicate data between such
35diverse machines as the Sun Workstation, VAX, IBM-PC, and Cray.
36XDR fits into the ISO presentation layer, and is roughly analogous in
37purpose to X.409, ISO Abstract Syntax Notation.  The major difference
38between these two is that XDR uses implicit typing, while X.409 uses
39explicit typing.
40.LP
41XDR uses a language to describe data formats.  The language can only
42be used only to describe data; it is not a programming language.
43This language allows one to describe intricate data formats in a
44concise manner. The alternative of using graphical representations
45(itself an informal language) quickly becomes incomprehensible when
46faced with complexity.  The XDR language itself is similar to the C
47language [1], just as Courier [4] is similar to Mesa. Protocols such
48as Sun RPC (Remote Procedure Call) and the NFS (Network File System)
49use XDR to describe the format of their data.
50.LP
51The XDR standard makes the following assumption: that bytes (or
52octets) are portable, where a byte is defined to be 8 bits of data.
53A given hardware device should encode the bytes onto the various
54media in such a way that other hardware devices may decode the bytes
55without loss of meaning.  For example, the Ethernet standard
56suggests that bytes be encoded in "little-endian" style [2], or least
57significant bit first.
58.NH 2
59\&Basic Block Size
60.IX XDR "basic block size"
61.IX XDR "block size"
62.LP
63The representation of all items requires a multiple of four bytes (or
6432 bits) of data.  The bytes are numbered 0 through n-1.  The bytes
65are read or written to some byte stream such that byte m always
66precedes byte m+1.  If the n bytes needed to contain the data are not
67a multiple of four, then the n bytes are followed by enough (0 to 3)
68residual zero bytes, r, to make the total byte count a multiple of 4.
69.LP
70We include the familiar graphic box notation for illustration and
71comparison.  In most illustrations, each box (delimited by a plus
72sign at the 4 corners and vertical bars and dashes) depicts a byte.
73Ellipses (...) between boxes show zero or more additional bytes where
74required.
75.ie t .DS
76.el .DS L
77\fIA Block\fP
78
79\f(CW+--------+--------+...+--------+--------+...+--------+
80| byte 0 | byte 1 |...|byte n-1|    0   |...|    0   |
81+--------+--------+...+--------+--------+...+--------+
82|<-----------n bytes---------->|<------r bytes------>|
83|<-----------n+r (where (n+r) mod 4 = 0)>----------->|\fP
84
85.DE
86.NH 1
87\&XDR Data Types
88.IX XDR "data types"
89.IX "XDR data types"
90.LP
91Each of the sections that follow describes a data type defined in the
92XDR standard, shows how it is declared in the language, and includes
93a graphic illustration of its encoding.
94.LP
95For each data type in the language we show a general paradigm
96declaration.  Note that angle brackets (< and >) denote
97variable length sequences of data and square brackets ([ and ]) denote
98fixed-length sequences of data.  "n", "m" and "r" denote integers.
99For the full language specification and more formal definitions of
100terms such as "identifier" and "declaration", refer to
101.I "The XDR Language Specification" ,
102below.
103.LP
104For some data types, more specific examples are included. 
105A more extensive example of a data description is in
106.I "An Example of an XDR Data Description"
107below.
108.NH 2
109\&Integer
110.IX XDR integer
111.LP
112An XDR signed integer is a 32-bit datum that encodes an integer in
113the range [-2147483648,2147483647].  The integer is represented in
114two's complement notation.  The most and least significant bytes are
1150 and 3, respectively.  Integers are declared as follows:
116.ie t .DS
117.el .DS L
118\fIInteger\fP
119
120\f(CW(MSB)                   (LSB)
121+-------+-------+-------+-------+
122|byte 0 |byte 1 |byte 2 |byte 3 |
123+-------+-------+-------+-------+
124<------------32 bits------------>\fP
125.DE
126.NH 2
127\&Unsigned Integer
128.IX XDR "unsigned integer"
129.IX XDR "integer, unsigned"
130.LP
131An XDR unsigned integer is a 32-bit datum that encodes a nonnegative
132integer in the range [0,4294967295].  It is represented by an
133unsigned binary number whose most and least significant bytes are 0
134and 3, respectively.  An unsigned integer is declared as follows:
135.ie t .DS
136.el .DS L
137\fIUnsigned Integer\fP
138
139\f(CW(MSB)                   (LSB)
140+-------+-------+-------+-------+
141|byte 0 |byte 1 |byte 2 |byte 3 |
142+-------+-------+-------+-------+
143<------------32 bits------------>\fP
144.DE
145.NH 2
146\&Enumeration
147.IX XDR enumeration
148.LP
149Enumerations have the same representation as signed integers.
150Enumerations are handy for describing subsets of the integers.
151Enumerated data is declared as follows:
152.ft CW
153.DS
154enum { name-identifier = constant, ... } identifier;
155.DE
156For example, the three colors red, yellow, and blue could be
157described by an enumerated type:
158.DS
159.ft CW
160enum { RED = 2, YELLOW = 3, BLUE = 5 } colors;
161.DE
162It is an error to encode as an enum any other integer than those that
163have been given assignments in the enum declaration.
164.NH 2
165\&Boolean
166.IX XDR boolean
167.LP
168Booleans are important enough and occur frequently enough to warrant
169their own explicit type in the standard.  Booleans are declared as
170follows:
171.DS
172.ft CW
173bool identifier;
174.DE
175This is equivalent to:
176.DS
177.ft CW
178enum { FALSE = 0, TRUE = 1 } identifier;
179.DE
180.NH 2
181\&Hyper Integer and Unsigned Hyper Integer
182.IX XDR "hyper integer"
183.IX XDR "integer, hyper"
184.LP
185The standard also defines 64-bit (8-byte) numbers called hyper
186integer and unsigned hyper integer.  Their representations are the
187obvious extensions of integer and unsigned integer defined above.
188They are represented in two's complement notation.  The most and
189least significant bytes are 0 and 7, respectively.  Their
190declarations:
191.ie t .DS
192.el .DS L
193\fIHyper Integer\fP
194\fIUnsigned Hyper Integer\fP
195
196\f(CW(MSB)                                                   (LSB)
197+-------+-------+-------+-------+-------+-------+-------+-------+
198|byte 0 |byte 1 |byte 2 |byte 3 |byte 4 |byte 5 |byte 6 |byte 7 |
199+-------+-------+-------+-------+-------+-------+-------+-------+
200<----------------------------64 bits---------------------------->\fP
201.DE
202.NH 2
203\&Floating-point
204.IX XDR "integer, floating point"
205.IX XDR "floating-point integer"
206.LP
207The standard defines the floating-point data type "float" (32 bits or
2084 bytes).  The encoding used is the IEEE standard for normalized
209single-precision floating-point numbers [3].  The following three
210fields describe the single-precision floating-point number:
211.RS
212.IP \fBS\fP:
213The sign of the number.  Values 0 and  1 represent  positive and
214negative, respectively.  One bit.
215.IP \fBE\fP:
216The exponent of the number, base 2.  8  bits are devoted to this
217field.  The exponent is biased by 127.
218.IP \fBF\fP:
219The fractional part of the number's mantissa,  base 2.   23 bits
220are devoted to this field.
221.RE
222.LP
223Therefore, the floating-point number is described by:
224.DS
225(-1)**S * 2**(E-Bias) * 1.F
226.DE
227It is declared as follows:
228.ie t .DS
229.el .DS L
230\fISingle-Precision Floating-Point\fP
231
232\f(CW+-------+-------+-------+-------+
233|byte 0 |byte 1 |byte 2 |byte 3 |
234S|   E   |           F          |
235+-------+-------+-------+-------+
2361|<- 8 ->|<-------23 bits------>|
237<------------32 bits------------>\fP
238.DE
239Just as the most and least significant bytes of a number are 0 and 3,
240the most and least significant bits of a single-precision floating-
241point number are 0 and 31.  The beginning bit (and most significant
242bit) offsets of S, E, and F are 0, 1, and 9, respectively.  Note that
243these numbers refer to the mathematical positions of the bits, and
244NOT to their actual physical locations (which vary from medium to
245medium).
246.LP
247The IEEE specifications should be consulted concerning the encoding
248for signed zero, signed infinity (overflow), and denormalized numbers
249(underflow) [3].  According to IEEE specifications, the "NaN" (not a
250number) is system dependent and should not be used externally.
251.NH 2
252\&Double-precision Floating-point
253.IX XDR "integer, double-precision floating point"
254.IX XDR "double-precision floating-point integer"
255.LP
256The standard defines the encoding for the double-precision floating-
257point data type "double" (64 bits or 8 bytes).  The encoding used is
258the IEEE standard for normalized double-precision floating-point
259numbers [3].  The standard encodes the following three fields, which
260describe the double-precision floating-point number:
261.RS
262.IP \fBS\fP:
263The sign of the number.  Values  0 and 1  represent positive and
264negative, respectively.  One bit.
265.IP \fBE\fP:
266The exponent of the number, base 2.  11 bits are devoted to this
267field.  The exponent is biased by 1023.
268.IP \fBF\fP:
269The fractional part of the number's  mantissa, base 2.   52 bits
270are devoted to this field.
271.RE
272.LP
273Therefore, the floating-point number is described by:
274.DS
275(-1)**S * 2**(E-Bias) * 1.F
276.DE
277It is declared as follows:
278.ie t .DS
279.el .DS L
280\fIDouble-Precision Floating-Point\fP
281
282\f(CW+------+------+------+------+------+------+------+------+
283|byte 0|byte 1|byte 2|byte 3|byte 4|byte 5|byte 6|byte 7|
284S|    E   |                    F                        |
285+------+------+------+------+------+------+------+------+
2861|<--11-->|<-----------------52 bits------------------->|
287<-----------------------64 bits------------------------->\fP
288.DE
289Just as the most and least significant bytes of a number are 0 and 3,
290the most and least significant bits of a double-precision floating-
291point number are 0 and 63.  The beginning bit (and most significant
292bit) offsets of S, E , and F are 0, 1, and 12, respectively.  Note
293that these numbers refer to the mathematical positions of the bits,
294and NOT to their actual physical locations (which vary from medium to
295medium).
296.LP
297The IEEE specifications should be consulted concerning the encoding
298for signed zero, signed infinity (overflow), and denormalized numbers
299(underflow) [3].  According to IEEE specifications, the "NaN" (not a
300number) is system dependent and should not be used externally.
301.NH 2
302\&Fixed-length Opaque Data
303.IX XDR "fixed-length opaque data"
304.IX XDR "opaque data, fixed length"
305.LP
306At times, fixed-length uninterpreted data needs to be passed among
307machines.  This data is called "opaque" and is declared as follows:
308.DS
309.ft CW
310opaque identifier[n];
311.DE
312where the constant n is the (static) number of bytes necessary to
313contain the opaque data.  If n is not a multiple of four, then the n
314bytes are followed by enough (0 to 3) residual zero bytes, r, to make
315the total byte count of the opaque object a multiple of four.
316.ie t .DS
317.el .DS L
318\fIFixed-Length Opaque\fP
319
320\f(CW0        1     ...
321+--------+--------+...+--------+--------+...+--------+
322| byte 0 | byte 1 |...|byte n-1|    0   |...|    0   |
323+--------+--------+...+--------+--------+...+--------+
324|<-----------n bytes---------->|<------r bytes------>|
325|<-----------n+r (where (n+r) mod 4 = 0)------------>|\fP
326.DE
327.NH 2
328\&Variable-length Opaque Data
329.IX XDR "variable-length opaque data"
330.IX XDR "opaque data, variable length"
331.LP
332The standard also provides for variable-length (counted) opaque data,
333defined as a sequence of n (numbered 0 through n-1) arbitrary bytes
334to be the number n encoded as an unsigned integer (as described
335below), and followed by the n bytes of the sequence.
336.LP
337Byte m of the sequence always precedes byte m+1 of the sequence, and
338byte 0 of the sequence always follows the sequence's length (count).
339enough (0 to 3) residual zero bytes, r, to make the total byte count
340a multiple of four.  Variable-length opaque data is declared in the
341following way:
342.DS
343.ft CW
344opaque identifier<m>;
345.DE
346or
347.DS
348.ft CW
349opaque identifier<>;
350.DE
351The constant m denotes an upper bound of the number of bytes that the
352sequence may contain.  If m is not specified, as in the second
353declaration, it is assumed to be (2**32) - 1, the maximum length.
354The constant m would normally be found in a protocol specification.
355For example, a filing protocol may state that the maximum data
356transfer size is 8192 bytes, as follows:
357.DS
358.ft CW
359opaque filedata<8192>;
360.DE
361This can be illustrated as follows:
362.ie t .DS
363.el .DS L
364\fIVariable-Length Opaque\fP
365
366\f(CW0     1     2     3     4     5   ...
367+-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
368|        length n       |byte0|byte1|...| n-1 |  0  |...|  0  |
369+-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
370|<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
371|<----n+r (where (n+r) mod 4 = 0)---->|\fP
372.DE
373.LP
374It   is  an error  to  encode  a  length  greater  than the maximum
375described in the specification.
376.NH 2
377\&String
378.IX XDR string
379.LP
380The standard defines a string of n (numbered 0 through n-1) ASCII
381bytes to be the number n encoded as an unsigned integer (as described
382above), and followed by the n bytes of the string.  Byte m of the
383string always precedes byte m+1 of the string, and byte 0 of the
384string always follows the string's length.  If n is not a multiple of
385four, then the n bytes are followed by enough (0 to 3) residual zero
386bytes, r, to make the total byte count a multiple of four.  Counted
387byte strings are declared as follows:
388.DS
389.ft CW
390string object<m>;
391.DE
392or
393.DS
394.ft CW
395string object<>;
396.DE
397The constant m denotes an upper bound of the number of bytes that a
398string may contain.  If m is not specified, as in the second
399declaration, it is assumed to be (2**32) - 1, the maximum length.
400The constant m would normally be found in a protocol specification.
401For example, a filing protocol may state that a file name can be no
402longer than 255 bytes, as follows:
403.DS
404.ft CW
405string filename<255>;
406.DE
407Which can be illustrated as:
408.ie t .DS
409.el .DS L
410\fIA String\fP
411
412\f(CW0     1     2     3     4     5   ...
413+-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
414|        length n       |byte0|byte1|...| n-1 |  0  |...|  0  |
415+-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
416|<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
417|<----n+r (where (n+r) mod 4 = 0)---->|\fP
418.DE
419.LP
420It   is an  error  to  encode  a length greater  than   the maximum
421described in the specification.
422.NH 2
423\&Fixed-length Array
424.IX XDR "fixed-length array"
425.IX XDR "array, fixed length"
426.LP
427Declarations for fixed-length arrays of homogeneous elements are in
428the following form:
429.DS
430.ft CW
431type-name identifier[n];
432.DE
433Fixed-length arrays of elements numbered 0 through n-1 are encoded by
434individually encoding the elements of the array in their natural
435order, 0 through n-1.  Each element's size is a multiple of four
436bytes. Though all elements are of the same type, the elements may
437have different sizes.  For example, in a fixed-length array of
438strings, all elements are of type "string", yet each element will
439vary in its length.
440.ie t .DS
441.el .DS L
442\fIFixed-Length Array\fP
443
444\f(CW+---+---+---+---+---+---+---+---+...+---+---+---+---+
445|   element 0   |   element 1   |...|  element n-1  |
446+---+---+---+---+---+---+---+---+...+---+---+---+---+
447|<--------------------n elements------------------->|\fP
448.DE
449.NH 2
450\&Variable-length Array
451.IX XDR "variable-length array"
452.IX XDR "array, variable length"
453.LP
454Counted arrays provide the ability to encode variable-length arrays
455of homogeneous elements.  The array is encoded as the element count n
456(an unsigned integer) followed by the encoding of each of the array's
457elements, starting with element 0 and progressing through element n-
4581.  The declaration for variable-length arrays follows this form:
459.DS
460.ft CW
461type-name identifier<m>;
462.DE
463or
464.DS
465.ft CW
466type-name identifier<>;
467.DE
468The constant m specifies the maximum acceptable element count of an
469array; if  m is not specified, as  in the second declaration, it is
470assumed to be (2**32) - 1.
471.ie t .DS
472.el .DS L
473\fICounted Array\fP
474
475\f(CW0  1  2  3
476+--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
477|     n     | element 0 | element 1 |...|element n-1|
478+--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
479|<-4 bytes->|<--------------n elements------------->|\fP
480.DE
481It is  an error to  encode  a  value of n that  is greater than the
482maximum described in the specification.
483.NH 2
484\&Structure
485.IX XDR structure
486.LP
487Structures are declared as follows:
488.DS
489.ft CW
490struct {
491        component-declaration-A;
492        component-declaration-B;
493        \&...
494} identifier;
495.DE
496The components of the structure are encoded in the order of their
497declaration in the structure.  Each component's size is a multiple of
498four bytes, though the components may be different sizes.
499.ie t .DS
500.el .DS L
501\fIStructure\fP
502
503\f(CW+-------------+-------------+...
504| component A | component B |...
505+-------------+-------------+...\fP
506.DE
507.NH 2
508\&Discriminated Union
509.IX XDR "discriminated union"
510.IX XDR union discriminated
511.LP
512A discriminated union is a type composed of a discriminant followed
513by a type selected from a set of prearranged types according to the
514value of the discriminant.  The type of discriminant is either "int",
515"unsigned int", or an enumerated type, such as "bool".  The component
516types are called "arms" of the union, and are preceded by the value
517of the discriminant which implies their encoding.  Discriminated
518unions are declared as follows:
519.DS
520.ft CW
521union switch (discriminant-declaration) {
522        case discriminant-value-A:
523        arm-declaration-A;
524        case discriminant-value-B:
525        arm-declaration-B;
526        \&...
527        default: default-declaration;
528} identifier;
529.DE
530Each "case" keyword is followed by a legal value of the discriminant.
531The default arm is optional.  If it is not specified, then a valid
532encoding of the union cannot take on unspecified discriminant values.
533The size of the implied arm is always a multiple of four bytes.
534.LP
535The discriminated union is encoded as its discriminant followed by
536the encoding of the implied arm.
537.ie t .DS
538.el .DS L
539\fIDiscriminated Union\fP
540
541\f(CW0   1   2   3
542+---+---+---+---+---+---+---+---+
543|  discriminant |  implied arm  |
544+---+---+---+---+---+---+---+---+
545|<---4 bytes--->|\fP
546.DE
547.NH 2
548\&Void
549.IX XDR void
550.LP
551An XDR void is a 0-byte quantity.  Voids are useful for describing
552operations that take no data as input or no data as output. They are
553also useful in unions, where some arms may contain data and others do
554not.  The declaration is simply as follows:
555.DS
556.ft CW
557void;
558.DE
559Voids are illustrated as follows:
560.ie t .DS
561.el .DS L
562\fIVoid\fP
563
564\f(CW  ++
565  ||
566  ++
567--><-- 0 bytes\fP
568.DE
569.NH 2
570\&Constant
571.IX XDR constant
572.LP
573The data declaration for a constant follows this form:
574.DS
575.ft CW
576const name-identifier = n;
577.DE
578"const" is used to define a symbolic name for a constant; it does not
579declare any data.  The symbolic constant may be used anywhere a
580regular constant may be used.  For example, the following defines a
581symbolic constant DOZEN, equal to 12.
582.DS
583.ft CW
584const DOZEN = 12;
585.DE
586.NH 2
587\&Typedef
588.IX XDR typedef
589.LP
590"typedef" does not declare any data either, but serves to define new
591identifiers for declaring data. The syntax is:
592.DS
593.ft CW
594typedef declaration;
595.DE
596The new type name is actually the variable name in the declaration
597part of the typedef.  For example, the following defines a new type
598called "eggbox" using an existing type called "egg":
599.DS
600.ft CW
601typedef egg eggbox[DOZEN];
602.DE
603Variables declared using the new type name have the same type as the
604new type name would have in the typedef, if it was considered a
605variable.  For example, the following two declarations are equivalent
606in declaring the variable "fresheggs":
607.DS
608.ft CW
609eggbox  fresheggs;
610egg     fresheggs[DOZEN];
611.DE
612When a typedef involves a struct, enum, or union definition, there is
613another (preferred) syntax that may be used to define the same type.
614In general, a typedef of the following form:
615.DS
616.ft CW
617typedef <<struct, union, or enum definition>> identifier;
618.DE
619may be converted to the alternative form by removing the "typedef"
620part and placing the identifier after the "struct", "union", or
621"enum" keyword, instead of at the end.  For example, here are the two
622ways to define the type "bool":
623.DS
624.ft CW
625typedef enum {    /* \fIusing typedef\fP */
626        FALSE = 0,
627        TRUE = 1
628        } bool;
629
630enum bool {       /* \fIpreferred alternative\fP */
631        FALSE = 0,
632        TRUE = 1
633        };
634.DE
635The reason this syntax is preferred is one does not have to wait
636until the end of a declaration to figure out the name of the new
637type.
638.NH 2
639\&Optional-data
640.IX XDR "optional data"
641.IX XDR "data, optional"
642.LP
643Optional-data is one kind of union that occurs so frequently that we
644give it a special syntax of its own for declaring it.  It is declared
645as follows:
646.DS
647.ft CW
648type-name *identifier;
649.DE
650This is equivalent to the following union:
651.DS
652.ft CW
653union switch (bool opted) {
654        case TRUE:
655        type-name element;
656        case FALSE:
657        void;
658} identifier;
659.DE
660It is also equivalent to the following variable-length array
661declaration, since the boolean "opted" can be interpreted as the
662length of the array:
663.DS
664.ft CW
665type-name identifier<1>;
666.DE
667Optional-data is not so interesting in itself, but it is very useful
668for describing recursive data-structures such as linked-lists and
669trees.  For example, the following defines a type "stringlist" that
670encodes lists of arbitrary length strings:
671.DS
672.ft CW
673struct *stringlist {
674        string item<>;
675        stringlist next;
676};
677.DE
678It could have been equivalently declared as the following union:
679.DS
680.ft CW
681union stringlist switch (bool opted) {
682        case TRUE:
683                struct {
684                        string item<>;
685                        stringlist next;
686                } element;
687        case FALSE:
688                void;
689};
690.DE
691or as a variable-length array:
692.DS
693.ft CW
694struct stringlist<1> {
695        string item<>;
696        stringlist next;
697};
698.DE
699Both of these declarations obscure the intention of the stringlist
700type, so the optional-data declaration is preferred over both of
701them.  The optional-data type also has a close correlation to how
702recursive data structures are represented in high-level languages
703such as Pascal or C by use of pointers. In fact, the syntax is the
704same as that of the C language for pointers.
705.NH 2
706\&Areas for Future Enhancement
707.IX XDR futures
708.LP
709The XDR standard lacks representations for bit fields and bitmaps,
710since the standard is based on bytes.  Also missing are packed (or
711binary-coded) decimals.
712.LP
713The intent of the XDR standard was not to describe every kind of data
714that people have ever sent or will ever want to send from machine to
715machine. Rather, it only describes the most commonly used data-types
716of high-level languages such as Pascal or C so that applications
717written in these languages will be able to communicate easily over
718some medium.
719.LP
720One could imagine extensions to XDR that would let it describe almost
721any existing protocol, such as TCP.  The minimum necessary for this
722are support for different block sizes and byte-orders.  The XDR
723discussed here could then be considered the 4-byte big-endian member
724of a larger XDR family.
725.NH 1
726\&Discussion
727.sp 2
728.NH 2
729\&Why a Language for Describing Data?
730.IX XDR language
731.LP
732There are many advantages in using a data-description language such
733as  XDR  versus using  diagrams.   Languages are  more  formal than
734diagrams   and   lead  to less  ambiguous   descriptions  of  data.
735Languages are also easier  to understand and allow  one to think of
736other   issues instead of  the   low-level details of bit-encoding.
737Also,  there is  a close analogy  between the  types  of XDR and  a
738high-level language   such  as C   or    Pascal.   This makes   the
739implementation of XDR encoding and decoding modules an easier task.
740Finally, the language specification itself  is an ASCII string that
741can be passed from  machine to machine  to perform  on-the-fly data
742interpretation.
743.NH 2
744\&Why Only one Byte-Order for an XDR Unit?
745.IX XDR "byte order"
746.LP
747Supporting two byte-orderings requires a higher level protocol for
748determining in which byte-order the data is encoded.  Since XDR is
749not a protocol, this can't be done.  The advantage of this, though,
750is that data in XDR format can be written to a magnetic tape, for
751example, and any machine will be able to interpret it, since no
752higher level protocol is necessary for determining the byte-order.
753.NH 2
754\&Why does XDR use Big-Endian Byte-Order?
755.LP
756Yes, it is unfair, but having only one byte-order means you have to
757be unfair to somebody.  Many architectures, such as the Motorola
75868000 and IBM 370, support the big-endian byte-order.
759.NH 2
760\&Why is the XDR Unit Four Bytes Wide?
761.LP
762There is a tradeoff in choosing the XDR unit size.  Choosing a small
763size such as two makes the encoded data small, but causes alignment
764problems for machines that aren't aligned on these boundaries.  A
765large size such as eight means the data will be aligned on virtually
766every machine, but causes the encoded data to grow too big.  We chose
767four as a compromise.  Four is big enough to support most
768architectures efficiently, except for rare machines such as the
769eight-byte aligned Cray.  Four is also small enough to keep the
770encoded data restricted to a reasonable size.
771.NH 2
772\&Why must Variable-Length Data be Padded with Zeros?
773.IX XDR "variable-length data"
774.LP
775It is desirable that the same data encode into the same thing on all
776machines, so that encoded data can be meaningfully compared or
777checksummed.  Forcing the padded bytes to be zero ensures this.
778.NH 2
779\&Why is there No Explicit Data-Typing?
780.LP
781Data-typing has a relatively high cost for what small advantages it
782may have.  One cost is the expansion of data due to the inserted type
783fields.  Another is the added cost of interpreting these type fields
784and acting accordingly.  And most protocols already know what type
785they expect, so data-typing supplies only redundant information.
786However, one can still get the benefits of data-typing using XDR. One
787way is to encode two things: first a string which is the XDR data
788description of the encoded data, and then the encoded data itself.
789Another way is to assign a value to all the types in XDR, and then
790define a universal type which takes this value as its discriminant
791and for each value, describes the corresponding data type.
792.NH 1
793\&The XDR Language Specification
794.IX XDR language
795.sp 1
796.NH 2
797\&Notational Conventions
798.IX "XDR language" notation
799.LP
800This specification  uses an extended Backus-Naur Form  notation for
801describing the XDR language.   Here is  a brief description  of the
802notation:
803.IP  1.
804The characters
805.I | ,
806.I ( ,
807.I ) ,
808.I [ ,
809.I ] ,
810.I " ,
811and
812.I *
813are special.
814.IP  2.
815Terminal symbols are  strings of any  characters surrounded by
816double quotes.
817.IP  3.
818Non-terminal symbols are strings of non-special characters.
819.IP  4.
820Alternative items are separated by a vertical bar ("\fI|\fP").
821.IP  5.
822Optional items are enclosed in brackets.
823.IP  6.
824Items are grouped together by enclosing them in parentheses.
825.IP  7.
826A
827.I *
828following an item means  0 or more  occurrences of that item.
829.LP
830For example,  consider  the  following pattern:
831.DS L
832"a " "very" (", " " very")* [" cold " "and"]  " rainy " ("day" | "night")
833.DE
834.LP
835An infinite  number of  strings match  this pattern. A few  of them
836are:
837.DS
838"a very rainy day"
839"a very, very rainy day"
840"a very cold and  rainy day"
841"a very, very, very cold and  rainy night"
842.DE
843.NH 2
844\&Lexical Notes
845.IP  1.
846Comments begin with '/*' and terminate with '*/'.
847.IP  2.
848White space serves to separate items and is otherwise ignored.
849.IP  3.
850An identifier is a letter followed by  an optional sequence of
851letters, digits or underbar ('_').  The case of identifiers is
852not ignored.
853.IP  4.
854A  constant is  a  sequence  of  one  or  more decimal digits,
855optionally preceded by a minus-sign ('-').
856.NH 2
857\&Syntax Information
858.IX "XDR language" syntax
859.DS
860.ft CW
861declaration:
862        type-specifier identifier
863        | type-specifier identifier "[" value "]"
864        | type-specifier identifier "<" [ value ] ">"
865        | "opaque" identifier "[" value "]"
866        | "opaque" identifier "<" [ value ] ">"
867        | "string" identifier "<" [ value ] ">"
868        | type-specifier "*" identifier
869        | "void"
870.DE
871.DS
872.ft CW
873value:
874        constant
875        | identifier
876
877type-specifier:
878          [ "unsigned" ] "int"
879        | [ "unsigned" ] "hyper"
880        | "float"
881        | "double"
882        | "bool"
883        | enum-type-spec
884        | struct-type-spec
885        | union-type-spec
886        | identifier
887.DE
888.DS
889.ft CW
890enum-type-spec:
891        "enum" enum-body
892
893enum-body:
894        "{"
895        ( identifier "=" value )
896        ( "," identifier "=" value )*
897        "}"
898.DE
899.DS
900.ft CW
901struct-type-spec:
902        "struct" struct-body
903
904struct-body:
905        "{"
906        ( declaration ";" )
907        ( declaration ";" )*
908        "}"
909.DE
910.DS
911.ft CW
912union-type-spec:
913        "union" union-body
914
915union-body:
916        "switch" "(" declaration ")" "{"
917        ( "case" value ":" declaration ";" )
918        ( "case" value ":" declaration ";" )*
919        [ "default" ":" declaration ";" ]
920        "}"
921
922constant-def:
923        "const" identifier "=" constant ";"
924.DE
925.DS
926.ft CW
927type-def:
928        "typedef" declaration ";"
929        | "enum" identifier enum-body ";"
930        | "struct" identifier struct-body ";"
931        | "union" identifier union-body ";"
932
933definition:
934        type-def
935        | constant-def
936
937specification:
938        definition *
939.DE
940.NH 3
941\&Syntax Notes
942.IX "XDR language" syntax
943.LP
944.IP  1.
945The following are keywords and cannot be used as identifiers:
946"bool", "case", "const", "default", "double", "enum", "float",
947"hyper", "opaque", "string", "struct", "switch", "typedef", "union",
948"unsigned" and "void".
949.IP  2.
950Only unsigned constants may be used as size specifications for
951arrays.  If an identifier is used, it must have been declared
952previously as an unsigned constant in a "const" definition.
953.IP  3.
954Constant and type identifiers within the scope of a specification
955are in the same name space and must be declared uniquely within this
956scope.
957.IP  4.
958Similarly, variable names must  be unique within  the scope  of
959struct and union declarations. Nested struct and union declarations
960create new scopes.
961.IP  5.
962The discriminant of a union must be of a type that evaluates to
963an integer. That is, "int", "unsigned int", "bool", an enumerated
964type or any typedefed type that evaluates to one of these is legal.
965Also, the case values must be one of the legal values of the
966discriminant.  Finally, a case value may not be specified more than
967once within the scope of a union declaration.
968.NH 1
969\&An Example of an XDR Data Description
970.LP
971Here is a short XDR data description of a thing called a "file",
972which might be used to transfer files from one machine to another.
973.ie t .DS
974.el .DS L
975.ft CW
976
977const MAXUSERNAME = 32;     /*\fI max length of a user name \fP*/
978const MAXFILELEN = 65535;   /*\fI max length of a file      \fP*/
979const MAXNAMELEN = 255;     /*\fI max length of a file name \fP*/
980
981.ft I
982/*
983 * Types of files:
984 */
985.ft CW
986
987enum filekind {
988        TEXT = 0,       /*\fI ascii data \fP*/
989        DATA = 1,       /*\fI raw data   \fP*/
990        EXEC = 2        /*\fI executable \fP*/
991};
992
993.ft I
994/*
995 * File information, per kind of file:
996 */
997.ft CW
998
999union filetype switch (filekind kind) {
1000        case TEXT:
1001                void;                           /*\fI no extra information \fP*/
1002        case DATA:
1003                string creator<MAXNAMELEN>;     /*\fI data creator         \fP*/
1004        case EXEC:
1005                string interpretor<MAXNAMELEN>; /*\fI program interpretor  \fP*/
1006};
1007
1008.ft I
1009/*
1010 * A complete file:
1011 */
1012.ft CW
1013
1014struct file {
1015        string filename<MAXNAMELEN>; /*\fI name of file \fP*/
1016        filetype type;               /*\fI info about file \fP*/
1017        string owner<MAXUSERNAME>;   /*\fI owner of file   \fP*/
1018        opaque data<MAXFILELEN>;     /*\fI file data       \fP*/
1019};
1020.DE
1021.LP
1022Suppose now that there is  a user named  "john" who wants to  store
1023his lisp program "sillyprog" that contains just  the data "(quit)".
1024His file would be encoded as follows:
1025.TS
1026box tab (&) ;
1027lfI lfI lfI lfI
1028rfL rfL rfL l .
1029Offset&Hex Bytes&ASCII&Description
1030_
10310&00 00 00 09&....&Length of filename = 9
10324&73 69 6c 6c&sill&Filename characters
10338&79 70 72 6f&ypro& ... and more characters ...
103412&67 00 00 00&g...& ... and 3 zero-bytes of fill
103516&00 00 00 02&....&Filekind is EXEC = 2
103620&00 00 00 04&....&Length of interpretor = 4
103724&6c 69 73 70&lisp&Interpretor characters
103828&00 00 00 04&....&Length of owner = 4
103932&6a 6f 68 6e&john&Owner characters
104036&00 00 00 06&....&Length of file data = 6
104140&28 71 75 69&(qui&File data bytes ...
104244&74 29 00 00&t)..& ... and 2 zero-bytes of fill
1043.TE
1044.NH 1
1045\&References
1046.LP
1047[1]  Brian W. Kernighan & Dennis M. Ritchie, "The C Programming
1048Language", Bell Laboratories, Murray Hill, New Jersey, 1978.
1049.LP
1050[2]  Danny Cohen, "On Holy Wars and a Plea for Peace", IEEE Computer,
1051October 1981.
1052.LP
1053[3]  "IEEE Standard for Binary Floating-Point Arithmetic", ANSI/IEEE
1054Standard 754-1985, Institute of Electrical and Electronics
1055Engineers, August 1985.
1056.LP
1057[4]  "Courier: The Remote Procedure Call Protocol", XEROX
1058Corporation, XSIS 038112, December 1981.
Note: See TracBrowser for help on using the repository browser.