Reading COBOL Layouts

This tutorial on how to read a COBOL layout was written specifically for our customers who have had a conversion performed at Disc Interchange and have received a COBOL layout with the data. It is intended to give you enough information to read most simple layouts. It does not cover all topics or everything you would find in a complex layout, and it is intended to explain COBOL layouts only so you can use your converted data, not so you can write COBOL programs.

This article begins here: Reading COBOL Layouts where you will also find a topic index.

Part 4: Numeric Fields

This section describes several numeric data types and the handling of signs and decimal points.

Contents of this section:

Need to convert Numeric fields?
That's our business!

   The "Usage Is" Clause
   Usage is Display
   Signed Fields
   Sign is Separate
   Computational and Binary Fields
   Real Decimal
   Implied Decimal
   Synchronization and Alignment

COBOL has several types of numeric fields. These data types include a "DISPLAY" field, which is composed of characters (the EBCDIC or ASCII characters for 0 - 9), binary fields, packed fields, and floating-point fields. There are also options for a separate + or - sign or a sign overpunch, and for real or implied decimal. The data type is specified by the "USAGE IS" clause.

The "USAGE IS" Clause

There is actually more to the picture statement than we've previously described. There is a "USAGE IS" clause that specifies the type of storage of a numeric field -- "display", binary, or computational. The full syntax, via an example, is:

   05  ACCOUNT-BALANCE    PIC S9(6)V99 USAGE IS COMPUTATIONAL-3.

This says to store the field in the computational-3 format. The "usage is" part is optional and generally left off, and "computational" can be abbreviated "COMP", so you will more commonly see this written

   05  ACCOUNT-BALANCE    PIC S9(6)V99 COMP-3.

The types of numeric fields you will commonly see in COBOL layouts are:

Display (including Signed fields)
Binary
Computational, or comp
Comp-1
Comp-2
Comp-3

Display, including "Signed" or "Zoned" fields, is the most common, and comp-3 is the second most common type of numeric field. Some compilers may also have comp-4 and comp-5 data types, usually to emulate comp on another compiler.

Usage is Display

Display format is the default for numbers in COBOL. If no "usage is" clause is specified, the default is "usage is display", which means the value is stored as EBCDIC characters (digits), as opposed to binary. The value may or may not have a decimal -- implied or real -- and may be unsigned or have an embedded or a separate sign -- which can be either leading or trailing. The default "signed display" format field contains an embedded trailing sign, and is commonly called a "Signed", or "IBM Signed", or "Zoned" field. This data type is described below.

Signed Fields

There is a common numeric data type used in COBOL on IBM mainframes called "Signed" (also called "IBM Signed", or "Zoned"). COBOL represents this type of field by an "S" in the picture clause of a display format field, e.g. PIC S9(6). A Signed field is composed of regular EBCDIC numeric characters, one character per byte, for all digits except the one that holds the sign, either the most-significant (sign leading) or the least-significant (sign trailing) digit -- usually the least-significant digit. The digit that holds the sign combines, or "over punches" the sign of the number onto that digit. This saves one byte that the sign would otherwise occupy. The value of that digit is stored as a binary value, and is OR'd with the sign code, which is D0 hex for negative numbers, C0 hex for positive values, and F0 hex for "unsigned" values.

Because of the overpunch, the digit that holds the sign will not appear as a number when the field is viewed in EBCDIC character mode. If you have the field

   05  ACCOUNT-BALANCE         PIC S9(6)V99.

and view a value of 1.23 with an EBCDIC editor, it will read "0000012C".

ASCII COBOL compilers also use a Signed data type with an overpunch, but the sign bits are different and not standardized between compilers. See our Tech-Talk brief Signed Fields for further details on both EBCDIC and ASCII Signed fields.

Sign is Separate

COBOL signed fields embed the sign in the value by default (see signed fields above). But there is a provision in COBOL for a separate sign, and it can be either leading or trailing. The statement for this is

       SIGN IS SEPARATE  (or SIGN SEPARATE)

This may be combined with the leading or trailing clause:

       SIGN IS LEADING SEPARATE  (or SIGN LEADING SEPARATE)

   or  SIGN IS TRAILING SEPARATE

This statement can be applied to an elementary item (field) or to the entire record.

Computational and Binary Fields

Because computers perform computations with binary numbers, it is more efficient to store those values in the file in their native binary form than to store them in human readable base ten. If the number is stored in its native binary format it can be input from the file and used directly. If it's stored in a base ten format it needs to be converted to binary before performing calculations on it, then converted back to base ten for storage.

COBOL defines several binary data types. We will list a brief summary here, and you can find more detail in COBOL Computational Fields and in COBOL Comp-3 Packed Fields. Before we start, there is one important point to understand: The COBOL standard leaves the actual implementation of most data types up to the vendor who wrote the COBOL compiler. The reason for this is because different computers -- CPUs -- use different binary representations internally, and function best with their own type of binary numbers. This approach results in better and faster compilers, but also causes confusion, because a "comp" data type on one machine is not necessarily the same as "comp" on another machine. The table below lists the common uses; not all compilers will follow these types. For more details on word order and signs see the link above.

Which data type a field uses for storage is determined by the "usage is" clause in the field definition, and in most cases the number of bytes of storage is determined by the number of digits in the PIC. Floating point numbers follow standard binary formats and as such their sizes are not determined by a PIC, and no PIC is used in the field definition.

Data Type Description of how this data type is stored

Binary This is a pure binary number, usually in 2's-complement, and usually either 2, 4, or 8 bytes.

Comp The COBOL standard intends that the comp data type be implemented using the most efficient data type for a particular machine. The compiler vendor will chose the best type for the CPU, probably binary.

Comp-1 This is generally a single precision floating point data type.

Comp-2 This is generally a double precision floating point data type.

Comp-3 Comp-3 is very common and its format is nearly universal across platforms. It "packs" two digits into each byte. See COBOL Comp-3 Packed Fields for a full description of this data type.

Packed Decimal Packed decimal is usually implemented as comp-3. See comp-3.

When reading a binary or comp field specification, the size listed in the PIC is the number of decimal digits after the number is converted from binary to base ten. In the case of a packed field, it's the size after unpacking.

Real Decimal

Most PC programmers tend to think in terms of "real decimal" in numeric values. On a PC, if you have a dollars and cents field for, say, invoice total, in the amount of $123.45, the file will contain the six bytes "123.45" (and probably a sign). In other words, there is a real decimal point in the file. COBOL can do this, too, via the following:

   05  INVOICE-TOTAL           PIC 999.99.

OR:

   05  INVOICE-TOTAL           PIC 9(3).9(2).

The presence of the "." in the PIC causes a real decimal in the file. Implied decimal, however, is much more common in COBOL.

Implied Decimal

Implied decimal simply means there is a decimal point implied at a specified location in a field, but not actually stored in the file. The location of the implied decimal is indicated by a "V" in the PIC. Using implied decimal saves space in the file. Implied decimal can apply to any kind of numeric field, including a packed, or comp-3 field.

For example,

   05  ACCOUNT-BALANCE     PIC 9(6)V99.

is an implied decimal field. There are 6 digits, then an implied decimal - the V - and 2 more digits, for a total of 8 digits. The field is 8 bytes in size; there is no "." in the file -- the location of the decimal point is implied to be between the 9(6) and the 99. If the field contains "00000123" then the account balance is $1.23, because there is a decimal implied between the dollars and cents.

Synchronization and Alignment

This topic is a bit involved for this tutorial, but you should be aware of it. When using binary storage (binary and comp), some compilers on some machines may require that a numeric field start on some boundary. For example, on a 32 bit machine, it may require that a comp field start on a 32 bit boundary.

If you specify a comp field in the middle of a record, and it doesn't happen to begin on a 32 bit (4 byte) boundary, the compiler will "align" it to a 32 bit boundary to "synchronize" it. What's actually stored in the file may not be the same as the PICs on the layout indicate. This is not a very common problem, partly because binary and comp fields are not very common in files, but you should be aware of it.

Next: Part 5 Tables and Occurs

Additional Information

For more articles on data conversion, see our TechTalk Index.

Our COBOL Conversion Services
Disc Interchange Service Company can convert most numeric data types, including all the IBM mainframe EBCDIC data types, and most ASCII data types from PC and UNIX systems. Our library of conversion routines permits us to handle those difficult jobs that standard COBOL compilers can't convert.

Mainframe & AS/400 Conversion to PC

With 32 years experience, we are the experts at transferring mainframe data to PCs.

Disc Interchange Service Company, Inc.
Media Conversion Specialists
15 Stony Brook Road
Westford, MA 01886

Data Type	Description of how this data type is stored
Binary	This is a pure binary number, usually in 2's-complement, and usually either 2, 4, or 8 bytes.
Comp	The COBOL standard intends that the comp data type be implemented using the most efficient data type for a particular machine. The compiler vendor will chose the best type for the CPU, probably binary.
Comp-1	This is generally a single precision floating point data type.
Comp-2	This is generally a double precision floating point data type.
Comp-3	Comp-3 is very common and its format is nearly universal across platforms. It "packs" two digits into each byte. See COBOL Comp-3 Packed Fields for a full description of this data type.
Packed Decimal	Packed decimal is usually implemented as comp-3. See comp-3.