Understanding Little and Big Endian Architecture

Understanding Little Endian and Big Endian Architecture

To understand the terms little and big endian it might be useful to know the memory layout and architecture. Because computers use blocks of eight bits (eight bits together form a byte), rather than accessing individual bits. And, memory to a program seems very large array of bytes. Every byte of memory is recognized by a unique number, known as its address. Further to bytes every computer has a word size, indicating the nominal size of integer and pointer data. The most important system parameter determined by the word size is the maximum size of the virtual address space. That is, for a machine with an n-bit word size, the virtual address can range from 0 to 2n - 1, giving the program access to at most 2n bytes. For example, a 32-bit word size computer can support up to 4GB virtual address space.

Little and Big Endian

Regardless of endianness (whether it is big or little endian) there are two common rules for data objects which take two or more bytes for storing information in memory. First, the data objects are allocated a contiguous sequence of bytes. And second, the address assigned to the data object will be the smallest address of bytes allocated.

For example, if a data object num of type int takes four bytes in memory addressed as 0x2100, 0x2101, 0x2102, 0x2103. Then the address assigned to num will be 0x2100, and the value of expression &num yields 0x2100.

Another important point to note that for single byte data endianness has no role to play. Computers by different vendors agree upon some basic standards and the definition of byte. A byte is a sequence of eight bits. One bit can store one of the two possible values either zero or one. Every bit in a byte is given a sequence number. The right most bit of a byte is given sequence number 0 (zero), and it is called least significant bit. While, the left most bit is given sequence number 7 (seven), and it is called the most significant bit. Every bit carries a weight, which is its storage capacity. The storage capacity of a bit is calculated by 2sequence number * the value the bit stores.
For example, if the 3rd bit of a byte stores 1 then its weight will be 23 * 1 = 8. If it stores 0, the weight will be 23 * 0 = 0.

So do not get confused while handling single byte data. The problem starts when computers store multibyte data, and question arises which byte should be stored at smallest address? Following is the way computers address this problem:

Big endian and little endian are two formats to store multibyte data types into computer's memory. These two formats are also called network byte order and host byte order respectively. In a multibyte data type such as int or long or any other multibyte data type the right most byte is called least significant byte and the left most byte is called most significant byte. In big endian format the most significant byte is stored first, thus gets stored at the smallest address byte, while in little endian format the least significant byte is stored first.

As an example, if x a four byte integer contains a hex value 0x76543210 ('0x' stands for hex), the least significant byte will contain 0x10 and the most significant byte will store 0x76. Now if you take a pointer c of type char and assign x's address to c by casting x to char pointer, then on little endian architecture you will get 0x10 when *c is printed and on big endian architecture you will get 0x76 while printing down *c. Thereby you can find out the endianness of your machine.

int x = 0x76543210;
char *c = (char*) &x;

Big endian format:
------------------
Byte address  | 0x01 | 0x02 | 0x03 | 0x04 | 
              +++++++++++++++++++++++++++++
Byte content  | 0x76 | 0x54 | 0x32 | 0x10 |
			 
Little endian format:
---------------------
Byte address  | 0x01 | 0x02 | 0x03 | 0x04 | 
              +++++++++++++++++++++++++++++
Byte content  | 0x10 | 0x32 | 0x54 | 0x76 |

In Big-Endian, Most-Significant-Byte (MSB) is stored at lowest address while, in Little-Endian, Least-Significant-Byte (LSB) is stored at lowest address.

Intel gives the following definition of Endianness:

"Endianness is the format to how multi-byte data is stored in computer memory. It describes the location of the most significant byte (MSB) and least significant byte (LSB) of an address in memory. Endianness is dictated by the CPU architecture implementation of the system. The operating system does not dictate the endian model implemented, but rather the endian model of the CPU architecture dictates how the operating system is implemented.
Representing these two storage formats are two types of Endianness-architecture, Big-Endian and Little-Endian. There are benefits to both of these endian architectures. Big-Endian stores the MSB at the lowest memory address. Little-Endian stores the LSB at the lowest memory address. The lowest memory address of multi-byte data is considered the starting address of the data."

C program to detect little and big endian architecture

/* 
   Write a C program to find out if the underlying 
   architecture is little endian or big endian. 
 */
 
#include <stdio.h>
int main ()
{
  unsigned int x = 0x76543210;
  char *c = (char*) &x;
 
  printf ("*c is: 0x%x\n", *c);
  if (*c == 0x10)
  {
    printf ("Underlying architecture is little endian. \n");
  }
  else
  {
     printf ("Underlying architecture is big endian. \n");
  }
 
  return 0;
}

Advantages of Endianness

Both formats, big and little endian have their own advantages and disadvantages.

In "Little-Endian" form, assembly language instructions for picking up a 1, 2, 4, or longer byte number proceed in exactly the same way for all formats: first pick up the lowest order byte at offset 0. Also, because of the 1:1 relationship between address offset and byte number (offset 0 is byte 0), multiple precision math routines are correspondingly easy to write.

In "Big-Endian" form, by having the high-order byte come first, you can always test whether the number is positive or negative by looking at the byte at offset zero. You don't have to know how long the number is, nor do you have to skip over any bytes to find the byte containing the sign information. The numbers are also stored in the order in which they are printed out, so binary to decimal routines are particularly efficient.

How Relevant Endian Order is?

Endian order means that any time a computer accesses a stream (a network tap, a local file, or an audio, video or multimedia stream), the computer has to know how the file is constructed. For example, if you write out a graphics file (such as a .BMP file, which is Little-Endian format) on a Big-Endian machine, you must first reverse the byte order of each integer you write or another "standard" program will not be able to read the file.

Last Word

In this tutorial we talked of little and big endian architecture, network byte order and host byte order, advantages of endianness, and relevance of little endian and big endian architecture? Hope you have enjoyed reading this tutorial. Please do write us if you have any suggestion/comment or come across any error on this page. Thanks for reading!

References

  1. Where is Byte 0?
  2. On holy wars and a plea for peace
  3. Randal E. Bryant, David R. O'Hallaron, Computer Systems: A Programmer's Perspective


Share this page on WhatsApp

Get Free Tutorials by Email

About the Author

is the founder and main contributor for cs-fundamentals.com. He is a software professional (post graduated from BITS-Pilani) and loves writing technical articles on programming and data structures.