CS 202 - Notes 2016-03-02

Integer sizes

Values in the computer have to be stored in some fixed size location

As programmers, our control over this storage comes when we declare variable types

In Java, the various integer types are well defined

Name # of bytes
byte 1 byte
short 2 bytes
int 4 bytes
long 8 bytes

This is because Java is designed to run on a virtual machine, so the developers of the language have control over the underlying storage.

In Python, we have dynamic typing and we don’t clare our variables, so everything is fairly hidden. The most common flavor of Python is written in C uses longs to store integers. However, they have some trick data structures under the hood that can allow integers to be essentially unbounded if they grow beyond the capacity of the long.

In C, things are a little less straightforward

Name # of bytes
char 1 byte
short at least 2 bytes
int at least as big as the short
long at least 4 bytes
long long at least 8 bytes

On most current desktops, we will find that the int is 4 bytes and the long is 8 (making long long redundant).

Check out 07-sizes.c for the program to test your particular setup.

In C, all of those types are stored as two’s-compliment numbers. We can add unsigned to the start to store unsigned values.

Note that char is an integer type. In C, we can freely switch between characters (a letter in single quotes) and the ASCII values that the system actually stores. We can also use char variables to hold small numbers that will never be interpreted as characters since we don’t have a byte type.

Endianness

Another consideration for data storage is endianness. This has to do with the order in which bytes are laid out in memory. This is mostly hidden from us until we have to work with data produced on a machine with a different endianness.

Big Endian means that integers are laid out as we write them, with the most significant or “left” byte coming first.

Little Endian means that integers are laid out with the least significant byte coming first.

The example in 08-endian.c shows how to use a union to trick the computer into revealing the byte order in memory.

Boolean operators

Basic Boolean operations: AND, OR, NOT, (and XOR)

Logical operators

Your exposure to this is with the logical operators we use to work with Boolean values. In C (as in Java), these operators are && (AND), || (OR), and ! (NOT).

We use these in conditional statements to combine conditional expressions. (x>3) && (x<10) for example.

Note that in C, we have no Boolean type, and no True/False values. In C, 0 is treated as false and anything else is true. If you were to print out the value of an expression that would evaluate to true or false, you will get either 0 or 1.

Bitwise operators

C also provides bitwise operators. This work at the bit level, allowing us to work with the underlying representation of values.

Our operators look much like the logical operators: & (bitwise AND), | (bitwise OR), ~ (bitwise NOT), and ^ (bitwise XOR).

Example: 2^4 = 6

To understand this, we need to think at the binary level 010 ^ 100 = 110. We are taking each bit and applying the operation to it individually.

A common use for this is to do masking, if we are intersted in just part of the number.

Example: 0x1AB4 & 0xFF

In bits, this is 0001 1010 1011 0100 & 0000 0000 1111 1111

When we AND these together at the bit level, we are left with 0000 0000 1011 0100, or 0xB4. So, in essence, we have “extracted” the least significant byte of the number.