Lecture 21 - Memory

Goals

  • Learn some more about arrays
  • Learn about strings
  • Learn about structs
  • Learn a little more about memory is organized

Last time we looked at the array in assembly, here is the program that generated it.

arrays

int main(int argc, char * argv[]){
	int a[5];

	for (int i = 0; i < 5; i++){
		a[i] = i;
	}
}

We don't really have an array variable represented in the assembly because of how it used -- the compiler is just hanging on to it internally

We have also learned how arrays are represented in memory -- as contiguous blocks of data, just as we have been thinking about them

Important notes about arrays in C:

  • as with arrays you have seen in other languages, they use square bracket notation.
  • we don't have array literals (there is a special case for array initialization which can only be used to define an array)
  • C isn't object oriented, so arrays only contain data. They have no methods (though there are some functions written to manipulate them)
  • arrays also don't have any properties. Most importantly, arrays don't know how long they are (we can see the assembly -- there are no indicators of length or where the bounds of the array are). In our particular case here, the compiler still knows, but as son as we start passing arrays to functions, all bets are off. You just need to keep track of the length in a separate variable.

strings

C does have strings, but they aren't a special type. They are just char arrays

We do have string literals, which are indicated with double quotes

Unlike Python and Java, single and double quotes are not interchangeable

Single quotes are for character literals, which have the type int Generally you will use this to talk about a single character Multiple character character literals can be made, but their behavior is implementation specific

Also, since strings are just arrays, everything we said about arrays are true here as well

Strings are also (usually) mutable, which is something to look out for. This is very handy if you want to change letters in the middle of a string, but you can't change their length (or rather you can, but weird things will happen if you walk off the end of the space reserved for it)

struct

C has another way to group data together called a struct

struct point {
	int x,y;
}

This, in essence, creates a new type. We can declare variables with it

struct point p1;

and we can access its fields using dot notation

p1.x = 5;
p1.y = 7;

If we look at the assembly:

mov     r3, #5
str     r3, [fp, #-12]
mov     r3, #6
str     r3, [fp, #-8]

We see that these two are right next to each other in memory Interestingly, there is no difference from two variables declared separately -- like the array, this is functionality that is supplied by the compiler

Structs are basically as close to objects as we get in C. A number of C libraries end up being virtually OO in that they define a new type and then provide a collection of functions that operate on it -- of course there is no data protection, inheritance or polymorphism... kind of. C does have one trick up its sleeve that you really won't see an analog of in other languages

union

Unions look like structs

union combo {
	int x;
	float f;
};

We can treat them the same

int main(int argc, char * argv[]){
	union combo u;
	u.x = 42;
	printf("%d\n", u.x); // prints 42
	u.f = 42.0;
	printf("%f\n", u.f); // prints 42.0
}

here is where things get interesting, however

printf("%d\n", u.x); // prints 1109917696 ?

Wait, what?

The explanation lies here:

printf("%d\n", sizeof(u)); // prints 4

That is only enough room for one of our values. Union grabs enough memory for the largest element, and then when we read or write from any of the elements, we are always accessing the same piece of memory -- just interpreting the bits there differently.

Exploring memory

I could demonstrate this by looking at the assembly code, but it is actually easier to just get the program to tell us where things are stored.

We need two new things:

  • & - the & operator can be placed in front of any variable and it will return the address of the variable in memory
  • %p - this is the format specifier we need to print out addresses
printf("u.x is stored at %p\nu.f is stored at %p\n", &u.x, &u.f);

Running this:

u.x is stored at 0x7ffefde1034c
u.f is stored at 0x7ffefde1034c

Note that the number will change if you run this again

Let's try with a collection of variables:

#include <stdio.h>
#include <stdint.h>

int main(int argc, char *argv[])
{
    int x;
    char c;
    uint16_t i;
    int32_t y;

    printf("x is stored at %p\n", &x);
    printf("c is stored at %p\n", &c);
    printf("i is stored at %p\n", &i);
    printf("y is stored at %p\n", &y);
}
$ gcc locations.c -o locations
$ ./locations
x is stored at 0x7fffa12f3c9c
c is stored at 0x7fffa12f3c9b
i is stored at 0x7fffa12f3c98
y is stored at 0x7fffa12f3c94

If we map that out, the memory layout looks like this:

0x7fffa12f3c94y
0x7fffa12f3c98i c
0x7fffa12f3c9cx

There are two things to notice here

  • the ordering is going from bottom to top (we have seen this before and we will learn the reason shortly)
  • there is a gap in there

Why is there a gap? We want to make sure that our values are aligned

Mechanical level

vocabulary

Skills


Last updated today at 10:24 AM