Lecture 28 - Pointers II
Goals
- Learn some more about pointers
- Learn how to manually allocate memory to the heap with malloc and calloc
Pointers and arrays
The ability to write functions like this is nice, but the real reason to use pointers is to be able to manipulate more complex data structures (like arrays).
As we saw earlier, when we pass an array variable to a function, we are actually passing the location of the array, rather than the entire thing. It actually goes deeper than that. Our array variable always resolves as the address of the array – the array variable is just a special case pointer.
int a[3];
int * p;
p = a;
p[0] = 1; // updates a[0] as well (they are the same array)This also works:
*(p + 1) = 2;We can do math with the pointers and the compiler figures out how big of a step to make. For our integers adding one is the equivalent to actually adding four to the address.
stupid pointer trick
int a[4];
a[0] = 4;
a[1] = 3;
a[2] = 2;
a[3] = 1;
printf("%d\n", a[2]); // prints 2
printf("%d\n", *(a+2)); // prints 2
printf("%d\n", 2[a]); // prints 2Square bracket notation is just shorthand for pointer arithmetic.
Addition is commutative, so 2[a] is the same as a[2]. Don’t ever write this in your code, however
arguments to main
Now that you know something about arrays as function pointers, we can understand what is going on with the arguments to main
int main(int argc, char * argv[])The argv is an array of strings. So that is the arguments that it is getting from the command line. What is argc then? That is the number of arguments (the Count). We need that because we don’t know how many arguments there are otherwise.
Returning arrays
What about returning arrays?
#include <stdio.h>
int[] f()
{
int a[2];
a[0] = 0x42;
a[1] = 0x24;
for (int i = 0; i < 2; i++)
{
printf("%x\n", a[i]);
}
return a;
}
int main(int argc, char *argv[])
{
int a[];
a = f();
for (int i = 0; i < 2; i++)
{
printf("%x\n", a[i]);
}
}This code doesn’t compile because there are a collection of issues
return_array.c:3:4: error: expected identifier or ‘(’ before ‘[’ token
3 | int[] f()
| ^
return_array.c: In function ‘main’:
return_array.c:19:8: error: array size missing in ‘a’
19 | int a[];
| ^
return_array.c:20:7: error: implicit declaration of function ‘f’ [-Wimplicit-function-declaration]
20 | a = f();
| ^
return_array.c:20:5: error: assignment to expression with array type
20 | a = f();
| ^
- we have the wrong return type
- we can’t reassign an array variable
Building on what we learned above, we can probably guess that we are returning the address of the array’s values in memory so we need to update the return type to be a pointer.
We specify we want a pointer by adding a * to the declaration
int * f(){The function now returns a pointer (we don’t have to do anything to the return statement since a is already an address)
We do the same thing to the variable we are going to store the result in
int * a;
a = f();
Unfortunately, this gives us a warning
func6.c: In function ‘f’:
func6.c:14:10: warning: function returns address of local variable [-Wreturn-local-addr]
14 | return a;
| ^
Why is returning the address of a local variable a bad thing?
the array is stored in a stack frame that we are about to dispose of
Creating arrays
Okay, let’s return to the problem we were trying to resolve – how can we return a new array from a function?
We need some memory that is not on the stack so that it is still available after the function call is done. So we need some new place in memory that will persist until we are done with it.
We are going to make a request to the operating system to ask it to find us a safe place for our data. We have two functions for doing this
malloc– allocate a single block of memorycalloc– allocate a sequence of contiguous memory blocks
These do pretty much exactly the same thing – the only difference is in the arguments. For malloc we just ask for a big chunk of memory and it returns a pointer to it. The calloc function works very similarly. it just takes an extra argument that allows us to specify how many of these little blocks we want (which makes it ideal for allocating arrays). They both return (void * ) because it doesn’t know what we are hoping to store there. Specifying that is left to us.
The sizeof function will lay a vital role in this process, helping us to determine how much space we need.
The location for these extra pieces of memory is a different part of main memory called the heap. The heap lives very low in memory and grows up towards the stack.
Allocating memory is extremely useful, but not without attached problems. When we ask for a chunk of memory, we are given a pointer to the allocated space. Keeping track of it is our problem.
This probably doesn’t sound that bad. However, it isn’t that difficult to forget the address by letting a variable fall out of scope. We now have an orphaned block of memory. Our code can’t use it because we have forgotten where it is. The system also doesn’t know where it is – it just knows that it was allocated. This little piece of memory will hang out, unusable, until the program exits. The longer we run the program, the more of these will clog up the works.
We call this process of gradually losing blocks of accessible memory to lost references a memory leak, and they are surprisingly common.
Mechanical level
vocabulary
- memory leak
Skills