CSC270 October 28 Tutorial: C pointers


[ King 11, 12, 17 ]




C pointers
[ King 11.1, 11.2, 11.3 ]

Each byte of memory has a different address.

Each variable is stored in some number of contiguous bytes of memory.
The address of a variable is the address of the first byte in which
it's stored.

    0100
    0101 xxx      <-- An integer might take 4 bytes.  This integer's
    0102 xxx          address is 0101
    0103 xxx
    0104 xxx
    0105
    0106
    0107 ccc      <-- A character only takes on byte.  This character's
    0108              address is 0107

In C, a pointer is a variable that stores an address.  If a
pointer `p' contains the address of a variable `x', we say p points
to x.  This is typically drawn with an arrow from p to x:

     p           x
   +---+       +---+
   | O |   --> | 5 |
   +-+-+  /    +---+
     |    |      
     +----+

Each pointer can point to many different variables at different times
during the program execution, but each variable must be of the same
type.

Declarations

The following declares integers `a' and `b' and a "pointer to integer"
`p'.  The asterisk before `p' shows that it is a pointer.  There can
be a space between the asterisk and `p' if you wish.

    int a, b;
    int *p;

``Address of''

To get `p' pointing to `a', we assign `p' the address of `a':

    p = &a;

The ampersand, when appearing in front of a variable, gives that
variable's address.  The above statement is read `p is assigned the
address of a'.

Dereferencing

To assign a value to the variable pointed to by `p', we use the
asterisk.

    *p = 55;
    printf( "%d %d\n", a, *p );

    output ---> 55 55

The above statement is read `the integer pointed to by p is assigned
55'.  This operation ``dereferences'' p.  In other words, p is a
``reference'' to something and *p is the actual something.

To get the value, we also use the asterisk:

    b = *p;
    printf( "%d\n", b );
 
    output ---> 55

Layout in memory

Suppose things are laid out in memory as follows:
    
    0100 aaa      < -- integer a
    0101 aaa
    0102 aaa
    0103 aaa
    0104 bbb      < -- integer b
    0105 bbb
    0106 bbb
    0107 bbb
    0108
    0109
    010A 
    010B
    010C ppp      < -- pointer p
    010D ppp
    010E ppp
    010F ppp

Then after the following statements, memory would look as shown below
(the hyphens are placeholder for the remainder of the variable).

    a = 55;
    p = &a;
    b = *p + 22;

    0100  55      < -- integer a
    0101  - 
    0102  - 
    0103  - 
    0104  77      < -- integer b
    0105  - 
    0106  - 
    0107  - 
    0108
    0109
    010A 
    010B
    010C 0100     < -- pointer p
    010D  - 
    010E  - 
    010F  -

Pointers as Arguments
[ King 11.4 ]

C functions use "call-by-value".  The arguments of the function call
are evaluated, their values are passed in to the function as
parameters, and any modifications to the parameters in not returned
from the function call.

To modify a variable that is one of the function arguments, you must
pass a POINTER to that variable:

  void f( int *p )

  {
    *p = *p + 1;
  }

  main()

  {
    int i, j;

    i = 0;
    j = 1;

    f( &i );
    f( &j );

    printf( "i = %d, j = %d\n", i, j );      --> i = 1, j = 2
  }

Above, f takes a pointer to an integer.  It increments the integer
pointed to by `p'.  The call to `f' must pass in a pointer to an
integer: &i is the address of `i' (in other words, a pointer to `i').

Pointers and structs
[ King 17.3, 17.4, 17.5 ]

Pointers are often used with structures in C.  For example, a
linked-list node looks like

    struct ll_node {
      int data;
      struct ll_node *next;
    }

Such a node contains data and a pointer `next' to the next node on the
list.  Here, `struct ll_node' is the data type.  Let's define an
LL_NODE type:

    typedef struct ll_node {
      int data;
      struct ll_node *next;
    } LL_NODE;

Note that it would not work to use "LL_NODE *" inside the structure
definition.  Typically, pointers to structures use the "struct ll_node
*" form.

A linked list has a pointer to its first node, which is initialized to
NULL.  `NULL' is usually defined in stdlib.h.

    LL_NODE *head;

    head = NULL;

A new node is created with malloc.

    LL_NODE *p;

    p = (LL_NODE *) malloc( sizeof( LL_NODE ) );

The fields of the structure would normally be referenced with the
`dot' operator, as in struct.data and struct.next.  BUT ... since p is
a POINTER to the structure, we use another operator, the `arrow'.  The
following initializes the new structure and adds it to the head of the
list.

    p->data = 0;
    p->next = NULL;

    head = p;

The following creates a list of nodes storing 5,4,3,2,1,0 in that
order.  Nodes are successively added to the *head* of the list as the
index i increases.

    LL_NODE *head, *p;
    int i;

    head = NULL;

    for (i=0; i < 6; i++) {
      p = (LL_NODE *) malloc( sizeof( LL_NODE ) );
      p->data = i;
      p->next = head;
      head = p;
    }

Trace it.

When you no longer need something that you allocated with `malloc',
BE SURE to return the memory to the operating system:

   free( p );

Above, p is a pointer that was returned from some call to malloc.

Pointer Notation Tricks
If `s' is a structure and `p' is a pointer to it, there are several
ways to reference the fields of the structure.  The following
references to `data' are all equivalent.

   LL_NODE s, *p;

   s.data = 5;
   p->data = 5;
   (*p).data = 5;

The last is interesting.  (*p) is the thing pointed to by `p'.  Since
this thing IS a structure, we use the dot notation to reference a
field.

Pointers and Arrays
[ King 12.1, 12.2, 12.3 ]

In C, an array variable is always a pointer to the first element of
the array!  

This means that you can pass array into functions without incurring
the cost of copying the whole array.  For example, suppose f() takes
an array of integers as an argument.  The following works because
arrays are represented with pointers.

    void f( int *a, int size )  

    {
      int i;

      for (i=0; i < size; i++)
        printf( "a[%d] = %d\n", i, a[i] );
    }

    main() 

    {
      int x[10];

      f( x, 10 );
    }

In passing `x' in to the function, we're really passing a pointer to the
first element of the array.  We could just as well have done

    f( &(x[0]), 10 );

since this also passes in the address of the first element.  Or, we could
have passed in only the middle four elements of the array:

    f( &(x[3]), 4 );

This passes in a pointer to element x[3] and tells f() that there are
four elements in the array that starts at that address:

               &(x[3])
                  |
                  v
    +---+---+---+---+---+---+---+---+---+---+
  x |   |   |   |   |   |   |   |   |   |   |
    +---+---+---+---+---+---+---+---+---+---+
      0   1   2   3   4   5   6   7   8   9

f() thinks it has the following array:

                +---+---+---+---+
             a  |   |   |   |   |
                +---+---+---+---+
                  0   1   2   3  

Dynamic Array Allocation
[ King 17.3 ]

Since arrays are represented with pointers, we can allocate them
dynamically.  The following allocates an array of 100 integers and
sets all entries to zero.

    int *a, i;

    a = (int *) malloc( 100 * sizeof( int ) );

    for (i=0; i < 100; i++)
      a[i] = 0;

We could just as well have used a pointer to move through the array:

    int *a, *p;

    a = (int *) malloc( 100 * sizeof( int ) );

    p = a;
    for (i=99; i>=; i--) {
      *p = 0;
      p++;
    }

The statement `p++' means `increment p'.  When applied to pointers,
this means `increment the address in p by the size of the thing p
points to'.  In other words (in this case), `point p to the next
integer following it in memory'.

Multidimensional Arrays
Pointers and multidimensional arrays are slightly more involved.  See
[ King 12.4 ], although that doesn't go into very much detail.