next up previous
Next: Defining Our Own 'types' Up: Learning the Language Previous: Learning the Language

Simple Java Sentences

Previously we mentioned that there are similarities between a natural language such as English, French, and so on and computer languages such as Java. One similarity is that both have an alphabet, and combinations of elements from the alphabet give words. Other symbols like the ';' symbol in are similar to punctuation found in English. This particular symbol denotes the end of certain Java 'phrases' or 'sentences' the way a period might denote the end of an English sentence.

In both languages some combinations of symbols have meaning and some don't. 'jmup' has no meaning in English, while 'jump' does in the same way that 'itn' has no meaning in Java while 'int' does. These words in Java that have a special meaning are called keywords. In both languages, words are combined into longer structures; in English they're called sentences while in Java, they're called statements.

The purpose of writing a Java statement is to give instructions to the computer (processor); of course the compiler must convert the statement into some sort of machine language first. What we need to know now is: What types of statements can we make? What does the computer do when presented with a compiled statement? This of course depends on the compiler, so we'll talk a little bit about the compiler in the following section.

A basic ability of a computer is the ability to write to and read from its memory. In addition, if we want to be sure that something we have written to memory is still there later, we have to make sure that the memory to which we've written is reserved in some way. We certainly wouldn't want to write a valuable piece of information to a memory location and then find later that it had been over written (and therefore lost - a memory location can hold one value at a time). Many statements involve reserving, reading and writing memory.

Here is an example of a Java statement:

int age = 25;

What is this telling the computer to do? This statement is interpreted as ``reserve some memory and write the value 25 in it''. From this point on in the program, you'll be able to refer to the value at that memory location as age. This is because the compiler makes a logical connection for you between the name age and whatever the memory location is so that every time you refer to age the compiler knows where to look for the value of age. Furthermore, this location is reserved just for age - nothing else but age will be written there. This isn't to say that the value can never change, just that it can only change if we modify the value of age; modifying values of anything else will not change the value of age. Because the value of age can change when we modify it, age is called a variable.

Of coure, we could have used any word we liked myage, TheAgeIAmNow, or even banana. In fact, given that we can use any word, we should choose the word that makes the most sense to us in the context of the program. This is because we may write a program and not look at it for a while, for example. When we come to look at it again, we'll wonder what the variable aqqpdos means and what it does. Better to call it something that means something to us.

On the other hand the word int is part of the Java language; it is called a keyword, as we've mentioned before. When choosing words for variable names, we must avoid the keywords of the language, because the compiler wouldn't be able to tell whether we meant the keyword or the variable otherwise!

Also notice that TheAgeIAmNow is different from theAgeIAmNow; both are different from theageiamnow. Case (i.e. whether characters are capitalized) matters! A language in which case matters is called case-sensitive

Notice the use of the '=' sign. This is a symbol which is used differently here than in mathematics. In mathematics '=' means that what is on the left side is the same as what is on the right side, such as in

5 + 1 = 4 + 2

Clearly we can see that what is on the left side is the same as what is on the right side. In Java however, the '=' sign means something different: it means ``take what is on the right side and store it in what is on the left side''. So '=' means write this to memory. In the case of our Java statement above, we are saying 'write the value 25 to the memory location that is associated with age'. Here is an example that makes this a little more clear:

age = age + 1;

What does this mean? It's actually a little complicated. Of course, if we were to interpret it in the mathematical sense, it would be false because age = age + 1 implies that 0 = 1 (subtract age from both sides). However, the compiler's interpretation of '=' is not the mathematical one. When the compiler sees this statement, it issues the following instructions to the processor:

  1. read the value stored at memory location associated with the word age
  2. add 1 to that value
  3. write the resulting value back to the memory location associated with the word age

Note that the word age means different things depending on which side of the '=' sign it is found! When it is on the right side, it means 'read the value from the memory location' and when it is on the left side of the '=' sign, it indicates the memory location where something should be written. This kind of construction can be confusing if you're new to programming, but I hope that it's clearer now.

Now, we said that the int age=25; statement reserves memory, but how does the compiler know how much memory to reserve? Well, it turns out that the compiler has some built-in information that tells it to reserve 4 bytes. In addition, the compiler has some a program which allows it to encode or convert '25' to some format that it and the hardware recognize. This format is expressed in binary, which is a number system using only the characters 0 and 1. (We normally use decimal, which uses the characters 0 through 9). A character in binary (either a 0 or a 1 ) is called a bit (Binary Digit) while a group of 8 bits is called a byte. A byte allows us to represent up to $2^8$ = 256 different items; these items could be interpreted as numbers for example. Because the compiler 'knows' about the amount of memory needed to store an int and the way it is encoded, an int is called a built-in or primitive data type.

Primitive data types include: (this is not an exhaustive list)

Name Range of Values Amount of Memory Required
int -2 147 483 648 to 2 147 483 647 4 bytes
double 4.94065645841246544e-324 to 1.79769313486231570e+308 8 bytes
char 0 to 65 535 2 bytes
boolean true, false 1 bit

Of course doubles can take on positive or negative values; only the positive values are listed here. The char type is meant to represent characters such as letters.

So the compiler 'knows' how much memory is required to store each of the primitive data types and 'knows' the encodings used (different for each type). In addition, the compiler/hardware 'know' how to perform certain operations on these types. For example there are sequences of instructions that increment an integer, decrement an integer, increment a double, decrement a double, add two integers, add two doubles, and so on. Notice that I've listed the instruction sequences for integers and doubles separately, even though it seems as though the same operation is being performed. That's because although the operation has the same effect, the instructions that carry it out must be different because both the size and the encoding of each type are different.

So we say that each of these (int, double, char, boolean) is a type, where we define type as the structure of the memory storage (how much memory it takes and what the encoding is) and the associated operations. Since the compiler 'knows' all this information for these particular types, we call them built-in.

Going back to our original statement int age=25; we can identify the int word as the type (meaning that the compiler is to reserve 4 bytes for it and convert 25 to the proper encoding for an int), age is a variable name, and we call '25' the value -- the interpretation of the bytes that are actually stored in memory.


next up previous
Next: Defining Our Own 'types' Up: Learning the Language Previous: Learning the Language
Chris Trendall
Copyright ©Chris Trendall, 2001. All rights reserved.

2001-12-09