ISBN : 1565922719
Cover Design - Exploring Java
For your free electronic copy of this book please verify the numbers below.
(We need to do this to make sure you're a person and not a malicious script)
Sample Chapter From Exploring Java
Copyright © Patrick Niemeyer & Joshua Peck
4. The Java Language
In this chapter, we'll introduce the framework of the Java language and some of its fundamental tools. I'm not going to try to provide a full language reference here. Instead, I'll lay out the basic structures of Java with special attention to how it differs from other languages. For example, we'll take a close look at arrays in Java, because they are significantly different from those in some other languages. We won't, on the other hand, spend much time explaining basic language constructs like loops and control structures. We won't talk much about Java's object-oriented features here, as that's covered in Chapter 5, Objects in Java.
As always, we'll try to provide meaningful examples to illustrate how to use Java in everyday programming tasks.
4.1 Text Encoding
Java is a language for the Internet. Since the people of the Net speak and write in many different human languages, Java must be able to handle a number of languages as well. One of the ways in which Java supports international access is through Unicode character encoding. Unicode uses a 16-bit character encoding; it's a worldwide standard that supports the scripts (character sets) of most languages.
Java source code can be written using the Unicode character encoding and stored either in its full form or with ASCII-encoded Unicode character values. This makes Java a friendly language for non-English speaking programmers, as these programmers can use their native alphabet for class, method, and variable names in Java code.
The Java char type and String objects also support Unicode. But if you're concerned about having to labor with two-byte characters, you can relax. The String API makes the character encoding transparent to you. Unicode is also ASCII-friendly; the first 256 characters are identical to the first 256 characters in the ISO8859-1 (Latin-1) encoding and if you stick with these values, there's really no distinction between the two.
Most platforms can't display all currently defined Unicode characters. As a result, Java programs can be written with special Unicode escape sequences. A Unicode character can be represented with the escape sequence:
xxxx is a sequence of one to four hexadecimal digits. The escape sequence indicates an ASCII-encoded Unicode character. This is also the form Java uses to output a Unicode character in an environment that doesn't otherwise support them.
Java stores and manipulates characters and strings internally as Unicode values. Java also comes with classes to read and write Unicode-formatted character streams, as you'll see in Chapter 8, Input/Output Facilities.
Java supports both C-style block comments delimited by /* and */ and C++-style line comments indicated by //:
/* This is a
As in C, block comments can't be nested. Single-line comments are delimited by the end of a line; extra // indicators inside a single line have no effect. Line comments are useful for short comments within methods because you can still wrap block comments around large chunks of code during development.
By convention, a block comment beginning with /** indicates a special "doc comment." A doc comment is commentary that is extracted by automated documentation generators, such as Sun's javadoc program that comes with the Java Development Kit. A doc comment is terminated by the next */, just as with a regular block comment. Leading spacing up to a * on each line is ignored; lines beginning with @ are interpreted as special tags for the documentation generator:
javadoc creates HTML class documentation by reading the source code and the embedded comments. The author and version information is presented in the output and the @see tags make hypertext links to the appropriate class documentation. The compiler also looks at the doc comments; in particular, it is interested in the @deprecated tag, which means that the method has been declared obsolete and should be avoided in new programs. The compiler generates a warning message whenever it sees you use a deprecated feature in your code.
Doc comments can appear above class, method, and variable definitions, but some tags may not be applicable to all. For example, a variable declaration can contain only a @see tag. Table 4.1 summarizes the tags used in doc comments.
The type system of a programming language describes how its data elements (variables and constants) are associated with actual storage. In a statically typed language, such as C or C++, the type of a data element is a simple, unchanging attribute that often corresponds directly to some underlying hardware phenomenon, like a register value or a pointer indirection. In a more dynamic language like Smalltalk or Lisp, variables can be assigned arbitrary elements and can effectively change their type throughout their lifetime. A considerable amount of overhead goes into validating what happens in these languages at run-time. Scripting languages like Tcl and awk achieve ease of use by providing drastically simplified type systems in which only certain data elements can be stored in variables, and values are unified into a common representation such as strings.
As I described in Chapter 1, Yet Another Language?, Java combines the best features of both statically and dynamically typed languages. As in a statically typed language, every variable and programming element in Java has a type that is known at compile-time, so the interpreter doesn't normally have to check the type validity of assignments while the code is executing. Unlike C or C++ though, Java also maintains run-time information about objects and uses this to allow safe run-time polymorphism.
Java data types fall into two categories. Primitive types represent simple values that have built-in functionality in the language; they are fixed elements like literal constants and numeric expressions. Reference types (or class types) include objects and arrays; they are called reference types because they are passed "by reference" as I'll explain shortly.
Numbers, characters, and boolean values are fundamental elements in Java. Unlike some other (perhaps more pure) object-oriented languages, they are not objects. For those situations where it's desirable to treat a primitive value as an object, Java provides "wrapper" classes (see Chapter 7, Basic Utility Classes). One major advantage of treating primitive values as such is that the Java compiler can more readily optimize their usage.
Another advantage of working with the Java virtual-machine architecture is that primitive types are precisely defined. For example, you never have to worry about the size of an int on a particular platform; it's always a 32-bit, signed, two's complement number. Table 4.2 summarizes Java's primitive types.
If you think the primitive types look like an idealization of C scalar types on a byte-oriented 32-bit machine, you're absolutely right. That's how they're supposed to look. The 16-bit characters were forced by Unicode, and generic pointers were deleted for other reasons we'll touch on later, but in general the syntax and semantics of Java primitive types are meant to fit a C programmer's mental habits. If you're like most of this book's readers, you'll probably find this saves you a lot of mental effort in learning the language.
Declaration and initialization
Variables are declared inside of methods or classes in C style. For example:
Variables can optionally be initialized with an appropriate expression when they are declared:
int foo = 42;
Variables that are declared as instance variables in a class are set to default values if they are not initialized. In this case, they act much like static variables in C or C++. Numeric types default to the appropriate flavor of zero, characters are set to the null character "