Chapter 2: Data Types

Return to all notes

In short, anything in scientific computing will ultimately be represented by a number and it is important to have a grasp of numbers and number types that are stored and processed by computers. At the beginning, floating-point numbers and integers are the two basic numbers types that are most important. The beginning of this chapter discusses the basic built-in types and then goes into further detail if needed.

Strings

Although strings are not that important in Scientific Computing, we will use them in general computing tasks. In Julia, a string is surrounded by “” (double quotes). For examples

str ="This is a string"

and if you enter typeof(str) then you should see String. The individual parts of the string are called characters, which have type Char and are by default Unicode Characters (which will we see are super helpful). A few other helpful things about strings aren

The length of a string is found using the length command.
To access the first element of the string, type first(str), the last is found by last(str) and the 3 character for example is str[3].
To turn other data types into string, use string. For example string(3.0) returns the string “3.0”.
Other important functions related to strings can be found at Julia’s documentation on strings

Numeric Types

One of the first things that is important is a knowledge of how computers store numbers and other mathematical objects. You should review details of integers and floating point numbers in your Introduction to Computer Science text or the following wikipedia pages for further information:

Julia’s standard numbers and mathematical objects

Since we are primarily using Julia in this class, there are a few extensions of the standard types of mathematical numbers. Specifically, rational numbers and complex numbers.
Julia has built-in number and other mathematical data types including those listed above in various sizes as well as rational numbers, matrices, and complex numbers. The following are links to the Julia documentation for

Size of numbers

Storing a single number on a computer takes certain amount of memory. There are a lot of specifics in the links above or in your Computer Science text, but in short there are different types of integer and floating point numbers. In Julia, there are 8, 16, 32, 64, 128 bit versions of integer (both signed and unsigned) and 16, 32 and 64 bit floating point numbers. For example, to set the number 1 as a 16-bit signed integer, type

Int16(1)

and to determine the type of this (or anything), type typeof(ans) and it returns Int16.

If you just enter x=5 (to assign the variable x the value of 5), and then typeof(x) you should get Int64 (or maybe Int32 if you’re using a 32-bit machine). The default integer type is generally 64 bit.

If you type typeof(3.0) you will probably get Float64, but possibly Float32, since also, this is the default floating point type.

Scientific Notation

Recall that any number written in decimal form with only a finite number of digits can be written in scientific notation that is in the form:
$$a \times 10^{b}$$

where $1<|a|<10$ and $b$ is an integer. For example $4003.23$ can be written as $4.00323 \times 10^{3}$, so $a=4.00323$ and $b=3$.

In this form the number $a$ is often called the significand or mantissa and the number $b$ is exponent. This example has the base 10, however other bases are common (generally base 2).

One major advantage to using numbers in this form is the simple multiplication and division. Consider multiplying $x=3.4 \times 10^{2}$ and $y=-4.7 \times 10^{-3}$. Using properties of exponentials we get
$$xy = (3.4)(-4.7) \times 10^{2-3} = -15.98 \times 10^{-1}$$
and typically we would like to put this back into scientific notation by shifting the exponent so $xy=-1.598 \times 10^{0}$.

Division can be done in a similar manner and perhaps surprisingly, addition and subtraction are more difficult due to the fact that the exponents of the two numbers need to be equal before adding and subtracting.

Binary Representation of integers

Since ultimately all numbers are stored as binary, it may be helpful to know how that number is stored. The bits function will do this. If x=Int8(3), and typing bits(x) returns "00000011", where 1) notice that there are in fact 8 bits used and 2) this is 3 in binary.

Floating Point Numbers of a given size

The reason for using floating point numbers in calculations is twofold. First, there is a finite size of storage for a number and secondly, routines for performing operations on floating-point numbers are fast and usually encoded on a computer chip.

Consider a floating point of a given size, say 64 bits generally called a double precision floating point number. The first bit is generally used for the sign, the next 11 are the exponent and the final 52 bits store the mantissa. A floating point number has two limitations and that is the precision (how many digits that can be stored) and the magnitude (the largest number). Double precision numbers are store in binary and converted to decimal with the form:
$$(-1)^{s} 2^{c-1023}(1+f)$$
where $s$ is the sign $c$ is the exponent and $f$ stores the mantissa. For example, consider the following number:
$$0\;10000000101\;0111011010000000000000000000000000000000000000000000$$
where spaces separate out $s$, $c$ and $f$. Converting $c$ to decimal:
$$c = 1 \cdot 2^{10} + 0 \cdot 2^{9} + \cdots + 1 \cdot 2^{2} + 0 \cdot 2^{1} + 1 \cdot 2^{0} = 1029$$
The mantissa is calculated in the following way

$$
\begin{array}{rcl}
f & = & 0 \cdot \biggl(\dfrac{1}{2}\biggr)^{1} + 1 \cdot \biggl(\dfrac{1}{2}\biggr)^{2} + 1 \cdot \biggl(\dfrac{1}{2}\biggr)^{3} + 1 \cdot \biggl(\dfrac{1}{2}\biggr)^{4} + 0 \cdot \biggl(\dfrac{1}{2}\biggr)^{5} + 1 \cdot \biggl(\dfrac{1}{2}\biggr)^{6} \newline
& & \qquad + 1 \cdot \biggl(\dfrac{1}{2}\biggr)^{7} + 0 \cdot \biggl(\dfrac{1}{2}\biggr)^{8} + 1 \cdot \biggl(\dfrac{1}{2}\biggr)^{9} = \frac{237}{512}
\end{array}
$$
and thus the floating point number is:
$$(-1)^{0} 2^{1029-1023} \left(1+\dfrac{237}{512}\right) = 93.625$$

The double precision number system falls into a class of number systems that we can commonly call floating-point number systems.

We can use the bitstring function in julia to find the binary representation. Notice that

bitstring(93.625)

returns a string that is the binary representation above.

Exercise

Play with other integers in 8-bit and see the representation.
How are negative numbers stored?
Examine other sizes of integers.
Examine the Unsigned versions of the integers.
Play with some floats of size 16, 32 or 64.

BigInts and BigFloats

As we will see in this book, sometimes you’ll run out of numbers (on the integer side) or precision when using a float. This is where BigInts and BigFloats come in handy. A BigInt will grow as needed. Here’s an example. The following will produce an array of powers of 10 up to $10^{20}$:

[10^i for i=10:20]

And notice that up to $10^{18}$, you get a reasonable number. Above that, what happens? From 18 to 19 there’s a overflow error, however Julia does not report it as such by default. The following will report the overflow error and nearly does the same thing:

let
  local i,x=10
  for i=1:20
    println(x)
    x=Base.checked_mul(x,10)
  end
end

Now if we do the following using BigInts:

[big(10)^i for i=1:20]

then it gives what we expect. And try:

typeof(big(10)^20)

and see what the resulting type is?

BigFloats

Sometimes you may want more precision than a Float64 gives. For example, what if you want to calculate $\pi$ to 100 digits.¹ Although I won’t go through that, but we can generate a big float with value 1.0 using x=big(1)/big(3) enter this in and you will get:

3.333333333333333333333333333333333333333333333333333333333333333333333333333348e-01

To determine the number of digits of accuracy, you can count (painfully) or try length(string(x)) which will return 84, but the last 4 are not digits, so it’s about 80 digits of accuracy, which is about 5 times the accuracy of Float64. Technically if you type precision(x) and this returns 256, which is the number of bits and it has 4 times the binary precision of Float64.

Note: the set of nested commands turns the number into a string and then gives the length of the string.

You can get more precision using the setprecision function. For example:

setprecision(2^10)

returns 1024 (or each number is stored in 128 bytes of memory). Then typing x2=big(1)/big(3) returns:

3.333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333343e-01

And again, to determine the number of digits,

length(string(big(1.0)))

which returns 315 so about 300 decimal digits.

Using BigFloats and BigInts

There’s a few things to notes about using BigInts and BigFloats.

You may need to make sure that all calculations are done using the number of bits desired. For example:
```
big(10^20)
```
returns 7766279631452241920 which makes absolutely no sense. However,
```
big(10)^20
```
returns what is expected. Why? Note: look at the difference between the two commands and think about order of operations.
It is generally slower to use BigInts and BigFloats than the 64-bit integers. So only use BigInts and BigFloats if you need the extra precision. We’ll discuss the speed of code and reasons for various speed throughout this course.

Rational numbers

A rational number is the ratio of integers and more commonly called a fraction. Julia (unlike many other general computing languages) has rational numbers built-in. To put in a ratio, enter a // to separate the numerator and denominator. For example:

2//3
4//7
178//11
-1//2

are examples of rational numbers. One advantage that they have is that the numerator and denominator are stored as integers (64-bit by default) and are not subject to round-off errors that floating points are. The standard operations $+,-,\cdot,\div$ between rationals results in a rational and as we will see in this course, there are advantages to using rationals instead of floating points.

Exercise

Perform the following operations involving rationals in Julia:

$\frac{1}{2} + \frac{2}{3}$
$\frac{1}{2} - \frac{2}{3}$
$\frac{2}{3} \cdot \frac{3}{5}$
$\frac{2}{3} \div \frac{3}{5}$

The Rational Type

If you enter type(1//2), note that julia returns Rational{Int64} and this is called a Parametric Composite Type, which will be talked about later. In this particular case, this is a rational type, but inside it (the numerator and denominator), they are type Int64. For example, to make a different type of rational you need to declare a different integer type inside.

Int16(1)//Int16(2)

and if you check the type of this, you will see it is Rational{Int16}.

Exercise

Build the rational $\frac{1}{2}$ with BigInts within it. Check that in fact it is stored as you expect.

Other Operations with Rationals

As long as you stay in the basic operations, $+,-,\cdot, \div$, the result will be a rational. However, may other operations are not. For example sin(1//2) will return a floating-point number. (2//3)^3 will return a rational but (2//3)^(1//2) will return a float (the square root of 2/3).

Complex Numbers

The other type of number listed above is a complex number, which julia has built-in. This is discussed in detail in Chapter 17

Abstract and Concrete Number Types

See a bare-bones description of all of Julia’s standard number types.

Abstract Number types

Number: a type that can be any number type
Integer: any type of integer
Signed: any type of signed integer
AbstractFloat: any type of floating point number

Concrete Number Types

The numbers shown above are concrete number types like:
* Float16, Float32, Float64, BigFloat which are all subtypes of AbstractFloat
* UInt8, UInt16,UInt32,UInt64,UInt128: which are all subtypes of Unsigned
* Int8, Int16, Int32, Int64, Int128, BigInt: which are all subtypes of Signed

To test if something is a subtype of another use the <: operation. For example

UInt8 <: Integer

returns true, but

Float16 <: Signed

returns false.

Converting numbers

float(x) converts any type of number to a floating point, like

float(1//3)

converts the Rational to Float64

parse(Type,str) parses str of type String into a number of type Type. There is also a base option.

parse(Int,"1234")

returns the integer 1234

and

parse(AbstractFloat,"1234")

returns the floating point 1234.0

Converting to Integers

If you have a floating point number or rational and you want to convert to an integer, typically use the ceil, floor, round functions. For example

ceil(Int,3.5)

returns the integer 4, since ceiling rounds up to the next integer.

And in fact, I calculated $\pi$ to about 100,000 digits for a talk in Spring 2016 using Julia↩