Skip to main content

Chapter 21 The Typing System in Julia

This chapters covers more details of the types of variables in Julia.

Section 21.1 The Type System in Julia

Chapter 3 covered various numbers types in Julia, however did not go into depth into the system. We cover this in more detail in this section. Recall that there are various integer types in Julia. We discussed signed and unsigned integers and listed the types. We will see how these fit together and throughout this chapter will show how this type system can be leverage to reduce code.
Most of the number types that we saw in Chapter 3 are types that we can actually create values with that given type. Examples include Float64, UInt8 and Int128. These and the related ones are all examples of concrete data types, which are types that can be created. Although not specificially dicussed as a type in Chapter 3, the types Signed and Unsigned are integer types that are collections of other types. Such are called abstract data types. Every type in Julia is one of these types and the methods isconcretetype and isabstracttype can be used to determine the type. For example isabstracttype(Signed) returns true whereas isconcretetype(Signed) returns false, but isconcretetype(Int64) returns true.
All concrete data types belong to some abstract data type. We can use the method supertype to list this type. For example, supertype(Int64) returns Signed. The other related method is subtypes which lists all subtypes of a given abstract type. For example, subtypes(Signed) returns:
6-element Vector{Any}:
 BigInt
 Int128
 Int16
 Int32
 Int64
 Int8
and we talked about all of these types in Chapter 3. And another example, subtypes(Unsigned) returns the array [5-element Vector{Any} UInt128, UInt16, UInt32, UInt64, UInt8]. The type system can be thought of as a tree in the graph-theory sense of nodes and edges. The leaves of the tree are the concrete data types. Using supertype(Signed) and supertype(Unsigned) both return Integer indicating that this is the common supertype of these types. And using subtypes(Integer) returns the array [Bool, Signed, Unsigned], which are all of type Integer.
We can actually create the part of the tree that are all Integer types as the following:
Figure 21.1. The type tree of integers in Julia

Section 21.2 Details of Integer Types

An integer in stored in binary with the given number of bits. For example, a 32-bit integer or Int32 will allocate 4 bytes (32 bits) of memory for a number.
In Julia, we can use the bitstring function to give the binary representation of integers and floating points. For example
bitstring(UInt8(18))
returns 00010010. Notice that bitstring(UInt8(255)) returns 11111111.
Similarly, the unsigned integers with more bits work the same with largest range of integers. For example bitstring(UInt64(100000)) returns
"0000000000000000000000000000000000000000000000011000011010100000"
which is a string of length 64.

Section 21.3 Abstract and Concrete Number Types

The numerical data types we have seen in this chapter are examples of concrete data types in that we can create data (usually numbers) with those types. These include the integer types Int8, Int16, Int32, Int64, Int128, BigInt and floating-point versions Float16, Float32, Float64, BigFloat. The rational and complex types are composite, however the internal part is a concrete type.
Julia is a bit different than other languages in that there are also abstract data types that 1) you can’t make data in the type and 2) are collections of other types.

Subsection 21.3.1 Abstract Number types

For example, Integer is the abstract type (also called a supertype) of all integer types. The other abstract number types are:
  • Signed: supertype of all signed integers like Int32, BigInt.
  • Unsigned: supertype of all unsigned integers like UInt32,UInt128.
  • Integer: supertype of all signed and unsigned integers.
  • AbstractFloat: supertype of all floating-point numbers.
  • AbstractIrrational: supertype of irrational numbers.
  • Real: supertype of all floating-point, rational, irrational and integer numbers.
  • Number: supertype of all numbers.
See a bare-bones description of all of Julia’s standard number types.

Subsection 21.3.2 Concrete Number Types

The numbers shown above are concrete number types like:
  • Float16, Float32, Float64, BigFloat which are all subtypes of AbstractFloat
  • UInt8, UInt16,UInt32,UInt64,UInt128: which are all subtypes of Unsigned
  • Int8, Int16, Int32, Int64, Int128, BigInt: which are all subtypes of Signed
  • Rational types are subtypes of Real
  • Complex types of subtypes of Number.
To test if something is a subtype of another use the <: operation. For example
UInt8 <: Integer
returns true, but
Float16 <: Signed
returns false.

Section 21.4 Union Types

There are cases when it would be nice if a variable could take on more than one type of data. We have already seen above that if a variable has any number type that we can use a supertype for value, that is Real can handle any integer or floating point number, but lets says they are more disparate than this. One way to handle this is to use the Any type, but as we have seen earlier in the text, this both results in both errors in coding as well as slower codee. The Union type takes care of this situation.
Perhaps we have loaded in some data that either came in as a real number or a string. Julia has a Union type that takes a list of the valid types. So if we declare
x::Union{Real,String} = 4
then this means that type of values that can be stored in x is either a Real or a String. For example, if we reassign with x = "4.5", then there are no errors. However, if we assign to x another type, like a tuple, we will get an error. For example,
x = (3,4)
results in
MethodError: Cannot `convert` an object of type 
  Tuple{Int64, Int64} to an object of type 
  Union{Real, String}
As another union type, we will see in the section below and more in dataframes in Chapter 31, both of which have missing data. As an example, let say that
A = [1, missing, 3, 4, 5]
which will return
5-element Vector{Union{Missing, Int64}}:
 1
  missing
 3
 4
 5
and note that this is a vector of type Union{Missing, Int64}, which again means that the types inside of the vector can either be of type Missing or Int64.

Subsection 21.4.1 Code Efficiency of Union Types

Let’s finish off this section with some code evaluation and load the BenchmarkTools package. You can imagine that using a Union type will have some effect on code speed and we’ll see how much here. We’ll start by adding up a million random integers and examine them with the @benchmark macro, specifically @benchmark sum(rand(1:100,1_000_000)) which returns:
BenchmarkTools.Trial: 1187 samples with 1 evaluation.
Range (min … max):  3.413 ms …  26.194 ms  ┊ GC (min … max): 0.00% … 31.25%
Time  (median):     4.083 ms               ┊ GC (median):    0.00%
Time  (mean ± σ):   4.207 ms ± 821.981 μs  ┊ GC (mean ± σ):  8.52% ±  9.11%

        ▅█▅▁▃         ▁▂▁                                     
 ▂▁▁▁▁▁▅██████▅▄▄▂▃▂▅▆███▇▆▅▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▂▂▂▄▃▃▃▃▂▂▂▂▂▂ ▃
 3.41 ms         Histogram: frequency by time        5.67 ms <

Memory estimate: 7.63 MiB, allocs estimate: 3.
and although there is a lot of information here, the median time is 4.083 ms.
Alternative, we will create variable arr that is an array of a union type with arr::Vector{Union{Real,String}} = collect(1:1_000_000). The first thing to note that even though the return type was Vector{Int64}, typeof(arr) returns Vector{Union{Real, String}} indicating that this is a vector of union types. Now if we sum these with @benchmark sum(arr), then we get the result:
BenchmarkTools.Trial: 168 samples with 1 evaluation.
 Range (min … max):  28.256 ms … 65.188 ms  ┊ GC (min … max): 0.00% … 0.88%
 Time  (median):     29.271 ms              ┊ GC (median):    1.90%
 Time  (mean ± σ):   29.788 ms ±  3.049 ms  ┊ GC (mean ± σ):  2.45% ± 1.48%

      ▃█▃▃ ▅▂▁                                                 
  ▆▄▃▆████▇███▄▃▃▄▃▃▄▁▃▁▃▁▃▁▁▃▁▃▁▃▁▁▁▁▁▁▁▁▃▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▃ ▃
  28.3 ms         Histogram: frequency by time        36.1 ms <

 Memory estimate: 15.26 MiB, allocs estimate: 999969.
and again, note that the median for this is 29.271 ms or about 7 times longer than the previous one.

Section 21.5 The Missing type

The one type that we will see later that has not been covered is the Missing type. This will be seen in Chapter 31 related to Data Analysis. The Missing type has only a single value, missing and is used usually related to data in vectors and data frames.
The value missing has the property that any operation with it results in missing. For example, missing+1 will return missing. Try other operations with missing.
You may ask yourself, "how helpful is a type like missing?" and that’s a great question. As we will see in Chapter 31 that often datasets have missing data and they need to be dealt with and we will see many examples in that chapter.
Let’s say that we have an array with a missing value like A=[1,missing,3,4,5] and try to find the mean
 1 
You’ll need to do using Statistics for this
of it with mean(A) and hopefully you’re not surprised to see that the result is missing.
When entering the vector A above, you should notice that the type is Vector{Union{Missing, Int64}}