Chapter 3 covered various numbers types in Julia, however did not go into depth into the system. We cover this in more detail in this section. Recall that there are various integer types in Julia. We discussed signed and unsigned integers and listed the types. We will see how these fit together and throughout this chapter will show how this type system can be leverage to reduce code.
Most of the number types that we saw in Chapter 3 are types that we can actually create values with that given type. Examples include Float64, UInt8 and Int128. These and the related ones are all examples of concrete data types, which are types that can be created. Although not specificially dicussed as a type in Chapter 3, the types Signed and Unsigned are integer types that are collections of other types. Such are called abstract data types. Every type in Julia is one of these types and the methods isconcretetype and isabstracttype can be used to determine the type. For example isabstracttype(Signed) returns true whereas isconcretetype(Signed) returns false, but isconcretetype(Int64) returns true.
All concrete data types belong to some abstract data type. We can use the method supertype to list this type. For example, supertype(Int64) returns Signed. The other related method is subtypes which lists all subtypes of a given abstract type. For example, subtypes(Signed) returns:
and we talked about all of these types in Chapter 3. And another example, subtypes(Unsigned) returns the array [5-element Vector{Any} UInt128, UInt16, UInt32, UInt64, UInt8]. The type system can be thought of as a tree in the graph-theory sense of nodes and edges. The leaves of the tree are the concrete data types. Using supertype(Signed) and supertype(Unsigned) both return Integer indicating that this is the common supertype of these types. And using subtypes(Integer) returns the array [Bool, Signed, Unsigned], which are all of type Integer.
An integer in stored in binary with the given number of bits. For example, a 32-bit integer or Int32 will allocate 4 bytes (32 bits) of memory for a number.
The numerical data types we have seen in this chapter are examples of concrete data types in that we can create data (usually numbers) with those types. These include the integer types Int8, Int16, Int32, Int64, Int128, BigInt and floating-point versions Float16, Float32, Float64, BigFloat. The rational and complex types are composite, however the internal part is a concrete type.
Julia is a bit different than other languages in that there are also abstract data types that 1) you can’t make data in the type and 2) are collections of other types.
There are cases when it would be nice if a variable could take on more than one type of data. We have already seen above that if a variable has any number type that we can use a supertype for value, that is Real can handle any integer or floating point number, but lets says they are more disparate than this. One way to handle this is to use the Any type, but as we have seen earlier in the text, this both results in both errors in coding as well as slower codee. The Union type takes care of this situation.
Perhaps we have loaded in some data that either came in as a real number or a string. Julia has a Union type that takes a list of the valid types. So if we declare
then this means that type of values that can be stored in x is either a Real or a String. For example, if we reassign with x = "4.5", then there are no errors. However, if we assign to x another type, like a tuple, we will get an error. For example,
As another union type, we will see in the section below and more in dataframes in Chapter 31, both of which have missing data. As an example, let say that
and note that this is a vector of type Union{Missing, Int64}, which again means that the types inside of the vector can either be of type Missing or Int64.
Let’s finish off this section with some code evaluation and load the BenchmarkTools package. You can imagine that using a Union type will have some effect on code speed and we’ll see how much here. We’ll start by adding up a million random integers and examine them with the @benchmark macro, specifically @benchmark sum(rand(1:100,1_000_000)) which returns:
Alternative, we will create variable arr that is an array of a union type with arr::Vector{Union{Real,String}} = collect(1:1_000_000). The first thing to note that even though the return type was Vector{Int64}, typeof(arr) returns Vector{Union{Real, String}} indicating that this is a vector of union types. Now if we sum these with @benchmark sum(arr), then we get the result:
BenchmarkTools.Trial: 168 samples with 1 evaluation.
Range (min … max): 28.256 ms … 65.188 ms ┊ GC (min … max): 0.00% … 0.88%
Time (median): 29.271 ms ┊ GC (median): 1.90%
Time (mean ± σ): 29.788 ms ± 3.049 ms ┊ GC (mean ± σ): 2.45% ± 1.48%
▃█▃▃ ▅▂▁
▆▄▃▆████▇███▄▃▃▄▃▃▄▁▃▁▃▁▃▁▁▃▁▃▁▃▁▁▁▁▁▁▁▁▃▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▃ ▃
28.3 ms Histogram: frequency by time 36.1 ms <
Memory estimate: 15.26 MiB, allocs estimate: 999969.
The one type that we will see later that has not been covered is the Missing type. This will be seen in Chapter 31 related to Data Analysis. The Missing type has only a single value, missing and is used usually related to data in vectors and data frames.
The value missing has the property that any operation with it results in missing. For example, missing+1 will return missing. Try other operations with missing.
You may ask yourself, "how helpful is a type like missing?" and that’s a great question. As we will see in Chapter 31 that often datasets have missing data and they need to be dealt with and we will see many examples in that chapter.