Section 22.1 Abstract Collection and Iteration
A collection is a set of items that have a few common properties including determining how many things are in the collection, whether or not the collection is empty and a way to iterate through all of the items. The concrete objects we have seen so far in this text include arrays, tuples and ranges. Any collection has the following functions applied to it
-
isempty
A boolean function that determines if a collection is empty.
isempty([1,2,3]) returns
false and
isempty(3:2) returns true because
3:2 is a range and since the second number is smaller than the first, there are no elements in it.
-
empty!
This function takes a collection and removes all elements of the collection. If
A=[1,2,3,4] then applying
empty!(A) results in
A being the empty array
[].
-
length
-
A function that returns the number of elements in the collection. As we’ve seen before
length([1,2,3]) returns 3, but note also
(1,2,3,4) returns
4.
length(1:2:11) returns 6.
Note that a 2D array (Matrix) is also a collection and if
C=[1,2; 3, 4; 5, 6] is the 3 by 2 matrix, then
length(C) returns
6, the total number of elements in the Matrix. This applies also to higher dimensional arrays.
-
in or ∈ and the negations
-
These function are boolean and determine if a single item is in or not in the collection. For example
1 ∈ [1,2,3,4] or
1 in [1,2,3,4] returns
true and
3 ∈ 1:3:11 returns false.
The functions
notin or the single character
∉ is the negation of the previous one. The expression
1 ∉ [1,2,3,4] returns
false.
One can access all of the elements of a collection by iterating over the elements. One simple way of doing this is with a for loop and we have seen this in
Chapter 5. For example
will print out (show) all of the elements in the range
1:4. Recall also, we can use a for loop over an array like
for i in [5, 9, 10, 2]
@show i
end
and finally, the other abstract collection we have see so far is a tuple (either named or unnamed). We can iterate over those like
for i in (x=3, y=4.5, z=7)
@show i
end
Although this seems quite simple, we will see that what is going on under the hood is that anything that is a collection has a method to go through the elements one at a time in order. This is called an
iterator. In fact the for loop with the range (first example above), can be translated to
range = 1:4
next = iterate(range)
while next !== nothing
(i, state) = next
@show i
next = iterate(range, state)
end
and as we disccused in
Chapter 5 a for loop can be written as a while loop. The important part of this is the
iterate method that is called before the loop and within the loop. The
iterate function takes an iterable collection (range, array, tuple) and a state variable that in this case is just the value in the range until the iteration is finished when it returns
nothing.
All collections need to define the function
iterate and in
Section 22.4 we show an example of a user-defined collection that defines
iterate.
Section 22.2 Sets
A mathematical
set is a collect of things without an order. For example, mathematically, a set is generally written as
\(\{1,2,3\}\) and there are important functions like union and intersection that is used. If
\(A=\{1,2,3\}\) and
\(B=\{3,4,5\}\text{,}\) then
\(A \cup B = \{1,2,3,4,5\}\) and
\(A \cap B = \{3\}\text{.}\) Also the sets
\(\{1,2,3\}\) and
\(\{3,2,1\}\) are the same in that order of the elements is not important.
Julia has a
Set is a collection that behaves the same way. If
A=Set([1,2,3])
B=Set([3,2,1])
then first notice that the order of the element may be switched around on output, but if we check for equality with
A==B, then the result is
true
Julia also has the functions
union and
intersection that work on
Sets, but also any iterators.
union(Set([1,2,3]),Set([3,4,5])) returns
Set([1,2,3,4,5]), but
union([3,4,8],3:7) returns
[3,4,8,5,6,7] and notice that in this case the order is kept.
Sets don’t need to be sets of integers. A set of a string produces the characters. For example,
Set("hello") returns
Set{Char} with 4 elements:
'h'
'l'
'e'
'o'
and notice that is it the set of characters in the work "hello" and "l" is not repeated.
Another function is the boolean version of
in or
ϵ that will return
true if a element is in the set,
false otherwise. For example if
A=Set([1,3,5,7,9]), then
7 ϵ A returns
true. There is also a not-in operator,
∉ which can be enters as
\notin TAB and if we do
4 ∉ A, the result is
true.
There are other functions that can be used with sets include set difference and testing for subsets. See
the Julia documentation on sets for more information.
Section 22.3 Dictionaries
In Julia, a
Dictionary is more general version of a named tuple and are sets of key/value pairs. The main difference is that a
Dict is mutable and there are many methods that update a
Dict. To create a
Dict, pass key/value pairs using "fat arrow" notation like:
d = Dict("a"=>1, "b"=>2, "c" => 3)
or alternatively, an array of tuples as key/value pairs:
s = Dict([("bart", 10), ("lisa", 8), ("maggie", 1)])
The getter/accessor is similar to that of an array.
d["a"] will return
1, the value of the key corresponding to
"a". And one can set a value in the same way.
d["a"] = 10 will update the value associated with the key "a" to be 10. Note that this is one way that a Dictionary differs from a named tuple. The Dictionary can be updated/mutated and a named tuple cannot.
push! and
pop! work with
Dicts in a similar manner to those as arrays. For example,
now returns the Dictionary with this entry added. Unlike an array,
pop! requires both the
Dict and a key which will be returns and removed.
returns
1 the value that corresponds to the
"maggie" entry and if
s is entered, then it no longer contains this entry.
The method
keys returns an array of the keys in the Dictionary. Since
"maggie" is no longer there
keys(s) returns
["lisa", "homer", "bart"]. The order of the keys may differ from what you see here. It returns an iterator, but probably better thought of as a set.
The method
values returns an array of the values in the Dictionary in the same order as that of the keys. When running
values(s), we get
[8,45,10].
As mentioned above, a dictionary is a more flexible version of a named tuple. Another example is that Dictionaries can be nested. Consider creating a dictionary which stores contact information for a person.
homer = Dict(
"first_name" => "Homer",
"last_name" => "Simpson",
"phones" => [
"home" => "987-555-1234",
"cell" => "987-555-1212"
],
"home_address" => Dict(
"street" => "742 Evergreen Terrace",
"city" => "Springfield"
),
"work_address" => Dict(
"street" => "10 Power Plant Lane",
"city" => "Springfield"
)
)
and as you can see, the keys are all strings, but the corresponding values differ. There are two strings, an array (of strings) and two
Dicts. Although this is more complex, we can still access things in a similar way. For example
homer["work_address"]["street"] will return
"10 Power Plant Lane".
There are other methods that we can use with Dictionaries and
the Julia documentation on dictionaries is a good source. Additionally in
Chapter 49, we will examine how to interact with a webserver that returns JSON strings which are parsed as dictionaries. This is a common format on the internet and we will see how to handle such things in Julia.
Section 22.4 User-created collections
Although we have seen many built-in collections and there are plenty of other examples in Julia packages, it should be no surprise to note that we can create a collection. Mainly, there are some key functions that need to be implemented on the collection. Let’s just show this with an example.
In
Chapter 7, we looked at Fibonacci numbers as examples of recursive functions and how to implement them efficiently. We will create a type
Fibonacci using a
struct like in
Chapter 12 in the following manner:
struct Fibonacci
n::Integer
function Fibonacci(n)
n >= 0 || throw(ArgumentError("This must be called with a nonnegative integer"))
new(n)
end
end
where a constructor has been used to check that only nonnegative integers are used. As is, this is just a struct with a single field called
n. To make this an iterator we need to implement some of the functions associated with an iterator. We will first implement the functions
isempty and
length and define them as
Base.isempty(f::Fibonacci) = f.n == 0
Base.length(f::Fibonacci) = f.n
and as can be seen that if we create a Fibonacci with 0 elements that it is empty and has length 0. It can be tested with
isempty(Fibonacci(0)) which returns
true and
length(Fibonacci(10)) returns 10.
So far this doesn’t see to do anything related to the fibonacci numbers, but by defining the
iterator function we can get the functionality that we desire. If
function Base.iterate(f::Fibonacci, state = 1)
if state > f.n
return nothing
end
local (x,y) = (1,1)
for i = 1:state-1
x,y = (y, x+y)
end
(x, state + 1)
end
where the arguments of the
iterate function is a object of the type in question and a variable
state which acts like an index. The function should return
nothing if the iteration is done and otherwise, return a tuple with the next value and increment the index/state. Note in our case above, we have used the faster (non recursive) version of fibonacci in
Subsection 8.8.1.
There are many ways that
iterate can be used. For example in a for loop like
for i in Fibonacci(10)
@show i
end
will produce the first 10 fibonacci numbers. Also,
collect(Fibonacci(10)) will create an array of length 10 with the Fibonacci numbers or
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55].