Sequence Types
Sequence types have the general concept of a first element, a second element, and so on. Basically an ordering of the sequence items using the natural numbers. In Python (and many other languages) the starting index is set to 0
, not 1
.
So the first item has index 0
, the second item has index 1
, and so on.
Python has built-in mutable and immutable sequence types.
Strings, tuples are immutable — we can access but not modify the content of the sequence:
In:
t = (1, 2, 3)t[0]
Out:
1
In:
t[0] = 100---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-155b9e8fb284> in <module>()
----> 1 t[0] = 100TypeError: 'tuple' object does not support item assignment
But of course, if the sequence contains mutable objects, then although we cannot modify the sequence of elements (cannot replace, delete or insert elements), we certainly can change the contents of the mutable objects:
In:
t = ( [1, 2], 3, 4)
t
is immutable, but its first element is a mutable object:
In:
t[0][0] = 100t
Out:
([100, 2], 3, 4)
Iterables
An iterable is just something that can be iterated over, for example using a for
loop:
In:
t = (10, 'a', 1+3j)s = {10, 'a', 1+3j}for c in t:print(c)Out:10
a
(1+3j)
In:
for c in s:print(c)Out:a
10
(1+3j)
Note how we could iterate over both the tuple and the set. Iterating the tuple preserved the order of the elements in the tuple, but not for the set. Sets do not have an ordering of elements — they are iterable, but not sequences.
Most sequence types support the in
and not in
operations. Ranges do too, but not quite as efficiently as lists, tuples, strings, etc.
In:
'a' in ['a', 'b', 100]
Out:
True
In:
100 in range(200)
Out:
True
Min, Max, and Length
Sequences also generally support the len
method to obtain the number of items in the collection. Some iterables may also support that method.
In:
len('python'), len([1, 2, 3]), len({10, 20, 30}), len({'a': 1, 'b': 2})
Out:
(6, 3, 3, 2)
Sequences (and even some iterables) may support max
and min
as long as the data types in the collection can be ordered in some sense (<
or >
).
In:
a = [100, 300, 200]min(a), max(a)
Out:
(100, 300)
In:
s = 'python'min(s), max(s)
Out:
('h', 'y')
In:
s = {'p', 'y', 't', 'h', 'o', 'n'}min(s), max(s)
Out:
('h', 'y')
But if the elements do not have an ordering defined:
In:
a = [1+1j, 2+2j, 3+3j]min(a)---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-b0cd79e53377> in <module>()
1 a = [1+1j, 2+2j, 3+3j]
----> 2 min(a)TypeError: '<' not supported between instances of 'complex' and 'complex'
min
and max
will work for heterogeneous types as long as the elements are pairwise comparable (<
or >
is defined).
For example:
In:
from decimal import Decimalt = 10, 20.5, Decimal('30.5')min(t), max(t)
Out:
(10, Decimal('30.5'))
In:
t = ['a', 10, 1000]min(t)---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-21-983eac063887> in <module>()
1 t = ['a', 10, 1000]
----> 2 min(t)TypeError: '<' not supported between instances of 'int' and 'str'
Even range
objects support min
and max
:
In:
r = range(10, 200)min(r), max(r)
Out:
(10, 199)
Concatenation
We can concatenate sequences using the +
operator:
In:
[1, 2, 3] + [4, 5, 6]
Out:
[1, 2, 3, 4, 5, 6]
In:
(1, 2, 3) + (4, 5, 6)
Out:
(1, 2, 3, 4, 5, 6)
Note that the type of the concatenated result is the same as the type of the sequences being concatenated, so concatenating sequences of varying types will not work:
In:
(1, 2, 3) + [4, 5, 6]---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-25-67a9e2ed8695> in <module>()
----> 1 (1, 2, 3) + [4, 5, 6]TypeError: can only concatenate tuple (not "list") to tuple
In:
'abc' + ['d', 'e', 'f']---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-26-8cbdd441adc1> in <module>()
----> 1 'abc' + ['d', 'e', 'f']TypeError: must be str, not list
Note: if you really want to concatenate varying types you’ll have to transform them to a common type first:
In:
(1, 2, 3) + tuple([4, 5, 6])
Out:
(1, 2, 3, 4, 5, 6)
In:
tuple('abc') + ('d', 'e', 'f')
Out:
('a', 'b', 'c', 'd', 'e', 'f')
In:
''.join(tuple('abc') + ('d', 'e', 'f'))
Out:
'abcdef'
Repetition
Most sequence types also support repetition, which is essentially concatenating the same sequence an integer number of times:
In:
'abc' * 5
Out:
'abcabcabcabcabc'
In:
[1, 2, 3] * 5
Out:
[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]
We’ll come back to some caveats of concatenation and repetition in a bit.
Finding things in Sequences
We can find the index of the occurrence of an element in a sequence:
In:
s = "gnu's not unix"s.index('n')
Out:
1
In:
s.index('n', 1), s.index('n', 2), s.index('n', 8)
Out:
(1, 6, 11)
An exception is raised if the element is not found, so you’ll want to catch it if you don’t want your app to crash:
In:
s.index('n', 13)---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-35-d038ca109973> in <module>()
----> 1 s.index('n', 13)ValueError: substring not found
In:
try:idx = s.index('n', 13)except ValueError:print('not found')Out:not found
Note that these methods of finding objects in sequences do not assume that the objects in the sequence are ordered in any way. These basically search that iterate over the sequence until they find (or not) the requested element.
If you have a sorted sequence, then other search techniques are available — such as binary searches. I’ll cover some of these topics in the extras section of this course.
Slicing
We’ll come back to slicing in a later lecture, but sequence types generally support slicing, even ranges (as of Python 3.2). Just like concatenation, slices will return the same type as the sequence being sliced:
In:
s = 'python'l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]s[0:3], s[4:6]
Out:
('pyt', 'on')
In:
l[0:3], l[4:6]
Out:
([1, 2, 3], [5, 6])
It’s ok to extend ranges past the bounds of the sequence:
In:
s[4:1000]
Out:
'on'
If your first argument in the slice is 0
, you can even omit it. Omitting the second argument means it will include all the remaining elements:
In:
s[0:3], s[:3]
Out:
('pyt', 'pyt')
In:
s[3:1000], s[3:], s[:]
Out:
('hon', 'hon', 'python')
We can even have extended slicing, which provides a start, stop, and a step:
In:
s, s[0:5], s[0:5:2]
Out:
('python', 'pytho', 'pto')
In:
s, s[::2]
Out:
('python', 'pto')
Technically we can also use negative values in slices, including extended slices (more on that later):
In:
s, s[-3:-1], s[::-1]
Out:
('python', 'ho', 'nohtyp')
In:
r = range(11) # numbers from 0 to 10 (inclusive)print(r)print(list(r))Out:range(0, 11)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
In:
print(r[:5])Out:range(0, 5)
In:
print(list(r[:5]))[0, 1, 2, 3, 4]
As you can see, slicing a range returns a range object as well, as expected.
Hashing
Immutable sequences generally support a hash
method that we'll discuss in detail in the section on mapping types:
In:
l = (1, 2, 3)hash(l)
Out:
2528502973977326415
In:
s = '123'hash(s)
Out:
-1892188276802162953
In:
r = range(10)hash(r)
Out:
-6299899980521991026
But mutable sequences (and mutable types in general) do not:
In:
l = [1, 2, 3]hash(l)---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-54-22dd9c98a095> in <module>()
----> 1 hash(l)TypeError: unhashable type: 'list'
Note also that a hashable sequence, is no longer hashable if one (or more) of its elements are not hashable:
In:
t = (1, 2, [10, 20])hash(t)---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-55-30cac1c4a226> in <module>()
1 t = (1, 2, [10, 20])
----> 2 hash(t)TypeError: unhashable type: 'list'
But this would work:
In:
t = ('python', (1, 2, 3))hash(t)
Out:
-8790163410081325536
In general, immutable types are likely hashable, while immutable types are not. So numbers, strings, tuples, etc are hashable, but lists and sets are not:
In:
from decimal import Decimald = Decimal(10.5)hash(d)
Out:
1152921504606846986
Sets are not hashable:
In:
s = {1, 2, 3}hash(s)---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-58-2216931a6bc4> in <module>()
1 s = {1, 2, 3}
----> 2 hash(s)TypeError: unhashable type: 'set'
But frozen sets, an immutable variant of the set, are:
In:
s = frozenset({1, 2, 3})hash(s)
Out:
-7699079583225461316
Caveats with Concatenation and Repetition
Consider this:
In:
x = [2000]id(x[0])
Out:
2177520743920
In:
l = x + xl
Out:
[2000, 2000]
In:
id(l[0]), id(l[1])
Out:
(2177520743920, 2177520743920)
As expected, the objects in l[0]
and l[1]
are the same.
Could also use:
In:
l[0] is l[1]
Out:
True
This is not a big deal if the objects being concatenated are immutable. But if they are mutable:
In:
x = [ [0, 0] ]l = x + xl
Out:
[[0, 0], [0, 0]]
In:
l[0] is l[1]
Out:
True
And then we have the following:
In:
l[0][0] = 100l[0]
Out:
[100, 0]
In:
l
Out:
[[100, 0], [100, 0]]
Notice how changing the 1st item of the 1st element also changed the 1st item of the second element.
While this seems fairly obvious when concatenating using the +
operator as we have just done, the same actually happens with repetition and may not seem so obvious:
In:
x = [ [0, 0] ]m = x * 3m
Out:
[[0, 0], [0, 0], [0, 0]]
In:
m[0][0] = 100m
Out:
[[100, 0], [100, 0], [100, 0]]
And in fact, even x
changed:
In:
x
Out:
[[100, 0]]
If you really want these repeated objects to be different objects, you’ll have to copy them somehow. A simple list comprehension would work well here:
In:
x = [ [0, 0] ]m = [e.copy() for e in x*3]m
Out:
[[0, 0], [0, 0], [0, 0]]
In:
m[0][0] = 100m
Out:
[[100, 0], [0, 0], [0, 0]]
In:
x
Out:
[[0, 0]]