Understand Common Sequence Data Types in Python – String, Tuple, and List

string, tuple, and list are three common build-in ordered collection data types in Python. Those sequence data types share some common operations.

Common Sequence Operations in Python
Name Operator Example
reference: Operations on Any Sequence in Python (interactivepython.org), 5.6 Sequence Types
indexing [n]
data = [1,2,3,4,5]
data[3] # return 4
concatenation +
data = [1,2,3,4,5]
data + [9] 
# return [1,2,3,4,5,9]
repetition *
data = [1,2,3,4,5]
data * 2 
# return [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
membership in
data = [9,2,4,4,6,2,8]
for val in data: print(val), 
# return 9 2 4 4 6 2 8
length len()
data = [1,2,3,4,5]
len(data) # return 5
slicing with step k [i:j:k]
data = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
data[0:10:2] # return [1, 3, 5, 7, 9]
slicing [i:j]
data = [1,2,3,4,5]
data[1:3] # return [2,3]
Minimum min(data)
data = [1,2,3,4,5]
min(data) # return 1
Maximum max(data)
data = [1,2,3,4,5]
max(data) # return 5
Index data.index(sub[, start[, end]])
data = [1,2,3,4,5]
data.index(3) # return 2
Count data.count(i)
data = [1,2,3,4,5,3]
data.count(3) # return 2

Those 3 sequence data types, string and tuple are immutable; list is mutable. Above common operations can be used on both mutable and immutable data types.

my_list = [1,2,3,4,5,6,7,8]
my_string = 'My name is Eva'
my_tuple = (1,2,3,4,'A','B','C')
my_list[0:2] # return [1, 2]

my_string[0:2] # return 'My'

my_tuple[0:2] # return (1, 2)

Tuple

Before we talk about the special methods for each data type, I'd like to talk about tuple first.

In Python, both list and tuple are heterogenous collections (although ist is intended to be homogeneous sequences); however, there's no special methods for tuple because tuple is immutable.

So, why using tuple?

Gred Wilson suggested that tuples should be one of the things Python 3000 could leave out, but Phillip Eby pointed out that tuples are not just constant lists but heterogeneous data structures.

Tuples are not constant lists -- this is a common misconception. Lists are intended to be homogeneous sequences, while tuples are hetereogeneous data structures.

-- form Python Tuples are Not Just Constant Lists (jtauber.com)

If you treat tuple as a constant list, then you will probably be very confused about it; but if you understand tuple as a data structure, just like JSON, it would be much easier to understand the purpose of using tuple.

data = ('Eva', 20, 'Front-End Software Engineer', 'F') # tuple packing

(name, age, job_title, gender) = data # tuple unpacking

name # return 'Eva'

age # return 20

job_title # return 'Front-End Software Engineer'

gender # return 'F'

tuple can be very useful if you want to store data, and it takes less memory than list because it is immutable data type.

Measured in bytes using Python 2.5 in 64-bit Ubuntu Linux
data type bytes
source: Python Memory Usage: What values are taking up so much memory?
int 24
float 24
tuple 63
list 101
dict 298
old-style class 345
new-style class 336
subclassed tuple 79
Record 79
Record with old class mixin 79
Record with new class mixin 79

If you want to know more things about tuples, here are more related articles about tuple.

Releated Articles:

List

In tuple section, we mentioned list a little bit. List is an ordered heterogenous collection data type which starts counting with 0. Here are some common methods for list:

Methods for list
Name Operator Example
Delete item
del data[i:j]
data = [0,1,2,3,4,5]
del data[1:2]
data # return [0, 2, 3, 4, 5]
Delete item with k step
del data[i:j:k]
data = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
del data[0:10:2]
data # return [1, 3, 5, 7, 9, 10, 11, 12, 13, 14, 15]
Append
data.append(item)
data = [0,1,2,3,4,5]
data.append('ABC')
data # return [0, 1, 2, 3, 4, 5, 'ABC']


data = [0,1,2,3,4,5]
data.append(['a', 'b', 'c'])
data # return [0, 1, 2, 3, 4, 5, ['a', 'b', 'c']]
Extend
data.extend(item)
data = [0,1,2,3,4,5]
data.extend('ABC')
data # return [0, 1, 2, 3, 4, 5, 'A', 'B', 'C']


data = [0,1,2,3,4,5]
data.extend(['a', 'b', 'c'])
data # return [0, 1, 2, 3, 4, 5, 'a', 'b', 'c']
Insert
data.insert(i,item)
data = [0,1,2,3,4,5]
data.insert(2, 'ABC')
data # return [0, 1, 'ABC', 2, 3, 4, 5]
Pop
data.pop([i])
data = [0,1,2,3,4,5]
data.pop() # return 5

data # return [0, 1, 2, 3, 4]
You can also assign the index of the item you want to pop
data = [0,1,2,3,4,5]
data.pop(2) # return 2

data # return [0, 1, 3, 4, 5]
Reverse
data.reverse()
data = [0,1,2,3,4,5]
data.reverse()
data # return [5, 4, 3, 2, 1, 0]
Remove Item
data.remove(item)
data = [0,1,2,3,4,5]
data.remove(3) # 3 is item vale, not item position

data # return [0, 1, 2, 4, 5]
Sorting
data.sort([cmp, key, reverse])
data = [4,5,3,2,6,8,1,0]
data.sort()
data # return [0, 1, 2, 3, 4, 5, 6, 8]
More examples please check: sort() method

sort() method

In Python, both sort() and sorted() have three arguments: cmp, key, and reverse; however, using key and reverse is more preferred because they are much faster than cmp. When Python sort a list, cmp will be called multiple times for each list element, but key and reverse will only touch each element once. (please refer to this document)

Instead of using data.sort(), you can also use sorted(data). The difference between data.sort() and sorted(data) is data.sort() will modify the original data, but sorted(data) will return the new sorted data.

data = [4,5,3,2,6,8,1,0]
data.sort()
data # return [0, 1, 2, 3, 4, 5, 6, 8]


data = [4,5,3,2,6,8,1,0]
sorted(data) # return [0, 1, 2, 3, 4, 5, 6, 8]

data # return [4, 5, 3, 2, 6, 8, 1, 0]

cmp specifies a comparison function of two arguments. This comparison function will compare whether the first argument is smaller than, equal to, or larger than the second argument, and this function will return a negative, zero, or positive number depends on the comparing result.

Here is the simplest example for using cmp which shows the logic of how cmp doing soring base on the returning result.

data = [12,3,5,16,9,7,2,11,14]
data.sort(cmp=lambda x, y: x - y)
data # return [2, 3, 5, 7, 9, 11, 12, 14, 16]


data = [12,3,5,16,9,7,2,11,14]
data.sort(cmp=lambda x, y: y - x)
data # return [16, 14, 12, 11, 9, 7, 5, 3, 2]

key specifies a function of one argument and the default value is None. The key function takes 1 argument and returns 1 value.

data = ['bbbb', 'aa', 'ccc', 'eeeee', 'f']
data.sort()
data # return ['aa', 'bbbb', 'ccc', 'eeeee', 'f']


data = ['bbbb', 'aa', 'ccc', 'eeeee', 'f']
data.sort(key=len)
data # return ['f', 'aa', 'ccc', 'bbbb', 'eeeee']

reverse is a Boolean value, which tells sort() to reverse the result or not. This argument can also be used with cmp or key.

data = [12,3,5,16,9,7,2,11,14]
data.sort(cmp=lambda x, y: x - y, reverse=True)
data # return [16, 14, 12, 11, 9, 7, 5, 3, 2]

list and reference

When you use list to do some operations, you might need to be very carefule about the reference issue.

data = [1,2,3]
A = data * 3
A # return [1, 2, 3, 1, 2, 3, 1, 2, 3]

data[0] = 'Castiel'
A # still returns [1, 2, 3, 1, 2, 3, 1, 2, 3]

But if you do something like this:

data = [1,2,3]
A = [data] * 3
A # return [[1, 2, 3], [1, 2, 3], [1, 2, 3]]

data[0] = 'Castiel'
A # will return [['Castiel', 2, 3], ['Castiel', 2, 3], ['Castiel', 2, 3]]

If you put your list into another list, Python copies each item by reference. If you don't want Python copies items by reference, you can simply use [:] to copy the list.

data = [1,2,3]
A = [data[:]] * 3
A # return [[1, 2, 3], [1, 2, 3], [1, 2, 3]]

data[0] = 'Castiel'
A # still return [[1, 2, 3], [1, 2, 3], [1, 2, 3]]

However, [:] is just a shallow copy, which can only copy the first level of elements and will not recursively duplicate the elements within itself. Let's just use the above example and see what will happen with nested list.

data = [['a','b'],2,3]
A = [data[:]] * 2
A # return [[['a', 'b'], 2, 3], [['a', 'b'], 2, 3]]

data[1] = 'Castiel'
A # still return [[['a', 'b'], 2, 3], [['a', 'b'], 2, 3]]

data[0][1] = 'Castiel'
A # return [[['a', 'Castiel'], 2, 3], [['a', 'Castiel'], 2, 3]]

To copy a nested list with reference, you will need to use copy.deepcopy(). Here are more related articles of copying lists in Python:

Related Articles:

String

Like tuple, string is an immutable sequence data type. Other than the common sequence collection data type operations we just introduced, string has more methods that can help you easily deal with the strings, but I am not going to list all of them. If you are want to know more information about string methods, please check the official Python document.

Common String Methods in Python
Name Method Example
source: 5.6.1. String Methods
Capitalize
str.capitalize()
data = 'this is my string'
data.capitalize() 
# return 'This is my string'
Center
str.center(width, [, fillchar])
data = 'In Center'
data.center(15) 
# return ' In Center '

data.center(15,'*') 
# return '***In Center***'
Ends With
str.endswith(suffix[, start[, end]])
data = 'this is my string'
data.endswith('ing') # return True

data.endswith('ing', 0, 8) # return False
Expand Tabs to Spaces
str.expandtabs([tabsize])
data = '01\t02\t03'
data.expandtabs(4) # return '01 02 03'
Find
str.find(sub[, start[, end]])
str.rfind(sub[, start[, end])
data = 'this is my string'
data.find('is') # return 2

data.find('is', 4) # return 5

data.find('is', 7, 10) # return -1

data.rfind('is') # return 5
Format
str.format(*args, **kwargs)
data = "Here we are! {0}, {1}, and {2}."
data.format('string', 'tuple', 'list')
# return 'Here we are! string, tuple, and list.'
Justified
str.ljust(width[, fillchar])
str.rjust(width[, fillchar])
data = 'my string'
data.ljust(15, '-') 
# return 'my string------'

data.rjust(15) 
# return '      my string'
Join
str.join(iterable)
'-'.join(['python', 'in', 'here']) 
# return 'python-in-here'
Partition
str.partition(sep)
data = 'Me: This is a long article'
data.partition(':') 
# return ('Me', ':', ' This is a long article')
Replace
str.replace(old, new[, count])
data='this is python'
data.replace(' ', '-') 
# return 'this-is-python'

data.replace(' ', ', ', 1) 
# return 'this, is python'
Strip
str.strip()
str.lstrip()
str.rstrip()
data = ' How are you~~ '
data.strip() 
# return 'How are you~~'

data.lstrip() 
# return 'How are you~~ '

data.rstrip() 
# return ' How are you~~'
Split
str.split(([sep[, maxsplit]])
str.rsplit(([sep[, maxsplit]])
str.splitlines([keepends])
data = 'my name is eva'
data.split(' ') 
# return ['my', 'name', 'is', 'eva']

data.split(' ', 2) 
# return ['my', 'name', 'is eva']

data.rsplit(' ', 2) 
# return ['my name', 'is', 'eva']


data = """Where
are
you"""
data.splitlines()
# return ['Where', 'are', 'you']

Well...that's it! I think this article is long enough.