Python - Data Structures

The most common Python data structures are:

  1. Lists
  2. Tuples
  3. Dictionaries (sets)

Python has a built-in list type named list.

List literals are written within square brackets [ ]. Lists work similarly to strings as in they use the len() function and square brackets [ ] to access data, with the first element at index 0.

>>> fruit = ['orange', 'blueberry', 'grapes', 'apple']
>>> print fruit[0]  
>>> print colors[2]
>>> print len(colors)
>>> print(fruit)
['orange', 'blueberry', 'grapes', 'apple']

It should be noted that assignment with an equal sign on lists does not make a copy. Instead, assignment makes the two variables point to the one list in memory.

new_list = fruit   ## does not copy the list  

both fruit and new_list point to the same list fruit

The empty list is just an empty pair of brackets [ ].

The '+' works to append two lists, so [1, 2] + [3, 4] yields [1, 2, 3, 4] (same as with strings).


The range(n) function yields the numbers 0, 1, ... n-1, and range(a, b) returns a, a+1, ... b-1. That is, up to but not including the last number. The combination of the for-loop and the range() function allow you to build a traditional numeric for loop:

## print the numbers from 0 through 9
>>> for i in range(10):
>>> print(i)

There is a variant xrange() which avoids the cost of building the whole list for performance sensitive cases.

List Methods

Here are some common list methods.

adds a single element to the end of the list. Common error: does not return the new list, just modifies the original.

list.insert(index, elem)
inserts the element at the given index, shifting elements to the right.

adds the elements in list2 to the end of the list. Using + or += on a list is similar to using extend().

searches for the given element from the start of the list and returns its index. Throws a ValueError if the element does not appear (use "in" to check without a ValueError).

searches for the first instance of the given element and removes it (throws ValueError if not present)

Sorts the list in place (does not return it). (The sorted() function shown below is preferred.)

Reverses the list in place (does not return it)

Removes and returns the element at the given index. Returns the rightmost element if index is omitted (roughly the opposite of append()).

Notice that these are methods on a list object

>>> contact_list = ['Albert', 'Connie', 'Mona', 'Bertie', 'Dylan']
>>> contact_list.append('Marko')         ## append list element at end
>>> list.insert(0, 'Carrie')        ## insert element at index 0
>>> list.extend(['Susan', 'Brenda'])  ## add list of elements at end
>>> print(contact_list)
['Carrie', 'Albert', 'Connie', 'Mona', 'Bertie', 'Dylan', 'Susan', 'Brenda']
>>> print(list.index('Connie'))

>>> list.remove('Bertie')        ## search list and remove that element
>>> list.pop(1)                  ## removes and returns 'Albert''
>>> print(contact_list)
['Carrie', 'Connie', 'Mona', 'Dylan', 'Susan', 'Brenda']

A common error: note that the above methods do not return the modified list, they just modify the original list.

>>> my_list = [1, 2, 3]
>>> print(my_list.append(4))   ## THIS does not work, append() always returns None

## Correct pattern:
>>> list.append(4)
>>> print(my_list)
[1, 2, 3, 4]
List Build Up

One common pattern is to start a list a the empty list [], then use append() or extend() to add elements to it:

>>> list = []          ## Start as the empty list
>>> list.append('abra')   ## Use append() to add elements
>>> list.append('cadabra')
List Slices

Slices work on lists just as with strings, and can also be used to change sub-parts of the list.

>>>  new_list = ['a', 'b', 'c', 'd']
>>>  print(new_list[1:-1])
['b', 'c']
>>>  list[0:2] = 'z'    ## replace ['a', 'b'] with ['z']
>>>  print(new_list)
['z', 'c', 'd']

Lists and strings have many common properties, such as indexing, sorting and slicing operations.

They are two examples of sequence data types

Since Python is an evolving language, other sequence data types may be added. There is also another standard sequence data type: the tuple.

A tuple consists of a number of values separated by commas:

>>> r = 1234, 5432, 'Pythonic!'
>>> r
(1234, 5432, 'Pythonic!')
>>> r[0]
>>> # You can also nest tuples
>>> g = r, (1,1,2,3,5,8,13,21)
>>> g
((1234, 5432, 'Pythonic!'), (1,1,2,3,5,8,13,21))
>>> # Tuples are immutable
g[0] = 8080  
Traceback (most recent call last):  
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment  
>>> # but tuples can contain mutable objects:
>>> v = ([1, 2, 3], [3, 2, 1])
>>> v
([1, 2, 3], [3, 2, 1])

Craig Derington

Secular Humanist, Libertarian, FOSS Evangelist building Cloud Apps developed on Red Hat Enterprise Linux and Ubuntu Server. My toolset includes Python, Celery, Flask, Django, MySQL, MongoDB and Git.

comments powered by Disqus