6. Data Structures

Note

This Chapter Data Structures is for beginner. If you have some Python programming experience, you may skip this chapter.

List, Set, Tuple, and Dictionary are most common and basic data structures in Python. This chapter will cover some basic python commands with these data structure.

These data structures are different in their Mutability and Order, shown in the image below:

_images/data_structures.png
  • You can use curly braces to define a set like this: {1, 2, 3}. However, if you leave the curly braces empty like this: {}, Python will instead create an empty dictionary. So to create an empty set, use set().

  • A dictionary itself is mutable, but each of its individual keys must be immutable. You can find out why here.

Reference: Data Structures- Lists, Tuples, Dictionaries, and Sets in Python: https://medium.com/@aitarurachel/data-structures-with-lists-tuples-dictionaries-and-sets-in-python-612245a712af

6.1. List

list is one of data structures which is heavily using in my daily work.

6.1.1. Create empty list

The empty list is used to initialize a list.

:: Python Code:

# list can be defined with square brackets.
my_list = []
# create empty list with list() constructor
# when no parameters are passed
my_list = list()
type(my_list)

:: Output:

list

I applied the empty list to initialize my silhouette score list when I try to find the optimal number of the clusters.

:: Example:

min_cluster = 3
max_cluster =8

# silhouette_score
scores = []

for i in range(min_cluster, max_cluster):
    score = np.round(np.random.random_sample(),2)
    scores.append(score)

print(scores)

:: Output:

[0.16, 0.2, 0.3, 0.87, 0.59]

6.1.2. Unpack list

:: Example:

num = [1,2,3,4,5,6,7,8,9,10]
print(*num)

:: Output:

1 2 3 4 5 6 7 8 9 10

6.1.3. Methods of list objects

Methods of list objects:

Name

Description

list. append(x)

Add an item to the end of the list

list. extend(iterable)

Extend the list by appending all

list. insert(i, x)

Insert an item at a given position

list. remove(x)

Remove the first item

list. pop([i])

Remove the item at given position

list. clear()

Remove all items from the list

list. index(x[,s[,e]])

Return zero-based index in the list

list. count(x)

Return the number of times x

list. sort(key,reverse)

Sort the items of the list

list. reverse()

Reverse the elements of the list

list. copy()

Return a shallow copy 1 of list

6.1.4. list.append(x) vs. list.extend(iterable)

The difference of list. append(x) vs. list. extend(iterable) is easy to understand from the example below:

:: Example:

list1 = ['A','B','C']
list2 = ['D','E','F']
list1.append(list2)
print(list1)

:: Output:

['A', 'B', 'C', ['D', 'E', 'F']]

:: Example:

list1 = ['A','B','C']
list2 = ['D','E','F']
list1.extend(list2)
print(list1)

:: Output:

['A', 'B', 'C', 'D', 'E', 'F']

Footnotes

1

Shallow Copy vs Deep Copy Reference: https://stackoverflow.com/posts/184780/revisions

Shallow copy:

_images/shal.png

The variables A and B refer to different areas of memory, when B is assigned to A the two variables refer to the same area of memory. Later modifications to the contents of either are instantly reflected in the contents of other, as they share contents.

Deep Copy:

_images/deep.png

The variables A and B refer to different areas of memory, when B is assigned to A the values in the memory area which A points to are copied into the memory area to which B points. Later modifications to the contents of either remain unique to A or B; the contents are not shared.

6.2. Tuple

A tuple is an assortment of data, separated by commas, which makes it similar to the Python list, but a tuple is fundamentally different in that a tuple is “immutable.” This means that it cannot be changed, modified, or manipulated.

6.2.1. Create Tuple

A tuple is defined in the same way as a list, except that all elements are enclosed in parentheses instead of square brackets. To create a tuple of one item, it’s required a trailing comma after the item. Without the comma, Python just assumes you have an extra pair of parentheses instead of creating a tuple.

:: Python Code:

# initialize an empty tuple by using the tuple function
my_tuple = tuple()

# tuple with one value must include trailing comma
my_tuple = ('A', )
type(my_tuple)

# string type if no trailing comma
my_str = ('A')
type(my_str)

# convert list to tuple
my_list = ['A','B','C']
my_tuple = tuple(my_list)
type(my_tuple)

:: Output:

tuple
str
tuple

6.2.2. Assigning Multiple Values At Once with Tuple

A cool way of using tuple is to assign multiple values at once.

:: Example:

(x, y, z) = ('A','B','C')
(MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY, SUNDAY) = range(7)

6.3. Dictionary

dict is one of another data structures which is heavily using in my daily work. I heavily applied the dict in my PyAudit package, more details can be found at PyAudit.

6.3.1. Create dict from lists

:: Example:

col_names = ['name','Age', 'Sex', 'Car']
col_values = ['Michael', '30', 'Male', ['Honda','Tesla']]
#
d = {key: value for key, value in zip(col_names, col_values)}
print(d)
#
import pandas as pd

df = pd.DataFrame(d)
print(df)

:: Output:

{'name': 'Michael', 'Age': '30', 'Sex': 'Male', 'Car': ['Honda', 'Tesla']}
      name Age   Sex    Car
0  Michael  30  Male  Honda
1  Michael  30  Male  Tesla

6.3.2. dict.get()

When get() is called, Python checks if the specified key exists in the dict. If it does, then get() returns the value of that key. If the key does not exist, then get() returns the value specified in the second argument to get(). A good application of get() can be found at Update Keys in Dict.

:: Example:

data1 = d.get("name", "best")
data2 = d.get("names", "George")
print(data1)  # Michael
print(data2)  # George

:: Output:

Michael
George

6.3.3. Looping Techniques

:: Example:

print([(key, val) for key, val in d.items()])

:: Output:

[('name', 'Michael'), ('Age', '30'), ('Sex', 'Male'), ('Car', ['Honda', 'Tesla'])]

6.3.4. Update Values in Dict

  1. Replace values in dict

    :: Example:

    replace = {'Car': ['Tesla S', 'Tesla X']}
    print(d)
    d.update(replace)
    print(d)
    

    :: Output:

    {'name': 'Michael', 'Age': '30', 'Sex': 'Male', 'Car': ['Honda', 'Tesla']}
    {'name': 'Michael', 'Age': '30', 'Sex': 'Male', 'Car': ['Tesla S', 'Tesla X']}
    
  2. Add key and values in dict

    :: Example:

    # add key and values in dict
    added = {'Kid': ['Tom', 'Jim']}
    print(d)
    d.update(added)
    print(d)
    

    :: Output:

    {'name': 'Michael', 'Age': '30', 'Sex': 'Male', 'Car': ['Tesla S', 'Tesla X']}
    {'name': 'Michael', 'Age': '30', 'Sex': 'Male', 'Car': ['Tesla S', 'Tesla X'], 'Kid': ['Tom', 'Jim']}
    

6.3.5. Update Keys in Dict

:: Example:

# update keys in dict
mapping = {'Car': 'Cars', 'Kid': 'Kids'}
#
print({mapping.get(key, key): val for key, val in d.items()})

:: Output:

{'name': 'Michael', 'Age': '30', 'Sex': 'Male', 'Car': ['Tesla S', 'Tesla X'], 'Kid': ['Tom', 'Jim']}
{'name': 'Michael', 'Age': '30', 'Sex': 'Male', 'Cars': ['Tesla S', 'Tesla X'], 'Kids': ['Tom', 'Jim']}

6.4. One line if-else statement

6.4.1. With filter

::syntax:

[ RESULT for x in seq if COND ]

:: Python Code:

num = [1,2,3,4,5,6,7,8,9,10]

[x for x in num if x%2 ==0]

:: Output:

[2, 4, 6, 8, 10]

6.4.2. Without filter

::syntax:

[ RESULT1 if COND1  else RESULT2 if COND2 else RESULT3 for x in seq]

:: Python Code:

num = [1,2,3,4,5,6,7,8,9,10]

['Low' if 1<= x <=3 else 'Median' if 3<x<8 else 'High' for x in num]

:: Output:

['Low',
 'Low',
 'Low',
 'Median',
 'Median',
 'Median',
 'Median',
 'High',
 'High',
 'High']

[VanderPlas2016] [McKinney2013]