## Python List Sorting Techniques

1. How do I sort a list ?

You can sort a list by using the built-in list.sort() method that modifies the list in-place, and it’s stable. The other way to sort a list is via built-in function sorted() that builds a new sorted list. Let me give you two examples.

Example using list.sort():

import random
# a list with 10 integers
nums = [num for num in range(10)]
print(nums)
# shuffle nums to see the effect of sorting, note that this is a in-place operation
random.shuffle(nums)
print(nums)
# sort it
nums.sort()
print(nums)


The above piece of code will have an output as following:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 7, 5, 3, 6, 8, 4, 1, 9, 2]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


Note: you may have different output per run because of the shuffle operation.

Example using sorted():

import random
# a list with 10 integers
nums = [num for num in range(10)]
print(nums)
# shuffle nums to see the effect of sorting, note that this is a in-place operation
random.shuffle(nums)
print(nums)
# sort it using sorted
nums_sorted = sorted(nums)  # a new list returned
print(nums_sorted)


Here, we’ll have a similar output as in the previous example:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[4, 5, 3, 7, 6, 2, 9, 8, 1, 0]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

1. How do I sort a list with multiple keys?

Suppose you have a list of records with each record representing a student. A student consists of a name, a gender, an age, etc. For simplicity, we can assume the record just consists of the three attributes aforementioned, thus the list looks like the following:

students = [('john', 'male', 23), ('alice', 'female', 18)]


In this case, you would like to sort the students first by name, then by the age. How can you do it? The answer is you can either use list.sort() or sorted(), but this time you need to supply a comparison function or specify the key to sort. Before we go ahead, let’s check the signatures:

The signature of list.sort() is sort(*, key=None, reverse=None) while the signature of sorted() is sorted(iterable[,key][,reverse].

Now let’s sort a list with multiple keys, see following for an example:

Here I only give an example using list.sort().

Example 1: specify key using a lambda

from pprint import pprint

# create a list to sort
students = [('dave', 'male', 10), ('jane', 'female', 12), ('john', 'male', 15)]
# specify key using a lambda
students.sort(key=lambda student: (student[0], student[2]))
pprint(students)


The above code gives us following output:

[('dave', 'male', 10), ('jane', 'female', 12), ('john', 'male', 15)]


You can see that it just works as we expected. Let’s check another example:

Example 2: specify key using itemgetter

from operator import itemgetter
from pprint import pprint

# create a list to sort
students = [('dave', 'male', 10), ('jane', 'female', 12), ('john', 'male', 15)]
# specify key using a lambda
students.sort(key=itemgetter(0, 2))
pprint(students)


This example has a same output as in example 1. There is a third way to do it, we can explicitly sort the students first by name, then by the age. Let’s have a look at it:

Example 3: explicitly do the sorting

from operator import itemgetter
from pprint import pprint

# create a list to sort
students = [('dave', 'male', 10), ('jane', 'female', 12), ('john', 'male', 15)]
students.sort(key=itemgetter(0))  # sort by name
students.sort(key=itemgetter(2), reverse=True) # sort by age in descending order
pprint(students)


The output of above code:

[('john', 'male', 15), ('jane', 'female', 12), ('dave', 'male', 10)]


This time we have a different output from previous two examples, we first sort by name, then sort by age in descending order. When we want to have fine controls over the order to sort, we use this method.

## Batch vs Stochastic Gradient Descent

At this point, it seems that we’ve seen two ways of doing linear regression.

• By applying the squared (or absolute) trick at every point in our data one by one, and repeating this process many times.
• By applying the squared (or absolute) trick at every point in our data all at the same time, and repeating this process many times.

More specifically, the squared (or absolute) trick, when applied to a point, gives us some values to add to the weights of the model. We can add these values, update our weights, and then apply the squared (or absolute) trick on the next point. Or we can calculate these values for all the points, add them, and then update the weights with the sum of these values.

The latter is called batch gradient descent. The former is called stochastic gradient descent.

The question is, which one is used in practice?

Actually, in most cases, neither. Think about this: If your data is huge, both are a bit slow, computationally. The best way to do linear regression, is to split your data into many small batches. Each batch, with roughly the same number of points. Then, use each batch to update your weights. This is still called mini-batch gradient descent.

## Mean vs Total Squared (or Absolute) Error

A potential confusion is the following: How do we know if we should use the mean or the total squared (or absolute) error?

The total squared error is the sum of errors at each point, given by the following equation:

$M = \sum_{i=1}^m \frac{1}{2} (y - \hat{y})^2$,

whereas the mean squared error is the average of these errors, given by the equation, where $m$ is the number of points:

$T = \sum_{i=1}^m \frac{1}{2m}(y - \hat{y})^2$.

The good news is, it doesn’t really matter. As we can see, the total squared error is just a multiple of the mean squared error, since

$M = mT$.

Therefore, since derivatives are linear functions, the gradient of $T$ is also mm times the gradient of $M$.

However, the gradient descent step consists of subtracting the gradient of the error times the learning rate $\alpha$. Therefore, choosing between the mean squared error and the total squared error really just amounts to picking a different learning rate.

In real life, we’ll have algorithms that will help us determine a good learning rate to work with. Therefore, if we use the mean error or the total error, the algorithm will just end up picking a different learning rate.

## Build an Environment for Deep Learning

Recently I bought a Nvidia GPU card, GTX 1080 Ti, it’s a very powerful card. I’m very happy to have an my own card for deep learning purpose. In order to make it work I bought an used HP Z420 workstation to work with it. Though z420 is used, it’s very powerful to work with GTX 1080 Ti. Buying an used product also saves me a lot of money because of my limited budget. The time I got the card, I’m eager to make everything combined and functional. Now I already have an ready environment for deep learning purpose, here I want to share the process to make everything work and hope this will benefit others, good luck!

My operating system is the recent released Ubuntu LTS version, namely Ubuntu 18.04. So following operations are carried out in this setting.

## Install nvidia driver

There are three methods to install nvidia driver for your card, I will introduce them later. Now an important work to do is disabling nouveau.

1. Disable nouveau

To make your card to work properly, one important thing to do is disabling the open source driver supplied by nouveau, you can do this by editing the grub config file (/boot/grub/grub.cfg), searching for line containing quiet splash and add acpi_osi=linux nomodeset to end of the line.

You can check the type of your card using following method:

 ubuntu-drivers devices


The above command shows all devices which need drivers, and which packages apply to them. The output is as follows on my computer:

== /sys/devices/pci0000:00/0000:00:02.0/0000:05:00.0 ==
modalias : pci:v000010DEd00001B06sv00001458sd0000376Abc03sc00i00
vendor   : NVIDIA Corporation
model    : GP102 [GeForce GTX 1080 Ti]
manual_install: True
driver   : nvidia-driver-390 - distro non-free recommended
driver   : xserver-xorg-video-nouveau - distro free builtin


We can conclude from above outputs that the machine has a gpu card manufactured by the vendor NVIDIA Corporation, its model is GP102 (GeForce GTX 1080 Ti) and a driver nvidia-driver-390 is recommended to install.

3. Install the driver

To install the driver, one easy way is using the command ubuntu-drivers autoinstall which installs drivers that are appropriate for automatic installation. A second way is adding a ppa repository and installing from it. You can find related resources about that, here I mainly describe the third method I use to install the driver, installing from the executable run file downloaded from nvidia official website. By running the installer file provided officially you can install the most recent driver and get better performances.

Finally,I got a file named NVIDIA-Linux-x86_64-410.73.run in my Downloads folder.

Step 2: install the driver

To install the driver, change to the Downloads directory and run the following commands in the terminal:

sudo telinit 3
sudo sh NVIDIA-Linux-x86_64-410.73.run


The first command disables the gui and takes you to an terminal login, all you to do is login, run the second command and continue with the instructions.

Note: You might need to run sudo apt install build-essentials before you carry out the command to install the driver, because some requirements need to meet, like cc etc.

When the installation completes, reboot your machine to make the driver work. Check the installation with the following commands:

• Run command lspci | grep -i nvidia to make sure the output is correct.
• An application named NVDIA X Server Settings is installed, you can open it and check the settings.

4. Install cuda and cudnn

In this part, we install cuda toolkit and cudnn library. An important step is choosing the right cuda version, because I want to install tensorflow-gpu that only supports cuda 9.0, so this version is the only choice I can take.

ps: I try to use cuda 10.0 and cuda 9.2, neither works with tensorflow-gpu. Knowing this will save you a lot of time.

Choose Legacy Releases

After you clicked Legacy Releases, you are taken to the cuda toolkit archive site, here you need to select Cuda Toolkit 9.0. Remained steps are choosing the operating system, architecture, distribution, version and installer type, in my case these are Linux, x86_64, Ubuntu, 16.04 and runfile(local). To install this version, you need to install additional four patches, download and install them.

As in previous, check following picture easing your choosing:

After all necessary files are downloaded, just issue the command sudo sh runfile, the order is main runfile, then patch 1 runfile, patch 2 runfile, patch 3 runfile, finally patch 4 runfile. When all is done, you have cuda toolkit 9.0 installed.

Note: Add /usr/local/cuda-9.0/bin to your PATH and /usr/local/cuda-9.0/lib64 to your LD_LIBRARY_PATH.

To install cudnn library, you need a nvidia deveploper account to download it. When the download completes, you got an archive file, decompress it and you got an folder named cuda, all you need to do is copy files in this folder to previous installed cuda toolkit related location, see following for details:

sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*


Congrats! When you are here, you got an cuda toolkit 9.0 and cudnn library instaled.

5. Install tensorflow-gpu

To install tensorflow, please refer to Install Tensorflow with pip. Following the guide, I created a python virtual environment and installed tensorflow-gpu with pip.

6. Test tensorflow

Now we have tensorflow installed, let’s check it by a simple tutorial.

import tensorflow as tf
mnist = tf.keras.datasets.mnist

x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)


Let’s check the output by running above code:

(venv) duliqiang@hp-z420-workstation:~/ml\$ python tf_minist.py
Epoch 1/5
2018-11-03 14:39:13.246295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.721
pciBusID: 0000:05:00.0
totalMemory: 10.92GiB freeMemory: 10.17GiB
2018-11-03 14:39:13.246620: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2018-11-03 14:39:17.931728: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-03 14:39:17.931773: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]      0
2018-11-03 14:39:17.931782: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0:   N
2018-11-03 14:39:17.932021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9827 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1)
60000/60000 [==============================] - 14s 237us/step - loss: 0.2017 - acc: 0.9402
Epoch 2/5
60000/60000 [==============================] - 7s 116us/step - loss: 0.0813 - acc: 0.9747
Epoch 3/5
60000/60000 [==============================] - 7s 120us/step - loss: 0.0532 - acc: 0.9836
Epoch 4/5
60000/60000 [==============================] - 7s 119us/step - loss: 0.0370 - acc: 0.9880
Epoch 5/5
60000/60000 [==============================] - 7s 117us/step - loss: 0.0271 - acc: 0.9917
10000/10000 [==============================] - 1s 60us/step


Nice work! The gpu card is working as expected.