The purpose of this session is to very briefly point to some further topics that you may be interested in, and will doubtless be very useful when writing your own programs.

1. Scripts

We have been using Jupyter Notebooks to write and run Python, installed using Anaconda. However this is only one way of doing this. Python is just a language and, with Anaconda, your computer can read and run files written in this language. To help the computer out, we save these files with the extension .py.

For example, save the following bit of Python in a file on your computer called hello-world.py (you can do this using any text editor).

sum = 0
for num in range(10):
    sum += num
print('Hello world! My sum is ' + str(sum))

Then open your command line or command prompt. Navigate to the place where you saved the file, and then type:

python hello-world.py

This tells the computer to read and run the Python written in the file hello-world.py.

Running scripts is the usual way to run very large Python programs, maybe code that takes a long time to run, or would be inappropriate for the Jupyter Notebook environment.

2. Readable code

You may have noticed throughout this course that nearly all Python commands look like English, and reading some Python aloud is understandable as English. This is on purpose, to make it easier to read.

When you have the freedom to name things yourselves, such as variables, functions and objects, it is good practice to ensure that these are meaningful English names, to ensure the overall readability of the code.

For example, consider the following code:

>>> def f(l):
...     n = len(l)
...     s = sum(l)
...     return s / n

>>> a = [1.76, 1.73, 1.68, 1.75, 1.76, 1.82, 1.77, 1.73, 1.66, 1.68]
>>> f(a)
1.734

With some thinking we can work out what this is doing. But with proper variable names this task would be much easier:

>>> def mean(data):
...     number_of_observations = len(data)
...     sum_of_observations = sum(data)
...     return sum_of_observations / number_of_observations

>>> heights = [1.76, 1.73, 1.68, 1.75, 1.76, 1.82, 1.77, 1.73, 1.66, 1.68]
>>> mean(heights)
1.734

3. Libraries

A great strength of Python is that there are a number of high quality pre-written code available for free to download. These are called libraries, and offer functions and objects to carry out common and specialised tasks. You have already used some of these, and the final set of exercises takes you through the use of some libraries that will be particularly relevant to data science, operational research, and applied statistics.

To show the diversity and range of libraries available, and as a reference for your future studies, below is a large list of libraries and their main uses:

  • Astropy: For commonly used astronomy tools.
  • BeautifulSoup: For pulling data from html and xml.
  • Ciw: For discrete event simulations of queueing systems.
  • Dask: For parellelising analytics.
  • datetime: For manipulating dates and times.
  • Django: For building web apps.
  • fastText: For text classification and representation.
  • Flask: For building web apps.
  • FuzzyWuzzy: For fuzzy string matching.
  • Gambit: For game theory.
  • GeoPandas: For manipulating and plotting geospacial data and maps.
  • itertools: For combinatorics and iterators.
  • Keras: For neural networks and deep learning.
  • kivy: For app development.
  • math: For mathematical functions and constants.
  • matplotlib: For 2D plotting.
  • Nashpy: For game theory.
  • NetworkX: For graph theory and networks.
  • NumPy: For efficient linear algebra.
  • pandas: For data frames, data manipulation, and data analysis.
  • PuLP: For optimisation and linear programming.
  • PyFlux: For time series analysis and prediction.
  • pygame: For making games.
  • PyTorch: For machine learning.
  • random: For generating random numbers and sampling from distributions.
  • requests: For sending HTTP requests and using APIs.
  • scikit-learn: For machine learning.
  • SciPy: For scientific computations, optimisation, and statistical functions.
  • SimPy: For process-based discrete event simulation.
  • SymPy: For symbolic mathematics and computer algebra.
  • TensorFlow: For neural networks and deep learning.
  • tqdm: For progress bars.
  • turtle: For turtle graphics.

Previous - Home - Next