Introduction to the Python Programming Language


Why Python

Python is an open-source, high-level, general purpose programming language that was originally developed by Guido van Rossum. What this means is that anyone can view (and optionally make contributions to) the actual language implementation, the language simplifies many tasks for the programmer, especially when compared to other programming languages like C or Java, and Python programs can easily be developed to tackle a wide range of problems including network communication, scientific calculations, data processing and archiving tasks, and graphical tool development. In addition, the syntax and form of the Python programming language is designed to allow Python programs to be quickly developed, simplifying development time and costs while also reducing maintenance costs since a well-written Python program is easy to comprehend.

Python logo

A large number of Python modules, which extend the base programming language, have been developed and are now widely used. These modules, which can simplify the development of new Python programs, can be broadly classified into three types. The first type are commonly used modules that are part of the official Python distribution, known as the standard library, such as the math, os.path, pickle, sqlite3, bz2, or csv modules. The second type are modules that are also commonly used, but not (yet) part of the official standard library. Relevant example modules of this type are the numpy, pandas, matplotlib, or scipy modules. The last type of module is developed by communities for a special purpose that are currently less commonly used (but this may change in time). These include modules like the seaborn, statsmodels, or nltk. Of course, any developer can write a Python module, thus offering a wide array of possible add-on functionality. Using these modules, however, should be carefully balanced with the importance minimizing software dependencies, which can reduce development and maintenance issues for software engineers.


Optional: Python History

Python is maintained by the Python Software Foundation and currently comes in two versions: Python 2 and Python 3, which is intentionally backwards incompatible. Sometime around the turn of the millennium, Python developers began to consider improvements to the Python language that might produce incompatibilities with the existing Python language. One such change, which has been introduced with Python 3, is that the Python programming language is now consistently object-oriented, which means that everything in a Python program or script is now an object. These changes were considered necessary to enable the language to continue to grow and develop. The development of this new version, originally entitled Python 3000 or Python 3K, now shortened to just Python 3, took a number of years as new ideas were carefully developed and tested, and to provide sufficient time for the existing community to participate in the progression of the Python language.

Now, over 15 years later, Python 3 is an improved version of the original Python language and offers a number of important advances. Thus, Python 2 is primarily used to maintain backwards compatibility with legacy Python codes that are too difficult or expensive to port to the newer version. In this class, we will exclusively use Python 3 since it represents the future, and all libraries we will use have already been successfully ported to the new version. As you browse different websites, IPython notebooks, or other resources, you should keep in mind this language split and be sure to focus primarily on Python 3 material to minimize confusion arising from these language differences.


The following code block, when executed, displays what might be called the Python Zen.


In [1]:
import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Basic Concepts

While Python is a relatively easy language to learn, there are a few basic concepts that need to be reviewed before we begin to discuss the Python programming language. A fundamental concept to remember is that good Python code should be easy to read. To help programmers adhere to this guideline, Python has the following guidelines:

  1. white space is important,
  2. names should be descriptive,
  3. code blocks are indented four spaces (not hard tabs) and follow a colon,
  4. lines of code should be limited to less than 80 characters, and
  5. good code should be throughly documented both with comments and descriptive documentation strings.

If lines need to be longer than 80 characters, the recommended practice is to use parentheses to group operations and to use suitable indentation to maintain readability. In the event this is insufficient, a line continuation character, /, can be used to allow code to extend over as many line as necessary.

Python supports REPL, which is an abbreviation for Read-Eval-Print-Loop, allowing a developer to write, run, and test code iteratively, which aids in quickly developing new programs. Python also is Unicode complaint, so character coding can be specified (for example, UTF-8) at the start of a program (or in string literals), allowing a wider range of characters to be used to write descriptive text.

Python Identifiers

A Python identifier is a name that is composed of a sequence of letters, numbers, and underscore characters that must adhere to the following rules:

  1. The first character must be a letter or an underscore character.
  2. Variable and Function names traditionally start with a lowercase letter.
  3. Classes traditionally start with an upper-case letter.
  4. The identifier cannot be one of the reserved Python keywords, listed in the code block below.

While not explicitly prevented, it is also recommended to avoid names of objects from common Python libraries, like the string, list, or tuple to minimize name collisions and any resultant confusion.


In [2]:
# We are in a Python3 Kernel
help('keywords')
Here is a list of the Python keywords.  Enter any keyword to get more help.

False               def                 if                  raise
None                del                 import              return
True                elif                in                  try
and                 else                is                  while
as                  except              lambda              with
assert              finally             nonlocal            yield
break               for                 not                 
class               from                or                  
continue            global              pass                


A Python identifier can be used as the name of a variable, function, class, or module. Python identifiers are case sensitive, so mylist is different than myList. Writing descriptive identifiers can be beneficial for code readability and subsequent maintenance, thus we often write multi-word identifiers. When combining words, one can either use camel-case format, where each new word after the first is capitalized like myFileList. Alternatively, we also can separate words by using underscores like my_filename_list. While both approaches are legal, it is best to be consistent as much as possible.

The Python Enhancement Proposal, PEP-8, provides a complete discussion of recommend best practices when writing Python code. An additional perspective is available on how to be a Pythonista.

Documentation

The primary mechanism for documenting Python code is to use comments. Python supports two types of comment strings. The first type is a single-line comment, which begins with the hash or pound character # and continues until the end of the line. The # character can appear anywhere on the line. You can create large comment blocks by placing single-line comments adjacent to each other in a Python program. Here are a couple of examples of single line comments; the first comment consists of the entire line, and the second comment extends from the preceding command to the end of that line.

# Calculate the hypotenuse of a triangle

c = math.sqrt(a**2 + b**2) # Assuming Euclidean Geometry

The second type of comment is a multi-line comment, which begins and ends with either three single quote characters, ''' comment text ''' , or three double quote characters in a row: """ comment text """. This comment can easily extend over multiple lines and is, therefore, used in a Python program to provide documentation via an implicit docstring for functions and classes. Here is an example of a multi-line comment string:

'''
This multi-line comment can provide useful information
for a function, class, or module.

This also allows whitespace to be used to help
write more clearly.
'''

The built-in help function can be used to view docstring comments for different functions, classes, or other Python features, as shown in the following code block. As the adjacent comment suggests, you should execute this function and change the argument to the help function to view documentation for other Python language components like int, complex, math, or list.

Another built-in function that you will frequently use is the print function, which, if necessary, converts its arguments to a string and displays the resulting string to STDOUT, which is generally the display.


In [3]:
help(print) # Try changing print to something different like int, complex, str, or list.
Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.


Student Exercise

In the empty Code cell below, write a simple Python comment, a multi-line comment, and use the help command to display the built-in documentation for the dict and str classes.


In [ ]:
 

Python Operators

Python supports the basic mathematical operators, and by importing the math library, the Python interpreter can mimic a scientific calculator. The following list presents the basic mathematical operators, in order of precedence (operators in the same table cell have the same precedence):

Operator Description Example
() Parenthesis for grouping (2 + 3)
** Exponential function 2**3
*
/
//
%
Multiplication
Division
Integer Division
Remainder
2*3.1
3.2 / 1.2
5//2
5%2
+
-
Addition
Subtraction
1.45 + 3.14
5.3 - 2.125

When computing a quantity, you often will want to assign the value to a variable. This is done by using the assignment operator =. On the other hand, you may want to test if two values are the same, which is done with the equivalence operator, ==. In addition, Python provides augmented assignment operators that combine a basic mathematical operator (+, -, *, /, **, //, or %) with the assignment operator: +=, -=, *=, /=, **=, //=, and %=; this can simplify and thus clarify some expressions. As a simple example of an augmented assignment operator, the following Python expressions are equivalent:

a = a + 1
a += 1

The Python language also provides a number of built-in functions, which generally convert a number either to a different datatype or different precision. These built-in functions are always available. The following table presents some of the more useful built-in functions for use in mathematical expressions.

Built-in Function Description
abs(x) Returns the absolute value of x
divmod(x, y) Returns both the quotient and remainder of x/y when using integer division
float(x) Returns x as a floating-point value
int(x) Returns x as a integer value
pow(x, y) Returns the exponential function x**y
round(x, n) Returns x as a floating-point value rounded to n digits (by default n=0)

In addition to these operators and built-in functions, there exist a number of mathematical operators in the Python [math library][2], including constants, numerical testing functions, trigonometric functions, logarithmic functions, and several special functions like the error function. The math library is part of the standard Python language, and is thus always accessible. To access them, the best approach is to simply add an import math statement at the top of a Python script or program. For example, the following example code demonstrates how to use functions in the math library:

import math

x = 2.3
y = math.sqrt(x/2.3) - math.sin(x/math.pi) + math.log(x**2)

In this example code, we have introduced the use of variables to hold the result of a calculation. Python is a dynamically typed language; thus we do not need to first declare the variable and its type before using it. If the variable is reused and assigned a different value, the variable takes on a new type. Python has a built-in type function that can always be used to ascertain the underlying data type of a variable, or any other legal Python construct as shown in the following code block.


In [4]:
x = 1
type(x)
Out[4]:
int
In [5]:
x = 3.2
type(x)
Out[5]:
float
In [6]:
type(print) # Try changing print to something different like 'Hello World!' or math
Out[6]:
builtin_function_or_method

Student Exercise

In the empty Code cell below, write a simple Python script to calculate the approximate number of seconds in one week. Use a variable to hold the calculation and print the answer within this notebook.


In [ ]:
 

Note that Python supports other operators that are used when working with Boolean data or to perform bit-wise operations. For conciseness, we do not discuss these operators in this notebook.

In the next set of code blocks, we present several examples that demonstrate how to use the basic Python mathematical operators and functions to compute different expressions. These blocks are meant to be executed, modified, and re-executed. One last point is that the Python interpreter provides a shorthand, a single underscore character _, to include the result of the last expression that was displayed by the Python interpreter in a new expression. Note that this is not automatically the result of a previous IPython Notebook cell, but is the last value computed. This is especially true if multiple values are calculated within a code block. The IPython Notebook extends this to allow repeated underscore characters to refer to previously calculated expressions, where the number of underscores refers to how many previous expressions should be used in palace of the underscore characters:

>>> a = 123
>>> a
123
>>> _ + 1
124

In [7]:
5 // 2
Out[7]:
2
In [8]:
5 % 2
Out[8]:
1
In [9]:
2.5 * 4.3 / 1.2 *(2 + 3 + 4 + 5) - 2.1**1.01
Out[9]:
123.3010280397591
In [10]:
2 * _  # Use the result from the previous calculation
Out[10]:
246.6020560795182
In [11]:
_ + __ + 1 # We now refer to the previous two calculations
Out[11]:
370.9030841192773
In [12]:
import math

math.sqrt(4)
Out[12]:
2.0
In [13]:
math.exp(0) + math.cos(0) # 1 + 1 = 2
Out[13]:
2.0

Best Practices: Version Management

A good practice to employ in notebooks is to display the version of all software tools that might be used within the notebook. Often this can easily be placed at the end of a notebook, where it will not distract from the primary intent of the material contained within the notebook. To display the version of Python used by the current notebook, we simply need to execute the following two lines. Note that we also can display the specific version of Python libraries (e.g., numpy, scipy, or matplotlib) that we use in a Notebook, which will be demonstrated later in this course.


In [14]:
import sys

print(sys.version)
3.6.7 | packaged by conda-forge | (default, Nov 21 2018, 02:32:25) 
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]

Ancillary information

The following links are to additional documentation that you might find helpful in learning this material. Reading these web-accessible documents is completely optional.

  1. The official Python3 Tutorial
  2. An official guide to Python for Beginners
  3. The introductory book A Byte of Python
  4. The book Think Python for Python3 provides a comprehensive view of Python for data science.
  5. The book Dive into Python is presented from a different, sometimes more advanced, viewpoint.

© 2017: Robert J. Brunner at the University of Illinois.

This notebook is released under the Creative Commons license CC BY-NC-SA 4.0. Any reproduction, adaptation, distribution, dissemination or making available of this notebook for commercial use is not allowed unless authorized in writing by the copyright holder.