2.7 Python functions with vector arguments and modules

For many numerical problems variables are most conveniently expressed by arrays containing many numbers (i.e. vectors) rather than single numbers (i.e. scalars). The function cd_sphere above takes a scalar as an argument and returns a scalar value too. For computationally intensive algorithms where variables are stored in arrays this is inconvenient and time consuming, as each of the array elements must be sent to the function independently. In the following, we will therefore show how to implement functions with vector arguments that also return vectors. This may be done in various ways. Some possibilities are presented in the following, and, as we shall see, some are more time consuming than others. We will also demonstrate how the time consumption (or efficiency) may be tested.

A simple extension of the single-valued function cd_sphere is as follows:

def cd_sphere_py_vector(ReNrs):
    CD = zeros_like(ReNrs)
    counter = 0

    for Re in ReNrs:
        CD[counter] = cd_sphere(Re)
        counter += 1
    return CD

The new function cd_sphere_py_vector takes in an array ReNrs and calculates the drag coefficient for each element using the previous function cd_sphere. This does the job, but is not very efficient.

A second version is implemented in the function cd_sphere_vector. This function takes in the array Re and calculates the drag coefficient of all elements by multiple calls of the function numpy.where; one call for each condition, similarly as each if statement in the function cd_sphere. The function is shown here:

def cd_sphere_vector(Re):
    "Computes the drag coefficient of a sphere as a function of the Reynolds number Re."
    # Curve fitted after fig . A -56 in Evett and Liu: "Fluid Mechanics and Hydraulics"

    from numpy import log10, array, polyval, where, zeros_like
    CD = zeros_like(Re)
   
    CD = where(Re < 0, 0.0, CD)     # condition 1
    
    CD = where((Re > 0.0) & (Re <=0.5), 24/Re, CD) # condition 2

    p = array([4.22, -14.05, 34.87, 0.658])
    CD = where((Re > 0.5) & (Re <=100.0), polyval(p, 1.0/Re), CD) #condition 3

    p = array([-30.41, 43.72, -17.08, 2.41])
    CD = where((Re > 100.0)  & (Re <= 1.0e4), polyval(p, 1.0/log10(Re)), CD) #condition 4

    p = array([-0.1584, 2.031, -8.472, 11.932])
    CD = where((Re > 1.0e4)  &  (Re <= 3.35e5), polyval(p, log10(Re)), CD) #condition 5

    CD = where((Re > 3.35e5) & (Re <= 5.0e5), 91.08*(log10(Re/4.5e5))**4 + 0.0764, CD) #condition 6

    p  = array([-0.06338, 1.1905, -7.332, 14.93])
    CD = where((Re > 5.05e5)  &  (Re <= 8.0e6), polyval(p, log10(Re)), CD) #condition 7
    
    CD = where(Re > 8.0e6, 0.2, CD)  # condition 8
    return CD

A third approach we will try is using boolean type variables. The 8 variables condition1 through condition8 in the function cd_sphere_vector_bool are boolean variables of the same size and shape as Re. The elements of the boolean variables evaluate to either True or False, depending on if the corresponding element in Re satisfy the condition the variable is assigned.

def cd_sphere_vector_bool(Re):
    "Computes the drag coefficient of a sphere as a function of the Reynolds number Re."
    # Curve fitted after fig . A -56 in Evett and Liu: "Fluid Mechanics and Hydraulics"
    
    from numpy import log10, array, polyval, zeros_like
       
    condition1 = Re < 0
    condition2 = logical_and(0 < Re, Re <= 0.5)
    condition3 = logical_and(0.5 < Re, Re <= 100.0)
    condition4 = logical_and(100.0 < Re, Re <= 1.0e4)
    condition5 = logical_and(1.0e4 < Re, Re <= 3.35e5)
    condition6 = logical_and(3.35e5 < Re, Re <= 5.0e5)
    condition7 = logical_and(5.0e5 < Re, Re <= 8.0e6)
    condition8 = Re > 8.0e6
    
    CD = zeros_like(Re)
    CD[condition1] = 0.0
    
    CD[condition2] = 24/Re[condition2]
    
    p = array([4.22,-14.05,34.87,0.658])
    CD[condition3] = polyval(p,1.0/Re[condition3]) 
    
    p = array([-30.41,43.72,-17.08,2.41])
    CD[condition4] = polyval(p,1.0/log10(Re[condition4]))
    
    p = array([-0.1584,2.031,-8.472,11.932])
    CD[condition5] = polyval(p,log10(Re[condition5]))
    
    CD[condition6] = 91.08*(log10(Re[condition6]/4.5e5))**4 + 0.0764
    
    p  = array([-0.06338,1.1905,-7.332,14.93])
    CD[condition7] = polyval(p,log10(Re[condition7]))
    
    CD[condition8] = 0.2
    
    return CD

Lastly, the built-in function vectorize is used to automatically generate a vector-version of the function cd_sphere, as follows:

cd_sphere_auto_vector = vectorize(cd_sphere)

To provide a convenient and practical means to compare the various implementations of the drag function, we have collected them all in a file DragCoefficientGeneric.py. This file constitutes a Python module which is a concept we will discuss in the section 2.8 How to make a Python-module and some useful programming features.

2.8 How to make a Python-module and some useful programming features

Python modules A module is a file containing Python definitions and statements and represents a convenient way of collecting useful and related functions, classes or Python code in a single file. A motivation to implement the drag coefficient function was that we should be able to import it in other programs to solve e.g. the problems outlined in (2.6.4 Example: Sphere in free fall). In general, a file containing Python-code may be executed either as a main program (script), typically with python filename.py or imported in another script/module with import filename.

A module file should not execute a main program, but rather just define functions, import other modules, and define global variables. Inside modules, the standard practice is to only have functions and not any statements outside functions. The reason is that all statements in the module file are executed from top to bottom during import in another module/script [4], and thus a desirable behavior is no output to avoid confusion. However, in many situations it is also desirable to allow for tests or demonstration of usage inside the module file, and for such situations the need for a main program arises. To meet these demands Python allows for a fortunate construction to let a file act both as a module with function definitions only (i.e. no main program) and as an ordinary program we can run, with functions and a main program. The latter is possible by letting the main program follow an if test of the form:

if __name__ =='__main__':
    <main program statements>

The __name__ variable is automatically defined in any module and equals the module name if the module is imported in another program/script, but when the module file is executed as a program, __name__ equals the string '__main__'. Consequently, the if test above will only be true whenever the module file is executed as a program and allow for the execution of the <main program statements>. The <main program statements> is normally referred to as the test block of a module.

The module name is the file name without the suffix .py [4], i.e. the module contained in the module file filename.py has the module name filename. Note that a module can contain executable statements as well as function definitions. These statements are intended to initialize the module and are executed only the first time the module name is encountered in an import statement. They are also run if the file is executed as a script.

Below we have listed the content of the file DragCoefficientGeneric.py to illustrate a specific implementation of the module DragCoefficientGeneric and some other useful programming features in Python. The functions in the module are the various implementations of the drag coefficient functions from the previous section.

Python lists and dictionaries

Lists hold a list of values and are initialized with empty brackets, e.g. fncnames = []. The values of the list are accessed with an index, starting from zero. The first value of fncnames is fncnames[0], the second value of fncnames is fncnames[1] and so on. You can remove values from the list, and add new values to the end by fncnames. Example: fncnames.append(name) will append name as the last value of the list fncnames. In case it was empty prior to the append-operation, name will be the only element in the list.
Dictionaries are similar to what their name suggests - a dictionary - and an empty dictionary is initialized with empty braces, e.g. CD = {}. In a dictionary, you have an 'index' of words or keys and the values accessed by their 'key'. Thus, the values in a dictionary aren't numbered or ordered, they are only accessed by the key. You can add, remove, and modify the values in dictionaries. For example with the statement CD[name] = func(ReNrs) the results of func(ReNrs) are stored in the list CD with key name.

To illustrate a very powerful feature of Python data structures allowing for lists of e.g. function objects we put all the function names in a list with the statement:

    funcs = [cd_sphere_py_vector, cd_sphere_vector, cd_sphere_vector_bool, \ 
             cd_sphere_auto_vector]  # list of functions to test

which allows for convenient looping over all of the functions with the following construction:

    for func in funcs:

Exception handling Python has a very convenient construction for testing of potential errors with try-except blocks:

try:
    <statements>
except ExceptionType1:
    <remedy for ExceptionType1 errors>
except ExceptionType2:
    <remedy for ExceptionType1 errors>
except:
    <remedy for any other errors>

In the DragCoefficientGeneric module, this feature is used to handle the function name for a function which has been vectorized automatically. For such a function func.func_name has no value and will return an error, and the name may be found by the statements in the exception block.

Efficiency and benchmarking The function clock in the module time, return a time expressed in seconds for the current statement and is frequently used for benchmarking in Python or timing of functions. By subtracting the time t0 recored immediately before a function call from the time immediately after the function call, an estimate of the elapsed cpu-time is obtained. In our module DragCoefficientGeneric the efficiency is implemented with the codelines:

        t0 = time.time()
        CD[name] = func(ReNrs) 
        exec_times[name] = time.time() - t0

Sorting of dictionaries The computed execution times are for convenience stored in the dictionary exec_time to allow for pairing of the names of the functions and their execution time. The dictionary may be sorted on the values and the corresponding keys sorted are returned by:

    exec_keys_sorted = sorted(exec_times, key=exec_times.get)

Afterwards the results may be printed with name and execution time, ordered by the latter, with the most efficient function at the top:

    for name_key in exec_keys_sorted:
        print(name_key, '\t execution time = ', '%6.6f' % exec_times[name_key])

By running the module DragCoefficientGeneric as a script, and with 500 elements in the ReNrs array we got the following output:

cd_sphere_vector_bool   execution time = 0.000312
cd_sphere_vector        execution time = 0.000641
cd_sphere_auto_vector   execution time = 0.009497
cd_sphere_py_vector     execution time = 0.010144

Clearly, the function with the boolean variables was fastest, the straight forward vectorized version cd_sphere_py_vector was slowest and the built-in function vectorize was nearly as inefficient.

The complete module DragCoefficientGeneric is listed below.

# src-ch1/DragCoefficientGeneric.py

from numpy import linspace,array,append,logspace,zeros_like,where,vectorize,\
    logical_and
import numpy as np  
from matplotlib.pyplot import loglog,xlabel,ylabel,grid,savefig,show,rc,\
    legend, setp

from numpy.core.multiarray import scalar

# single-valued function
def cd_sphere(Re):
    "Computes the drag coefficient of a sphere as a function of the Reynolds number Re."
    # Curve fitted after fig . A -56 in Evett and Liu: "Fluid Mechanics and Hydraulics"
    
    from numpy import log10,array,polyval    
    
    if Re <= 0.0:
        CD = 0.0
    elif Re > 8.0e6:
        CD = 0.2
    elif Re > 0.0 and Re <= 0.5:
        CD = 24.0/Re
    elif Re > 0.5 and Re <= 100.0:
        p = array([4.22,-14.05,34.87,0.658])
        CD = polyval(p,1.0/Re) 
    elif Re > 100.0 and Re <= 1.0e4:
        p = array([-30.41,43.72,-17.08,2.41])
        CD = polyval(p,1.0/log10(Re))
    elif Re > 1.0e4 and Re <= 3.35e5:
        p = array([-0.1584,2.031,-8.472,11.932])
        CD = polyval(p,log10(Re))
    elif Re > 3.35e5 and Re <= 5.0e5:
        x1 = log10(Re/4.5e5)
        CD = 91.08*x1**4 + 0.0764
    else:
        p = array([-0.06338,1.1905,-7.332,14.93])
        CD = polyval(p,log10(Re))
    return CD

# simple extension cd_sphere
def cd_sphere_py_vector(ReNrs):
    CD = zeros_like(ReNrs)
    counter = 0

    for Re in ReNrs:
        CD[counter] = cd_sphere(Re)
        counter += 1
    return CD

# vectorized function
def cd_sphere_vector(Re):
    "Computes the drag coefficient of a sphere as a function of the Reynolds number Re."
    # Curve fitted after fig . A -56 in Evett and Liu: "Fluid Mechanics and Hydraulics"

    from numpy import log10, array, polyval, where, zeros_like
    CD = zeros_like(Re)
   
    CD = where(Re < 0, 0.0, CD)     # condition 1
    
    CD = where((Re > 0.0) & (Re <=0.5), 24/Re, CD) # condition 2

    p = array([4.22, -14.05, 34.87, 0.658])
    CD = where((Re > 0.5) & (Re <=100.0), polyval(p, 1.0/Re), CD) #condition 3

    p = array([-30.41, 43.72, -17.08, 2.41])
    CD = where((Re > 100.0)  & (Re <= 1.0e4), polyval(p, 1.0/log10(Re)), CD) #condition 4

    p = array([-0.1584, 2.031, -8.472, 11.932])
    CD = where((Re > 1.0e4)  &  (Re <= 3.35e5), polyval(p, log10(Re)), CD) #condition 5

    CD = where((Re > 3.35e5) & (Re <= 5.0e5), 91.08*(log10(Re/4.5e5))**4 + 0.0764, CD) #condition 6

    p  = array([-0.06338, 1.1905, -7.332, 14.93])
    CD = where((Re > 5.05e5)  &  (Re <= 8.0e6), polyval(p, log10(Re)), CD) #condition 7
    
    CD = where(Re > 8.0e6, 0.2, CD)  # condition 8
    return CD

# vectorized boolean
def cd_sphere_vector_bool(Re):
    "Computes the drag coefficient of a sphere as a function of the Reynolds number Re."
    # Curve fitted after fig . A -56 in Evett and Liu: "Fluid Mechanics and Hydraulics"
    
    from numpy import log10, array, polyval, zeros_like
       
    condition1 = Re < 0
    condition2 = logical_and(0 < Re, Re <= 0.5)
    condition3 = logical_and(0.5 < Re, Re <= 100.0)
    condition4 = logical_and(100.0 < Re, Re <= 1.0e4)
    condition5 = logical_and(1.0e4 < Re, Re <= 3.35e5)
    condition6 = logical_and(3.35e5 < Re, Re <= 5.0e5)
    condition7 = logical_and(5.0e5 < Re, Re <= 8.0e6)
    condition8 = Re > 8.0e6
    
    CD = zeros_like(Re)
    CD[condition1] = 0.0
    
    CD[condition2] = 24/Re[condition2]
    
    p = array([4.22,-14.05,34.87,0.658])
    CD[condition3] = polyval(p,1.0/Re[condition3]) 
    
    p = array([-30.41,43.72,-17.08,2.41])
    CD[condition4] = polyval(p,1.0/log10(Re[condition4]))
    
    p = array([-0.1584,2.031,-8.472,11.932])
    CD[condition5] = polyval(p,log10(Re[condition5]))
    
    CD[condition6] = 91.08*(log10(Re[condition6]/4.5e5))**4 + 0.0764
    
    p  = array([-0.06338,1.1905,-7.332,14.93])
    CD[condition7] = polyval(p,log10(Re[condition7]))
    
    CD[condition8] = 0.2
    
    return CD
    

if __name__ == '__main__':              
#Check whether this file is executed (name==main) or imported as a module
    
    import time
    from numpy import mean
    
    CD = {} # Empty list for all CD computations
    

    ReNrs = logspace(-2,7,num=500)
    
    # make a vectorized version of the function automatically
    cd_sphere_auto_vector = vectorize(cd_sphere) 
    
    # make a list of all function objects
    funcs = [cd_sphere_py_vector, cd_sphere_vector, cd_sphere_vector_bool, \
             cd_sphere_auto_vector]  # list of functions to test   
    
    
    # Put all exec_times in a dictionary and fncnames in a list 
    exec_times = {}
    fncnames = []
    for func in funcs:
        try:
            name = func.__name__
        except: 
            scalarname = func.__getattribute__('pyfunc')
            name = scalarname.__name__+'_auto_vector'
                      
        fncnames.append(name)
                                      
        # benchmark 
        t0 = time.time()
        CD[name] = func(ReNrs) 
        exec_times[name] = time.time() - t0
    
    # sort the dictionary exec_times on values and return a list of the corresponding keys   
    exec_keys_sorted = sorted(exec_times, key=exec_times.get)
    
    # print the exec_times by ascending values
    for name_key in exec_keys_sorted:
        print(name_key, '\t execution time = ', '%6.6f' % exec_times[name_key])

        
    # set fontsize prms 
    fnSz = 16; font = {'size'   : fnSz}; rc('font',**font)          
    
    # set line styles
    style = ['v-', '8-', '*-', 'o-']
    mrkevry = [30, 35, 40, 45]
    
    # plot the result for all functions    
    i=0
    for name in fncnames: 
        loglog(ReNrs, CD[name], style[i], markersize=10, markevery=mrkevry[i])
#        hold('on')
        i+=1 
    
    # use fncnames as plot legend
    leg = legend(fncnames) 
    leg.get_frame().set_alpha(0.)
    xlabel('$Re$')
    ylabel('$C_D$')
    grid('on', 'both', 'both')
#    # savefig('example_sphere_generic.png', transparent=True) # save plot if needed
    show()