Python

Helpful Python notes and errors

Python is a programming language

General

  • Operators

    # boolean operations
    x or y
    x and y
    not x
    
    # comparisons
    1 < 2       # less than
    1 <= 2      # less than or equal
    2 > 1       # greater than
    2 >= 1      # greater than or equal
    2 == 2      # equal
    2 != 1      # not equal
    a is b      # object identity, true if variables point to same object
    a is not b  # negated object identity, true if variables don't point to same object
    
    # bitwise operations (on integer types)
    x | y   # bitwise or
    x ^ y   # bitwise xor
    x & y   # bitwise and
    x << n  # x bitshifted left by n bits
    x >> n  # x bitshifted right by n bits
    ~x      # bits of x are inverted
    
    # membership operators
    a in b      # bool indicating if sequence b has an element with value given by a
    a not in b  # negation of above statemnent
  • Standard Data Types

    # (broad) principle built-in types: numerics, sequences, mappings, 
    # classes, instances, and exceptions
    # (specific) five standard data types: numbers, strings, list, tuple, 
    # dictionary
    
    # numbers -- 3 types: int, float, complex
    int = 100
    float = 32.5+e10
    complex = 3.14j
    
    # strings
    str = 'Hello world'
    
    # lists -- mutable sequences
    list = ['a','b','c']
    
    # tuples -- immuatable sequences
    tuple = ('a','b','c')
    
    # dictionaries
    dict = {'a':1, 'b':2, 'c':3}
  • Built-in Functions

    # enumerate
    
    # sorted :: returns new sorted array, doesn't affect original
    arr = [1,5,3,2]
    sarr = sorted(arr)
    >> sarr = [1,2,3,5]
    
    # sort :: sorts an array in-place
    arr.sort()
    >> arr = [1,2,3,5]

Data Structures

  • Sequences

    • Common Sequence Operations

      # for all sequences (strings, lists, tuples, etc)
      s = ['a','b','c']
      
      # element in sequence
      'a' in s
      >> True
      'e' not in s
      >> False
      
      # concatenation, repitition
      s+s
      >> ['a','b','c','a','b','c']
      s*3
      >> ['a','b','c','a','b','c','a','b','c']
      
      # single indexing
      s[0]
      >> 'a'
      
      # slice of sequence s[i:j]
      s[1:3]
      >> ['b','c']
      
      # slice of sequence with step k s[i:j:k]
      s[::2] 
      >> ['a','c']
      
      # length of sequence
      len(s)
      >> 3
      
      # min and max element of sequence
      min(s), max(s)
      >> ('a','c')
      
      # index of first occurrence of element in sequence (optionally at/after index i and before index j)
      s.index('b')
      >> 1
      
      # number of occurrences of element in sequence
      s.count('a')
      >> 1
    • Lists

      • General Methods

        # create list
        fruits = ['orange', 'apple', 'pear', 'banana', 'apple']
        
        # count occurences of item in list
        fruits.count('apple')
        >> 2
        
        # get index of given item
        fruits.index('orange')
        >> 0
        
        # restrict index search with list.index(x[, start[, end]])
        fruits.index('apple', 2)
        >> 4
        
        # reverse the list in place i.e. no new assignment needed
        fruits.reverse()
        >> ['apple', 'banana', 'pear', 'apple', 'orange']
        
        # append an item to end of list
        fruits.append('grape')
        >> ['apple', 'banana', 'pear', 'apple', 'orange', 'grape']
        
        # insert an item x at (before) a specific index i list.insert(i, x)
        fruits.insert(2, 'kiwi')
        >> ['apple', 'banana', 'kiwi', 'pear', 'apple', 'orange', 'grape']
        
        # sort list in place
        fruits.sort()
        >> ['apple', 'apple', 'banana', 'grape', 'kiwi', 'orange', 'pear']
        
        # remove and return item at end of list or at given index list.pop([i])
        fruits.pop()
        >> ['apple', 'apple', 'banana', 'grape', 'kiwi', 'orange']
        
        # remove first item with value equal to x list.remove(x)
        fruits.remove('apple')
        >> ['apple', 'banana', 'grape', 'kiwi', 'orange']
      • Lists as Stacks

        # stacks are 'last in, first out'
        stack = [3, 4, 5]
        stack.append(6)
        >> [3, 4, 5, 6]
        
        # pop off most recently appended item
        stack.pop()
        >> 6
        
        stack.pop()
        >> 5
        
        stack
        >> [3, 4]
      • Lists as Queues

        from collections import deque
        
        # queues are first in, first out
        queue = deque(['Eric', 'John', 'Michael'])
        
        # append item to end of queue
        queue.append('Terry')
        >> deque(['Eric', 'John', 'Michael', 'Terry'])
        
        # remove 'first-in' items
        queue.popleft()
        >> 'Eric'
        
        queue.popleft()
        >> 'John'
        
        queue
        >> deque(['Michael', 'Terry'])
      • List Comprehensions

        # list comps are consice ways of creating lists
        squares = [x**2 for x in range(5)]
        >> [0, 1, 4, 9, 16]
        
        # double for loops and conditions
        lc = [(x,y) for x in [1,2,3] for y in [4,2,5] if x != y]
        >> [(1, 4), (1, 2), (1, 5), (2, 4), (2, 5), (3, 4), (3, 2), (3,5)]
        
        # the above lc is equivalent to
        lc = []
        for x in [1,2,3]:
            for y in [4,2,5]:
                if x != y:
                    combs.append((x, y))
        >> [(1, 4), (1, 2), (1, 5), (2, 4), (2, 5), (3, 4), (3, 2), (3,5)]
        
        # nested list comps -- matrix transpose example
        matrix = [
            [1,2,3,4],
            [5,6,7,8],
            [9,10,11,12],
        ]
        
        # transpose operation as list comp
        tr = [[row[i] for row in matrix] for i in range(4)]
        >> [[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
        
        # alternatively, we could use the zip(*iterables) method
        # asterisk expands list as multiple args to zip()
        list(zip(*matrix))
        >> [[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
    • Strings

      • Common Methods

          words = 'some words'
        
          # replace first n occurrences of given substring
          words.replace('words', 'things', 1)
          >> 'some things'
        
          # capitalize first letter of string
          words.capitalize()
          >> "Some words"
        
          # count num of occurrences of substring in range [start,end]
          words.count('so',3,6)
          >> 0
        
          # find lowest index where subtring is found within [start, end]
          words.find('wo')
          >> 5
        
          # formatting
          'sum of 1+2 is {0}'.format(1+2)
          >> "sum of 1+2 is 3"
        
          # find index of substring, same as find but raises error w/o finding
          words.index('or')
          >> 6
        
          # join strings in passed iterator with separator b/w strings
          ' '.join(['some','words'])
          >> 'some words'
        
          # split string into list by separator
          words.split(' ')
          >> ['some', 'words']
        
          # strip the specified characters from the head and tail of the string
          words.strip('so')
          >> 'me word'
        
          # make string titlecased ie uppercase first letter of all words
          words.title()
          >> 'Some Words'
        
          str.isalnum()   # returns true if all characters are alphanumeric
          str.isalpha()   # returns true if all characters are alphabetic
          str.isascii()   # returns true if all characters are ascii
          str.isdecimal() # returns true if all characters are decimal
          str.isdigit()   # returns true if all characters are digits
          str.islower()   # returns true if all characters are lowercase
          str.isupper()   # returns true if all characters are uppercase
          str.isnumeric() # returns true if all characters are numeric
  • Generators

    By definition, a function using the keyword yield instead of return will return a generator object. This generator object has a next() function that can be called until the iterator is exhausted. The placement of the yield keyword in a generator function essentially serves as a stopping point where the state of variables and the context is frozen and the desired value is returned. U to next() the generator object picks up where it left off, resuming computation using the same context as when the last yield statement was executed. Generators are a subset of iterators, and generally generator functions are just an easy way of creating iterators. Otherwise, to create an iterator, you have to actually define a class with its own iterator specific methods (as explained below). The yield keyword is the point at which a value is returned, and the sequence is incremented using the next() method.

    It is handy to use “generator comprehensions”, which are just like list comprehensions but use parantheses () instead of square brackets [].

  • Iterators

    By definition, iterators are objects with an __iter__ and __next__ method.

Modules

  • Project structure

    /Project /project init.py source.py README.md requirements.txt

  • Module: a single Python file, takes on the name of the file (without the .py extension)

  • Package: a collection of modules, a directory of Python files with an __init__.py file placed in it.

    • Python will treat a package as a module if imported directly. This imported module, however, will only be able to provide functions/objects in the __init__.py file (that have imported from the submodules). This means that I can structure files/classes however I like as a directory, but can modify how Python treats this code by choosing what I want to be available in the __init__.py file at the module level. Another way to think about it is __init__.py modifies the namespace under the module for which it’s defined. Think of __init__.py file as a normal python file with access to whatever names you have imported; all of those names will then be registered and accessible under the parent module name when actually using that module elsewhere.

    • The only things available at the top level of a package are functions or objects available directly (i.e. within an __init__.py file). Submodules/subpackages are not available directly from the top level import, and must be imported explicitly. For example, if I have the package structure

        a_package/
           __init__.py
           module_a.py
           a_sub_package/
             __init__.py
             module_b.py

      and I run import a_package, things like a_package.module_a or a_package.a_sub_package.module_b will not exist. The only things that will be accessible under the name a_package will be any objects included in the top level __init__.py file; for example, if the file imported/defined a_name, then a_package.a_name will be accessible. Note that when importing a module, the name used to import will have access to all objects within the module. To acces a submodule like module_a, you must import it directly: import a_package.module_a, which will then allow you to access obects in module_a via a_package.module_a.<object_name>. Note that when using import a.b.c, the last name used (i.e. c) must be a package/module, and not an object. To bring the object directly into the namespace, you can instead use from package.module import object. Finally, to access an object in module_b, you could do

      1. import a_package.a_sub_package.module_b, and access the object with `a_package.a_sub_package.module_b.<object_name>
      2. from a_package.a_sub_package.module_b import <object_name>, and access the object directly with <object_name>

      This logic applies regardless of the level of module/package nesting.

    • Note: imports from other modules depends on where the script is run. If you have a script that needs to import from a module in a parent directory, it can do so as long as the script is run from the parent directory. The path from where the script is run is automatically added to the Python search path, and any child scripts will have access to that root directory. Some on this here.

    • Another note: to run a script from within a package where you want the root directory context, use the -m option: python -m path.to.module. This will run the Python file at path/to/module.py as __main__. Note that path will be considered the “top-level package” when you execute the command. This means any relative imports used by the package will work, so long as the package itself is included as the first word in path.to.module. Otherwise Python will end up discarding the true package context and your relative imports will likely break since you’re starting from a different location.

    • On Relative Imports: relative imports are only relative in a package. Use dots to indicate where you want to import surronding modules/objects from while within a package, even when within the same folder (i.e. use . in that case). Relative imports make sense to use when developing a package, and are the method by which I will be developing packages.

    • A possible “one stop shop” to all import questions: https://stackoverflow.com/questions/14132789/relative-imports-for-the-billionth-time

    • A question related to importing subpackages and issues: https://stackoverflow.com/questions/12229580/python-importing-a-sub-package-or-sub-module

  • Installing locally: I recently wanted install my library (problib) locally for use within an app (probapi). This would allow me to treat my package like a true package that was installed from PyPi or something. Local installation can be achieved by running pip install -e /path/to/lib (preferrably in a venv).

  • Packaging Python: When attempting local installation as described above, I immediately ran into some errors. My local module was not properly packaged and ready to be installed. Here is the official Python packaging tutorial, along with a fantastic additional resource for structuring packages and more.

  • Automatic Documenation w/ Sphinx: Sphinx is a program (among others) that generates documentation for your Python package. See the problib docs folder and config file for the basics. Right now, generating documentation automatically requires the following commands:

      sphinx-apidoc -f -o source/ ../problib
      make html

Exception Handling

Input & Output

Reading and Writing Files

# read (print) all file contents
with open('/path/to/file.txt', 'r') as f:
    print(f.read())

# read (print) file contents line by line
with open('/path/to/file.txt', 'r') as f:
    print(f.readline())

# delete file and open for writing
with open('/path/to/file.txt', 'w') as f:
    f.write('some text')
  • open(file, mode='r') :: opens specified file for given mode
    • r :: open for reading
    • w :: open for writing, first deleting all file contents
    • x :: open for excluding creation (and allow writing). That is, if file exists throw an error, else create it and allow writing.
    • a :: open for writing, appending to file if exists
    • + :: open for reading and writing. This means you can write to the file, and then use seek to navigate forwards and backwards in the file to read at different places.

Classes

Basic Structure Example

# parent class definition
class Animal():
    '''documentation string, accessible via Animal.__doc__'''
    
    # class variables
    species = 20

    # construcutor
    def __init__(self, name, age):
        # instance variables
        self.name = name
        self.age = age

    # instance method
    def get_name(self):
        return self.name

    # class method, can only modify the class state
    @classmethod
    def add_species(cls):
        species += 1

    # static method, independent of class
    @staticmethod
    def animal_count(animals):
        return len(animals)

Instantiation

# create instance of Animal
animal = Animal('tom', 10)

# call instance method
animal.get_name()

# call class method
Animal.add_species()

Inheritance

# child class inheriting from Animal
class Dog(Animal):

    # overriding parent method
    def get_name(self):
        return 'Dog' + self.name

Note that in this example, the derived constructor is not defined. Here the base class constructor is called, as one might expect. Should the derived class Dog want its own constructor, we can simply define one. However, this constructor will not by default call the base class constructor, meaning it will not be initialized in the same fashion (i.e. create instance variables) as the base class. If we desire the initialization process from the base class, we can make use of super().__init__():

class Dog(Animal):

    def __init__(self, breed, name, age):
        # initialize derived like base
        super().__init__(name, age)
        self.breed = breed

Default class atributes

  • __doc__ :: class documentation string, if it exists
  • __dict__ :: dictionary containing classes’s namespace
  • __name__ :: class name
  • __module__ :: module name in which the class is defined
  • __bases__ :: tuple containing bases classes

Default class methods

  • __init__(self[, args]) :: class documentation string, if it exists
  • __del__(self) :: object destructor, deletes object
  • __repr__(self) :: evaluatable string representation
  • __str__(self) :: printable string representation
  • __cmp__(self) :: object comparison

Operator overloading

  • __add__(self, other) :: addition operator overload
  • __sub__(self, other) :: subtraction operator overload
  • __mul__(self, other) :: multiplication operator overload
  • __div__(self, other) :: division operator overload
  • __pow__(self, other) :: exponentiation operator overload

Object Destruction (garbage collection)

objects whose reference count reaches zero are deleted automatically

Attribute Decorators

  • @property :: a method decorator interface for the property object descriptor. Decorates object attribute getting, setting, and deleting. This decorator is pretty much useless outside of the case where you decide you want to define some getters and setters on top of an object that doesn’t already have them. A quote from here: “The great thing about properties is not that they replace getters and setters, its that you don’t have to write them to future-proof your code.”

      class Animal():
          def __init__(self, name):
              self.name = name
    
          @property
          def age(self):
              '''
              Assume an animal's age was related to its name. Here we can 
              define age to be a property of the class, similar to name, but
              wrap our getters and settings with the @property decorator. The
              vanilla "@property" defines a getter. This can now be accessed
              as A.age, where A is an instance of the Animal class.
              '''
              return len(self.name)
    
          @property.setter
          def age(self, age):
              '''
              The @property.setter decorator provides a setter wrapper for
              the previoulsy defined property under the same function name.
              '''
              self.age = age

Common Gotchas

Great article here on commonly missed issues and how to resolve them

Mutable Default Parameters

When recently trying to use an empty list as a default parameter, I was observing very odd behavior creating objects. Somehow the parameter was being populated without me specifying anything! Turns out Python only evaluates default arguments once when the function is declared, so it was creating a single list in memory and I was effectively passing that list (and any modifications that have since been made) into the function every time it was called. This is easy to fix; just be more explicit and use None.

Web

  • Requests Module

      import requests
    
      url = 'https://smgr.io'
    
      # simple get request
      r = requests.get(url)
    
      # simple post requests with json data
      r = requests.post(url, json={'key':'value'})
    
      # passing parameters in url
      payload = {'key1':'value1', 'key2':'value2'}
      r = requests.get(url, params=payload)
      print(r.url)
      >> "https://smgr.io?key1=value1&key2=value2"
    
      ## response attributes
      # status code of request
      r.status_code
      >> 200
    
      # auto decoded text response
      r.text
      >> "{'res':'reponse string'}"
    
      # use JSON decoder to turn to dicts & lists
      r.json()
      >> {'res':'reponse string'}
  • Server side rendering vs client side rendering

    Frameworks like React and Angular are examples of client side rendering libraries. These libraries use templating on the client side and can react to changing data quickly, reloading the dynamic aspects of the template when the change occurs. An example of this would be a to-do list app, where there is a React template that dynamically renders a

      based on the available to-do items. When a user deletes an item, this HTML list will update immediately on the client side as that part of the template is re-rendered according to the new data. A request could also be made to an external server to update this change in a database.

      How does this look without the Javascript framework? Using a backend like Flask, one can write HTML templates (using Jinja) that are populated by data, then sent to the client for the browser to display. So all “dynamic” functionality is stored inside of the server template. When something like a to-do list item gets deleted in this case, the clicked JS button would perhaps make a call to a Flask route, the function at the route would handle removal of the item from the database, and a “reloaded” template (i.e. the same template as before, now being populated with different data) is sent back to the client. However, this HTML file would have to be reloaded with AJAX for the page not to “reload”, meaning the entire page must be re-rendered to show the single updated data point.

Module Specific Info

  • Async blocks

    Asyncio is not multi-threading. That is, it uses a single thread but runs other code during await blocks where we know something will take time. For example, using await asyncio.sleep(5) , we begin to sleep and make use of the fact that result will take time. We move on to other work during this 5 second sleep period (still in a single thread), and once the 5 seconds is up we return to run the code beneath the sleep statement synchronously. Asyncio makes use of pieces of co-routines that we know ahead of time will require waiting, and weaves in work from one function while the other is on a wait break.

  • Psycopg2

    • Transactions and connections: To interact with a database, a connection (or database session) must be opened. From this connection, a transaction is initiated and cursors can make executions. Transactions are tied to the connection itself; if you have multiple cursors under the same connection, after a cursor issues the first execution, all other cursors and executions belong to the same transaction until a commit or rollback is called. After one of these commands, a new transaction begins on the next execution from any cursor under the connection.
    • Read here on some basic usage and safety measure when using the module.
    • There are occasionally issues installing this module via pip. For example, despite having it work with Python 3.6, I faced the same old errors trying to get it installed with 3.7. Basically you just need to have a few Python version specific dev tools installed: sudo apt install libpq-dev python3-dev should fix things up.
  • Flask and WSGI

    Apache’s mod_wsgi can be configured to communicate with a Flask application. The mod_wsgi module creates a WSGI server on top of Apache, and we tell Apache where to find our python application. The standard configuration within a VirtualHost block looks as follows

      WSGIDaemonProcess app python-home=/home/sites/smgr.io/flask_app/venv
      WSGIScriptAlias /api /home/sites/smgr.io/flask_app/app.wsgi
    
      WSGIProcessGroup app
      WSGIApplicationGroup %{GLOBAL}
      WSGIScriptReloading On
    
      <Directory /home/sites/smgr.io/flask_app>
          Require all granted
      </Directory>

    Here we specify a new WSGI daemon process, name it, and tell it where to find the Python interpreter and associated modules from the venv. The alias tells Apache under what url to serve the app. The directory block simply enables some options for the WSGI server and gives Apache access to the files in the flask_app directory. Note that while we’ve told Apache where our Python venv is, where the WSGI file is, and given access to our flask_app directory, we have not yet placed our flask_app directory in the Python path. This means Apache can’t run our flask app or any other associated files because it doesn’t know where they are. We can fix this in one of two ways:

    1. Can add a python-path to the WSGIDaemonProcess. This tells the WSGI server where our files are by adding them to the python path. Our WSGIDaemonProcess would now look like

       WSGIDaemonProcess app python-home=/home/sites/smgr.io/flask_app/venv python-path=/home/sites/smgr.io/flask_app
    2. We can append the directory with our files to the python path in the WSGI file itself. This would look as follows

       import sys
       sys.path.append('/home/sites/smgr.io/flask_app')
      
       from app import app as application

    In either case, the Python path is modified to include the files within our flask_app directory, and thus the WSGI server has access to our application. This is essentially the entirety of setting up mod_wsgi for communication to a Flask app.

    ### WSGI Setup for Apache2 and Python 3.6

    SO: https://stackoverflow.com/questions/44914961/install-mod-wsgi-on-ubuntu-with-python-3-6-apache-2-4-and-django-1-11?rq=1. Requires python be install with shared libraries enabled, i.e. PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install 3.6.3

    1. Uninstall current WSGI module sudo apt-get remove libapache2-mod-wsgi-py3
    2. Install dev packages sudo apt-get install libpq-dev python3.6-dev apache2-dev
    3. Install wsgi_mod via python version you want pip3.6 install mod_wsgi
    4. Run mod_wsgi-express module-config and copy its output to Apache config

    ### Setting up venv Python3.6 WSGI

    1. Create venv python3.6 -m venv env
    2. Activate env source env/bin/activate

    ## Flask Streaming

    HTTP streaming is a standard protocol that is essentially long-polling with chunking, and allows the connection between server and client to remain open for an extended period of time. Flask can implement streaming by passing a Python generator to a Flask Response object, and on the JS client reading the streaming endpoint with a ReadableStream object. One of the main drawbacks to streaming is that the raw bytes and sent through. There’s no particular protocol for reconstructing your data on the other side, and can be quite difficult if you’re trying to send complex objects. In some cases it can be straightforward; if just characters are being sent through, you only need to worry about a UTF-8 decoding for each byte you receieve. Otherwise, you might need to factor in the response length, keep track of how many bytes were sent and when, and then match that byte structure up against a local copy of some meaningful data structure you wish to reconstruct (think Powershades, this is exactly what I was tasked to do when parsing a byte-level response over the network). A great SO response describing the difference between websockets, streaming, long-polling, etc here, and a explanatory gist here.

    ## Flask Setup and Local Dev

    • To run local development server, simply run the script that calls app.run(), where app is the Flask app. That is, use

        if __name__ == '__main__':
            app.run()

      This method is no longer the recommended way of starting a Flask development server due to issues with the reload mechanism. The recommended way is to start Flask via the command line:

        $ export FLASK_APP=my_application
        $ export FLASK_ENV=development
        $ flask run

      which will enable an interactive debugger and reloader, and start the development server.

  • Common pip errors

    • When creating a venv natively through Python, if facing ensurepip or --default-pip errors of some sort, try installing python3.x-venv. This has resolved most of the issues in a number of scenarios.

Type Checking

  • isinstance(instance, class) is the recommended way of checking if an instance is a certain type. It lends itself to inheritance, returning True if the instance is a type that inherits from the (base) class we’re checking against.
  • Can also use type(instance) is Class to check if the instance is exactly the class given; this ignores inheritance, returning False even when the instance’s type is one that inherits from the given class.