-
Notifications
You must be signed in to change notification settings - Fork 1
Software Development Best Practices
Here are some software development tips that I've learned over the years for rapidly creating high-quality software. They have served me well, but they are by no means exhaustive or authoritative, so please feel free to add and edit.
Some code is self-documenting, but for all else, my personal rule of thumb is as follows:
-
Each new file or module gets a high-level comment describing how it's intended to be used (unless it's self-explanitory)
-
Each new public-facing method or function should have, at a minimum:
- A high-level prescription about what the function does.
- A description of each argument, type-annotated.
- A description of each keyword argument, type-annotated.
- A description of what the function returns.
- (optional, recommended) A description of what exceptions it can raise.
For documentation convention, I recommend the numpy docstring convention.
For example, if we have a module called dna.py
that contains dna-manipulation code, it might look something like this:
"""
=================
nanoporter.dna.py
=================
This module contains functionality for manipulating DNA.
"""
from typing import List, Protocol
from foo import Enzyme
class DNA(Protocol):
"""Represents a chain of nucleotide bases.
The DNA class is used to store information about biological data.
"""
pass
def polymerase_chain_reaction(dna: DNA, enzyme: Enzyme, cycles=10) -> List[DNA]:
"""Performs a polymer chain reaction on some DNA, copying it a bunch of times.
Parameters
----------
dna : DNA
The DNA to be sequenced and copied.
enzyme : Enzyme
The specific polymerase to use in the PCR reaction.
cycles : int, optional
The number of times to run the PCR, which will result in 2**cycles copies of DNA, by default 10.
Returns
-------
List[DNA]
A list of DNA copies synthesized from the original strand.
"""
On my previous teams, one rule that brought us great success was "For every TODO you write, create a corresponding GitHub Issue, and link it in the TODO".
Or, in a quasi-rhyming form:
For every TODO, file an Iss-ue
These TODOs take the form: "TODO: ${high-level description of what needs to be achieved}: ${URL to GitHub issue}"
For example, instead of:
def polymerase_chain_reaction(dna: DNA, enzyme: Enzyme, cycles=10) -> List[DNA]:
dispense(enzyme)
dna.unwind() # TODO: Improve DNA unwind
dna.do_something_else()
It's preferred to have something like:
def polymerase_chain_reaction(dna: DNA, enzyme: Enzyme, cycles=10) -> List[DNA]:
dispense(enzyme)
dna.unwind() # TODO: Move the DNA unwind logic to the top of the function : https://github.com/uwmisl/NanoporeTERs/issues/1
dna.do_something_else()
This approach provides a number of advantages:
-
Progress-tracking: TODOs alone don't provide any sense of progress or priority, but GitHub issues do 😃
-
Details: One can elaborate more in a GitHub issue more than in code comments. Tracking details in Issues also preserves history in a way that code TODOs alone cannot.
-
Collaboration: Easier for collaborators and people new to the project to understand what needs to change.