click on the Biowiki logo to go to homepage
Edit Raw Print
Links Diffs RSS
About Stats Recent


Research Teaching Blog
Fall09 | Sandbox
Biowiki > Teaching > Bio E 131 > StyleGuidelines

Search

Advanced search...

Topics

PageRank Checker

Overview and Rationale

Most programming languages provide a great deal of flexibility in the syntax they allow. Different programmers have different styles, and this freedom allows each person to program in a style that he/she is comfortable with. However, without the careful application of some set of consistent guidelines, the clarity of a program may suffer. Consider this example from the wikipedia entry on obfuscation:

The following program is a fairly straightforward implementation (in a C-like language) of an algorithm that prints all of the prime numbers up to some specified number (the cap).

    void primes(int cap) {
      int i, j, composite;
      for(i = 2; i < cap; i++) {
        composite = 0;
        for(j = 2; j < i; j++) 
          composite += !(i % j);
        if(!composite)
          printf("%d\t", i);
      }
    }
    

However, the rules of the language allow the same code to be written as:

    _(__,___,____){___/__<=1?_(__,___+1,____):!(___%__)?_(__,___+1,0):___%__==___/
    __&&!____?(printf("%d\t",___/__),_(__,___+1,0)):___%__>1&&___%__<___/__?_(__,1+
    ___,____+!(___/__%(___%__))):___<__*__?_(__,___+1,____):0;}
    

This is clearly an extreme example, where the programmer deliberately made the code as unreadable as possible. Even in more moderate cases, however, a piece code may be very difficult for anyone but the original programmer to follow. After a few months or years have passed, even the code's originator may no longer be able to recognize and update it. To help coders to produce clear, readable code, several widely accepted coding practices have been developed. For example, check out:

Fully flushed out guidelines, such as the Java example above, are beyond the scope of this course. However, you are still expected to follow a limited subset of fundamental practices, as described below. Additional entries may be added throughout the semester to address recurrent problems.

Specific Expectations for 131/231 Students

Variable Names

Meaning

All non-looping variable names must be meaningful. Instead of naming your parameters with one or a few letters, use descriptive names that suggest the purpose of the variable. For example, if you are writing a program that manipulates a DNA string, you might name that variable sequence instead of s. The only exception to meaningful variable names is looping variables. It is considered good style to use i, j, k, etc. as the loop control variable name.

Capitalization

In general, how to capitalize variable names differs from language to language and person to person. The only concrete requirement for this course is that you choose a consistent pattern of capitalization. If this is your first programming experience, the following rules are recommended:

  1. Most variables should be written using "camel case". In this style, the first word is lower case, and the first letter of all following words is upper case. For example, variable, thisIsMyVariable, and whenDoesTheCapitalizationStop are all written in camel case.
  2. Constants, however, should be written in upper case, with underscores separating words. For example, PI and NUMBER_OF_TABLES are constant variable names.
  3. Methods should follow the same naming as normal variables - camel case.

For a more complete description of one suggested pattern of capitalization, see the Naming Conventions section of this perl style guide.

Indentation

Each line of code at an equivalent depth should be indented the same amount (1 tab per depth). For example,

    while (<>) 
    {
        @numbers = split;
        $total = 0;
        foreach $number (@numbers) 
        {
            $total += $number;
            $grand_total += $number;
        }
        print "Total for line $.: $total.\n";
    }
    print "The grand total of all lines is $grand_total.\n";
    

(Code borrowed from http://perl.plover.com/yak/handson/examples/sumlines.pl)

You may place the parentheses either on the same line as the key word (while, for, if, etc.) as below, or on the following line as above.

    while (<>) {
        @numbers = split;
        $total = 0;
        foreach $number (@numbers) {
            $total += $number;
            $grand_total += $number;
        }
        print "Total for line $.: $total.\n";
    }
    print "The grand total of all lines is $grand_total.\n";
    

Comments

Comments will be expected in six places in your code:

1. At the beginning of any program, describing (in brief) what the program does. If the program implements a particular algorithm, please state that in the initial comments. For example, you may write the comment:

# This program implements Gillespie's algorithm to analyze the stochastic behavior of the HIV viral life cycle in humans.

2. Above any method with non-obvious function, describing what that function does. For example, above the method CalculateTotals you may write

# Sums up the energy scores of all components, and outputs the results in tabular format

3. Above stretches of code within a method that accomplish a particular task. As a rule of thumb, every 10-20 lines of code (that isn't repetitive or otherwise obvious) should have a comment describing what that code does.

4. Above any line that implements a formula, describing what that formula is symbolically. I.e., if you have a line of code that calculates the area of a circle, you should have the following comment above it:

# Area = PI * r^2

5. Above any non-trivial regular expressions, explaining what that expression should match. If you have a line of code like

    if ($phone =~ /\d{3}-\d{4}/)
    

You should comment above it:

# Recognizes phone numbers in standard XXX-XXXX format

6. Above anything that seems particularly tricky to understand. I.e.

# Converts the number to binary, takes the 2's complement, and adds to the previous result.

Non-redundancy

Part of producing clear code is to avoid having the same bit of code in several places. If you find yourself copying and pasting code, you should probably either be using a loop, or moving that code into a separate method. For example, if you are write code that sums the numbers 1 to 10 in one location, and very similar code that sums 3 through 29 in another location, you should instead write a single method that sums from any number m to any higher number n. You can then call that method from the other two places. This has the added advantage of making your code easier to maintain. If you discover that you made a mistake in implementing the summing loop, you only have to fix it in one place now, instead of two.

Reasonable Algorithmic Complexity

Although algorithms are not the primary focus of this class, you are expected to develop a rough understanding of how computationally complex your code is. For assignments that require algorithm implementation and/or development, solutions that are excessively wasteful (several orders higher polynomial time than needed, or unnecessarily exponential) will be graded down.

-- JoshKittleson - 29 Aug 2007

Actions: Edit | Attach | New | Ref-By | Printable view | Raw view | Normal view | See diffs | Help | More...