Command Line Arguments, File I/O, Modules

BCH 519
Spring 2017

Andrew E. Bruno
aebruno2@buffalo.edu

Topics Covered

  • Command line arguments
  • File I/O
  • Modules, PyPI
  • Debugging

Command Line Arguments

Passing arguments to your python script

  • Your python script is run from the command line
  • Recall the basic form of a unix command:
      command options arguments
  • You can pass "arguments" to your python script just like any other unix command
  • Allows you to pass input or "data" into your programs
  • Example: ./myscript.py arg1 arg2 arg3 ..

The special list: sys.argv

  • Python sys module has a number of special pre-defined variables
  • sys.argv is a special list containing the command line arguments to your script
  • For the more info: http://docs.python.org/2/library/sys.html

Exercise 1: echo in python

# sys.argv is a list of command line arguments passed to your script. 
# Here, we simply print or "echo" the first argument that was passed 
# to our script.
#
# **Note: sys.argv[0] is the name of the script. 
# The first argument starts at: sys.argv[1]
import sys

print sys.argv[1]

run in the terminal:


$ python ex1.pl Hello
Hello
$ python ex1.pl Hello World
Hello
$ python ex1.pl "Hello World"
Hello World
    

Exercise 1: What we learned

  1. sys.argv is a list of command line arguments passed to your script
  2. sys.argv[1] The arguments start at index 1. Index 0 is the name of the script
  3. What happens when you run ex1.py with no arguments? How can you fix this?

Exercise 2: add two numbers

import sys

# Good to always validate any input to your program
if len(sys.argv) != 3:
    print "Usage: python ex16.py [num1] [num2]"
    sys.exit(1)

num1 = sys.argv[1]
num2 = sys.argv[2]
total = int(num1) + int(num2)

print "Sum: {}".format(total)

run in the terminal:


$ python ex2.py 2 15
Sum: 17
    

Exercise 2: What we learned

  1. Good practice to always validate any input to your programs
  2. sys.exit function exits program immediately
  3. What happens when you run ex2.py with the following input? How can you fix this?
    
    $ python ex2.py bob bill
        

File I/O

Files

  • Data is typically stored in files
  • We can write programs to process and mainpulate data stored in files
  • Three basic file operations: read, write, append

File Objects

sys.stdout = Standard Output

  • We've actually been working with this file object all along
  • Standard output refers to the output of a program (what get's printed to the screen)
  • Python has a file object called sys.stdout
  • print statement "writes" to a file object, by default sys.stdout
import sys

print "Hello World"

# Is the same as 

print >>sys.stdout, "Hello World"

Opening and closing files

  • File objects are created in a "mode" using the open function:
    • 'r' = read
    • 'w' = write
    • 'a' = append
  • When your done, close your file using file.close:
# Open file for reading
fin = open('input.txt', 'r')

# Open file for writing
fout = open('input.txt', 'w')

fin.close()
fout.close()

Writing to Files

  • Open a file object in write mode: 'w'
  • Use the file.write function to write to the file object
fout = open('output.txt', 'w')
fout.write("Hello World\n")
fout.close()

Reading Files

  • Open a file object in read mode: 'r'
  • Good practice to use with keyword. Will properly close files for you
  • File objects in python are iterators
with open('input.txt', 'r') as fin:
    # Read entire file
    contents = fin.read()

    # Read all lines
    lines = fin.readlines()

    # Read single line
    line = fin.readline()

    # Loop through lines of file
    for line in fin:
        print line,

sys.stdin = Standard Input

  • Standard input refers to the data going into a program (data provided as input)
  • Python uses a file object called sys.stdin
    import sys
    
    # Read line from STDIN
    line = sys.stdin.readline()
    print line,
  • Example program execution, '|' pipe will send the output of the echo command as input into our python script:
    
    $ echo "Hello World" | python test.py
    Hello World
        

Typical I/O Scenario

  1. Get command line arguments
  2. Open files for reading and/or writing
  3. Read data and process
  4. Write output
  5. Close files

Exercise 3: cat in Python

import sys

if len(sys.argv) != 2:
    print "Usage: python ex3.py [filename]"
    sys.exit(1)

filename = sys.argv[1]
with open(filename, 'r') as fin:
    for line in fin:
        print line,

run in the terminal:


$ echo "Hello World" > test-file.txt
$ cat test-file.txt
Hello World
$ python ex3.py test-file.txt
Hello World
    

Exercise 3: What we learned

  1. sys.argv contains the command line arguments
  2. with closes the open file object automatically
  3. print line, the trailing comma will ommit printing a newline

Modules

What is a Module?

  • A set of related functions in a "library" file
  • Designed to be reusable by other modules or programs

What does a Python module look like?

File: mathfunc.py
greeting = 'Hello World'
salutation = 'Goodbye'

def power(num, pow):
    return num ** pow

How to use Python modules?

File: square.py
import mathfunc

sq = mathfunc.power(8, 2)
print mathfunc.greeting
print "8 squared = {}".format(sq)
print mathfunc.salutation

PyPI

Example: Biopython

  • http://biopython.org
  • Set of freely available tools for biological computation written in Python by an international team of developers
  • Collection of many Python modules and related documenation

Parsing FASTA files

  • Sequencing data is often stored in a simple text based format called FASTA
  • Begins with a single line description, followed by lines of sequencing data
  • Instead of re-inventing the wheel and writing our own FASTA parser. Let's use a module from Biopython

Exercise 4: Parsing FASTA files with Biopython

from Bio import SeqIO

for rec in SeqIO.parse("mirna-targets.fasta", "fasta"):
    print rec.id 
    print "{} {}".format(len(rec), rec.seq)

Debugging - pprint

  • Excellent python module for debugging your code
  • "Data pretty printer"
  • Allows you to "dump" the contents of a variable
  • Example: dump the value of an entire hash

Exercise 5: pprint

import pprint

data = [
    {
        'id': 124,
        'name': 'miR-245',
        'chrom': 'chr10'
    },
    {
        'id': 234,
        'name': 'miR-201',
        'chrom': 'chr11'
    },
]

pprint.pprint(data)

Homework #3

  • Due: 2017-02-28 09:00:00