from Scientific.IO.TextFile import TextFile
from string import split
lines = 0
words = 0
for line in TextFile('some_file'):
    lines = lines + 1
    words = words + len(split(line))
print lines, " lines, ", words, " words."
This program can be tested by running the Unix utility wc for
comparison.
from Scientific.IO.TextFile import TextFile
from Scientific.IO.FortranFormat import FortranFormat, FortranLine
from string import strip
generic_format = FortranFormat('A6')
atom_format = FortranFormat('A6,I5,1X,A4,A1,A3,1X,A1,I4,A1,' +
                            '3X,3F8.3,2F6.2,7X,A4,2A2')
carbons = 0
for line in TextFile('protein.pdb'):
    record_type = FortranLine(line, generic_format)[0]
    if record_type == 'ATOM  ' or record_type == 'HETATM':
        data = FortranLine(line, atom_format)
        atom_name = strip(data[2])
	if atom_name[0] == 'C':
	    carbons = carbons + 1
print carbons, " carbon atoms"
Note the use of strip to remove the spaces around the atom
name that are present in a properly formatted PDB file (but not in
the PDB files produced by most programs).
Testing this program is less straightforward. One possibility is to apply a slight change to make it count only C-alpha atoms. That must give the number of residues in the file.
from Scientific.IO.TextFile import TextFile
from Scientific.IO.FortranFormat import FortranFormat, FortranLine
from string import join, strip
generic_format = FortranFormat('A6')
atom_format = FortranFormat('A6,I5,1X,A4,A1,A3,1X,A1,I4,A1,' +
                            '3X,3F8.3,2F6.2,7X,A4,2A2')
atom_lines = []
for line in TextFile('protein.pdb'):
    record_type = FortranLine(line, generic_format)[0]
    if record_type == 'ATOM  ' or record_type == 'HETATM':
        data = FortranLine(line, atom_format)
        atom_name = strip(data[2])
	element = atom_name[0]
	atom_lines.append(join([element, str(data[8]),
				str(data[9]), str(data[10])]) + '\n')
output_file = TextFile('protein.xyz', 'w')
output_file.write(`len(atom_lines)` + '\n\n')
output_file.write(join(atom_lines, ''))
output_file.close()
There are several ways to avoid reading the file twice for obtaining
the number of atom, but the general idea is to collect information
into a list and then writing it out in a separate step. In the example
above the list contains the text lines of the output files, but it
could also be used to store the input lines or some intermediate
data representation.
Using the output lines in the list has the advantage of not requiring a second loop to generate the output file; a single write operation plus a call to join is sufficient. In fact, this combination occurs so frequently that a shorthand was introduced which is even more efficient: instead of output_file.write(join(atom_lines, '')) you can write output_file.writelines(atom_lines).