how What is this cProfile result telling me I need to fix?
I would like to improve the performance of a Python script and have been using
I opened this
Question: What can I do about
If it is relevant, here is the full source code to the script in question:
If I comment out the
Is writing to
I find it helpful to sort the stats on
The four methods mentioned by @Bernd Petersohn take up only 3.7 seconds out of a total execution time of 13.541 seconds. Before worrying too much about those, modularise your script into functions, run cProfile again, and sort the stats by
Update after question revised with changed script:
“””Question: What can I do about join, split and write operations to reduce the apparent impact they have on the performance of this script?””
Huh? Those 3 together take 2.6 seconds out of the total of 13.8. Your parseJarchLine function is taking 8.5 seconds (which doesn’t include time taken by functions/methods that it calls.
Bernd has already pointed you at what you might consider doing with those. You are needlessly splitting the line completely only to join it up again when writing it out. You need to inspect only the first element. Instead of
Now let’s dive into the body of parseJarchLine. The number of uses in the source and manner of the uses of the
Why do you need
Here is the parseJarchLine function with removing-waste changes marked  and changing-to-int changes marked . Good idea: make changes in small steps, re-test, re-profile.
Update after question about
If the statement that you commented out was anything like the original one:
Then your question is … interesting. Try this:
Now comment out the
By the way, someone mentioned in a comment about breaking this into more than one write … have you considered this? How many bytes on average in elements[1:] ? In chromosome?
=== change of topic: It worries me that you initialise
Now I’m similarly worried about the two global variables
BTW, the script tests for Python 2.5 or better; have you tried profiling on 2.5 and 2.6?