Thesis analysis

April 15, 2009 at 03:10AM     Research

When I was writing my PhD thesis, I kept my thesis in CVS. For my non-technical readers, that's a form of revision management software, which lets you keep track of your document as it changes over time. That allows you to fix something if you realize you want to go back to a previous version, it allows you to create diffs (like Microsoft Word's track changes) between any versions of your document, and it also means you can analyze the history of your document.

Thesis analysis
Inspired by something I once saw on another blog, I wrote some scripts to find out statistics of my thesis over its lifetime. In the graph at left (click it for a larger version), the green area represents the total number of pages over time, and the the blue bars represent the number of lines of code changed each day. In other words, the green represents the overall picture and the blue is the amount of work done each day.

I was most interested to discover that my thesis work happened in chunks with long breaks in between. As you can see I worked on my thesis over a period of 6 months before submission on January 22, 2009, but in that six month period only about 55 days were actually spent editing my thesis. Rather than slacking off, what really happened is that I was working on turning the research into papers that eventually turned into thesis chapters. So a lot of work would happen outside the thesis for a few weeks and once that was done there would be a flurry of activity to bring it all into the thesis.