Tag Archives: networks

A thesis relived: using text analytics to map a PhD journey

 

Your thesis has been deposited.

Is this how four years of toil was supposed to end? Not with a bang, but with a weird sentence from my university’s electronic submission system? In any case, this confirmation message gave me a chuckle and taught me one new thing that could be done to a thesis. A PhD is full of surprises, right till the end.

But to speak of the end could be premature, because more than two months after submission, one thing that my thesis hasn’t been yet is examined. Or if it has been, the examination reports are yet to be deposited back into the collective consciousness of my grad school.

The lack of any news about my thesis is hardly keeping me up at night, but it does make what I am about to do in this post a little awkward. Following Socrates, some people would argue that an unexamined thesis is not worth reliving. At the very least, Socrates might have cautioned against saying too much about a PhD experience that might not yet be over. Well, too bad: I’m throwing that caution to the wind, because what follows is a detailed retrospective of my PhD candidature.

Before anyone starts salivating at the prospect of reading sordid details about about existential crises, cruel supervisors or laboratory disasters, let me be clear that what follows is not a psychodrama or a cautionary tale. Rather, I plan to retrace the scholastic journey that I took through my PhD candidature, primarily by examining what I read, and when.

I know, I know: that sounds really boring. But bear with me, because this post is anything but a literature review. This is a data-driven, animated-GIF-laden, deep-dive into the PhD Experience. Continue reading

The Who dimension

My last post focussed on my progress in making sense of the Where dimension of the public discourse on coal seam gas, including how the Where intersects with the What. This post is about the Who. Somehow, I’ve managed to say almost nothing on this blog so far about the Who dimension of my data. Nearly all of what I’ve written has been about the What, Where and When. It’s time to rebalance this equation.

Until recently, the Who dimension of my data was represented only by a pool of Australian news organisations (at more than 300 sources, it was admittedly a rather large pool), as I was working just with the data I retrieved from the Factiva news database. Now that I have incorporated additional data that I scraped from the websites of community, governments and industry stakeholders (as discussed in my last post), the Who dimension has become a little bit richer. Before I start exploring questions about specific stakeholders and news organisations, or make decisions about which sources I might want to exclude all together, I want to survey the full breadth of sources in my data. I want the birds-eye view. But how to get it?

Who × When ÷ Where = Wha…?

In the previous post, I listed all of my stakeholder sources in colourful tables showing the production of content over time. Initially I thought that doing the same thing with 300 news sources would be ridiculous, but then I figured it might just be ridiculous enough to work. Through a creative deployment of Excel’s conditional formatting feature, I managed to make what you see in Figure 1. Each horizontal band is an individual news source, and the darkness of the band corresponds with the number of articles produced by that source per quarter. Within each state, the sources are grouped by region, although I haven’t indicated where these groupings begin and end (maybe next time!).

Figure 1. The temporal coverage of all news sources in my corpus.
Figure 1. The temporal coverage of all news sources in my corpus. Each horizontal band represents a news source, while the shading indicates the number of articles published per quarter.

For an experiment that I didn’t take very seriously, this viz actually isn’t too bad. It highlights several features of the data that are useful to know. Firstly, it shows that very few publications have been reporting on coal seam gas continuously since 2000. Nationally, there are The Australian, The Financial Review, Australian Associated Press, and Reuters News (these are not labelled on the graph, so you’ll have to take my word for it). In Queensland, there are the Courier-Mail, the Gold Coast Bulletin, and (to a lesser extent) the Townsville Bulletin. In New South Wales, there has been more-or-less continuous coverage from the Sydney Morning Herald, and somewhat patchier coverage from the Newcastle Herald. The long horizontal lines in Victorian part of the chart represent the Herald Sun and The Age. Continue reading

Adventures in harmonic space

Long, long ago, I studied music. In fact, when I finished high school, music was all I wanted to study. To be sure, I didn’t just want to study it: I wanted to compose it as well. 1 But I soon discovered that music theory was something worthy of study in itself, quite apart from the grounding it provided for composition. Music theory, especially the analysis of harmonies and harmonic progressions, provided a way to pop the hood on a piece of music (or even a whole genre) and learn what makes it tick. As if that weren’t exciting enough, I sensed that there were more profound truths waiting to be teased out of these harmonic structures. For if they offered clues about what makes music tick, then surely they said something about what makes us tick as well.

I never did pursue my vision of a grand unified theory of tonal harmony and psychoacoustics. I soon found that there were also other things worth studying, many of which came with the bonus incentive of career prospects. One thing led to another, and for better or worse, I ended up working for the government. And not as a music theorist. But to this day, I can’t help hearing a piece of music and thinking about what makes it tick. The theorist within me is always plugging away, even while the rest of me is just enjoying the tune.

Unsurprisingly then, when I started playing with network graphs about 18 months ago, among the first things I asked myself is what application they might have for music theory. The beauty of network graphs is that they can be used to represent just about anything. Any system or community of inter-related parts can be turned into a network of nodes and connections. So far on this blog I’ve used network graphs to explore the linkages among websites related to coal seam gas, and to identify clusters of documents containing duplicated text. On my other blog, I used network graphs to see how the names of different people and places featured across a collection of my posts.

In this post, I will use network graphs to visualise the relationships among chords within a piece of music. You could examine melodies in much the same way, by breaking them down to their individual notes and tracking which notes pair up and cluster together most often. But I suspect that there is more to be gained from visualising the harmonic relationships. Continue reading

Notes:

  1. Eventually, years later, I did get around to writing some music. And I have finally published some of the results onto Youtube.