Post by Minna Krejci
The average person might not be familiar with many of the different methods that exist for visualizing data. But chances are, you’ve seen word clouds.
As more and more information becomes readily available and people have less time and attention to give to any one item (article, website, blog, etc.), it becomes more and more important to make information easily accessible. A word cloud is a visual tool that allows you to quickly get a sense for a collection of words — something like the text of a website, an article, a speech, a document, etc. Usually, the words are sized based on the frequency with which they appear (i.e., the words that are used the most are the biggest).
Let’s play a little game. Can you guess what the following word clouds represent? (Keep reading to see the answers and the sources.)
Answers in order (click on the links for the source and more information):
Beatles UK hit singles 1962-1970
Words used to describe coffee
Words from a company’s customer satisfaction survey
The US Constitution
Terms related to trigonometry
Terms related to Web 2.0
The lyrics of John Lennon’s “Imagine”
Now that you know the answers, go back and take a look at the word clouds again. Some of these were definitely harder than others, but I think it’s pretty clear that you can gather a lot of information about the subject rather quickly.
This method of representing data can obviously be quite powerful. Its use has exploded recently in politics, as a means to quickly represent the platforms of different candidates, or to summarize a speech. Personally, I think that it actually runs the risk of overuse in this context — can we really learn everything we need to know about a person’s viewpoint just by looking at the frequency at which he/she uses certain words? It’s useful, sure, but it seems like the arrangement of the words should matter as well… isn’t that why we learned grammar?
Just for fun, let’s take a look at our own MASI blog and see what we get. I made this cloud at www.tocloud.com, using the first two pages of posts on our blog: