Tag Archives: MLA

#MLA 14: A First Look (IV)

The Story So Far

We have been looking at an archive of tweets tagged with #MLA14, which corresponded to the 2014 MLA (Modern Language Association) Annual Convention. It was held in Chicago from  Monday 9 to Sunday 12 January 2014. You can still browse or search 2014 sessions in the online Program.

The studied archive comprises a dataset of 27,491 unique tweets, collected between Sunday September 01 2014 at 20:35:07 and Wednesday January 15 2014 at 16:16:41Central Time.

The dataset studied in this series of posts was collected and cleaned by Chris Zarate and myself.

After deduplication we were down to 27,491 tweets, and in a sub-set that collects the tweets posted during the actual convention days the total number of tweets in this period sums 21,915 tweets.

We have been offering some key figures and some basic visualisations of the data.

For the first part of this series, click here.

For the second part of this series, click here.

For the third part of this series, click here.

 Text Analysis

We used the Voyant Tools (previously the unfortunately-named Voyeur), a web-based reading and analysis environment for digital texts developed by Stéfan Sinclair and Geoffrey Rockwell, to obtain the most frequent words in the text of the total number of tweets (this includes RTs and replies) posted with #MLA14 during each day of the convention.

Below we share some word clouds to visualise this. As most people know now word clouds are visual presentations of keywords  extracted from a text which are visually differentiated according to their position and frequency of use in that text. Voyant uses Cirrus, which is a “visualization tool that displays a word cloud relating to the frequency of words appearing in one or more documents. […] The larger the word, the more frequent the term.”

In this case we are sharing static image files exported from Voyant itself. We are also including the top 5 most frequent words in each set of tweets. In all cases we used a customised English (“Taporware”) stop words list that was applied globally including words like #mla14, MLA, RT, panel, session, http, t.co, etc.

Numbered hashtags corresponding to sessions were not included in the stop word list as one of the intentions was to reveal which sessions were more frequently mentioned each day. (To find out which sessions correspond to each numbered hashtag check the online Program).

Limitations and Fair Warning

After running the four different corpora more than once through Voyant we discovered the tool was unable to reproduce the same results, particularly regarding word and unique word counts. Top 5 most frequent words remained with minimal variations of little significance, which might mean the results we share in that regard are more or less reliable, though not 100% exact.

We were logically disappointed at the failure to ensure reproducibility using the same corpora and the same tool (we don’t consider each corpus to be too large for reliable text analysis). We will keep looking into it and will keep aiming for reproducibility of the results with different tools, and we will update any findings here.

Here we are only presenting as a research progress update the figures and clouds obtained after the fourth trial, having cleared caches and ensuring the corpora were complete.

 Thursday 9 January 2012

Total number of tweets: 4,558

Total number of words: 71,630

Total number of unique words: 9,142

Top 5 most frequent words in the corpus: #s80 (271), #s66 (199), humanities (188), #s130 (156), #s173 (150).

#mla14 Thursday 9 January Cirrus Word Cloud. Retrieved January 22, 2014 from http://voyeurtools.org/tool/Cirrus/

#MLA14 Thursday 9 January Cirrus Word Cloud. Retrieved January 22, 2014 from http://voyeurtools.org/tool/Cirrus/

Friday 10 January 2014

Total number of tweets:  7,417

Total number of words: 131,500

Total number of unique words: 13,367

Top 5 most frequent words in the corpus: data (381), #s299 (378), students (354), #s339 (342), reading (342).

#mla14 Friday 10 January Cirrus Word Cloud. Retrieved January 22, 2014 from http://voyeurtools.org/tool/Cirrus/

#mla14 Friday 10 January Cirrus Word Cloud. Retrieved January 22, 2014 from http://voyeurtools.org/tool/Cirrus/

Saturday 11 January 2014

Total number of tweets:   6,265

Total number of words:  112,482

Total number of unique words:  11,954

Top 5 most frequent words in the corpus:  #s577 (562), digital (543), work (413), humanities (340), #medievaltwitter (283).

    #MLA14 Saturday 11 January Cirrus Word Cloud. Retrieved January 22, 2014 from http://voyeurtools.org/tool/Cirrus/

#MLA14 Saturday 11 January Cirrus Word Cloud. Retrieved January 22, 2014 from http://voyeurtools.org/tool/Cirrus/

Sunday 12 January 2014

Total number of tweets: 3,675

Total number of words: 66,426

Total number of unique words: 8,206

Top 5 most frequent words in the corpus: #s679 (626), digital (266), #s738 (212), @adelinekoh (174), #s708 (173).

#MLA14 Sunday 12 January Cirrus Word Cloud. Retrieved January 22, 2014 from http://voyeurtools.org/tool/Cirrus/

#MLA14 Sunday 12 January Cirrus Word Cloud. Retrieved January 22, 2014 from http://voyeurtools.org/tool/Cirrus/

 

Tool Citation

Sinclair, S. and G. Rockwell (2014). Voyant Tools: Reveal Your Texts. Voyant. Retrieved January 22, 2014 from http://voyeurtools.org/

For the first part of this series, click here.

For the second part of this series, click here.

For the third part of this series, click here.

 

 

#MLA14: A First Look (II)

[For the first part of this series, click here.

For the third part of this series, click here.]

As we said in the previous post the dataset we have includes 27,491 unique tweets, collected between Sunday September 01 2014 at 20:35:07 and Wednesday January 15 2014 at 16:16:41Central Time.

(Needless to say Twitter activity with #MLA14 has continued, but Wednesday January 15 16:16:41 is when the archive we are focusing on ends).

Another Finding: How Many Unique Twitter Usernames

There are 3,545 unique usernames in the dataset. Logically not all users tweet as much or with the same frequency.

This number does not mean that 3,545 unique “real” people tweeted with the hashtag, as we must consider that some Twitter users participated in the backchannel with more than one username or account (for example, a personal and an organisational or institutional one), but this is not always easy to identify. It is also possible that more than one “real” people manage one single account.

The following chart compares the number of Twitter usernames that tweeted with #MLA14 during the period of collection described above with the official number of registered participants in the program* and an approximate number of paid attendees.

#MLA14 Comparative Participants Chart, CC-BY Chris Zarate and Ernesto Priego

#MLA14 Comparative Participants Chart, CC-BY Chris Zarate and Ernesto Priego

Many questions arise about the relationships between those attending the convention, those registered in the program (that are a subset of the former) and those participating via the backchannel.

Determining nuanced relationships between the groups might shed some light on the role of tweeting within the context the convention and live-tweeting from the convention itself. Is the backchannel a significant method of “amplification” beyond the convention’s venue? Can current data answer this question and help lay out trends for the future?

There are of course many other questions arising from the data. We’ll be looking at them gradually, some here and hopefully with more detail in a future publication.

*Chris Zarate released program data on GitHub in XML and JSON format: https://github.com/mlaa/mla14.org

Please check John Mulligan’s blog for some very interesting visualisations of scholarly networks including #MLA14.