Sunday, July 14, 2013

Proto-analysis of Boston Globe Traffic on Facebook

Update on 7/18/2013: In this post, you'll find a fair amount of explanations about statistics and key metrics. If you're already familiar with them, please refer to a neat summary published by the Nieman Journalism Lab.

Last week, I gave a little talk at the Boston Globe, presenting my preliminary analysis that examined how the Boston Globe articles were perceived through its Facebook Page. Through my analysis, I hoped to answer two questions. What types of stories are shared by the Boston Globe staff on the social media platform? In turn, how do different types of shared stories differently affect Facebook users’ reading and sharing? By answering these two questions, I aimed to find out how well the staff’s intentions were aligned with readers’ interest as measured through three metrics offered by Facebook, and whether there were gaps between the intentions and perceptions that would signal room for improvement.


 Highlights of the Study

  • I examined 215 stories shared in two weeks on the Facebook Page of the Boston Globe.
  • I found several attributes correlated with attention:
    • Image size (none, thumbnail, single-column, and double-column)
    • Without or without a “breaking” label in the caption
    • Time of sharing (hour and weekday)
    • News topic defined by editors (business, metro, sports, etc.)
    • Related to the Boston Marathon bombing or not
  • There were gaps between staff’s efforts and Facebook users’ reading and sharing.


Facebook Insights and its Metrics


I exported the data through Facebook Insights, a built-in feature for Page administrators, to a spreadsheet file and later analyzed them in R, an open-source statistical tool. I kept the dataset fairly small to save time, especially since I cleaned data and labeled some of the variables manually, as automation was infeasible for them. In total, I examined 215 stories shared from May 7 to 21 this year.

My analysis was completely dependent on the three metrics Facebook Insights features: reach, engaged users, and talking about this. According to Facebook, reach is defined as “the number of unique people who have seen your post”; engaged users as “the number of unique people who have clicked on your post”; and talking about this as “the number of unique people who have created a story from your Page post. Stories are created when someone likes, comments on or shares your posts; answers a question you posted; or responds to your event”. These metrics are counted as absolute numbers of unique visitors in various ways and reflect user behavior from passive reading to proactive sharing.

The next section discusses statistical details that may not appear familiar to some people. Please click here to jump directly to the section on findings and implications.


Independent Variables, Dependent Variables, and Negative Binomial Regression

The statistical tool I used for this analysis is negative binomial regression, and I want to explicate the two terms, regression and negative binomial, to justify my choice of research method. Regression is a statistical process employed to estimate the relationships among variables. Variables serve different functions on analysis and some are labeled as independent variables and some others dependent variables. Dependent variables measure the attributes we expect to increase or decrease, such as life expectancy, happiness, and crime rate. Independent variables measure the factors that affect, predict or are associated with the outcome of dependent variables, such as educational level, blood pressure, police numbers, etc. Independent and dependent variables are by no means predetermined, but instead they are assigned freely for various research questions. For instance, we can estimate a graduate’s income from her educational level, or estimate how likely someone holds a master’s degree given her income.

In my case of analyzing Facebook data, I chose the three key metrics, namely reach, engaged users and talking about this, as dependent variables. The independent variables are different aspects of shared posts that possibly affect these outcomes. The aspects I included are news section, image size, “breaking” label, publication hour and weekday. Especially, I created a binary independent variable that marked stories as relevant or irrelevant to the Boston Marathon bombing, because this topic has been a beat followed closely by the Globe staff.

The reason why I chose regression is because it allows for assessing the association of each independent variable with the dependent variable separately. This is very important for the analysis. For example, more black women were reported to die of breast cancer than white women. Then could we assume that, biologically, black women confront a higher risk of the disease? Maybe not. If we include women’s occupation, education and income into the analysis, we could find that black and white women are not significantly different in developing breast cancer if they are at the same socioeconomic status (SES).

Taking the study of analyzing news stories as another example, we may observe story A is read by more people than story B. Can we claim that story A is more interesting than story B? Again, maybe not. We may find story A was shared at 8 am when people tend to check Facebook on their commute to work, whereas story B was shared at 11am when people are often busy working. Also, story A covers sports and story B covers international relations, while sports news is generally more popular than international news. Therefore, to control the various aspects of news stories, I need to run regression for more robust and reliable results.

On the question of which type of regression is most appropriate, a quick response is Poisson regression because it handles count data, such as how many times a week people watch TV, how many times a year tornados break out in the US, and how many people are waiting in front of you at a cashier. Because the data I collected violated an assumption for Poisson regression (equal mean and variance), I chose an alternative approach called negative binomial regression, because it is a good choice to deal with the overdispersion expressed by my data. For those interested in a description of these and other analysis methods, UCLA shares a lot of tutorials on statistical analysis, including negative binomial regression.

Coefficients generated by negative binomial regression are log ratios. To make the findings more comprehensible, in the following section, I present the ratios using the exponentiated coefficients.

Findings and Implications

This study was inspired by Facebook’s report on good practices for media companies. Facebook collected a sample of news institutes using Facebook Pages and reached the conclusions based on various practices of them. By contrast, my study was only focused on the Boston Globe and my findings were not always consistent with the suggestions given by Facebook.

“Breaking” Label

Facebook found that ‘posts that included “breaking” or “breaking news” received a 57% higher engagement over posts that were not identified as breaking news.’ In contrast, I did not find any significant difference in engaging users or going viral. The only difference I found is a significant increase in reach by 60%. From this, we could infer that the “breaking” label did not inhibit “engaged users” or talking about this and increased reach.

Image Size

In terms of illustrative images, four sizes can be observed in the posts on Facebook Pages. They are zero or no image, thumbnail images, single-column images and double-column images, but the double-column images cannot be seen on users’ news feeds, and is only available on Facebook Pages. For research purposes, I retained “double-column” as an image size. From the following chart, you can see how image size affected the amount of attention drawn from Facebook users. The ratios are exponentiated coefficients.
  • Quite obviously, illustrating a story with an image was better than with no image.
  • A thumbnail image appeared not to make a significant difference than no image.
  • The larger an image was, the more popular a shared story was likely to be.

Marathon Bombing

The stories about the Boston Marathon bombing significantly attracted more attention on Facebook. Across the three key metrics, reach, engaged users, and talking about this, these stories increased the metrics by 31%, 97%, and 64%. However, when I looked at how users were engaged in doing likes, comments and shares, I realized people didn’t necessarily “like” bombing-related stories. It’s not surprising because “liking” a horrible story may create a cognitive conflict for some people and therefore they don’t feel comfortable “liking” it. Regarding comments and shares, bombing-related stories enjoyed increased performance by 90% and 80%. Again, the ratios here are exponentiated coefficients.

Sharing Hour and Weekday

Because the data set spanned only two weeks, I don’t consider correlations to sharing weekday to be reliable. However, it’s large enough to compare 24 hours across a day. The following chart shows how the stories were shared by the staff and perceived by Facebook users. From it, we can see:
  • More stories were shared during business hours.
  • However, across the three metrics, the performance was not great during business hours.
  • The traffic seemed to peak around 8 am and around 11pm - 2am EST.
    • West coasters may contribute to after-midnight lags.

I talked with Joel Abrams at the Boston Globe about why peaks appeared in the early morning and late night. We’ve conjured up two theories for the phenomenon. First, people check Facebook more frequently before and after work, for instance, on commute or in bed. Second, quite uncooperatively, newsrooms share fewer stories during those “idling” hours because social media editors are also not at work. As such, those hours may see a shortage of new posts and therefore there is less competition for attention seekers. In the future, we could experiment with sharing stories in the early morning and late night to see if we could possibly boost traffic.

News Sections

There are in total 12 news sections predetermined by the Boston Globe staff: art, business, ideas, lifestyle, magazine, metro, news, opinion, slides, specials, sports, and upgrade. (Upgrade posts are advertising that invites people to upgrade their membership to subscribers.) The following chart shows how many stories the staff shared across topics and how different topics were associated with reach, engaged users and talking about this. Between the staff’s shares and the readers’ attention, there were in fact some gaps.

The regression analysis assessed with higher precision how different news sections affected stories' performance on Facebook. Art news was taken as the baseline and the other news sections were compared to it. The results were shown as ratios (e.g., 20% means only one fifth as good as art news, and 300% means three times as good as art news). Please note that the confidence intervals were exponentiated from regression estimates and that's why the upper interval is larger than the lower interval. Now we can sort out news sections by their impact on performance:
  • Sorted by the amount shared by staff, high to low are:
    • Metro, sports, news, lifestyle, arts, business, opinion, slides/mag/upgrade, ideas, and special.
  • Sorted by reach, top ones are:
    • Opinion, slides, lifestyle, and business
  • Sorted by engaged users, top ones are:
    • Opinion, metro, lifestyle, and business
  • Sorted by talking about this, top ones are:
    • Slides, opinion, sports, and metro.
  • The misalignment between staff’s shares and readers’ perception may be a starting point for adjustments.

To compare the two dimensions (staff’s posts and readers’ attention), I scatter-plotted them together on one chart. In this chart, the horizontal axis represents how many stories were shared by the staff, and the vertical axis denotes how the stories were perceived by Facebook readers, in terms of reach, engaged users, and talking about this. The data were log transformed so that the data points could be squeezed together for a more sensible view. The units in fact didn’t matter here, because what we hope to see is the ratio of effort to outcome. or efficiency. To indicate their efficiency in the readers’ responses to the staff’s efforts, I roughly grouped the news topics into high, medium and low and colored the background with yellow, grey and white. It appeared that, given the same amount of posts, opinion engaged more activities and photo slides tended to go more viral. Meanwhile, we could see that the shared posts of opinion and photo slides were fairly scarce. There is a gap between the amount of articles published by section and the traffic they capture, and this could be a fruitful point of analysis for future adjustment in article sharing choice. Specifically, this study suggest that more readers will be engaged if there were more posts of opinion, photo slides, business, and lifestyle.

Virality or Conversation Rate

The following chart shows a trend: when stories reached a larger amount of readers, more readers would be engaged in more activities around the stories, with each dot representing one shared story. This trend appears in a roughly linear relationship, between reach, engaged users, and talking about this. Meanwhile, we can easily discern some circles dangling beneath the trending lines, residing in the red circles. So why did those stories generate fewer activities?
The virality extent, or so-called conversation rate, helps to discover these underperforming stories. This metric is calculated as the ratio of talking about this to reach. I’ll list the least as well as most conversational stories and give a quick summary of the observed patterns in the content.

Most conversational stories

  1. Oklahoma City Thunder star Kevin Durant today pledged $1 million to recovery efforts after yesterday's devastating tornado.
  2. Romeo and Juliet, the swans who reside at the Boston Public Garden during the summer (and at Franklin Park Zoo during the winter), returned there today in a sign that the spring season is truly here. See photos:
  3. The lilacs are in full bloom at Arnold Arboretum.  This photo was taken yesterday, known officially as Lilac Sunday at the Arboretum.  Stop by if you have a chance.    Globe staff photo / Yoon S. Byun
  4. Say hello to the CapeFlyer. It had its inaugural run today and is scheduled to have its official debut next weekend, the first time in about 25 years service from Boston to Cape Cod will be offered.  Would you ride it?
  5. The Marathon bombing sheared off the right leg of Marc Fucarile (pictured, with his fiancee Jen Regan) in a millisecond. It spared the left, but not by much. Now, he and his family are in a painful waiting game to see if his “good” leg can be saved.
  6. A child was pulled from the rubble of Plaza Towers Elementary School in Moore, Okla., after an EF-4 tornado struck. The tornado, with winds up to 200 mph, was up to a mile wide and left behind large areas of devastation.
  7. “It was one of the greatest moments in Boston sports history,” writes the Globe’s Dan Shaughnessy about the Bruins’ thrilling win over the Maple Leafs. “And then came a miracle… the Bruins scored and scored and scored.”
  8. The Boston Athletic Association is inviting all runners who failed to finish 2013 Boston Marathon to run in next year's race.  This affects 5,633 runners.
  9. Brad Marchand scored the Bruins' game-winning goal over the Rangers at 15:40 of overtime. Story:    (Photo credit: AP)
  10. After learning she had an 87% chance of developing breast cancer, actress Angelina Jolie underwent a preventative double mastectomy.  Jolie shares her story in a powerful The New York Times op-ed today:     EPA photo

Least conversational stories

  1. Keith Reddin’s thriller “Almost Blue” at the Charlestown Working Theater, isn’t so much blue as noir
  2. #Recipe for paella-stuffed peppers
  3. New: Matthew Gilbert's Buzzsaw column. As the cult favorite, "Arrested Development," returns with a season-sized “episode dump,” Globe critic Matthew Gilbert asks, does giving viewers too much leave them with nothing to talk about?
  4. Make mom feel even more special with these stylish Mother’s Day gifts.
  5. The Phoenix Suns named 33-year-old Ryan McDonough, formerly of the Boston Celtics, as their new general manager.
  6. Album review: The soundtrack for Baz Luhrmann's film adaptation of "The Great Gatsby," curated by Jay-Z, is a fantastical reimagining of that era, putting ‘20s jazz in the modern context of pop and hip-hop. Oddly enough, the one thing the soundtrack is missing is heart.
  7. Creative restlessness and a sense of adventure are at the heart of Iron & Wine’s latest album, “Ghost on Ghost,” which Sam Beam will celebrate with a show at Berklee Performance Center tonight.
  8. Book review: The beloved author of “The Kite Runner,” Khaled Hosseini, returns to the rugged landscape of his home country, Afghanistan with "And the Mountains Echoed."
  9. Jon Lester gave up six runs in six innings in Chicago as the White Sox defeated the Red Sox, 6-4.
  10. Yahoo is buying Tumblr for $1.1 billion. Do you think this will help rejuvenate the Yahoo brand? Is Tumblr a good investment?
Here is my quick summary of patterns related to conversational potential of stories.
  • Beautiful and pleasant stuff was the most conversational, such as photo slides.
  • Also highly conversational: there’s a problem but there have been (or would be) a solution:
    • Tie but broken by miracle win in sports
    • Failed to finish marathon but were invited back to do it
    • Marathon bombing victims but were given medical care
    • Natural disaster but children were saved
    • Chance of cancer but intervention minimized it
  • The least conversational:
    • Arts related (music, movies, books, etc.)
    • Factual information (sports scores, settled business deals, etc.
  • The high and low engagement is consistent with prior research that higher emotional reaction leads to more frequent expression.


Limitations and Future Research

  • Limitations
    • The data set is fairly small (n = 215)
    • Hence, more sampling errors and biases in results
    • Also omitted to examine how the frequency of shares would affect readers’ perceptions (the more shared stories the better, or vice versa, or doesn’t matter?)
  • Future research
    • Time-series data
    • Demographics (gender, age ranger, location, etc.)
    • Devices (web vs. mobile, platform types, etc.)

Search This Blog