Approaches to analyzing Twitter data

 


This topic describes how you might choose to analyze and explore a dataset containing social media data from Twitter.

For information on how to collect and import Twitter data—refer to Import from Twitter.

What do you want to do?


 


Explore Twitter data in Detail View

When you open the dataset in Detail View, you can visually explore it. You can also:

You can click the tabs in Detail View to get a different perspective of your data.

  • Form  View the data one record at a time, laid out as a form.

  • Chart  Display a chart of your Twitter data—refer to Visualize Twitter data as a chart for more information.

  • Cluster Analysis  Displays a diagram that can help you to see patterns in the data—for example, which Twitter users used similar words. For more information, refer to Visualize Twitter data with cluster analysis.

  • Map (NVivo 10 for Windows Service Pack 2 or later)  Geovisualize the data—for example, to see the geographic spread of social media commentators. For more information, refer to Geovisualize your social media data.

You can also run queries to find and code at themes in your data:

Top of Page

Gather Twitter data over time

Each time you capture Twitter data, a new NCapture file is created. When you import NCapture files into your project, by default, any matching social media datasets are merged together.

The only time you can merge matching social media datasets is when you import from NCapture. If you choose not to merge matching social media datasets during import, then you will not be able to merge them later in NVivo.

Matching datasets do not need to have the same names. To be considered matching, the social media properties of the datasets need to be the same—for example based on the same hashtag search in Twitter.

Matching datasets captured at different times may include some of the same content. When matching Twitter datasets are merged, any duplicate content is removed.

For example, imagine that you capture Twitter data for the hashtag #climate on Monday and import the NCapture file into your project. Then, on Tuesday and again on Wednesday you also capture Tweets based on the hashtag #climate. When you import these NCapture files into your project, by default, the Tweets from Tuesday and Wednesday are merged together with the dataset from Monday to create a single dataset. You can also view a timeline to see trends over time—refer to Display a chart of Tweets over time.

If you want to merge matching datasets, make sure the Merge matching social media datasets (including previously imported) check box is selected on the Import from NCapture dialog box, otherwise new datasets will be created when you input subsequent NCapture files.

Top of Page

Exclude biographical information when you import the data

By default, when you import data from Twitter, biographical information (location and web address) about the users is imported together with their Tweets.

If you do not want to bring this information into your project—for example, if it is not relevant to your research—you can set your preferences for importing biographical information on the Social Media Datasets tab in the Project Properties dialog box.

For example, you might want to bring in the location whenever you import Twitter data and exclude the other biographical fields.

Refer to Set project properties for more information.

Top of Page

Visualize Twitter data as a chart

Click the Chart tab in Detail View to display a chart of your Twitter data. You can make changes to the chart—for example, you can:

Top of Page

Display a chart of Tweets over time

You can chart Tweets in a timeline to see trends over time—for example, if there is an increase in the number of Tweets on a specific day, you may want to investigate further.

For Twitter datasets containing a User Stream (posts from a specific Twitter user):

  • Click the Chart tab in Detail View. A chart with a timeline on the X-axis is displayed.

For other Twitter datasets (based on a search, favorites, or a list):

  1. Click the Chart tab in Detail View.

  2. On the Chart tab, in the Options group, click Select Data.

The Chart Options dialog box opens.

  1. Under X-axis, choose a timeline option.

  2. Click OK.

NOTE  Double-click on a data point—for example, bar or column—to see the underlying data. The resulting data will display in Detail View.

For more information on customizing charts, refer to Change the appearance or content of a chart.

Top of Page

Compare Twitter users by number of followers/following

You can view a chart that compares Twitter users by the number of followers and number following.

  1. Click the Chart tab in Detail View to display a chart of the dataset.

  2. On the Chart tab, in the Options group, click Select Data.

The Chart Options dialog box opens.

  1. Under X-axis, ensure that User names is selected.

  2. Under Y-axis, select Number of followers/following.

  3. Click OK.

Top of Page

Visualize Twitter data with cluster analysis

You can click the Cluster Analysis tab to see a diagram that can help you to see patterns in the data. For example, you can see which Twitter users used similar words.

You can also:

Top of Page

Display other items on the cluster analysis diagram

Cluster analysis enables you to see patterns in your Twitter data—by default, usernames are compared by similarity of words in the Tweets. You can also display other items on the cluster analysis diagram—for example, to answer questions like what other hashtags are similar to #climate?

To display hashtags on the cluster analysis diagram:

  1. Click the Cluster Analysis tab in Detail View.

  2. On the Cluster Analysis tab, in the Options group, click Select Data.

The Cluster Analysis Options dialog box opens.

  1. In the Display items list, select Hashtags.

  2. Click OK.

Top of Page

Gather Tweets by Username, Hashtag or other predefined columns

Do you want to gather Tweets from a particular user or hashtag? You can use auto coding to gather Tweets from predefined columns—for example user or hashtag.

The table below is a simplified example of a dataset containing Twitter data.

The columns containing Username and Hashtags are classifying fields and the Tweet column is codable field. Whether the columns are codable or classifying is predetermined and cannot be changed.

Username Tweet Hashtags
Person1 Study: rising sea levels threaten island communities. #climate bit.lyxfgn6B climate
Person2 Record high temperatures recorded in #arctic due to #climate change. arctic
climate
Person2 We need to act now to slow the effects of #climate change. climate

Gather Tweets for each user into a node.

If you auto coded this dataset by Username, you would create the following node hierarchy:

  • Twitter

  • Username

  • Person1

  • Person2

The case nodes (Person1 and Person 2) are classified as 'Twitter User' and information from the user's profile—for example, Bio and Number of Followers—is stored as attribute values.

Gather Tweets for each hashtag into a node

If you auto coded this dataset by Hashtag, you would create the following node hierarchy:

  • Twitter

  • Hashtag

  • climate

  • arctic

NOTE You can choose to code based on other predefined columns—for example, Location or Tweet Type (Tweet/Retweet).

NVivo provides a Wizard to guide you through the process of auto coding. Refer to Automatic coding in dataset sources for more information.

Top of Page