Approaches to analyzing Twitter data
This topic describes how you might choose to analyze and explore a dataset containing social media data from Twitter.
For information on how to collect and import Twitter data—refer to Import from Twitter.
What do you want to do?
- Explore Twitter data in Detail View
- Gather Twitter data over time
- Exclude biographical information when you import the data
- Visualize Twitter data as a chart
- Display a chart of Tweets over time
- Compare Twitter users by number of followers/following
- Visualize Twitter data with cluster analysis
- Display other items on the cluster analysis diagram
- Gather Tweets by Username, Hashtag or other predefined columns
Explore Twitter data in Detail View
When you open the dataset in Detail View, you can visually explore it. You can also:
-
Use the sort or filter functions to see patterns in your data. For example, you can filter Tweets to only show those made by a specific user or during a specific date range.
-
Hide columns to limit the amount of data you are looking at—for example, you could hide the Tweet ID and Location columns.
-
Reorder columns—for example, if you want to move the Number of Followers column next to the Username column.
-
Adjust the column width—for example, to expand the column containing the Tweets.
-
Manually code Twitter data at nodes representing themes—refer to Basic Coding in dataset sources for more information.
-
Use automatic coding techniques to perform broad-brush coding of the Tweets—refer to Gather Tweets by Username, Hashtag or other predefined columns.
You can click the tabs in Detail View to get a different perspective of your data.
-
Form View the data one record at a time, laid out as a form.
-
Chart Display a chart of your Twitter data—refer to Visualize Twitter data as a chart for more information.
-
Cluster Analysis Displays a diagram that can help you to see patterns in the data—for example, which Twitter users used similar words. For more information, refer to Visualize Twitter data with cluster analysis.
-
Map (NVivo 10 for Windows Service Pack 2 or later) Geovisualize the data—for example, to see the geographic spread of social media commentators. For more information, refer to Geovisualize your social media data.
You can also run queries to find and code at themes in your data:
-
Run a Word Frequency query to identify common themes.
-
Run a Text Search query to find all instances of a particular word or phrase.
Gather Twitter data over time
Each time you capture Twitter data, a new NCapture file is created. When you import NCapture files into your project, by default, any matching social media datasets are merged together.
The only time you can merge matching social media datasets is when you import from NCapture. If you choose not to merge matching social media datasets during import, then you will not be able to merge them later in NVivo.
Matching datasets do not need to have the same names. To be considered matching, the social media properties of the datasets need to be the same—for example based on the same hashtag search in Twitter.
Matching datasets captured at different times may include some of the same content. When matching Twitter datasets are merged, any duplicate content is removed.
For example, imagine that you capture Twitter data for the hashtag #climate on Monday and import the NCapture file into your project. Then, on Tuesday and again on Wednesday you also capture Tweets based on the hashtag #climate. When you import these NCapture files into your project, by default, the Tweets from Tuesday and Wednesday are merged together with the dataset from Monday to create a single dataset. You can also view a timeline to see trends over time—refer to Display a chart of Tweets over time.
If you want to merge matching datasets, make sure the Merge matching social media datasets (including previously imported) check box is selected on the Import from NCapture dialog box, otherwise new datasets will be created when you input subsequent NCapture files.
Exclude biographical information when you import the data
By default, when you import data from Twitter, biographical information (location and web address) about the users is imported together with their Tweets.
If you do not want to bring this information into your project—for example, if it is not relevant to your research—you can set your preferences for importing biographical information on the Social Media Datasets tab in the Project Properties dialog box.
For example, you might want to bring in the location whenever you import Twitter data and exclude the other biographical fields.
Refer to Set project properties for more information.
Visualize Twitter data as a chart
Click the Chart tab in Detail View to display a chart of your Twitter data. You can make changes to the chart—for example, you can:
-
Compare the number of references (Tweets) by hashtag or location—refer to Change the appearance or content of a chart.
-
Select a different chart type—for example, a pie chart—refer to Change the appearance or content of a chart.
Display a chart of Tweets over time
You can chart Tweets in a timeline to see trends over time—for example, if there is an increase in the number of Tweets on a specific day, you may want to investigate further.
For Twitter datasets containing a User Stream (posts from a specific Twitter user):
-
Click the Chart tab in Detail View. A chart with a timeline on the X-axis is displayed.
For other Twitter datasets (based on a search, favorites, or a list):
-
Click the Chart tab in Detail View.
-
On the Chart tab, in the Options group, click Select Data.
The Chart Options dialog box opens.
-
Under X-axis, choose a timeline option.
-
Click OK.
NOTE Double-click on a data point—for example, bar or column—to see the underlying data. The resulting data will display in Detail View.
For more information on customizing charts, refer to Change the appearance or content of a chart.
Compare Twitter users by number of followers/following
You can view a chart that compares Twitter users by the number of followers and number following.
-
Click the Chart tab in Detail View to display a chart of the dataset.
-
On the Chart tab, in the Options group, click Select Data.
The Chart Options dialog box opens.
-
Under X-axis, ensure that User names is selected.
-
Under Y-axis, select Number of followers/following.
-
Click OK.
Visualize Twitter data with cluster analysis
You can click the Cluster Analysis tab to see a diagram that can help you to see patterns in the data. For example, you can see which Twitter users used similar words.
You can also:
-
Double-click on a data point—for example, a Username—to see the Tweets for that user. The resulting data will display in Detail View.
-
Change the appearance of the cluster analysis diagram—for example to see the data as a 2D or 3D Cluster Map.
-
Display other items on the cluster analysis diagram—for example, to compare similarity by hashtag.
Display other items on the cluster analysis diagram
Cluster analysis enables you to see patterns in your Twitter data—by default, usernames are compared by similarity of words in the Tweets. You can also display other items on the cluster analysis diagram—for example, to answer questions like what other hashtags are similar to #climate?
To display hashtags on the cluster analysis diagram:
-
Click the Cluster Analysis tab in Detail View.
-
On the Cluster Analysis tab, in the Options group, click Select Data.
The Cluster Analysis Options dialog box opens.
-
In the Display items list, select Hashtags.
-
Click OK.
Gather Tweets by Username, Hashtag or other predefined columns
Do you want to gather Tweets from a particular user or hashtag? You can use auto coding to gather Tweets from predefined columns—for example user or hashtag.
The table below is a simplified example of a dataset containing Twitter data.
The columns containing Username and Hashtags are classifying fields and the Tweet column is codable field. Whether the columns are codable or classifying is predetermined and cannot be changed.
Username | Tweet | Hashtags |
Person1 | Study: rising sea levels threaten island communities. #climate bit.lyxfgn6B | climate |
Person2 | Record high temperatures recorded in #arctic due to #climate change. | arctic
climate |
Person2 | We need to act now to slow the effects of #climate change. | climate |
Gather Tweets for each user into a node.
If you auto coded this dataset by Username, you would create the following node hierarchy:
-
Twitter
-
Username
-
Person1
-
Person2
The case nodes (Person1 and Person 2) are classified as 'Twitter User' and information from the user's profile—for example, Bio and Number of Followers—is stored as attribute values.
Gather Tweets for each hashtag into a node
If you auto coded this dataset by Hashtag, you would create the following node hierarchy:
-
Twitter
-
Hashtag
-
climate
-
arctic
NOTE You can choose to code based on other predefined columns—for example, Location or Tweet Type (Tweet/Retweet).
NVivo provides a Wizard to guide you through the process of auto coding. Refer to Automatic coding in dataset sources for more information.