Approaches to analyzing LinkedIn data
This topic describes how you might choose to analyze and explore a dataset containing social media data from LinkedIn.
For information on how to collect and import LinkedIn data—refer to Import from LinkedIn.
What do you want to do?
- Explore LinkedIn data in Detail View
- Gather LinkedIn data over time
- Exclude biographical information when you import the data
- Visualize LinkedIn data with cluster analysis
- Gather LinkedIn discussions by Username or other predefined column
Explore LinkedIn data in Detail View
When you open the dataset in Detail View, you can visually explore it. You can also:
-
Use the sort or filter functions to see patterns in your data. For example, you can filter the posts or comments to only show those made by a specific user or during a specific date range.
-
Hide columns to limit the amount of data you are looking at—for example, you could hide the columns containing Post ID or Headline.
-
Reorder columns—for example, if you want to move the Comment Text column next to the Post column.
-
Adjust the column width—for example, to expand the column containing the posts.
-
Manually code LinkedIn data at nodes representing themes—refer to Basic Coding in dataset sources for more information.
-
Use automatic coding techniques to perform broad-brush coding of the data—refer to Gather LinkedIn discussions by Username or other predefined column.
You can click the tabs in Detail View to get a different perspective of your data.
-
Form View the data one record at a time, laid out as a form.
-
Cluster Analysis Display a diagram that can help you to see patterns in the data—for example, which LinkedIn users used similar words. For more information, refer to Visualize LinkedIn data with cluster analysis.
-
Map (NVivo 10 for Windows Service Pack 2 or later) Geovisualize the data—for example, to see the geographic spread of social media commentators. For more information, refer to Geovisualize your social media data.
You can also run queries to find and code at themes in your data:
-
Run a Word Frequency query to identify common themes.
-
Run a Text Search query to find all instances of a particular word or phrase.
Gather LinkedIn data over time
Each time you capture social media data from LinkedIn, a new NCapture file is created. When you import NCapture files into your project, by default, any matching LinkedIn datasets are merged together.
The only time you can merge matching social media datasets is when you import from NCapture. If you choose not to merge matching social media datasets during import, then you will not be able to merge them later in NVivo.
Matching datasets do not need to have the same names. To be considered matching, the social media properties of the datasets need to be the same—for example based on the same group or discussion.
Matching datasets captured at different times may include some of the same content. When matching LinkedIn datasets are merged, any duplicate content is removed.
If you want to merge matching datasets, make sure the Merge matching social media datasets (including previously imported) check box is selected on the Import from NCapture dialog box, otherwise new datasets will be created when you input subsequent NCapture files.
Exclude biographical information when you import the data
By default, when you import data from LinkedIn, biographical information (such as Location and Headline) is imported together with posts or comments.
If you do not want to bring this information into your project—for example, if it is not relevant to your research—you can set your preferences for importing biographical information on the Social Media Datasets tab in the Project Properties dialog box.
For example, you might want to bring in the location whenever you import LinkedIn data and exclude the other fields.
Refer to Set project properties for more information.
Visualize LinkedIn data with cluster analysis
Click the Cluster Analysis tab in Detail View to see a diagram that can help you to see patterns in the data. For example, you can see which LinkedIn users used similar words.
You can also:
-
Double-click on a data point—for example, a Username—to see the underlying data in Detail View.
-
Change the appearance of the cluster analysis diagram—for example to see the data as a 2D or 3D Cluster Map.
-
Display other items on the cluster analysis diagram—for example, industries.
To display industries on the cluster analysis diagram:
-
Click the Cluster Analysis tab in Detail View.
-
On the Cluster Analysis tab, in the Options group, click Select Data.
The Cluster Analysis Options dialog box opens.
-
In the Display items list, select Industries.
-
Click OK.
Gather LinkedIn discussions by Username or other predefined column
Do you want to see all the posts and comments for a particular user together? Or would you prefer to gather all the posts and comments for each discussion. You can use auto coding to create nodes based on Username, Discussion (post and its comments) or other predefined columns.
Here is a simplified example of a dataset containing discussions from LinkedIn. The first row contains a post and the next three rows are comments on that post. The last row includes a post that has no comments.
The columns containing Posted by Username and Commenter Username are classifying fields. The columns containing Post and Comment Text are codable fields. Whether the columns are codable or classifying is predetermined and cannot be changed.
Posted by Username | Post | Commenter Username | Comment Text |
Mary Smith |
I'm looking for ideas on how to promote sustainability at our school. | ||
Mike Jones |
Have you thought of starting a vegetable garden? | ||
Lee Black |
We have a no-packaging policy for lunches. This reduces litter! | ||
Mike Jones | What are you currently doing about recycling? | ||
Mike Jones | I'm presenting a paper on Environmental Education next week. |
You can gather all the posts and comments for each user into a node that represents them. For example, if you auto coded this dataset by Username, you would create the following node hierarchy:
-
LinkedIn
-
Username
-
Mary Smith
-
Mike Jones
-
Lee Black
The case nodes (Mary Smith, Mike Jones, Lee Black) are classified as 'LinkedIn User' and information from the user's profile—for example, Industry and Number of Connections—is stored as attribute values.
NVivo provides a wizard to guide you through the process of auto coding—refer to Automatic coding in dataset sources for more information.