Approaches to analyzing LinkedIn data

 


This topic describes how you might choose to analyze and explore a dataset containing social media data from LinkedIn.

For information on how to collect and import LinkedIn data—refer to Import from LinkedIn.

What do you want to do?


 


Explore LinkedIn data in Detail View

When you open the dataset in Detail View, you can visually explore it. You can also:

You can click the tabs in Detail View to get a different perspective of your data.

  • Form  View the data one record at a time, laid out as a form.

  • Cluster Analysis  Display a diagram that can help you to see patterns in the data—for example, which LinkedIn users used similar words. For more information, refer to Visualize LinkedIn data with cluster analysis.

  • Map (NVivo 10 for Windows Service Pack 2 or later)  Geovisualize the data—for example, to see the geographic spread of social media commentators. For more information, refer to Geovisualize your social media data.

You can also run queries to find and code at themes in your data:

Top of Page

Gather LinkedIn data over time

Each time you capture social media data from LinkedIn, a new NCapture file is created. When you import NCapture files into your project, by default, any matching LinkedIn datasets are merged together.

The only time you can merge matching social media datasets is when you import from NCapture. If you choose not to merge matching social media datasets during import, then you will not be able to merge them later in NVivo.

Matching datasets do not need to have the same names. To be considered matching, the social media properties of the datasets need to be the same—for example based on the same group or discussion.

Matching datasets captured at different times may include some of the same content. When matching LinkedIn datasets are merged, any duplicate content is removed.

If you want to merge matching datasets, make sure the Merge matching social media datasets (including previously imported) check box is selected on the Import from NCapture dialog box, otherwise new datasets will be created when you input subsequent NCapture files.

Top of Page

Exclude biographical information when you import the data

By default, when you import data from LinkedIn, biographical information (such as Location and Headline) is imported together with posts or comments.

If you do not want to bring this information into your project—for example, if it is not relevant to your research—you can set your preferences for importing biographical information on the Social Media Datasets tab in the Project Properties dialog box.

For example, you might want to bring in the location whenever you import LinkedIn data and exclude the other fields.

Refer to Set project properties for more information.

Top of Page

Visualize LinkedIn data with cluster analysis

Click the Cluster Analysis tab in Detail View to see a diagram that can help you to see patterns in the data. For example, you can see which LinkedIn users used similar words.

You can also:

  • Double-click on a data point—for example, a Username—to see the underlying data in Detail View.

  • Change the appearance of the cluster analysis diagram—for example to see the data as a 2D or 3D Cluster Map.

  • Display other items on the cluster analysis diagram—for example, industries.

To display industries on the cluster analysis diagram:

  1. Click the Cluster Analysis tab in Detail View.

  2. On the Cluster Analysis tab, in the Options group, click Select Data.

The Cluster Analysis Options dialog box opens.

  1. In the Display items list, select Industries.

  2. Click OK.

Top of Page

Gather LinkedIn discussions by Username or other predefined column

Do you want to see all the posts and comments for a particular user together?  Or would you prefer to gather all the posts and comments for each discussion. You can use auto coding to create nodes based on Username, Discussion (post and its comments) or other predefined columns.

Here is a simplified example of a dataset containing discussions from LinkedIn. The first row contains a post and the next three rows are comments on that post. The last row includes a post that has no comments.

The columns containing Posted by Username and Commenter Username are classifying fields. The columns containing Post and Comment Text are codable fields. Whether the columns are codable or classifying is predetermined and cannot be changed.

Posted by Username Post Commenter Username Comment Text

Mary Smith

I'm looking for ideas on how to promote sustainability at our school.    
   

Mike Jones

Have you thought of starting a vegetable garden?
   

Lee Black

We have a no-packaging policy for lunches. This reduces litter!
    Mike Jones What are you currently doing about recycling?
Mike Jones I'm presenting a paper on Environmental Education next week.    

You can  gather all the posts and comments for each user into a node that represents them. For example, if you auto coded this dataset by Username, you would create the following node hierarchy:

  • LinkedIn

  • Username

  • Mary Smith

  • Mike Jones

  • Lee Black

The case nodes (Mary Smith, Mike Jones, Lee Black) are classified as 'LinkedIn User' and information from the user's profile—for example, Industry and Number of Connections—is stored as attribute values.

NVivo provides a wizard to guide you through the process of auto coding—refer to Automatic coding in dataset sources for more information.

Top of Page