Approaches to analyzing survey results

This topic describes how you can analyze and explore a dataset containing survey results, starting with the simplest approach and then moving to more complex methods.

If you want to know how to import survey results into NVivo, refer to the following topics:

In this topic

Explore your survey data in Detail View
Gather responses to each question
Gather responses of each survey respondent
Grouping demographic values into ranges
Gather responses based on demographic values
Auto code survey responses based on existing coding patterns

Explore your survey data in Detail View

When you open the dataset in Detail View, you can visually explore the dataset. When you are working with the dataset in Detail View, you can:

Hide columns to limit the amount of data you are looking at—for example, if you want to see the first column in your dataset next to the fifth column, you can hide the intervening columns.
Use the sort or filter functions to see patterns in your data. For example, if your dataset contains survey responses and includes a classifying field for sex, you can use the sort or filter functions to view the responses of the males or females.
Manually code survey responses at nodes representing the themes in your data—refer to Basic Coding in dataset sources for more information.

You can also run queries to find and code at themes in your data:

Run a Word Frequency query to identify common themes in the survey responses.
Run a Text Search query to find all instances of a particular word or phrase.

Top of Page

Gather responses to each question

Do you want to see how all respondents replied to a question? Gathering responses to each survey question at a node allows you to group the data into broad themes.

Using the example dataset below, you could create a node Question 1 and code the entire column at that node. You could create another node to contain all responses to Question 2.

Respondent	Age	Sex	Question 1	Question 2
Anna	29	Female	I think there should be more car-free zones	Electric buses and taxis would help reduce pollution in the inner city
Jack	31	Male	Pedestrians need to feel safe. There should be better lighting and more police	We should create more green spaces
Maria	52	Female	Safety barriers at busy intersections	I don't think they should tax car parks
Peter	47	Male	Better education in schools about road safety	More street trees

You can code the column manually or automatically:

You can select the entire column and manually code it at a new node called Question 1
You can use the Auto Code Wizard—select Auto code based on structure or style (Step 1 of the Wizard), and then select Code at nodes for selected columns (Step 2 of the Wizard). This is useful when you have many columns containing responses to different survey questions.

NOTE When each respondent is represented by multiple rows (a row per survey question), you can still use the Auto Code Wizard to gather the responses to a single question at a node—refer to Gather survey responses from multiple rows for more information.

Whichever method you use, you will create and code at the following nodes:

Question 1
Question 2

Once you have grouped all responses to a question at a single node, you can use some of NVivo's powerful analysis tools, including:

Open the node and visually explore content coded at the node. From here you could 'code on' to more granular thematic groupings. For example, you could gather all answers which mentioned car-free zones.
Run a Word Frequency query (using the node in the scope of the query) to find common words or concepts in responses to Question 1.
Run a Text Search query looking for particular words or concepts, using the node in the scope of the query. For example, you could search for education and code all the results at a new node.
Generate a cluster analysis diagram. For example, you can explore the similarity between the responses to Question 1 and responses to other survey questions.
Open the node, then manually code a portion of the responses to a group of thematic nodes (car free zones, lighting, safety barriers), then use pattern-based coding to auto code the node to the specific thematic nodes that relate to that question. Refer to Automatic coding using existing coding patterns for more information

Top of Page

Gather responses of each survey respondent

If your data contains classifying fields that describe your survey participants—for example, the name, age and sex of the participant—you can use these fields to create nodes that represent your survey participants. You can code everything a participant said in response to survey questions at the node that represents them.

Using the data below, for each respondent you would create one node, of the classification 'person', with attributes for Age and Sex. Responses to both Question 1 and Question 2 would be coded at this node.

Respondent	Age	Sex	Question 1	Question 2
Anna	29	Female	I think there should be more car-free zones	Electric buses and taxis would help reduce pollution in the inner city
Jack	31	Male	Pedestrians need to feel safe. There should be better lighting and more police	We should create more green spaces
Maria	52	Female	Safety barriers at busy intersections	I don't think they should tax car parks
Peter	47	Male	Better education in schools about road safety	More street trees

Using the NVivo's automated tools, you can do this in two steps:

Code content at nodes for each person.

Use the Auto Code Wizard—select Auto code based on structure or style (Step 1 of the Wizard), and then select Code at nodes for each value in a column (Step 2 of the Wizard), to code the Question 1 and Question 2 responses at nodes representing the values from the 'Respondent' column. This creates the following nodes:

Anna
Jack
Maria
Peter

At this point, the nodes have coding, but are not classified, and have no attribute values.

Use the Classify Nodes from Dataset Wizard to add the demographic information (age and sex) of each participant to their node.

Once you have created and coded responses at nodes for each respondent, you can use analysis tools which compare their attribute values. You can:

Create charts to compare the demographic attributes of your respondents—perhaps your respondents are mostly males under 30 years old?.
Generate a cluster analysis diagram that compares the attribute values of your respondents—are there clusters of respondents with similar characteristics? Are there any 'outliers'—respondents with demographic characteristics that are very different from the others.
Run a Word Frequency query (using the node in the scope of the query) to find common words or concepts in responses to Question 1. You could code the results at new nodes to further refine your analysis.
Run a Text Search query looking for particular words or concepts, using the node in the scope of the query. For example, you could search for education and code all the results at a new node.

When you have gathered responses both at question nodes (Question 1, Question 2 ) and at respondent nodes (Anna, Jack, Maria, Peter), you can analyze what respondents in different demographic groups are saying in response to particular questions:

Use a Coding query to view all the responses of males under 30 years to Question 1.
Use a Word Frequency query to find the most commonly occurring words or ideas that females mention when responding to Question 2

NOTE

If each respondent is represented by multiple rows (a row per survey question), you can still use the Auto Code Dataset Wizard to gather each person's responses at a node—refer to Gather survey responses from multiple rows for more information.
If you have demographic information about your respondents stored separately from your survey data, you may need to set the attribute values by another method. For example, you can import node attribute values from a spreadsheet or by importing from another NVivo project—refer to Import (or export) classification sheets for more information.

Top of Page

Grouping demographic values into ranges

When you use demographic information in the dataset to set attribute values for nodes, you can optionally group values into ranges.

For example, if your dataset contains the age of your respondents, it may be more useful to know that an individual participant is within the 21-29 age range, than to know their precise age.

The Classify Nodes from Dataset Wizard allows you to the group values—refer to Classify nodes (set attribute values to record information) for more information.

Top of Page

Gather responses based on demographic values

Most commonly, you will use the classifying fields in your dataset to set the attribute values on your respondent nodes. However, it is also possible to create a node structure that reflects the demographic characteristics of your respondents—this provides another way of looking at your data.

Respondent	Age	Sex	Question 1	Question 2
Anna	29	Female	I think there should be more car-free zones	Electric buses and taxis would help reduce pollution in the inner city
Jack	31	Male	Pedestrians need to feel safe. There should be better lighting and more police	We should create more green spaces
Maria	52	Female	Safety barriers at busy intersections	I don't think they should tax car parks
Peter	47	Male	Better education in schools about road safety	More street trees

Using the dataset above, you could use the Auto Code Wizard—select Auto code based on structure or style (Step 1 of the Wizard), and then select Code at nodes for each row (Step 2 of the Wizard)—to create nodes to represent the male and female respondents, and then child nodes for each age range (by grouping the age values into ranges). Responses to survey questions are coded at the appropriate age-range node.

Your resulting node structure (depending on how you group ages into ranges) might be:

Female

18-29
30-39
40-49
50-52
Male
18-29
30-39
40-49
50-52

This can be a quick way to gather responses by demographic groupings. You can see what males aged 30-39 are saying. If you want to see what all males are saying, turn on aggregation at the parent node (Male).

NOTE

Because the demographic attributes are reflected in the node hierarchy rather than as attribute values on the node, this method is not appropriate when you want use analysis tools that compare node attribute values—for example, if you want to generate a cluster analysis diagram to see the demographic spread of your respondents.
If you anticipate gathering more source materials, and particularly if these source materials are not datasets, this method may not be appropriate. Manually coding additional material within this structure may prove difficult—it might be preferable to store demographic information as attribute values on nodes representing individual participants, as it will be easier to add further coding.

Top of Page

Auto code survey responses based on existing coding patterns

You can use pattern-based auto coding to speed up the process of coding survey responses. Before you use pattern-based coding, you need to start with manual 'pilot' coding of the responses—for example, code 5-10% of the responses manually.

If your dataset contains responses to questions on a range of topics or issues, you may get better results with pattern-based coding if you auto code the responses to one question at a time using specific thematic nodes that relate to that question.

To auto code survey responses based on existing coding patterns:

First, gather a subset of responses for each question and perform manual 'pilot' coding as follows:

Filter the rows in your dataset to show only the responses you want to use for your pilot coding. For example, you could show the rows prior to a particular response date.
Auto code the dataset using source structure to gather the responses into a node for each question— for example, Question 1, Question 2, Question 3. In step 3 of the Wizard, choose to code Filtered rows only.
Open each question node and manually 'code on' to a group of thematic nodes specific to that question.

Next, auto code the rest of the responses using existing coding patterns as follows:

Change the filtering in your dataset to hide the responses that were already coded—for example, you could hide rows prior to a particular response date. The reason for filtering the data is to ensure that pattern coding doesn’t re-code responses that you’ve already coded manually.
Again, auto code the dataset using source structure to gather the responses into a new node for each question. In step 3 of the Wizard, choose to code Filtered rows only. In step 5 of the Wizard, create the new nodes in a location Under a New Node (so that they are in a separate node hierarchy from the question nodes you created for pilot coding).
For each question node in the new node hierarchy, use pattern-based coding to auto code the responses to the specific thematic nodes for that question.

If you plan to import multiple times from the same SurveyMonkey survey—for example, by periodically gathering completed responses to an open survey—you can import and manually code the initial responses. Then later, you can import additional responses and use pattern-based auto coding on the new data.

Pattern-based auto coding is an experimental feature that you can test and try out. This feature is designed to speed up the coding process for large volumes of textual content. Pattern-based coding was introduced in NVivo 10 for Windows Service Pack 4 and updated in Service Pack 5.

Refer to Automatic coding using existing coding patterns for more information.

Top of Page