Basic analysis of survey data

More than just the basic survey analyses techniques Most survey software programs can do the basic types of analyses... but they often don't do them well and they're limited in the available layouts and statistics. Professional researchers need a survey software package that goes beyond the basics. Concepts like weighting, margin of error, percentage base, and significance testing are built into StatPac survey software.


The basic types of survey analyses techniques you're likely to use are frequencies (FR), descriptive statistics (DE), crosstabs and banner tables (BA), and listing open-ended text (LI). There are others, but these are the techniques used on nearly all surveys. The two letters in parentheses are the abbreviation you can use to specify the command.


Most people find us while looking for crosstabs software, and frankly, the crosstab and banner tables generated by our survey software are extraordinary. In fact, they're so good that we've dedicated an entire page to just crosstabs and banner tables. But we also do a great job on frequencies and descriptive statistics, and you'll find examples of those on this page.

Use "options" to control the survey analyses & statistical reports

When performing a statistical analysis of survey data there are many "behind the scenes" decisions about how the analysis will be performed and how the results will be displayed. Most survey software makes those decisions for you. Sometimes they're right and sometimes they're not. Our survey software uses an "Options" command to give you full control of the analysis and reporting process. You can simply type the options or select them from a menu. Options are used to control the survey analysis and format of the reports

How to create statistical & survey analyses reports

Analyses are requested by preparing a procedure file of commands A "procedure file" is used to generate reports. It's just a set of commands that says "first do this", "then do this", ... and "finally do this". That's what we mean by "batch processing". You can tell StatPac to do one, or a hundred different analyses all at once. That's a big time-saver! You're not stuck doing one analysis at a time.


I'll give you the bad news first: you'll have to learn a "programming language" to become proficient with the software. If you're now running for the hills, then StatPac survey software is probably not your best choice.


The good news is really good news and there's a lot of it: The programming language is very simple.  (Here's a one-minute lesson in how to design analyses). For a typical survey, you'll probably use fewer than a dozen commands, or you can just tell StatPac to write the procedure file for you. There's clear online help to show you how to use the commands. Changes are easy to make and mistakes (yes, they happen) are easy to fix. You can even ask StatPac to write the "bare-bones" procedures for you.


Analysis design screen will automatically create a topline procedure file


Here's what the analysis design screen looks like. The left side of the screen is where you type the commands and the right side of the screen shows the question numbers and names. At first glance, the commands might appear complex, but if you look closely, you'll see that many of them are used over and over again. Once you've learned them, they're easy.


An analysis design screen with many statistical procedures


Frequencies (counts & percents)

A frequency analyses is used to display counts and percents for questions that have categorical response choices. Everyone understands frequencies, but many people don't realize how many options are available when creating a frequency table. Basic frequencies analysis shows counts and percents


Here is a picture of how you preview the results of an analysis. This is a frequency analysis with cumulative percents. As you can see, the interface is similar to a word processor, with a tool bar, menu bar, and a ruler.


Preview the results of an analysis


And when you click on the toolbar graphics icon , you could select a vertical bar chart:


Vertical bar graph


Most people view a percent as some kind of "set in stone" value. That is, if 30% selected the second response choice, then that's the percentage in the population. They forget the concept of "margin of error". When doing a survey on a sample of the population (as opposed to a census), there's a margin of error associated with every statistic. Confidence intervals tell you the plus or minus level of certainty you can have with a statistic. Every percentage in a frequency analysis report has a confidence interval associated with it although very few survey software programs tell you what it is. StatPac gives you the choice to show the confidence interval for the percentages.


Confidence intervals tell you the level of certainty you can have for each percent


Click on the toolbar graphics icon to select a pie chart:


Pie charts are good when there are a small number of response categories


Percentages are typically calculated by dividing the number of people who selected a response by the number of people who were asked the question. This can be misleading when a substantial number of people did not answer the question. In those cases, the denominator for calculating the percentage (referred to as the percentage base) should be the number of people that answered the question. StatPac lets you set the percentage base to either.


Some questions allow multiple answers (i.e., a respondent can make more than one choice). Percentage base for multiple response questions is especially important because the reported percentages will be very different. But with multiple response, there are three possible choices for the percentage base... the number of people, the number of responses, and the number of people who selected at least one choice.


Multiple response is a common type of survey question


Click on the toolbar graphics icon and select a horizontal bar chart:

Horizontal bar charts are good when there are many response choices


When a group of questions have the same response choices (e.g., a series of Likert scale questions), it is often desirable to show them in one table, maybe even sorted by level of agreement. With a single option, StatPac lets you group questions into a matrix layout. Another option displays them sorted by level of agreement.


Show many frequency analyses on one page in a matrix format


Sometimes, unlabeled choices (e.g., zip codes) are best shown in a highly compressed table, like this:


Some data is best displayed in a compressed format


List (open-ended verbatim responses)

Open-ended questions give an opportunity for verbatim responses Open-ended questions are extremely valuable. They are used extensively for "Other (please specify)" questions, and are also used for questions where the researcher has insufficient pre-survey knowledge to create a set of close-ended response categories. StatPac can list the open-ended responses verbatim like this:


Listing open-ended verbatim responses


Coding open-ended responses (content analysis)

Reading open-ended comments often reveals patterns and themes but they are difficult to mentally aggregate. Even with just a few dozen responses it become difficult to get a feel for how many people said what. StatPac open-ended response coding lets you group respondents' answers into response categories (themes), making it easier to make sense of the answers. The coding of open-ended responses (often referred to as "content analysis") is nearly automatic and can save many hours of time on a large survey. Content analysis of verbatim responses


StatPac's automated response coding works spectacularly well on some questions and poorly on others. Questions that have a relatively limited set of responses will produce nearly perfect coding. Examples are: "What city do you live in?", "Who's your favorite professional football team?", "Who should be employee of the month?", and "What is the one thing our department could do to improve the most?". Each of the questions are specific and would yield a limited set of unique responses; their coding would be nearly automatic. Soundex is used so spelling differences become irrelevant.


Very general questions that elicit a large number of unique responses generally do not code well automatically. Examples are: "How do you feel about life?", and "Describe the event that changed your opinion.". StatPac might do well at automatic coding these questions and it might not. It just depends on what people said. At a minimum, it would do the initial coding to give you a head start.


Here is one of the open-ended response coding screens:


Automatically code open-ended responses into response categories


Descriptive statistics (measures of central tendency & dispersion)

Descriptive statistics include measures of central tendency (mean, median, and mode) and measures of dispersion  (variance, standard error, and standard deviation). StatPac additionally includes measures of non-normality (skewness and kurtosis), as well as confidence intervals and quartiles (or any "iles"). Descriptive statistics include measures of central tendency, dispersion, and non-normality


Use options to select which statistics are reported. Here's an example of descriptive statistics for a single variable.


Sample descriptive statistics printout


Click on the toolbar graphics icon to get a distribution plot:


Distribution plot to see if the data resembles a normal distribution


Descriptive statistics can also be performed on a group of related variables. Select which statistics will be included in the table.


Descriptive statistics for several variables in matrix format


Click on the toolbar graphics icon to select a horizontal bar graph of means:


Horizontal bar graph showing means


(read more about the features in our survey software)