= Survey (human research) =

In research of human subjects, a survey is a list of questions aimed for extracting specific data from a particular group of people. Surveys may be conducted by phone, mail, via the internet, and also in person in public spaces. Surveys are used to gather or gain knowledge in fields such as social research and demography.

Survey research is often used to assess thoughts, opinions and feelings. Surveys can be specific and limited, or they can have more global, widespread goals. Psychologists and sociologists often use surveys to analyze behavior, while it is also used to meet the more pragmatic needs of the media, such as, in evaluating political candidates, public health officials, professional organizations, and advertising and marketing directors. Survey research has also been employed in various medical and surgical fields to gather information about healthcare personnel's practice patterns and professional attitudes toward various clinical problems and diseases. Healthcare professionals that may be enrolled in survey studies include physicians, nurses, and physical therapists among others. A survey consists of a predetermined set of questions that is given to a sample. With a representative sample, that is, one that is representative of the larger population of interest, one can describe the attitudes of the population from which the sample was drawn. Further, one can compare the attitudes of different populations as well as look for changes in attitudes over time. A good sample selection is key as it allows one to generalize the findings from the sample to the population, which is the whole purpose of survey research. In addition to this, it is important to ensure that survey questions are not biased such as using suggestive words. This prevents inaccurate results in a survey.

These are methods that are used to collect information from a sample of individuals in a systematic way. First there was the change from traditional paper-and-pencil interviewing (PAPI) to computer-assisted interviewing (CAI). Now, face-to-face surveys (CAPI), telephone surveys (CATI), and mail surveys (CASI, CSAQ) are increasingly replaced by web surveys. In addition, remote interviewers could possibly keep the respondent engaged while reducing cost as compared to in-person interviewers.

==Types==

===Census===

A census is the procedure of systematically acquiring and recording information about the members of a specific given population. It is a regularly occurring and official count of a particular population. The term is used mostly in connection with national population and housing censuses; other common censuses include agriculture, business, and traffic censuses. The United Nations defines the essential features of population and housing censuses as "individual enumeration, universality within a defined territory, simultaneity and defined periodicity", and recommends that population censuses be taken at least every 10 years

===Other household surveys===

Other surveys than the census may explore characteristics in households, such as fertility, family structure, and demographics.

Household surveys with at least 10,000 participants include:
- General Household Survey, conducted in private households in Great Britain. It is a repeated cross-sectional study, conducted annually, which uses a sample of 9,731 households in the 2006 survey.
- Generations and Gender Survey, conducted in several countries in Europe as well as Australia and Japan. The programme has collected least one wave of surveys in 19 countries, with an average of 9,000 respondents per country.
- Household, Income and Labour Dynamics in Australia Survey, where the wave 1 panel consisted of 7,682 households and 19,914 individuals
- Integrated Household Survey, a survey made up of multiple other surveys in the UK. It includes about 340,000 respondents, making it the largest collection of social data in the UK after the census.
- National Survey of Family Growth, conducted in the United States by the National Center for Health Statistics division of the Centers for Disease Control and Prevention to understand trends related to fertility, family structure, and demographics in the United States. The 2006-2010 NSFG surveyed 22,682 interviews.
- Panel Study of Income Dynamics in the United States, wherein data have been collected from the same families and their descendants since 1968. The study started with over 18,000 nationally representative individuals. It involved more than 9,000 individuals as of 2009.
- Socio-Economic Panel, a longitudinal panel dataset of the population in Germany. It is a household-based study that started in 1984 and which reinterviews adult household members annually. In 2007, the study involved about 12,000 households, with more than 20,000 adult persons sampled.
- UK households: a longitudinal study, now known as Understanding Society. Its sample size is 40,000 households from the United Kingdom or approx. 100,000 individuals.

===Opinion poll===

An opinion poll is a survey of public opinion from a particular sample. Opinion polls are usually designed to represent the opinions of a population by conducting a series of questions and then extrapolating generalities in ratio or within confidence intervals.

===Healthcare surveys===
Medical or health-related survey research is particularly concerned with uncovering knowledge-practice gaps. That is to say to reveal any inconsistencies between the established international recommended guidelines and the real time medical practice regarding a certain disease or clinical problem. In other words, some medical surveys aim at exploring the difference between the proper practice and the actual practice reported by the healthcare professionals. Medical survey research has also been used to collect information from the patients, caregivers and even the public on relevant health issues. In turn the information gathered from survey results can be used to upgrade the professional performance of healthcare personnel including physicians, develop the quality of healthcare delivered to patients, mend existing deficiencies of the healthcare delivery system and professional health education. Furthermore, the results of survey research can inform the public health domain and help conduct health awareness campaigns in vulnerable populations and guide healthcare policy-makers. This is especially true when survey research deals with a wide spread disease that constitutes a nationwide or global health challenge.

==Methodology==

A single survey is made of at least a sample (or full population in the case of a census), a method of data collection (e.g., a questionnaire) and individual questions or items that become data that can be analyzed statistically. A single survey may focus on different types of topics such as preferences (e.g., for a presidential candidate), opinions (e.g., should abortion be legal?), behavior (smoking and alcohol use), or factual information (e.g., income), depending on its purpose. Since survey research is almost always based on a sample of the population, the success of the research is dependent on the representativeness of the sample with respect to a target population of interest to the researcher. That target population can range from the general population of a given country to specific groups of people within that country, to a membership list of a professional organization, or list of students enrolled in a school system (see also sampling (statistics) and survey sampling).

The choice between administration modes is influenced by several factors, including 1) costs, 2) coverage of the target population (including group-specific preferences for certain modes), 3) flexibility of asking questions, 4) respondents' willingness to participate and 5) response accuracy. Different methods create mode effects that change how respondents answer. The most common modes of administration are listed under the following headings.

=== Mobile surveys ===
Mobile data collection or mobile surveys is an increasingly popular method of data collection. Over 50% of surveys today are opened on mobile devices. The survey, form, app or collection tool is on a mobile device such as a smart phone or a tablet. These devices offer innovative ways to gather data, and eliminate the laborious "data entry" (of paper form data into a computer), which delays data analysis and understanding. By eliminating paper, mobile data collection can also dramatically reduce costs: one World Bank study in Guatemala found a 71% decrease in cost while using mobile data collection, compared to the previous paper-based approach.

Apart from the high mobile phone penetration, further advantages are quicker response times and the possibility to reach previously hard-to-reach target groups. In this way, mobile technology allows marketers, researchers and employers to create real and meaningful mobile engagement in environments different from the traditional one in front of a desktop computer. However, even when using mobile devices to answer the web surveys, most respondents still answer from home.

=== SMS/IM surveys ===
SMS surveys can reach any handset, in any language and in any country. As they are not dependent on internet access and the answers can be sent when its convenient, they are a suitable mobile survey data collection channel for many situations that require fast, high volume responses. As a result, SMS surveys can deliver 80% of responses in less than 2 hours and often at much lower cost compared to face-to-face surveys, due to the elimination of travel/personnel costs. IM is similar to SMS, except that a mobile number is not required. IM functions are available in standalone software, such as Skype, or embedded on websites such as Facebook and Google.

===Online surveys===
Online (Internet) surveys are becoming an essential research tool for a variety of research fields, including marketing, social and official statistics research. According to ESOMAR online survey research accounted for 20% of global data-collection expenditure in 2006. They offer capabilities beyond those available for any other type of self-administered questionnaire. Online consumer panels are also used extensively for carrying out surveys but the quality is considered inferior because the panelists are regular contributors and tend to be fatigued. However, when estimating the measurement quality (defined as product of reliability and validity) using a multitrait-multimethod approach (MTMM), some studies found a quite reasonable quality and even that the quality of a series of questions in an online opt-in panel (Netquest) was very similar to the measurement quality for the same questions asked in the European Social Survey (ESS), which is a face-to-face survey.

Some studies have compared the quality of face-to-face surveys and/or telephone surveys with that of online surveys, for single questions, but also for more complex concepts measured with more than one question (also called Composite Scores or Index). Focusing only on probability-based surveys (also for the online ones), they found overall that the face-to-face (using show-cards) and web surveys have quite similar levels of measurement quality, whereas the telephone surveys were performing worse. Other studies comparing paper-and-pencil questionnaires with web-based questionnaires showed that employees preferred online survey approaches to the paper-and-pencil format. There are also concerns about what has been called "ballot stuffing" in which employees make repeated responses to the same survey. Some employees are also concerned about privacy. Even if they do not provide their names when responding to a company survey, can they be certain that their anonymity is protected? Such fears prevent some employees from expressing an opinion.

====Advantages of online surveys====
- Web surveys are faster, simpler, and cheaper. However, lower costs are not so straightforward in practice, as they are strongly interconnected to errors. Because response rate comparisons to other survey modes are usually not favourable for online surveys, efforts to achieve a higher response rate (e.g., with traditional solicitation methods) may substantially increase costs.
- The entire data collection period is significantly shortened, as all data can be collected and processed in little more than a month.
- Interaction between the respondent and the questionnaire is more dynamic compared to e-mail or paper surveys. Online surveys are also less intrusive, and they suffer less from social desirability effects.
- Complex skip patterns can be implemented in ways that are mostly invisible to the respondent.
- Pop-up instructions can be provided for individual questions to provide help with questions exactly where assistance is required.
- Questions with long lists of answer choices can be used to provide immediate coding of answers to certain questions that are usually asked in an open-ended fashion in paper questionnaires.
- Online surveys can be tailored to the situation (e.g., respondents may be allowed save a partially completed form, the questionnaire may be preloaded with already available information, etc.).
- Online questionnaires may be improved by applying usability testing, where usability is measured with reference to the speed with which a task can be performed, the frequency of errors and user satisfaction with the interface.

====Key methodological issues of online surveys====
- Sampling. The difference between probability samples (where the inclusion probabilities for all units of the target population is known in advance) and non-probability samples (which often require less time and effort but generally do not support statistical inference) is crucial. Probability samples are highly affected by problems of non-coverage (not all members of the general population have Internet access) and frame problems (online survey invitations are most conveniently distributed using e-mail, but there are no e-mail directories of the general population that might be used as a sampling frame). Because coverage and frame problems can significantly impact data quality, they should be adequately reported when disseminating the research results.
- Invitations to online surveys. Due to the lack of sampling frames many online survey invitations are published in the form of an URL link on web sites or in other media, which leads to sample selection bias that is out of research control and to non-probability samples. Traditional solicitation modes, such as telephone or mail invitations to web surveys, can help overcoming probability sampling issues in online surveys. However, such approaches are faced with problems of dramatically higher costs and questionable effectiveness.
- Non-response. Online survey response rates are generally low and also vary extremely – from less than 1% in enterprise surveys with e-mail invitations to almost 100% in specific membership surveys. In addition to refusing participation, terminating surveying during the process or not answering certain questions, several other non-response patterns can be observed in online surveys, such as lurking respondents and a combination of partial and item non-response. Response rates can be increased by offering monetary or some other type of incentive to the respondents, by contacting respondents several times (follow-up), and by keeping the questionnaire difficulty as low as possible. There are draw-backs to using an incentive to garner a response. Non-bias responses could be questioned in this type of situation. The most concrete way to gain feedback is to publicize what is done with the results. To take concrete actions based on feedback and to show that to the customer base is extremely motivating to customers to continue to let their voice be heard.
- Acquiescence bias. Due to a phenomenon inherently present in human nature, many people have acquiescent personalities and are more likely to agree with statements than disagree - regardless of the content. Often, those people see the question-asker as an expert in their field which causes them to be more likely to react positively to the question asked.
- Platform Issues. Lack of familiarity with the platform used can cause participants and clients confusion, or limit who may be willing and able to navigate surveys on digital platforms.
- Questionnaire design. While modern web questionnaires offer a range of design features (different question types, images, multimedia), the use of such elements should be limited to the extent necessary for respondents to understand questions or to stimulate the response. It should not affect their responses, because that would mean lower validity and reliability of data. Appropriate questionnaire design can help lowering the measurement error that can arise also due to the respondents or the survey mode itself (respondent's motivation, computer literacy, abilities, privacy concerns, etc.).
- Post-survey adjustments. Various robust procedures have been developed for situations where sampling deviate from probability selection, or, when we face non-coverage and non-response problems. The standard statistical inference procedures (e.g. confidence interval calculations and hypothesis testing) still require a probability sample. The actual survey practice, particularly in marketing research and in public opinion polling, which massively neglects the principles of probability samples, increasingly requires from the statistical profession to specify the conditions where non-probability samples may work.

These issues, and potential remedies, are discussed in a number of sources.

===Telephone===

Telephone surveys use interviewers to encourage the sample persons to respond, which leads to higher response rates. There are some potential for interviewer bias (e.g., some people may be more willing to discuss a sensitive issue with a female interviewer than with a male one). Depending on local call charge structure and coverage, this method can be cost efficient and may be appropriate for large national (or international) sampling frames using traditional phones or computer assisted telephone interviewing (CATI). Because it is audio-based, this mode cannot be used for non-audio information such as graphics, demonstrations, or taste/smell samples.

===Mail===
Depending on local bulk mail postage, mail surveys may be relatively lower cost compared to other modes. The field method tends to be longer - often several months - before the surveys are returned and statistical analysis can begin. The questionnaire may be handed to the respondents or mailed to them, but in all cases they are returned to the researcher via mail. Because there is no interviewer presence, the mail mode is not suitable for issues that may require clarification. However, there is no interviewer bias and respondents can answer at their own convenience (allowing them to break up long surveys; also useful if they need to check records to answer a question). To correct nonresponse bias, extrapolation across waves could be done. Response rates can be improved by using mail panels (members of the panel must agree to participate) and prepaid monetary incentives, but response rates are affected by the class of mail through which the survey was sent. Panels can be used in longitudinal designs where the same respondents are surveyed several times.

Visual presentation of survey questions make a difference in how respondents answer them; with four primary design elements: words (meaning), numbers (sequencing), symbols (e.g. arrow), and graphics (e.g. text boxes). In translated surveys, writing practice (e.g. Spanish words are lengthier and require more printing space) and text orientation (e.g. Arabic is read from right to left) must be considered in questionnaire visual design to minimize data missingness.

===Face-to-face===
The face-to-face mode is suitable for locations where telephone or mail are not developed. Like the telephone mode, the interviewer presence runs the risk of interviewer bias.

===Video interviewing===
Video interviewing is similar to face-to-face interviewing except that the interviewer and respondent are not physically in the same location, but are communicating via video conferencing such as Zoom or Teams.

===Virtual worlds===
Virtual-world interviews take place online in a space created for virtual interaction with other users or players, such as Second Life. Both the respondent and interviewer choose avatars to represent themselves and interact by a chat feature or by real voice audio.

===Chatbots===
A chatbot is used regularly in marketing and sales to gather experience feedback. When used for collecting survey responses, chatbot surveys should be kept short, trained to speak in a friendly human tone, and use easy-to-navigate interface with more advanced artificial intelligence.

===Mixed-mode surveys===
Researchers can combine several above methods for the data collection. For example, researchers can invite shoppers at malls, and send willing participants questionnaires by emails. With the introduction of computers to the survey process, survey mode now includes combinations of different approaches or mixed-mode designs. Some of the most common methods are:
- Computer-assisted personal interviewing (CAPI): The computer displays the questions on screen, the interviewer reads them to the respondent, and then enters the respondent's answers.
- Audio computer-assisted self-interviewing (audio CASI): The respondent operates the computer, the computer displays the question on the screen and plays recordings of the questions to the respondents, who then enters his/her answers.
- Computer-assisted telephone interviewing (CATI)
- Interactive voice response (IVR): The computer plays recordings of the questions to respondents over the telephone, who then respond by using the keypad of the telephone or speaking their answers aloud.
- Web surveys: The computer administers the questions online. See computer-assisted web interviewing (CAWI).

==Interpretation==

===Correlation and causality===

When two variables are related, or correlated, one can make predictions for these two variables. However, this does not mean causality. At this point, it is not possible to determine a causal relationship between the two variables; correlation does not imply causality. However, correlation evidence is significant because it can help identify potential causes of behavior. Path analysis is a statistical technique that can be used with correlational data. This involves the identification of mediator and moderator variables. A mediator variable is used to explain the correlation between two variables. A moderator variable affects the direction or strength of the correlation between two variables. A spurious relationship is a relationship in which the relation between two variables can be explained by a third variable.

Moreover, in survey research, correlation coefficients between two variables might be affected by measurement error, what can lead to wrongly estimated coefficients and biased substantive conclusions. Therefore, when using survey data, we need to correct correlation coefficients for measurement error.

===Reported behavior versus actual behavior===
The value of collected data completely depends upon how truthful respondents are in their answers on questionnaires. In general, survey researchers accept respondents' answers as true. Survey researchers avoid reactive measurement by examining the accuracy of verbal reports, and directly observing respondents' behavior in comparison with their verbal reports to determine what behaviors they really engage in or what attitudes they really uphold. Studies examining the association between self-reports (attitudes, intentions) and actual behavior show that the link between them—though positive—is not always strong—thus caution is needed when extrapolating self-reports to actual behaviors, Dishonesty is pronounced in some sex-related queries, with men often amplifying their number of sex partners, while women tend to downplay and slash their true number.

==History==
The Statistical Society of London pioneered the questionnaire in 1838. "Among the earliest acts of the Statistical Society of London ... was the appointment of committees to enquire into industrial and social conditions. One of these committees, in 1838, used the first written questionnaire of which I have any record. The committee-men prepared and printed a list of questions 'designed to elicit the complete and impartial history of strikes.'"

The most famous public survey in the United States of America is the national census. Held every ten years since 1790, the census attempts to count all persons, and also to obtain demographic data about factors such as age, ethnicity, and relationships within households.

With the application of probability sampling in the 1930s, surveys became a standard tool for empirical research in social sciences, marketing, and official statistics.

Nielsen ratings (carried out since 1947) provide another example of public surveys in the United States. Nielsen rating track media-viewing habits (radio, television, internet, print) the results of which are used to make commissioning decisions. Some Nielsen ratings localize the data points to give marketing firms more specific information with which to target customers. Demographic data is also used to understand what influences work best to market consumer products, political campaigns, etc.

Following the invention of the telephone survey (used at least as early as the 1940s), the development of the Internet in the late-20th century fostered online surveys and web surveys.

==See also==

- Assessment
- Audience measurement
- Comparison of survey software
- Data collection system
- Opinion poll
- Statistical survey
- Questionnaire
- Wiki survey
