1) Psychosocial refers to how one positions or views themselves in the world, such as the identities one assumes, one's level of confidence, one's sense of well-being, and one's interactions or relationships with others or their environment.

2) In education research, instruments are any tools used to collect data. This can include tests, questionnaires, surveys, and interview or focus group protocols. Typically, instruments are intended to measure latent constructs. Latent constructs are abstract variables that are not directly visible, such as confidence in one's ability to do science, one's knowledge about the structure and function of DNA, or one's experimental design skills. In this paper, the construct of interest is "project ownership," and the paper outlines the development of a survey that can be used to measure project ownership.

3) In general, items are questions or statements on a test or questionnaire.Scaled items are questions or statements for which respondents select a number or point on a scale, such as a rating of 2 (agree) on a scale of 1-5 (strongly agree, agree, neither agree nor disagree, disagree, strongly disagree).

4) In social sciences, theories are logical explanations of social phenomena. Theories explain relationships between relevant variables, which can then be tested empirically.

5) Dimensionality refers to the components of the instrument or construct that comprise the whole. Take for example a chocolate chip cookie. Its "dimensions" are chips, dough, and some amount of baking. Without the dough, it would just be chips. Without the chips or baking, it would just be dough. All three dimensions are needed to comprise a chocolate chip cookie. Some scales include multiple dimensions if multiple ideas comprise the whole construct, as in this paper. Other scales are thought to be measuring a single, unidimensional construct, such as research self-efficacy or confidence in one's ability to do research. For more on dimensionality, see: http://www.socialresearchmethods.net/kb/scalgen.php

6) Reliability is the quality of the instrument - in other words, how reliably or consistently it can measure the intended construct. Reliability is judged in multiple ways, including consistency over time (same person responds the same way at multiple time points) or consistency across respondents (different people respond similarly, for example, when rating the quality of students' responses to an essay question). For more on reliability, see: http://www.socialresearchmethods.net/kb/reliable.php

7) Validity is the idea that the instrument is actually measuring what you think it is measuring. For this paper, validity refers to the idea that students' responses to the survey are indicators of students' actual sense of project ownership - in other words, that it is reasonable and fair to draw conclusions about their sense of ownership based on how they respond to the survey. It is typical to require multiple forms of validity evidence to argue that an instrument is a valid or trustworthy measure of a construct. It is also important to note that validity is "contextual" - meaning that what is valid in one context or for a particular purpose may not be valid in other contexts or for other purposes. Imagine for example you were developing a test to measure 2nd graders' reading abilities. This is unlikely to be a valid measure of reading abilities of high school students. For more on validity, see: http://www.socialresearchmethods.net/kb/constval.php

8) Coefficient alpha, also called Cronbach's alpha, is an indicator of "internal reliability" of an instrument or scale. It is a function of the number of items, the average covariance among pairs of items, and the variance of the total score. The idea is that responses to items purported to measure a single construct should all correlate at a high level. This statistic alone is not a compelling indicator of reliability since its value can be inflated by increasing the total number of items. Rules of thumb suggest a minimum acceptable alpha value of 0.7 or 0.8 (values range from 0.0 to 1.0). For more details, see: http://www.socialresearchmethods.net/kb/reltypes.php

9) Known groups validity refers to the idea that two groups can reasonably be expected to differ with respect to the construct being examined. Thus, differences in their responses can be used to confirm that an instrument is useful for discriminating between them. Once this is confirmed, the instrument itself can be used to discriminate among groups.

1) Sometimes because of our experience as instructors, we assume our experiences reflect well known and well accepted ideas. This is often not the case. In this Introduction, note how each claim or logical step in the core argument of the paper is supported by one or more citations. Because biology education research is interdisciplinary, it can be particularly helpful to search for articles with a cross-disciplinary tool such as Google Scholar https://scholar.google.com/.

2) Cognition relates to thinking or learning. Cognitive gains relate to improvements in one’s ability to think and reflect upon a particular lesson or topic.

3) Affect relates to emotions, attitudes, or dispositions. Affective gains relate to positive changes in enjoyment, satisfaction, beliefs, or values.

4) Behavior relates to observable or measurable actions. Behavioral outcomes are changes in actions by individuals as a result of an experience, such as the act of asking questions or choosing to take certain courses.

5) Note the careful language here. The authors do not state that students achieved particular outcomes, but rather that students reported achieving particular outcomes. This is an important distinction because students may be able to report reliably about some of their outcomes, such as gains in confidence, but not be able to reliably report on changes in their knowledge or skills. It is well established in cognitive science and psychology that novices cannot reliably gauge their knowledge and skills. Experts tend to be better able to gauge their knowledge and skills, but have a tendency to underestimate because they are aware of all they don't know. This phenomenon is called the "Dunning-Kruger effect." For more on student self-assessment, see: http://rer.sagepub.com/content/59/4/395.short

6) Likert scales are multipoint range of responses to questions or statements, which are commonly used in survey research. For example, on a scale of 1-5, these can be strongly agree, agree, neither agree nor disagree, disagree, and strongly disagree.

7) The term qualitative can mean many things in education research. In this instance, qualitative means that the study made use of qualitative (not quantitative) data in the form of interviews and qualitative methods, specifically assigning meaning to quotes in interviews (see coding below) and then grouping meanings into overarching thematic ideas. For more on qualitative research, see: http://www.socialresearchmethods.net/kb/qual.php

8) Construct validity refers to the overarching idea of validity - that the instrument is measuring what it is intended to measure. For more on construct validity, see: http://www.socialresearchmethods.net/kb/constval.php

9) The field of computational linguistics involves using computer science and computational techniques to analyze language and speech.

10) Content analysis refers to the process of chunking language, either written or oral, into units (e.g., quotes), assigning meaning to the units, and then making inferences about the meaning or patterns of meaning.

11) This entire paragraph aims to build an argument about what ownership is and why it is important.

12) In social sciences, constructs are abstractions that are not directly observable but can be inferred through observable phenomena, such as emotions, attitudes, knowledge, or skills. They can also be thought of as latent variables, which are variables that are inferred based on observable variables such as responses to test or survey items, observation protocols, or interview or focus group questions.

13) Coding is the process of assigning meaning to chunks of language, either written or oral. For more on qualitative coding and analysis, see: https://researchrundowns.com/qual/qualitative-coding-analysis/

14) When defining a construct, it is important to consider not only what counts as part of the construct (indicators) but also what doesn't count (counterindicators). Clear delineation of the boundaries of a construct will result in more accurate measurement and improved interpretation of measurement data.

15) This careful selection and description of the study sample, meaning who the participants are and what their range of experiences are, is important for helping to delineate what counts as ownership and what does not.

15b) The fact that the authors articulate how the sample was selected makes it easier for readers to evaluate the results.

16) These five categories can be considered dimensions of the construct of ownership - elements that make up the idea of ownership. Think of a chocolate chip cookie as the construct and chips, dough, and baking as the dimensions. All three are necessary for the construct of a chocolate chip cookie to exist. A chocolate chip cookie would not be a chocolate chip cookie without its dimensions: chips, dough, and baking. Similarly, if others wanted to use the ownership scale to measure student sense of ownership that results from a particular educational experience, all of the dimensions that comprise ownership would need to be considered.

17) Because so much of social science research is exploring complex phenomena that are not directly observable, researchers must articulate how they operationalize their constructs of interest. In other words, what is the researcher observing that is an indicator of the construct? For example, students' responses to a test question indicate some knowledge about evolution. The "assessment triangle" provides a useful way to think about this (http://www.nap.edu/read/10019/chapter/4?term=%22assessment+triangle%22#44), specifically that there is a construct (e.g., student cognition), an observation that is indicative of the construct (e.g., responses to test questions), and an interpretation of the observation that results in an inference about the construct (e.g., the way the student answered the question indicates something about how they think)

18) This paragraph places this new study in the context of a prior work, Hanauer et al. (2012).

19) The authors are presenting some of the problems with undergraduate research experiences, now commonly called CUREs (course-based research experiences, see https://curenet.cns.utexas.edu/), in terms of resources and student outcomes. In addition, they point out the need to develop an instrument that can measure project ownership in the context of course-based research experiences.

20) This description of the CURE survey, the URSSA, and the SURE surveys helps provide background information on the state of the field prior to this work stating the features of these instruments, limitations, and the need for other instruments.

21) The authors describe prior work that this report builds upon.

1) Standard methods sections for an education study include participants (including their personal characteristics and context or environment such their institution type, course context, etc.), the process (instruction, instrument development, etc.), data collection methods, and data analysis methods.

2) Minimum sample sizes should be chosen ahead of data collection in order to avoid "p-hacking," which is when researchers collect data until they reach a desired p value. A power analysis should be done for quantitative research or some other evidence-based rationale should be used to decide minimum sample sizes before data collection begins.

3) Knowing who the participants are helps readers determine how the instrument described here could be useful. For example, the authors tested the instrument with undergraduate students in the United States. Middle school students or practicing scientists might also develop a sense of ownership of their research projects, but the instrument would need to be tested and validated with these groups to make sure they interpret the items as they are intended and that the instrument is useful for discriminating among individuals in other populations.

4) The more different settings data can be collected from, the more likely the results can be generalized because the participants are more likely to reflect a broader sample. In addition, depending on the nature of the instrument, it may be more salient to measure the construct at the course or institution level rather than the student level. Because ownership is specific to a student's interaction with their educational experience, the authors were most interested in measuring at the student level.

5) By stating where the sample population is from, the authors acknowledge the limitations of their study.

6) One issue that is not addressed here is whether students are likely to respond in ways that are influenced by their institutions or courses. In other words, students who are enrolled in the same course are likely to respond more similarly than students in different courses. When this happens, it may be necessary to address this by using statistical models that account for the nesting of the data (i.e., students within courses, within institutions). If however students vary more in their responses within a course or institution than across courses or institutions, then nested models may not be necessary. Nesting is not possible with this dataset because there are not likely to be enough responses within each institution to model variance at the student and course or institution level.

7) It is helpful to provide information about how students were recruited or incentivized to participate in order to evaluate any potential for bias in the sample and also so others can replicate the methods.

8) It is important to include the number of participants who were invited to participate (if known), started to participate, and completed participation. This allows readers to judge how representative the sample might be of a larger population, and thus how generalizable the results might be.

9) For this study, only data from fully completed surveys were used. This helps avoid making assumptions about missing data, which could be missing at random or missing for a reason that results in bias in the results.

10) Manuscripts reporting on studies involving human subjects must include explicit assurance that the research reported was approved or determined to be exempt from review by a local Institutional Review Board (IRB).

11) Providing demographic characteristics for the sample helps readers determine how representative their responses are likely to be of the entire undergraduate student population in the U.S.

12) Pilot testing is an important step in instrument development. Typically, a small group of individuals representing the target population are asked to respond to the items, explain their responses, describe any points of confusion or ambiguity, and possibly suggest improvements to the wording. Their explanations are useful for ensuring the items are being interpreted the way they were intended and are as clear as possible.

13) These authors use a measurement development process espoused by Netemeyer and colleagues (2003), which draws at least in part on the widely used and highly respected framework for validity proposed by Samuel Messick. For more on validity theory, see explanations on YouTube from John Hathcoat at James Madison University. 5-minute introduction at: https://www.youtube.com/watch?v=rYc-coraFNk , and 40-minute lecture at: https://www.youtube.com/watch?v=_HS4lxsoR4Q

14) Clearly defining the construct of interest

15) Writing the items themselves. At this stage, it is common to seek input from experts in the construct(s) being assessed to make sure the items represent the construct. Representation of the construct includes having items that represent all dimensions of the construct as well as the full range of difficulty or agreeable-ness of the construct (e.g., what a respondent might agree to if they have the very minimum level of ownership and what a respondent would agree to only if they reach a full level of ownership, as well as points in between).

16) Rewording items based on responses to the study to eliminate confusions and ambiguities.

17) Collecting data from a larger sample of respondents and conducting analyses that reveal information about the quality of the instrument.

18) If the items are measuring a single construct, then a given respondent should respond similarly to all of the items.

19) Factor analysis is a statistical method that describes variability among many observed, correlated variables (e.g., responses to individual items on a survey) in terms of a lower number of unobserved variables or factors (e.g., latent variables or constructs). For example, in this study, responses to the items related to emotion are more correlated to one another than they are to the items related to ownership. So emotion and ownership are thought to be two factors.

20) Exploratory factor analysis, or EFA, makes no a priori assumptions about how many factors there are, or which items represent which factors.

21) Confirmatory factor analysis should be used when arguments can be made regarding how many factors are represented in an instrument and which items relate to which factors.

22) Cronbach's alpha is used as a measure of internal consistency, or how closely related a set of items are to one another. Ideally, separate items intended to measure the same construct should be correlated. Although it is widely used, there are also widespread concerns about the limitations of this metric as an indicator of reliability. For more on this, consult with a psychometrician or quantitative methodologist.

23) Internal consistency reflects the relationship between items purported to measure the same construct or dimension.

24) A known-groups comparative study is a comparison of results from groups that can reasonably be expected to differ. This type of evidence can be used to make an argument about the validity of a measure. Do we need a definition of? This was part of the abstract.

1) Validity, reliability, and dimensionality are the psychometric properties of a measure (i.e., how an instrument is used in a particular setting with a particular population).

2) For more on acceptable values for Cronbach's alpha and its use as a measure of reliability and internal consistency, see: http://shell.cas.usf.edu/~pspector/ORM/LanceOrm-06.pdf

3) This refers to a subsection of the Cronbach’s alpha procedure that allows the researcher to evaluate whether the alpha value changes if items are deleted. In other words, what would happen to the internal reliability of the scale if an item was deleted? These four items reduced the reliability of the scale up front and thus were taken out of the analysis.

4) This reflects two errors in the paper itself - 15 items are presented even though it says 18 here. The final survey contains 16 items; one is missing from this list. The total items tested are listed in Table 3 and the final version of the survey is presented in Table 4.

5) This is the correlation between scores on the specific item and the total score on the survey. Generally, responses to a given item should correlate with responses to the entire survey if they are measuring the same or related constructs.

6) This statistic is a measure of the proportion of variance among variables (in this case the items) that might be common variance (in this case, ownership or dimensions of it). A rule of thumb is that KMO values between 0.8 and 1 indicate sampling is adequate, and values less than 0.6 indicate sampling is not adequate although some authors set this cut-off at 0.5 as was done here.

7) Certain statistics assume normal distribution of one-dimensional or univariate data. Thus, data must be examined to determine whether this is a fair assumption before using these statistics. Normality can also be examined for multiple variables.

8) Maximum likelihood estimation is a method for estimating some unknown parameter (e.g., ownership) based on the distribution of observed values from a sample (i.e., participants’ responses to the items). The idea is to estimate a value for the unknown parameter that maximizes the probability of getting the data that were observed.

9) Rotation is a technique used to make the results of factor analysis easier to interpret. In essence, the item responses (variables) are graphed and the axes of the graph are rotated so that clustering between variables is more obvious. Oblimin rotation (versus varimax rotation) was chosen here because it allows the factors not to be orthogonal (i.e., correlated with one another). This makes sense because the items are likely to represent different factors or dimensions, which together represent ownership. Kaiser normalization involves normalizing the factor loadings, or correlations with factors, prior to rotating and then denormalizing them after rotation.

10) A scree plot is a visual depiction of the eigenvalues. By examining the slope of the line, you can see that most of the variance is explained by three factors. The slope doesn't change much with additional factors.

11) Eigenvalues are scalar values that result from linear transformation of a vector. They are helpful in this case because they reveal how many factors are likely to be explaining most of the variance in the data. If an eigenvalue is low (below 1 is the rule of thumb, although this is a low bar), then it is not explaining much of the variance in the data.

12) Loadings are the regression coefficients for each item with the factor. Different researchers advocate for different rules of thumb regarding what are acceptable thresholds for factor loadings. Loadings over 0.6 are generally considered good, while smaller loadings can be acceptable with sufficient sample. For more on this, see: http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/thresholds

13) The entire instrument is included as it was administered so that readers can evaluate it for themselves.

14) This "course type" factor would be better described as "degrees of agency", as it is described in the text.

15) Factor loadings can be both positive and negative. This item has a negative wording, so it is not surprising that it negatively correlates with the other items in this factor. The important feature of factor loadings is their absolute value. This is also important for thinking about the construct itself - negative values indicate something about what the construct is not, while positive values indicate something about what the construct is.

16) Factor loadings can be both positive and negative. This item has opposite wording to the items that have positive factor loadings, so it is not surprising that it negatively correlates with those items. Given these values, a better name for this construct might be "lack of agency."

17) Maximum likelihood extraction is a technique that is part of exploratory factor analysis (EFA). A contrasting technique would be principal components analysis (PCA), which uses principal axis factoring. This is fairly technical, but the main point is that EFA using this technique should be used (versus PCA) if there is a theoretical reason to connect the items, in other words the items represent some latent construct. In this case there is a good reason: the qualitative work that preceded this work. For more on the difference between EFA and PCA, see: http://www2.sas.com/proceedings/sugi30/203-30.pdf

18) Multiple sources of evidence are used to make judgments about whether items should be removed. It is important to remember that if items represent particular theoretical aspects of the construct of interest, it may be important to keep them and consider why the empirical evidence is not consistent with what was theorized.

19) The final, recommended version of the instrument is included, with the items that represent each dimension grouped together, to make it easier for readers to use. Grouping items related to a single factor together helps respondents stay focused on the particular idea or phenomenon and reduces their cognitive load.

20) Each point on the scale is labeled so that respondents have a better sense of what each point means. Responses to Likert scale items like this are considered ordinal variables rather than continuous variables because the difference between points on the scale may differ in a non-linear fashion.

21) This next component of the work strengthens the argument that the Project Ownership Survey is a valid measure of undergraduate life science students' sense of ownership of their lab learning experiences by comparing levels of ownership among students who would reasonably expect to differ in their ownership.

22) This is one piece of evidence that the students differed in the experience they were responding about.

23) This is a second piece of evidence that the students' experiences differed between the two course types.

24) One-way ANOVA is used to test whether there is a statistically significant difference in the means of two or more independent groups.

25) It is more typical to calculate ANOVAs with sum scores (sums of responses to all of the items), because the items are thought to represent a latent construct rather than to have stand-alone meaning. ANOVAs for each of the items was calculated here because this was the first description of the survey and this level of analysis can yield insight into how each item is behaving. Moving forward, this kind of calculation should be done with sum scores for each subscale (project ownership and emotion), since each represents a different dimension, or the entire scale.

26) P values of <0.05 are generally interpreted as statistically significant. Yet, many concerns have been raised about the limited meaning of the p value, the arbitrary cut-off of 0.05 to determine statistical significance, and the possibility of Type I errors when multiple comparisons are done (as was done here). The reporting of effects sizes (as was done here) and using corrections for multiple comparisons, such as the Bonferroni correction, can help to address these concerns.

27) Cohen's d is an indicator of the size of the standardized difference between two means. For more on interpreting Cohen's d, see: http://rpsychologist.com/d3/cohend/

1) The conclusions are not simply a restatement of the results. Rather, the results are briefly summarized and the authors focus on describing unanticipated findings, study limitations, and applications and implications of the work.

LSE authors have the liberty to assign appropriate names to each section of their paper. Using common section titles helps the reader navigate the work. In this case, the authors chose to call the section that interprets their result Conclusions instead of Discussion.

2) The key here is that multiple sources of evidence were used to demonstrate the POS is a valid measure of students' ownership of their lab learning experiences. This is consistent with current thinking about validity as a holistic idea as described by Samuel Messick. For more on validity theory, see explanations on YouTube from John Hathcoat at James Madison University. 5-minute introduction at: https://www.youtube.com/watch?v=rYc-coraFNk, and 40-minute lecture at: https://www.youtube.com/watch?v=_HS4lxsoR4Q

3) Cronbach's alpha and results of the factor analysis are the psychometric properties described here.

4) This is an important point because the authors are not "throwing out" the idea that student agency is important and perhaps unique to research courses. Rather, they will likely go back to the drawing board in designing items to better capture the idea of agency or what unique agency students experience in research courses versus traditional lab courses.

5) All studies have limitations. The authors are sometimes (although not always) best positioned to see the limitations of their study. Pointing them out not only makes it clear they have thought through their results and are not overstating their claims. Some papers even have a separate Limitations section.

6) This is a good recommendation given features of the study design and the evidence presented here.

7) The psychosocial aspects of undergraduate research experiences refer to how students view or position themselves as while doing research and as a result of doing research. Students' psychosocial development, such as shifts in their identities as scientists or their confidence in their ability to do science research, is important to measure because it is likely to influence the choices students make in the future, such as whether they choose to engage in other research experiences, complete a science major, or pursue a graduate degree or a science-research related career.

8) It is reasonable to expect that students will learn to think in different ways as a result of doing research compared to other types of learning experiences. However, these outcomes are important to measure because research experiences are likely to vary widely and we cannot understand how to design effective research experiences without collecting and analyzing data to determine what students learn or how they develop.

Developing an Instrument

Annotated by Erin L. Dolan, Rebecca M. Price and Clark R. Coffman

Annotation published February 8, 2018

Hanauer and Dolan describe the development and testing of a research instrument for assessing how much ownership undergraduate biology students feel towards their scientific projects. Project ownership cannot be measured directly, but by asking a carefully designed series of questions, the degree of project ownership can be inferred. Instruments like this are frequently used in education research. The National Research Council and the American Psychological Association have useful guides for developing this kind of instrument in assessment.

We include this article in Anatomy of an Education Study because developing instruments—valid and reliable ways to measure latent variables (i.e., things in social science we cannot see, such as intelligence or sense of belonging)—is extremely challenging to do and important to do well because these tools will be used in future research. First drafts of the questions in an instrument often do not accurately measure the intended variables, and many iterations are necessary to refine questions until they produce trustworthy results. Hanauer and Dolan describe the process they used to develop the Project Ownership Survey, from initial design through pilot testing, analysis, refinement, and revision. Finally, they demonstrate that their instrument does measure project ownership for a sample of the intended population, undergraduate students in US colleges and universities.

CBE Life Sci Educ vol. 13 no. 1 149-158 doi: 10.1187/cbe.13-06-0123

The Project Ownership Survey: Measuring Differences in Scientific Inquiry Experiences

  1. David I. Hanauer*, and
  2. Erin L. Dolan

Affiliations

  1. *Indiana University of Pennsylvania, Indiana, PA 15705
  2. PHIRE Program, Hatfull Laboratory, University of Pittsburgh, University of Pittsburgh, Pittsburgh, PA 15260
  3. Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA 30602
  1. Michelle Smith, Monitoring Editor
  • Submitted July 8, 2013.
  • Revised November 6, 2013.
  • Accepted November 6, 2013.

Abstract

A growing body of research documents the positive outcomes of research experiences for undergraduates, including increased persistence in science. Study of undergraduate lab learning experiences has demonstrated that the design of the experience influences the extent to which students report ownership of the project and that project ownership is one of the psychosocial factors involved in student retention in the sciences. To date, methods for measuring project ownership have not been suitable for the collection of larger data sets. The current study aims to rectify this by developing, presenting, and evaluating a new instrument for measuring project ownership. Eighteen scaled items were generated based on prior research and theory related to project ownership and combined with 30 items shown to measure respondents’ emotions about an experience, resulting in the Project Ownership survey (POS). The POS was analyzed to determine its dimensionality, reliability, and validity. The POS had a coefficient alpha of 0.92 and thus has high internal consistency. Known-groups validity was analyzed through the ability of the instrument to differentiate between students who studied in traditional versus research-based laboratory courses. The POS scales as differentiated between the groups and findings paralleled previous results in relation to the characteristics of project ownership.

INTRODUCTION

Numerous calls for reform in undergraduate biology education emphasize the value of undergraduate research (e.g., American Association for the Advancement of Science, 2011). These calls are based on a growing body of research documenting how students benefit from research experiences (Kremer and Bringle, 1990; Kardash, 2000; Rauckhorst et al., 2001; Hathaway et al., 2002; Bauer and Bennett, 2003; Lopatto, 2004, 2007, 2010; Seymour et al., 2004; Hunter et al., 2007; Russell et al., 2007; Laursen et al., 2010; Thiry and Laursen, 2011). Undergraduates report cognitive gains, such as learning to think and work like a scientist; affective gains, such as finding research enjoyable and exciting; and behavioral outcomes, such as intentions to pursue further education or careers in science. Studies of undergraduate research experiences have focused primarily on internship-style research, in which individual undergraduates participate in research as apprentices to graduate, postdoctoral, or faculty mentors.

Many colleges and universities lack the research infrastructure to involve undergraduates in internship-style research experiences or simply cannot accommodate large undergraduate populations in internships (Wood, 2003; Desai et al., 2008). As a result, an increasing number of faculty members are employing “alternatives to the apprenticeship model” (Wei and Woodin, 2011)—scalable ways of involving students in research. Course-based undergraduate research experiences, or CUREs, involve whole classes of students in research projects that build on current science knowledge and involve students in the range of scientific practices, from asking questions to collecting, analyzing, and interpreting data, to building models and communicating their findings. In many of these projects, such as the Genomics Education Partnership or the Partnership for Research and Education in Plants (Dolan et al., 2008; Shaffer et al., 2010), students’ findings have been published or have the potential to be published. Such courses can also engage introductory students or others who have chosen not to pursue internship-style research. Thus, CUREs may influence students’ academic and career paths more than summer research experiences, which typically serve to confirm students’ prior academic or career choices (Lopatto, 2004, 2007; Seymour et al., 2004; Hunter et al., 2007). Students who participate in CUREs report many of the same gains as students who participate in more intensive but less scalable lab- or field-based internships (Goodner et al., 2003; Hatfull et al., 2006; Drew and Triplett, 2008; Lopatto et al., 2008; Caruso et al., 2009; Shaffer et al., 2010).

To date, most assessment of CUREs has made use of the Classroom Undergraduate Research Experiences (CURE) survey (Lopatto, 2010). The CURE survey comprises three elements: 1) an instructor report of the extent to which the learning experience resembles the practice of science research (e.g., outcomes of the research are unknown, students have some input into the focus or design of the research), 2) a student report of learning gains, and 3) a student report of attitudes toward science. The student portions of the CURE survey include series of Likert-type items about students’ attitudes toward science and their educational and career interests, as well as students’ perceptions of the learning experience, the nature of science, their own learning styles, and the science-related skills they developed as a result of participating in a CURE. Use of the CURE survey is an important first step in understanding the impacts of these educational experiences, but information concerning this instrument's dimensionality, validity, or reliability is not readily available. Another instrument used to measure the outcomes of research experiences is the Undergraduate Research Student Self-Assessment (URSSA; Hunter et al., 2009; Laursen et al., 2010). A large qualitative study provided the empirical evidence for URSSA development, serving as the basis for its construct validity. The URSSA and the CURE survey, as well as the CURE survey's sister instrument, the SURE (Survey of Undergraduate Research Experiences; Lopatto, 2004, 2007), aim to document outcomes of research experiences, rather than measure or discriminate between elements of CUREs that lead to particular student outcomes. In fact, for both CUREs and undergraduate research experiences in general, the design elements that lead to such desired outcomes as persistence in the sciences have yet to be elucidated (Sadler et al., 2010; Adedokun et al., 2013).

Project ownership has been proposed as one of these elements (Lopatto, 2003). Previous research on project ownership (Kennedy, 1994; Chung et al., 1998; Downie and Moore, 1998; Mason et al., 2004; Nail, 2007; Wiley, 2009; Hanauer et al., 2012) has explored the concept as part of educators’ increasing interest in understanding and measuring how social interactions influence students’ psychological development. The most developed of these studies, from a measurement perspective, was Hanauer et al. (2012), which utilized a computational linguistic and content analysis approach to measure project ownership, differentiate among undergraduate research experiences, and show connections to student retention. The aim of the present study is to evaluate the reliability and validity of a new instrument for assessing project ownership in undergraduate research experiences. The instrument is called the Project Ownership survey (POS), and it was developed from the linguistic and content categories presented in Hanauer et al. (2012).

Ownership in education has been related to student choices, engagement, emotional involvement, and personal connectivity (Kennedy, 1994; Chung et al., 1998; Downie and Moore, 1998; Mason et al., 2004; Nail, 2007; Wiley, 2009). Ownership as a concept integrates personal responsibility with commitment to and identification with the work conducted in the educational setting (Wiley, 2009). As such, measuring project ownership offers the potential to capture a particular orientation toward work conducted within the sciences. Hatfull (2010), in a discussion of his CURE dedicated to bacteriophage isolation and genomic description, makes the connection explicit by relating scientific discovery, positive emotion, a sense of accomplishment, motivation, and ownership. This approach emphasizes the relationship between the design components of the experience (such as facilitating the option of discovering a novel organism) and student emotive responses. Milner-Bolotin (2001) has also specified that the development of ownership is sensitive to the specific components manifested in an education program.

Hanauer and colleagues (2012) specifically defined the construct of project ownership and its relationship to educational experiences. This study involved interviewing students who had participated in different undergraduate research experiences and then carefully coding their expressed experiences for indicators or counterindicators of ownership. Three undergraduate research experiences were explored: a research-based field and laboratory course (Scott Strobel's Rainforest Expedition and Laboratory [REAL] program), internship-style independent research experiences, and more traditional laboratory courses in biochemistry and chemistry. The result of this analysis was a set of content statements that characterize these different experiences in terms of aspects of project ownership. The following five categories of project ownership statement were found to differentiate between the three educational experiences:

  1. Constructing connections between personal history and scientific inquiry: This category includes both statements and narratives that describe significant moments in a student's life that have shaped, influenced, and created the scientific work done in laboratory and fieldwork. These statements and narratives involve a student bringing past experience in the form of personal stories and past educational experience into current and future research.

  2. Agency combined with mentorship: This category identifies moments when the student actively sought advice, assistance, or direction from professors, teachers, and other students in order to overcome an issue or fulfill an aim in the student's research project. These moments represent cocontributions and knowledge building between students and with educators.

  3. Expressions of excitement toward scientific inquiry: This category finds moments of genuine excitement for the process of scientific inquiry. This code reveals when students show real emotional connections to the work they are performing. These statements express positive emotional interaction relating to involvement in science.

  4. Overcoming challenging moments in science: This category identifies statements that address strategies for overcoming frustrating moments or problems encountered in research. The students discussed how they approached problems by adjusting their work or predicted how they could develop an entirely different approach to work around the problem.

  5. Expressions of a sense of personal scientific achievement: This category describes a positive emotional expression upon achieving a specific goal. This category captures a specific moment in student work. Students reference a specific finding or discovery, and how the finding resulted in their pride, happiness, or satisfaction.

For each of these categories, there were increased frequencies of usage for students who partook in the research-based course rather than in the independent study or the traditional laboratory. Furthermore, students from the independent group had higher frequencies of usage for these categories than the traditional laboratory group but lower frequencies than the research-based course group. In this sense, these categories operationalized the concept of project ownership in a way that allowed differences in expressed educational experience to emerge.

An additional finding of the Hanauer et al. (2012) study was the presence of increased levels of emotive language for students who studied in the research-based laboratory course. This suggested that project ownership included not only personal connectivity, agency, problem solving, social interaction, and a sense of personal achievement, but also increased emotional valence for the educational experience. The study also presented data on the long-term outcomes of these educational experiences and suggested that increased project ownership resulted in long-term persistence in the sciences.

A limitation of the work conducted so far on project ownership is that, methodologically, it has been inappropriate for larger-scale research. Although theoretical description, qualitative observation, interview, and content and linguistic analysis have all been important for developing an understanding of the concept of project ownership, they are limited by the time-consuming and personnel-intensive nature of data collection and analysis. However, the findings from this work are sufficient to formulate a hypothesis that enhanced project ownership is positively related to longer-term retention in science careers and to offer an operational definition of project ownership. The task now is to develop an instrument based on existing research that is suitable for larger-scale implementation. Broadly, the aim of this paper is to present and evaluate a survey-based instrument that measures project ownership, with the hope that this tool will facilitate larger-scale studies of project ownership. To facilitate this process, the current study addressed the following research questions:

  1. What components of project ownership can be measured in a valid and reliable way?

  2. Are there differences in degrees of project ownership for students who studied in traditional and research-based educational laboratory experiences?

METHODS

Participants

The participants in this study were 114 undergraduate students enrolled in a 24 different laboratory courses at 21 different institutions of higher education across the United States. The average number of participating students at each institution was 2.29 (range of 1–8 students). Student participation in the survey was requested through course instructors who were members of the Course-based Undergraduate Research Experiences Network (CUREnet; www.curenet.franklin.uga.edu). Students were not offered any incentive to participate in the survey. Of the 114 participants, only 68 completed the full survey. Analysis was only conducted on full surveys. Demographic information for the students is provided in Table 1. The request to participate in the survey and the Web-based informed consent process were conducted in accordance with Indiana University of Pennsylvania's IRB approval (log no. 13–185). The request to complete the survey was sent in the last 2 wk of classes, and all responses were collected by the end of that semester.

Table 1.

Demographic characteristics of participants (n = 68)a

Characteristic n %
Gender
 Female 34 50
 Male 34 50
Class
 First year 7 10
 Sophomore 20 29
 Junior 15 22
 Senior 24 35
 Prefer not to respond 2 3
Race/ethnic identification
 Asian 5 7
 African American 11 16
 Hispanic or Latino 11 16
 Native Hawaiian or Other Pacific Islander 1 1
 White 39 57
 Other 3 4
 Prefer not to respond 4 6
Institution type
 Research university 22 32
 Master’s-granting institution 1 1
 Four-year school 43 63
 Community college 2 3

Development of the Survey Instrument

The POS was designed to have three main components: specification of the undergraduate research experience, assessment of degrees of project ownership, and emotive scales. The specification of the educational experience included several scales dealing with degrees of autonomy and a written description of the specific course. The project ownership component was developed as an extension of the qualitative and quantitative findings of the Hanauer et al. (2012) study presented above. In the development of the survey, the content categories of project ownership were rewritten as statement prompts in a five-point Likert scale (strongly agree–strongly disagree) format. For the emotional scales, a standardized set of self-reported, discrete emotion scales was used (Izard, 1993). The complete survey was piloted with a small group of 10 undergraduate students to evaluate that all questions and scales were comprehensible and elicited data relevant to the intent of the survey. Revisions were made in wording and organization. Finally, the survey was migrated to a Web-based interface for ease of data collection.

Data Analysis for Survey Instrument Validation

Data analysis was conducted in accordance with procedures utilized when validating a new survey instrument. The process of scale development has been described as involving four stages of development: 1) construct definition through theoretical and literature review; 2) generating measurement items; 3) refining scales through field-testing; and 4) finalizing the scale (Netemeyer et al., 2003). Both stages three and four involve field testing and, in a sense, include a repetition of the same set of procedures designed to explore the dimensionality, reliability, and validity of the new scale. Dimensionality in relation to survey instruments refers to the homogeneity of the items on the scale and evaluates the presence of underpinning constructs (Netemeyer et al., 2003). The core assumption is that specific items that vary in a systematic correlational manner underpin and together offer an operational definition of the construct being measured. An evaluation of the dimensionality of scale is important, in that it facilitates an understanding of the ways in which the different items on the scale relate and of the number of underpinning factors present within the scale. Dimensionality is usually analyzed through the statistical procedure of factor analysis (Rietveld and Van Hout, 1993). Both stages three and four of scale development involve factor analysis; however, in stage three, an exploratory factor analysis is conducted to refine the scale and gain understanding of the relationship between the items in relation to underpinning factors. In stage four, a confirmatory factor analysis is conducted to confirm the structure of the scale.

Both reliability and validity are well-established concepts for any usage of a research tool. In relation to scale development, reliability is established through the evaluation of the interrelatedness of the items on the scale and the stability of scores across administrations (Bachman, 2004). Cronbach's alpha is the most widely used reliability coefficient for establishing levels of acceptable internal consistency (Netemeyer et al., 2003). For validity, both theoretical and empirical measures need to be taken to ensure the construct validity of the scale. These can include close evaluation of the sources of the scale, its relationship to existing literature, qualitative analyses of participant responses, and empirical comparisons of the new scale with existing measures. The aim, as with any evaluation of validity, is to make sure that the new scale measures what it is intended to measure.

For assessing the dimensionality, reliability, and validity of the POS, the following analytical procedures were performed and are reported in the next section. An exploratory factor analysis was conducted to assess dimensionality and establish relationships between items. Cronbach's alpha was calculated to establish internal consistency of the scales. A known-groups comparative study was conducted to assess the validity of the revised instrument.

RESULTS

The Psychometric Properties of the POS

Statistical analyses were conducted to investigate the psychometric properties of the POS. The internal consistency of the whole instrument was evaluated using Cronbach's alpha with the result α = 0.91, which indicates high levels of consistency for the tool. However, it is important to note that 30 of the 48 items on the survey came from a standardized and established tool for evaluating emotional profiles (Izard, 1993), and were therefore already highly consistent. Accordingly, it was decided to evaluate the internal consistency of the 18 new items without the emotional scales. Cronbach's alpha for the new items was α = 0.74, which is an acceptable level of consistency. To further understand the internal consistency of these 18 scales, we computed item-total correlations by correlating each item with the sum of the items (total score) to identify particular items that might be reducing reliability (Guilford, 1953). In this analysis, four items were found not to contribute to the instrument's reliability and were thus identified for potential removal from the scale, depending on the outcomes of the factor analysis. The four items were: “I designed my own research project,” “I found my research experience to be frustrating,” “My research experience was boring,” and “I did not care about the findings of my research project.” Table 2 presents the means and SDs for all scales and item-total correlations for the 18 new scales.

Table 2.

Item descriptions, means, and SDs for all scales and item-total correlations for 18 POS scales

Mean SD Item-total correlation
1. I designed my own research project. 3.48 1.38 0.08
2. I was responsible for the outcomes of my research. 2.02 1.11 0.45
3. I was in control of my research project. 2.48 1.37 0.32
4. The research question I worked on was important to me. 2.57 1.05 0.60
5. I had a personal reason for choosing the research project I worked on. 3.23 1.17 0.46
6. In conducting my research project, I actively sought advice and assistance. 1.95 0.91 0.66
7. My research project was exciting. 2.28 0.99 0.57
8. I faced challenges that I managed to overcome in completing my research project. 2.17 0.91 0.59
9. The findings of my research project gave me a sense of personal achievement. 2.08 0.93 0.67
10. My findings were important to the scientific community. 2.57 0.98 0.56
11. My research will help to solve a problem in the world. 2.77 1.05 0.55
12. I found my research experience to be frustrating. 3.15 1.07 0.30
13. In conducting my research I faced unexpected difficulties. 2.35 0.94 0.27
14. In conducting my research I was told what results to expect. 3.32 1.16 −0.55
15. In conducting my research I was told what procedures to follow. 2.35 1.16 0.24

Because the POS is a new instrument, an exploratory factor analysis was conducted to establish the internal structure of the tool and the dimensions of the project ownership construct (Thompson, 2004). As reported above, only 68 participants completed the full survey. Because this is usually considered too small a sample for a factor analysis, a Kaiser–Meyer–Olkin measure of sampling adequacy was calculated with the result of 0.653. This result, which is above the 0.5 benchmark, indicates an adequate sample, and a full exploratory factor analysis was therefore conducted. Descriptive statistics for each of the measures to be used in the factor analysis were calculated to make sure that the assumption of multivariate normality was not violated. A maximum likelihood factor analysis with oblimin with Kaiser normalization rotation was conducted to determine the internal structure of the survey and the dimensions of project ownership. For determining the number of factors to enter into the analysis, a scree plot of eigenvalues was graphed (see Figure 1). Based on this plot, a three-factor solution was specified.

Figure 1.

Scree plot of eigenvalues.

Figure 1.

Three components were extracted, accounting for 51.56% of the total variance of the observed variables. Table 3 presents the factor pattern matrix and the regression coefficients of each variable on each of the factors. The first factor, which accounted for 28% of the total variance, was constructed of emotional items with a negative orientation. Accordingly this factor was labeled “negative emotion.” The second factor, which accounted for 18.2% of the total variance, was constructed from the items dealing with aspects of project ownership and positive emotion categories. Accordingly, this factor was labeled “project ownership.” Three items in this factor had low pattern loadings, suggesting they should be considered for removal from the instrument: “To what extent does the word alert describe your experience of the laboratory course?” “To what extent does the word concentrating describe your experience of the laboratory course?” and “In conducting my research I faced unexpected difficulties.” The third factor, which accounted for 5.3% of the total variance, was constructed from items that dealt with student control over the research and its outcomes. Accordingly, this factor was labeled “degrees of agency.” Interestingly, the emotional category item of being worried factored with this group of items. It should be noted, though, that the pattern loadings for this factor were low. Broadly, the factor analysis parallels the original design of the POS and differentiates between the sections that deal directly with project ownership, emotion, and course design.

Table 3.

Pattern matrix and regression coefficients for the POS

Factora
1 (emotion) 2 (project ownership) 3 (course type)
To what extent does the word sad describe your experience of the laboratory course? 0.90
To what extent does the phrase feeling of distaste describe your experience of the laboratory course? 0.89
To what extent does the word disgust describe your experience of the laboratory course? 0.89
To what extent does the phrase feeling of revulsion describe your experience of the laboratory course? 0.87
To what extent does the word disdainful describe your experience of the laboratory course? 0.86
To what extent does the word mad describe your experience of the laboratory course? 0.85
To what extent does the word angry describe your experience of the laboratory course? 0.84
To what extent does the word downhearted describe your experience of the laboratory course? 0.81
To what extent does the word enraged describe your experience of the laboratory course? 0.80
To what extent does the word blameworthy describe your experience of the laboratory course? 0.78
To what extent does the word guilty describe your experience of the laboratory course? 0.76
To what extent does the word bashful describe your experience of the laboratory course? 0.75
To what extent does the word contemptuous describe your experience of the laboratory course? 0.72
To what extent does the word scared describe your experience of the laboratory course? 0.71
To what extent does the word scornful describe your experience of the laboratory course? 0.71
To what extent does the word discouraged describe your experience of the laboratory course? 0.68
To what extent does the word fearful describe your experience of the laboratory course? 0.64
To what extent does the word sheepish describe your experience of the laboratory course? 0.61
To what extent does the word shy describe your experience of the laboratory course? 0.59
To what extent does the word repentant describe your experience of the laboratory course? 0.58
I found my research experience to be frustrating. 0.53
To what extent does the word tentative describe your experience of the laboratory course? 0.41
My research project was interesting. 0.88
My research project was exciting. 0.83
The findings of my research project gave me a sense of personal achievement. 0.78
To what extent does the word delighted describe your experience of the laboratory course? 0.76
The research question I worked on was important to me. 0.75
To what extent does the word happy describe your experience of the laboratory course? 0.74
To what extent does the word joyful describe your experience of the laboratory course? 0.72
In conducting my research project, I actively sought advice and assistance. 0.69
To what extent does the word astonished describe your experience of the laboratory course? 0.68
To what extent does the word surprised describe your experience of the laboratory course? 0.65
To what extent does the word amazed describe your experience of the laboratory course? 0.65
My research experience was boring. −0.60
I faced challenges that I managed to overcome in completing my research project. 0.57
My findings were important to the scientific community. 0.55
My research will help to solve a problem in the world. 0.52
I did not care about the findings of my research project. −0.52
I was responsible for the outcomes of my research. 0.49
I had a personal reason for choosing the research project I worked on. 0.41
To what extent does the word alert describe your experience of the laboratory course? 0.39
To what extent does the word concentrating describe your experience of the laboratory course? 0.30
In conducting my research I faced unexpected difficulties. 0.18
To what extent does the word worried describe your experience of the laboratory course? 0.57
I was in control of my research project. −0.46
I designed my own research project. −0.37
In conducting my research I was told what procedures to follow. 0.36
In conducting my research I was told what results to expect. 0.30
  • aExploratory factor analysis with maximum likelihood extraction and oblimin with Kaiser normalization rotation.

Based on the pattern loadings, item-total correlations, and reliability analysis, a revised version of the POS could be proposed. For the scales of project ownership, factor 2 was modified by removing items with low reliability (“My research experience was boring” and “I did not care about the findings of my research project”). Cronbach's alpha was calculated for the remaining scales on this factor with the result of α = 0.92, which indicates a very high level of reliability. Cronbach's alpha was also calculated for the variables of factor 3 dealing with degrees of ownership. The resultant alpha value (α = 0.27) was very low, and given this, these scales were deemed unreliable and removed from the survey. Table 4 presents the final version of the survey, which consists of 16 items dealing with project ownership and positive emotion.

Table 4.

Final version of the POS

Strongly agree Agree Neither agree nor disagree Disagree Strongly disagree
My research will help to solve a problem in the world.
My findings were important to the scientific community.
I faced challenges that I managed to overcome in completing my research project.
I was responsible for the outcomes of my research.
The findings of my research project gave me a sense of personal achievement.
I had a personal reason for choosing the research project I worked on.
The research question I worked on was important to me.
In conducting my research project, I actively sought advice and assistance.
My research project was interesting.
My research project was exciting.
Very strongly Considerably Moderate Slightly Very slightly
To what extent does the word delighted describe your experience of the laboratory course?
To what extent does the word happy describe your experience of the laboratory course?
To what extent does the word joyful describe your experience of the laboratory course?
To what extent does the word astonished describe your experience of the laboratory course?
To what extent does the word surprised describe your experience of the laboratory course?
To what extent does the word amazed describe your experience of the laboratory course?

Differences between Educational Experiences on the POS

Previous research on the concept of project ownership had demonstrated differences in the expression of project ownership between types of scientific inquiry educational experience (Hanauer et al., 2012). As reported above, the scales of the current survey were developed on the basis of this research, and as such, the survey has some degree of construct validity. However, to further assess the validity of the tool, a known-groups validity approach was taken. If the original study utilized project ownership categories to differentiate between educational experiences, then a basic measure of the validity of the POS is to assess whether the new tool also systematically differentiates between educational experiences. In completing the survey, participants were asked to specify whether the laboratory course they were responding about was a traditional laboratory course or a research-based laboratory experience, and they were also required to verbally describe the course. Students’ answers specifying course type were compared with their verbal descriptions of the course they had taken in order to establish the presence of clear groupings. This analysis established two types of educational research experience that could be compared: traditional laboratory courses and research-based laboratory courses. The traditional laboratory course was characteristically described by students as being a “required introductory course” and involving “various small experiments; not real research.” The research-based course was characteristically described by students in terms of the scientific question they explored, such as “Our current work focuses on the molecular basis of the unique colonizing ability of the D-genotype strains. Genomic sequence of L51-96 will reveal unique features of the special relationship between these strains and wheat.”

Thirty-two students specified they took a traditional laboratory course (12 different specified courses), and 33 specified they took a research-based laboratory course (12 different specified courses). These specifications were corroborated through a content analysis of their verbal descriptions. Based on previous research (Hanauer et al., 2012), a comparison between these two groups on the POS scales should provide some insight into the ability of this tool to differentiate between types of educational experience. The basic hypothesis was that undergraduate research-based courses would elicit systematically more positive ratings on the individual project ownership rating scales than the traditional undergraduate laboratory course.

Table 5 presents the means and SDs for each of the items in the revised POS. As can be seen from the descriptive statistics, for every item, the research-based group has more positive ratings (indicated by lower means on five-point Likert scales, with 1 = strongly agree and 5 = strongly disagree) than the traditional laboratory group. To further evaluate the hypothesis of difference between educational experience on the POS, we calculated one-way analysis of variance (ANOVA) with each of the items as dependent variables and groups (required; research-based) as the independent variable. As seen in Table 5, nine of the 16 items indicated significant differences between the traditional and research-based lab course at the 0.05 or 0.01 levels. Cohen's d as a measure of effect size for scales with significant differences ranged from 0.51 to 0.9, specifying medium-to-large effect sizes (Cohen, 1992). In addition, two additional items showed a trend of difference, with p levels at 0.10 and Cohen's d in the medium range (0.45–0.55). The remaining five items were not found to be significantly different.

Table 5.

Means, SDs, and one-way ANOVA comparison for traditional laboratory and research laboratory groups

POS scale Traditional lab Research lab F (1,63) d
My research project was interesting. 2.41 (0.87) 1.94 (0.83) 4.89** 0.55
My research project was exciting. 2.47 (0.84) 2.06 (1.05) 2.97* 0.43
The findings of my research project gave me a sense of personal achievement. 2.47 (1.02) 1.70 (0.68) 12.98*** 0.90
To what extent does the word delighted describe your experience of the laboratory course? 2.94 (1.10) 2.27 (1.09) 5.92*** 0.61
The research question I worked on was important to me. 2.72 (1.02) 2.33 (1.10) 2.12 0.36
To what extent does the word happy describe your experience of the laboratory course? 2.75 (1.14) 2.03 (0.85) 8.41*** 0.72
To what extent does the word joyful describe your experience of the laboratory course? 3.06 (1.16) 2.33 (1.11) 6.70*** 0.64
In conducting my research project, I actively sought advice and assistance. 2.28 (1.02) 1.61 (0.70) 9.65** 0.77
To what extent does the word astonished describe your experience of the laboratory course? 3.63 (1.26) 3.58 (1.09) 0.28 0.04
To what extent does the word surprised describe your experience of the laboratory course? 3.28 (1.22) 3.15 (1.00) 0.22 0.11
To what extent does the word amazed describe your experience of the laboratory course? 2.94 (1.24) 2.64 (1.02) 1.14 0.26
I faced challenges that I managed to overcome in completing my research project. 2.41 (0.95) 1.94 (0.86) 4.32** 0.51
My findings were important to the scientific community. 2.88 (1.07) 2.21 (0.74) 8.48*** 0.74
My research will help to solve a problem in the world. 3.03 (1.1) 2.42 (0.91) 5.98*** 0.60
I was responsible for the outcomes of my research. 2.25 (1.27) 1.76 (0.87) 3.35* 0.45
I had a personal reason for choosing the research project I worked on. 3.34 (1.18) 3.18 (1.23) 0.29 0.13
  • Five-point Likert scale: 1 = strongly agree to 5 = strongly disagree.

    *p < 0.10.
    **p < 0.05.
    ***p < 0.01.

Five project ownership items (“The research project gave me a sense of personal achievement,” “In conducting my research project I actively sought advice and assistance,” “I faced challenges that I managed to overcome,” “My findings were important to the scientific community,” and “My research will help to solve a problem in the world”) and the positive emotional scales of delight, happiness, and joy systematically differentiated between the groups, with the research-based laboratory groups having more positive ratings on each of these scales. Interestingly, four of the five nonsignificant items came from the same emotional domain (astonishment, surprise, and amazement), suggesting that, while this might measure some aspects of project ownership, it does not differentiate between the groups and might in a future iteration of the POS be considered for removal. Overall, based on the finding of the tests of difference, the POS would seem to operate in a manner that is expected from a tool of this type.

CONCLUSIONS

The aim of this study was to present and evaluate a new instrument for measuring project ownership that is appropriate for the collection of larger-scale data. The revised 16-item POS was found to be highly reliable (α = 0.92). Construct validity was addressed by developing the items based on a 2-yr study that defined and operationalized project ownership in the context of undergraduate laboratory learning experiences. To bolster the evaluation of validity, we utilized a known-groups validity approach based on the prediction of group differences between types of undergraduate research experience. The findings revealed that the tool significantly differentiated between the groups on a majority of the items of the POS. Accordingly, it seems that the psychometric properties of the instrument allow it to be used for larger-scale data collection on project ownership.

Previous research had specified five categories of student statement that signified the presence of project ownership (Hanauer et al., 2012). These consisted of connections between personal history and scientific inquiry, agency and mentorship, expressions of excitement, overcoming challenging moments, and expressions of personal scientific achievement. The findings presented here on the POS items that significantly differentiate between traditional and research-based laboratory courses replicate some but not all of the original findings. Interestingly, the items dealing with the category of personal connections to scientific inquiry (“I had a personal reason for choosing the research project I worked on” and “The research question I worked on was important to me”) were not significantly different between the groups. It is possible that this reflects the realities of most laboratory courses and that, even in research-based courses, research questions and projects more often are assigned than student defined. In the previous study, the special nature of the Strobel REAL project may indeed be very different from the research-based courses in the current study, and this may account for this difference. All the other categories from the original study were supported by significant differences in the items of the POS. The usage of discrete emotional scales did provide clear evidence of the emotional valence of different undergraduate research experiences. The items dealing with interest, excitement, delight, happiness, and joy were all significantly more positive for the research-based course.

In the process of tool validation, several items of the original survey were removed. Importantly, items under the heading of degree of agency, involving ratings of the degree of control and design of the research project, were not found to have psychometric properties that would allow them to be used in the current version of the POS. However, from a theoretical perspective, these types of items would seem to have some significance, and as such, it is possible that new versions of these items need to be formulated and psychometrically validated in a new version of the POS.

As with any instrument development process, the current study has some limitations. The main limitation concerns the size and type of sample used. The sample was small and not random. Students came from 21 different institutions across the United States, but independently decided whether to complete the survey itself. In accordance with IRB criteria, participation was voluntary, and this may have constructed a self-biasing group of participants (e.g., those who have an interest in ownership). However, the adequacy of the sample size was statistically evaluated and found to be sufficient to conduct a factor analysis. As with the development of any new instrument, we suggest that future usage of the revised POS involve an additional iteration of the analysis of the tool's psychometric properties.

Underpinning the research on project ownership is the idea that it is important to understand, facilitate, and measure both the psychosocial and cognitive aspects of undergraduate research experiences. Project ownership is one of several potential measures that could be used to further explore the elements of undergraduate research experiences that influence a student's decision to stay in the sciences and become an active researcher (Adedokun et al., 2013; Eagan et al., 2013). Project ownership may be particularly important, in that it is tied directly to the research project and educational experience of the student. The evidence presented here demonstrates that the POS is a useful tool for measuring project ownership. At this stage, it is important for the instrument to be used by a broad group of researchers and evaluators of undergraduate research and laboratory experiences to yield further insights into its validity and reliability as a measure of project ownership. The POS should also be used to characterize a broad range of research experiences in order to elucidate the relationship between students’ sense of ownership and the gains they make from participating in research. Results of this work will be useful for identifying design features of undergraduate research experiences that enhance project ownership. Ultimately, the aim is to understand those factors that enrich student research experiences and facilitate a process of increased numbers of students staying in the sciences for the very best reason—the joy of being a researcher.

ACKNOWLEDGMENTS

D.I.H. was funded through a subaward (IUP RI log no. 0910-028) from a Howard Hughes Medical Institute professorship award to Graham Hatfull. CUREnet is supported by the National Science Foundation under grant no. 1554681.

Footnotes

Address correspondence to: David I. Hanauer (hanauer@iup.edu).

“ASCB®” and “The American Society for Cell Biology®” are registered trademarks of The American Society of Cell Biology.

REFERENCES

  1.