`vignettes/form_prereg2D_v1.Rmd`

`form_prereg2D_v1.Rmd`

This vignette shows the Preregistration Template for Secondary Data Analysis form. It can be initialized as follows:

```
initialized_prereg2D_v1 <-
preregr::prereg_initialize(
"prereg2D_v1"
);
```

After this, content can be specified with preregr::prereg_specify() or preregr::prereg_justify. To check the next field(s) for which content still has to be specified, use preregr::prereg_next_item().

The form’s metadata is:

field | content |
---|---|

title | Preregistration of secondary data analysis: A template and tutorial |

author | Olmo R. van den Akker, Sara J. Weston, Lorne Campbell, William J. Chopik, Rodica Ioana Damian, Pamela E. Davis-Kean, Andrew N. Hall, Jessica E. Kosie, Elliott Kruse, Jerome Olsen, Stuart J. Ritchie, K. D. Valentine, Anna E. van ’t Veer, Marjan Bakker |

version | 1.0 |

comments | Please cite the associated paper when using this preregistration template (see https://doi.org/10.15626/MP.2020.2625) |

The form is defined as follows (use preregr::form_show() to show the form in the console, instead):

```
preregr::form_knit(
"prereg2D_v1"
);
```

Here we present a preregistration template for the analysis of secondary data and provide guidance for its effective use. We are aware that the number of questions (25) in the template may be overwhelming but it is important to note that not every question is relevant for every preregistration. Our aim was to be inclusive and cover all bases in light of the diversity of secondary data analyses. Even though none of the questions are mandatory, we do believe that an elaborate preregistration is preferable over a concise preregistration simply because it restricts more researcher degrees of freedom. We therefore recommend that authors answer as many questions in as much detail as possible. And, if questions are not applicable, it would be good practice to also specify why this is the case so that readers can assess your reasoning.

Effectively preregistering a study is challenging and can take a lot of time but, like Nosek et al. (2019) and many others, we believe it can improve the interpretability, verifiability and rigor of your studies and is therefore more than worth it if you want both yourself and others to have more confidence in your research findings.

The current template is merely one building block toward a more effective preregistration infrastructure and, given the ongoing developments in this area, will be a work in progress for the foreseeable future. Any feedback is therefore greatly appreciated. Please send any feedback to the corresponding author, Olmo van den Akker (ovdakker@gmail.com).

Title

title

Provide the working title of your study.

*Example*: Do religious people follow the golden rule? Assessing the link between religiosity and prosocial behavior using data from the Wisconsin Longitudinal Study.

Authors

authors

Name the authors of this preregistration.

*Example*: Josiah Carberry (JC) – ORCID iD: https://orcid.org/0000-0002-1825-0097 Pomona Sprout (PS) – Personal webpage: https://en.wikipedia.org/wiki/Hogwarts_staff#Pomona_Sprout

Research questions

research_questions

List each research question included in this study.

*Example*: RQ1 = Are more religious people more prosocial than less religious people? RQ2 = Does the relationship between religiosity and prosociality differ for people with different religious affiliations?

Hypotheses

hypotheses

Please provide the hypotheses of your secondary data analysis. Make sure they are specific and testable, and make it clear what your statistical framework is (e.g., Bayesian inference, NHST). In case your hypothesis is directional, do not forget to state the direction. Please also provide a rationale for each hypothesis.

*Example*: “Do to others as you would have them do to you” (Luke 6:31). This golden rule is taught by all major religions, in one way or another, to promote prosociality (Parliament of the World’s Religions, 1993). Religious prosociality is the idea that religions facilitate behavior that is beneficial for others at a personal cost (Norenzayan & Shariff, 2008). The encouragement of prosocial behavior by religious teachings appears to be fruitful: a considerable amount of research shows that religion is positively related to prosocial behavior (e.g., Friedrichs, 1960; Koenig, McGue, Krueger, & Bouchard, 2007; Morgan, 1983). For instance, religious people have been found to give more money to, and volunteer more frequently for, charitable causes than their non-religious counterparts (e.g., Grønbjerg & Never, 2004; Lazerwitz, 1962; Pharoah & Tanner, 1997). Also, the more important people viewed their religion, the more likely they were to do volunteer work (Youniss, McLellan, & Yates, 1999). Based on the above we expect that religiosity is associated with prosocial behavior in our sample as well. To assess this prediction, we will test the following hypotheses using a null hypothesis significance testing framework:

H0(1) = In men and women who graduated from Wisconsin high schools in 1957, there is no association between religiosity and prosociality H1(1) = In men and women who graduated from Wisconsin high schools in 1957, there is a positive association between religiosity and prosociality

Dataset

dataset

Name and describe the dataset(s), and if applicable, the subset(s) of the data you plan to use. Useful information to include here is the type of data (e.g., cross-sectional or longitudinal), the general content of the questions, and some details about the respondents. In the case of longitudinal data, information about the survey’s waves is useful as well.

*Example*: To answer our research questions we will use a dataset from the Wisconsin Longitudinal Study (WLS; Herd, Carr, & Roan, 2014). The WLS provides long-term data on a random sample of all the men and women who graduated from Wisconsin high schools in 1957. The WLS involves twelve waves of data. Six waves were collected from the original participants or their parents (1957, 1964, 1975, 1992, 2004, and 2011), four were collected from a selected sibling (1977, 1994, 2005, and 2011), one from the spouse of the original participant (2004), and one from the spouse of the selected sibling (2006). The questions vary across waves and are related to domains as diverse as socio-economic background, physical and mental health, and psychological makeup. We will use the subset consisting of the 1957 graduates who completed the follow-up 2003-2005 wave of the WLS dataset because it includes specific modules on religiosity and volunteering.

Openness of data

dataset_open

Specify the extent to which the dataset is open or publicly available. Make note of any barriers to accessing the data, even if it is publicly available.

*Example*: The dataset we will use is publicly available, but you need to formally agree to acknowledge the funding source for the Wisconsin Longitudinal Study, to cite the data release in any manuscripts, working papers, or published articles using these data, and to inform WLS about any published papers for use in the WLS bibliography and for reporting purposes. To do this you need to submit some information about yourself on the website (https://www.ssc.wisc.edu/wlsresearch/data/downloads/). You will then receive an email with a download link.

Access to data

data_access

How can the data be accessed? Provide a persistent identifier or link if the data are available online, or give a description of how you obtained the dataset.

*Example*: The data can be accessed by going to the following link and searching for the variables that are specified in Q12 of this preregistration: https://www.ssc.wisc.edu/wlsresearch/documentation/browse/?label=&variable=&wave_108=on&searchButton=Search

Date(s) data were accessed

data_date

Specify the date of download and/or access for each author.

*Example*: PS: Downloaded 12 February 2019; Accessed 12 February 2019. JC: Downloaded 3 January 2019 (estimated); Accessed 12 February 2019. We will use the data accessed by JC on 12 February 2019 for our statistical analyses.

Data collection

data_collection

If the data collection procedure is well documented, provide a link to that information. If the data collection procedure is not well documented, describe, to the best of your ability, how data were collected.

*Example*: The WLS data was and is being collected by the University of Wisconsin Survey Center for use by the research community. The origins of the WLS can be traced back to a state-sponsored questionnaire administered during the spring of 1957 at all Wisconsin high school to students in their final year. Therefore, the dataset constitutes a specific sample not necessarily representative of the United States as a whole. Most panel members were born in 1939, and the sample is broadly representative of white, non-Hispanic American men and women who completed at least a high school education. A flowchart for the data collection can be found here: https://www.ssc.wisc.edu/wlsresearch/about/flowchart/cor459d7.pdf

Data codebook

data_codebook

Some studies offer codebooks to describe their data. If such a codebook is publicly available, link to it here or upload the document. If not, provide other available documentation. Also provide guidance on what parts of the codebook or other documentation are most relevant.

*Example*: The codebook for the dataset we use can be found here: https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k. We will mainly use questions from the mail survey about religion and spirituality, and the phone survey on volunteering, but will also use some questions from other modules (see the answer to Q12).

Manipulated variable(s)

var_manipulated

If you are going to use any manipulated variables, identify them here. Describe the variables and the levels or treatment arms of each variable (note that this is not applicable for observational studies and meta-analyses). If you are collapsing groups across variables this should be explicitly stated, including the relevant formula. If your further analysis is contingent on a manipulation check, describe your decisions rules here.

*Example*: Not applicable.

Measured variable(s)

var_measured

If you are going to use measured variables, identify them here. Describe both outcome measures as well as predictors and covariates and label them accordingly. If you are using a scale or an index, state the construct the scale/index represents, which items the scale/index will consist of, how these items will be aggregated, and whether this aggregation is based on a recommendation from the study codebook or validation research. When the aggregation of the items is based on exploratory factor analysis (EFA) or confirmatory factor analysis (CFA), also specify the relevant details (EFA: rotation, how the number of factors will be determined, how best fit will be selected, CFA: how loadings will be specified, how fit will be assessed, which residuals variance terms will be correlated). If you are using any categorical variables, state how you will code them in the statistical analyses.

*Example*: Religiosity (IV): Religiosity is measured using a newly created scale with a subset of items from the Religion and Spirituality module of the 2004 mail survey (described here: https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gmail_religion). The scale includes general questions about how religious/spiritual the individual is and how important religion/spirituality is to them. Importantly, the questions are not specific to a particular denomination and are on the same response scale. The specific variables are as follows: 1. il001rer: How religious are you? 2. il002rer: How spiritual are you? 3. il003rer: How important is religion in your life? 4. il004rer: How important is spirituality in your life? 5. il005rer: How important was it, or would it have been if you had children, to send your children for religious or spiritual instruction? 6. il006rer: How closely do you identify with being a member of a religious group? 7. il007rer: How important is it for you to be with other people who are the same religion as you? 8. il008rer: How important do you think it is for people of your religion to marry other people who are the same religion? 9. il009rer: How strongly do you believe that one should stick to a particular faith? 10. il010rer: How important was religion in your home when you were growing up? 11. il011rer: When you have important decisions to make in your life, how much do you rely on your religious or spiritual beliefs? 12. il012rer: How much would your spiritual or religious beliefs influence your medical decisions if you were to become gravely ill? The levels of all of these variables are indicated by a Likert scale with the following options: (1) Not at all; (2) Not very; (3) Somewhat; (4) Very; (5) Extremely, as well as ‘System Missing’ (the participant did not provide an answer) and ‘Refused’ (the participant refused to answer the question). Variables il006rer, il008rer, and il012rer additionally include the option ‘Don’t know’ (the participant stated that they did not know how to answer the question). We will use the average score (after omitting non-numeric and ‘Don’t know’ responses) on the twelve variables as a measure of religiosity. This average score is constructed by ourselves and was not already part of the dataset. Prosociality (DV): In line with previous research (Konrath, Fuhrel-Forbis, Lou, & Brown, 2012), we will use three measures of prosociality that measure three aspects of engagement in other-oriented activities (see Brookfield, Parry, & Bolton, 2018 for the link between prosociality and volunteering). The prosociality variables come from the Volunteering module of the 2004 phone survey. The codebook of that module can be found here: https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gvol). The three measures of prosociality we will use are: 1. gv103re: Did the graduate do volunteer work in the last 12 months? This dichotomous variable assesses whether or not the participant has engaged in any volunteering activities in the last 12 months. The levels of this variable are yes/no. Yes will be coded as ‘1’, no will be coded as ‘0’. 2. gv109re: Number of graduate’s other volunteer activities in the past 12 months. This variable is a summary index providing a quantitative measure of the participant’s volunteering activities. Scores on this variable range from 1 to 5 and reflect the number of the previous five questions to which the participant answered YES. The previous five questions assess whether or not the participant volunteered at any of the following organization types: (1) religious organizations; (2) school or educational organization; (3) political group or labor union; (4) senior citizen group or related organization; (5) other national or local organizations. For each of these questions the answer ‘yes’ is coded as 1 and the answer ‘no’ is coded as 0. 3. gv111re: How many hours did the graduate volunteer during a typical month in the last 12 months? This is a numerical variable that provides information on how many hours per month, on average, the participant volunteered. The three variables will be treated as separate measures in the dataset and do not require manual aggregation.

Number of Siblings (Covariate): We will include the participant’s number of siblings as a control variable because many religious families are large (Pew Research Center, 2015) and it can be argued that cooperation and trust arise more naturally in larger families because of the larger number of social interactions in those families. To measure participants’ number of siblings we used the variable gk067ss: The total number of siblings ever born from the 2004 phone survey Siblings module (see https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gsib). This is a numerical variable with the possibility for the participant to state “I don’t know”. At the interview participants were instructed to include “siblings born alive but no longer living, as well as those alive now and to include step-brothers and step-sisters and children adopted by their parents.”

Agreeableness (Covariate): We will include the summary score for agreeableness (ih009rec, see https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gmail_values) in the analysis as a control variable because a previous study (on the same dataset, see the answer to Q18) we were involved in showed a positive association between agreeableness and prosociality. Because previous research also indicates a positive association between agreeableness and religiosity (Saroglou, 2002) we need to include agreeableness as a control variable to disentangle the influence of religiosity on prosociality and the influence of agreeableness on prosociality. The variable ih009rec is a sum score of the variables ih003rer-ih008rer (To what extent do you agree that you see yourself as someone who is talkative / is reserved [reverse coded] / is full of energy / tends to be quiet [reverse coded] / who is sometimes shy or inhibited [reverse coded] / who generates a lot of enthusiasm). All of these were scored from 1 to 6 (1 = “agree strongly”, 2 = “agree moderately”, 3 = “agree slightly”, 4 = “disagree slightly”, 5 = “disagree moderately”, 6 = “disagree strongly”), while participants could also refuse to answer the question. If a participant refused to answer one of the questions, that participant’s score was not included in the sum score variable ih009rec.

Inclusion and exclusion criteria

inclusion

Which units of analysis (respondents, cases, etc.) will be included or excluded in your study? Taking these inclusion/exclusion criteria into account, indicate the (expected) sample size of the data you’ll be using for your statistical analyses to the best of your knowledge. In the next few questions, you will be asked to refine this sample size estimation based on your judgments about missing data and outliers.

*Example*: Initially, the WLS consisted of 10,317 participants. As we are not interested in a specific group of Wisconsin people, we will not exclude any participants from our analyses. However, only 7,265 participants filled out the questions on prosociality and the number of siblings in the phone survey and only 6,845 filled out the religiosity items in the mail survey (Herd et al., 2014). This corresponds to a response rate of 73% and 69% respectively. Because we do not know whether the participants that did the mail survey also did the phone survey, our minimum expected sample size is 10,317 * 0.73 * 0.69 = 5,297.

Missing data

missing

What do you know about missing data in the dataset (i.e., overall missingness rate, information about differential dropout)? How will you deal with incomplete or missing data? Based on this information, provide a new expected sample size.

*Example*: The WLS provides a documented set of missing codes. In Table 1 (see https://doi.org/10.15626/MP.2020.2625) you can find missingness information for every variable we will include in the statistical analyses. ‘System missing’ refers to the number of participants that did not or could not complete the questionnaire. ‘Partial interview’ refers to the number of participants that did not get that particular question because they were only partially interviewed. The rest of the codes are self-explanatory. Importantly, some respondents refused to answer the religiosity questions. These respondents apparently felt strongly about these questions, which could indicate that they are either very religious or very anti-religious. If that is the case, the respondent’s propensity to respond is directly associated with their level of religiosity and that the data is missing not at random (MNAR). Because it is not possible to test the stringent assumptions of the modern techniques for handling MNAR data we will resort to simple listwise deletion. It must be noted that this may bias our data as we may lose respondents who are very religious or anti-religious. However, we believe this bias to be relatively harmless given that our sample still includes many respondents that provided extreme responses to the items about the importance of the different facets of religion (see https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gmail_religion). Moreover, because our initial sample size is very large, statistical power is not substantially compromised by omitting these respondents. That being said, we will extensively discuss any potential biases resulting from missing data in the limitations section of our paper. Employing listwise deletion leads to an expected minimum number of 10,317 * 0.30 * 0.70 * 0.64 = 1,387 participants for the binary logistic regression, and an expected minimum number of 10,317 * 0.24 * 0.70 * 0.64 = 1,109 (gv109re) and 10,317 * 0.23 * 0.70 * 0.64 = 1,063 (gv111re) for the linear regressions.

Outliers

outliers

If you plan to remove outliers, how will you define what a statistical outlier is in your data? Please also provide a new expected sample size. Note that this will be the definitive expected sample size for your study and you will use this number to do any power analyses.

*Example*: The dataset probably does not involve any invalid data since the dataset has been previously ‘cleaned’ by the WLS data controllers and any clearly unreasonably low or high values have been removed from the dataset. However, to be sure we will create a box and whisker plot for all continuous variables (the dependent variables gv109re and gv111re, the covariate gk067ss, and the scale for religiosity) and remove any data point that appears to be more than 1.5 times the IQR away from the 25th and 75th percentile. Based on normally distributed data, we expect that 2.1% of the data points will be removed this way, leaving 1,358 out of 1,387 participants for the binary regression with gv103re as the outcome variable and 1,086 out of 1,109 participants, and 1,041 out of 1,063 participants for the linear regressions with gv109re and gv111re as the outcome variables, respectively.

Sampling weights

sampling_weights

Are there sampling weights available with this dataset? If so, are you using them or are you using your own sampling weights?

*Example*: The WLS dataset does not include sampling weights and we will not use our own sampling weights as we do not seek to make any claims that are generalizable to the national population.

Previous work

previous_work

List the publications, working papers (in preparation, unpublished, preprints), and conference presentations (talks, posters) you have worked on that are based on the dataset you will use. For each work, list the variables you analyzed, but limit yourself to variables that are relevant to the proposed analysis. If the dataset is longitudinal, also state which wave of the dataset you analyzed. Importantly, some of your team members may have used this dataset, and others may not have. It is therefore important to specify the previous works for every co-author separately. Also mention relevant work on this dataset by researchers you are affiliated with as their knowledge of the data may have been spilled over to you. When the provider of the data also has an overview of all the work that has been done using the dataset, link to that overview.

*Example*: Both authors (PS and JC) have previously used the Graduates 2003-2005 wave to assess the link between Big Five personality traits and prosociality. The variables we used to measure the Big Five personality traits were ih001rei (extraversion), ih009rei (agreeableness), ih017rei (conscientiousness), ih025rei (neuroticism), and ih032rei (openness). The variables we used to measure prosociality were ih013rer (“To what extent do you agree that you see yourself as someone who is generally trusting?”), ih015rer (“To what extent do you agree that you see yourself as someone who is considerate to almost everyone?”), and ih016rer (“To what extent do you agree that you see yourself as someone who likes to cooperate with others?). We presented the results at the ARP conference in St. Louis in 2013 and we are currently finalizing a manuscript based on these results. Additionally, a senior graduate student in JC’s lab used the Graduates 2011 wave for exploratory analyses on depression. She linked depression to alcohol use and general health indicators. She did not look at variables related to religiosity or prosociality. Her results have not yet been submitted anywhere. An overview of all publications based on the WLS data can be found here: https://www.ssc.wisc.edu/wlsresearch/publications/pubs.php?topic=ALL.

Prior knowledge

prior_knowledge

What prior knowledge do you have about the dataset that may be relevant for the proposed analysis? Your prior knowledge could stem from working with the data first-hand, from reading previously published research, or from codebooks. Also provide any relevant knowledge of subsets of the data you will not be using. Provide prior knowledge for every author separately.

*Example*: In a previous study (mentioned in Q17) we used three prosociality variables (ih013rer, ih015rer, and ih016rer) that may be related to the prosociality variables we use in this study. We found that ih013rer, ih015rer, and ih016rer are positively associated with agreeableness (ih009rec). Because previous research (on other datasets) shows a positive association between agreeableness and religiosity (Saroglou, 2002) agreeableness may act as a confounding variable. To account for this we will include agreeableness in our analysis as a control variable. We did not find any associations between prosociality and the other Big Five variables.

Statistical model

model

For each hypothesis, describe the statistical model you will use to test the hypothesis. Include the type of model (e.g., ANOVA, multiple regression, SEM) and the specification of the model. Specify any interactions and post-hoc analyses and remember that any test not included here must be labeled as an exploratory test in the final paper.

*Example*: Our first hypothesis will be tested using three analyses since we use three variables to measure prosociality. For each, we will run a directional null hypothesis significance test to see whether a positive effect exists of religiosity on prosociality. For the first outcome (gv103re: Did the graduate do volunteer work in the last 12 months?) we will run a logistic regression with religiosity, the number of siblings, and agreeableness as predictors. For the second and third outcomes (gv109re: Number of graduate’s other volunteer activities in the past 12 months; gv111re: How many hours did the graduate volunteer during a typical month in the last 12 months?) we will run two separate linear regressions with religiosity, the number of siblings, and agreeableness as predictors. The code we will use for all these analyses can be found at https://osf.io/e3htr.

Effect size

effect_size

If applicable, specify a predicted effect size or a minimum effect size of interest for all the effects tested in your statistical analyses.

*Example*: For the logistic regression with ‘Did the graduate do volunteer work in the last 12 months?’ as the outcome variable, our minimum effect size of interest is an odds of 1.05. This means that a one-unit increase on the religiosity scale would be associated with a 1.05 factor change in odds of having done volunteering work in the last 12 months versus not having done so. For the linear regressions with ‘The number of graduate’s volunteer activities in the last 12 months”, and “How many hours did the graduate volunteer during a typical month in the last 12 months?’ as the outcome variables, the minimum regression coefficients of interest of the religiosity variables are 0.05 and 0.5, respectively. This means that a one-unit increase in the religiosity scale would be associated with 0.05 extra volunteering activities in the last 12 months and with 0.5 more hours of volunteering work in the last 12 months. All of these smallest effect sizes of interest are based on our own intuition. To make comparisons possible between the effects in our study and similar effects in other studies the unstandardized linear regression coefficients will be transformed into standardized regression coefficients using the following formula: β_i=B_i (s_i/s_y), where B_i is the unstandardized regression coefficient of independent variable i, and s_i and s_y are the standard deviations of the independent and dependent variable respectively. Comment(s): A predicted effect size is ideally based on a representative preliminary study or meta-analytical result. If those are not available, it is also possible to use your own intuition. For advice on setting a minimum effect size of interest, see Lakens, Scheel, & Isager (2018) and Funder and Ozer (2019).

Power

power

Present the statistical power available to detect the predicted effect size(s) or the smallest effect size(s) of interest, OR present the accuracy that will be obtained for estimation. Use the sample size after updating for missing data and outliers, and justify the assumptions and parameters used (e.g., give an explanation of why anything smaller than the smallest effect size of interest would be theoretically or practically unimportant).

*Example*: The sample size after updating for missing data and outliers is 1,358 for the logistic regression with gv103re as the outcome variable, and 1,086 and 1,041 for the linear regressions with gv109re and gv111re as the outcome variables, respectively. For all three analyses this corresponds to a statistical power of approximately 1.00 when assuming our minimum effect sizes of interest. For the linear regressions we additionally assumed the variance explained by the predictor to be 0.2 and the residual variance to be 1.0 (see figure below for the full power analysis of the regression with the lowest sample size). For the logistic regression we assumed an intercept of -1.56 corresponding to a situation where half of the participants have done volunteer work in the last year (see the R-code for the full power analysis at https://osf.io/f96rn).

Inference criteria

inference_criteria

What criteria will you use to make inferences? Describe the information you will use (e.g. specify the p-values, effect sizes, confidence intervals, Bayes factors, specific model fit indices), as well as cut-off criteria, where appropriate. Will you be using one- or two-tailed tests for each of your analyses? If you are comparing multiple conditions or testing multiple hypotheses, will you account for this, and if so, how?

*Example*: We will make inferences about the association between religiosity and prosociality based on the p-values and the size of the regression coefficients of the religiosity variable in the three main regressions. We will conclude that a regression analysis supports our hypothesis if both the p-value is smaller than .01 and the regression coefficient is larger than our minimum effect size of interest. We chose an alpha of .01 to account for the fact that we do a test for each of the three regressions (0.05/3, rounded down). If the conditions above hold for all three regressions, we will conclude that our hypothesis is fully supported, if they hold for one or two of the regressions we will conclude that our hypothesis is partially supported, and if they hold for none of the regressions we will conclude that our hypothesis is not supported.

Assumptions

assumptions

What will you do should your data violate assumptions, your model not converge, or some other analytic problem arises?

*Example*: When the distribution of the number of volunteering hours (gv111re) is significantly non-normal according to the Kolmogorov-Smirnov test (Massey, 1951), and/or (b) the linearity assumption is violated (i.e., the points are asymmetrically distributed around the diagonal line when plotting observed versus the predicted values), we will log-transform the variable.

Sensitivity

sensitivity

Provide a series of decisions about evaluating the strength, reliability, or robustness of your focal hypothesis test. This may include within-study replication attempts, additional covariates, cross-validation efforts (out-of-sample replication, split/hold-out sample), applying weights, selectively applying constraints in an SEM context (e.g., comparing model fit statistics), overfitting adjustment techniques used (e.g., regularization approaches such as ridge regression), or some other simulation/sampling/bootstrapping method.

*Example*: To assess the sensitivity of our results to our selection criterion for outliers, we will run an additional analysis without removing any outliers.

Exploratory

exploratory

If you plan to explore your dataset to look for unexpected differences or relationships, describe those tests here, or add them to the final paper under a heading that clearly differentiates this exploratory part of your study from the confirmatory part.

*Example*: As an exploratory analysis, we will test the relationship between scores on the religiosity scale and prosociality after adjusting for a variety of social, educational, and cognitive covariates that are available in the dataset. We have no specific hypotheses about which covariates will attenuate the religiosity-prosociality relation most substantially, but we will use this exploratory analysis to generate hypotheses to test in other, independent datasets.

*Comments*: Whereas it is not presently the norm to preregister exploratory analyses, it is often good to be clear about which variables will be explored (if any), for example, to differentiate these from the variables for which you have specific predictions or to plan ahead about how to compute these variables.

Integrity statement

integrity_statement

The authors of this preregistration state that they filled out this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and dataset.