Processing math: 100%

This analysis script accompanies the manuscript Chen, Chavez, Ong, & Gunderson (invited revision).

Results

Regression analyses showed that there were no differences between conditions in students’ high school GPAs ( pStudy 1 = 0.939, pStudy 2 = 0.393) and college GPAs before the intervention ( pStudy 1 = 0.577, pStudy 2 = 0.557 ).

Across both cohorts, there were also no significant differences between conditions in students’ desired grades on each of their two exams ( all p-values > 0.160 ), their motivation to achieve their desired grades ( all p-values > 0.267 ), the personal importance of these desired grades ( all p-values > 0.181 ), and their confidence in attaining their desired grades ( all p-values > 0.161 ).

Treatment Effects

We conducted our analyses using multiple approaches: first, we conducted an intent-to-treat analysis (Gupta, 2011; Wertz, 1995) by comparing the performance of all students based on their randomly assigned condition, regardless of how many surveys they took. This avoids the self-selection bias potentially introduced by only analyzing students who finished the full treatment or full control. Second, we compared the performance of students in the treatment and control conditions who took our surveys before both of their exams (i.e. the full treatment versus the same number of control surveys). Third, we considered whether treatment dosage among those who were assigned to receive the intervention resulted in differential benefits among those treated. Performance differences between conditions in all three analyses replicated when we analyzed students’ exam and final course performance excluding the homework extra credit points that they attained for participating in these surveys.

In both studies, our intent-to-treat analyses found that students in the treatment condition outperformed those in the control condition on their final course grades by an average of one-third of a letter grade. In Study 1, students in the treatment condition performed an average of 3.64% [0.28%, 7.00%] higher on their final course grades than students in the control condition ( MT = 83.90% vs. MC = 80.26%, Cohen’s d = 0.33, Welch t(162) = 2.14, p = 0.034 ). This performance advantage replicated in Study 2, where students in the treatment condition scored an average of 4.21% [0.97%, 7.44%] higher in the class than those in the control condition ( MT = 83.44% vs. MC = 79.23%, d = 0.37, t(183) = 2.56, p = 0.011 ). Performance differences between conditions were evident on every exam that the students took in both cohorts, although the difference was trending but not statistically significant on Exam 1 in Study 1 (Fig. 1).

Fig. 1. Average student performance (% score) on Exam 1, Exam 2, and final course grades by condition in our intent-to-treat analyses. Error bars represent 95% confidence intervals of the means for each condition.

We replicated these findings when we compared the performances of the majority of students who took the full intervention dosage (i.e. once before each of two exams) against students in the control condition who received the same number of control exam reminders. In both studies, the average class performance difference between these two groups of students was one-third of a letter grade. Students who took the intervention twice scored an average of 3.45% [0.26%, 6.65%] higher on their final course grades in Study 1 ( MT = 86.35% vs. MC = 82.90%, d = 0.38, t(127) = 2.14, p = 0.034 ), and 4.65% [1.45%, 7.85%] higher in Study 2 ( MT = 85.77% vs. MC = 81.12%, d = 0.47, t(139) = 2.87, p = 0.0047 ), relative to those in the control condition. Significant performance differences were also observed on students’ exams, with the exception of Exam 1 in Study 1, where our results were in the predicted direction but not statistically significant (Fig. S1).

We found a treatment dosage effect among those who had received the intervention. The majority of treatment condition students in each study took the treatment twice rather than once (Study 1: 75.9% twice versus 24.1% once; Study 2: 70.5% twice versus 29.5% once). Students in the treatment condition who took the intervention twice (as opposed to once) scored significantly higher on their final course grades (Study 1: Mdiff = 10.16% [5.30%, 15.03%], d = 1.11, t(32) = 4.26, p = 0.00017 ; Study 2: Mdiff = 7.90% [2.77%, 13.04%], d = 0.81, t(38) = 3.12, p = 0.0035 ). These results, however, might be explained by differences in self-selection. For instance, more motivated students might be more likely to take both pre-exam surveys. We did find some evidence to suggest this, given the significant differences in class performance even among control conditions students who took different numbers of surveys ( Study 1: Mdiff = 11.08% [3.52%, 18.63%], d = 1.00, t(24) = 3.03, p = 0.0059 ; Study 2: Mdiff = 9.43% [2.01%, 16.86%], d = 0.81, t(23) = 2.63, p = 0.015 ).

Nevertheless, differences in self-selection by motivation level do not completely explain the observed performance disparities between treatment condition students who had taken our surveys twice versus just once. The treatment dosage effect was still significant when we controlled for students’ GPA at the beginning of the class, which is often indicative of how motivated students are towards their academic learning ( Study 1: bdosage = 6.24 [1.85, 10.63], se = 2.21, t(81) = 2.83, p = 0.0059 ; Study 2: bdosage = 6.17 [2.35, 9.99], se = 1.92, t(90) = 3.21, p = 0.0019 ).

In summary, we can conclude that students benefit from doing the intervention exercise over getting a regular exam reminder, and that more exposure to the intervention is associated with higher class performance.

Treatment Homogeneity

We found that the Strategic Resource Use for Learning intervention was academically advantageous for different types of college students across the demographic and performance variables that we collected (gender, race, class standing, and pre-intervention performance levels). Moderation analyses showed that it similarly impacted males and females, students of different racial groups, students of different class standings, and low and high-performing students in both cohorts (all interaction p-values > 0.188). Model comparisons further reinforced these results: Pooling across both studies, we compared one model specifying all the interactions between condition and individual difference variables (gender, race, class standing, pre-intervention GPA, and cohort) to another model without the interactions (only the main effects). The two models were not statistically different from one another (p = 0.369), implying that the more parsimonious model without interactions is sufficient to explain the data. These results support our inference that the intervention benefitted students to similar extents across these individual differences.

Causal Process

We tested our prediction that the intervention would affect students’ performance through greater self-reflection on their learning and more effective resource use behaviors, in that order. Aggregating across all available data in our two studies, we first ran regression analyses to test for the predicted relationships among our variables. Students who had been randomly assigned to the treatment reported practicing significantly more self-reflection on their learning in class (b = 0.21 [0.03, 0.38], se = 0.09, t(163) = 2.38, p = 0.019 ). the more students thought strategically about how to effectively approach their learning, the more useful they found the resources they had used for studying (b = 0.22 [0.06, 0.37], se = 0.08, t(120) = 2.80, p = 0.0059 ). and this predicted how well they performed in the class (b = 2.12 [-0.18, 4.42], se = 1.17, t(223) = 1.82, p = 0.071 ). There was no direct effect of condition on students’ resource use behaviors (p = 0.261). There was also no significant difference in the number of resources that students used for learning—if anything, students in the treatment condition used fewer learning resources on average in the class, than students in the control condition (MT = 11.91 vs. MC = 13.03, Mdiff = 1.12 [-0.19, 2.42], d = 0.22, t(223) = 1.69, p = 0.093). This result suggests that the intervention made students use their resources more effectively rather than just getting them to use more resources.

(Serial Mediation in Mplus)

Exam-Focused Resource Selection and Follow-Through in the Treatment Condition

For students in the treatment condition, we could test the performance benefits of using resources that they had strategically selected and planned ahead of time versus those that they had not selected in advance but ended up using. We could also examine the degree to which following through with the strategic plans they had made affected their performance on each exam. We aggregated across all exams in both studies and used mixed effects models with exam number, individual student, and cohort included as random effects.

Importance of strategic forethought in resource use.

We tested the contribution of strategic forethought to students’ grades by comparing how well treatment group students’ exam performance was explained by the number of resources that they had strategically selected and used versus the number of resources they had not selected a priori but ended up using. Both of these variables were added as fixed effects predictors in our mixed effects model. Only the number of resources that students had strategically selected in advance and used positively related to their exam performance (b = 0.77 [0.33, 1.21], se = 0.22, t(241) = 3.48, p = 6e-04); the number of resources that they used but had not selected in the intervention ahead of time did not significantly predict their exam performance (p = 0.382).

Follow-through with plans.

To test how much acting on one’s strategically selected resources contributed to students’ performance, beyond the mere motion of selecting and planning, we matched the resources that every student had chosen before their exams to their post-exam resource use responses. We ran a mixed effects model predicting students’ exam performance with two fixed effects predictors: the number of resources that treatment group students had planned and used, and the number of resources that they had planned but not used. Results showed that actually using the resources selected was crucial to performance. Out of students’ strategically planned resources, only those that they had actually utilized contributed positively to their exam grades (b = 0.75 [0.31, 1.19], se = 0.22, t(230) = 3.38, p = 0.00084). In the same model, the number of resources that students had planned but not used were unrelated to their exam performance (p = 0.946). Therefore, strategically selecting and planning out what resources will be useful does not automatically boost students’ grades—it also requires putting these strategic plans to practice.

To summarize, the resources that students had strategically selected through our intervention exercise were especially predictive of performance, relative to those which they used but without such forethought. Furthermore, carrying out one’s plans was key to translating this psychology into performance benefits—a result consistent with a long tradition of research about the importance of implementing one’s intentions (Gollwitzer, 1999).

Additional Emotional and Motivational Benefits

We examined additional potential consequences of the intervention, including its effects on students’ pre-exam negative affect and perceived control over their performance, how much pre-exam planning they did, and how well they followed through with their plans. We aggregated across all exams in both studies and used mixed effects models to test for differences on each of these variables by condition, including exam number, individual student, and cohort as random effects. Relative to control condition, students in the treatment condition experienced significantly lower negative affect towards their upcoming exams (b = -0.43, [-0.73, -0.14], se = 0.15, t(353) = 2.88, p = 0.0043) and perceived greater control over their own learning in the class (b = 0.16, [-0.01, 0.33], se = 0.09, t(347) = 1.86, p = 0.064) . Neither students’ subjective degree of prior planning (p = 0.492) nor how much they felt that they had followed-through with their plans (p = 0.381) differed between conditions.

Intervention Components

To understand which elements of the intervention predicted students’ class performance, we coded and analyzed students’ open-ended responses to why each resource they had chosen would be useful to them (“why useful” responses) and the descriptions of their exam preparation plans (planning). Examples of students’ open-ended responses are provided in the SOM Appendix C.

Students’ “why useful” responses were coded into five main strategies that they mentioned which are consistent with self-regulation theory: (1) explicit consideration of the exam format, (2) leveraging multiple resources in a synergistic manner, (3) a focus on promoting learning and understanding of the class material, (4) illustrating an understanding of personal strengths and weaknesses, and (5) recognizing that learning is a social process (as opposed to an individual’s isolated endeavor). Two independent coders categorized students’ open-ended responses into these five categories if students mentioned any of the categories in their responses (inter-rater κ ranged from 0.88 to 1.00), and any disagreements were resolved through discussion. Students’ plans were similarly coded into the following three thematic categories: when, where, and how the resources were going to be used (inter-rater κ ranged from 0.94 to 1.00). For each thematic category, we created a measure of how much students engaged in it across their two exams (“0” if they did not mention it at all, “1” if they only wrote about it before one exam, and “2” if they wrote about it before both exams).

We regressed treatment condition students’ final course grades on this measure of engagement separately for each of these 8 categories (see Table 1 for results). Four elements of the intervention significantly and consistently related to students’ class performance across the two studies: explicitly tailoring their choice of resources to the exam questions expected, focusing their resource use on learning and understanding the material, and planning out when and how they would use their resources (Table 1). For example, students in the treatment condition who were more engaged in reflecting on what was expected of them on their exams as they chose their resources tended to perform better in the class. These results emphasize that both strategic self-reflection and planning are valuable components of the intervention.

Table 1

Categories Unstandardized b
[95% CI]
t p
  “Why Useful” Self-regulatory Elements    
Study 1      
Consideration of Exam Format 3.76 [1.07, 6.45] 2.78 0.01
Synergistic use with other resources 1.36 [-1.57, 4.29] 0.92 0.36
Learning and understanding of the class material 4.98 [1.04, 8.92] 2.51 0.01
Understanding of personal strengths 1.38 [-1.55, 4.30] 0.94 0.35
Learning as a social process 0.16 [-2.08, 2.41] 0.14 0.89
Study 2      
Consideration of Exam Format 3.22 [0.62, 5.83] 2.46 0.02
Synergistic use with other resources 1.79 [-1.03, 4.61] 1.26 0.21
Learning and understanding of the class material 8.18 [4.85, 11.50] 4.88 0.00
Understanding of personal strengths 2.44 [-1.52, 6.40] 1.22 0.22
Learning as a social process -0.41 [-2.57, 1.74] -0.38 0.71
  Elements of Planning    
Study 1      
When 4.33 [0.79, 7.87] 2.43 0.02
Where 3.16 [0.31, 6.00] 2.20 0.03
How 5.66 [2.50, 8.83] 3.56 0.00
Study 2      
When 4.97 [1.66, 8.29] 2.98 0.00
Where -0.19 [-3.40, 3.01] -0.12 0.91
How 6.71 [3.53, 9.89] 4.19 0.00

Supplementary Materials

Table S1

Percentage of students in the class who participated in each survey

% Participation Study 1 Study 2
Pre-Exam 1 Survey 78.7 84.5
Pre-Exam 2 Survey 90.4 76.3
Both Pre-Exam Surveys 73.0 69.1
Post-Exam 1 Survey 80.9 78.3
Post-Exam 2 Survey 78.1 83.6
Both Post-Exam Surveys 68.0 71.0
All Four Surveys 58.4 58.9

Note. There were 178 students who enrolled in and obtained a final grade in the class in study 1, and 207 students in study 2.

Table S2

Students’ pre-intervention measures by condition

  High School GPA College GPA Goal grade Motivation Importance Confidence
      Study 1      
Control 3.80 (0.33)
[3.72, 3.88]
3.08 (0.43)
[2.98, 3.18]
       
Treatment 3.80 (0.39)
[3.70, 3.90]
3.13 (0.54)
[3.01, 3.24]
       
Study 1 Exam 1            
Control     2.39 (1.19)
[2.11, 2.68]
5.74 (1.20)
[5.45, 6.03]
5.88 (1.42)
[5.53, 6.22]
4.57 (1.30)
[4.26, 4.89]
Treatment     2.18 (1.22)
[1.89, 2.47]
5.87 (0.90)
[5.66, 6.08]
6.14 (0.89)
[5.93, 6.35]
4.86 (1.06)
[4.61, 5.11]
Study 1 Exam 2            
Control     2.90 (1.62)
[2.54, 3.26]
5.98 (0.97)
[5.76, 6.20]
6.12 (1.00)
[5.89, 6.34]
4.63 (1.13)
[4.38, 4.89]
Treatment     2.56 (1.41)
[2.25, 2.87]
6.00 (1.16)
[5.75, 6.26]
6.23 (0.91)
[6.03, 6.43]
4.89 (1.19)
[4.63, 5.15]
      Study 2      
Control 3.75 (0.33)
[3.68, 3.81]
3.19 (0.49)
[3.09, 3.30]
       
Treatment 3.79 (0.31)
[3.72, 3.85]
3.24 (0.50)
[3.13, 3.34]
       
Study 2 Exam 1            
Control     2.36 (1.45)
[2.05, 2.67]
5.84 (1.21)
[5.58, 6.10]
6.17 (0.99)
[5.96, 6.38]
4.83 (1.13)
[4.59, 5.07]
Treatment     2.16 (1.34)
[1.88, 2.45]
5.87 (1.03)
[5.65, 6.09]
6.07 (1.03)
[5.85, 6.29]
4.72 (1.04)
[4.50, 4.94]
Study 2 Exam 2            
Control     2.62 (1.43)
[2.31, 2.94]
6.28 (0.99)
[6.06, 6.49]
6.29 (0.90)
[6.09, 6.49]
5.00 (1.22)
[4.73, 5.27]
Treatment     2.36 (1.73)
[1.96, 2.76]
6.11 (0.92)
[5.89, 6.32]
6.21 (0.90)
[6.01, 6.42]
4.77 (1.28)
[4.48, 5.07]
             

Note. Descriptive statistics of the motivation, importance, and confidence measures reflect 7-point scales. No significant differences between conditions were found on any of these pre-intervention measures.

Table S3

Goodness-of-fit statistics for mediation models tested.

TODO TABLE S3

R Session Info; to aid in reproducibility

# Details on the R versions and package versions used
devtools::session_info()
## Session info --------------------------------------------------------------
##  setting  value                       
##  version  R version 3.3.1 (2016-06-21)
##  system   x86_64, darwin13.4.0        
##  ui       X11                         
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  tz       America/Los_Angeles         
##  date     2017-01-12
## Packages ------------------------------------------------------------------
##  package      * version date       source        
##  acepack        1.3-3.3 2014-11-24 CRAN (R 3.3.0)
##  assertthat     0.1     2013-12-06 CRAN (R 3.3.0)
##  chron          2.3-47  2015-06-24 CRAN (R 3.3.0)
##  cluster        2.0.4   2016-04-18 CRAN (R 3.3.1)
##  colorspace     1.2-6   2015-03-11 CRAN (R 3.3.0)
##  corrplot     * 0.77    2016-04-21 CRAN (R 3.3.0)
##  data.table     1.9.6   2015-09-19 CRAN (R 3.3.0)
##  devtools     * 1.12.0  2016-06-24 CRAN (R 3.3.0)
##  digest         0.6.10  2016-08-02 CRAN (R 3.3.0)
##  evaluate       0.9     2016-04-29 CRAN (R 3.3.0)
##  foreign        0.8-66  2015-08-19 CRAN (R 3.3.1)
##  formatR        1.4     2016-05-09 CRAN (R 3.3.0)
##  Formula      * 1.2-1   2015-04-07 CRAN (R 3.3.0)
##  ggplot2      * 2.2.1   2016-12-30 CRAN (R 3.3.2)
##  gridExtra      2.2.1   2016-02-29 CRAN (R 3.3.0)
##  gtable         0.2.0   2016-02-26 CRAN (R 3.3.0)
##  Hmisc        * 3.17-4  2016-05-02 CRAN (R 3.3.0)
##  htmltools      0.3.5   2016-03-21 CRAN (R 3.3.0)
##  knitr          1.14    2016-08-13 CRAN (R 3.3.0)
##  labeling       0.3     2014-08-23 CRAN (R 3.3.0)
##  lattice      * 0.20-33 2015-07-14 CRAN (R 3.3.1)
##  latticeExtra   0.6-28  2016-02-09 CRAN (R 3.3.0)
##  lazyeval       0.2.0   2016-06-12 CRAN (R 3.3.0)
##  lme4         * 1.1-12  2016-04-16 CRAN (R 3.3.0)
##  lmerTest     * 2.0-32  2016-06-23 CRAN (R 3.3.0)
##  lsr          * 0.5     2015-03-02 CRAN (R 3.3.0)
##  magrittr       1.5     2014-11-22 CRAN (R 3.3.0)
##  MASS           7.3-45  2016-04-21 CRAN (R 3.3.1)
##  Matrix       * 1.2-6   2016-05-02 CRAN (R 3.3.1)
##  memoise        1.0.0   2016-01-29 CRAN (R 3.3.0)
##  minqa          1.2.4   2014-10-09 CRAN (R 3.3.0)
##  mnormt         1.5-4   2016-03-09 CRAN (R 3.3.0)
##  munsell        0.4.3   2016-02-13 CRAN (R 3.3.0)
##  nlme         * 3.1-128 2016-05-10 CRAN (R 3.3.0)
##  nloptr         1.0.4   2014-08-04 CRAN (R 3.3.0)
##  nnet           7.3-12  2016-02-02 CRAN (R 3.3.1)
##  plyr         * 1.8.4   2016-06-08 CRAN (R 3.3.0)
##  psy          * 1.1     2012-06-21 CRAN (R 3.3.0)
##  psych        * 1.6.6   2016-06-28 CRAN (R 3.3.0)
##  RColorBrewer   1.1-2   2014-12-07 CRAN (R 3.3.0)
##  Rcpp           0.12.6  2016-07-19 CRAN (R 3.3.0)
##  reshape      * 0.8.5   2014-04-23 CRAN (R 3.3.0)
##  reshape2     * 1.4.1   2014-12-06 CRAN (R 3.3.0)
##  rmarkdown      1.0     2016-07-08 CRAN (R 3.3.1)
##  rpart          4.1-10  2015-06-29 CRAN (R 3.3.1)
##  scales       * 0.4.1   2016-11-09 CRAN (R 3.3.2)
##  stringi        1.1.1   2016-05-27 CRAN (R 3.3.0)
##  stringr        1.1.0   2016-08-19 CRAN (R 3.3.0)
##  survival     * 2.39-4  2016-05-11 CRAN (R 3.3.1)
##  tibble         1.2     2016-08-26 CRAN (R 3.3.0)
##  withr          1.0.2   2016-06-20 CRAN (R 3.3.0)
##  yaml           2.1.13  2014-06-12 CRAN (R 3.3.0)