This R Markdown file reproduces all the analyses, tables and figures produced in:
Chen, P., Ong, D. C., Ng, J., & Coppola, B. (in press). Explore, Exploit, and Prune in the Classroom: Strategic Resource Management Behaviors Predict Performance. AERA Open.
ResponseRates | Year1 | Year2 | Year3 | Year4 |
---|---|---|---|---|
Enrolled in class | 1172 | 1336 | 1438 | 1392 |
Exam 1 | 1136 (96.93%) | 992 (74.25%) | 1265 (87.97%) | 1064 (76.44%) |
Exam 2 | 1119 (95.48%) | 994 (74.4%) | 1300 (90.4%) | 1097 (78.81%) |
Exam 3 | 1123 (95.82%) | 907 (67.89%) | 1287 (89.5%) | 1105 (79.38%) |
At least one exam survey | 1170 (99.83%) | 1071 (80.16%) | 1347 (93.67%) | 1201 (86.28%) |
All three exam surveys | 1057 (90.19%) | 853 (63.85%) | 1194 (83.03%) | 940 (67.53%) |
Note that Table 2 (Descriptive Frequencies of Resource Use on the Prior Exam (Exams 1 and 2) and Percentages of Resources Explored, Exploited, and Pruned Out of Those Possible (on the Subsequent Exams 2 and 3, Respectively), Aggregating Across Cohorts)) is found later in the file due to the the “flow” of the code.
To first analyze how students’ resource use changed over time, we fit a simple linear model predicting the number of resources used at a given time point, assuming a linear coefficient on time, an integer-valued variable (t = 1, 2, 3). We applied the same model to analyze how students’ reported mean usefulness ratings changed over time. These two models were estimated with random effects of individual students nested within cohort (year):
\(\text{NumResourcesUsed}_{i,t,y} = b_0 + b_1 t + u_{i,y} + u_y + \epsilon_{i,t,y}\)
\(\text{MeanUsefulness}_{i,t,y} = b_0 + b_1 t + u_{i,y} + u_y + \epsilon_{i,t,y}\)
where \(u_{i,y}\) is the student-specific random intercept nested within year, \(u_y\) is a random intercept for year, and \(\epsilon_{i,t,y}\) is the residual error term. In R syntax, the models are:
lmer(numResUsed ~ examNum + (1|ID:Year) + (1|Year))
lmer(meanUsefulness ~ examNum + (1|ID:Year) + (1|Year))
We begin by describing how students in our study interacted with their resources. Students showed a decreasing linear trend in the number of resources that they used to study for each of their 3 exams (Equation 2a). Across all four cohorts, students started off using an average of 7.9 (standard deviation, SD = 1.7) resources to study for Exam 1, 7.4 (SD = 1.7) resources for Exam 2, and 7 (SD = 1.7) resources for Exam 3, linear trend b = -.44, 95% CI = [-.47, -.42], p < .001. We illustrate students’ use and usefulness ratings for the 12 kinds of resources in Figure 1, aggregated across cohorts and categorized by exam. On average, some resources (e.g., attending the lecture and using the coursepack) were used more than others (e.g., lecture podcasts, Science Learning Center)—not surprisingly, these tended to also be the resources that a large proportion of students rated as “extremely useful” for their learning.
While this decreasing trend of number of resources used could be interpreted as decreasing motivation, our evidence suggests otherwise: Over the same period, students’ mean ratings of how useful their resources had been exhibited a positive linear trend over time—increasing from 4.09 (SD = 0.473) on Exam 1, to 4.17 (SD = 0.529) on Exam 2, and 4.21 (SD = 0.58) on Exam 3, linear trend b = .06, [.05, .07], p < .001 (Equation 2b). A score of 4 on our 5-point scale corresponds to “useful”. We inferred that, rather than necessarily being less invested, students were, on average, becoming possibly more focused and effective in their resource use over time. Evidence from our cognitive interviews with a randomly selected sample of students in the class who were not part of this study (see SOM “Survey Validation” for details) supports this idea that some students were strategically changing their resource use over time in the class. For example, one student shared that, “By exam 3 I wasn’t using [textbook problems]. I used those for the first exam, but found them not as helpful, so I stopped using those.” Next, we turn to how students increased their resource-use effectiveness by managing their resource use wisely from exam to exam.
[stats dump:]
Ultimately, we were interested in whether students’ resource management behaviors between exams were associated with their exam performance. In a mixed-effects linear model, we regressed students’ current exam performance on their reported exploration, exploitation, and pruning behaviors, controlling for students’ performance on the prior exam as fixed effects. We added random intercepts by student nested within year, and time-point nested within year. Exam scores, the dependent variable, were converted into percentage scores out of 100 for all exams. Effect sizes (unstandardized b coefficients) can be interpreted in units of percentage points. Means and standard deviations of the three class exam scores are presented in Table S3. Thus, for student \(i\) at time \(t\) in year \(y\), we estimate the following model:
\(\text{Exam}_{i,t,y} = b_0 + b_1 \text{NumExplore}_{i,t,y} + b_2 \text{NumExploit}_{i,t,y} + b_3 \text{NumPrune}_{i,t,y} + b_4 \text{Exam}_{i,t-1,y} + u_{i,y} + u_{t,y} + u_y + \epsilon_{i,t,y}\)
Where \(\text{Exam}_{i,t,y}\) denotes the exam score of student \(i\) at time \(t\) in year \(y\), \(u_{i,y}\) gives the random intercept of student nested within year, \(u_{t,y}\) gives the random intercept of exam nested within year, and \(u_y\) gives the random intercept by year. Note that controlling for the exam performance at the previous time-point is conservative, and allows us to test if the resource regulatory behaviors (done between exams) explain exam performance over and above prior performance. In R syntax, this model is:
lmer(currentScorePercent ~ sum_explore + sum_exploit + sum_prune + pastScorePercent + (1|ID:Year) + (1|examNum:Year) + (1|Year))
(Note the above figure is not in the paper as the information already appears in Table 3)
We analyzed the effect of exploring, exploiting, and pruning behaviors on exam performance (Equation 3). Consistent with our hypotheses, the extent to which students engaged in each of these resource management behaviors positively predicted their exam performance. This was true for exploration (b = 0.85, [0.52, 1.18], p < .001), for exploitation (b = 0.91, [0.74, 1.08], p < .001), and for pruning (b = 0.75, [0.03, 1.48], p = .042). Table 3 presents the full statistical results. Following recent recommendations in psychological science and statistics to move away from Null Hypothesis Statistical Testing (e.g., Wasserstein, Schirm, & Lazar, 2019) and towards an “estimation” framework (e.g., Cumming, 2014), we provide p values for completeness, but we focus on interpreting the effect sizes, which are directly interpretable in terms of exam performance. Exploring one new resource was associated with an average of 0.85 percentage points increase in students’ performance on the current exam; exploiting one additional resource that was considered useful on the previous exam was associated with an average of 0.91 percentage points increase in students’ performance on the current exam; and pruning one additional resource that was found to be useless on the previous exam was associated with an average of 0.75 percentage points increase in students’ performance on the current exam.
Figure 2 visually illustrates how empirically-observed combinations of exploration, exploitation, and pruning related to changes in students’ exam performance. We observed that greater resource management was associated with larger changes in students’ performance on subsequent exams: Starting from the origin and moving out along each of the three axes, as learners report practicing more exploration, exploitation, and pruning, we see that their exam performance improves. Our findings underscore the adaptive, strategic nature of learners’ decisions to explore new resources, exploit previously useful resources, and prune previously useless resources between one exam to the next.
In addition, we replicated our results when controlling for student engagement with the course resources (i.e., the total number of resources they used). Controlling for the number of resources students reported using at the beginning of the course (before Exam 1) as a proxy of their course engagement, as well as prior performance (i.e., adding total number of resources initially used as additional covariate to Equation 3), we find that greater exploration (b = 0.93 [0.59, 1.28], p<.001), exploitation (b = 0.85 [0.65, 1.05], p<.001), and pruning (b = 0.68 [-0.06, 1.43], p = .072) between exams still predicted students’ subsequent exam performance. The effect sizes in this analysis were relatively similar in magnitude, although we note that the coefficient on pruning is no longer statistically significant at the .05 level, suggesting that strategic resource management behaviors of exploring, exploiting, and pruning offer predictive value above and beyond a proxy of students’ sheer use of more course resources.
lmer(currentScorePercent ~ sum_explore + sum_exploit + sum_prune + exam1_sumres + pastScorePercent + (1|ID:Year) + (1|examNum:Year) + (1|Year))
(i.e., we added their sum of resources used on exam 1 as an additional covariate)
Finally, our definition of exploration only required that students not use a resource on the preceding exam, but did not consider if they have not used it on all previous exams. We repeated all our analyses using a stricter definition of exploration that is trying a resource that students had not used on any previous exam, rather than just the preceding exam. Our results replicated, and the effect size on explore was in fact, stronger (b = 1.18 [0.78, 1.57], p<.001). However, we choose to retain our current (conservative) operationalization, using only the previous exam’s (non-)use, to be consistent with how we operationalized exploiting and pruning. We note that exploitation and pruning are theoretically well-defined even using only the previous exam.
lmer(currentScorePercent ~ sum_explore_STRICT + sum_exploit + sum_prune + pastScorePercent + (1|ID:Year) + (1|examNum:Year) + (1|Year))
Year1 | Year2 | Year3 | Year4 | |
---|---|---|---|---|
Male | 583 | 499 | 637 | 582 |
Female | 587 | 572 | 708 | 616 |
Gender: Not Reported | 0 | 0 | 2 | 3 |
Asian | 220 | 212 | 258 | 236 |
Black | 64 | 39 | 63 | 52 |
Hispanic | 43 | 38 | 58 | 36 |
Native Amr | 16 | 8 | 8 | 11 |
White | 676 | 651 | 791 | 726 |
Race: Not Reported | 151 | 123 | 169 | 140 |
name | current5 | current4 | current3 | current2 | current1 | current0 | currentAll |
---|---|---|---|---|---|---|---|
Past: Usefulness 5 | 18053 | 5172 | 706 | 123 | 339 | 1467 | 25860 |
Past: Usefulness 4 | 5834 | 12362 | 3032 | 432 | 195 | 3713 | 25568 |
Past: Usefulness 3 | 812 | 2828 | 2885 | 547 | 117 | 3217 | 10406 |
Past: Usefulness 2 | 122 | 343 | 474 | 327 | 97 | 803 | 2166 |
Past: Usefulness 1 | 255 | 139 | 83 | 67 | 106 | 291 | 941 |
Past: Didn’t use | 1550 | 2367 | 1387 | 267 | 155 | 31319 | 37045 |
Past: All | 26626 | 23211 | 8567 | 1763 | 1009 | 40810 | 101986 |
Note. The numbers of resources reflect the mean numbers per student, averaged across all students per exam.
Exam2 | Exam3 | Both | |
---|---|---|---|
Average number of resources that were not used on the prior exam | 4.13 | 4.62 | 4.39 |
Of these, number explored on current exam | 0.706 | 0.644 | 0.686 |
Percentage Explored | 16.9 | 14 | 16 |
Average number of resources that were rated useful (>3) on prior exam | 6.17 | 6 | 6.07 |
Of these, number exploited on current exam | 5.54 | 5.41 | 5.46 |
Percentage Exploited | 90.1 | 90.4 | 90.1 |
Average number of resources that were rated useless (<3) on prior exam | 0.395 | 0.34 | 0.37 |
Of these, number pruned on current exam | 0.15 | 0.108 | 0.129 |
Percentage Pruned | 40.8 | 37.9 | 42.3 |
Exam | Year1 | Year2 | Year3 | Year4 |
---|---|---|---|---|
Exam 1 | 65.4 (15.75) | 67.3 (18.83) | 67.14 (15.24) | 76.76 (14.5) |
Exam 2 | 70.6 (16.74) | 73.06 (18.2) | 74.71 (14.6) | 68.49 (17.42) |
Exam 3 | 62.36 (21.53) | 55.41 (23.83) | 59.38 (22.06) | 51.58 (20.17) |
## R version 4.0.1 (2020-06-06)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] plyr_1.8.6 plotly_4.9.2.1 doBy_4.6.6 MuMIn_1.43.17 pander_0.6.3
## [6] lme4_1.1-23 Matrix_1.2-18 reshape2_1.4.4 corrplot_0.84 ggplot2_3.3.1
## [11] tidyr_1.1.0 dplyr_1.0.0
##
## loaded via a namespace (and not attached):
## [1] statmod_1.4.34 tidyselect_1.1.0 xfun_0.14
## [4] purrr_0.3.4 splines_4.0.1 lmerTest_3.1-2
## [7] lattice_0.20-41 colorspace_1.4-1 vctrs_0.3.1
## [10] generics_0.0.2 htmltools_0.4.0 stats4_4.0.1
## [13] viridisLite_0.3.0 yaml_2.2.1 rlang_0.4.6
## [16] pillar_1.4.4 nloptr_1.2.2.1 glue_1.4.1
## [19] withr_2.2.0 lifecycle_0.2.0 stringr_1.4.0
## [22] munsell_0.5.0 gtable_0.3.0 htmlwidgets_1.5.1
## [25] evaluate_0.14 labeling_0.3 knitr_1.28
## [28] crosstalk_1.1.0.1 broom_0.5.6 Rcpp_1.0.4.6
## [31] scales_1.1.1 backports_1.1.7 jsonlite_1.6.1
## [34] farver_2.0.3 Deriv_4.0 digest_0.6.25
## [37] stringi_1.4.6 numDeriv_2016.8-1.1 grid_4.0.1
## [40] tools_4.0.1 magrittr_1.5 lazyeval_0.2.2
## [43] tibble_3.0.1 crayon_1.3.4 pkgconfig_2.0.3
## [46] ellipsis_0.3.1 MASS_7.3-51.6 data.table_1.12.8
## [49] minqa_1.2.4 rmarkdown_2.2 httr_1.4.1
## [52] R6_2.4.1 boot_1.3-25 nlme_3.1-148
## [55] compiler_4.0.1