Introduction
Methods
- Table 1: Numbers and Percentages of Students who Participated Across the 4 Consecutive Cohorts
Interleaved Methods and Results
- Resource use over time.
- Predicting Exam Performance
  - Robustness Check 1: Controlling for initial sum of resources used, as a proxy for engagement
  - Robustness check 2: Analyses using strict explore
Figures
- Figure 1: Resource Use Figure
- Figure 2: Graphical Representation of the Frequency of Exploring, Exploiting and Pruning in Relation to Changes in Performance
Supplemental Figures and Tables

Introduction

This R Markdown file reproduces all the analyses, tables and figures produced in:

Chen, P., Ong, D. C., Ng, J., & Coppola, B. (in press). Explore, Exploit, and Prune in the Classroom: Strategic Resource Management Behaviors Predict Performance. AERA Open.

Methods

Table 1: Numbers and Percentages of Students who Participated Across the 4 Consecutive Cohorts

ResponseRates	Year1	Year2	Year3	Year4
Enrolled in class	1172	1336	1438	1392
Exam 1	1136 (96.93%)	992 (74.25%)	1265 (87.97%)	1064 (76.44%)
Exam 2	1119 (95.48%)	994 (74.4%)	1300 (90.4%)	1097 (78.81%)
Exam 3	1123 (95.82%)	907 (67.89%)	1287 (89.5%)	1105 (79.38%)
At least one exam survey	1170 (99.83%)	1071 (80.16%)	1347 (93.67%)	1201 (86.28%)
All three exam surveys	1057 (90.19%)	853 (63.85%)	1194 (83.03%)	940 (67.53%)

Note that Table 2 (Descriptive Frequencies of Resource Use on the Prior Exam (Exams 1 and 2) and Percentages of Resources Explored, Exploited, and Pruned Out of Those Possible (on the Subsequent Exams 2 and 3, Respectively), Aggregating Across Cohorts)) is found later in the file due to the the “flow” of the code.

Interleaved Methods and Results

Resource use over time.

To first analyze how students’ resource use changed over time, we fit a simple linear model predicting the number of resources used at a given time point, assuming a linear coefficient on time, an integer-valued variable (t = 1, 2, 3). We applied the same model to analyze how students’ reported mean usefulness ratings changed over time. These two models were estimated with random effects of individual students nested within cohort (year):

\(\text{NumResourcesUsed}_{i,t,y} = b_0 + b_1 t + u_{i,y} + u_y + \epsilon_{i,t,y}\)

\(\text{MeanUsefulness}_{i,t,y} = b_0 + b_1 t + u_{i,y} + u_y + \epsilon_{i,t,y}\)

where \(u_{i,y}\) is the student-specific random intercept nested within year, \(u_y\) is a random intercept for year, and \(\epsilon_{i,t,y}\) is the residual error term. In R syntax, the models are:

lmer(numResUsed ~ examNum + (1|ID:Year) + (1|Year))

lmer(meanUsefulness ~ examNum + (1|ID:Year) + (1|Year))

Results: Aggregate Patterns of Resource Use and Usefulness

We begin by describing how students in our study interacted with their resources. Students showed a decreasing linear trend in the number of resources that they used to study for each of their 3 exams (Equation 2a). Across all four cohorts, students started off using an average of 7.9 (standard deviation, SD = 1.7) resources to study for Exam 1, 7.4 (SD = 1.7) resources for Exam 2, and 7 (SD = 1.7) resources for Exam 3, linear trend b = -.44, 95% CI = [-.47, -.42], p < .001. We illustrate students’ use and usefulness ratings for the 12 kinds of resources in Figure 1, aggregated across cohorts and categorized by exam. On average, some resources (e.g., attending the lecture and using the coursepack) were used more than others (e.g., lecture podcasts, Science Learning Center)—not surprisingly, these tended to also be the resources that a large proportion of students rated as “extremely useful” for their learning.

While this decreasing trend of number of resources used could be interpreted as decreasing motivation, our evidence suggests otherwise: Over the same period, students’ mean ratings of how useful their resources had been exhibited a positive linear trend over time—increasing from 4.09 (SD = 0.473) on Exam 1, to 4.17 (SD = 0.529) on Exam 2, and 4.21 (SD = 0.58) on Exam 3, linear trend b = .06, [.05, .07], p < .001 (Equation 2b). A score of 4 on our 5-point scale corresponds to “useful”. We inferred that, rather than necessarily being less invested, students were, on average, becoming possibly more focused and effective in their resource use over time. Evidence from our cognitive interviews with a randomly selected sample of students in the class who were not part of this study (see SOM “Survey Validation” for details) supports this idea that some students were strategically changing their resource use over time in the class. For example, one student shared that, “By exam 3 I wasn’t using [textbook problems]. I used those for the first exam, but found them not as helpful, so I stopped using those.” Next, we turn to how students increased their resource-use effectiveness by managing their resource use wisely from exam to exam.

[stats dump:]

Resource usage across all 4 years:
- usage at examNum 1: M=7.86, SD=1.72,
- usage at examNum 2: M=7.37, SD=1.69,
- usage at examNum 3: M=6.99, SD=1.71,
- b = -0.437 [-0.505, -0.37], t(3) = -12.7, p = 0.00112;
Mean usefulness across all 4 years
- usefulness at examNum 1: M=4.09, SD=0.473,
- usefulness at examNum 2: M=4.17, SD=0.529,
- usefulness at examNum 3: M=4.21, SD=0.58,
- b = 0.0599 [0.0507, 0.069], t(8842) = 12.8, p<.001;

Predicting Exam Performance

Ultimately, we were interested in whether students’ resource management behaviors between exams were associated with their exam performance. In a mixed-effects linear model, we regressed students’ current exam performance on their reported exploration, exploitation, and pruning behaviors, controlling for students’ performance on the prior exam as fixed effects. We added random intercepts by student nested within year, and time-point nested within year. Exam scores, the dependent variable, were converted into percentage scores out of 100 for all exams. Effect sizes (unstandardized b coefficients) can be interpreted in units of percentage points. Means and standard deviations of the three class exam scores are presented in Table S3. Thus, for student \(i\) at time \(t\) in year \(y\), we estimate the following model:

\(\text{Exam}_{i,t,y} = b_0 + b_1 \text{NumExplore}_{i,t,y} + b_2 \text{NumExploit}_{i,t,y} + b_3 \text{NumPrune}_{i,t,y} + b_4 \text{Exam}_{i,t-1,y} + u_{i,y} + u_{t,y} + u_y + \epsilon_{i,t,y}\)

Where \(\text{Exam}_{i,t,y}\) denotes the exam score of student \(i\) at time \(t\) in year \(y\), \(u_{i,y}\) gives the random intercept of student nested within year, \(u_{t,y}\) gives the random intercept of exam nested within year, and \(u_y\) gives the random intercept by year. Note that controlling for the exam performance at the previous time-point is conservative, and allows us to test if the resource regulatory behaviors (done between exams) explain exam performance over and above prior performance. In R syntax, this model is:

lmer(currentScorePercent ~ sum_explore + sum_exploit + sum_prune + pastScorePercent + (1|ID:Year) + (1|examNum:Year) + (1|Year))

(Note the above figure is not in the paper as the information already appears in Table 3)

Results: Predicting Exam Performance

We analyzed the effect of exploring, exploiting, and pruning behaviors on exam performance (Equation 3). Consistent with our hypotheses, the extent to which students engaged in each of these resource management behaviors positively predicted their exam performance. This was true for exploration (b = 0.85, [0.52, 1.18], p < .001), for exploitation (b = 0.91, [0.74, 1.08], p < .001), and for pruning (b = 0.75, [0.03, 1.48], p = .042). Table 3 presents the full statistical results. Following recent recommendations in psychological science and statistics to move away from Null Hypothesis Statistical Testing (e.g., Wasserstein, Schirm, & Lazar, 2019) and towards an “estimation” framework (e.g., Cumming, 2014), we provide p values for completeness, but we focus on interpreting the effect sizes, which are directly interpretable in terms of exam performance. Exploring one new resource was associated with an average of 0.85 percentage points increase in students’ performance on the current exam; exploiting one additional resource that was considered useful on the previous exam was associated with an average of 0.91 percentage points increase in students’ performance on the current exam; and pruning one additional resource that was found to be useless on the previous exam was associated with an average of 0.75 percentage points increase in students’ performance on the current exam.

Figure 2 visually illustrates how empirically-observed combinations of exploration, exploitation, and pruning related to changes in students’ exam performance. We observed that greater resource management was associated with larger changes in students’ performance on subsequent exams: Starting from the origin and moving out along each of the three axes, as learners report practicing more exploration, exploitation, and pruning, we see that their exam performance improves. Our findings underscore the adaptive, strategic nature of learners’ decisions to explore new resources, exploit previously useful resources, and prune previously useless resources between one exam to the next.

Table 3

Explore: b = 0.848 [0.515, 1.18], t(8486) = 4.99, p<.001;
Exploit: b = 0.911 [0.741, 1.08], t(8486) = 10.5, p<.001;
Prune: b = 0.752 [0.0271, 1.48], t(8486) = 2.03, p = 0.042;
pastScorePercent: b = 0.824 [0.806, 0.843], t(8488) = 86.6, p<.001;
Conditional R^2 after Nakagawa & Schielzeth (2013): 0.6011472

Robustness Check 1: Controlling for initial sum of resources used, as a proxy for engagement

In addition, we replicated our results when controlling for student engagement with the course resources (i.e., the total number of resources they used). Controlling for the number of resources students reported using at the beginning of the course (before Exam 1) as a proxy of their course engagement, as well as prior performance (i.e., adding total number of resources initially used as additional covariate to Equation 3), we find that greater exploration (b = 0.93 [0.59, 1.28], p<.001), exploitation (b = 0.85 [0.65, 1.05], p<.001), and pruning (b = 0.68 [-0.06, 1.43], p = .072) between exams still predicted students’ subsequent exam performance. The effect sizes in this analysis were relatively similar in magnitude, although we note that the coefficient on pruning is no longer statistically significant at the .05 level, suggesting that strategic resource management behaviors of exploring, exploiting, and pruning offer predictive value above and beyond a proxy of students’ sheer use of more course resources.

lmer(currentScorePercent ~ sum_explore + sum_exploit + sum_prune + exam1_sumres + pastScorePercent + (1|ID:Year) + (1|examNum:Year) + (1|Year))

(i.e., we added their sum of resources used on exam 1 as an additional covariate)

Explore: b = 0.934 [0.592, 1.28], t(8258) = 5.35, p<.001;
Exploit: b = 0.85 [0.652, 1.05], t(8258) = 8.42, p<.001;
Prune: b = 0.684 [-0.0599, 1.43], t(8258) = 1.8, p = 0.0716;
pastScorePercent: b = 0.82 [0.801, 0.838], t(8260) = 84.7, p<.001;
exam1_sumres: b = 0.158 [-0.0481, 0.364], t(8258) = 1.5, p = 0.133;
Conditional R^2 after Nakagawa & Schielzeth (2013): 0.5985226

Robustness check 2: Analyses using strict explore

Finally, our definition of exploration only required that students not use a resource on the preceding exam, but did not consider if they have not used it on all previous exams. We repeated all our analyses using a stricter definition of exploration that is trying a resource that students had not used on any previous exam, rather than just the preceding exam. Our results replicated, and the effect size on explore was in fact, stronger (b = 1.18 [0.78, 1.57], p<.001). However, we choose to retain our current (conservative) operationalization, using only the previous exam’s (non-)use, to be consistent with how we operationalized exploiting and pruning. We note that exploitation and pruning are theoretically well-defined even using only the previous exam.

lmer(currentScorePercent ~ sum_explore_STRICT + sum_exploit + sum_prune + pastScorePercent + (1|ID:Year) + (1|examNum:Year) + (1|Year))

Explore: b = 1.18 [0.784, 1.57], t(8488) = 5.86, p<.001;
Exploit: b = 0.911 [0.742, 1.08], t(8486) = 10.6, p<.001;
Prune: b = 0.75 [0.0265, 1.47], t(8486) = 2.03, p = 0.0422;
pastScorePercent: b = 0.824 [0.806, 0.843], t(8488) = 86.9, p<.001;
Conditional R^2 after Nakagawa & Schielzeth (2013):
- 0.5977108

Figures

Figure 1: Resource Use Figure

Figure 2: Graphical Representation of the Frequency of Exploring, Exploiting and Pruning in Relation to Changes in Performance

Supplemental Figures and Tables

Table S1: Demographics of Students Who Participated in At Least One of Our Surveys Across the 4 Cohorts.

	Year1	Year2	Year3	Year4
Male	583	499	637	582
Female	587	572	708	616
Gender: Not Reported	0	0	2	3
Asian	220	212	258	236
Black	64	39	63	52
Hispanic	43	38	58	36
Native Amr	16	8	8	11
White	676	651	791	726
Race: Not Reported	151	123	169	140

Table S2: Breakdown of resource usefulness ratings, comparing past usefulness and current usefulness. These numbers are collapsed across all 4 cohorts and each of the two (past/current) time-points in class.

name	current5	current4	current3	current2	current1	current0	currentAll
Past: Usefulness 5	18053	5172	706	123	339	1467	25860
Past: Usefulness 4	5834	12362	3032	432	195	3713	25568
Past: Usefulness 3	812	2828	2885	547	117	3217	10406
Past: Usefulness 2	122	343	474	327	97	803	2166
Past: Usefulness 1	255	139	83	67	106	291	941
Past: Didn’t use	1550	2367	1387	267	155	31319	37045
Past: All	26626	23211	8567	1763	1009	40810	101986

Table 2: Descriptive frequencies of resource use on the prior exam (Exams 1 and 2) and the percentages of those resources explored, exploited, and pruned out of those possible on the subsequent Exams 2 and 3, respectively, collapsing across cohorts.

Note. The numbers of resources reflect the mean numbers per student, averaged across all students per exam.

	Exam2	Exam3	Both
Average number of resources that were not used on the prior exam	4.13	4.62	4.39
Of these, number explored on current exam	0.706	0.644	0.686
Percentage Explored	16.9	14	16
Average number of resources that were rated useful (>3) on prior exam	6.17	6	6.07
Of these, number exploited on current exam	5.54	5.41	5.46
Percentage Exploited	90.1	90.4	90.1
Average number of resources that were rated useless (<3) on prior exam	0.395	0.34	0.37
Of these, number pruned on current exam	0.15	0.108	0.129
Percentage Pruned	40.8	37.9	42.3

Table S3: Means (and Standard Deviations) of Exams Scores

Exam	Year1	Year2	Year3	Year4
Exam 1	65.4 (15.75)	67.3 (18.83)	67.14 (15.24)	76.76 (14.5)
Exam 2	70.6 (16.74)	73.06 (18.2)	74.71 (14.6)	68.49 (17.42)
Exam 3	62.36 (21.53)	55.41 (23.83)	59.38 (22.06)	51.58 (20.17)

Session Information

## R version 4.0.1 (2020-06-06)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] plyr_1.8.6     plotly_4.9.2.1 doBy_4.6.6     MuMIn_1.43.17  pander_0.6.3  
##  [6] lme4_1.1-23    Matrix_1.2-18  reshape2_1.4.4 corrplot_0.84  ggplot2_3.3.1 
## [11] tidyr_1.1.0    dplyr_1.0.0   
## 
## loaded via a namespace (and not attached):
##  [1] statmod_1.4.34      tidyselect_1.1.0    xfun_0.14          
##  [4] purrr_0.3.4         splines_4.0.1       lmerTest_3.1-2     
##  [7] lattice_0.20-41     colorspace_1.4-1    vctrs_0.3.1        
## [10] generics_0.0.2      htmltools_0.4.0     stats4_4.0.1       
## [13] viridisLite_0.3.0   yaml_2.2.1          rlang_0.4.6        
## [16] pillar_1.4.4        nloptr_1.2.2.1      glue_1.4.1         
## [19] withr_2.2.0         lifecycle_0.2.0     stringr_1.4.0      
## [22] munsell_0.5.0       gtable_0.3.0        htmlwidgets_1.5.1  
## [25] evaluate_0.14       labeling_0.3        knitr_1.28         
## [28] crosstalk_1.1.0.1   broom_0.5.6         Rcpp_1.0.4.6       
## [31] scales_1.1.1        backports_1.1.7     jsonlite_1.6.1     
## [34] farver_2.0.3        Deriv_4.0           digest_0.6.25      
## [37] stringi_1.4.6       numDeriv_2016.8-1.1 grid_4.0.1         
## [40] tools_4.0.1         magrittr_1.5        lazyeval_0.2.2     
## [43] tibble_3.0.1        crayon_1.3.4        pkgconfig_2.0.3    
## [46] ellipsis_0.3.1      MASS_7.3-51.6       data.table_1.12.8  
## [49] minqa_1.2.4         rmarkdown_2.2       httr_1.4.1         
## [52] R6_2.4.1            boot_1.3-25         nlme_3.1-148       
## [55] compiler_4.0.1

Explore, Exploit, and Prune in the Classroom: Strategic Resource Management Behaviors Predict Performance

Patricia Chen, Desmond Ong, Jessica Ng, Brian Coppola

November 24, 2020