Literature Review

Measuring with Tests

The focus of this research project is how we can effectively measure the success of a project-based middle school. The research will include an emphasis on including student voice in determining what metrics should be included in an accountability framework. For the last fifteen years, accountability systems have focused on using standardized tests to measure school success. As Diane Ravitch stated, “Faced with this lack of consensus, policy makers define good education as higher test scores” (Lindgren, 2009 found in Ritchhart, 2012). Fueled by accountability measures in No Child Left Behind, by 2010 all fifty states were using standards-referenced tests to measure student achievement (Reeves, 2010, 305). As Farrington (2012) explains:

Education policy makers have attempted to ensure students’ qualifications for college
by ratcheting up academic demands through more rigorous high school graduation

requirements, increasing participation in advanced coursework, and raising standards

within courses. Test-based accountability measures have been enacted with the

intention of holding schools accountable for reaching these higher standards (p. 3).

In order to hold schools accountable for preparing students for success in higher education, policy makers have relied on tests of standards vertically aligned to reach that goal. The recent Common Core State Standards, though a departure in many ways from the previous lot of standards, is based on that same framework. For example, the introduction to the CCSS literacy standards explains that the goal of the standards is to “document and define general, cross-disciplinary literacy expectations that must be met for students to be prepared to enter college and workforce training programs ready to succeed” (Common Core, 2014). Yet despite the energy behind the standards and testing approach to education, Farrington (2012) contends that “there is little to no rigorous evidence that efforts to increase standards and require higher levels of coursework – in and of themselves – are likely to lead many more students to complete high school and attain college degrees” (p. 3).

The idea that increasing standards and the tests that accompany them does not lead to increased success in college and career is a contentious claim. Many thought leaders subscribe to the theory articulated by Porter-Magee and Borgioli in their 2013 article critiquing anti-test backlash. Porter-Magee and Borgioli (2013) argue that standardized tests is an effective tool for evaluating schools. The authors assert that within the goals for education “there is real content that students need to master; there are questions that have right and wrong answers; and there are many skills that can be evaluated using well-crafted standardized tests, including even the multiple-choice kind” (Porter-Magee, 2013). In addition they point to evidence that states which set rigorous standards and strong accountability measures, such as Massachusetts, have seen the greatest gains for all students (Porter-Magee, 2013).

In fact the movement towards more tests has not yielded changes in some of the most reliable measures of student achievement. Though some metrics of student performance have shown growth over the years, on the National Assessment of Educational Progress “8th- and 12th-grade scores have been largely flat” (Darling-Hammond et al. , 2014 p.2) and on the Program for International Student Assessment “U.S. performance has declined in math, reading, and science between 2000 and 2012, both absolutely and in relation to other countries” (Darling-Hammond et al. , 2014 p.2). Tucker (2014) contends that there is no evidence that standardized testing has led to improved student learning, even for the low-income and minority students it sought to serve. Conversely Feng (2010) argues that accountability systems based on testing “tend to improve the outcomes of low‐performing students” (Chakrabarti 2007; Chiang 2009; Figlio and Rouse 2006; Jacob 2005; Rouse et al. 2007; West and Peterson 2006 found in Feng 2010). Furthermore Chakrabarti (2007), argues that some of those gains in standardized test scores may be the result of schools engaging in what she describes as “strategic behavior with questionable educational benefit” (p.2) such as reclassifying low-achieving groups of students as learning disabled or even suspending students on testing days to reduce the impact of their scores on the school’s results. In addition, Figlio and Loeb’s (2011) meta-analysis of research on student learning and standardized test-based school accountability found that those systems did not result in higher student achievement as measured by metrics not related to the tests themselves. Yet, in the context of the standardized tests themselves there was an increase in student scores.

Though there is much conflicting evidence, Figlio and Loeb’s (2011) overall conclusion was that school accountability systems “might not generate higher achievement” (p. 397). This conclusion is particularly striking because of its ambiance. Figlio and Loeb synthesized dozens of studies on the impact of standardized tests and could not prove that they generated higher achievement. Still, students and teachers spend their hours and taxpayers spend their money implementing these accountability systems across the country. In addition this result highlights the complexity of attempting to measure the complex goals of education as well as the limitations of doing so via standardized tests.

Though overall there are mixed reviews of the relationship between standardized tests and student learning, one clearly defined limitation of these tests is their inability to accurately predict future college and career outcomes. Farrington (2012) suggests that standardized tests are not indicators of eventual outcomes in college, career, or civic participation. This misalignment is exacerbated by the fact that the tools students need for eventual success are dramatically changing. Though there is debate regarding to what extent the 21st century requires a new set of skills for students, most researchers agree with Rotherham and Willingham’s (2009) claim that changes in our economy and the world mean that collective and individual success depends on having a different set of skills and competencies than in past generations.

This reality means that schools need to adapt their goals for students to meet the demands of the 21st century. This call for a restructuring of schools to align with the demands of the 21st century echoes the findings of the New Commission on the Skills of the American Workforce, a group of business leaders, governors, school chancellors, and former secretaries of Labor and Education. The Commission released a report in 2006 entitled Tough Choices or Tough Times that stated, “It is a world in which comfort with ideas and abstractions is the passport to a good job, in which creativity and innovation are the key to the good life, in which high levels of education—a very different kind of education than most of us have had—are going to be the only security there is.” (found in Silva, 2008, 8). Though there are many different frameworks for what skills and competencies students will required in the 21st century, Voogt and Roblin’s (2012) recent review of those frameworks found that a common set of competencies were present. These competencies included: collaboration, communication, literacy, and social/cultural competencies including citizenship. In addition, most frameworks also emphasized creativity, critical thinking, productivity, and problem-solving (p. 315). Considering these frameworks for what matters in the 21st century, education thought leader Tony Wagner contends, “we need ‘Accountability 2.0’” (Friedman, 2013).

In addition there is also a significant branch of research that contends not only are standardized tests not a good predictor of eventual student outcomes in the 21st century, they also have negative impacts on schools. Gawlik (2012) found that “charter school teachers find high-stakes testing extremely stressful and believe it negatively affects their students” (p. 217). Gawlik (2012) attributes those negative effects to reduced teacher autonomy that leads to decreased motivation and a curriculum that is narrowed to align with tested goals (p. 217). Newell (2002) summarizes this stance by stating, “For students tests usually measure only a limited part of a subject area, do not cover a broad range of abilities, rely too heavily on memorized facts and procedures, and fail to emphasize thinking and application of knowledge” (p. 1). Newell’s (2002) analysis again highlights the disconnect between standardized tests and the competencies necessary for success in the 21st century. This misalignment acutely affects educators who, as Gawlik’s (2012) research demonstrates, feel these tests have a negative effect on teaching and learning.

Some educators dismiss these concerns, stating that as tests evolve with new technology or in response to new initiatives like the Common Core State Standards, they will become better indicators of student success. This could certainly be a valid claim. The next generation of assessments such as PARCC and SmarterBalanced have only recently piloted their exams and will begin full implementation this year. Indeed, a great deal of effort has been put forth by both these consortia to ensure their assessments are aligned with the skills necessary for college and career success. In addition, as Silva (2009) outlines, there are several innovative models of testing such as the College Work Readiness Assessment (CWRA), River City, Powersource, and the International Baccalaureate’s multiple performance assessments that harness the power of adaptive and virtual technology as well as the open-ended response to assess competencies such as critical thinking and complex decision-making. For example, in a 2008 report Silva describes how the CWRA requires students to show proficiency in “more sophisticated skills like evaluating and analyzing information and thinking creatively about how to apply information to real-world problems” (7). Silva’s analysis of new models of testing challenge the assumption that advanced skills aligned with the demands of the 21st century cannot be measured by standardized tests (Silva, 2008, p. 12). This same sentiment is echoed by Pellegrino (2006) in his prediction that technology will allow for assessments that are truly aligned with the cognitive skills and competencies demanded by the 21st century.

New tests, however, do not offer a simple solution to the question of how schools should measure their success. As Rotherham and Willingham (2009) suggest, these tests are still in their infancy. Even Silva, a proponent of these assessments, concedes new benchmarks are tied up with problems of cost and reliability. These issues make scaling new types of assessments difficult (Silva, 2009). In addition, there are still those who that argue that it will not be possible to truly measure 21st century competencies using standardized tests.   As Reeves (2010) explains:

Developing better tests of student learning in the 21st century is as futile as attempting to find a faster horse and buggy would have been in the 20th century. No amount of training or discipline would make the horse competitive with the automobile, airplane, or space shuttle. The nature of the horse makes such competition impossible. Similarly, the nature of testing - with its standardized conditions, secrecy, and individual results - is antithetical to the understanding, exploration, and creativity that are hallmarks of a new framework for assessment (p. 306).

Given the difficulty of aligning tests with 21st century demands in a scaleable, valid, and timely way, it does not seem wise to continue down the path of using new forms of standardized tests. A more prudent course of action would be to subject the outcomes of these assessment to vigorous research to assess their alignment with our overall goals for schools. Though there is much debate about how we can revise assessments to be more aligned with our goals for schools, at this moment it is clear as Pellegrino (2006) reports that “our assessment system is seriously flawed and broken. Given the amount that we currently spend on the large-scale assessment of academic achievement, we get very little in the way of positive return on investment.” (p. 2).   Pellegrino (2006) outlines four failings of our current assessment system. First, he asserts that the system does not effectively measure the complex knowledge and skills we hope education will help students develop. This assertion parallels other researchers’ concerns about the disconnect between standardized tests and 21st century competencies. In addition, Pellegrino (2006) argues these assessments have limited utility in improving teaching and learning as they only capture a snapshot of students rather than assessing their growth over time. Finally, these assessments raise questions about equity and fairness as they might, despite efforts to be unbiased, measure elements not directly aligned with the competencies they seek to assess. This bias could result in inequitable outcomes for different groups (Pellegrino, 2006, p. 7-8). Clearly standardized tests are not an adequate answer to the question of accountability in secondary schools.

Concerns about the efficacy of using standardized tests as the primary metric for accountability have begun to translate into the policy realm. Recently, California passed Senate Bill 1458 which will shift measurements of school performance away from a near exclusive reliance on state test scores. Starting in 2016, the law will limit results of standardized tests to only influencing sixty percent of a school’s Academic Performance Index (API), the primary vehicle of individual school accountability within the state of California. The other forty percent of each school’s API will be determined by the State Board of Education and the Superintendent of Public Instruction and may include graduation rates, Advanced Placement scores, and even college readiness as determined by need for remedial courses in English and Math. State Senator and lead proponent of the bill, Darrell Steinberg, stated that the legislation “will prove to be one of the most significant education reform bills of the decade” (Fensterwald, 2012). These reforms emphasize the importance of developing an alternative method for measuring schools.

Reimagining Accountability

Standardized tests are limited in their ability to provide useful information about the effectiveness of schools preparing students for college and career in the 21st century. It is necessary, therefore, to reimagine school accountability. The imperative to reimagine school accountability is intricately linked to the need to reimagine schools themselves. Returning to the goals of education, college, career and civic readiness, there is a clear need to change several of the fundamental assumptions about school. As Wagner states:

We teach and test things most students have no interest in and will never need, and facts that they can Google and will forget as soon as the test is over. Because of this, the longer kids are in school, the less motivated they become. Gallup’s recent survey showed student engagement going from 80 percent in fifth grade to 40 percent in high school. More than a century ago, we ‘reinvented’ the one-room schoolhouse and created factory schools for the industrial economy. Reimagining schools for the 21st-century must be our highest priority (Friedman, 2013).

What would this new system of accountability look like? What guiding principles are important to follow as the system undergoes drastic change? In a recent paper that synthesizes much of the best thinking about accountability reform, Linda Darling-Hammond, Gene Wilhoit, and Linda Pittenger (2014) suggest to re-invent our education to meet the demands of the 21st century, it is clear that we need:

more aligned systems of assessment and accountability that support genuinely
higher and deeper levels of learning for all students, and more flexible designs for
schools so that their graduates can meet the challenges of a world in which both
knowledge and tools for learning are changing rapidly (p.1).

Darling-Hammond and her co-authors (2014) assert that this new approach to accountability needs a focus on meaningful learning. This focus requires innovative approaches to accountability as deep learning is notoriously difficult to capture in standardized tests or other traditional metrics. Ron Berger (2003) conveys this point beautifully when he describes the necessity of sharing actual student work and even student presentations of their work to showcase student learning. In addition, Darling-Hammond and her co-authors (2014) assert that revised accountability systems must be reciprocal and comprehensive, build capacity of educators, and draw from multiple measures. These criteria, along with the principle of measuring meaningful learning, provide a framework for designing reimagined accountability structures. Within the scope of this paper, which is a single middle school, the focus on meaningful learning, building capacity, and using multiple measures are particularly relevant and important.

Measuring Meaningful Learning

To address the first criteria, accountability must seek to measure meaningful learning. Though Darling-Hammond offers Conley’s framework of college and career readiness as the description of meaningful learning, there are several other frameworks available that are more appropriate in a project-based setting which is the focus of this paper. For example research funded by the Hewlett Foundation has developed a framework for “deeper learning” (Vander Ark & Schneider). The components of that framework include mastery of core academic content, critical thinking, effective communication, ability to work collaboratively, learning how to learn, and academic mindsets. There are many similarities to Conley’s framework which includes four elements: key content knowledge; cognitive strategies such as critical thinking and problem-solving; learning skills and techniques; and college knowledge, the cultural, personal and practical knowledge necessary to succeed in the particular context of higher education. In addition to these approaches to defining what meaningful learning is, there are also several frameworks for 21st century skills, as was mentioned earlier. Voogt and Roblin’s (2012) synthesized these frameworks to develop a list of common elements that includes: collaboration, communication, literacy, and social/cultural competencies including citizenship. In addition, most frameworks also emphasized creativity, critical thinking, productivity, and problem-solving (p. 315). Again, though there are many similarities between these frameworks and the ones proposed by Conley and Hewlett, there are nuanced differences.

In addition to different views on what matters for meaningful learning, there are also a myriad of ways to measure those elements. For example, researchers have proposed that schools use locally created performance assessments (Darling-Hammond et al., 2014), portfolio presentations (Berger, 2014, Darling-Hammond et. al, 2014), and student-led conferences (Berger, 2014) to measure meaningful learning. There are a variety of additional metrics related to learning that include graduation rates, college enrollment, college persistence, equity in achievement in each of those categories, and of course grades.   Any one of these elements, though seemingly straightforward, comes laden with complexity and nuance.

Grades seem like a useful tool for evaluating school success. Grades are already in quantitative form and thus easy to translate into accountability structures. In addition, unlike standardized tests, there is a large body of research suggesting a strong connection between students’ secondary school grades and the outcomes of college and career success. Farrington (2012)’s survey of research found that students’ grades, GPA, and class rank are “vastly better predictors of high school and college performance and graduation, as well as a host of longer-term life outcomes, than their standardized test scores or the course work students take in school” (p. 3). In fact, Geiser and Santelices (2007) found that “high-school grades in college-preparatory subjects are consistently the best indicator of how students are likely to perform in college” (p. 24). If the goal of measuring school success is to determine how well schools are preparing secondary students for higher education, it appears that grades would be the most reliable tool to do so.

Despite their seemingly smooth application to accountability frameworks, grades present many challenges when used to measure school success. First and foremost, consider Campbell’s Law which states, “The more any quantitative social indicator or even some qualitative indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor” (Nichols, 2008, p. 672). Grades are subject to variation and include different information from school to school and often even from classroom to classroom within the same school. As Guskey and Jung (2012) explain there is “tremendous variation that exists in grading practices, even among teachers of the same courses in the same department in the same school” (p. 27). The fact that grades are able to vary from teacher to teacher means that though current research shows that they are predictive of student outcomes, if schools begin to use grades to measure school success, the pressure to succeed may result in grade distortion. The same level of complexity and challenge is inherent in any particular measure of students academic performance.

Though very different from grades, measuring creativity is equally complex. It is widely accepted that creativity is an important capacity for students entering into the 21st century workplace, but how we measure creativity is still hotly debated. Sir Ken Robinson, a renowned supporter of education that promotes creativity, summarized the debate in a recent interview with ASCD. Robinson explained:

Whether there would be an individual grade for creativity, that's a larger question. Certainly giving people credit for originality, encouraging it, and giving kids some way of reflecting on whether these new ideas are more effective than existing ideas is a powerful part of pedagogy. But you can't reduce everything to a number in the end, and I don't think we should. That's part of the problem. (Azzam, 2009).

Despite the difficulty of measuring creativity, several states and organizations have begun to create tools to do so. In 2010, Massachusetts passed a bill calling for the development of a creativity index, now called the Creativity Challenge Index. Massachusetts State Senator Stan Rosenberg claimed the index was a critical component of transforming schools to meet the needs of employers. Rosenberg claimed, “We're tapping into a very clear need, as expressed particularly by employers, to re-incorporate into the curriculum and school experience many opportunities for young people to develop creativity-oriented skills" (Robelen, 2012). Several supporters of creativity, however, echoed Robinson’s statement that perhaps not everything should not be measured. Robert J. Sternberg, the provost and a professor of psychology and education at Oklahoma State University, who is an expert in intelligence-testing and has studied creativity extensively, cautioned, “We don't want an index that trivializes creativity, such as by counting numbers of activities that, on their surface, sound creative rather than exploring what is actually done in the activities to encourage creativity” (Robelen, 2012). Yet, despite these concerns, legislation similar to the Massachusetts law has also been proposed in California and Oklahoma and there are campaigns beginning in Connecticut, Maine, New York, Rhode Island, Colorado and Wisconsin (“What is the creative challenge index?”). In addition, the Buck Institute for Education, an organization focused on project-based learning, has also created a Creativity and Innovation Rubric for grades 6-12 which is aligned with the Common Core State Standards (“6-12 Creativity & Innovation Rubric”).

Creativity is clearly an important goal for schools in the 21st century and though it is difficult to measure, educators have made progress on how we can assess it (Robinson, 2009). The question remains, however, if these tools should be used as part of how we measure school success. Though incredibly different than grades, reviewing research on assessing creativity highlights the complexity and challenge of measuring meaningful learning. There are challenges inherent in both determining the definition of meaningful learning and then determining how best to measure that goal. The difficulties of using grades or creativity rubrics to assess meaningful learning highlight the complexities involved in attempting to measure any element of that learning. Meaningful learning is both incredibly important to include in accountability metrics and incredibly difficult to quantify.

Multiple Measures
Though student academic performance is a key element in measuring school success, Darling-Hammond’s (2014) criteria of multiple measures emphasizes the idea that accountability should not be limited to only metrics within that domain. A recent study from the Canadian organization, People for Education, identified six areas to measure to determine a school’s success. These areas are: physical and mental health, social-emotional development, creativity and innovation, and citizenship and democracy (People for Education, 2013, p. 3). This parallels Darling-Hammond’s (2014) claim that in addition to measures of meaningful learning, school accountability should also include measures of students’ social-emotional competence, responsibility, and citizenship. Expeditionary Learning Schools use different tools to measure students’ character (Berger, 2014), and Envision Schools seek to measure students’ ability to reflect on their learning (Lenz, 2013). Wagner suggests that two of the most important elements schools can develop are students’ sense of purpose and driving passion. He explains that his research shows that, “We need to focus more on teaching the skill and will to learn and to make a difference and bring the three most powerful ingredients of intrinsic motivation into the classroom: play, passion and purpose.”(Friedman, 2013).

An important subset of these factors that is critically important to academic performance, but not a direct metric of that performance is what Camille Farrington terms academic mindsets. Non-cognitive elements like academic mindsets are rarely included in accountability measures. This derives from the fact that though the People for Education study outlines several ways of assessing these goals, they are all relatively nascent and often complex to implement. In discussing the difficulty in transitioning to a system of accountability that includes these factors, the report quotes educational economist Hank Levin who explains:

The specific non-cognitive or personality attributes required for successful adulthood

are more diffuse and more contested and have not yielded to the straightforward
measurement methods used for standardized tests. There is simply no global agreement on what is of consequence beyond student achievement and how it should be measured. For these reasons, and perhaps others, discussions of world-class education and educational systems have been limited to student achievement (People for Education, 2013, p. 12).

Despite the difficulty of measuring academic mindsets, they are critical ingredients to students’ eventual college and career success. One reason that grades are more predictive of future success than standardized test scores is that by design they take into account more than just strict content knowledge and academic skills. Grades also measure what Farrington (2012) calls noncognitive factors. These include a range of academic behaviors, attitudes and strategies including study skills, attendance, work habits, time management, help-seeking behaviors, metacognitive strategies, and social and academic problem-solving skills (Farrington, 2012, p. 3). Farrington (2012) cites a substantial body of research that shows the importance of students developing these behaviors and attitudes and the correlation between them and students’ eventual academic outcomes. Along with her co-authors, Farrington creates a framework for understanding these elements which divides noncognitive factors into five categories: academic behaviors, academic perseverance, academic mindsets, learning strategies, and social skills. Academic behaviors are of course most closely tied to changes in academic performance, but these behaviors are influenced by academic perseverance. Simply put, how long students are able to maintain academic behaviors like going to class and doing homework is a function of their ability to persevere. Similarly Farrington and her colleagues link ability to persevere to academic mindsets. They outline four academic mindsets: belonging, growth mindset, self-efficacy, and belief in the value of the work. Each of these mindsets have been proven to be both integral to student performance and thus ultimate success as well as able to be influenced by education (Farrington, 2012).   Given these two factors, it is worth considering including academic mindsets as one of the multiple measures of a school’s success.

Another important category of metrics that is not a direct measurement of student achievement is educator inputs. In addition to measuring the outcomes that are both directly related to the goals of school and that serve as proxies or indicators for those goals, it is also possible and often valuable to measure the inputs from teachers and school leaders. For example, both Darling-Hammond (2014) and People for Education (2013) suggest measuring school climate. It also possible to measure elements such as the availability of qualified teachers, access to adequate facilities and materials, and opportunities for parental involvement. A group of districts in California known as CORE districts were given flexibility in their accountability metrics and included those items along with measurement of student academics and social/emotional outcomes. This parallels the accountability system in New Mexico which includes student self-reported opportunities to learn and in Oklahoma which includes metrics related to parent and community engagement. In addition, several states and countries use a form of School Quality Review in their accountability frameworks. This means that expert educators visit the school to access teaching and other school practices (Darling-Hammond et. al, 2014). Similarly both Expeditionary Learning Schools and Envision Schools use a similar system to measure schools fidelity to the principles their respective school models (R. Berger, personal communication, November 5, 2014 & B. Lenz, personal communication, November 7, 2014). The assumption inherent in measuring inputs is that there is a clear correlation between these practices and the desired outcomes.

Measuring inputs is particularly important in schools with a unique model. This paper focuses on a project-based school and therefore it would be possible to design a metric that measures how aligned the school’s inputs are with the principles of project-based learning. A leader in project-based learning research, the Buck Institute for Education outlines eight essential elements for project-based learning: significant content is involved, students are engaged in the project and feel as if they need to know the material they are studying, there is a central driving question, students develop 21st century competencies, students engage in in-depth inquiry, students have voice and choice, work is critiqued and revised, and finally there is a public audience for the work (Larmer & Mergendoller, 2010). It would be possible and potentially valuable to measure to what extend the school’s inputs are aligned with these elements or other design principles.

Building Capacity

School accountability is complex and requires thoughtful analysis of both what to measure and how to measure those elements. Schools often seek to accomplish many goals as mentioned above. In addition to teaching students content and skills, schools shape students’ character, capacity for continuous learning, passion for learning, and even their overall purpose. All of these elements are part of preparing students for college and career in the 21st century. Standardized tests, even those adapted to measure higher order thinking skills, only measure a subset of the goals society has for schools. Given the breadth of goals for public education and the wide variety of possibilities for measuring those goals, someone must make a decision about what to measure as a proxy for school success. For accountability, however, to be effective it must shaped, at least in part, by local actors. As Darling-Hammond and her co-authors (2014) contend, genuine accountability “must involve communities, along with professional educators and governments, in establishing goals and contributing to their attainment” (p. 4). Figlio and Loebe (2011) substantiate that claim by proving that there is a record of “far greater positive effects of accountability in states with greater local autonomy” (p. 411).

Since the possibilities for accountability structures are so varied and the need for tailoring those structures to local needs is so great, accountability frameworks, at least in part, should be co-created with key stakeholders within the local context. This means including parents, teachers, and school leaders in the design of accountability matrices. In addition, student voices are also important in creating a system of accountability. This may seem like a radical idea, but it is connected to several well-supported constructs about teaching and learning.

The first construct is that if the goal of accountability is improvement, then investing students in co-creating the metrics for accountability is invaluable. Berger (2014) makes the argument in support of student-engaged assessment, assessment which puts students “in the driver’s seat” (p. 7), at the classroom level. This type of assessment is both motivating for students and provides a depth of understanding to assessment that is not possible without students’ own reflection. The same concept applies when students have the opportunity to have a voice in decisions that affect their school. Mitra (2008) explains:

Student voice initiatives have been shown to increase youth agency, to create greater
attachment to schools, and to build a range of skills and competencies including getting
along with others, planning complex projects, and public speaking (p. 2).

Including students in the process of determining how we measure schools could ultimately help prepare them for college, career, and civic participation.

Students also have valuable insights when it comes to evaluating school. Ferguson (2012) contends that a “well constructed classroom-level student surveys are a low burden and high-potential mechanism for incorporating students’ voices in massive numbers into our efforts to improve teaching and learning” (p. 28). Ferguson’s research shows that student perception of teacher efficacy is one of the most accurate tools available to researchers. In addition, a variety of research shows that including student input can improve schools in a variety of areas. As Mitra (2012) writes:

Student voice activities also can serve as a catalyst for positive changes in schools, such as improvements in instruction, curriculum,teacher-student relationships (Rudduck, 2007), teacher preparation (Cook-Sather, 2002), assessment systems(Colatos and Morrell, 2003; Fielding, 2001), and visioning and strategic planning (Eccles and Gootman, 2002; Zeldin, 2004) (p. 104).

Students certainly have unique perspective into schools and therefore their ideas about how we determine what makes a school effective could help push adult thinking beyond traditional metrics and towards more meaningful accountability plans.

Several organizations also have found that student voice in evaluating different elements of school is beneficial. Groups such as YouthTruth and Panorama Education have built and implemented sophisticated surveys that gather student input on a variety of factors about their schools. Rhode Island’s education system is on the forefront of this area of accountability with its School Accountability for Learning and Teaching (SALT) Surveys. These surveys measure elements including teacher availability, school safety, parent involvement, and class/ethnic diversity (Darling-Hammond et al., 2014).

Yet, the idea of including students in the process of determining what to measure and how to measure is an area with limited research and few documented attempts. In addition, several researchers raise concerns about incorporating student voice into school reform and decisioning. Mitra (2012) warns:

[t]he power and status distinctions in school settings especially provide a dramatic form of asymmetry due to institutional norms of deference to adult authority and the separation of adult and youth roles in schools (p. 109).

It is difficult to ensure that students are true partners in schools. Given the hierarchical structure of school, students are often not used to having their voice be an important factor in adult decision-making. Overcoming those established beliefs and norms requires a critical mass of adults to willing to be open to a “rupture of the ordinary” (Fielding, 2004 in Mitra, 2014, p. 109). In addition, if adults ask for students input and then don’t act on it, we risk sending the message that students’ voice is not important and disempowering them. Given this possibility, as McQuillan (2005) cautions, adults must be thoughtful about how to structure student input so that it is clear what students are able to influence and what students are not able to influence.

Despite these potential challenges, the benefits of including student voice in determining accountability frameworks could be immense. This is especially true given the complexity of creating useful metrics and the necessity of doing so in the context of local needs and models; it seems useful to bring students, teachers, school leaders, and parents into that process. In particular, including student voice in designing accountability structures could potentially have great value in navigating the complexity of choosing what to measure to determine school success.

LIterature Review