At Metis we have been actively working to assist public school systems—in New York City and beyond—to integrate for the last 30 years. We are keenly reading the work of other researchers and journalists about the latest stages of this struggle and we especially enjoyed the Research Alliance’s deep dive into different ways of measuring integration. With our decades of experience evaluating magnet programs that explicitly aim to promote school integration and decades more experience measuring the racially disproportionate impacts of juvenile detention policies, we have some thoughts about how to tackle these methodological questions that we would love to share.

In the Research Alliance post, the authors considered the relative merits of three different approaches to measuring integration in school enrollment. In this post, we consider how the measurement question fits into the school integration conversation, then we go on to propose a different set of units to measure and measurements to take.

### 1. How does the narrow question of metric selection fit into the school integration conversation?

We appreciate how Research Alliance closed the piece with four critical questions. The first three form a natural, logical chain: What factors explain segregation? How do we measure progress toward integration? And how do we measure the impact of segregation? The fourth question—what diversity does our classification system miss—is also excellent. We won’t get to that one in this response, but we encourage those interested in this topic to read NALEO Education Fund’s 2017 policy brief on the “Combined Question.”

As we tackle question number two around measuring progress toward integration, we don’t want to lose sight of question number one about the factors that explain segregation. It is beyond the scope of this essay to definitively name every factor, but let’s acknowledge that the assumptions that we make about those factors have implications for how we go on to measure progress. For example, if you believe that the main factors are white supremacy and anti-black racism, you might choose to highlight measures of the segregation of white students from black students, rather than considering all of the racial groups as parallel classifications (as done by the Mayor’s School Diversity Advisory Group definitions).

IntegrateNYC has done great work to get more specific about how segregation is maintained despite decades of desegregation plans. Both the School Diversity Advisory Group and the NYC Department of Education have adopted their framework. Mechanisms that have maintained segregation include the disproportionate share of teachers that are white women in contrast to the demographics of students, the lack of culturally relevant and sustaining practices in schools, and the carceral nature of classroom management and school discipline. In this piece we will use the terms *desegregation* and *integration in enrollment* to refer to changes in the racial composition of students in a school. Full* integration* will be reserved for efforts that address all of the above-mentioned factors, at minimum. Policy makers will need to measure progress for each of these factors in addition to any measures of integration in enrollment to be effective at using data to drive decision making.

Before we delve into measuring progress toward integration in enrollment, we also want to nod to question number three about the impacts of segregation and integration, especially as they relate to resources. Recently one of us attended a Community Education Council meeting in Harlem’s Community School District 5. A joint presentation from Capital Planning and the School Construction Authority advertised that the latest capital plan has $9 billion to spend on new school construction. Most of the new construction will be in neighborhoods that are getting new condominiums, such as Long Island City. The presenters also acknowledged that the plan’s budgets of $750, $284, and $50 million for accessibility, air conditioning, and bathroom upgrades, respectively, would be insufficient to meet the needs of existing schools in places like Harlem. It sounded like what they were saying was that when a need arises in a neighborhood with few black and Latinx students, the city has billions of dollars to meet that need; but when a need arises in a neighborhood with more black and Latinx families, billions are not available. (According to the Department of City Planning, Long Island City—as represented by the neighborhood tabulation areas comprising Queensbridge, Ravenswood, Long Island City, Hunters Point, Sunnyside, and West Maspeth—and Harlem—as represented by the area comprising Central Harlem North and Polo Grounds—each has about 40,000 housing units. In Long Island City about 4,000 units are new since 2010, and about 41% of the population is black or Latinx. In Harlem, there are about a third as many new units and twice as many black and Latinx residents.)

### 2. Effective ways of measuring integration in school enrollment

Now, let’s consider the question in the middle: what is the best way to measure progress toward integration in enrollment, and can we do it in a way that informs decision-making? Research Alliance considers three intriguing approaches: the percentage of schools that have between 50% and 90% of their students identifying as black or Latinx; the percentage of schools that have roughly the same percentage of the four main racial groups as their community school district (within 10 percentage points); and the percentage of schools that have roughly the same percentage of the four main racial groups as their borough. Although these definitions are each different and yield different results, all three are limited by using the *percentage of schools* as their primary unit of analysis.

In this article we will show how using schools as the unit of analysis is limiting on three fronts: it saddles you with a lagging indicator of the impact of changes to enrollment policies; it doesn’t account for school size, and it creates discontinuities—little jumps where the movement of a single student could have a big effect on the measure—making the results seem capricious.

We submit that using indices of dissimilarity—a commonly-used measure of segregation—for grade-level cohorts would resolve all three of these concerns.

*A different set of units to measure*

First, we can use grade-level cohorts instead of whole-school data to get timely indicators, rather than lagging indicators. From our years of experience assessing the impact of policy on equity, a key lesson has been to frame data around policies, and not the people impacted by policies (e.g., at what rates do courts send children to detention, not at what rates are children winding up in detention). When Nicole Mader and colleagues wanted to examine the impact of school choice policies on desegregation (spoiler: their impact is segregative), they looked at just kindergarten classes, not whole schools. While a careful evaluation of school integration efforts will need other measures for other assignment mechanisms (e.g, mid-year transfers), the key leverage points of Pre-K, kindergarten, middle school, and high school admissions should be the main focus. If we dramatically change kindergarten enrollment procedures, we will see the impact on the kindergarten demographics right away, but it will take years before it impacts whole-school demographics. (In this article we will continue to use whole-school demographics, as that is what is publicly available for most schools, but our examples will use middle schools, whose populations turn over in only three years.)

*A different set of measurements to take*

Next, let’s talk about the benefits of an index of dissimilarity. This index represents how evenly two groups are distributed across components of a larger whole on a scale from 0 to 1. If the groups are evenly distributed, then the index is 0. If the groups are completely segregated, then the index is 1. With indices of dissimilarity the unit of analysis is the larger whole, but it is also possible to see how many points each school contributes to the total. The appendix shows the formula for calculating indices. If you would like to learn more about how to apply this formula in your work, please email us.

In the following paragraphs we will illustrate how indices of dissimilarity better account for school size and avoid discontinuities in measurement by contrasting an index of dissimilarity for young men and young women, with a measurement of segregation that looks at the percentage of schools that are not gender-representative (PSNR) of the whole set of schools. Our examples draw on empirical data from three Queens middle schools within a three-mile radius of Jamaica Station that serve grades six to eight exclusively. These stand-alone middle schools combined have 1,753 students, 46% of whom are female. One of the schools, Russell Sage, is much larger than the other two. The total enrollment and gender ratios for the schools are:

- JHS 190 Russell Sage: 582 students, 46% female;
- Catherine & Count Basie MS 72: 174 students, 48% female; and
- Redwood MS: 187 students, 44% female.

Since children of each gender are fairly evenly distributed across these schools, the index of dissimilarity is 0.02. Similarly, the PSNR is 0%. (We use the 10% threshold rule adopted by the SDAG, so to be gender representative a school could be anywhere between 36% and 56% female.) In this empirical example, these two different measurements are similar, but as we measure change over time we may get different results depending on the measurement we use.

To understand how the index of dissimilarity operates when there is a mix of small and large schools, imagine if three years from now MS 72 had 113 fewer boys, while Russell Sage had 113 more, and all the other figures stayed constant. Then the PSNR would go up to 33% because one of the three schools—MS 72 with 73% female students—would no longer be gender representative, and the index of dissimilarity would go up to 0.14. Both measurements would reflect how the genders became more segregated.

Now imagine that the enrollment shift is reversed and the smaller school gets more boys, while the larger school has fewer. The fact that the genders have become more segregated is still reflected in the index of dissimilarity, which would move up to 0.12. But the PSNR would miss a significant shift in enrollment since all the schools would still be between 36% and 56% female. This shows how the impact on the index of dissimilarity is more closely related to the number of students moved, while the PSNR is more impacted by the size of the schools they are moving from or to. These figures are presented as a table in the appendix. We would also be happy to share our spreadsheet with the calculations for all the examples in this document. Just send us an email.

Next, let’s see how the index of dissimilarity avoids the discontinuities (jumps) we may find in the PSNR. Remember that we are imagining that over three years there are no changes in the overall numbers of student of each gender across the three schools. Since 46.2% of the students are female, any school that is at least 36.2% but not more than 46.2% female would be gender representative. Imagine that three years from now MS 72 had 114 more boys (rather than 113). With 163 girls and now 288 boys, the percentage of students who are female would dip down to 36.1%. In this instance, the index of dissimilarity would still be 0.12, while the PSNR would shoot up to 33% on account of that one student. This kind of disproportionate sensitivity cuts against both the validity and reliability of the PSNR.

*When to use indices of dissimilarity*

The index of dissimilarity is a clear and robust measure of segregation, but one index will never be enough to fully convey the complexity of segregation in New York City. As we pointed out above, the choices about which indices to report should reflect an understanding of the factors that explain segregation. Each index can measure the evenness of distribution of only two groups at a time, and suffice it to say that there are more than two meaningful race groups in New York City. In this post we feature examples that emphasize the segregation of white students from other groups because it does seem like the tendency of white people to attend schools with other white people is a factor that contributes to segregation, but indices of dissimilarity can be used for other groups to the extent that reliable data have been collected.

In their 2014 paper, *New York State’s Extreme School Segregation Inequality, Inaction, and a Damaged Future*, John Kucsera and Gary Orfield cautioned that it might not be good to use indices of dissimilarity to measure segregation if the mechanisms of segregation are operating outside the geography in question. Take a neighborhood like Harlem that is well known for having few white students. In 2018-19 there were three stand-alone middle schools in Harlem’s District 5 with a total of nine white students between them. If we calculated an index of dissimilarity for White and non-White students in this case, the movement of just 7 students could be the difference between maximal and minimal dissimilarity. In this case minimal dissimilarity would not represent the integration of white students in enrollment.

We accept this caution but counter that the diversity of New York City, or lack thereof, is not the central question. The central question is whether race determines which school a child goes to, and therefore we can measure progress toward integration in enrollment using indices of dissimilarity. Even when a group is in the minority, we can bring down an index related to that group by making them more evenly distributed. And even in the case of Harlem we can use indices of dissimilarity to measure the segregation of the groups that are in those schools and discuss whether that level is acceptable. And let’s be clear about what it means to accept that the problem is not a lack of diversity in New York City. In New York City white children are in the minority (15% of all NYC public and charter school students and no more than 50% in any community school district). Every single one of the definitions under consideration orients us toward the goal of having no schools where white students are in the majority.

Let’s take a look at one example of how an index of dissimilarity compares to the PSNR to help us understand racial integration at the community district level. The case of stand-alone middle schools in Community School District 1 on the lower east side of Manhattan is interesting because it also shows how we can use the measurement of how many points each school contributes to the index. In this case the PSNR for Asian, black, Latinx, or white students is only 25%. Only one school, Tompkins Square, is not representative for only one race, white. The index of dissimilarity for white students compared with all other students combined in stand-alone middle schools in 2018-19 (including one charter school) was 0.40. Of these 0.40 points 0.20 come from that one school, Tompkins Square. This is because out of four stand-alone middle schools, 70% of the white students attend just that one school. However, with a white population of 21%, Tompkins Square was only just barely outside the threshold of 20% to be racially representative. If there were just 17 more Latinx students at Tompkins Square and 17 fewer at School for Global Leaders, then every school would be racially representative for all groups. But in that scenario 70% of the white students would still all attend just one of the four schools, and the index of dissimilarity would fall only 0.02 points from 0.40 to 0.38.

### 3. Where to go from here

Research Alliance, we would love to see a future post where you show trends in indices of dissimilarity for the entry grades at the city, borough, and district levels. The next steps would be to look at the impacts of policies on the index and to set goals framed around the index. Where changes in policies happen at the district level, we would look to the district-level indices to measure change. Where changes in policies happen at the borough level we would look to the borough indices to measure change. And where changes in policy happen at the city-wide level we would look to the city-wide indices.

### Appendix A. How to calculate an index of dissimilarity

Indices of dissimilarity measure how evenly two groups (A and B) are distributed across N components of a larger whole on a scale from 0 (most even) to 1 (least even) as follows:

In Section 2 of this article we consider an empirical example: stand-alone middle schools in City Council Districts 28 and 29. The index of dissimilarity for young men and young women among these three schools in 2018-19 was:

The table below lays out the hypothetical future scenarios we considered in the article. To request a copy of the original spreadsheet with all of our examples, please contact us.

Please note that for the citywide data presented in the graph at the beginning of the article, the 2012-13 data are from the spreadsheet downloaded in 2016 and may be missing schools that closed in that period. We estimate that could be as many as 20 schools or about 7%.