Part One: AERA... Huberman, CPS in violation of every guideline on testing by every professional organization

A review of the guidelines on testing from every professional organization shows that Chicago has been in violation of the professional guidelines on the appropriate use of tests since the beginning of the 21st Century. Staring this week, Substance will publish the guidelines from the major organizations of professionals in the field. As the following shows, Arne Duncan and the U.S.Department of Education, since the promulgation of Race To The Top, are also in violation not only of the basic ethical norms about dealing with children, but also of the explicit professional organizations' guidelines.

Ron Huberman's "Chief Performance Officer" Sarah Kremsner (above in grey suit), took office in 2009 with no experience, training or certification in education and quickly expanded the so-called "Office of Performance Management" into one of the largest and most controversial departments in the history of Chicago's public schools. Chicago's media ignored the fact that Kremsner's version of "Performance Management" at the Chicago Transit Authority, where she worked under Ron Huberman prior to 2009, was dumped as soon as Huberman and Kremsner were gone because it was largely irrelevant to the public transportation needs of Chicago's people. Substance photo taken in July 2010 at Chicago Vocational High School. For nearly ten years, the chief testing people at Chicago Public Schools have refused to participate in professional conferences and debates about the testing they oversee. I didn't even notice this until six years ago, when AERA met in Chicago and I asked Dan Buglar (then CPS accountability and research chief) whether he was going to present to AERA. He acted as if that was a lewd proposition, and Chicago, then under Arne Duncan, completely ignored the AERA meeting, even though it was held less than a mile from CPS headquarters. Buglar, whose training was in something called "public policy," was not the only CPS research and accountability chief to avoid the profession. In fact, he was part of a crowd.

From Dan Buglar to Ginger Reynolds through Sarah Kremsner (and every one of Chicago's "Chief Area Officers" who in 2010 are pushing additional tests on the schools, area by area), they have all avoided any professional accountability for the policies they oversee (or have overseen). During the years beginning in 2004, however, they have all overseen the destruction of schools, principals and teachers (through "Renaissance 2010") that violate every professional standard. But since the professional organizations have no way of enforcing their codes, and Chicago makes sure it doesn't appoint professionals to these jobs, the stories remain "off the radar."

In 2010, Sarah Kremsner, a long-time crony of Ron Huberman, is the chief of "Performance Management" at CPS. Like the city's test chiefs (going back to Phil Hansen, a special ed teacher who was appointed by Paul Vallas in 1995 to be the city's first "accountability" chief), Kremsner has avoided professional peer reviewed scrutiny of the programs they have overseen. Her degrees and training have little to do with professional use of tests and other instruments in education. Like Ron Huberman, she has enormous power without any professional training, expertise, experience, or qualifications.

On February 11, 2009, Ginger Reynolds (above facing left, wearing classes) testified that CPS had to close Fulton Elementary School because of "academic failure." Like dozens of other schools during the purges of "Renaissance 2010," Fulton was closed and subjected to the controversial "turnaround" program. Like her successor, Sarah Kremsner, and her predecessor, Dan Buglar, Reynolds served as the Chicago Public Schools chief of testing and accountability despite the fact that she had no professional record in psychometrics, testing, and measurement. Seated beside Reynolds above is Board attorney Joe Moriarity, who regularly provided the show trials aimed at closing schools with the legalese necessary for the actions. Substance photo by George N. Schmidt.Only because of the complete collapse of Chicago's daily newspaper reporting is her absurd version(s) of "Performance Management" give free reign to destroy children's lives, communities' stability, and the professional careers of teachers and principals. Were there any real accountability in Chicago, Kremsner and her boss would never have been give the power they are exercising at such great expense and human cost today in Chicago. Nothing she has done would survive even the most cursory peer review.

In this series for Substance, we'll review the professional guidelines of the professional organizations that Chicago's Board of Education has ignored for the past decade and a half. We have to begin with AERA (and its companion, the National Council for Measurement in Education).

The following is the current position of the American Educational Research Association (AERA). Further information about AERA can be found on the AREA website.

For those who want to explore AREA more completely, you can get to the AERA position below from the following address

AERA Position Statement on High-Stakes Testing in Pre-K – 12 Education

Adopted July 2000

The American Educational Research Association (AERA)is the nation's largest professional organization devoted to the scientific study of education. The AERA seeks to promote educational policies and practices that credible scientific research has shown to be beneficial, and to discourage those found to have negative effects. From time to time, the AERA issues statements setting forth its research-based position on educational issues of public concern. One such current issue is the increasing use of high-stakes tests as instruments of educational policy.

This position statement on high-stakes testing is based on the 1999 Standards for Educational and Psychological Testing. The Standards represent a professional consensus concerning sound and appropriate test use in education and psychology. They are sponsored and endorsed by the AERA together with the American Psychological Association (APA) and the National Council on Measurement in Education (NCME). This statement is intended as a guide and a caution to policy makers, testing professionals, and test users involved in high-stakes testing programs. However, the Standards remain the most comprehensive and authoritative statement by the AERA concerning appropriate test use and interpretation.

By the time he left CPS in July 2007 (above), Dan Bugler had served as chief of "Research Evaluation and Accountability" and provided the texts for the destruction of the first two dozen Chicago public schools under the Board's "Renaissance 2010" program. Bugler, like those who served as the system's "accountability" chiefs before and after 2007, had no experience or record in testing and measurement. When asked in 2007 whether anyone from CPS would be presenting to the annual meeting of the American Educational Research Association (AERA), which was meeting in Chicago, Bugler gave Substance a smirk. No one from the CPS "accountability" office participated in AERA in Chicago that year, because the Chicago versions of "accountability" (as well as research and evaluation by that time) had become so corrupt that they were a national scandal. Substance photo by George N. Schmidt.Many states and school districts mandate testing programs to gather data about student achievement over time and to hold schools and students accountable. Certain uses of achievement test results are termed "high stakes" if they carry serious consequences for students or for educators. Schools may be judged according to the school-wide average scores of their students. High school-wide scores may bring public praise or financial rewards; low scores may bring public embarrassment or heavy sanctions. For individual students, high scores may bring a special diploma attesting to exceptional academic accomplishment; low scores may result in students being held back in grade or denied a high school diploma.

These various high-stakes testing applications are enacted by policy makers with the intention of improving education. For example, it is hoped that setting high standards of achievement will inspire greater effort on the part of students, teachers, and educational administrators. Reporting of test results may also be beneficial in directing public attention to gross achievement disparities among schools or among student groups. However, if high-stakes testing programs are implemented in circumstances where educational resources are inadequate or where tests lack sufficient reliability and validity for their intended purposes, there is potential for serious harm. Policy makers and the public may be misled by spurious test score increases unrelated to any fundamental educational improvement; students may be placed at increased risk of educational failure and dropping out; teachers may be blamed or punished for inequitable resources over which they have no control; and curriculum and instruction may be severely distorted if high test scores per se, rather than learning, become the overriding goal of classroom instruction.

This statement sets forth a set of conditions essential to sound implementation of high-stakes educational testing programs. It is the position of the AERA that every high-stakes achievement testing program in education should meet all of the following conditions:

Protection Against High-Stakes Decisions Based on a Single Test

Decisions that affect individual students' life chances or educational opportunities should not be made on the basis of test scores alone. Other relevant information should be taken into account to enhance the overall validity of such decisions. As a minimum assurance of fairness, when tests are used as part of making high-stakes decisions for individual students such as promotion to the next grade or high school graduation, students must be afforded multiple opportunities to pass the test. More importantly, when there is credible evidence that a test score may not adequately reflect a student's true proficiency, alternative acceptable means should be provided by which to demonstrate attainment of the tested standards.

Adequate Resources and Opportunity to Learn When content standards and associated tests are introduced as a reform to change and thereby improve current practice, opportunities to access appropriate materials and retraining consistent with the intended changes should be provided before schools, teachers, or students are sanctioned for failing to meet the new standards. In particular, when testing is used for individual student accountability or certification, students must have had a meaningful opportunity to learn the tested content and cognitive processes. Thus, it must be shown that the tested content has been incorporated into the curriculum, materials, and instruction students are provided before high-stakes consequences are imposed for failing examination.

Validation for Each Separate Intended Use Tests valid for one use may be invalid for another. Each separate use of a high-stakes test, for individual certification, for school evaluation, for curricular improvement, for increasing student motivation, or for other uses requires a separate evaluation of the strengths and limitations of both the testing program and the test itself. Full Disclosure of Likely Negative Consequences of High-Stakes Testing Programs

Where credible scientific evidence suggests that a given type of testing program is likely to have negative side effects, test developers and users should make a serious effort to explain these possible effects to policy makers.

Arne Duncan praised Dan Bugler effusively at the July 2007 meeting of the Chicago Board of Education, when Bugler left the Board. Duncan seconded Bugler's comment that both were disappointed that CPS had never won a "Broad" (the Broad Foundation's award for school systems). Neither Duncan nor Bugler had any experience or peer reviewed vetting in testing and evaluation, and used "accountability" as a club to justify the privatization of dozens of Chicago public schools and the purges on hundreds of teachers and principals. Substance photo July 2007 by George N. Schmidt. Alignment Between the Test and the Curriculum Both the content of the test and the cognitive processes engaged in taking the test should adequately represent the curriculum. High-stakes tests should not be limited to that portion of the relevant curriculum that is easiest to measure. When testing is for school accountability or to influence the curriculum, the test should be aligned with the curriculum as set forth in standards documents representing intended goals of instruction. Because high-stakes testing inevitably creates incentives for inappropriate methods of test preparation, multiple test forms should be used or new test forms should be introduced on a regular basis, to avoid a narrowing of the curriculum toward just the content sampled on a particular form.

Validity of Passing Scores and Achievement Levels When testing programs use specific scores to determine "passing" or to define reporting categories like "proficient," the validity of these specific scores must be established in addition to demonstrating the representativeness of the test content. To begin with, the purpose and meaning of passing scores or achievement levels must be clearly stated. There is often confusion, for example, among minimum competency levels (traditionally required for grade-to-grade promotion), grade level (traditionally defined as a range of scores around the national average on standardized tests), and "world-class" standards (set at the top of the distribution, anywhere from the 70th to the 99th percentile). Once the purpose is clearly established, sound and appropriate procedures must be followed in setting passing scores or proficiency levels. Finally, validity evidence must be gathered and reported, consistent with the stated purpose.

Opportunities for Meaningful Remediation for Examinees Who Fail High-Stakes Tests Examinees who fail a high-stakes test should be provided meaningful opportunities for remediation. Remediation should focus on the knowledge and skills the test is intended to address, not just the test performance itself. There should be sufficient time before retaking the test to assure that students have time to remedy any weaknesses discovered.

Appropriate Attention to Language Differences Among Examinees If a student lacks mastery of the language in which a test is given, then that test becomes, in part, a test of language proficiency. Unless a primary purpose of a test is to evaluate language proficiency, it should not be used with students who cannot understand the instructions or the language of the test itself. If English language learners are tested in English, their performance should be interpreted in the light of their language proficiency. Special accommodations for English language learners may be necessary to obtain valid scores. Appropriate Attention to Students with Disabilities

In testing individuals with disabilities, steps should be taken to ensure that the test score inferences accurately reflect the intended construct rather than any disabilities and their associated characteristics extraneous to the intent of the measurement.

Careful Adherence to Explicit Rules for Determining Which Students Are to be Tested When schools, districts, or other administrative units are compared to one another or when changes in scores are tracked over time, there must be explicit policies specifying which students are to be tested and under what circumstances students may be exempted from testing. Such policies must be uniformly enforced to assure the validity of score comparisons. In addition, reporting of test score results should accurately portray the percentage of students exempted. Sufficient Reliability for Each Intended Use Reliability refers to the accuracy or precision of test scores. It must be shown that scores reported for individuals or for schools are sufficiently accurate to support each intended interpretation. Accuracy should be examined for the scores actually used. For example, information about the reliability of raw scores may not adequately describe the accuracy of percentiles; information about the reliability of school means may be insufficient if scores for subgroups are also used in reaching decisions about schools. Ongoing Evaluation of Intended and Unintended Effects of High-Stakes Testing With any high-stakes testing program, ongoing evaluation of both intended and unintended consequences is essential. In most cases, the governmental body that mandates the test should also provide resources for a continuing program of research and for dissemination of research findings concerning both the positive and the negative effects of the testing program.


November 2, 2010 at 10:17 AM

By: They allowed cheating too!

Duncan Huberman

They include the schools with the 'wrong' scores in the overall numbers so the scores increase. Very little is done about test cheating in CPS. It's all about just getting scores up.

November 3, 2010 at 5:19 PM

By: Isabell Scott

Violating Testing Guidelines

Thanks for publishing this, George.

November 5, 2010 at 7:14 PM

By: Sarah Chambers


Very informative article.

November 14, 2010 at 1:59 PM

By: John Whitfield

"When they all don't speak English"

''Appropriate Attention to Language Differences Among Examinees If a student lacks mastery of the language in which a test is given, then that test becomes, in part, a test of language proficiency. Unless a primary purpose of a test is to evaluate language proficiency, it should not be used with students who cannot understand the instructions or the language of the test itself. If English language learners are tested in English, their performance should be interpreted in the light of their language proficiency.''

Instead of pointing out low test scores on these ethnocentric instruments to our PEP (Potentially English Proficient) students, thereby thrashing their self-esteem, we should be celebrating diversity more (as some institutions already do). One of the most beautiful things about living in Chicago, in spite of our segregated neighborhoods and schools (not that all are)is, the diverse populous of the "city of big shoulders".

''Where credible scientific evidence suggests that a given type of testing program is likely to have negative side effects, test developers and users should make a serious effort to explain these possible effects to policy makers.''

Please keep in mind, that the PEP students are taking these instruments in their second language, a fact too easily overlooked, and oft forgotten. and then, to add insult to injury, PEP students, or ELL (English Language Learners)are too often put in the same light as the learning disabled, or inappropriately placed into Special education programs, prematurely exiting the bilingual education program.

Or put into Special Education, because no continual language support, like sheltered English, is available, or not utilized.

It was a better Public School System, when there was an elected School Board, and activists were able to get advocates like Maria Vargas, and Juan Cruz elected to the Board of Ed., to look out for, and monitor the needs of the PEP students, and others, vulnerable to the likes of non-educators, running a school system, who have been appointed, not elected to the school board.

Miguel Del Valle, I must admit, would be a great advocate, for empowering our students, as he had been in Springfield for decades, often having told his personal story at State Bilingual Conferences, how he was simply placed in the back of the room as a small child in school, and ignored for the most part, when he spoke only Spanish, and before the advent of the Bilingual Education Act,

and before having learned English, the hard way.

The Bilingual Education Act, as you know, came out of the "Lau vs. Nichols" supreme court case, when a Chinsese student, Lau, had the Board of Education, and superintendent Nichols, in his city sued, because he had managed to graduate from High School without having learned English.

What do you think?

November 14, 2010 at 4:32 PM

By: What has not been violated in CPS?

Elected CPS Leaders

In my opinion, all educational leaders at CPS, including CEOs should be elected by the most important people in the education system. Teachers, parents, and other union members should be the ones to elect their leaders.

Otherwise business people will be placed in these positions,and you will continue to observe violations in all areas of the education system in Chicago.

It is also imperative to elect the best candidate for mayor. I don't know who I am going to vote for, however I am leaning towards Danny Davis or Miguel Del Valle. I do believe Danny Davis has a better chance of winning. I believe there should be only one good minority candidate because otherwise African American and Hispanics are going to split votes, giving Emmanuel a better chance of winning. It would be nice if there were only one great leader or consensus no matter his/her enthic group.

Add your own comment (all fields are necessary)

Substance readers:

You must give your first name and last name under "Name" when you post a comment at We are not operating a blog and do not allow anonymous or pseudonymous comments. Our readers deserve to know who is commenting, just as they deserve to know the source of our news reports and analysis.

Please respect this, and also provide us with an accurate e-mail address.

Thank you,

The Editors of Substance

Your Name

Your Email

What's your comment about?

Your Comment

Please answer this to prove you're not a robot:

1 + 2 =