Developing Longitudinal Data Systems
In Chapter 4, we described the various indicators that can be used to estimate dropout and graduation rates. Some indicators, such as status rates and event rates, can be derived using cross-sectional data. Other indicators, such as cohort rates based on individual data, require longitudinal, student-level data. Calculating this type of cohort rate requires a data system that can track students over time, at least from their entrance to grade 9 through the subsequent years until they leave high school. Initially as a consequence of the National Governors Association (NGA) Compact in 2005 (see Chapter 2) and more recently because of new rules for complying with the No Child Left Behind (NCLB) Act, states are now expected to be able to report individually based cohort graduation rates.
During the workshop, participants explored issues related to the development of state longitudinal databases to support calculation of these rates, and this chapter documents the information we obtained with regard to best practices for creating data systems. The chapter begins with a description of the essential elements of longitudinal data systems, particularly as they relate to calculating cohort graduation and dropout rates. At the workshop, we heard from several state data system administrators, and this chapter provides a synopsis of the work each has done. The chapter draws extensively from information provided by Lavan Dukes, with the Florida Department of Education. The chapter closes with a discussion of the advice that the data administrators would offer to states as they develop their data systems.
At the time of the workshop in October 2008, some states were well along in the development of such data systems, and others were just beginning. We
think that states are likely to have made considerable progress in their systems over the past two years while this report was in production, in part due to requirements for competing for Race to the Top grant funding from the U.S. Department of Education. As a result, some of the specific details in this chapter may be out of date. Nevertheless, the basic components of a high-quality data system, as discussed in this chapter, are still relevant, and the recommendations we offer are still valid.
STATUS OF STATE DATA SYSTEMS
The status of longitudinal data systems across the states is uneven. While some states are just beginning, others have long had data warehouses capable of providing longitudinal data about their K-12 school systems. For instance, Delaware, Florida, Louisiana, and Texas have had systems in place since the 1980s. These data systems were developed to meet a variety of state needs and have served as the building block for currently existing longitudinal data systems. They contain information about school facilities, school personnel, school finances, instructional programs, and students. However, these data systems were not necessarily created to yield information about individual students’ educational achievement and progress.
The passage of the Educational Technical Assistance Act of 2002 refocused data system design efforts on producing longitudinal systems capable of answering questions about student achievement. The Statewide Longitudinal Data System (SLDS) Grant Program, as authorized by the act, was designed to (http://nces.ed.gov/programs/slds/):
aid state education agencies in developing and implementing longitudinal data systems. These systems are intended to enhance the ability of states to efficiently and accurately manage, analyze, and use education data, including individual student records. The data systems developed with funds from these grants should help states, districts, schools, and teachers make data-driven decisions to improve student learning, as well as facilitate research to increase student achievement and close achievement gaps.
In the first year of the program (2005-2006), grants were awarded to 13 states. Another 13 states received grants in 2006-2007, and 16 states received grants in 2007-2008. Thus, at the time of the workshop in October 2008, work on state data systems was well under way, with a total of 41 states and the District of Columbia having received SLDS grants (see http://nces.ed.gov/Programs/SLDS/index.asp).
To assist states with their data systems, the Gates Foundation developed the Data Quality Campaign (DQC).1 Formed in 2005, the DQC is a national,
10 Essential Elements of a Robust Longitudinal Data System
SOURCE: Reprinted with permission from the Data Quality Campaign, copyright 2007.
collaborative effort to encourage and support state policy makers to improve the availability and use of high-quality education data to improve student achievement. The goal of the DQC is to help states design data systems that contain the necessary information to answer research questions about the correlates of student achievement and educational progress. The DQC has worked with policy makers to define their questions and to identify the required data.
The DQC has focused its efforts on helping states build high-quality data systems that can effectively and accurately answer questions that cannot be answered with cross-sectional data. To this end, the organization has identified 10 essential elements of a robust longitudinal data system, which are shown in Box 6-1.
DQC conducts annual surveys of states to gather information about the status of their data system development. When the campaign began, no state had a data system that incorporated all of the elements. By 2008, significant progress had been made, with 4 states (Arkansas, Delaware, Florida, and Utah)2
funding is now provided by the Casey Family Program, the Lumina Foundation, and the Michael and Susan Dell Foundation for Education.
As of July 2009, this increased to 6 states, adding Georgia and Louisiana to the list of states with all 10 elements in place. See http://www.dataqualitycampaign.org/ for most recent status of states’ progress on implementing these elements into their data systems.
indicating that they had systems that incorporated all of these elements, and 12 states (Alabama, Georgia, Kentucky, Louisiana, Massachusetts, Mississippi, Nevada, Ohio, Tennessee, Texas, Washington, and West Virginia)3 having data systems with at least eight of the elements.4 It is important to note that this information is based on self-reported responses from the states and may not be entirely accurate. For instance, California reported to the DQC that it has student-level graduation and dropout data, yet the California Department of Education website reports otherwise: “Since student level data are not collected, cohort graduation rates that account for incoming and outgoing transfers cannot be calculated.”5
Although all of these elements are important for addressing questions about students’ educational progress, element 8 is fundamental for calculating cohort-based dropout and graduation rates. When the DQC began in 2005, only 40 states maintained student-level graduation and dropout data. By 2008, 50 states (of 51) had systems that included student-level graduation and dropout data. However, element 8 alone is not sufficient to calculate the NGA graduation rate (see Equation 2.1), now required for NCLB. The NGA rate requires that states have the facility to track students over time, as they enter grade 9 for the first time, as they progress from grade to grade, as they transfer to and from schools (both public and private) and to and from states, and as they leave and possibly return to school. Thus, the NGA rate requires that students be assigned a unique identifier, that their transitions be coded, and that their final exit status be recorded. According to the DQC survey, by 2008, only 18 states had the capability of producing the NGA rate. Another 27 states reported that they were on track to being able to produce the NGA rate, with 7 expected to have that capability by 2009, 10 more by 2010, 8 more by 2011, and 2 more by 2012 or later.6 According to Nancy Smith, with the Data Quality Campaign, all 50 states and the District of Columbia can identify dropouts, and 49 can identify transfers.
The aspects of the NGA graduation rate that pose problems are identifying other kinds of school leaving and coding the final exit status. States must be able to distinguish correctly between departing students who drop out or obtain a GED and students who transfer to another school. At the time of the
As of July 2009, this increased to 18, adding Colorado, Minnesota, Missouri, New Jersey, New Mexico, North Carolina, Oklahoma, and South Carolina to the list of states with at least eight elements in place; see http://www.dataqualitycampaign.org/ for most recent status of states’ progress on implementing these elements into their data systems.
These survey results reflect the status of state data systems prior to the implementation of the Race to the Top initiative.
The number does not add to 51 because, at the time of the survey, some states indicated that they were not yet planning to report the NGA rate.
workshop, only 35 states could identify students who left school to enroll in a GED program, and 39 could identify students who were incarcerated.
ADDRESSING DIFFERENCES ACROSS STATE SYSTEMS
In the 1980s, schools were beginning to develop their data systems. These were typically single-purpose systems maintained in specific offices within the school system. For instance, high schools might have had automated systems for scheduling purposes. Exceptional student education and vocational education offices might have created data systems to collect the information the U.S. Department of Education required in return for the federal funding they received. But the systems differed in ways that made it impossible to merge and match data across systems. For instance, data elements were similarly named across the systems, but the definitions of the elements differed, as did the coding conventions.
At the time data system development began, much of the data was collected and maintained on paper, and as a student progressed through the school system, paper moved from one point to the other (e.g., paper transcript, paper report card). As states moved toward developing comprehensive statewide data systems, one of the overarching objectives was to facilitate more efficient and rapid exchange of information within and between levels of state education systems. States developed their systems to ease the transfer of student records, to serve as the day-to-day information provider for staff, to serve as the data source for federal reporting requirements, and to meet other state- and school-specific needs.
For the most part, states developed their data systems independently of each other. That is, there was no federal body that guided system development in a coordinated way that would produce 50 state systems with common characteristics. Thus, data systems vary greatly across states. States maintain data elements and program their systems to perform functions that meet their state-specific needs. These uses and needs differ among states.
There have been two significant efforts to attempt to reduce the variability among the states. First, as described in Chapter 1, was the NGA Compact, which laid out a common metric for calculating and reporting graduation rates, presumably to enable comparisons across states and districts. Second was a task force created by the National Forum on Education Statistics to standardize the exit codes that states use to classify students’ enrollment status.
The National Forum on Education Statistics is a cooperative of state, local, and federal education agencies working to improve the quality of education data gathered for use by policy makers. The forum established the Exit Code Task Force, charged with constructing a taxonomy that could “account, at any point in time, for all students enrolled (or previously enrolled) in a particular school or district” (National Forum on Education Statistics, 2006, p. 2). The
group developed six broad classifications that met two important criteria: they are mutually exclusive, and they cover every possible situation. The six broad categories are
still enrolled in the district,
not enrolled, eligible to return, and
exited (neither completed nor dropped out).
For these broad categories, 23 subcategories of exit codes were identified. The task force developed these categories by examining the exit codes used in all of the states. They proposed the system of codes as a standard that could be cross-walked with state systems without losing their integrity (National Forum on Education Statistics, 2006).
Despite these efforts to standardize the way that graduates and dropouts are identified and rates estimated, differences among states data systems and state laws can still affect the estimates. Examples from three states help to illustrate these differences.
Florida has had a longitudinal data system in place since 2004. Prior to that time, the PK-12, community college, and the university system sectors each had separate and independent data systems based on unit-record data. The current data system is comprehensive and covers all major data areas, including staff, students, finance, facilities, and programs. The system allows for linkages across all subsystems in the state and local education agencies. It also facilitates data exchanges across all levels of public and, in some cases, private education. The system can track students from the time they first enter school until they leave it and into the labor force and higher education. Lavan Dukes, with the Florida Department of Education, provided some examples of the capabilities of this state system.
For instance, Dukes showed that Florida’s system can identify the number of grade 3 students in a given school year who are not in the same school in which they were enrolled in the prior two years. In the 2007-08 school year, the state recorded approximately 228,500 grade 3 students. Of those students, only about 128,000 were enrolled in the same public school in which they were
This description reflects the characteristics of the system at the time of the workshop in October 2008. For further information, see http://www.fldoe.org/eias/dataweb/database_0809/appenda.pdf.
enrolled in the 2005-2006 and 2006-2007 school years. The data system allows the analyst to identify the remaining 100,500 students who changed schools at least once in the prior two years and connect them to their standardized test scores to evaluate any changes in academic performance associated with mobility. This type of analysis can be conducted for the total group of students as well as for students grouped by race/ethnicity or by economic, disability, or English language learner (ELL) status.
With its comprehensive student data collection system already in place, Florida was immediately able to calculate the rate mandated by the federal government in October 2008 (i.e., the number of students who graduate in four years with a regular high school diploma divided by the number of students who entered high school four years earlier). The state could compare its current graduation rate calculation with the U.S. Department of Education’s mandated calculation for differences. This comparison required only an adjustment in which certain codes were included in the calculation, creating a smooth transition from the existing methods to the new methods.
Dukes also demonstrated the type of outcome information available for students after they left the public school system (see Box 6-2 and Box 6-3). The boxes show, for instance, that 65 percent of the 2006 graduates (see Box 6-2) were enrolled in higher education, while only 4 percent of the dropouts (see Box 6-3) were enrolled in higher education. And 3 percent or fewer of the graduates were receiving public assistance in the form of Temporary Assistance for Needy Families (TANF) or food stamps, while 17 percent of the dropouts were receiving this type of aid.
The state has worked hard to develop a set of attendance and record-keeping codes that allow it to accurately track entries, withdrawals, and re-entries and classify the outcomes for each student enrolled in the school system. As shown in Box 6-4, the state uses 12 codes for graduates, with the codes designed to be mutually exclusive. For instance, two codes are used to indicate standard diploma recipients and to distinguish between students who followed a college preparatory curriculum (WFA) and a career preparatory curriculum (WFB). Other codes distinguish between students who earned a standard diploma but took an alternate to the state graduation test (WFT) or were allowed to waive the Florida Comprehensive Assessment Test (WFW). The state has three codes for GED earners: those who pass the GED, pass the state graduation test, and receive a standard diploma (W10); those who pass the GED, pass an alternate graduation test, and earn a standard diploma (WGA); and those who pass the GED, fail the state graduation test, and receive a State of Florida diploma (WGD). A separate code is also used for students who graduate by demonstrating mastery of the Sunshine State Standards for Special Diploma (W07).
Three codes are used for students who receive a certificate of completion. Code W08 indicates students who earned the minimum number of credits but did not pass the graduation test (or an alternate) and did not achieve the
Outcome Findings for Florida Public High School Graduates
SOURCE: Data from http://www.fldoe.org/fetpip/. Reproduced by Lavan Dukes in Data Reporting Infrastructure Necessary for Accurately Tracking Dropout and Graduation Rates, presentation for the Workshop on Improved Measurement of High School Dropout and Completion Rates, 2008.
required GPA. Code W8A is used for students who met all the requirements for a standard diploma but did not pass the graduation test; these students are eligible to take the college placement test in order to enter a state community college. Finally, code W09 is used for special education students who met local requirements but not the state minimum requirements.
A set of codes is also used to designate and distinguish among students who drop out. The codes indicate both that the student has withdrawn from school and provide the reason for the withdrawal. For instance, certain codes identify students who withdraw to enter the adult education program (W26), for medical reasons (W18), or because they were expelled (W21).
Outcome Findings for Florida Public High School Dropouts
SOURCE: Data from http://www.fldoe.org/fetpip/. Reproduced by Lavan Dukes in Data Reporting Infrastructure Necessary for Accurately Tracking Dropout and Graduation Rates, presentation for the Workshop on Improved Measurement of High School Dropout and Completion Rates, 2008.
In Indiana, the use of individual student identifiers was illegal until 2001, and thus its work on developing a longitudinal data system with student-level data did not begin until after this time. The state has specified a method for calculating the cohort graduation rate that is articulated in state law and, as Wesley Bruce, with the Indiana Department of Education, explained at the workshop, changing the formula means that the state legislature must change the law. The
Florida’s Attendance Record-Keeping Exit Codes
SOURCE: Data from http://www.fldoe.org/eias/dataweb/database_0708/appenda.pdf. Reproduced by Lavan Dukes in Data Reporting Infrastructure Necessary for Accurately Tracking Dropout and Graduation Rates, presentation for the Workshop on Improved Measurement of High School Dropout and Completion Rates, 2008.
law has gone through several iterations. At present, state law specifies a formula that is slightly different from the NGA formula, specifically:
[T]he department shall determine and report a statewide graduation rate that is consistent with guidelines developed by the [NGA]. If the guidelines are unclear or allow flexibility in determination, the requirements of this chapter apply to the determination of a statewide graduation rate. However, cohort members who leave after less than one (1) year of attendance in an Indiana school and whose location cannot be determined may not be subtracted in the calculation of a statewide graduation rate.
This slight change in the way that transfer students are treated in the denominator produces state rates in Indiana that differ slightly from the NGA rates. At the workshop, Bruce demonstrated the impact of these differences. For a sample cohort of 648 of which 472 were graduates, the state formula yielded a graduation rate of 76.5 percent, whereas the NGA formula yielded a rate of 75.7 percent.
To classify students’ enrollment status, the state uses 30 exit codes (see Box 6-5), which are also defined by law or by state board rule.9 Codes 1 through
See http://www.doe.in.gov/stn/pdf/2007-DM.pdf for additional details about Indiana’s coding system.
18 are used for dropouts, and codes 19 through 30 are used to designate transfers. This coding scheme creates some complexities because the codes are not mutually exclusive. For instance, codes 1 through 18 are actually reasons for dropping out. Although a student may actually have multiple reasons for leaving, only one code can be designated for a student per year. Bruce commented that, in an ideal system, one code would be used to indicate that the student dropped out and subcategories would be used to document the reason for leaving. However, changing the coding system would require a change in state law.
The state has three codes for missing students (14, 17, and 26). These cover students who were enrolled at some point but are not currently enrolled and not verified as transfers. In Indiana the original school is held responsible for these students unless their status as a transfer can be verified.
Code 28 (leaving school for religious beliefs) is a state-specific code. In Indiana, this code is primarily used for Amish students who are expected to leave school after grade 8 in order to work. The state has an agreement with the Amish bishops that Amish students may leave school after grade 8. Amish students are not considered in the dropout or graduation rates.
Massachusetts first began assigning unique statewide student identifiers in 2000 and had the first statewide collection of data during the 2001-02 school year. At the time, the state collected 35 elements primarily focused on the demographic information that they were required to report to the federal government. The state did not use any of the information for assessment, funding, or accountability purposes for the first two years, which allowed time to create, test, and clean the data and the system. After the first few years, they added data elements in order to include programmatic information needed to assess what was happening in schools and how to improve instruction.
Prior to 2006, the state required schools to produce a graduation rate for NCLB, as mandated by state law. However, in Massachusetts, they actually produced a rate that was the percentage of enrolled twelfth graders who passed the state exit exam—referred to as a “competency determination rate.” Rob Curtin, with the Massachusetts Department of Education, pointed out that they were not able to produce the cohort rate before 2006 because the data were not available; that is, the system began in the 2001-02 school year, and it takes five years to accumulate the needed data. They now use a formula that is similar to the NGA formula, with minor adjustments. One adjustment is that the span of time is “within four years” not “in four years” in order to include the three-year graduates. The cohort is defined as first-time ninth graders,
and they are followed through the summer of the fourth year. This practice of including summer graduates is not uniform across the country. For instance, Jeanine Hildreth, with the Baltimore City Schools, indicated that summer graduates are attributed to the next class in Maryland, and Mel Riddile, with the National Association of Secondary School Principals, said that when he was a principal, students who fulfilled graduation requirements in the summer following completion of grade 12 were treated as dropouts/nongraduates in calculating the graduation rates.
The Massachusetts policy also specifies how subgroup performance is reported. Students classified as ELL, low income, or special education are included with the subgroup for the cohort graduation rate if they were reported in the subgroup for any one of the four years of high school. The rationale behind this policy is to give districts credit for keeping those students through the four years.
Massachusetts uses a set of 21 codes to indicate students’ enrollment status (see Box 6-6). The state has a series of codes to designate transfers. Code 20 indicates that the student has transferred to another public school in the state. When students are given this status, officials must scour the state records and confirm their location. Failure to locate them means they are coded as dropouts. Curtin noted that this requirement is relatively new to the system, and when it was implemented the dropout rate increased by nearly half a percent. School officials are also expected to verify out-of-state transfers by confirming that they have received a request for records from the receiving state. Curtin said that although this requirement is difficult to enforce, compliance is confirmed through state audits of local school systems.
The state maintains a GED database that permits tracking students who leave to confirm that they are enrolled in a diploma-granting adult education course. Students who obtain a GED by October 1 of the following year are removed from the state’s dropout count. GED recipients are not considered graduates, however; they are treated as a distinct group from dropouts and graduates.
One difference from Indiana is that Massachusetts does not have a code for missing. Curtin indicated that if a student is reported in one data collection, they have to be tracked down and reported on in the next data collection.
Code 9 is reserved for special education students who have reached the maximum age of 21 and are released from school even though they have not received a diploma or a certificate of attainment. Those students are reported in a separate category and are not included in either dropout rates or graduation rates. They are nongraduates, but not dropouts. They are not included in the denominator for the cohort.
Massachusetts Attendance Record-Keeping Exit Codes
04 Graduate with a competency determination
05 Permanent exclusion (expulsion)
09 Reached maximum age, did not graduate or receive a Certificate of Attainment
10 Certificate of Attainment
11 Completed grade 12 and district-approved program (district does not offer a Certificate of Attainment)
20 Transferred — In state public
21 Transferred — In state private
22 Transferred — Out-of-state (public or private)
23 Transferred — Home-school
24 Transferred — Adult diploma program, leading to MA diploma
30 Dropout — Enrolled in a nondiploma-granting adult education program
31 Dropout — Entered Job Corps
32 Dropout — Entered the military
33 Dropout — Incarcerated, district no longer providing educational services
34 Dropout — Left due to employment
35 Dropout — Confirmed dropout, plans unknown
36 Dropout — Student status/location unknown
40 Not enrolled but receiving special education services only
41 Transferred — No longer receiving special education services only
SOURCE: Data from http://www.doemass.org/infoservices/data/sims/enroll_validations.pdf. Reproduced by Robert Curtin in A Long Road to a Longitudinal Data System, presentation for the Workshop on Improved Measurement of High School Dropout and Completion Rates, 2008.
CHARACTERISTICS OF AN EFFECTIVE DATA SYSTEM
The representatives of state departments of education who participated in the workshop are in charge of some of the most comprehensive data systems in the country. These panelists offered a number of suggestions for states as they work to create their longitudinal data systems. In addition, publications available through the DQC provide advice to states about data system development (see Data Quality Campaign, 2006a, 2006b, 2006c, 2007). The suggestions drawn from these sources are discussed below.
Suggestions for States
One key issue is identifying the intended functions of the system so they can guide the development project. Given the current focus on accountability, some of the functions systems should be able to perform include:
Following students’ academic progress and examining progress for critical subgroups, such as by race/ethnicity, gender, social/economic status, English proficiency, and disability status.
Determining the effectiveness of specific schools and programs.
Identifying high-performing schools and districts so that educators and policy makers can determine the factors that contribute to high academic performance.
Evaluating the effect of teacher preparation and training programs on student achievement by connecting students to test scores and teachers.
Evaluating the extent to which school systems are preparing students for success in rigorous high school courses, college, and challenging jobs by connecting students and their performance to higher education and workforce data.
In order to do this, comprehensive data systems need to include a variety of information about students, teachers, and the instructional programs. The workshop participants identified the following as important data elements that systems should include
Student demographics, such as race, gender, grade level, ELL status, disability status, migrant status, and socioeconomic status (e.g., education of parents).
Student coursework data, such as the name of courses taken, credits attempted, credits earned, overall GPA.
Student attendance records, such as entry date, withdrawal date, days present, days absent.
Student assessment data, such as tests taken (form, date, publication year), test results (score, performance level).
Discipline information, such as type, context, and results.
Teacher information, such as type of degree, years of experience, college attended, certification status, in/out-of-field status, advanced certification status (i.e., certification by the National Board for Professional Teaching Standards), highly qualified teacher status.
Financial information capable of being linked to the school information.
Staff/personnel information capable of being linked to school information.
The presenters made a number of suggestions regarding data elements. They emphasized that data quality begins at the local school district with the person who enters the data; accuracy depends on the quality of data that are entered. They advised states to refer to Building the Culture of Data Quality (National Forum on Education Statistics, 2005), which provides a number of
important guidelines regarding data quality. They also advised states to develop clearly defined, carefully articulated coding systems that everyone understands. It is important to think about the codes and processes in terms of the possibilities for “gaming the system.” That is, system developers should think about ways that the schools can interpret the rules other than what was intended and try to prevent these misinterpretations. To the extent possible, the system should begin with the most granular data available; that is, it is always possible to use granular data to aggregate up, but it is not possible to “aggregate down.” If the goal is to compare across years, it is important that the data and algorithms remain consistent. One small change will result in inaccurate and inappropriate comparisons. Annual written documentation of processes, procedures, and results will help maintain consistency and quality over time. It is also critical to institute a process for adding elements or making changes to the data system. New data elements should be clearly defined, the coding should be understood, and the new elements should adhere to established protocol for the system.
The presenters also emphasized that it is important to take the time to get it right. Although states are under time pressure to produce the rates, it is important that the rates are not reported until the data are of sufficient quality to yield accurate estimates. They noted that there are several ways to improve the quality of data received from the schools. One way is to publicly report it. As one workshop participant commented, “accuracy always improves when someone is embarrassed by a public report.” Another way is to conduct regular audits of the school systems to ensure that reporting of student enrollment status is accurate and that adequate documentation is obtained to verify the status of transfer students. In addition, extensive and ongoing staff training for the collection, storage, analysis, and use of the data at the state, district, and school levels will also help ensure data quality. Workshop participants advised that local school districts must be willing and able to comply with the requirements for creating the system and ensure high-quality data input from their end of the system.
The presenters pointed out that schools and districts will be more likely to provide quality data if they see the benefit of the information. Having the ability to access the data and make data-based decisions is key to using the data to improve instruction and educational outcomes. For instance, Robin Taylor, with the Delaware Department of Education, demonstrated how she is able to access her system on an as-needed basis to answer specific questions. In preparation for the workshop, she used her system to inquire about the annual by-grade dropout rates in her state for the 2007-08 school year. In a matter of minutes, she was able to determine that 41 percent of the students who dropped out in Delaware did so in grade 9, 32 percent in grade 10, 19 percent in grade 11, and 8 percent in grade 12.
The presenters also noted that sophisticated and complex systems are
not always needed. The key is that the system be flexible, integrated, understandable, and able to grow. However, the state education agency must have the technological infrastructure, including hardware and software, to design, develop, and implement the system. The political will must exist to create and implement a system that functions with valid and useful results.
Standards for Data Systems
Lavan Dukes discussed a set of standards created by the Governmental Accounting Standards Board (GASB). Although these standards were originally developed in the context of financial information systems, their characteristics can be applied to any longitudinal data collection system:
Understandability. Information should be simple but not oversimplified. Explanations and interpretations should be included when necessary.
Reliability. Information should be accurate, verifiable, and free from bias. It should be comprehensive; nothing should be omitted that is necessary to represent events and conditions, nor should anything be included that would cause the information to be misleading.
Relevance. There must be a close logical relationship between the information provided and the purpose for which it is needed.
Timeliness. Information from the system should be available soon enough after the reported events to affect decisions.
Consistency. Once a principle or a method is adopted, it should be used for all similar events and conditions. The calculations performed by all should result in the same answer.
Comparability. Procedures and practices should remain the same across time and reports. If differences occur, they should be due to substantive differences in the events and conditions reported rather than arbitrarily implemented practices or procedures for data collection.
In addition, data systems that store individually identifiable information should be designed and maintained to ensure that proper privacy and confidentiality protections are implemented. There are a number of potential dilemmas posed by such data systems, and states should clearly define and implement systems to control who has access to the information, what information they have access to, and what uses can be made of the information. For instance, in addition to implementing appropriate computer security protections, the state should determine who outside of school officials should have access to the data (i.e., law enforcement officials, social services, attorneys, parents, researchers) and should define acceptable uses. Privacy and confidentiality rules are addressed in documents available from the Data Quality Campaign (i.e., see http://dataqualitycampaign.org/resources/980).
Dropout and completion rates cannot be calculated without data. The accuracy of the rates depends on the accuracy and the completeness of the data used for their calculation. State and local education agencies play the leading role in collecting the data that are used to produce cohort rates, the rates that are ultimately used for accountability purposes. In this chapter, we discussed the essential elements of a longitudinal data system identified by the Data Quality Campaign. We think that these components are critical for ensuring that data systems are able to track students accurately, calculate dropout and completion rates, monitor students’ progress, identify students at risk of dropping out, and conduct research to evaluate the effectiveness of their programs. We encourage all states to incorporate these components into their systems and therefore recommend:
RECOMMENDATIONS 6-1: All states should develop data systems that include the 10 essential elements identified by the Data Quality Campaign as critical for calculating the National Governors Association graduation rate. These elements include a unique student identifier, student-level information (data on their enrollments, demographics, program participation, test scores, courses taken, grades, and college readiness test scores), the ability to match students to their teachers and to the postsecondary system, the ability to calculate graduation and dropout rates, and a method for auditing the system.
States and local education agencies can take a number of steps to ensure the quality of their data systems and the data that are incorporated into them. Specifically, data systems should be developed so that the information contained in them is understandable, reliable, relevant for the intended purpose, available in a timely manner, and handled in a consistent and comparable way over time. Annual written documentation of processes, procedures, and results will help maintain consistency and quality over time. It is particularly important to maintain historical documentation of the processes, procedures, and information used to calculate the rates that are reported from the data system so that rates can be reproduced if needed and the data can be handled in a comparable way in subsequent years. It is also critical to institute a process for adding elements or making changes to the data system. Likewise, mechanisms for data retrieval should be incorporated into system designs so that usable data sets can be easily produced. New data elements should be clearly defined, the coding should be documented, and the new elements should adhere to established protocol for the system. If the goal is to make comparisons across years, it is important that the data and algorithms remain consistent. One small change in method may result in inaccurate and inappropriate comparisons. We therefore recommend:
RECOMMENDATION 6-2: All states and local education agencies should maintain written documentation of their processes, procedures, and results. The documentation should be updated annually and should include a process for adding elements or making changes to the system. When data systems or recording procedures or codes are revised, old and new systems should be used in parallel for a period of time to determine consistency.
The quality of the data begins at the point when data are collected and entered into the system. It is therefore important that training be provided for those who carry out these tasks. Extensive and ongoing staff training should cover the collection, storage, analysis, and use of the data at the state, district, and school levels. To this end, system developers should develop clearly defined, carefully articulated coding systems that all contributors to and users of the system can understand. As they do this, system developers should think about ways that those entering the data might interpret the rules in ways other than what was intended and try to prevent these misinterpretations. On this point, we recommend:
RECOMMENDATION 6-3: All states and local education agencies should implement a system of extensive and ongoing training for staff that addresses appropriate procedures for collection, storage, analysis, and use of the data at the state, district, and school levels.
An important mechanism for verifying the accuracy of data that are incorporated into the system is to conduct regular audits of the school systems. Audits can help to ensure that local education agencies are following the intended procedures that reporting of student enrollment status is accurate and that adequate documentation is obtained to verify the status of transfer students and students coded as dropouts. Audits can help to identify procedures or processes that are posing problems and can be used to improve instructions provided to school systems. We therefore recommend
RECOMMENDATION 6-4: All states and local education agencies should conduct regular audits of data systems to ensure that reporting of student enrollment status is accurate and that adequate documentation is obtained to verify the status of transfer students and students who drop out.