This is the author's personal reflection on the Characteristics of a Good Test. It can help teachers realize some important points that are needed to be considered when constructing tests for their students.

If the curriculum is not tested, it is difficult to know if any of it works. Without standardized tests, reliably gauging student progress becomes problematic for anyone outside the classroom. One must accept whatever each teacher says, and without standardized tests, points of comparison for different classrooms become progressively rarer. Without either common standards or high-stakes standardized tests, there may be no effective way to monitor systemwide performance at all. Some U.S. teachers may be doing a wonderful job in their totally customized classes, but some may be doing an awful job. How is one to know or tell which? One must hope that teachers will face down their own natural inclinations as well as those of students, parents, and schools to avoid accountability and hold themselves and their students to high standards of performance regardless. One must also hope that teachers will know how.

Some studies that had been conducted showed that teacher-made tests were good and satisfactory. However, the majority of teachers do not validate their tests administering them to the students. This study would describe the perspective of teachers towards their-own.-made (teacher-made) tests they have made to know that to what extent their agreement regarding their attitudes, quality, and use of the tests. Five English teachers participated in this research. Their view on the test they have made would be analyzed and described. The results showed that (1) the teachers agreed about the appropriateness of the test they administered; (2) the teachers believed that the data quality obtained during research was useful and meaningful, and; (3) the teachers used the test to identify and to evaluate their learning objectives, students’ learning needs, students’ learning difficulties, and school evaluation

Assessment is a very important part of a learning process. To conduct the assessment, teachers have to design a test. To retrieve a quality of a test, a physics teacher needs to do the items analysis. There are several ways of doing items analysis which including the analysis of difficulty index, discrimination index, and analysis of the validity and reliability. This research is a descriptive study that aims to describe systematically and accurate information on the actual situation in this case about the difficulty index, discrimination index, validity and reliability of the items. The variables analyzed were the quality of teacher-made tests physics. The results of this study indicate that physics teacher-made tests have low validity, reliability moderate or medium, high difficulty index, and poor discrimination index

Despite the fact that tests are never tolerated by most students; quizzes, on the contrary, are somehow acceptable owing to their two major most important characteristics: 1. Duration: Short concise tests which take no more than 15 minutes. 2. The limited range of lessons the students are tested on.

The U.S. public has consistently favored standardized testing in the schools, preferably with consequences (or “stakes”) riding on the results, ever since the first polls taken on the topic several decades ago. Depending on how the question is framed, those in favor of high-stakes standardized testing outnumber those opposed at ratios as high as twelve to one. Parents are stronger supporters of high-stakes testing than are nonparents, and that support does not budge when they consider the possibility of their own progeny failing. Results from different polls approaching the topic in different ways suggest that nearly all Americans would like to see high-stakes tests administered at least once at every grade level. In twelve years of elementary and secondary school, however, the typical U.S. school district offers just one or two standardized tests with high stakes for students. With only a few exceptions, U.S. educational testing programs fall short of what the public wants, and short of what most industrialized countries have.

Assessment is a crucial aspect of the teaching and learning processes. According to Nunan (1999), “assessment refers to the tools, techniques, and procedures for collecting and interpreting information about what learners can and cannot do”. Based on the previous statement, this paper wants to focus its attention on tests. They are useful tools to evaluate students provided that they are implemented appropriately; that is why, their analysis is a critical factor in the assessment process, given the fact that it informs teachers about their instruction, students’ performance, and the curriculum, amongst others. Bailey (1998), states that “when we talk about a test being “appropriate,” the issue is partly whether the test provides us with the information we need to gain about the students we serve.” It is paramount to identify the desired goals we want our students to reach, the place where they are in relation to them and the tools we need to provide them with, in order to help them getting higher achievements. Tests analysis helps us to identify those aspects and to adjust our teaching practices aiming at students to succeed. In accordance with the preceding information, the effectiveness of three subtests will be analyzed, so their validity, reliability, practicality and positive washback (Bailey, 1998, p. 3) will be determined as well. These subtests are part of an only language test that was administered to eleventh graders at a public school. It mixes three different constructs taking into account that students at this grade need to be prepared to face a standardized test, which evaluates learners similarly and is called Saber Test. It assesses learners’ English skills, such as their pragmatic, lexical, communicative and grammatical knowledge, and their reading comprehension (ICFES, 2014). The test discussed in this report is a progress one, whose stimulus material is based on mythology. This topic was studied in depth during the second term of the academic year, and learners have received complementing knowledge in other classes, such as Social Studies and Spanish. My purpose is to discover whether the test I designed fulfill my students’ needs, expectations and English level or, on the contrary, it needs to be modified in order to get reliable information about my pupils’ learning process. Additionally it will be analyzed how well the test measures the constructs and the way they relate each other, as Brown (2005) advocates, validity “is the degree to which a test measures what it claims, or purports, to be measuring”.

This study aimed at investigating the quality of multiple-choice items test created by teachers of mathematics on the topic of logical mathematic for tenth graders at Pangeran Antasari Senior High School in Medan. The test consisted of 40 multiple-choice items, and the classical theory analysis was carried out. The results turned out that 22 out of 40 items were valid and 18 items were not. The coefficient reliability resulted in 0,88196 meaning that it had higher reliability and consistency over time. Furthermore, the item-difficulty test revealed that 17 items were easy, 4 items were moderate, and 1 item was difficult. The distractor analysis indicated that 21 items were good and 1 item was very good. Of 22 valid items, there were good and tricky items existed. It was reasonable to conclude that 22 out of 40 items created by the mathematic teachers were categorized as good. Preliminary Bruce, Weil and Calhoun (Sumiati and Asra: 2008) states that learning is essentially a complex process (complex), but with the same purpose of providing learning experiences to students in accordance with the objectives. The goal is actually a reference in the implementation of the learning process. To determine the achievement of learning objectives it is necessary to evaluate learning outcomes. According to Tyler (Rashid and Mansour: 2008), the evaluation is the process of determining the extent to which educational goals have been achieved. Broader definition put forward by the other two experts, the Cronbach and Stufflebeam (2000), the additional definition is that the evaluation process is not simply measure the extent to which the objectives are achieved, but it is used to make decisions. One way to evaluate learning outcomes is to use the test results of learning. In order to learn the test results can be used as its function is to measure the achievement of learning objectives, one of the teacher's task is to evaluate the device tests that have been made, such as with the test item analysis to determine the quality of the tests that have been made. But in reality, not many are doing so. Event analyze the test item is an activity that must be done to improve the quality of teachers that have been written test. Darwyan Shah et al. (Arifin: 2009) defines the test item analysis as an investigation or a study of a part of the whole thing must be answered by learners. Nana Sudjana (2009) define that test item analysis or item analysis is assessment test questions in order to obtain a device that has the question of adequate quality. From the definition above can be concluded that the analysis of items that is a process that is carried out to investigate, researching and reviewing the test questions in order to obtain a device that has the question of adequate quality. There are several reasons why the analysis of test items required. According to (Asmawi Zainul, et al: 1997) these reasons, among others: a. To know the strengths and weaknesses of test items, so do the selection and revision of items. b. To provide information on the specifications of items in full, so that will make it easier for device makers in formulating questions about the exam that will meet the needs in the field and a certain degree.

