LAI 534 Measurement & Evaluation of Science Education

Course Registration Number: 096537
Tuesday, 4:10 - 6:50 pm
216 Baldy Hall
Instructors:  R. L. Doran
Office: 562 Baldy Hall
Office Hours:  Tuesday and Thursday 1:00 - 4:00 pm
Asst. Instructors:  Carol Lenhardt (589 Baldy) & Joe Zawicki (304 Baldy)
(Texts available through the instructors and the UB Bookstore)

Course Overview

Electronic Resources for Curriculum and Assessment – New York State


Course Overview

                          '80     '98
1             1/16     1               Introduction, Overview, Requirements

2             1/23     1     1-2     Instructional Objectives, Domains, Taxonomies, Test Grids

3             1/30     2               Item Formats, Above Recall, RCT, NORC

4             2/6       2               Constructed Response, Alternative Formats, SED

5             2/13     5                Item and Test Analysis, Validity & Reliability

6             2/20     3                Schemas & Test Grids - Affective

7             2/27     3                Affective Formats & Inventories

8             3/6    UB BREAK

9             3/13     4     3         Performance-based Assessment, Rurbrics

10           3/20     4     5-8      Laboratory & Inquiry Skills

11           3/27     6     4         Grading, Portfolios, Journals

12           4/3       4                TIMSS, SED, Item Pools

13           4/10                       Revisit Cognitive

14           4/17                       School Break

15           4/24                       Revisit Affective

16           5/1                         Revisit Inquiry

17           5/8                         Student Presentations of Portfolios

Course Requirements:

This course will focus on improving your measurement and evaluation skills and interpretation of assessment data.  You will be expected to do specific readings on the above topics, participate in class discussions, and involve yourself in the following activities:

  1. Sets of four items (each of a different format; true-false, multiple-choice, completion, matching, short answer) for three important outcomes in your science area.  The three outcomes should be one from each of the three levels; knowing, using, and integrating.   Due 2/6
  2. Table of Specifications (test grid) for a science unit or course with estimated emphasis of content topics and cognitive outcomes.  A short rationale should explain the details of the table.  Due 2/6
  3. A unit test composed of 25 multiple choice items for a content unit taught early in the spring semester.  Items can be be selected from various existing sources, Regents exams, commercial tests, textbooks/curriculum project tests, etc.  Please indicate source and test grid.  The correct answers or scoring key is an important part of each item.  Due 2/1
  4. A set of 50 test items (various formats) written to fit a test grid for a science unit or course.  Each item should be labelled as to the content topic and cognitive outcomes from your test grid.  The correct answers or scoring key is an important part of each item.  At least 50% of the items must be above the recall/knowing level.  Due 3/13
  5. Using student responses (preferably over 30 students) conduct an item and test analysis (available through the UB Computing Center) of a 25 item test (activity 1C).  Critique the items using the data obtained from the analyses and your knowledge of the outcomes, the items and the students  Write  specific revisions of 10 weaker items based on your analysis.  Due 4/3
  6. Develop a description of affective outcomes for a science course or unit (may be uni- or multi-dimensional).  Due 4/3
  7. Create a 40 item inventory to assess these outcomes, using at least three different item formats.  Due 4/3
  8. Administer two or three performance assessment tasks to a group of students.  Score and evaluate student responses.  Write a short descriptive report.  Due 4/10
  9. Develop a description of laboratory/performance/skill outcomes list or test grid for a science course or until.  Due 4/10
  10. Prepare a set of 20 items (at least 3 tasks) designed to assess these outcomes, using at least three formats.  A rubric for scoring must be included.  Due 4/10
  11. Prepare a rationale for an evaluation/grading plan for students in a science course or unit.  Statement must include two parts:
  12. The "mechanics" - detailed procedures for collecting and organizing data and determining grades, e.g., calculating weighted scores (if used).  Due 4/2
  13. The "philosophy" - description of why this plan is relevant for the course, student, and instructor.  Due 4/24
 Student grades for LAI 534 will be based on the quality of the above activities.  A weighting system will be discussed by the class and instructor at the first class meeting.  All material submitted to the instructor will be returned to the students as quickly as possible.  Any student who cannot complete all assignments by May 11 will received an Incomplete. Detailed plans for finishing the work should include completion before enrolling in further courses.

 The following "weighting system" will be used unless students submit a personalized system.  Constraints:
   1)  Total of 100 points,
   2)  Weights are multiples of 5,
          between 5 and 20.
1.     a      5
        b    10
        c    10
        d    20

2.     10

3.     a     5
        b   10

4.     a   10
        b    5
        c   10

5.     5

(Return to top of page)

Electronic Resources for Curriculum and Assessment – New York State

Many documents are in portable document format (*.pdf); these documents can be read using Adobe Acrobat Reader.  The program is available free through the Adobe site:

The State Education Department homepage:

The website contains a pull-down menu linked to approximately 80 different topics.  The links for assessment and curriculum are expanded below; you may wish to explore additional links from the main website.

The Office of Assessment homepage:

The NYS Assessment homepage addresses the following science related information:

General Information

Examination Schedules

New and Revised Assessments

Elementary and Intermediate Level Examinations

Secondary Level Examinations

Curriculum Resource Guides

Publications Catalog

Mathematics, Science and Technology

Science -- Resource Guide with Core Curriculum

Technology Education

Exam Design (High School & Science 8)

Part A
Content-based, multiple-choice questions (approximately 30% of examination)
Part B
Content and skills-based, multiple-choice and constructed-response questions (approximately 30% of examination, partial credit possible)
Part C
Content and real-world application, extended constructed-response (approximately 25% of examination, partial credit possible)
Part D
Laboratory performance tasks, administered prior to on-demand portion of examination (score will comprise 15% of examination grade)

Trends in Measurement and Evaluation of Science Instruction

Purposes of Evaluation
Society – literacy (science/technology)

Educators – articulation (3 levels)

Parents – vocation

Students – personal growth

Types of Evaluation
Diagnostic -- pretest, remediation

Formative -- feedback, reinforcement

Summative – grading, achievement

Foci of Evaluation




Cognitive – norm or criterion referenced

Affective – interests, values

Psychomotor – lab skills

Verbal behavior

Instructional strategies




Logical structure

Cognitive level

Process orientation

Methods of Evaluation

Paper-and-pencil: M-C, T-F, Essay


Lab Performance Test

Open Book

Take Home

Methods of Evaluation (Con’t)
Observations – checklist



Rating scale

Product analysis



Self/peer evaluation -- conferences

Situation 1
In photosynthesis, the function of chlorophyll is that of:

A. an enzyme in digestion

B. carbon dioxide in respiration

C. bile in the digestion of fat

D. glucose in respiration

Situation II
The following statements are to help you describe yourself in science. Please respond to them as if you were describing yourself to yourself. Do not omit any item. Read each statement carefully; then select one of the five responses listed below.
Completely Mostly Partly Most Completely

False False T/F True True

1 2 3 4 5

Remember, respond to the statements as if you were describing yourself to yourself in science.

I am satisfied with my ability to make predictions.

I do well on number problems in class.

I wish I could make better conclusions based on what I have seen in class.

I am a person who works well with numbers.

I can compare things.

I give up when I have to classify things.

Situation III
Alarmed by reports of plummeting scores on student achievement tests in science, the school board of Technotown – response to many appeals from its citizens – has commissioned a study of the district’s secondary science program in an effort to identify its weaknesses. The study team – comprised of outside consultants as well as teachers and administrators within the system– will use NSTA’s Guidelines for Self-Assessment package in their work.
The titles of its modules are:

Our School’s Science Curriculum

Our School’s Science Teachers

Science Student/Teacher Interactions

Science Facilities and Teaching Conditions

Assessment, Measurement, Testing and Evaluation

Assessment – collection of information via various formats and modes (qualitative, quantitative) and for various purposes

Measurement – a form of assessment which uses written forms of data collection, to include tests, checklists, and inventories

Testing – specialized mode of assessment which is usually timed, consists of discrete items or questions and is focused on a specified set of objectives

Evaluation – making decisions and judgments based on information collected and assumed or established criteria

Predicted Trends in Measurement and Evaluation of Science Instruction


group administered tests

Pencil-and-paper tests

In the Future:

End-of-course summative assessment

Variety of formats: large & small group, individual

Variety of formats: pictorial, laboratory performance

Variety of pretest, diagnostic and formative types of measurements


Measurement of low-level cognitive outcomes

Norm-referenced achievement testing

In the Future:

The inclusion of higher level cognitive outcomes (analysis, evaluation, critical thinking) as well as affective (attitudes, interests, and values) and psychomotor outcomes

The inclusion of more criterion-referenced assessment, mastery testing, and self and peer evaluation


Measurement of facts and principles of science

In the Future:

Measurement of student achievement

The inclusion of objectives related to the processes of science, the nature of science, and the interrelationship of science, technology and society

The inclusion of measuring the effects of programs, curricula and teaching techniques


Teacher-made tests

In the Future:

Concern with total test scores

The combined use of teacher-made tests, standardized tests, research instruments, and items from collections assembled by teachers, projects and other sources

Interest in sub-test performance, item difficulty and discrimination, all aided by mechanical and computerized facilities


One-dimensional format of evaluation (e.g., a numerical or letter grade)

In the Future:

A multidimensional system of reporting student progress with respect to such variables as concepts, processes, laboratory procedures, classroom discussion, and problem-solving skills

Dimensions: Content vs. Behavior

Item Pool Mechanics


Index cards

Electronic databases

Commercial programs


Levels of assessment (k, u, i)

Item Analysis Report:
NYSED 1975

(Electronic Database Example)

Table of Specifications
Biological Science Curriculum Study (BSCS)

Ability to: recall and organize materials learned, apply knowledge to new concrete situations, use skills involved in understanding scientific problems, show relationships between bodies of knowledge
(Doran, p. 23-24)

Comparison of Essay and Objective Items

Abilities measured


Incentive to pupils

Ease of preparation


(Doran, p. 27)
Item Pools

SISS (5, 9)

TIMSS (4, 8, 12)




Projects (Rochester)


State Education Departments (NY, California, …)





Test Grids

Table 4

Ideal %
Ideal %
Ideal %
Ideal %
Ideal %
Ideal %
Ideal %
Ideal %
Ideal %
Ideal %
Ideal %
Ideal %



Table 7

%¯ ®


Using the Core Documents to Write Constructed Response Questions for

Exams and Labs with Examples from the Chemistry Core

Tom Shiland
Saratoga Springs Senior H.S.

Has the old chemistry syllabus been "dumbed down" or is it now possible to test and teach for real understanding?  Productive ways to think about the chemistry core and the new assessments.


    A number of teachers feel the new chemistry core document represents a "dumbing down" of the old chemistry syllabus.  In particular, the reduction of the mathematics in chemistry section and the reduction in the quantum mechanics section has been criticized.  Does a student who recalls that the d sublevel has 5 orbitals demonstrate a real understanding of chemistry?  Higher level questions can be constructed which probe student's knowledge of much simpler models.
    The core will indeed have been "dumbed down" if there is not an increase in the depth of the assessments, assessments which now involve fewer concepts.  As you are probably aware, a significant part of the new exams scheduled to go on line in June of 2001 with the earth science and biology exams are questions which require students to write out a response, i.e. "constructed response" questions.  Few examples of these questions are actually available and there is little information on how to write them.
    The purpose of this workbook is to: a) provide general guidelines for writing constructed response questions and b) to illustrate these guidelines with examples from the chemistry core.  Each of these areas will be addressed, showing that assessments can be constructed which test a deeper understanding of the fewer concepts which are now in the core area.  The workbook is divided into three sections: general guidelines, content examples and process examples.

General guidelines:

1) Start from the latest version of the core documents in writing questions.  Read the core documents carefully and make sure your test items stay within the content or skill of the core.
2) Break our thinking about writing the question into parts:
a. what part of the core document does it address
b. what do you want students to do (explain, describe, predict, graph)
c. what is the setting for the question (the lab, the classroom, home)
d. what do you want them to use in their response (concepts, principles, theories)
e. what should their product look like (labeled drawing, paragraph explanation)
3) Describe exactly in the question what you want in your answer.  If you want them to use a particular concept or theory in their answer, say so.  If you want a labeled diagram and complete sentences say so.  "Guess what is on my mind" is not a higher level question.  On the other hand, limit extraneous material in the stem.  If "taking a walk on the beach" has nothing to do with the problem at hand, then drop it.  The use of pictures and data with these questions is fine, but make sure it is necessary to analyze them to arrive at an answer.
4) “Decookbooking" your labs opens up a whole range of testable items as constructing data tables, graphs, writing simple procedures which fall under Standard 1 of the MST.
5) "Real-world" scenarios suggested in Part C appear to be the most difficult to write because they assume all students have had the same experience which allows them to interpret the scenario.  It may be better to begin with writing questions where you know that all students have had the same experience, as in your class observing a demonstration or performing a lab.
6) Make sure the information you give is scientifically accurate and reasonable.  For example, in chemistry - does the reaction actually occur, are boiling points and melting points accurate, does the compound exist?  Reference any sources used for data, e.g. Chemical Rubber Company, handbook, textbooks.
7) Make sure your question could not be just as well addressed as a multiple choice question.
8) "Cue" the answer and make it easier to score by creating widely spaced lines for writing, and boxes for diagrams if they are required on your answer sheet.  These suggest appropriate sizes for writing and diagrams.  Create constant expectations for the answers, e.g. always use complete sentences, always label diagrams.
9) Construct your scoring guide ahead of time and keep it simple.  Make the point totals low on these questions initially until you are confident that the item is sound.  Break more elaborate questions into parts, and keep each part low in points, perhaps 2 points for a correct answer, 1 point for partial credit.  Think ahead of time what will count as partial credit.
10) Give students practice on constructed response questions before placing them on an exam.  These questions are often related to lab situations, so placing them on a post or pre- lab section in a lab is an ideal way to give students some writing practice and gently raise your expectations.  Whatever practice you give them, you must go over the scoring guide you would use for the questions.
11) Use your department meetings as a forum to discuss these questions.  There is nothing that can beat examining a question from the multiple perspectives of a group of people.  Teachers from other disciplines bring a student's point of view to the discussion.  Invite each member of your department to bring a proposed question with its scoring guide on a transparency to the next meeting.  Suggest a goal in your department that no test this year be entirely multiple choice.
12) Analyze your student answers after the exam or lab.  Was the stem of the question stated precisely enough so that students used the scientific knowledge and skills that you expected?  Did misconceptions surface that you were unaware of, e.g. particles of a liquid are always farther apart than those of a solid?

Content (Standard 4 of the MST) examples
1) A particular recipe calls for 2.5 cups of sugar and 4 cups of flour to make 24 cookies.  How much of each ingredient would be required to make 30 cookies?  Show all calculations and explain your reasoning


Scoring guide: Calculation has set up showing units (1) and correct answer

Explanation describes the proportional reasoning involved (2).

(Core Reference: 3.3 c A balanced chemical equation represents conservation of atoms and mole ratios of reactants and products. (Note for the purpose of the examination calculations will be limited to mole-mole problems).

2) Solid paraffin sinks in liquid paraffin while solid water (ice) floats on liquid water.  Propose an explanation and use labeled diagrams of particles(4).




solid paraffin in liquid ice in water paraffin                             ice in water


 Scoring guide:
Diagram (2)
2-Shows particles of solid paraffin closer together than liquid paraffin.  Shows ice particles farther apart than water particles.  Shows solids having a regular pattern and liquids having a random arrangement.
1-Shows particles of one substance represented correctly
0-Shows no substance represented correctly

Explanation: Describes the correct diagrams accurately (2).

(Core reference: 3.lhh The three phases of matter, i.e. solids, liquids, and gases, have different properties.)

3) Explain why placing a drop of boiling water on your hand does not bum your hand, while placing your hand in boiling water would certainly result in a severe bum.  Use the concepts of heat, temperature and calories in your explanation. (4)


(Core References 4.2a Heat is a form of energy, which is the total amount of kinetic energy of the particles in a sample of water, 4.2 b Temperature is a measurement of the average kinetic energy of the particles in a sample of material.  Temperature is not a form of energy.)

4) Explain how hydrogen and oxygen can make a completely new substance with different properties, without gaining or losing any atoms, but water boiling does not.  Use labeled particle diagrams and a written explanation.




Hydrogen and oxygen                                              water boiling


Scoring guide:
Particle diagrams of hydrogen and oxygen drawn correctly, showing a new particle of water being formed (2).
Water boiling shown with no changes in particles but only spacing (2).
Explanation correctly describing diagrams (2)

(Core Reference: 3.2a A physical change results in the rearrangement of existing particles in a substance.  A chemical change results in the formation of different particles with changed properties.)

Describe the contents of each box in terms of elements, compounds and mixtures.

Description of A;

Description of B:

Description of C:

Scoring guide:

3- Box A consists of a diatomic element, Box B consists of a monatomic element.  Box C consists of a mixture of a compound and a monatomic element.

0-2 Each correct box description is worth one point.           -
( Core references - elements; 3.laa compounds; 3.111 mixtures)
6) Crushing a sugar cube will make it dissolve faster in water as will heating the water that it dissolves in.  Explain how each of these processes work on a particle level.

Explanation for crushing:

Explanation for heating:

Scoring guide:
2- Crushing a sugar cube increases the surface area of the sugar crystal, allowing more collisions between the water molecules and the sugar molecules.  Heating the water makes the water molecules move faster, increasing their collisions with the sugar crystal.

(Core reference: 3.49- The rate of a chemical reaction depends on several factors: temperature, concentration, nature of reactants, surface area, and the presence of a catalyst.)..

Process Skills (MST Standard 1) questions

1) A fellow student tells you they have determined the density of zinc using the same equipment in our lab as 7.304 g/mL.  You tell them this is impossible.  Show a sample calculation and explain why this result would not be likely. (4) Sample calculation:

(Core reference: Skill under Standard 1, Mathematical analysis- Analyze data utilizing the concepts of measurement precision and uncertainty as related to significant figures used in calculations).

2) A student collects the following data using equipment in our lab. mass of solid 36.2 grams

volume of graduated cylinder before 5.60 mL
volume of graduated cylinder after adding solid 8.9 mL
a) Analyze the data collected (1)

b) Given the data, calculate the density. (2)

c) Given an accepted value of 7.1 g/mL, find the percent error. (1)
(Core reference as above)

3) Describe a step by step procedure to separate iron filings and sugar into separate pure substances identifying all equipment with its proper name. (not necessarily 7 steps)
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 7

Scoring guide:

2- All necessary steps listed in proper order with equipment.
1- Steps missing or order wrong.

 (Core reference: Key idea .2- Standard 1)
4) For an experiment to determine if whether water is distilled (evaporated and condensed) has an effect on its density.:
a) Give the independent and dependent variables.(2)

b) Design the data table.(S)

Scoring Guide:
a) independent variable is the type of water, either distilled or not distilled
dependent variable- density of the water

b) 4 actual equipment readings (4) + two columns (sets of readings) (1)
distilled water              not distilled water
mass of empty flask (g)
buret reading start (mL)
buret reading finish (mL)
mass of flask + water (g)

(Core reference: MST Standard 1, Key Idea 3)

5) Class data for the reaction rate lab was as follows: Temp.(degrees C) Time (sec)
     10         50.3
     15         51.5
     20         36.6
     25         42.8
     30         36.0
     35         39.0

Graph the data, draw a best fit curve and estimate the time at 400C.

Scoring guide:

5- graph has independent and dependent variables with proper axis, labels on axis and title of graph, best fit curve drawn correctly line extended to 400 C correctly.
0-4 Each item above is worth one point.

(Core reference: MST Standard 1, Key idea 1)

6)) Describe an experiment to determine the effect of temperature on the rate an alka-seltzer tablet dissolves.  Specify a hypothesis; along with the basis for the hypotheisis, give independent and dependent variables, procedure and data table with units.

5- Each of the above given.

0-4 Each item worth one point

(Core reference: MST Standard 1, Key Idea 2)

 The University of the State of New York
Office of State Assessment




1. Use constructed response items to measure objectives that cannot be measured as well with multiple choice items.  Good constructed response items ask students to demonstrate an understanding or an appreciation of a skill.

2. Indicate clearly the type and length, or depth, of answer required.  Write precise, accurate, readable, and complete student directions. (p a r c)

3. Make questions or subtasks within a task independent of each other.  Items need to be scorable independently of each other. (e.g. If students cannot construct a graph, they cannot describe a relationship supported by the graph.)

4. Develop the rubric or scoring guide at the same time the item is written.  Provide model answers and a range of acceptable answers. (This step will enhance steps 2 and 3.)

5. Verify the scientific accuracy of all stimulus materials (weather maps, cross-sections, data tables, diagrams etc.) and provide references.

6. Remember the four "R's" of item writing: review, reflect, revise, revise again!

SWM 12/99

Guidelines for Developing Extended-Response Items


The extended-response (essay) item is used to measure higher-level learning outcomes.  It requires students to apply thinking and problem-solving skills, to demonstrate understanding of scientific concepts, and to demonstrate the ability to produce, organize, and express ideas, and to integrate learning from different areas.

The extended-response item is useful for asking students to perform such tasks as:
• comparing or contrasting two or more things
• describing similarities and differences
• describing relationships
• describing applications of principles
• identifying and explaining cause-effect relationships
• giving examples of principles, concepts, or events
• analyzing a series of events
classifying, sorting, or categorizing explaining or interpreting a passage presenting relevant arguments
stating necessary assumptions
applying principles in novel situations, extrapolating beyond known information deciding or recommending for or against something formulating tenable hypotheses
formulating valid conclusions

Steps in Developing Extended-Response Items

I . Identify the content you want to test.  Your item should test important knowledge and skills and be based on content and behaviors contained in the core curriculum guide.  Review your assignment and carefully read through those portions of the guide that pertain to your assignment.  You may need to locate and review reference materials that relate to your assignment before you begin to write the item.

Identify the higher-order process you want the student to demonstrate. (See Bloom's Taxonomy/Cognitive Activity).  The behaviors tested should be those that would be expected of a student in intermediate -level science.

3 Determine that the extended-response format is the best type of item to use.

4. Write a general statement of your idea for the item, incorporating the content and the cognitive process.

Terminology Used in Writing Items to Test Complex Learning Outcomes

Good answers to constructed-response and extended-response items depend in part upon the directive words that indicate the way in which students should respond.  The chart below lists some of the words that can be used to encourage students to demonstrate higher-level cognitive skills.

ANALYZE         Break down a complex whole into its component parts so as to discover its true
                          nature or inner relationships
                          For example: Analyze the various strategies used to study

COMPARE        Bring out points o similarity and points of difference.
                          For example: Compare the function of carbohydrates and fats.

CONTRAST       Bring out the points of difference.
                          For example: Contrast the different functions of the large and small intestines.

CRITIQUE         Review the merits of an item or issue; criticism may approve or disapprove.
                          For example: Critique the methodology of the author's proposal.

DEFINE             Give the meaning of a word or concept; place it in the class to which it
                          belongs and set it off from other items in the same class.
                          For example: Define the "whole language approach to teaching literacy in
                          elementary schools.

DESCRIBE        Give an account of tell about; give a word picture of.
                          For example: Describe how you and your partner collaborated on the project.

DISCUSS           Consider from various points of view; present the different sides of. (Item should
                          provide a focus of discussion for the student.)
                          For example: Discuss the advantages and disadvantages of .

EVALUATE       Give the good points and the bad ones; appraise; give an opinion regarding the value of,
                          compare the advantages and limitations of.
                          For example: Evaluate the usefulness of computers in the classroom.

EXPLAIN          Make clear; interpret; make plain; tell the meaning of, tell how to do something.
                         For example: Explain how hail is formed.

ILLUSTRATE   Use a word picture, diagram, chart, or concrete example to clarify or explain a point.
                         For example: Draw a diagram to illustrate the rain cycle.

INTERPRET     Give the meaning of your thoughts about; translate.
                         For example: Interpret the findings on the graph that follows.

JUSTIFY          Show good reasons for: gain evidence or facts to support your position.
                        For example: Justify your answer by citing relevant examples that have occurred in the
                        past year.

SUMMARIZE  Sum up; give the main points briefly.
                        For example: Summarize three ways to set up this experiment.

TRACE            Follow the course of; describe the progress of.
                        For example: Trace the development of a human embryo from conception, to birth.

(Return to R. Doran's homepage)