Browse Prior Art Database

An Assessment of Reliability of Dialogue-Annotation Instructions Disclosure Number: IPCOM000128647D
Original Publication Date: 1977-Dec-31
Included in the Prior Art Database: 2005-Sep-16

Publishing Venue

Software Patent Institute

Related People

William C. Mann: AUTHOR [+5]


This report is part of an ongoing research effort on man-machine communication, which is engaged in transforming knowledge of how human communication works into improvements in the man-machine communication of existing and planned computer systems. This research has developed some methods for finding certain kinds of recurring features in transcripts of human communication. These methods involve having a trained person, called an Observer, annotate the transcript in a prescribed way. One of the issues in evaluating this methodology is the potential reliability of the Observer's work. This report describes a test of Observer reiiablity. It was necessary to design a special kind of test, including some novel scoring methods. The test was performed using the developers of the instructions as Observers. The test showed that very high Observer reliability could be achieved. This indicates that the observation methods are capable of deriving information which reflects widely shared perceptions about communication, and which is therefore the right kind of data for developing human communication theory. It is a confirmation of the appropriateness and potential effectiveness of using this kind of observations in the dialogue-modeling methodology of which they are a part. It is also of particular interest as an approach to study of human communication based on text, since content-related text-annotation methods have a reputation of low reliability.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 4% of the total text.

Page 1 of 42


An Assessment of Reliability of Dialogue-Annotation Instructions

William C. Mann' James H. Carlisle James A. Moore James A. Levin

ARPA ORDER NO. 2930 NR 154-374

ISIIRR-77-54 January 1977


4676 Admiralty Way/Marinadel Rey/California 90291 UNIVERSITY OF SOUTHERN CALIFORNIA (213) 822-1511

Preparation o f this paper was supported by the Office o f Naval Research, Personnel and Training Research Programs, Code 458, under Contract N00014-75-C-0710, NR 154-374, under terms of ARPA Order Number 2930. The views and conclusions contained in this document are those o f the authors) and should not be interpreted as necessarily repre- senting the official policies, either expressed or implied, o f the Office o f Naval Research, the Defense Advanced Research Projects Agency, or the U.S. Government. T'hic inryimp"t. is ihhm4jPd fnr hrihlir rnlaecn 'Ind telp* ilulri~sttion_ic~9alimitnd


List of Tables and Figures 4 Abstract 5 Acknowledgments 5 I. Overview and Research Context 6 II. Reliability in Systematic Observational Techniques 7 III. The Dialogue Annotation Instructions 9 IV. The Methodology for Reliability Assessment 12 A. Design Issues in Reliability of Content Analysis 8. The Agreement Assessment Algorithm 1. Event collapsing 2. Agreement on Event Identification (level One) 3. Agreement on Event Dependent Annotation (Level Two) 4. Combining Reliability Scores 5. Sources of Possible Bias 6. Mathematical Properties of the Reliability Computation Method 7. Rejection of Other Algorithms for Reliability Computation C. Summary of the Methodology for Reliability Assessment V. Reliability Coding Rules by DAI Category 24 A. Event Collapse Rules for Segments B. Requests C. Repeated References D. Topic Structure E. Expressions of Comprehension F. Similar Expressions Generated Out of Context VI. A Study of Four Dialogues - Application of the Methodology 26 A. Subjects B. Dialogue Selection G. Similar Expression Generation D. Annotation of Dialogues Page

VII. Test Results and Interpretation 28 A. Overview of the Results B. Requests Test Results C. Repeated Reference Test Results 0. Expression of Comprehension E. Topic F. Similar Expressions Test Results VIII. Conclusions 36 A. Summary B. Interpretation of the Results Appendices: A. Dialogues Used in this Test 38 B. Sample Similar Expressions 52 C. Observer Checklist 54 D. Procedural Expression of the Reliability Computation Algorithm 58 References 61



University of Southern California Page 1 Dec 31, 1977

Page 2 of 42

An Assessment of Reliability of Dialogue-Annotation Instructions

1. Apparent Reliabilities for Various Numbers of Observers 19 2. Comparison Points for Reliability Score Interpretation 20

3. Estimated Reliability Under Random Observation 21

4. Reliability Computations 29 FIGURES

1. An Example of Pairwise Comparison 14 2. An Example of Pairwi...