What is it? Ancestors History Tutorial Literature Software Commerce
   
   
 

History

Although Aristotle was Plato’s most distinguished student, he did not lead Plato’s Academy after Plato, but rather founded his own school, the Lyceum. This was probably because of the great dispute between the teacher and his student. Both Plato and Aristotle agreed that knowledge, episteme, was perfect, unchanging and categorical. Both also agreed that the world of experience -- the world in which we all live -- is in constant flux. The great sophist Heraclites said that you can’t step in the same river twice, meaning that every aspect of the river changes between the first step and the second. He believed that the only reality was change.

Heraclites’ great rival, Parmenides, believed the opposite. The Greeks embraced two philosophical principles very strongly: the first is the principle of non-contradiction, which says that an entity can’t be and not be the same thing at the same time. A tomato might be green, or it might be red, but it can’t be red and green at the same time. The other is the principle of causality, which says that nothing can come from nothing, nor can anything pass into nothing.

Based on these principles, Parmenides argued that change was impossible, and the flux apparent in our experiences must be an illusion. If a green tomato ripened, for example, Parmenides asked where did the red tomato come from? And where did the green tomato go? Green can’t be the cause of red, because the effect has to be like its cause. The green tomato can’t be annihilated, because nothing can pass into nothing, and the red tomato can’t come from nothing, so this change can’t actually happen -- it must be an illusion.

Plato tended to agree with Parmenides. (In Plato’s dialogues, Socrates wins in every case but one -- his dialogue with Parmenides. But Plato opines that Socrates was young at the time.) He, too, regarded the world of experience to be illusion, a source of error to be avoided. The true source of knowledge was not experience, but the World of Ideas, and the proper road to true knowledge was to lead it out (e ducere) of your own mind, where it lies, forgotten, from our original sojourn in that world.

Aristotle disagreed with Plato about this. He suggested that the world of experience was made up of two principles: primary matter and substantial form. Primary matter is the basis of an entity’s existence, but its substantial form that makes it what it is. Change is simply the process by which restless primary matter casts off and takes on forms. It is the forms which we come to know by abstraction from our experience. Thus Aristotle disagrees with Plato about his most important belief: he believes knowledge comes from experience, and advocates examining the world of experience to gain knowledge.

But Aristotle has not yet solved Parmenides’ dilemma. He needs to account for the succession of forms. Where do the forms come from, and where do they go? According to the view that Aristotle, Plato, Parmenides and the rest accepted, the forms must be caused by something, since nothing can come to be from nothing, and the cause must contain the form in it in some way, since a cause must be like its effect.

Aristotle solves this problem by asserting that the form or essence of any entity contains within it from the first moment of its existence all the forms through which it will pass during its existence. They are there potentially (in potency), and come to be actually there (in act) when they are realized. An acorn contains leaves, bark and the rest of the attributes of an oak tree in potency, but the oak tree has them actually (in act). The form (essence) of a tomato contains both the forms “green” and “red”, first red in potency and then red in act.

This solves the problems of causality and non contradiction for any given entity, but does not solve the problem entirely. Aristotle also needs to solve the problem of the succession of things themselves. If the child was caused by its parents, and its parents by their parents, what does the entire sequence depend on? Aristotle solves this problem by positing an “uncaused cause,” an entity existing for all time which contains all the forms that will ever be. This preexisting set of forms is what Aristotle calls “final causes,” the preexisting ideas of what a thing is to become.

Aristotle’s concept of motion follows this model exactly. When an object is at rest, it is in a place, but it is potentially in another place; it is in one place “in act” and another place “in potency.” Once it has moved it is in the new place “in act.” When it is moving, however, it is not in any place. For Aristotle, motion is “the act of a being in potency insofar as it is in potency.” The object contained its final resting place “in potency” as part of its substantial form from the first moment of its existence, and intended to go there. Heavy objects “belong” at the center of the earth and try to go there; light objects, such as fire, want to be at the periphery of the world, and try to go there. The impetus or cause of their motion lies within them. An Aristotelian explanation of motion consists of understanding the state of the object before and after the motion, but does not focus on the moving itself.

Is it possible that the inadequacy of the Aristotelian concept of motion made it impossible for analysts to see it precisely? Despite Galileo’s extraordinary ingenuity, precise description of complicated motions had to await the development of Newton and Leibnitz’ new language, calculus, which made it possible to describe motions as precisely as desired, if not perfectly. In advocating careful and extensive study of the world of experience, Aristotle strongly supports the development of science, but the categorical, absolute character of Aristotle’s view of knowledge is fundamentally different from knowledge as science understands it. Science rejected Aristotle’s idea of the “final cause” in describing motion, and explosive progress followed quickly.

The Social Sciences

Not all the sciences rejected Aristotelian thinking. Just as Aristotle believed a rock needed to know where it wanted to go before it went, he believed that humans needed to intend to do whatever they did. If I go to the store, it’s because I intended to go to the store before I left; if I eat a banana, the intention to eat the banana must preexist my act. For Aristotle, behavior or action is “the act of a being in potency insofar as it is in potency,” and not analyzed further. The action is not the focus of inquiry, but the state of the person before and after the action. Action or behavior is a series of indistinct blurs between states of being. While it is amazing that Aristotle’s theory of motion was held in the face of overwhelming evidence for 2000 years by very intelligent and thoughtful scholars, it is perhaps even more amazing that Aristotle’s theory of human behavior is still held by most social scientists 2400 years later.

Rejecting Aristotle is not easy. In the first half of the 20th century, Alfred Korzybsky developed a theory strongly critical of Aristotle’s methods which became very popular, and led to the foundation of the discipline of general semantics. Korzybsky’s work focuses on the discrepancy between the flux of our experiences versus the ideal, static, categorical structure of language, and warns that failure to realize the crude approximation that language provides for experience blinds one to seeing and understanding. Korzybsky’s work had led to diverse followings, including academic scientists like S. I. Hayakawa, cult like movements such as noology, and literary works, particularly those by A. E. van Vogt.

As with most debate cases, the need is strong but the plan is weak. Korzybsky and his followers do a convincing job of showing that the continuous, flowing world of experience is inadequately represented by the categorical Aristotelian language we use to describe it, but provide no systematic procedure for overcoming the problem. Even though Gilbert Gosseyn, the hero of A.E. van Vogt’s provocative novels, calls on the power of non-aristotelian philosophy and methods, his success is attributable much more to his extra brain, which gives him the ability to teleport across galactic distances, and a stash of extra bodies which come to life in succession as each one old one is killed.

The Categorical Character of Social Science

While the physical sciences were making giant strides after abandoning the categorical model of Aristotle for the comparative model of Galileo, the social scientists remained steadfastly categorical in their thinking. In psychology, the basic model of human behavior remained categorical and intentional, with prior states of mind, such as attitudes, wishes, motives, needs, or other psychological states providing an impetus which led to a behavior, which then resulted in an end state. Many psychologists -- perhaps a great majority early in the 20th century -- believed these motives were built in genetically from the first moment of a person’s existence.

In sociology and anthropology, the most common general theory believed that society had a structure which consisted of statuses, which where distinct “locations” in the society (“status” is Latin for “place.”) All the statuses were arranged in a hierarchy. Karl Marx recognized three distinct levels of stat us: the rulers, the working class, and a middle class that was doomed to be driven down into the working class. Max Weber opined that there were three parallel situses in the hierarchy, with stratification based on wealth, status and power.

Anthropologists differed among themselves as to how many classes there were. Some recognized three, lower, middle and upper; some added “working class” to make a four tier system. Lloyd Warner identified six. In any event, each of the classes was thought of as a discrete, categorical thing. Mobility meant the ability of an individual to move from the class in which s/he was born to another. In an open society, the boundaries between classes were thought to be porous, and mobility occurred frequently; in a closed society, boundaries were rigid and mobility rare.

Some sociologists
, however, conceived of status as a continuous, comparative, quantitative variable. Archie O. Haller, a University of Wisconsin sociologist with an engineering background, suggested a modern comparative model of stratification. Haller did not conceive of social mobility as the discrete change of an individual from one status to another, but rather as a lifetime trajectory, as a point moving in a continuous stratification space. Along with William Sewell and Alejandro Portes, he published a model of the status attainment process which suggested that the trajectory of individuals through the status hierarchy was determined by their continuously changing aspirations, which were themselves influenced by the expectations of their significant others. Their findings indicated support for the model, but their results were attenuated by the poor quality categorical measurements in the secondary data that was available to them. Haller designed a new study to develop superior instrumentation, which resulted in the Wisconsin Significant Other Battery (WISOB), a set of questionnaires which identified the most significant others for adolescent children, and measured the aspirations of the children and the expectations of their significant others.

The WISOB was itself a categorical device, in which adolescent children were asked to name the people who communicated most with them in each of four categories. The measurement of educational and occupational aspirations and expectations, however, had some comparative characteristics. Level of educational aspirations and expectations were measured by asking students how far they planned to go through school and by asking significant others how far they expected the adolescent to go through school. Although the answers were recorded categorically (e.g., some high school, finish high school, etc.) they corresponded roughly to years of schooling, which is comparative.

Level of occupational aspirations and expectations were calculated by asking students what specific jobs they expected to be able to get, and by asking significant others what specific jobs they expected the child to be able to get. The level of occupational prestige of each of these jobs was recorded based on the NORC Occupational Prestige scale, a quasi comparative scale with approximately a 90 point range. Scores of all the jobs for a single adolescent were averaged to provide an estimate of the level of occupational aspiration and expectation.

An important part of the model to be tested hypothesized that the aspirations of the adolescent respondents would be strongly influenced by the expectations of their significant others, following Mead, Sullivan and others. Since the significant others were identified by the adolescents rather than preselected by the investigators, respondents differed in the number of significant others they reported. This produced a difficult analytic situation since there are no traditional multivariate analysis methods that allow a different number of variables per case. After much study and consultation, the investigators (at this point Haller, Joseph Woelfel and Edward L. Fink) decided to calculate the average expectations of all significant others for each respondent, and use this average as an indicator of the expectations. This turned out to be a very good predictor of the respondents’ aspirations -- by far the best in the literature by a very large margin.

No one at the Significant Other Project ("other" than what, you might ask?) had any theoretical justification for choosing the mean, but chose it solely as a heuristic to overcome the problem of different numbers of variables per case. After the fact, however, the results seemed very reasonable. If each individual significant other’s expectation could be thought of as a force acting on the individual’s aspiration, then the mean of all those forces would represent a balance point where the net force was zero.

Now at the University of Illinois, Woelfel, along with John Saltiel, Donald Hernandez, and Curtis Mettlin (with some help from Ken Southwood) worked out the algebra of the force model for the one dimensional case, and Hunter, Danes, & Woelfel provided experimental evidence that this model fit observations better than alternative plausible models. This work, generally referred to as a theory of linear force aggregation, resulted in a series of publications showing that attitudes of respondents tended to lie near the weighted average of the expectations of significant others, controlling for important social structural factors.

The space of occupations

Despite the strong support for the averaging model, a major problem remained: it only applied to attitudes that could be measured on a comparative scale. It was possible to take the average of the occupational prestige of several occupations, or the average number of years of education, or the average number of radical activities, or the average number of marijuana cigarettes smoked per day, but just what is the average of Doctor and Airline Pilot? The averaging model could not be used for discrete choices: If your mother expects you to be a doctor and your father expects you to be an airline pilot, just what is the average? This problem could be solved if each discrete object, such as an occupation, could be represented as a point in space, close to other objects that are like it, and far from other objects which are different.

Several spatial representations of social and psychological data were known. L.L.Thurstone conceived of psychological content in spatial terms. He conceived of attitudes as “positions” in a mathematical space, and his scaling procedure involved starting with large pools of such positions and sorting them into piles until a final, reduced set of positions lying at approximately equal intervals remained to form the scale. In his study of human intelligence, he believed that the measured values on intelligence tests were a function of a smaller set of “factors” that represented aspects of mental ability. These “factors” he thought of as a bundle of inter correlated vectors in a vector space of relatively low dimensionally, and he developed procedures for identifying these factors by extracting the eigenvectors of the matrix of intercorrelations among test items. It’s important to understand that the central goal of factor analysis was to find a vector space of considerably lower dimensionally than the order of the data: any procedure that did not reduce the dimensionally of the data would be a failure for Thurstone’s purposes.

Thurstone’s
factor analysis was a vector space within which various “factors” underlying mental ability were arrayed as generally correlated vectors, but Thurstone’s factor space had some difficulties: most important was the standardization of the data in the form of correlation coefficients that made the factor space into a unit hypersphere. Moreover, at the time Thurstone developed factor analysis, the computer had not yet been invented, and factor analyses had to be done by hand, a laborious procedure involving dozens of graduate students laboring weeks. Because of this labor intense procedure, Thurstone developed rules of thumb for determining when “enough” factors had been extracted, which led to the common practice of presenting a smaller dimensional solution that did not completely represent the data.

Subsequent practice, in which workers were interested only in the items which had the highest numerical coordinates (“factor loadings”) led to the common practice of deleting any coordinates whose value fell below plus or minus .4. The result was a “factor space” that did not actually represent the raw data well, and, in fact, the original correlation matrix could not be regenerated from the matrix of factor loadings, nor could the original scores be reproduced from the correlation matrix. Thus began the curious practice, common to 20th century psychometrics, of compromising the data to fit preconceived notions of what the resulting space “ought” to look like. This practice can be attributed to the Platonic notion that true or correct ideas must have a specific, perfect form, and the world of experience could only be a source of distorted and erroneous perceptions.

Not only was the dimensionality of the space expected to be small, but the dimensions were also supposed to represent some latent factor or trait. Osgood’s semantic differential space, which was popular for a brief period in the early second half of the 20th century, was also a unit sphere restricted to three named dimensions, which were always expected to be three orthogonal attributes: good-bad, active-passive, and strong-weak, but it’s bipolar measurement system and extensive list of “degenerate” attributes which would not fit into the hypothesized three dimensions were problems. Research using the methods of the semantic differential showed that many attributes could not be made to fit into the three dimensional unit sphere that was the semantic differential space, so these were set aside to a list of “degenerate” attributes which were proscribed from use.

In 1938, Young and Householder identified an exact solution to the problem of defining a spatial coordinate system from a matrix of inter point distances. This solution, slightly modified by Warren Torgerson, was presented to Psychology in his 1958 textbook, Theory and Method of Scaling, under the name “multidimensional scaling,” but quickly ran into problems. When given high quality paired comparison data from actual empirical measurements, results of the Young-Householder-Torgerson method were usually both high dimensional and non-euclidean. The high dimensionally was indicated by a large number of eigenvectors of substantial length, and the non-euclidean character was revealed by the fact that several of these eigenvectors were imaginary, with corresponding negative eigenvalues.

20th century psychometricians were alarmed by these two characteristics, (although no reasons were ever given for why the space of human cognition ought to be euclidean and low dimensional), and sought to find ways to “correct” the solution. Once again, in the Platonic tradition, psychometricians assumed that the measurements themselves were inherently untrustworthy, and that the high dimensionally and non euclidean character of the space were the result of measurement error. The belief that human measurements were inherently very crude also led to the belief that the only useful result of developing a space of cognition was to produce two dimensional pictorial maps that would give investigators and intuitive picture of the overall structure of the space. The idea of the space as an inertial reference frame within which cognitive processes might be precisely represented was not present in the psychometric literature of the time.

Attneave (1954) suggested that the paired comparison scales might not be trusted to have a true zero point, and suggested finding a smallest constant number (the “additive constant”) which could be added to every measurement to make the space euclidean. This procedure, however, still left the dimensionally of the space high, which most psychometricians found uncomfortable. Another procedure known at the time was adding the smallest (largest absolute value) negative eigenroot to every eigenroot, which would leave all eigenroots positive, then renormalizing the eigenverctors to their new eigenroots. This eliminated all traces of non-euclideanism from the space, but sill left a high dimensional solution. What’s more, the original measurements differed from the values regenerated from the newly scaled coordinates by large margins.

Roger Shephard and Joseph Kruskal independently developed a “solution” to the problem of high-dimensional non-euclidean spaces. Since, following Plato, the data provided by measurements would be grossly distorted, or, following Aristotle, the data provided by human measurements would be of a much lower order of precision that that provided by physical experience, the data themselves should be of secondary importance, and the more important component of understanding must come from the philosophical appeal of certain absolute, beautiful forms. Therefore, we may feel free to modify our measured data until they conform to the ideal space, which should be of low (two or three) dimensions and euclidean. Curiously, these writers assign one aspect of experience an inviolable certainty: they all assume that the ordinality of measurements are trustworthy and must not be violated. Given this stipulation, investigators are free to adjust the measured values in any way and by any amount until they it into a euclidean space of pre specified dimensionally (usually 2 or 3) as long as the order of the original measurements is not violated. A number, usually Kruskal’s stress or a variant thereof, is then calculated to assess the degree to which the final solution violates the ordinality of the original measurements. This new method of “non-metric multidimensional scaling” had a major impact on the field of psychometrics, and, for a long time, almost completely eclipsed the use of classical Young, Householder Torgerson methods.

Why, in the confusing world of Plato’s sense experiences, or Aristotle’s world in which human data can only be perceived to much broader tolerances than physical things, should these psychometricians find the ordinal relations of perception to be reliable data? This is probably due to S. S. Stevens’ fourfold classification of measurement as nominal, ordinal, interval and ratio, a taxonomy accepted as an article of faith by virtually every textbook in the social sciences.
Within this taxonomy, the lowest form of measurement is nominal, in which objects of perception can be named. At the second, ordinal level, the objects can be placed in rank order in terms of some attribute, but the exact intervals among them cannot be ascertained. At the third level, perceptual objects can be placed in an order, and the exact intervals among them can be established, but the location of a true zero, that is, a point at which none of the attribute exists, is unknown. Finally, at the highest level of measurement, the exact intervals among perceptual objects can be established, and their distances from a true zero point is known.

Stevens’ taxonomy is not derived from principles, nor based on data, but relies entirely on intuition or common sense. Like Aristotle’s law of falling bodies, however, Stevens’ classification is incorrect, and won’t stand up under very simple scrutiny. Of course, measurements can be classified into four categories, since any set of perceptions can be classified into any number of categories arbitrarily, but whether these categories themselves are an ordinal scale is open to question. Is an interval scale “higher” than an ordinal scale?

The psychometric literature makes it seem as if this is too obvious to require proof, either formal or empirical, but in fact, the ordinal property of a distribution of values is only more robust than the metric values in the case of one special kind of distribution: one in which the values are very sparse and widely distributed. When this is the case -- and only when this is the case -- large changes in the values of elements in the distribution will leave the rank orders unchanged. In other kinds of distributions, including those with dense distributions with many elements close to each other in value, or in data with extensive symmetry, such as the distances among the features on a human face, very slight changes in the values of the elements will produce very large changes in the rank orders.

These are not rare cases, but probably typical of the most common kinds of data. Deciding whether you prefer chocolate to vanilla may be simple, and perhaps easier then deciding how much you prefer one to the other, but placing flavors into rank order is much more difficult -- even more difficult than assigning each flavor a numerical favorability rating.

Not surprisingly, non-metric multidimensional scaling, which once almost completely eliminated classical procedures from the field, has already run into serious trouble, and even the leaders of the non-metric movement now suggest that the classical Young Householder Torgerson procedure is often -- perhaps even usually -- better.

The Galileo Group at the University of Illinois


The Galileo Group at Michigan State

The Galileo Group at Albany

The Galileo Group at Buffalo

The Galileo Group at The East West Center

But what good is it?

It’s important to remember that the original goal of factor analysis, the semantic differential, and multidimensional scaling was to find a space of low dimensionality spanned by a small set of vectors that represented meaningful psychological attributes. The Galileo developers, however, had no interest in this. The original goal was specifically to establish a coordinate reference frame that could be used as a mathematical aid for describing cognitive processes such as attitude and belief changes over time. The fact that strong evidence indicated that the space of cognitive processes, when measured with ratio-level paired comparisons and using exact rather than approximate scaling algorithms, was high dimensional and non euclidean was of no consequence. The philosophy behind Galileo is straightforward and consistent: 1) measure as precisely as possible, 2) introduce no distortion into the analysis at any point, and 3) accept the results as they are. Following this philosophy rigorously generally produces high dimensional, non euclidean spaces. But what are they good for, and why would anyone want to make one?

The usefulness of Galileo space is that events in the space correspond to events of interest in experience. Each of the points in a Galileo space represents a social object, following Mead, and such objects can be, as Blumer notes, “...anything that can be designated or referred to.” The self is an object in this system, and can be positioned in a Galileo space. Behaviors are also objects, and can be arrayed in the same space. Wisan’s dissertation supported the hypothesis that behaviors that are performed frequently (e.g., walking, sitting) lie closer to the self point in a Galileo space than do behaviors that are performed infrequently or rarely (e.g., marrying, fighting). In fact, between repeated administrations of the behavior questionnaire, in response to the US invasion of Cambodia and the killing of four students at Kent State University, national guard forces were sent in to the University of Illinois campus. At the next administration, fighting moved considerably closer to the self point in the map, while revolution moved even farther away from the self.

Galileo lends itself well to time series measurement and experimental research, because it makes it possible to project measurements made at different times onto the same coordinates, and because the algorithm behaves identically every time. The non metric scaling algorithms are seldom seen in time series or experimental research, because their iterative approximation interacts non-linearly with data and thus treats data differently in each session, and the merely ordinal character of the data is not strong enough to show changes over time with meaningful precision.

Ideal Point

 
 
updated May 2, 2015
 
 
Click to back HOME!