The Usefulness of Quantitative Methods in Political Science: the Case of Scaling and the United States Supreme Court A thesis presented by Jonathan Robert Pool to The Department of Government in partial fulfillment of the requirements for the degree with honors of Bachelor of Arts, Harvard College, March, 1964 Section IV WARNING: This file exists for the sole purpose of exposing the content of the associated PDF page-image document (http://utilika.org/pubs/etc/scaling.pdf) to search engines. The text in this file is not entirely accurate and omits tables. 47 IV Scaling Quantitative methods have been used to study many aspects of the Supreme Court, since C. Herman Pritchett pioneered in their use in 1948.81 The summary decisionmaking of the Court has been quantitatively analyzed; the relation of Justices' behavior to their previous backgrounds and affiliations has been studied; the content of opinions has been mathematically analyzed; the statics and dynamics of intra-Court politics have been the object of much attention; and this is but a partial list.82 The attitudes of the Justices have been treated by quantitative methods perhaps more than has any other topic. Three principal methods have been used for this purpose: content analysis, Guttman scaling, and multivariate analysis, Content analysis has been comparatively rarely employed, mostly by one scholar, and it represents not a very radical departure in theory from 48 non-quantitative methods, so we shall exclude it from consideration. For reasons to be explained below, we shall exclude also multivariate analysis. The remaining method, Guttman scaling, also known as cumulative scaling, is by far the most widely used and well established quantitative method for the investigation of Justices' attitudes, but, because of its theoretical and practical difficulties, Its use for this purpose is often attacked. Scaling thus provides an ideal subject for our attention. As we indicated above, we shall study first the theory of scaling and then its operation in practice. The theory of scaling is rooted, not in an esoteric mode of thought, but in common sense. The fact that the judicial process is based on the adversary system itself suggests that judges might think about their cases as problems of choice--each alternative being to favor one of two parties over the other. When precedent and the belief in rule by lawn rather than by men are firm, the choices involve deciding what precedents, laws, and constructions of law should apply to the facts of the case, as well as what facts should be taken into consideration. It is not unreasonable, then, to believe that judges translate the inter-party conflicts that they 49 face into conflicts between values, in an effort to assuage the impingements of their own arbitrariness on the course of justice. Thus it is quite conceivable that the Justices of the Supreme Court regard many of their cases as contests between two values. obvious examples of such conflicting pairs include individual freedom versus national security, protection for workers versus economic freedom for management, and decision on Constitutional dilemmas by the politically semi-representative and semiaccountable Court versus by the popularly elected Congress. Suppose, then, that some of the cases before the Court are viewed by all of the Justices as involving only, or predominantly, conflicts between pairs of values. If we can isolate a group of these cases involving conflicts between the same two values, then new possibilities for analysis should be opened. On the one hand, it should be possible to rank the Justices according to the intensity of their favoritism toward one value over the other. If it a certain case one Justice votes for value A and another for value B, we immediately know who is more willing to favor A over B. By making several such comparisons we can order all nine Justices, 50 from the most fervent partisan of A over B to the most avid supporter of B over A. In precisely the same fashion we can rank the cases as well. It is common to conceptualize that one of the values in a case makes a claim against the other. Thus, if a certain Justice supports value A over value B in one case, but gives his vote to B in another case, we clan conclude that for this Justice, with his particular attitude toward A and B, A's claim on B in the first case was mild enough to earn the Justice's support, but A's claim became more severe in the second case--so severe In fact, that the Justice was no longer willing to support it. A series of comparisons can be made between cases, if we consider that a claim supported by a Justice is less severe than one denied by the same Justice. In this way we can rank the cases according to the severity of the claim of A on B. It follows from what we have said that a Justice with a strong partiality toward value A will vote for A in more cases than will a Justice with cooler feelings toward A in comparison with Bo And in a case in which A's claim on B is mild, A will get more votes than In a, case where the claim is severe. It also follows that, 51 given a group of cases conforming to our specifications*, the clearest way to express our conclusions is to construct a diagram, in which the Justices, ranked, lie on one axis, and the cases, also ranked, lie on the other, and in which, at each point of intersection between a Justice and a case, the Justices vote in the case is recorded. Such a diagram is named a "scalogram". A model of a perfect scalogram is shown in Table 1. Table 1 Model of a perfect Scalogram *That is, all cases viewed by all of the Justices as two-value conflicts, involving the same two values for every Justice and every case. 52 In this scalogram, the cases have been ranked from the one with the least severe claim on the left, C1, to the one with the most severe claim on the right, 010. The Justices have been ranked from the one most sympathetic to the claim at the top, J1, to the one least sympathetic to it on the bottom, J9. As is seen, for each Justice and for each case there is a "breaking point". Each Justice has voted for the claim in all cases left of his breaking point, and against the claim in all cases to the right. In each case, all the Justices above its breaking point have voted for the claim, and all those below the breaking point have voted against the claim. The breaking point of a Justice represents the most severe claim that he will support. We have shown that, if the Justices vote on a group of cases in accordance with their attitudes toward the conflict between two certain values, we can construct with the voting data a perfect scalogram. In practice, however, we do not begin with the Justices' motives and conclude from them a scalogram. It is the voting data that we are given, and with which we can construct a scalogram, and from this we seek to discover the attitudes of the Justices. However, it does not necessarily 53 follow that, if we can construct a perfect scalogram, we can also make firm conclusions about the Justices' attitudes. The reason is that there are other possible causes of the existence of a perfect scalogram than the fact that our assumption about the Justices' voting is correct. Even by pure chance the votes might form a pattern that could be perfectly scaled, as we shall see later. Therefore, in investigating attitudes by means of scaling, we must rename as our hypothesis what has heretofore been our basic assumption about the way the Justices decide how to vote. Thus, the existence of a perfect scalogram may tend to confirm our hypothesis for the particular group of cases that is scaled. It is rare, however, that a perfect scalogram, such as in Table 1, can be produced from the real voting data of the Court, because of several characteristics of these data. It may be that two or more Justices have the same voting record, or that in two or more cases the same Justices voted for the claim and the same Justices against. In such a situation it is impossible to rank those Justices with the same voting record or those cases which elicited the same response from the Court. We are not justified in saying 54 that they are of the same rank: their breaking points may indeed differ. We know about each breaking point only that it lies somewhere in the region between the last positive and the first negative vote. Depending on the particular cases and Justices to whom these votes belong, the region of indeterminacy may be large or small, i.e. include a large or small variation in severity of claims and sympathy of Justices. If we cannot rank two cases, we shall never be able to rank them, for no new data will ever appear about the votes in those cases. If we cannot rank two Justices, however, there is still hope that in the future a case will arise whose claim splits the Justices, revealing who is more sympathetic to the claim. The most serious reason for the frequent impossibility of constructing a perfect scalogram is that the votes cannot be arranged in such a way that every Justice has a breaking point which divides all his positive votes from all his negative votes. Some Justices have positive votes surrounded by negative ones, or vice versa: these are called by the scalers "inconsistencies". In order to eliminate the inconsistencies of one Justice, it would be necessary to re-arrange the cases in such a way that 55 inconsistencies would arise for other Justices. The presence of even one inconsistency forces us to re-examine our hypothesis that all of the Justices voted in all of the cases in accordance with their attitudes toward the conflict between one pair of values. Inconsistencies require the modification of this proposition, in one or more ways. One modification is to hypothesize that a Justice with inconsistencies in a scalogram changed his attitude toward the values at some time. If such a change is suspected, one can divide the group of cases into those decided before and those decided after the date of the alleged change, and then scale the two groups separately. If this operation does not work, however, and it becomes necessary to assume several changes of attitude in order to eliminate the inconsistencies, it then becomes reasonable to believe that the values about which the Justices' attitudes are being considered were not the only values affecting the votes. We are then led to conclude that there was more than one influential question in the minds of some of the Justices, making perfect scaling impossible. Another possible modification of the proposition tested by scaling is the addition of subjectivity. It is easily 56 conceivable that all the Justices could consider that a certain group of cases raised only one significant question and could all have a fixed attitude on that question, yet not vote in a scalable manner. All that is necessary is a difference in the severity of the claim perceived by one Justice from that perceived by another. If, for a particular group of cases, the scalogram turns out to contain a large number of inconsistencies, we must conclude that the hypothesis is simply disproved for this group of cases. If, however, the number of inconsistencies is small, i.e. the scalogram is almost perfect, we can make it compatible with our hypothesis if we modify the hypothesis in one of the ways described above. We can also use a comprehensive, less specific modification, by making our hypothesis: the Justices in this group of cases voted generally, but not invariably, according to their attitudes toward the conflict between one pair of values. As long as there are only few inconsistencies, it is reasonable to consider this hypothesis, because the probability of a nearperfect scalogram occurring by chance is exceedingly small. In order to have some rule for deciding how imperfect a scalogram may be and still be considered helpful as an 57 almost perfect scalogram, scalers have established arbitrary criteria to decide whether a group of cases is scalable. The most sophisticated of these criteria is the coefficient of scalability, abbreviated "S", and it is equal to the fraction of the potentially inconsistent votes that is consistent. The Commonly accepted minimum value of 8 for scalability is 0.60 or 0.65. In other words, if at least 60% or 65% of all the votes which might have been inconsistent (when nine Justices vote there is a maximum of four possible inconsistencies in each case) are consistent, the pattern is considered scalable. Another criterion is the coefficient of reproducibility (CR). This is the fraction of all votes cast that is consistent. If at least 90% of the votes recorded in the scalogram are consistent, the CR is 0.90 and the votes are considered scalable. The coefficient of scalability, the coefficient of reproducibility, and all other formulae for the measurement of scalability contain biases which become important when the given group of cases is characterized by one or another kind of voting pattern, such as many 5-4 votes. But our main concern is with the validity of the scaling process, and for our purposes the issue of the precision of the 58 indicators of scalability can be neglected. Until this point we have discussed only one type of scalogram: that in which the same Justices voted in all of the cases in the group being scaled. Barring nonparticipations, such a scalogram contains a record of a vote at every available position in the diagram. Such completely filled scalograms are exceptional, however. Most groups of cases selected for scaling stretch over a period of time encompassing changes in the membership of the court or include cases in which not every Justice cast a vote. The result is a scalogram in which there are no indications of a vote at several positions, namely the intersections of each Justice with the cases decided when he was not on the Court or not participating. We may group these two situations together and call the result an unfilled scalogram. Unfilled scalograms have some different characteristics from filled ones and are worthy of separate consideration. In discussing filled scalograms, we hypothesized that, according to scaling theory, a case presenting a more severe claim should bring fewer votes in support of the claim than a case in which the claim is milder. This principle is not valid for unfilled scalograms, however. 59 The claim in one case may receive more votes than the claim in another for the simple reason that one or more Justices sympathetic to such claims occupied positions on the Court during the one case, and less sympathetic Justices filled those positions during the other case. In fact, it may very well happen that a claim that receives many votes at one time is more severe than one receiving few votes at another time. Another principle used in constructing filled scalograms was that a Justice who gave more votes to a claim was more sympathetic to the claim than one who gave it fewer votes. This principle does not hold for unfilled scalograms, because Justices belonging to the Court at different times can be expected to vote on questions involving claims of different severities. Therefore the breaking point of a Justice is not related to the proportion of his votes that he gives to the claimant, but only, in accordance with the original definition, to the most severe claim that he supports and the mildest claim that he rejects. It is not even possible to locate exactly every Justice's breaking point in an unfilled scalogram, because the most severe claim voted for and the mildest one rejected by a Justice may be separated by several cases with 60 intermediate claims, on which the Justice did not vote at all. Since there is no way to determine how he would have voted in those cases, his breaking point cannot be defined, except by saying that it lies somewhere in the region between the most severe claim supported and the mildest one denied. Unfilled scalograms not only have these special characteristics, but they also have a special value not possessed by filled ones. The value lies in the fact that unfilled scalograms bring together for comparison cases decided by different Justices, and Justices who decided different cases. Thus an unfilled scalogram can be used to support the hypothesis that one case raised a more severe claim than another case, even though an entire turnover in Quit membership occurred between the two. Likewise, one Justice may be said to be more sympathetic to a certain kind of claim than another Justice, even if the two never shared a day on the Court. How such comparisons may be made is illustrated by the hypothetical scalogram in Table 2. For the sake of brevity, only a five-man Court has been assumed in Table 2, but the case of a nine-man Court is perfectly analogous. The cases are lettered In chronological order of decision, 61 Table 2 Hypothetical Unfilled Scalogram and the Justices are numbered in order of date of retirement from the Court. Among the many possible comparisons, it appears that case J raised a more severe claim than case A. We can say this in spite of the fact that they were decided by two entirely different Courts. The scalability of the voting pattern gives support to our general hypothesis. According to the hypothesis, case J presented a more severe claim than case D, because Justice J8 was willing to grant the claim in D but not in 62 J. And case D involved a more severe claim than did case A, since Justice J4 voted for the claim In A but denied it in D. If the claim in J was more severe then in D, and the claim in D more severe than in A, the claim In J was obviously more severe than the claim In A. Similarly, we can conclude that Justice J10 was apparently more sympathetic to the claim in question than was Justice J5, even though there was not one case on which they both voted, and even though J5 gave the claim 57% of his relevant votes and J10 gave it only 50% of his. The possibility of drawing such conclusions shows the special analytical power of unfilled as opposed to filled scalograms. With this sketch of the theory of scaling as relevant to the Supreme Court, we turn to the ways in which this theory has been applied. We must pay the most attention to the work of Glendon Schubert. He has been the leading scaler of the Court, and many of the achievements of scaling, as well as most criticisms of it, are due to him. One of Schubert's favorite techniques is to scale several groups of cases dealing with different variations 63 of the same subject. For example, he scaled all the cases involving aliens between 1950 and 1957, then scaled the ones between 1950 and 1953 (the Vinson Court) separately from the cases between 1954 and 1957 (the Warren Court), and finally separately scaled over the entire period those cases involving Communist and those involving non-Communist aliens. Each time, the resulting scalogram had an acceptably low number of inconsistencies by the conventional criteria. Naturally, the information contained in one scalogram could be translated into many paragraphs of verbal description, so the scaler must selectively interpret his scalograms for the reader. One thing noticed and called to the reader's attention by Schubert was that the Vinson Court was highly polarized, with four of the Justices voting for a large majority of the aliens and five Justices voting against the aliens almost every time. The scalogram for the Warren Court shows no such sharp division: the Justices were apparently rather evenly distributed over the continuum of attitudes toward the claims of aliens against the government. Regarding the possible differential treatment of aliens who were Communists, the scalograms showed no substantial difference, but Schubert observed that 64 it is entirely possible that part of the differences noted in comparing the Vinson and Warren Court periods may be attributable to the fact that a majority of the alien cases decided by the Vinson Court, but only four of the twenty decisions of the Warren Court, involved Communists. It would be necessary to make an independent scalogram analysis of all cases involving alleged Communists, decided by the Supreme Court during the period of the 1949-1956 Terms, in order to be more confident as to which of the two alternative interpretations should be preferred.83 The rank of Frankfurter in attitude toward aliens is specifically noted by Schubert. Frankfurter, in both Courts and toward both Communists and non-Communist aliens, was the third most sympathetic to their claims, below only Douglas and Black. 84 Another topic subjected to scale analysis by Schubert was the right to counsel.85 He found that right-to-counsel cases decided between 1940 and 1957 were scalable. He also managed to scale separately the sub-group of cases involving capital punishment and the cases in which lesser punishments were given, In order to test whether there was a difference in the Justices, attitudes toward the right to counsel in these two kinds of cases. Examining cases involving alleged unconstitutional search and seizure,86 Schubert scaled all of the cases between 1937 and 1957 and found enough inconsistencies to 65 make the scalogram only barely acceptable according to the coefficient of reproducibility. Then he separated the cases involving acts by Federal authorities from cases of search and seizure under states. When these groups were scaled separately, scalability was improved. Frankfurter was found to be the strongest supporter in the Court of protection against improper Federal search and seizure,87 but to be relatively less enthusiastic about voting against alleged improper state search and seizure.88 Schubert tried to scale "federal tax cases involving government-taxpayer conflicts decided during the 1953-1958 terms", but the resulting scalogram was unacceptable. He succeeded, however, with the subset including only the cases involving criminal charges.89 When a selected group of cases proves to be scalable, the scaler usually concludes that the Justices voted on the cases according to their attitudes toward the claim which has been used as a criterion for choosing the group by the scaler. Thus S. Sidney Ulmer's studies of civil liberties cases in the 1956 and 1959 terms of the Court have concluded that the members of the Court voted on these cases according to "one dominant operating 66 variable... : deprivation of a claimed civil liberty."90 Schubert has indicated that a high degree of scalability for the alien cases would be "persuasive evidence" for the consideration by the Justices of alien status itself as an important claim.91 Schubert has summed up the conclusions drawn from scaling as follows: The research done thus far in cumulative scaling indicates that there is a high degree of consistency in the attitudes of Supreme Court Justices toward the recurrent issues of public policy that characterize their work load. This consistency of response in individual judicial voting in such an area of public policy as civil liberties claims appears to provide a much better general explanation of how and why the Court makes its policy choices than does the alternative traditional theory of stare decisis that consistency in the manipulation of precedential legal rules and principles is a function of legal craftsmanship.92 Such conclusions are typical of scalers, and they are in contrast with what nonscalers usually conclude. Those who use non-quantitative methods, whether unsophisticated or sophisticated, generally hold that a Justice's attitudes toward a few legal principles will go a long way in explaining his votes. Some, however, like Rodell, believe that the forces motivating the Justices are not few, but infinite, and not legal, but of diverse kinds. It might be anticipated that the users of quantitative 67 methods, supposedly on the frontier of political research, would represent an extension of the "realism" typified by Rodell. Such is not the case. The scalers view judicial behavior as explainable by few, not many, principles, and only some of these are nonlegal, such as attitude toward aliens, while others are legal, e.g. attitude toward state versus Federal search and seizure. Thus, in their conclusions, the scalers tend to embody some of both Rodell and Mason. The process of scaling and drawing conclusions from scalograms used by Schubert, Ulmer, and others is fraught with fallacies. We shall display this process in more detail as we expose the errors that it contains. One of the major faults lies in the manner in which scalograms are constructed from raw voting data. Scalers refuse to recognize the important differences between filled and unfilled scalograms, and use one set of rules for constructing both kinds. The only published comprehensive set of rules for scaling Supreme Court decisions93 is too long and complicated to be described here. What is important is that these rules, in an apparent effort to conserve objectivity by eliminating any possible differences 68 in the treatment of the same material from one investigator to another, elaborately specify how to arrange the cases and Justices in order, given any distribution of votes. In order to accomplish this arrangement, many arbitrary rules are employed. These have the effect of ordering cases according to the number of votes received by their claims, a principle which we have shown to violate scaling theory itself when applied to the construction of unfilled scalograms. The rules also redefine the breaking point of a Justice to be the most severe claim supported, i.e. the left end of the region of indeterminacy mentioned above, and the Justices are ordered accordingly. This procedure, too, violates scaling theory, by extracting from the voting data information more precise than the theory permits. When these rules are applied to the construction of filled scalograms, the result usually (but by no means always) is the arrangement of cases and Justices which produces the least possible number of inconsistencies. Scaling theory offers no reason why a scaler should punish himself by making any arrangement that increases the number of inconsistencies, and yet such an increase in often the result of the construction of unfilled 69 scalograms with Schubert's rules. Thus there are two major objections to the use of these rules in the construction of unfilled scalograms: more precision is claimed than the data warrant, and unnecessary inconsistencies are forced into the resulting scalograms. In illustration of the first objection, consider the following scalogram, constructed by Schubert for a group of Federal search and seizure cases from 1937 to 1948.94 Table 3 Unfilled Scalogram Constructed by Schubert 70 This scalogram shows only one inconsistency, and by inspection we can see that it would be impossible to reconstruct the scalogram without at least one inconsistency. The data on which the scalogram is based, however, are not complete enough to justify this arrangement of cases and Justices as the only possible one. There are many ways in which we could re-order them without increasing the number of inconsistencies, and the resulting now scalogram would be just as plausible, in terms of scaling theory, as is the one in Table 3. An example of how the same data could be rescaled is shown in Table 4. A comparison of Tables 3 and 4 will show that the number of inconsistencies has remained constant, but the order of cases and Justices has changed substantially, Case E, for example, is the eighth case in Table 3 and the eleventh in Table 4; case I moved from twelfth to eighth position; G went from tenth to thirteenth. only two cases retained their positions. The change Is more pronounced in the order of the Justices. Among the many changes of position, Brandeis moved from a low fourteenth place to a middle ninth, and, most spectacularly, Black dropped from an upper-middle seventh position to nearly the bottom of the list, at sixteenth place. As can be 71 Table 4 Schubert's Scalogram (Table 3) Reconstructed without His Rules seen in the tables, either Black's vote in case J or his vote in cast M must be considered an inconsistency in scaling terms. In Schubert's scalogram, the J vote is arbitrarily chosen as the inconsistency, and, in our revision, the M vote is instead chosen. These two alternatives make it possible to place Black almost anywhere between seventh and sixteenth positions. Thus indeterminacy may be an important characteristic of unfilled scalograms, but scalers tend to avoid acknowledging 72 it by injecting an artificial determinacy through their scaling rules. our second objection, that scalers include unnecessary inconsistencies in unfilled scalograms, is illustrated by the following pair of scalograms. Table 5 is Table 5 Unfilled Scalogram Constructed by Schubert Schubert's interpretation of a group of Federal search and seizure cases from 1949 to 1957.95 This scalogram contains seven inconsistencies, bit one of them is due to the fact that Schubert disobeyed his own rules for an unexplained reason, in calling Douglas's breaking point 73 seven instead of ten. Even if this error is corrected, however, the number of inconsistencies, now six, can be decreased by rearranging the cases and Justices, as in Table 6. for example. Here the number of inconsistencies Table 6 Schubert's Scalogram (Table 5) Reconstructed without His Rules has been reduced from six to four. A few substantial changes of position have also taken place, such as of case U and Justices Harlan and Jackson. The two illustrations above attempted to show that the potential value of unfilled scalograms has been largely negated by the substitution of arbitrary rules 74 for reasonableness in the scaling process. Until this point, we have discussed some reasons why the model of a perfect scalogram in Table 1 cannot usually be reproduced in practice, and some failings in the ways in which scalers cope with the resulting imperfections and uncertainties. Men if all of these practical difficulties in constructing scalograms were eliminated, however, and every scalogram in practice turned out to be a perfect one, the conclusions drawn from them would still be open to doubt. In describing the theory of scaling, we noted one basic implication: if the Justices all vote on each of a group of cases in accordance with their fixed attitudes toward one question (the proper extent of the claim of one value on another), then a scalogram without inconsistencies can be constructed for that group of cases. In describing the conclusions drawn from scaling, however, we have seen that the scalers assume that the reverse Implication, too, is valid. In fact, as we noted earlier, it is not. The scalability of a group of cases In no way implies that the cases were voted on with one and the same question in the minds of the Justices. If votes for each Justice are assigned at random to the cases in a group, there is a 75 certain chance that the group will be scalable without inconsistencies, and there is a greater probability that it will be scalable with few enough inconsistencies to meet the criteria for scalability used by most scalers. If the number of cases is large enough and/or the scalogram is sufficiently filled, the probability of a scalable voting pattern arising by pure chance is indeed small, but it is still there. Moreover, if the voting of the Court on a certain group of cases is scalable, there is a greater probability that the votes of any single Justice would fit into the scale pattern, even if those votes were assigned to him at random, because of the small number of votes involved. The probability is especially great if the Justice was on the Court for only a part of the group of cases. In Table 6. for example, three of the Justices cast votes in only two cases each, and only one out of four possible combinations of votes, i.e. a negative vote left of a positive vote, could have produced an inconsistency for any of these three Justices. We are not suggesting, of course, that any Justice votes by tossing a coin. But whatever the probability that random voting would have had of making a group of 76 cases scalable, voting on the basis of more than a single question would have at least the same probability of leading to an acceptable scalogram. Therefore, from the scalability of a group of cases we can conclude that they were voted on with one question is mind, but we must always attach the reservation that there is a certain statistically calculable probability that other considerations entered into the voting, perhaps making the scalogram more perfect or less perfect, and perhaps not. Scalers seem to regard the perfectness or imperfectness of a scalogram as an indication of success or failure. This attitude obscures the fact that we can draw more certain conclusions from non-scalability than from scalability. Although we must make an allowance for chance, i.e. additional influences on the voting, when we conclude the dominance of a single question from the existence of an acceptable scalogram, we can say with absolute certainty, when a group of cases fails to scale, that the Justices' attitudes toward one question did not determine all of their votes. There may be groups of cases on which we would expect the Justices to vote on the basis of one question, and then such a negative conclusion would be quite important. 77 But if our conclusion is that one question did dominate in a group of cases, our difficulties have not ended when we have made the necessary reservation about chance. We must still ask ourselves what the dominant question was. The answer is by no means obvious, but scalers seem to think it is. Usually the group of cases selected for investigation by scaling is not picked by lot from all of the cases within a certain period of time, but is composed of those cases Which, in the view of the investigator, have something important in common. Examples are civil rights cases, freedom of speech cases, and cases testing state regulation of business. If awn investigator picks a group of cases according to what seems to him to be a common important characteristic, and if he finds that the votes on the cases in this group are scalable, he naturally tends to conclude that the single question which probably dominated the voting turned upon the criterion by which he originally chose the cases. The conclusion is not valid, because of two possibilities. First, it may be true that the Investigator has chosen all of the cases, and only the cases, which share a certain characteristic, but the salient characteristic may be one other than that by which he made the choice. 78 The second, and more likely, possibility is that the cases selected are part of a larger group, distinguished by a different characteristic. By looking at any perfect scalogram, it is easy to see that the elimination of any number of cases from consideration would not have any effect on the scalability of the remaining group. Hence the important principle that, if a group of cases is scalable, so is any subgroup thereof. If the group is only imperfectly scalable, On within the conventionally acceptable range, the removal from consideration of cases not responsible for any inconsistencies may leave a scalogram with a larger fraction of its votes inconsistent, and therefore with a poorer coefficient of scalability. In general, however, a subgroup is not likely to be much more or less scalable than the entire group from which it is taken, if that group scales well. If a group of capital punishment cases proves to be scalable, it may be that the Justices voted according to their attitudes toward the death penalty. But perhaps the larger group of cases involving all criminal convictions would also produce an acceptable scalogram. If this group were originally chosen to be scaled, the 79 scaler would most likely conclude that the Justices' votes on all the cases, including the capital punishment ones, were based on their attitudes toward criminal defendants versus the state. As another example, Wallace Mendelson notes Ulmer's conclusion that the Justices voted on a group of cases according to their attitudes toward civil rights. But scaling, says Mendelson, can prove several contradictory things: if particular subgroups of the original group are scaled, it can be shown, in the same manner in which Ulmer "proved" his assertion, that the Justices voted on the 25 cases in one subgroup in accordance with their attitudes on Communism, and on the 15 cases in another subgroup according to their attitudes toward homicide.96 Our objection to the validity of the conclusions derived from scaling is becoming serious indeed. We have seen that, if a group of cases is scalable, the scalability of a subgroup tells us almost nothing new, since it could be predicted from the fact that the full group is scalable. What must we than conclude if we find that -11 of the cases decided in one term produce an acceptable scalogram? The scaler would conclude that the Justices voted on all the cases according to some one dominant 80 question; indeed, this very conclusion is implied when we characterize a Justice as being to a certain degree liberal or conservative. Liberality-conservatism is conceived as a one-dimensional variable encompassing all, or almost all, of the particular policy issues which arise. According to this conception, the Justices can be ordered according to a scale of liberality, and this ordering implies an ordering of the cases as well, so that one case presenting a more severe liberal claim than another will receive fewer votes than the other, those voting for the claim being more liberal than those voting against it. The conception just described is what a scaler would conclude if he found all cases scalable together. And, for at least two terms (the 1936 and 1961), the group comprising all the cases decided within a term has been found to be scalable. Pritchett described the entire period of 1931-6 as marked by "almost watertight" blocs in the Court, and by a fixed pattern of agreement and disagreement. "Locating the Justices along a single attitude scale in terms of relative liberalism or conservatism would adequately account for the judicial disagreements manifested during that period."97 Schubert confirmed this statement by 81 scaling the entire 1931 term.98 To discover whether the adequacy of liberal-conservative ratings would apply to a recent term, we scaled the entire 1961 term, which is presented in Table 7. According to both of the standard criteria of scalability, this scalogram is acceptable, although not to as high a degree as the scalogram of the 1934 term. This finding gives us reason to be very wary of bias on the part of the scaler. Whatever group of cases he picks out to scale, his results add little to our knowledge if the group composed of all cases is scalable. Clearly a reasonable procedure would be to begin by trying to scale entire chronological series of cases, and if these did not scale, to investigate how they might be divided into groups that would produce acceptable scalograms. Scalers, however, begin at once with small groups. They do not select, for scaling, groups comprised of those cases involving litigants one of whose middle names begins with "R", nor groups consisting of those cases arrived at by one or another manipulation of a random number table, for there is no reason even to suspect that such groups might be scalable. Scalers select, instead, groups of cases Which have in common an 82 issue which the scalers think might well be the dominant issue in the minds of the voting Justices. When the groups scale, it is this issue which is assumed to be the dominant one, and the scalers do not consider the possibility that where the one issue occurred there coincided also another, to which the Justices paid more attention. Thus the scalers' conclusions depend closely on their presuppositions. One allegation as to what these presuppositions might be is made by Mendelson. Cases involving civil liberties, he says, involve also "the distinction between constitutional and statutory construction; between stare decisis in relation to constitutional, as against statutory, decisions; between judicial review of procedure and judicial review of substance--to mention only the obvious." "[O]nly an activist is inclined to ignore" such differences and to lump all civil liberties cases together without further distinction. Thereby "neo-behavioralism" (a term which subsumes sealing) "tends to reflect the dedicated libertarianism of most of its practitioners. Thus it generally applies to the judicial process an essentially political test; namely, the libertarian party line."99 For Mendelson, the principal "weakness" of judicial 83 behavioralism is that "it assumes that every vote in a case that has some connection with civil liberty, for example, is necessarily a vote for or against that liberty."100 Here Mendelson points to the existence of cases in which several Justices voted against the person claiming a civil liberty for procedural reasons, but simultaneously voted for a principle destined to have a wide pro-civil-liberties effect in the future. There is evidence to support Mendelson's charge. On the trivial side, there is the fact that scalers use a language which seems to reflect, or which could easily lead to, the mode of thought that is alleged. Votes for the civil liberties claim or the civil rights claim or the individual's claim against the state or a business are invariably denoted as positive, and opposite votes as negative. And the term "inconsistency" is a very unfelicitous one, which, if not understood from a purely technical standpoint, implies the correctness of the proposition which scaling is designed to test, even before the results of the testing are known. It also implies that, if a Justice votes on a different basis from his fellow Justices, he does not vote according to any principle at all. 84 The crux of the question lies in the way in which scalers draw their conclusions. Mendelson is wrong in saying that scalers assume the irrelevance of jurisprudential, as opposed to political, principles. Such an assumption is not logically inherent in the scalers' work, nor, as we have seen, do scalers even conclude that legal principles are completely extraneous. Rather, the scalers combine an inattention to the implications of their own theory with a lack of rigor in their procedure, with the result that they are able to achieve almost any conclusions they might want, and they usually conclude, to a certain extent, that which Mendelson says they have assumed all along. In order not to exceed the bounds beyond which scaling theory does not extend, scalers must choose one of two alternatives. They may select for Investigation only those groups of cases which they wish, according to criteria of their own choice; then, when these groups prove scalable, the scalers may conclude, with a margin for the effect of probability, that the votes were based on a single issue, but the scalers have absolutely no basis for any speculation as to what that issue was. The second alternative is to scale every conceivable group of 85 cases, including groups randomly compiled. Then, if the scaler concludes from the scalability of, for example, a group of alien cases that the votes were based on relative sympathy for aliens, he must draw with equal certainty the conclusion that the Justices' attitudes toward some single question also decided their votes in any other group of cases that is scalable, whether the question is visible to the scaler or not. As stated, of course, this second alternative is impossible to follow, requiring as it does the construction of an infinite number of scalograms. In practice, what is called for is the open-minded, impartial scaling not only of groups of cases that are expected to be scalable, but also of groups expected not to be scalable, such as randomly selected groups and entire terms. Scalograms of the latter are necessary because what is important is not how scalable the cases involving, let us say, Communists are, but how much more scalable they are than any group of cases selected by lot. The scalers have followed neither of the alternatives outlined above. They have chosen groups of cases with the discretion of the first alternative, and drawn conclusions with the breadth allowed by the second. 86 Basically, the error of the scalers seems to have arisen from a misconception about the function of the scaling process. Scalers assume that a successful scalogram is a proof of something. It is really, however, nothing more than the expression in orderly form of an observed regularity. The name of this regularity is scalability. When we find that the votes in a certain collection of cases are characterized by this regularity, we may be inspired to formulate a hypothesis that would explain the regularity. Then it is our task to conduct further investigations, in such a way that our hypothesis is either proved or disproved. Scalers have fallen into the error of neglecting this last step. instead of observing the regularity, making a guess about its cause, calling this guess a hypothesis, and proceeding to test their hypothesis, the scalers observe the regularity, guess as to its cause, but call their guess a definite explanation, and see no need to subject this explanation to any tests. Not all scalers, of course, adhere to the procedure described above: some use more care at times, and some use less. Schubert notes, for example, that in one article101 Ulmer constructed a scalogram in which changes 87 in Court membership were handled by putting new Justices in the attitudinal positions of their predecessors.102 In the terminology of this paper, what should have been an unfilled scalogram was converted to a filled one by pretending that outgoing Justices and their successors were one and the same person. Clearly Ulmer was Injecting a complicating question into the object of inquiry: Justices tend to vote according to the same attitudes toward the same questions as their predecessors? At a time when scalers are still struggling to prove that Justices vote on the basis of attitudes toward single questions in certain types of cases, the addition of another proposition to the burden of proof of scaling is premature. On the side of greater prudence, some scholars of the behavioralist school do not always commit the fault noted above of taking a scalogram for a final explanation of the votes involved. Instead they use scaling as an indicator, to discover cases and particular votes worthy of more detailed examination. In one article, 103 Ulmer first scaled a group of civil rights cases and then looked non-quantitatively at those cases which evoked inconsistent votes and those which provided the breaking 88 points for the Justices. Similarly, Harold Spaeth scaled a group of cases and examined the contents and the written opinions of those cases which elicited inconsistent votes.104 To at least two studies Mendelson's accusation is absolutely inapplicable, because scaling itself is used to test whether the principle of judicial self-restraint, believed by Mendelson to be supremely relevant, is really at work. In one article1O5, Spaeth isolated, by a content-analytical process, 52 cases appearing to have been decided principally on the basis of Supreme Court activism versus self-restraint. He scaled the entire group, divided it into five subgroups, and scaled them. Although all of the scalograms were acceptable, scaling by subgroups reduced the total number of inconsistencies, suggesting that attitudes toward self-restraint differed with the kind of case, even if selfrestraint was apparently the principal issue in all the eases. Spaeth tried to explain remaining inconsistencies by hypothesizing an attitudinal change by one Justice at a certain time and effects of other, intruding issues. Spaeth then, however, considered cases of the same time period which included, but were not dominated by, the issue of 89 self-restraint. In cases focusing on state regulation of business, for example, Douglas, Black, Warren, and Brennan, the activists in pure restraint cases, exercised deference to state power. Spaeth's conclusion was that, depending on the kind of case, the issue of selfrestraint was treated as a first-, second-, or thirdorder consideration relative to the substantive issues involved. Joel Grossman has taken fellow scalers to task for their superficial treatment of Frankfurter's voting on civil liberties, and has offered, through a more thorough investigation by scaling, his own interpretation.106 The scalograms, said Grossman, that led other scalers to decide that Frankfurter was an anti-libertarian, were inconclusive. It was necessary to consider Frankfurter's own explanation of his voting behavior, but other scalers had cast this explanation aside at the outset. Grossman presented and analyzed Frankfurter's stated reasons and, on this basis, defined a new Value, "denial of judicial responsibility", or DJR. To discover whether this value was relevant to Frankfurter's voting, Grossman scaled all the cases in the 1958 and 1959 terms in which DJR, as defined, was explicitly raised as an issue. Frankfurter 90 was the only Justice with a perfect pro-DJR voting record. Then Grossman scaled all the civil liberties cases in those two terms, in two groups: those in which DJR appeared as an issue, and those in which it did not. In the group involving DJR, Frankfurter voted against the civil liberties claim every time, that claim being in every case incompatible with the claim for DJR. In the civil liberties cases which did not involve DJR, however, Frankfurter voted sometimes for and sometimes against the civil liberties claim. In the scalograms for these cases, Frankfurter appears more sympathetic to civil liberties than Harlan and Clark in the 1958 term, and than Harlan, Clark, and Whittaker in the 1959 term, with 22% and 19%, respectively of his votes cast for the claim. Grossman warned, in his conclusion, that Frankfurter could have raised the issue of DJR in some cases where he did not do so, but it seemed that he voted according to DJR where that Issue was raised. If so, decided Grossman, Frankfurter indeed was more, but only slightly more, libertarian than his votes would suggest if the issue of judicial self-restraint were ignored by the analyst. Schubert, too, has at times used an advanced 91 procedure.107 His analysis then treats scaling more or less as a generator of hypotheses. A group of cases which is thought to be scalable is scaled, and certain inconsistencies arise. The Justices and cases responsible for the inconsistencies are studied nonquantitatively, in an effort to find a second issue which might have been operating, along with the issue originally presumed dominant, to determine the votes. Thus scaling ceases to be used as a test of the proposition that only one issue is significant. It becomes assumed that several issues enter into the votes, and a scalogram becomes merely a representation of the first approximation to reality, based on the most important of the issues. A second issue is hypothesized, and a second, more exact approximation is made. There is no necessary end to the series thus begun. Schubert has successively introduced up to five issues. At each stage, the cases and Justices are ranked on the new issue, in such a way as to help explain the hitherto inconsistent votes. There are two ways in which this kind of analysis, multivariate as opposed to simple Guttman scaling, can be performed. One way is to cease reliance on the scaling process after the first issue has been studied. When 92 this method is used, the investigator reverts to traditional, nonquantitative examination. Heavy stress is laid on written opinions as a source of information on the subsidiary issues which have given rise to votes that do not fit in the scalogram. This is the technique used by Schubert in a study of cases involving the question of civilian versus military control.108 When he scales the cases on the basis of this question and finds inconsistencies, he examines the opinions and concluded that a second issue was influential: stare decisis. Such an analysis, of course, is subject both to the criticisms of non-quantitative methods and to those of scaling, but to a modified degree in each case. Opinions are not accepted at face value, but are weighed along with votes, and the scalogram is not treated as a final explanation, but is supplemented by investigation of the opinions. The second kind of multivariate analysis uses a different approach from anything so far described. Instead of beginning with a Guttman scalogram, and adding issues one by one, the investigator assumes a certain number of issues (most commonly three) and, by a different process, scales the cases with respect to all of these issues simultaneously. Then the number of issues 93 assumed is three, the results are represented as a cube of space, and the cases and Justices, instead of being ordered along one line, are arrayed at various points in the space. This type of analysis, mainly employed by Schubert109, involves many complexities and problems, and to treat it would require as much space as is here devoted to Guttman scaling. The method is very new in its application to the Supreme Court, the entire literature consisting of two recent articles by Schubert, and it seems reasonable to let such a new method mature somewhat before subjecting it to critical analysis. We have now reached the point at which an over-all evaluation of cumulative scaling as a method of analyzing Supreme Court Justices' attitudes is possible. For this purpose we must turn to those questions which we originally formulated as guides in this evaluation. We established four criteria for the usefulness of quantitative methods. The first was that the methods must improve knowledge or understanding. On the face, scaling has done so to a great degree. By its use, long-accepted notions about the composition of the Justices' attitudes, And the kinds of attitudes which influence their votes, such as support of judicial selfrestraint, stare decisis, 94 and other legal principles, have been challenged, and the Supreme Court's voting patterns have been given quite new interpretations. The scalogram itself is an enlightening form for the presentation of information, even if we disregard the conclusions drawn from it. Some of the results of scaling, if they are accepted, constitute revolutionary additions to, and alterations in, our knowledge of the Supreme Court, while other scaling studies confirm the results of non-quantitative research. We must question, however, the validity of these results. What is the relation between facts and values in the work done with scaling? Certainly there is nothing so blatant as statements about how Justices should have voted or what attitudes they should have had. Such normative expressions have been effectively excluded. only more subtle biases, such as the libertarian activism alleged by Mendelson, might be present. We can not be sure, however, that such a bias is active, because the failure to distinguish among legally technical differences and the Consequent collection of all civil liberties cases under one heading, while perhaps due to a bias of libertarianism, might also be a result of other things, 95 such as the belief, not that the Court aught to utilize legal technicalities merely as a tactic to advance ends dictated by values, but that it does so in fact. In other words, scalers may have drawn their conclusions partly because they investigated the facts in ways conducive to the achievement of the results that they expected. If this is so, we have a negative answer to the question whether scaling has eliminated the influence of subjectivity. Furthermore, scalers rarely acknowledge the subjectivity of their findings: it must be uncovered by the critic of the scalers' methodology. As a further step in deciding whether scaling has added to knowledge or understanding, we must ask whether the method has led to a restriction of the problems or facts considered, or to the neglect of other methods. The most striking part of the answer to this question is the fact, already mentioned, that scaling makes use only of data on votes cast, and finds no room for opinions, nor for the insights offered by conditions surrounding the Court and the cases and by the ensuing effects of particular decisions. Indeed, scaling feeds not even on all votes, as has been shown, but only on those in nonunanimous cases. This severe limitation is prima facie 96 evidence that scaling is not a self-sufficient tool for the investigation of the Justices' attitudes. We should expect scaling to be one among several methods. Scalers, however, often fail to supplement their scalograms with a substantial amount of any other kind of research before announcing their conclusions. If these conclusions were expressed as hypothetical conjectures to be tested by reference to further scaling and to data inaccessible by scaling, we could only acclaim the scaling process as one fruitful first step in the acquisition of knowledge and understanding of the Court. Only when scalers make use jointly of scaling and of non-quantitative methods--and they often do not--does scaling approach what seems its proper role. There is no apparent quality in scaling, which could be blamed for the fact that scalers often ignore the necessity of supplementing scaling with other methods. Not the method, but its users, over-enthusiastic about its value, must be blamed. We must also ask ourselves whether the presentation of the results of scaling has been obscured by jargon, as critics charge. The potentially confusing word "inconsistency" has already been mentioned. This and all other technical words used in scaling have, however, been 97 clearly defined by their users, and the careful lay reader should have no difficulty with the terminology. Also, the results of scaling are generally presented in a clear, non-technical manner unlikely to give trouble to any reader. Whatever one's doubt as to the validity of the scalers' conclusions, there need be no uncertainty about what those conclusions are. Thus it is possible to say that scaling has indeed improved our knowledge and understanding of the attitudes of the Justices, but the method has been often mishandled, and as a result only a small part of the potential value of scaling has been realized. Our second criterion for the usefulness of a method was that it must represent some improvement over other methods. How does scaling compare with non-quantitative methods with respect to the questions just discussed? Non-quantitative studies are in general not free from the influence of the scholar's values. Traditionally, scholars of the Supreme Court have not been satisfied with pure description of and explanation of the Court's behavior; "the dogma of some behavioral scientists, that value judgments are outside the pure stream of research, would be accepted by few legal scholars who had ever 98 stopped to think about what they do and why they do it."110 McCloskey calls Douglas and Black dogmatists because he is interested in seeing the development of legal doctrine. There is more warmth in Rodell's description of Black as a man motivated by human situations than in his portrayal of Frankfurter as one guided by fear of an imprudent, offensive extension of the Court's concern, perhaps because Rodell believes that Black behaves as a Supreme Court Justice ought to behave. For every influence by values that we can detect, there, are no doubt many that are too deeply hidden for our notice. Scholars almost never explicitly warn readers about the values which underlie their work. The reader must do his beat to find where facts and values have been confused, and, more often, where values have influenced the way in which the scholar perceived and presented his facts. The role of subjectivity In fact perception is necessarily great in non-quantitative analysis. In the typical pattern, the belief that votes are cast according to legal principles, the belief that written opinions represent true opinions, and the belief that opinions are consistent, are Influential in the conclusions drawn 99 about Justices' attitudes. The vexing characteristic of non-quantitative studies is that it is impossible to determine the extent or the manner of the influence of the investigator's subjectivity, just as the influence of his values can never be clearly known. The writer states certain of his values and biases in perception and leaves others unmentioned. In his presentation he is forced to extract a few of the many relevant facts and to emphasize certain ones of those that he mentions. In the end, the reader can not trace the subjective sources of the author's conclusions, but must merely judge on their plausibility. Here lies a very important difference between non-quantitative methods and scaling. Scaling is a thoroughly defined process based on a limited fund of information. Every step in the process can be checked by the reader, and any fallacies can be detected. Furthermore, the reader can put the method to his own use: after deciding how much can be expected from scaling, he can apply its rules to any problem of his own choice. By contrast, no such explicit rules are available for the use of non-quantitative methods. Therefore the common accusation that quantitative findings are made incomprehensible to the layman 100 might well, in the case of Justices' attitudes, be turned back upon the users of traditional methods themselves. It is precisely because the process by Which scalers reach their results is clear, that we have been able to detect the sources of the invalidity of those results and to recommend reforms in the process. No such recommendations could be made for the non-quantitative studies, because their processes are actually the occult thought processes of their authors, rather than an explicit set of rules. it is possible merely to conjecture about the sources of error in these studies. As has been suggested, non-quantitative methods seem characterized by a susceptibility to the intrusion of values and subjectivity, and the process of information gathering and reasoning, being implicit and obscure, is not likely to be immune from carelessness. Neither nor-quantitative methods nor scaling is perfect, either in theory or in practice. Non-quantitative methods are plagued by normative and observational bias; this can be reduced by more careful formulation of procedures and more explicit description of chains of reasoning, but the essence of norquantitative study is the freedom which the scholar has to let his expert mind 101 roam over a huge range of information, fix on certain facts as important, and ignore others as trivial. Subjectivity cannot be entirely abandoned without destroying this essence. Scaling is beset by subjectivity also, but it is a product not of the intrinsic nature of the scaling process, but of the abuse of that process. If scaling theory is respected and rigor is added to the scalers' work, scaling can become a very valuable tool, because it will embody the objectivity that is necessarily lacking in nonquantitative work. On the other side of the coin, scaling relies on a small fraction of the information relevant to the Justices' attitudes, and nonquantitative methods are needed to make use of the vast majority of the information. Two criteria remain to be considered. The third criterion is that the method give us information about interesting or important matters. In other comparisons of quantitative and non-quantitative methods, this criterion might be applicable. Here, however, we have chosen to study a quantitative method that has been used on a subject already of great interest to non-quantitative scholars. Scaling has therefore drawn attention neither toward nor away from the subject of the attitudes 102 of the Justices. The fourth criterion is that the method should not be harmful. It is of course impossible to measure the effects of scaling on people's attitudes or on the Supreme Court. No doubt scaling has convinced a certain number of persons that She members of the Court adhere to legal principles far less than they profess. (This conclusion has not been proven by the scalers, as has been demonstrated, but it may nevertheless be true.) We can decide as we wish whether the growth of the opinion that the Court is a political body rather than an applier of the law is a beneficial or a harmful trend. The yardstick theory may give the Court a shield of sanctity against the attacks of those dissatisfied with its decisions, and scaling may be rendering that shield less effective.* As yet, however, scaling is employed by so *"Let us suppose ... that in studying the political process's relationship to the Supreme Court, political science eventually demonstrates..." empirically that "the judge is In fact nothing more than a glorious rational izer of his own personal values. ...But what then? ...Is it not possible that the 'myth' of the 'objectivity' of the Court and courts is quite functional for stability and/or equilibrium? ... If Charles Black is correct in his opinion that the real genius of American politics has ben [sic] in the establishment of the legitimizing and checking functions of the Supreme Court, then such a scientific discovery made, verified, and popularized by political scientists could be positively disastrous."111 103 few scholars and is under such heavy attack that it seems unlikely that its effects on public opinion Ore at all significant. Likewise, there is no evidence that scaling has influenced the behavior of the Court itself. The Court continues to justify its decisions in legal terms, and seems to have made no effort to conform to the behavioralists' notion of consistency as opposed to the lawyers' conception.112 Thus the effects of scaling outside the academic world have probably been miniscule up to now. As for the future, there is no guarantee that scalers will continue to arrive at the same conclusions about Justices' attitudes, particularly if scalers take heed of current criticisms of their methods and If scalers and non-scalers alike come to recognize the value of utilizing both methods cooperatively. Two conclusions, in the way of advice to scalers and nonscalers, emerge, First, we have seen how extremely limited is the information put into the scaling process; what comes out is also necessarily limited. A nonquantitative scholar, given enough persistence, can discover all the patterns and trends that scaling might reveal (although perhaps not as efficiently). But scaling 104 has one redeeming quality that sets it off sharply from nonquantitative methods: objectivity. This quality is in part inalienable from the process: biases injected into scaling are discoverable, as they are not when operative in non-quantitative scholarship. Bat, for the rest, only the user of scaling can retain the objectivity of which the method is capable. If he does, he will make scaling; if he does not, he has destroyed its raison d' etre and made the value of its use dubious indeed. Finally, we have described non-quantitative studies as falling into various strata of methodological sophistication. The reader has perhaps inferred from our description of scaling that studies using this method, too, are sophisticated to different degrees in their methods. In each case, the more sophisticated work is characterized by a wider range of source materials and methods of analysis. Above the three levels of sophistication heretofore presented, there ought to be a fourth, but we have been unable to find any examples of it. A study at the highly sophisticated level would make use of both sophisticated scaling and sophisticated nonquantitative methods. The powerful interpretive skill of the expert mind, working intuitively, and the vast area 105 of information that mark non-quantitative scholarship at its best require that non-quantitative methods remain the core of study of the Justices' attitudes. The objective presentation of all votes by scaling, the possibility of formulating with complete precision the scaling procedure before it is applied, the convenience and clarity of scalograms, and the almost automatic tendency of scalograms to generate hypotheses--all these qualities should make scaling an almost indispensable aid in any thorough study in the area of Justices' attitudes. Non-quantitative methods and scaling possess nearly complementary features; their use together should produce more advanced knowledge of the Justices' attitudes than has so far been attained. In spite of its limitations and the faults in its application, scaling has been profitably used on the Supreme Court. Some studies, such as those of Grossman and Spaeth, have contribute to our understanding of Justices' attitudes in ways that would have been far more difficult without scaling. Other work with scaling has drawn conclusions that cannot be accepted as having been rigorously reached, but are at least made plausible enough to warrant further investigation. And one of the major contributions of scaling to the field has been to 106 stimulate reaction on the part of non-scalers, in the form of more detailed re-examination of their own beliefs and the careful defense of those beliefs against the attacks of the scalers. Rashness, unsystematic procedure, and methodological solipsism have destroyed a large part of the potential fruit of the application of scaling to the Supreme Court, but we cannot therefore call scaling useless in this field of study. Scaling has been used with some profit, even if much of the profit is the byproduct of the resulting intra-disciplinary debate. But the greatest contribution to Supreme Court scholarship of the scaling done so far rill certainly be that it has laid a foundation for the more mature use of scaling in the future, closely allied to non-quantitative methods and freed from the abuses that have marked its use in the past. 107 Notes 1. For discussions of what constitutes scientific methods, see: Heinz Eulau, "Segments of Political Science Most Susceptible to Behavioristic Treatment". In James 0. Charlesworth (ed.), The Limits of Behavioralism in Political Science (Philadelphia, 1962), p. 32. Arthur S. Goldberg, "Political Science as Science". In Nelson W. Polsby et al., Politics and Social-Life (Boston, 1963), pp. 30-3 Samuel J. Eldersveld et al., "Research in Political Behavior". In S. Sidney Amer (ed.), Introductory Readings in Political Behavior (Chicago, 1961), pp. 8 & Richard 0. Snyder, "Experimental Techniques and Political Analysis". In Charlesworth, op. cit., pp. 115-22. Polsby, op. cit., pp. 4-8. Robert A. Dahl, "The Behavioral Approach in Political Science". In Polsby, op. cit., pp. 18-21. David Easton, "The Current Meaning of 'Behavioralism' in Political Science". In Charlesworth, op. cit., pp. 7-8. 2. Polsby, op. cit., pp. 8-14. 3. Ibid. See also Paul F. Lazarsfeld, "The American Soldier". In Polsby, op. cit., pp. 37-8. 4. Ibid. 5. Karl W. Deutsch, "The Limits of Common Sense". In Polsby, op. At., p. 52. 6. Bertrand de Jouvenel, "On the Nature of Political Science". In American Political Science Review, vol. LV, no. 4 (Dec. 1961), pp. 773-4. 7. Mulford Q. Sibley, "The Limitations of Behavioralism". In Charlesworth, op. cit., pp. 70-1. 8. Leo Strauss, "An Epilogue". In Herbert J. Storing (ed.), Essays on the Scientific Study of Politics (New York, 1962), pp. 311-27. See also; Walter Berns, "The Behavioral Sciences and the Study of Political Things". In American Political Science Review, vol. LV, no. 3 (Sept. 1961), pp. 558-9. David Easton, The Political System (New York, 1963), ch. 9. 9. Goldberg, op. cit., pp. 27-8. 10. P.A. Hayek, The Counter-Revolution of Science 108 (Glencoe, 1952), pp. 25-35. 11. Goldberg, op. cit., pp. 27-8. 12. Hayek, op. cit., pp. 44-51. 13. Sibley, op. cit., pp. 71-8. 14. Ibid., p. 70. 15. Russell Kirk, "Is Social Science Scientific?" In Polsby, op. cit., pp. 60-4. 16. Strauss, op. cit., p. 312. 17. Ibid., p. 316. 18. Ibid., pp. 311-37. 19. Berns, op. cit., p. 556. 20. Kirk, loc. cit. 21. Strauss, op. cit., pp. 320 & 322-7. 22. Hayek, op. cit., pp. 50-1. 23. Kirk, loc. cit. 24. Wallace Mendelson, "The Neo-Behavioral Approach to the Judicial Process; a Critiques. In American Political Science Review, vol. LVII, no. 3 (Sept. 1963), pp. 593-603. 25. Strauss, op. cit., pp. 312-3. 26. Hayek, op. cit., pp. 53-6. 27. Eulau, op. cit., p. 31. Roger Hilsman, "The Foreign Policy Consensus". In Polsby, op. cit., p. 393. H. Douglas Price, "Are Southern Democrats Different?" In Polsby, op. cit., pp. 740-4. 28. Thomas Morgan, "The People Machine". In Harper's Magazine, 29. Sibley, op. cit., pp. 89-92. 30. Hayek, op. cit., pp. 94-102. 31. Strauss, op. cit., pp. 318-9. 32. de Jouvenel, op. cit., pp. 773-4. 33. Ibid., p. 776. 34. Alpheus Thomas Mason, The Supreme Court: Palladium of Freedom (Ann Arbor, 1962), p. 175. 35. Ibid., pp. 175-6. 36. C. Herman Pritchett, The Roosevelt Court (New York, 1948), pp. 15-6. 37. Ibid., P. 19. 38. Ibid., p. 16. 39. For example: David Fellman, "Constitutional Law in 1957-1958". In American Politic 1 Science Review, vol LIII, no. 1 (March 1959), pp. 138-80. Fellman, "Constitutional Law in 1959-1960". In American 109 Political Science Review, vol LV, no. 1 (March 1961 pp. 112-35. "The Supreme Court: 1958 Term". In Harvard Law Review, vol. 73, no. 1, pp. 840-940. 40. Francis P. McQuade et al., "Mr. Justice Brennan and His Legal Philosophy". In Notre Dame Lawyer, vol. XXXIII, no. 3, pp. 321-49. 41. Ibid., p. 326. 42. Mason, op. cit. 43. Ibid.# p. 169. 44. Ibid., p. 173. 45. Aid., p. 169. 46. Helen Shirley Thomas, Felix Frankfurter: Scholar an the Bench (Baltimore, 1960). 47. Ibid., pp. ix-x. 48. Ibid., pp. 105-9. 49. Ibid., p. 218. 50. Ibid., pp. 256-8. 51. Ibid. 52. Ibid., p. 115. 53. Ibid., pp. 19-21. 54. Daniel M. Berman, "Mr. Justice Brennan: a Preliminary Appraisal". In The Catholic University of America Law Review, vol. VII, no.1 (Jan. 1958), pp. 1-15. 55. Ibid., P. 13. 56. Ibid., p. 14. 57. Ibid., p. 15. 58. Marlin M. Vol, "Mr. Justice Whittaker". In Notre Dame_Lawyer, vol. XXIIII, no. 2 (March 1958), pp. 159-77. 59. Ibid., p. 176. 60. John M. Harlan, "The Frankfurter Imprint as Seen by a Colleague". In Harvard Law Review, vol. 76, no. 1 (Nov. 1962), pp. 1-2. 61. Ibid., p. 2. 62. Robert G. McCloskey, "Deeds without Doctrines: Civil Rights in the 1960 Term of the Supreme Court". In American Political Science Review, vol. LVI, no. 1 (March 1962), pp. 71-89. 63. 367 U.S. 203 (1961). 64. Clyde E. Jacobs, Justice Frankfurter and Civil Liberties (Berkeley, 1961). 65. Ibid., p. 14. 66. Ibid., pp. 35-6. 110 67. 341 U.S. 494 (1951) 68. Jacobs, op. cit., p. 127. 69. Ibid., pp. 120-1. 70. 354 U.S. 234 (1957) 71. Jacobs, op. cit., p. 127. 72. Ibid., p. 154. 73. Fred Rodell, "For Every Justice, Judicial Deference Is a Sometime Thing". In Georgetown Law Journal, vol. 50, no. 4 (summer 1962), pp. 700-8. 74. 75. Rodell, op. cit., p. 706-7. 76. Ibid., pp. 702-3. 77. Ibid., pp. 703-6. 78. Ibid., pp. 707-8. 79. Ibid., p. 708. 80. Wallace Mendelson, Justices Black and Frankfurter: Conflict on the Court. 81. Pritchett, op. cit. 82. For a survey of the work that has been done, see Glendon Schubert, "Behavioral Research in Public Law". In American Political Science Review, vol. LVII, no. 2 (June 1963), pp. 433-45. 83. Glendon A. Schubert, Quantitative Analysis of Judicial Behavior (Glencoe, 1959; hereafter cited as Q), p. 305. 84. Ibid., p. 301. 85. Ibid., pp. 322-38. 86. Ibid., pp. 338-63. 87. Ibid., pp. 352-3. 88. Ibid., p. 354. 89. Harold J. Spaeth, "An Approach to the Study of Attitudinal Differences as en Aspect of Judicial Behavioral In Wrest Journal of Political Science vol. V, no. 2 (May 1961), pp. 165-80. 90. S. Sidney Ulmer, "Supreme Court Behavior and Civil Rights". In Western Political Quarterly, vol. 13, no. 2 (June 1960), pp. 288-311. 91. Schubert, Q, p. 298. 92. Schubert, "From Public Law to Judicial Behavior". In Schubert (ed.), Judicial Decision-Making (New York, 1963), pp. 1-10. 93. Schubert, Q, pp. 280-90. 94. Ibid., p. 352. 95. Ibid., p. 353. 96. Mendelson, "The Neo-Behavioral...", p. 597. 111 97. Pritchett, op. cit., p. 33. 98. Schubert, 9, p. 314. 99. Mendelson, "The Neo-Behavioral..., p. 598. 100. Ibid., p. 596. 101. Ulmer, "A Note on Attitudinal Consistency in the United States Supreme Court". In Indian Journal of Political Science, Vol. 22 (1961), pp. 195-204. 102. Schubert, "Behavioral...". 103. Ulmer, "Supreme Court...". 104. Spaeth, "An Approach...". 105. Spaeth, "Judicial Power as a Variable Motivating Supreme Court Behavior". In Midwest Journal of political Science, vol. VT, no. 1 (Feb. 1962), pp. 54-82. 106. Joel B. Grossman, "Role-Playing and the Analysis of Judicial Behavior: the Case of Mi. Justice Frankfurter". In Journal of Public Law, Vol. 11, no. 2 (1962), pp. 285-309. 107. See, for example, Schubert, "Civilian Control and Stare Decisis in the Warren Court". In Schubert, Judicial Decision-Making. 108. Ibid. 109. See Schubert, "Judicial Attitudes and Voting Behavior: the 1961 Term of the United States Supreme Court". In Law and Contemporary Problems, Vol. XXVIII, no. 1 (winter 1963), pp. 100-42. 110. Ralph S. Brown, Jr., "Legal Research: The Resource Base and Traditional Approaches". In The American Behavioral Scientist, Vol. VII, no. 4 Dec. 1963), pp. 3-7. 111. Theodore L. Becker, "On Science, Political Science, and Law". In American Behavioral Scientist, vol. VII, no. 4 (Dec. 1963), pp. 11-5. 112. Hans W. Baade, Forward to Jurimetrics, i.e. Law and Contemporary Problems, vol. XXVIII, no. 1 (winter 1963), pp. 3-4. 112 Selected Bibliography Scientific Methods in the Study of Politics: Charlesworth, James C. (ed.) The Limits of Behavioralism in Political Science, Philadelphia, 1962. Hayek, F.A. The Counter-Revolution of Science, Glencoe, 1952. de Jouvenel, Bertrand. "On the Nature of Political Science". In American Political Science Review (Dec. 1961). Polsby, Nelson W., et al. Politics and Social Life, Boston, 1963. Storing, Herbert J. (ed.) Ways on the Scientific Study of Politics, New York, 1962. Ulmer, S. Sidney (ed.) Introductory Readings In Political Behavior Chicago, 1961. Non-quantitative Studies of the Supreme Court: Jacobs, Clyde F. Justice Frankfurter and Civil Liberties, Berkeley, 1961. McCloskey, Robert G. "Deeds without Doctrines: civil Rights in the 1960 Term of the Supreme Court". In American Political Science Review, vol. LVI, no. 1 (March 1962), pp. 71-89. Rodell, Fred. "For Every Justice, Judicial Deference Is a Sometime Thing". In Georgetown Law Journal, vol. 50, no. 4 (summer 1962), pp. 700-8 Thomas, Helen Shirley. Felix Frankfurter: Scholar on the Rench, Baltimore, 1960. quantitative Studies of the Supreme Court: Grossman, Joel B. "Role-Playing and the Analysis of Judicial Behavior: The Case of Mr. Justice Frankfurter". In Journal of Public Law, vol. 11, no. 2 (1962), pp. 285-309. Schubert, Glendon A. (ed.) Judicial Decision-Making, Few York, 1963. Schubert, Glendon A. Quantitative Analysis of Judicial Behavior, Glencoe, 1959. Spaeth, Harold J. "Judicial Power as a Variable Motivating Supreme Court Behavior". In Midwest journal of political Science, vol. VI, no. 1 (Feb. 1962), pp. 54-82. 113 Miscellaneous: Jurimetrics, i.e. Law and Contemporary problems, vol. XXVIII no. 1 (winter 1963). Mendelson, Wallace. "The Neo-Behavioral Approach to the Judicial Process: A Critique". In American Political Science Review, vol. LVII, no. 3 (Sept. 19 31, pp. 593-603. Schubert, Glendon. "Behavioral Research in Public Law" (bibliographical essay). In American Political Science Review (June 1963), pp. 433-45.