Univac Election Doc

1960 Article on Election Prediction by SW

Back to Steve Wright Home Page Back to Main Univac Page

Click symbol for saveable MS Word document

COMPUTERS and AUTOMATION for October, 1960

The Appetite For Instant News
Election Predictions

Stephen E. Wright Applied Data Research Princeton, N. J.

(Based on a report given at the meeting of the Association for Computing Machinery, Milwaukee, Wisc., August 26, 1960)

As a new presidential campaign begins to gather steam, the country will soon be caught up in a rising wave of excitement. The horses are making the final turn, the stretch run will soon begin. As in recent years, the culmination of the race on Election Eve will be watched by that new phenomenon, the electronic tout. All 3 major networks will employ computers in a desperate race to report the news before it happens.

Predicting elections on computers must rank high in the list of useless activities, Professor Jackson's Bridge-Playing Program notwithstanding. Outside the gambling profession, there is no possible benefit in making a good guess 6 or 8 hours before the vote is known with certainty. Besides, why should people in their right mind stick their necks out before millions of kibitzers when events may prove them wrong before the night is out?

The answer lies in our appetite for instant news, magnified by the power of the broadcasting industry. A public that waited in agony for news about Princess Margaret's wedding gown cannot be allowed to learn the identity of our next President from the morning papers.

And so it was that in November, 1952, UNIVAC marched, or stumbled, onto the political stage. Its advent was made propitious by the extreme caution that its predecessors, the pollsters, displayed that year. That election took place, you will recall, during the final months of the administration of President Dewey.

Besides, Gallup and Roper are only human, whereas the electronic computer is endowed with big magic. The admiration of the public for the giant brain is mixed with the sly hope that it will fall on its face and thus confirm the ultimate dominance of man over machine. I must also mention the undeniable fact that statistics somehow smacks of un-Americanism. Obviously, the ability to predict the future action of people is a negation of the democratic process, leading to thought control and socialism.

Needless to say, these factors increased the news value of the computer's performance to the television networks. This interest of the networks was matched by that of Remington Rand who undertook the election project in 1952 in the hope of publicizing their then new prodigy. In this they succeeded brilliantly; overnight "IBM's UNIVAC" became a household word. It is unfortunate that the household market was not ready for this product.

Now let us dismiss for a moment the frivolous nature of this activity and see how such predictions are actually made.

Our historical model is the old-time wardheeler (nowadays even poll-watchers are statesmen), who watched the votes come in from the doubtful precincts, those with real live voters. He might say "We got 200 more votes in Ward 16 than this time last year. With that kind of lead we usually pile up a majority of 2,000 in the district. They have a lead of 3,000 in the suburbs, but our vote always comes in late there. I better go pay my respects to the new Mayor. "

In national elections, the volume of data is too great and the vote comes in too randomly to make evaluations at the ward and precinct level. But some basic ideas behind this analysis are used; the extrapolation of the current vote in time, the comparison with past results, and the extrapolation in space. And of course, the computer can make a quantitative analysis, both of past data and current returns.

It thrives on digesting large chunks of data and emitting simple minded summaries. It does so quickly and accurately. Whether it also does so correctly depends on the statistical model used, the skill of the programmers, and, I must confess, luck.

Probability being what it is, a perfectly sound prediction may be wrong, and in making a flat prediction between two candidates, to be wrong is to be 100% wrong, to the delight of newspaper editors, political pundits, and broken down horseplayers.

In the previous elections, we were lucky; but, with the exception of the contests for the 1954 Congress, the races were not too close. (I find myself developing a callousness toward politics, to the point where I hope for a landslide victory, regardless of who wins. And I have become a fervent advocate of the two party system; I shudder at the thought of predicting 3 way races!)

The statistical model we used in predicting elections on UNIVAC was developed by Dr. Max Woodbury of New York Univ. Since Dr. Woodbury is currently on a good will tour of Europe and can't fight back, I must warn you that my knowledge of statistics is not only negligible but strongly biased. As a defrocked electronic engineer, I have always suspected that there is something fishy about probability theory. I had many battles over the years with Dr. Woodbury in programming his statistical models, battles that I fought for common sense versus statistics; it is galling to admit that he was always right, so far.

The kind of model we choose is restricted by the nature of the available data. It is desirable to get final votes from a number of properly chosen key precincts or districts, politically stable and uniformly distributed among the major regions of the nation. As this kind of data is available only after the last viewer has gone to bed, we must make do with anything that we can get, as early as possible. This means that we must rely mainly on the two major wire services, supplemented by special phone and teletype reports.

The wire services report national races primarily in the form of totals for each candidate in a given state, and the number of precincts reporting.

A typical return from the last presidential election is: Vermont 243 precincts out of 720, Eisenhower 5120, Stevenson 30. In states with large metropolitan areas, the vote may be broken down by city and upstate.

Information about finer subdivisions is theoretically available, but usually excluded because there are limitations in feeding huge volume of data, all manually transcribed, to the computer; difficulties in analyzing past data; etc.

Hence the state is our basic unit in handling predictions in Presidential and Senatorial Elections.

Within each state, we have two possibilitiee

" If some reports from this state are available, we ask, knowing the 2 party vote at this time, what will be the distribution of the final vote.

" If no reports from this state are available, we ask, knowing what the trend is in reporting states, and knowing how this state voted in the past, what will be the distribution of the final vote.

That is, within each reporting state, we (1) extrapolate the current votes of each party to the final vote; (2) compute the predicted Democratic percent and compare it with past averages in this state; (3) compute a difference; (4) summarize these differences to produce a national trend; (5) then apply this trend to non reporting states to predict their Democratic percents. This description of course is a greatly simplified version of the model actual used. In practice, these curves have a scatter that introduces an uncertainty, and this leads us to estimate a degree of precision of the extrapolation; this precision is a function that increases as more precincts are counted.

A state prediction is not all black and white. We have some reporting votes but we also consider the trend in other states. Many states are broken down into city and upstate vote, and we combine the predictions.

We thoroughly analyze past data. We have information back to 1944 for states and back to 1952 for some metropolitan areas. We make use of this information for assigning weights. Other objective factors besides past history could be taken into account, such as incumbency. Some subjective factors have been suggested: stands of candidates on the farm issue, labor, local issues, .... I but no quantitative measure exists. Time limitations have prevented the inclusion of these factors.

Gathering past data is an enormous job, but there are some references: Scammon's "America Votes", the Clerk of the House, etc. Data analysis must be repeated in each election, since the preceding elections furnish the most reliably correlated data. Analysis of past data is backbreaking work even with computers. A mathematician like Dr. Woodbury is never satisfied with past programs, but always seeking "minor changes" to "improve" the program. Until you've programmed computers, you'll never know how major a minor change can be!

The program for the computer deals with control over the correctness of the data, especially mistakes in the teletype and telephone returns mistakes in votes, in races, in areas. If there is an extra digit in the vote count, either the vote count is corrected or it is excluded; perhaps the parties have been reversed; the wrong area may have been reported; or the figures may have been invented.

We have to rely on human beings to transcribe the information into a form that the computer can accept. Therefore we have the information typed 3 times, and the computer accepts the information on a best 2 out of 3 basis. It is hoped that by 1962 computers capable of direct input will be available.

The computer checks that the total vote reported in the area this time is not less than the total vote reported last time. It also checks:
that the number of precincts reporting is greater than last time;
that the total number of precincts reporting is less than the total number of precincts in the area;
that the total vote is reasonable;
that the total Democratic percent is reasonable.

This year not only the major networks but three of the major computer manufacturers are caught up in the prediction rat race. In fact, election prediction is now a required part of the news coverage. So far have we come in eight years!

Back to Steve Wright Home Page Back to Main Univac Page

Created 02-23-04

Modified 03-01-04 1040