Skip to main content

Currently Skimming:

Printing Chemical Structures Electronically: Encoded Compounds Searched Generically with IBM-702
Pages 711-730

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 711...
... M DE BACKER Objective In a paper entitled "Routine Report Writing by Computer" by Waldo, Gordon, and Porter published recently in American Documentation, announcement was made of an industrial research report writing system where a largescale computer functioned as its core.
From page 712...
... The output may be a two-dimensional printed structure easily recognizable by trained chemists. Both of these advantages were specified requirements when the project was undertaken.
From page 713...
... To avoid these blackouts in our technical communication system, we believe it is imperative that any new system that may be invented has no hope for success unless it begins with the bench worker, passes through all the necessary steps, and concludes with a future user of the recorded facts. Thus to invent a code for chemical compounds without having in mind at the time the chemist at the laboratory bench doing research, the report writer, the documentalist, the current report reader, and the ultimate retrospective searcher is a mistake.
From page 714...
... We believe that this paper describes an entire chemical documentation system from the generation of the information to its ultimate use at a later date. Of course, the key is the storage and retrieval of chemical structures in a manner that meets these criteria.
From page 715...
... TABLE 1 1 = single bond 2 = double bond 3 = triple bond 7 = ionic bond, designating salts, complexes, etc. 8 = the point of attachment bond, where there is an indeterminate number of C atoms in a chain 9 = the point of attachment is in doubt where there exists indeterminate geometrical Isomerism ' = a special bond, symbol used in rare cases to indicate single bonds only.
From page 716...
... Boxing and punctuating make it a simple matter for the keypunch operator to enter the structure onto a punchcard. All structures are handled in an analogous manner.
From page 717...
... WALDO AND DE BACKER ¢~0~ | FICHE /3 Generic Search of Encoded Compounds becomes 717 -C2 C .
From page 718...
... Furthermore, we have limited the number of rows arbitrarily to thirteen to accomr~lodate our report-writing program (5~. To make the search program somewhat easier we have ruled further that ~q
From page 719...
... All data cards to be added to the structure file are entered in the computer along with this program load deck. Each group of carcis comprising a complete structure is transcribed through the computer to magnetic tape into the form of a variable length record.
From page 720...
... The rules established for coding structures are integrated in the program so that the computer is able to take a fairly sophisticated look at the chemist's coding and the keypunch operator's work. It will not allow any atom to have too many or tOO few bonds, nor is a "7" bond code permissible with atoms for which ionic bonds are not "legal.'' Improper atom and bond codes and misplaced characters are recognized by the computer, as are various other types of errors.
From page 721...
... Likewise oxygen has only two covalent bonds. However, nitrogen, phosphorus, and sulfur are not so easily hanciled, and a set of rather sophisticated rules is written into the molecular formula program so that the computer can recognize amines from azides, sulfites from sulfonates, and phosphites from phosphonates, etc.
From page 722...
... Thus, a single file, consisting of only a few reels, contains all information necessary for searching and printing many thousands of chemical structures. Controls | Structure data | Preparing the search question A chemist proposing a search of the chemical structure files for a specific substructure or chemical moiety must state precisely the elements, bonds, connections, and so forth, he desires to locate.
From page 723...
... Little education is needed for a human being to code the chemical structures for input to the computer, but this is counterbalanced by the fact that much education is necessary for the computer to locate substructures in the coded form. The advantage is, of course, that the chemist's time is available for more important duties that computers cannot perform; and our educated computer must be put to work.
From page 724...
... Depending on the nature of the search question, and control information given the computer, the search continues until a match is found, or until the whole structure has been scanned, no equality found, and the structure rejected. Figure 3 is the basic flow of the manner in which the 702 attempts to find what it wants.
From page 725...
... Flow diagram. Searching and printing chemical structures.
From page 726...
... they feel that the price of changing the conventions from the classical to the computer-printed form is a small price to pay for the ability to make generic searches among thousands of compounds and to have the resulting structures immediately available for study. In accord with our stated policy at Monsanto of making this research work in documentation pay for itself in each step along the way, we have been successful in using the computer to write one-page laboratory reports from data supplied to the keypunch operators from the laboratory notebooks of several scientists evaluating uses for compounds as they are synthesized.
From page 727...
... cards accurately reproduced by a machine. However, probably the most important advantage in making repetitive use of test data stored on punched cards is the potential of releasing thousands of man-hours of highly skilled scientific personnel from the drudgery part of report writing, that of copying reams of data over again.
From page 728...
... This program wiD answer a further bonus question. The question was to examine aL the ciata for a given test during 1957 and perform a minor calculation on every piece of data recorded in this test and then prepare a table of compound number and name with these calculated results on an offset master properly paginated so that the sheets, when duplicated, can simply be punched and inserted into the scientist's final report.
From page 729...
... The chances of finding erroneous data stored by accident through the human error of recording or punching or through the very infrequent machine error is entirely left to chance. We have found these errors by tedious proofreading of the output, but this can only be done on a small scale.
From page 730...
... There may be other applications of the topological method we have employed here for the storage of chemical structures. It seems to us to be applicable to any series of simple systematic diagrams, such as electric circuits, or the diagrams used in architecture, anatomy, botany, and geology.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.