Rabu, 19 April 2017

aaaaaaaa

An Assembly Language for Reprogramming
Marvin Lowell Graham and Peter Zilahy Ingerman* Wesiinghouse Eledric Corporafion,^ Baltimore, Maryland
Complete reprogramming of compiler language programs is seldom necessary. It is assembly language programs which present the greatest difficulty. Assembly languages generally provide a one-for-one translation from a symbolic to a numeric version of a program, that is, from assembly language to machine language. The meta-language presented here can be used to specify the mapping of any language which conforms to a canonical list form into on arbitrary stream of bits. This bit stream may be treated as a machine language program, a character stream, or whatever else the user might desire. Thus, this meta-language can be used to map from one assembly language into another or from the assembly language for one machine into the machine language of another.
liilroduetion Halpern |1] has implemented an assembly meta-language wliich expects virtuallyevery line of a program to be either a macro call or a contribution to the definition of macro. Ill Haljiern's system, the ]iarameterization ^\ithin the macros is performed in terms of M'lnliolic (juahties. Ferguson [2], on the other hand, has implemented an assembly nicla-language in which the ]»arameterization within Ihe macros o|:ierates in terms of tlie execution time values of lhc symbols used in tlie macro call. This paper specifies an assembly meta-language based on the work of Ferguson, but in which the jiarameterization can at any ])oint be either by symbol or by ^'aIue. This system is currently being implemented at ilie AVestinghouse Defense and Space Center. The extension wliii'Ii is probably the most important in re|irogramming applications is the ability to inspect the lines preceding and following any line during the processing of that line. Consequently, translations from an assembly or machine language program for a multiple address machine is feasible, and vice versa. Disassembly is also possible, since the bit stream of the input machine language jirogram can be inspected and translation back to a suitable symbolic assembly language performed. It is imperative to recognize that the translation would necessarily be performed uniformly and Avithout bias on Ihe entire machine language bit stream, both the '•instructions" and the "data". The .symbolic re.sull of this translation will be nothing more than a reasonable facsim
Presented at the ACM Reproj^raniming Conference, Princeton, New Jersey, June 1965, co-sponsored by the Association for Computing Machinery tirid Ajiplied Datii Research, Inc. The uurk reporieil in this piiper has been sponsored in part by I lie Air Force Olfice of Scientific Hesoarch of the Office of Aerospace Uesearch, under Contract No. AF4(ii3S)1452. • Present address: RCA EDP Division, Cherry Hill, New Jersey. t Defense and Space Center.
ile of what the program would have looked like had the original programming been done in the language to which the translation was made. Since each word must be transhit cd as an instruction, tho.se words which are used entirely as data would none the less be translated into instructions.
1. Control Charaeters An assembly language program consists of a sequence of lines. Each line begins witli the control character "CR" anil extends to Ihe next such character. Each line consists of a sequence of lists, separated from each other by the control character "LS". "LS" is an infix operator so that the configuration "CR LS" imj^lie.s two lists, the first of which is null. Finally, each list consists of a sequence of items, separated from each other by the control character "IS", which is also an infix operator. Hence, the configuration "CR IS" implies that the first item of the first list is null. An example of a line is shown in Figiu'e 1. The control characters can be thought of as:
CR analogous to a carriage return im n ty|iewriter; IS analogous lo tin- (-(imina ot" SLEUTH II LS analogous tn the spare which must appear between SLEUTH 11 lists.
CR iltm IS Mem LS Hem IS .'em LS LS IS ilem 15 i ren.
LIST
VOID ITEM
VOID LIST
Fio. 1
It is significant to note that even though the control charactei's may have to be represented in some implementation as actual charactei-s, they must not be regarded or manipulated as objects in the input string, but merely as delimiters. Hence, the above analogies to the space and comma of SLEUTH II hold only where these characters serve as delimiters and not as objects.
2. Structure of a Line
The first two lists of a line have a tixed interpretation. The first list serves the function of the location identification in conventional assembly systems. The fii-st item of this list is generally construed to be a symbolic designation for the address of the line, and is called the label of the line. The second item of the list is taken to be the name (symbolic designation) of the location counter which is to control the allocation of the line (and all successive lines with a null location counter specification). The value of the third item of the list is the amount by which the location counter is to be incremented after processing the line. This item will often be null; in this case, the contents of the appro
/ Number 12 / Doccmher, l%.i Communications of the ACM •69
priate localion counter is modilied in a manner determined by the niiture of tho remainder of tlie line. The vahio of tlie fourth item specilied tho maximum depth in tho ]»rogt-am liw (sec 4.2), rehitive lo the current de|>th, at, which I he symbolic value and niuneric value of the label are to be ivailable fur jirocessing. Tho ^'alue of the fifth item specilies the depth in the prognun tree, relative to the current depth, al which tho symbol table entry is to be retained. The second list serves as an o|ieratoi* whose operands are the rest of Ihe lists on the line. The first, iiom of the second list identifies the oporation. Several primitives (basic o])eralions) are inherent to the assembler and function as defined below. Any other opei'atio!is must, be defined by the programmer as required, in terms of the primitives.
3. Additional Concepts
Items. The term item as used liere refers to a symbolic designation, a RADIX evocation (see 4.5), a FUNCTION evocation (see 4.(1.")) or any comhination of these elementary items with suitable arithmetic, Boolean, and/or relational operators including parentheses. The allowed operatoi-s (since they aro essentially conventional) are not enumerated here for the sake of brevity. The ends of an item are always delimited by control characters. Literals. Any item enclosed entirely in parentheses will be treated as a symbolic immediate address, sometimes called a literal, losing this technique, one may write on any line what he is referring to rather than where it is stored. The information enclosed \\'ithin the i)arentheses will be treated as a eompiete line with the left parenthesis serving as a pseudo "CR". The value of a numeric reference to a hteral is the value of the localion counter specified within the literal which was assigned \(> ihe first output line generated by the literal. A literal whose location counter specification is void will be allocated under control of the same location counter as the last previous literal. All literals under the control of the same location counter are pooled; that is, duplicates are eliminated. Line Counter. The character "S", when used as an elenieiitarj'- item, represents the value of the approj>riate location counter for the first output item generated for the current line. Hence, "S" will be regarded as a symbol which is inherently unredefinable by the programmer but continuously redehned by the system. If it appears on a line whose operator is a primitive, its value is determined by assuming that it occurred on the last previous nonprimitive hne. (In this context, the operator of a DO line is the operator the pseudo-hne included in it.) In addition, the value of the expression "S + n", for example, will be the value of the location counter which controls the allocation on the first out)>ut item generated by the «th following nonprimitive line when that item is generated. Hence, "S ± n" can be thought of as a present position relative address whose unit of measure is lines rather than locations.
Symbols. When any symbol occurs as item 1 of list 1 of a line whose operator is a nonprimitive, it takes the current value of the location counter controlling that line, and is entered in the symbol table with that value.
A symbol is subscripted \\Inn ii is inlldwi-.l by a list enclosed in parentheses. A snbscrijjti-d symbol S {il 18 /a... IS /„) has the a priori value equal to tho number of subscripted symbols S (ii IS 72... IS /„ IS /,,+i) for whii h I lie values of ij for 1 < j < n are identical. The a priori nuinorii; value nf the symbol "S" will be zero only when "S" followed by a sub.scrijjt list (either null or nonnull) appears somewhere in tho program. Otherwise "S" has a priori numeric value niill. The symbols at the end of the tree have either value zero (nof niilll) or flie value assigned by their having occurred in a (JENERATOK reference (see 4.6.2) or item 1 of list 1 of a line. All nonsubscripted symbols have an a priori null value.
4. Primitives
4.1 Line. This primitive defines the length of an output line. The format of a source line using this primitive is
CIt name LS LINE LS ii IS n IS I'a IS i*
The value of ii is the number of bits per output character under control of the LINE designation "name". The value of ^2 is the maximum number of such items in an output line. The value of i^ is the representation of the "null character" to be used in filhng fields which are not filled by the value of the item for fhe field. The value of i^ is limited to the values below and specified
0 Right justify the item values, 1 Left justify the item values.
4.2 Form. 71iis primitive defines the format of an output hne under the control of the LINE designation specified by "LINE-name". The format of a line using this primitive is
CR name LS FORM FS LINE-nnme LS /, IS 12 IS ... IS i, LS R
It is evoked by writing "name" as the operator of a line together with an o(3erand list with j items. When it is evoked, the values of tho j items are written as output, in the output file determined by the value of R, each with /„ (1 < n < j) charactei-s per item, justified and truncated in Ihe number of bits designated by the specified UNI*] definition. The outpni medium for each output file is installation dependent. Evocation of a FORM for which a null LINE-name specification was given will result in the use of LINE specification nsed for the list previously evoked FORM. The power and fiexibility of this type of assembly process is based on the fundamental concept that there is exactly one |irimitive whoso evocation causes output to be produced. Tho entire assembly process consists of a repeated evocation of a FORM definition with varying sets of parameters. A FORM can be evoked directly Ijy writing its label as tho operator of a line. However, a nesting process is available to the i)rogrammer (see 4.6) which allows him to evoke a complete subjirogram for assembly with one line. Nested evocations may be placed within nested
770 CommunicutionH <if the ACM Volimic » / Numlwr 12 / Dccvnil.ti.
•'Vocations ad iriritiilum. This nesting facility inipctses an iniplicit ii-ce sirnclin'o on each line of llie program. Each ^i^*' nt I III' Ill-sling process iricroascv lhc depth of the tree \>y nwr lr\rl. Tin- siilil H'cs Inr cadi line aro connected by \nluc (ll iheir liciiig in Ihc same program. Hence, the enIne luo^rani can lie regarded !is a free structure of |»aratiicterized evocations of I*"{)RM definitions.
4.:i J'JQU. This primitive causes item 1 of Iho list 1 lo III' placed in the symbol table wiih the value given in list 'A. The ('(iiinal cif a line usiii<; lhl^ primitive is
CK luiinr LS I'XJU LS item
If a symbol table entry has previously been made with "name" as its symbol, the new value will rei)lace the old one. If, however, the old value was unredefinable (that is. It occurred as item 1 of list 1 on a nonprimitive line), and the new value dif^'ers from the old one, a multiple definition indicafion will be given, no values will be changed, and no hnl her processing of fiie EQU line will bo done.
4.4 DO. This primitive in'ovides the means for repetili\e processing of given psendo-linc wliirh begins item 1 of list 4 of the DO hne. The value of item 1 of list 3 specifies tlie number of repetitions to be performed; a zero value c-anses tlie ])soudo-line to be disregarded. Item 2 of list 3 is the symbolic designation of the counter to be used for the duration of the repetition process; the confents of this counter can be accessed from fhe pseudoliiK' by this name. Its value begins at 1 and increases by 1 tnr eacli repetition of the pseudo-line. The repetition process stops when the counter contents becomes equal to the value of item 1 of list 3. The LS between lists 3 and 4 serves a.s a pseudo CK. The eontiguration "LS LS" implies a null first list for fhe pseudo-line. The format, of a line nsing fhis primitive is
CR name LS DO LS count IS cuunt.-r LS /, L8 /•• LS I3 . . .
4..3 RADIX. The forniaf ol a line using this primitive
is
CR name LS RADIX LS io IS (, IS ... IS ij-i
The /• items of list 3 are Iho symbols which will rejiresent flic digits of an integer with radix j; item ii, has value k. Tlie .symbol "name" must prefix the digit symbols; if "name" is the same as any of fhe if ems of list 3, it is to be regarded as a digit of the integer as well as a RADIX evocation.
•if^GENERATOR, ENTRY and FUNCTION 4.6.1. The primitives are more simply discussed to gether. The formats of lines nsing GENERATOR as a primitive are
CR namc'i LS GENIORATOR LS . . . CR nanii', LS ICN'TKV LS . . . CR name, LS ENTItY LS . . . CR LS END.
These lines define an oj)eratnr whose name is "namei". The GENERATOR is evoked by a line whose item 1, hst
2 IS one nf Iho iiaines of an I'iX'J'lO' line. The bf»fiy ol Ih'(il']i\'l']RATOR con.sistH of lines which evoke primitives and other GENERAT{iR8. Reference can be made within tho GENERATOR (o eifhor the numeric or symhwiic vahies of items on flM' evoking lino and its neighbors. Reference to a numorif: value is made by following fhe name f.f the GENERATOR, here "namei", with a list of items, called subscripts, enclosed in parenthesas. Reference to a symbolic value is made by following "namoi" wnth a list of itoms, called designators, enclosed in f)rackets. References to numeric values will be called numeric references; references to symbolic values will be called symbolic references. The following conventions hold.
4.t>.2. Generally, there are three items in ihe sub.script list. The value of item 1 designafes the line; line 0 is the evoking line, line 1 Ihe following, and line —1 the preceding, etc. The value of item 2 designafes tho list required on I he roforeiicod line. List 0 is list 3 of the relevant entry line of the GENERATOR for line 0, and is a null list for all other lines. The value of item 3 designates an item of the specified list. Any numeric reference wifh fewer than three subscripts takes tho a priori value normally assigned to a subscripted Iabol. Henee, the value of a numeric reference with fwo subscripts will be the number of items in the referenced list, since these items are all referenced with triply subscripted numeric references whose first two subscripts are identical to those of the doubly subscripted numeric reference. 4.6.3. Subscript 3 can take a special form which uses Iho control character "IJ", item juxtaposer. Its form is ihen
Ilcni A IJ it.-Mi B.
"IJ" is an infix operator. The value of item A designates an item of the S]iocified list. The value of item B designates which of several possible special questions is being asked about the designated ifem. The admissible item B values together wifh the information they request are
Vjlue Information Returned 0 Number of characters in the designated item. 1 One if the designated value references a defined RADIX; zero otherwise. 2 One if the designated item is redefinable; zero otherwise. 4.6.4. Generally, there are four items in the designator list. The value of item 1 designates the line with res])ect to the evoking line. The value of ifoni 2 de.signates the list on Ihe referenced line. The value of item 3 specifies an item of the referenced list. The value of item 4 designates a character of the referenced item. The value of a .symbolic reforonoe is determined by the number of designators it contains. The values are listed below.
A' umber of designators 1 2
3 4
Value Character string of evoking line. Character string of the designated list. Character string of the designated item. The designated character.
\,,liiinf fi / Number 12 / l>t-<:pniher, 1965 Comnniiiications of the ACM
•t,ti..i. The ulher ninlfi hne .slrnctiu't! is The general format of sovoral linos usinp; this primitive is
Cll nam.', LS KUNCTION LS . . . CR nanw; LS lONTHY LS . . . CR names LS KNTRY LS . . . CR LS END LS ilem
This sirvicture is evoked hy the occurrence of name2, naniOg, etc. in any item. Operands may bo presented to the FUNGTION l\y placing them iu lists in parenthesis after the name evocafion. The values of these operands aro rofercnced in tho body of the KUNGTION by subscripting namoi with the desired list and item number analogous to the reference to paranietei"s in tho body of a GENERATOR. The diffei-ence between the GENERATOR and FUNCTION primitives is that the former generally iiroduces output lines wiiereas the latter alwaysi:)roduoes a value for uso ill fiu'ther processing. The value returned by the FUNGTION is the value of "item" given on its END line evaluated with the values of its parameter at the time the END line is reached. 4.7 GO. This ])rimitive alters fhe sequence of inferprolation of the source lines. The format of a source line using the primitive is CR name LS GO LS label "Label" specifies the label of iho lines from wliich assembly is to jirooeed, subject to the following rostriction: Entry to a GENERATOR which was not being processed prior to ihe occurrence of the GO line must be made through an ENTRY line of the GENERATOR. 4.8 END. This jjrimitive has three distinct uses: (1) An END line must be given for each GENERATOR line. This nse of END signals the conclusion of fhe GENERATijR body. (2) An END line must bo given for each FUNGTION line. This use of END must have a nonvoid third list as described under FUNCTION, and signals the conclusion of the FUNCTION body. (3) An END line must ot cur as tho last line of every program. Hence, an END line must conclude every program and subprogram.
5. Conclusions The meta-language presented here can be used to specify the mapping of any language which conforms to the canonical list form into an arbitrary stream of bits. This output bit stream may be treated as a machine language program, a character stream, or whatever else the user might desire. For example, the input could be written in some arbitrary assembly language and the out)-iut be a machine language program for a particular machine. Convei-sely, the input might be written in a particular assembly language and the output bo a machine language jirogram for an arbitrary nmchino. (Of course, there is no way to provide for the case in which the programmer squares an instruction and executes the middle bits of the product, or any of the grueome analogs to this example.)
TIK^ pi'ohli^ni ol' 1'oprngi'a.niiiiint; lor nnc nia''liiiic the asseinbly-languago vorsions of programs wrilicn tm anollier can bo facilitated by the use of ihe techniques suggested by this paper. Given as input the juswenibly language (or the old inaiOiinc, either the machine code, the assembly laiigiiago or both, for tho new machine can bo generated. However, significant dillicnlf ios i-an arise when the source program includes serjuonces of code in which time delays are significant. This situation arises most frequently in sequences which perform input or output without library routines. These sequetices can bo traced at translation time using the GO primitive and the line scanning facility of the meta-language. Timing estimates can be constructed for all possibly critical sequences and diagnostic messages inserted where appropriafe. At worst, questionable sequences can be detected and indicated.
APPENDIX
Tho appendix presents ati example of the use of the language. The control characters used in tho example are:
CR LS IS
Beginning of the line Tab (sequence of spaces)
List one of all the examiile lines is intentionally incomplete for the sake of clarity; otherwise, the example is complete. Tho example shows how a UNIVAC 1107 {U1107) Block Transfer (BT) instruction could be mapped to IBM 7094 (17094) code assuming a particular mapping of tho registers of the UNIVAG 1107 to IBM 7094 oore memory. The 1107 block transfer instruction is executed in repeat mode. In the SLEUTH II language for the 1107, the block transfer instruction is written:
BT DESTINATION INDEX, ADDRESS, *SOURCE INDEX
The asterisk which precedes the source index specification indicates that the index registers are to be incremented for each transmission operation.
0. 1, 2, 3, 4, 5, 6, 7 0, 1, 2, 3, 4, 5, (3, 7, S, 9 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 0, 1, 2, i, 4, 5, 6, 7, 8. 9 0. 1, 2,3, 4, 5,6, 7,8,9 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 0, 1, 2, 3, 4, 5, 6, 7. 8, 9 0, 1, 2, 3, 4, 5, 6, 7, 8, 0 0, 1,2,3,4,5,6,7,8,9 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 1. 3(1, 0, 0 W, 15, 3, 15 CODE
0, 053-iOO 0, 053500 CXI (0,0, 1), CXI (0,0, 2). CXI (0,3,2), CXI (0,3, 1)
772 CommunicationB of ihr
0 1 2 3 4 5 0 7 8 9 L I CXI LXA LAC
END CX2 PXD
RADIX RADIX RADIX RADIX RADIX RADIX RADIX RADIX RADIX RADIX LINE FORM, L GENERATOR ENTRY ENTRY I
GENERATOr; ENTRY
Voliinn
4. 075400
12 / December, I9(i5
ni\ KNTllY
END CT ri.A CLA' STO
END CX3 TIX
G BT Gl INC
I
CM-JNERATOR I':N"I"I.'V r:.\'ri;>' ICNTKV 1
GENERATOR KNTin" I
GIJ:NERATOR ENTRY GENERATOR ENTRV PXI) SUB PDX PXD SUB PDX END FLAG
CX2 (0, 0, (0,3, 1),
11, ll'tOdlHI
(1. 0511011(1
II. OlHHnO
1), CX'J (1
CT (U, U, 1), CT (0, (0,3,2),
CX3 (0, 0, (0,3,2),
13
G (0, 3
G 7 G (0, 3, 3) 7
EOU G
CT (0, 3
1), CX3 CX3 (0,
, 1)
[0. 3. 3. 1
(0, 0, 2), CX2
O,'J),CT , 1)
(0,3,3), CX3 3,1)
1 = 1*1
A
DO LXA LAC LAC! CLA STO DO
TIX END
FLA(j G 10,;;,:., i lie, 5 G (0,3, U,(i G (0,3, 3), 7 G (0, 3, 2), 7 G (0, 3, 2,) 0 FLAG LS LS INC separators here) A, 5, 1
REFERENCES
KQU
(note; 2 list
1. HALPERN, M. I. XPOP; A metalanguage withont metaphysics. Proc. 1964 Eall Joint Comput. Crnif., Vol. 26, Spartan Books, Washington, D.C, 1904. 2. FiiRGUsoN, D. E. A meta-assembly language. Programmatica, Los Angeles, 3. INGERMAN, P. Z. The parameterization of the translation process. Paper presented at IFIP Working Conference on Formal Language .Specification Languages, Baden, Austria, Sept. 19(j4. 4. Reference Manual IBM 7ttJ4 Data Processing System, Form A22-6703, IBM, Customer Manuals, Department 298, P.O. Box 390, Poughkeepsie, N.Y. 5. Central Computer Manual, UP-24tJ3 Rev, 2, Systems Programming Library Services, UNIVAC Engineering Center, Plant 2 Box 999, Bluebell, Pa,
1401 Compatibility Feature on the IBM System/360 Model 30
M. A. McCormack, T. T. Schansman and K. K. Womack
/BM Corporofion,* Endicott, New York
The "second generation" of stored-program computers, of which the IBM 1400 series wos a part, brought EDP into the mass market for the first time on a large scale. As this era unfolded, rapid changes in technology led to rapid obsolescence of data processing equipment. Programs written for a particulor system required tedious conversion as incompatible new machines came into use. The IBM System 360 has been designed with the conversion problem specificolly in mind. One of the conversion aids available on the Model 30 is the 1401 compatibility feature. This feature, in conjunction with other aids, permits a smooth and inexpensive transition to optimum use of the new system.
Introduction III the past it has not geiu'ially been economically feasible to implement two (iissimilar machine languages within a single processor. Today, the Read Only Storage Controls used in IBM System/360 make it economically feasible to implement the languages of current systems within System/360. To give as complete a picture of this new implementation technique as possible, remarks will be restricted to implementation of the MOl <(impatibility feature on Systein/360 Model 30.
The ] JOl Compatibility Feature Two principal convereion methods have evolved from second generation techiutlogj': program translation and simulation. In considering a meUujti of conversion for 1400 series programs to the Model 30, both of these eonvei-sion methods were considered, but were rejected as being either too slow or requiring too much manual intervention. The objective was to pro^^ide a means of running 1400 series programs on the Model 30 without change. The introduction of the 1401 compatibility feature on the IBM 1410 had shown historically that such an objective could be achieved in a machine of similar internal organization. Ease of use and sjieed were demonstrated to be the primary advantages of having a compatibility feature. In the past, it was considered impossible to implement two completely different machine organizations in one processor, without incuriing exceptional costs and intolerable inefficiency. However, in the ease of the Model 30, it seemed that Read Only Storage Controls make manipu
Presented at the .^CM Ueprngramming Conference, Princeton, N. J,, June 19G5, co-sponsorod by the Association fur Computing Machinery and Applied Data Research, Inc. * Systems Developnieiit nivisinn.
Volume 8 / Niiinhcr 12 / 1965 of