Little man computer
The Little Man Computer (LMC) is an instructional model of a computer, created by Dr. Stuart Madnick in 1965. The LMC is generally used to teach students, because it models a simple von Neumann architecture computer - which has all of the basic features of a modern computer. It can be programmed in machine code (albeit in decimal rather than binary) or assembly code.
The LMC model is based on the concept of a little man shut in a closed mail room (analogous to a computer in this scenario). At one end of the room, there are 100 mailboxes (memory), numbered 0 to 99, that can each contain a 3 digit instruction or data (ranging from 000 to 999). Furthermore, there are two mailboxes at the other end labeled INBOX and OUTBOX which are used for receiving and outputting data. In the center of the room, there is a work area containing a simple two function (addition and subtraction) calculator known as the Accumulator and a resettable counter known as the Program Counter. The Program Counter holds the address of the next instruction the Little Man will carry out. This Program Counter is normally incremented by 1 after each instruction is executed, allowing the Little Man to work through a program sequentially. Branch instructions allow iteration (loops) and conditional programming structures to be incorporated into a program. The latter is achieved by setting the Program Counter to a non-sequential memory address if a particular condition is met (typically the value stored in the accumulator being zero or positive).
As specified by the von Neumann architecture, each mailbox (signifying a unique memory location) contains both instructions and data. Care therefore needs to be taken to stop the Program Counter from reaching a memory address containing data - or the Little Man will attempt to treat it as an instruction. One can take advantage of this by writing instructions into mailboxes that are meant to be interpreted as code, to create self-modifying code. To use the LMC, the user loads data into the mailboxes and then signals the Little Man to begin execution, starting with the instruction stored at memory address zero. Resetting the Program Counter to zero effectively restarts the program, albeit in a potentially different state.
To execute a program, the little man performs these steps:
- Check the Program Counter for the mailbox number that contains a program instruction (i.e. zero at the start of the program)
- Fetch the instruction from the mailbox with that number. Each instruction contains two fields: An opcode (indicating the operation to perform) and the address field (indicating where to find the data to perform the operation on).
- Increment the Program Counter (so that it contains the mailbox number of the next instruction)
- Decode the instruction. If the instruction utilises data stored in another mailbox then use the address field to find the mailbox number for the data it will work on, e.g. 'get data from mailbox 42')
- Fetch the data (from the input, accumulator, or mailbox with the address determined in step 4)
- Execute the instruction based on the opcode given
- Branch or store the result (in the output, accumulator, or mailbox with the address determined in step 4)
- Return to the Program Counter to repeat the cycle or halt
While the LMC does reflect the actual workings of binary processors, the simplicity of decimal numbers was chosen to minimize the complexity for students who may not be comfortable working in binary/hexadecimal.
Some LMC simulators are programmed directly using 3-digit numeric instructions and some use 3-letter mnemonic codes and labels. In either case, the instruction set is deliberately very limited (typically about ten instructions) to simplify understanding. If the LMC uses mnemonic codes and labels then these are converted into 3-digit numeric instructions when the program is assembled. The first digit of a numeric instruction
The table below shows a typical numeric instruction set and the equivalent mnemonic codes.
|Numeric code||Mnemonic code||Instruction||Description|
|1xx||ADD||ADD||Add the value stored in mailbox xx to whatever value is currently on the accumulator (calculator).
|2xx||SUB||SUBTRACT||Subtract the value stored in mailbox xx from whatever value is currently on the accumulator (calculator).
|3xx||STA||STORE||Store the contents of the accumulator in mailbox xx (destructive).
|5xx||LDA||LOAD||Load the value from mailbox xx (non-destructive) and enter it in the accumulator (destructive).|
|6xx||BRA||BRANCH (unconditional)||Set the program counter to the given address (value xx). That is, value xx will be the next instruction executed.|
|7xx||BRZ||BRANCH IF ZERO (conditional)||If the accumulator (calculator) contains the value 000, set the program counter to the value xx. Otherwise, do nothing. Whether the negative flag is taken into account is undefined. When a SUBTRACT underflows the accumulator, this flag is set, after which the accumulator is undefined, potentially zero, causing behavior of BRZ to be undefined on underflow. Suggested behavior would be to branch if accumulator is zero and negative flag is not set.
|8xx||BRP||BRANCH IF POSITIVE (conditional)||If the accumulator (calculator) is 0 or positive, set the program counter to the value xx. Otherwise, do nothing. As LMC memory cells can only hold values between 0 and 999, this instruction depends solely on the negative flag set by an underflow on SUBTRACT and potentially on an overflow on ADD (undefined).
|901||INP||INPUT||Go to the INBOX, fetch the value from the user, and put it in the accumulator (calculator)
|902||OUT||OUTPUT||Copy the value from the accumulator (calculator) to the OUTBOX.
|000||HLT/COB||HALT/COFFEE BREAK||Stop working/end the program.|
|DAT||DATA||This is an assembler instruction which simply loads the value into the next available mailbox. DAT can also be used in conjunction with labels to declare variables. For example, DAT 984 will store the value 984 into a mailbox at the address of the DAT instruction.|
Using Numeric Instruction Codes
This program (instruction 901 to instruction 000) is written just using numeric codes. The program takes two numbers as input and outputs the difference. Notice that execution starts at Mailbox 00 and finishes at Mailbox 07. The disadvantages of programming the LMC using numeric instruction codes are discussed below.
|00||901||INBOX --> ACCUMULATOR||INPUT the first number, enter into calculator (erasing whatever was there)|
|01||308||ACCUMULATOR --> MEMORY||STORE the calculator's current value (to prepare for the next step...)|
|02||901||INBOX --> ACCUMULATOR||INPUT the second number, enter into calculator (erasing whatever was there)|
|03||309||ACCUMULATOR --> MEMORY||STORE the calculator's current value (again, to prepare for the next step...)|
|04||508||MEMORY --> ACCUMULATOR||(Now that both INPUT values are STORED in Mailboxes 08 and 09...)
LOAD the first value back into the calculator (erasing whatever was there)
|05||209||ACCUMULATOR = ACCUMULATOR - MEMORY||SUBTRACT the second number from the calculator's current value (which was just set to the first number)|
|06||902||ACCUMULATOR --> OUTBOX||OUTPUT the calculator's result to the OUTBOX|
|07||000||(no operation performed)||HALT the LMC|
Using Mnemonics and Labels
Assembly language is a low-level programming language that uses mnemonics and labels instead of numeric instruction codes. Although the LMC only uses a limited set of mnemonics, the convenience of using a mnemonic for each instruction is made apparent from the assembly language of the same program shown below - the programmer is no longer required to memorize a set of anonymous numeric codes and can now program with a set of more memorable mnemonic codes. If the mnemonic is an instruction that involves a memory address (either a branch instruction or loading/saving data) then a label is used to name the memory address.
- This example program can be compiled and run on the LMC simulator available on the website of York University (Toronto, Canada) or on the desktop application written by Mike Coley. All these simulators include full instructions and sample programs, an assembler to convert the assembly code into machine code, control interfaces to execute and monitor programs, and a step-by-step detailed description of each LMC instruction.
INP STA FIRST INP STA SECOND LDA FIRST SUB SECOND OUT HLT FIRST DAT SECOND DAT
Without labels the programmer is required to manually calculate mailbox (memory) addresses. In the numeric code example, if a new instruction was to be inserted before the final HLT instruction then that HLT instruction would move from address 07 to address 08 (address labelling starts at address location 00). Suppose the user entered 600 as the first input. The instruction 308 would mean that this value would be stored at address location 08 and overwrite the 000 (HLT) instruction. Since 600 means "branch to mailbox address 00" the program, instead of halting, would get stuck in an endless loop.
To work around this difficulty, most assembly languages (including the LMC) combine the mnemonics with labels. A label is simply a word that is used to either name a memory address where an instruction or data is stored, or to refer to that address in an instruction.
When a program is assembled:
- A label to the left of an instruction mnemonic is converted to the memory address the instruction or data is stored at. i.e. loopstart INP
- A label to the right of an instruction mnemonic takes on the value of the memory address referred to above. i.e. BRA loopstart
- A label combined with a DAT statement works as a variable, it labels the memory address that the data is stored at. i.e. one DAT 1 or number1 DAT
In the assembly language example which uses mnemonics and labels, if a new instruction was inserted before the final HLT instruction then the address location labelled FIRST would now be at memory location 09 rather than 08 and the STA FIRST instruction would be converted to 309 (STA 09) rather than 308 (STA 08) when the program was assembled.
Labels are therefore used to:
- identify a particular instruction as a target for a BRANCH instruction.
- identify a memory location as a named variable (using DAT) and optionally load data into the program at assembly time for use by the program (this use is not obvious until one considers that there is no way of adding 1 to a counter. One could ask the user to input 1 at the beginning, but it would be better to have this loaded at the time of assembly using one DAT 1)
This program will take a user input, and count down to zero.
INP OUT // Initialize output LOOP BRZ QUIT // If the accumulator value is 0, jump to the memory address labeled QUIT SUB ONE // Label this memory address as LOOP, The instruction will then subtract the value stored at address ONE from the accumulator OUT BRA LOOP // Jump (unconditionally) to the memory address labeled LOOP QUIT HLT // Label this memory address as QUIT ONE DAT 1 // Store the value 1 in this memory address, and label it ONE (variable declaration)
This program will take a user input, square it, output the answer and then repeat. Entering a zero will end the program.
(Note: an input that results in an output greater than 999 will cause an error due to the LMC 3 digit number limit).
START LDA ZERO // Initialize for multiple program run STA RESULT STA COUNT INP // User provided input BRZ END // Branch to program END if input = 0 STA VALUE // Store input as VALUE LOOP LDA RESULT // Load the RESULT ADD VALUE // Add VALUE, the user provided input, to RESULT STA RESULT // Store the new RESULT LDA COUNT // Load the COUNT ADD ONE // Add ONE to the COUNT STA COUNT // Store the new COUNT SUB VALUE // Subtract the user provided input VALUE from COUNT BRZ ENDLOOP // If zero (VALUE has been added to RESULT by VALUE times), branch to ENDLOOP BRA LOOP // Branch to LOOP to continue adding VALUE to RESULT ENDLOOP LDA RESULT // Load RESULT OUT // Output RESULT BRA START // Branch to the START to initialize and get another input VALUE END HLT // HALT - a zero was entered so done! RESULT DAT // Computed result (defaults to 0) COUNT DAT // Counter (defaults to 0) ONE DAT 1 // Constant, value of 1 VALUE DAT // User provided input, the value to be squared (defaults to 0) ZERO DAT // Constant, value of 0 (defaults to 0)
Note: If there is no data after a DAT statement then the default value 0 is stored in the memory address.
In the example above, [BRZ ENDLOOP] depends on undefined behaviour, as COUNT-VALUE can be negative, after which the ACCUMULATOR value is undefined, resulting in BRZ either branching or not (ACCUMULATOR may be zero, or wrapped around). To make the code compatible with the specification, replace:
... LDA COUNT // Load the COUNT ADD ONE // Add ONE to the COUNT STA COUNT // Store the new COUNT SUB VALUE // Subtract the user provided input VALUE from COUNT BRZ ENDLOOP // If zero (VALUE has been added to RESULT by VALUE times), branch to ENDLOOP ...
with the following version, which does VALUE-COUNT instead of COUNT-VALUE, making sure the accumulator never underflows:
... LDA COUNT // Load the COUNT ADD ONE // Add ONE to the COUNT STA COUNT // Store the new COUNT LDA VALUE // Load the VALUE SUB COUNT // Subtract COUNT from the user provided input VALUE BRZ ENDLOOP // If zero (VALUE has been added to RESULT by VALUE times), branch to ENDLOOP ...
Another example is a quine, printing its own machine code (printing source is impossible because letters cannot be outputted):
DAT 515 // 00: Load BRA 5 instruction to store at 20 DAT 320 // 01: Store instruction at position 20 DAT 518 // 02: Load current LOAD instruction to store at 19 DAT 319 // 03: Store instruction at position 19 DAT 619 // 04: Branch to code created, that will go to next instruction with the pointed number loaded in accumulator DAT 902 // 05: Output pointed number. DAT 217 // 06: Subtract end marker from number. DAT 712 // 07: If zero, output modifiable data (the modifiable LOAD instruction) and halt (position 12) DAT 518 // 08: Otherwise, increment pointer DAT 117 // 09: The increment DAT 318 // 10: Here the incremented pointer is stored DAT 602 // 11: Branch to start of loop (2) DAT 516 // 12: Now, output position 18's original value (it was modified throughout running of quine) DAT 902 // 13: Actual output DAT 0 // 14: Halt DAT 605 // 15: BRA 5 instruction to be loaded at position 20 DAT 500 // 16: LOAD instruction original form DAT 1 // 17: End marker for the quine, increment for pointer DAT 500 // 18: LOAD instruction that will be loaded at position 19
This quine works using self-modifying code. It reserves positions 19 and 20 for encoding. It stores a pointer at location 18.
- CARDboard Illustrative Aid to Computation (another instructional model)
- "Little Man Computer". Illinois State University. May 1, 2000. Retrieved March 8, 2009.
- Yurcik, W.; Osborne, H. (2001). "A crowd of Little Man Computers: Visual computer simulator teaching tools". Proceeding of the 2001 Winter Simulation Conference (Cat. No.01CH37304) 2. p. 1632. doi:10.1109/WSC.2001.977496. ISBN 0-7803-7307-3.
- Yurcik, W.; Brumbaugh, L. (2001). "A web-based little man computer simulator". Proceedings of the thirty-second SIGCSE technical symposium on Computer Science Education - SIGCSE '01. p. 204. doi:10.1145/364447.364585. ISBN 1581133294.
- Osborne, H.; Yurcik, W. (2002). "The educational range of visual simulations of the Little Man Computer architecture paradigm". 32nd Annual Frontiers in Education. pp. S4G–S19. doi:10.1109/FIE.2002.1158742. ISBN 0-7803-7444-4.
- Chen, Stephen Y.; Cudmore, William C. "The Little Man Computer". York University. Retrieved October 7, 2010.
- Coley, Mike. "The Little Man Computer". Retrieved April 12, 2012.
- Richard J. Povinelli:Teaching:Introduction to Computer Hardware and Software:Little Man Computer
- The "Little Man" Computer
- Microsoft Excel LMC simulator
- Java Applet
- Windows Executable
- Windows Executable
- Emacs package
- Adobe Flash Version
- Adobe Director Version
- Windows Executable
- Adobe Director Version with Graphic Little Man
- Windows Executable with Graphic Little Man
- GCSE Computing
- CPU BattleTanks:Control a tank in your browser with a Little Man Computer CPU