Language and Numerical Structures
by
James Redin

X-Number PatentVerbal Numerals Keyboard


  Foreword

Back in the 60's, since my days in the engineering school, I always had a passion for calculating devices. First it was the slide-rule and latter, at the beginning of the 70's, it shifted to the pocket calculators.

Yes, it all started in the 70's. ARISTO was always my favorite slide-rule brand, so I couldn't resist the temptation of buying this ARISTO M36 pocket calculator from a friend who came from Germany. Although it had only the four basic arithmetic functions, it is indeed the best calculator I have ever had.

Its power and flexibility came from two special keys: a Memory key and a Swap key. The Memory key was used in combination with the arithmetic functions and the Equal key. The Swap key was also used in combination with other keys. For example: [Memory][Swap] would swap the memory contents with the display value. Can you do that with your standard non-scientific calculator?

It didn't have scientific functions, not even the square root, but because of its flexibility it was easy (and fun) to compute a square root by applying to the number the Newton-Raphson iteration a few times.

I used it while studying for my Master degree in Industrial and Management Engineering at Columbia University, and later several years during my professional career, until one day somebody stole it from my office. When I tried to get another one, it was too late, ARISTO had already moved into the LCD display technology and their calculators looked more like the standard ones. I was disappointed to discover that the memory key had disappeared and Square Root key had taken the place of the Swap key. Few years after ARISTO succumbed to the competition.

Then, the microcomputers arrived, and MS-DOS with the unbelievable power of the TSR memory resident applications started inundating the market and my attention. Of course, my natural reaction was to use all this wonderful power in writing an emulator of my beloved ARISTO. It was 1992 and since this was just my after hours hobby, it took much more time than I though. To start with, I had to learn some C and the usage of TSR memory management functions.

The project was running fine until I had to write the routines to perform the data-entry. Here I decided to include most of the capabilities I had found in other calculators during my quest in searching for an ARISTO like functionality. One of this capabilities was the usage of a Hundred key [00] (to replace pressing the Zero key twice while entering cents or numbers with many zeroes). So, since the magic of the microprocessor was so easy to handle, I said to myself why not a Thousand key [000]?

Soon I discovered why the calculator manufacturers never included a Thousand key and very few included the Hundred key. It was easier to press the Zero key two or three times as required. But it was also at this point where I realized that not only the Hundred key, but also the Thousand key and even a Million key can prove to be very helpful if they could be used in a mnemonic mode to enter a number in the same way as we think on the number.

Think on this number for example: 2001. I'm sure that you read it mentally as "Two Thousand One." Your mind didn't read the number as "Two Zero Zero One."

Shouldn't it be easier to enter a number in a keyboard by using the same way as our mind thinks on the number? For example, how about to have a calculator equipped with three additional keys: a Hundred key [00], a Thousand key [000], and a Million key [0x6]. Entering the number 2001 would require pressing the following sequence: [2][000][1] (for Two-Thousand-One).

Soon I realized that there were some unexpected results. First, it was no longer required to convert mentally the number into its sequence of digits prior to entering the number (this can be challenging with certain numbers), but even better, the number of keystrokes required to enter a number was in most cases less than the number of keystrokes required by the conventional procedure.

I checked the idea with my daughter-advisor Marisol (then 14) and she liked the idea. Great, so let's write some functions that would do that I said to myself. It sounded simple and easy to implement. Well it was not. Marisol would be able to help me now that she is going for her MS at MIT, but not on those days. So it took me a couple of years to develop my concept of the Verbal Numerals and the algorithms required to do the conversion.

I filed for a patent application in February 11 of 1992, and finally after four years of browsing through the pages of the book "Patent It Yourself" by David Pressman, and writing countless pages answering observations posted by the Patent Office, I got my patent 5623433 issued in April 22, 1997.

It has been a long and interesting journey but, of course, hopefully this is just the beginning...

James Redin

April 22, 1997.


  Background and Objectives

When a number is expressed orally, several rules, which are sometimes language dependent, must be applied in order to determine the proper way to express the number.

For example, the number 100,000 cannot be expressed as "One thousand hundred" (although it can be actually considered as made up of thousand hundreds), instead, it must be expressed as "One hundred thousand." Another interesting example is the way the number 1,200 can be expressed in English; note that in this case the notation "Twelve hundred" or "One thousand two hundred" are equally acceptable; this does not hold true in other languages like in Spanish, when only the equivalent for the second form is accepted: "Mil docientos."

The purpose of this document is to analyze the rules that govern the expression of numbers in the English language, as well as its similarities and main differences with other occidental languages such as Spanish, French and Dutch. This may prove to be useful in designing a new method to input numbers in numerical devices such as computers and electronic calculators.

 

  Small Numbers

In every language, special names and/or special naming conventions are applied to numbers under 100. For the purpose of this analysis these numbers will be referred to as "small numbers." The smaller the number, the more specific the name. In English, numbers ranging from 0 to 12 have proper single names which do not follow any rule at all, each name is unique and shows no relationship with the others; numbers ranging from 13 to 19 also have single names, but this time the name is formed by combining a root taken from the names assigned to numbers 3 to 9 with a suffix "teen." A similar approach is used to name the remaining multiples of ten, 20 to 90, by using the suffix "ty." Numbers starting with 21 up to 99 not included in the later set, have a composite name made up from the name of the immediate lower multiple of 10 plus the unique name assigned to the number that corresponds to the remaining number of units; as illustration example, the number 37 is expressed as "Thirty- seven."

In Spanish a similar scheme is applied, except that numbers from 0 to 15, and multiples of 20 to 90 have been assigned single names, while every other number in the range have a composite name constructed in a way similar to the described above for English; as illustration example, the number 17 is expressed as "Diez y siete," many numbers like the one in this example have been assigned special concatenated single names; in Spanish, for example, the proper way to write the name of number 17 is "Diesisiete" instead of "Diez y siete."

It is interesting to notice the way some small numbers are constructed in French. For example, the numbers 70 and 80, instead or being assigned single names as its counterparts in other languages and other multiples of ten in the same language, are expressed with the composite names "Soixant Dix" and "Quatre Vingts", which translated literally into English would mean "Sixty Ten" and "Four Twenties."

It follows from the discussion above, that small numbers have no general naming conventions, and the way they are expressed greatly depends on the language applied.

 

  Unit Structures

In the case of numbers greater than 99, special single names have been assigned to some powers of ten, the most common are the names assigned to 100, 1,000 and 1,000,000, which in English are "Hundred," "Thousand" and "Million." Equivalent names are also used in other languages, for example: "Cien," "Mil," and "Millon" in Spanish; "Cent," "Mille," and "Million" in French.

Larger powers of ten have also been assigned single names but not always have consistent meanings. The most typical case is the name "Billion" which in United States means one thousand millions (1,000,000,000) while in England it means one million millions (1,000,000,000,000). By the same token, the name "Trillion" in United States represents a unit followed by twelve zeroes, while in England it represents a unit followed by eighteen zeroes. French notation fixes this inconsistency by assigning different names to the U.S. Billion and the English Billion. "Milliard" is used for 1,000,000,000, and "Billion" for 1,000,000,000,000.

For the purpose of this analysis, all the powers of ten greater or equal to 100 which have been assigned single names will be referred to as "unit structures."

The following table shows the different names assigned in several languages to the main unit structures:
Unit 	      English	
Structure   US         UK     Spanish  French 
--------- -------   --------  -------  --------
10**2	Hundred   Hundred   Cien     Cent
10**3	Thousand  Thousand  Mil      Mille
10**6	Million   Million   Millon   Million 
10**9 	Billion      -        -      Milliard
10**12 	Trillion  Billion   Billon   Billion
10**18      -      Trillion  Trillon  Trillion    -
-----------------------------------------------

It follows from these observations, that the numbers "Hundred," "Thousand," and "Million" are the only unit structures which remain consistent across all the Western languages.

  Large Numbers and Order Parameters

Consistent with the definition given above for small numbers, a "large number" will be defined as any integer greater that 99; therefore, every large number is greater than at least one unit structure. For the purpose of this analysis, the largest available unit structure smaller than the number will be defined as the "Order" of the number. The Order of the number is a parameter of the number and can be used to represent the number by using the following expression:

number = int(number/Order) x Order + rem(number/Order)

where int(number/Order) represents the result of applying an integer division of the number by its Order, and rem(number/Order) represents the remainder of the same operation. For the purpose of this analysis, the values of int(number/Order) and rem(number/Order) will be defined as the "Factor" of the number and the "Module" of the number, respectively. Therefore, above expression can be written as follows:

number = Factor x Order + Module

Above expression may be generalized for all numbers if the order of small numbers is defined as 1. Notice that in this case the Factor is equal to the value of the small number and the Module is zero.

The following table shows some illustration examples of above defined concepts:

  number		Factor	Order	    Module
  ----------- 	------	----------  -------
  100,000		100	1,000	    0
  350		3	100         50
  99		99	1           0
  0		0	1           0 
  2,457,128	2 	1,000,000   457,128
  457,128		457	1,000       128	
  --------------------------------------------

  Verbal Numerical Expressions.

As shown in the examples of the previous section, in some cases the Factor and/or the Module can be large numbers. In these cases, the original expression can be expanded recursively until all the Factors and Modules of the expression are small numbers as described in the following algorithm:

(1) find the Order, Factor and Module of the number;

(2) if the Factor is a large number, apply recursively steps (1) to (5) to obtain the Factor expression and then enclose the factor expression within parentheses, otherwise use the Factor as the Factor expression;

(3) if the Module is a large number, apply recursively steps (1) to (5) to obtain the Module expression, otherwise use the Module as the Module expression;

(4) if the order is greater than 1, append "Factor expression x Order" to the number expression; otherwise, append "Factor expression to the number expression;

(5) if the module is greater than 0, append "+ Module expression" to the number expression.

Figure A1 shows a flow diagram for the algorithm described above. Notice that the recursive procedure is applied twice, however, due to the similarity of steps (2) and (3), the procedure can be simplified to use only one single call to the recursive procedure by subtracting the product Factor x Order from the number and then repeating the procedures until the result is zero. The simplified procedure is shown in Figure A2.

As illustration example, the application of above algorithm to number 457,128 yields the following expressions:
  457 x 1,000 + 128
  (4 x 100 + 57) x 1,000 + 128
  (4 x 100 + 57) x 1,000 + 1 x 100 + 28.

Now, it is interesting to realize that the expressions obtained by the application of above procedure have a total consistency with the verbal expression of the number. In fact, notice that the name of the number can be easily obtained just by arranging the names of the numbers and unit structures of the expression in the same order as they appear in the expression without paying attention to the arithmetic symbols used in the expression. For example, the final expression can be used to obtain the name of the number 457,128 by using exclusively the names of the small numbers and unit structures as follows:

  (4 x 100 + 57) x 1,000 + 1 x 100 + 28

four hundred fifty-seven thousand one hundred twenty-eight.

The same procedure with minor adaptations can be used in languages other than English.

Because of this correlation with the verbal expression of a number, expressions obtained by applying the algorithm described above will be referred to as "Verbal Numerical Expressions," and the algorithm will be called "Verbal Expression Algorithm."

Notice that all the components of a verbal numerical expression must follow the order: "Factor x Order + Module;"
if any component of the expression is permuted, the new expression, according with the commutative law of the numbers, will still represent the same number, however it will no longer be consistent with its verbal notation. For example, the following expression:

  1,000 x (4 x 100 + 57) + 1 x 100 + 28

still represents the number of the example, however, it is no longer consistent with its verbal notation. Notice that "thousand four hundred fifty-seven one hundred twenty-eight" is not even the name of a valid number.

  Properties of Verbal Numerical Expressions

By observing the nature of the verbal numerical expressions, the following properties can be found:

a) the Module is always smaller than the Order;

b) whenever the Order is smaller than the largest unit structure available in the set of unit structures used to construct a verbal numerical expression, the Factor is smaller than the Order;

c) except for the representation of zero, Factor and Module components are always greater than zero; and

d) Order components are always greater than 1.

Notice that consistent with properties (c) and (d), small numbers are never expressed as "Factor x Order + Module" in a verbal numerical expression.

Above properties may be used to determine if a given expression is a valid verbal numerical expression.

  Structural Verbal Notation and Verbal Numerals

In the previous section it was shown that the name of a number is actually the representation of a verbal numerical expression. Another way to represent a verbal numerical expression is by assigning symbols to the unit structures (i.e.: "H," "T" and "M" for Hundred, Thousand and Million) and combining them with the small numbers used in the verbal numerical expression in a mode similar to the way the name of the number is constructed orally.

As illustration example, the application of the verbal expression algorithm to the number 35,178,971 will yield the following intermediate and final verbal numerical expressions:
35 x 1,000,000 + 178,971
35 x 1,000,000 + 178 x 1,000 + 971
35 x 1,000,000 + (1 x 100 + 78) x 1,000 + 971
35 x 1,000,000 + (1 x 100 + 78) x 1,000 + 9 x 100 + 71.
These expressions can also be expressed symbolically with sequences of numbers and unit structure symbols as follows:
  35M178971
  35M178T971
  35M1H78T971
  35M1H78T9H71.

Some languages, like Spanish, omit the pronunciation of the Factor when it is equal to 1. As illustration example, the number "One thousand one hundred" (1T1H) is expressed in Spanish as "Mil cien" (TH). This seems to be a convenient feature to be included in a verbal numeral because it reduces the number of components required by some verbal numerals, for example, the number illustrated above could also be represented as follows:

   35MH78T971
   35MH78T9H71.

Since a verbal numerical expression represents a number, the symbolic representation of a verbal numerical expression will also represent a number. A numeral is the symbolic representation of a number, therefore all the symbolic representations of verbal numerical expressions are numerals.

For the purpose of this analysis, the symbolic representation of a verbal numerical expression will be called "Verbal Numeral."

  Conversion of numbers into verbal numerals

Since a verbal numeral is the symbolic representation of a verbal numerical expression, the procedure used to obtain the verbal numeral of a number should be similar to the verbal expression algorithm.

The following is an algorithm that can be used to convert a number into a verbal numeral:

(1) find the Order, Factor and Module of the number;

(2) if the Factor is a large number, apply recursively steps (1) to (4) to obtain the verbal numeral of the Factor, otherwise use the decimal representation of the Factor as the verbal numeral of the Factor;

(3) if the Module is a large number, apply recursively steps (1) to (4) to obtain the verbal numeral of the Module, otherwise use the decimal representation of the Module as the verbal numeral of the Module;

(4) obtain the verbal numeral of the number by appending to the verbal numeral of the Factor the symbol of the Order and the verbal numeral of the Module, in that order.

Figure A3 shows a flow-diagram with the simplified version of this algorithm.

  Constructing the names of large numbers

There are many applications where it is desirable to build the textual name of a number. Example of these applications are programs and routines used to print the textual dollar amount in a check. Notice that the algorithm described in the previous section can easily be modified to build up the name of a number by replacing the symbols of the unit structures and the decimal representations of the small numbers with their corresponding names.

Figure A4 shows a flow diagram with an algorithm to find the English name of a number. It may be easily adapted to other languages, for example, the explicit expression of the number "one" when preceding a unit structure must be done when implemented in Spanish.

  Conversion of verbal numerals into numbers

The verbal expression algorithm can also be used to develop an algorithm to obtain the value of the number represented by a verbal numeral. Basically, this procedure should be equal to the verbal expression algorithm, except that this time the expression components are actually computed and added to the number, and the Order is extracted directly from the verbal numeral rather than computed from the number. The algorithm may be summarized as follows:

(1) find the Order symbol of the verbal numeral by locating the right-most symbol of the largest structure contained in the verbal numeral; if found, the Order is the absolute value of the Order symbol; otherwise the Order is 1;

(2) get the value of the Factor by using the segment of the verbal numeral located at the left side of the Order symbol, if no Order symbol is available, use the verbal numeral as the segment. If no segment is available the value of the Factor is 1. If the segment contains at least one structure symbol, apply recursively steps (1) to (4) on this segment to obtain the value of the Factor, otherwise use the value represented by the digits in the segment as the value of the Factor;

(3) get the value of the Module by using the segment of the verbal numeral located at the right side of the Order symbol, if no Order symbol or no segment is available, the value of the Module is zero. If the segment contains at least one structure symbol, apply recursively steps (1) to (4) on the segment to obtain the value of the Module, otherwise use the value represented by the digits in the segment as the value of the Module;

(4) determine the value of the number by using the following expression: Factor x Order + Module.

Figure A5 shows a simplified version of above algorithm with a provision to check the validity of the verbal numeral.

The algorithm developed in this section can be implemented in the logic of a numerical data-entry device such as a calculator or computer to accept numbers entered in structured mode by using keys with structure symbols. The same procedure can also be adapted to enter numbers in voice recognition devices.

  Advantages of Verbal Numerals

One obvious advantage of the Structural Verbal Notation is its consistency with the way the number is expressed orally. This allows to build a valid verbal numeral without applying the procedures described before, just by replacing the unit structures used to express the name of the number with the corresponding symbols and writing down the remaining numbers in the corresponding sequence.

In many instances verbal numerals require fewer number of symbols than decimal numerals. Here are several examples:
  3,000,005 	   3M5
  245,000,000,072   245M72
  350,000 	   3H50T or 350T
  2,000,305         2M3H5 or 2M305
  1,001,000         1M1T or MT
  1,000,000,000     1TM or TM
  1,000,000,100     1TM1H or TMH 

A large number usually may be represented by several verbal numerals, this provides a flexibility not available on decimal numerals.

Another advantage of verbal numerals (which is not so straight-forward) is its capability to grow gradually as the number is being pronounced orally. This property does not exist in the corresponding decimal numerals as shown in the following example:

In the construction of the number "Five Million Three Hundred Thousand Six" the following intermediate numerals are involved:

  Numeral      Decimal     Verbal
  Name         Numeral     Numeral
  -------	     ---------   --------
  Five...	     5           5
  million...   5,000,000   5M
  three...     5,000,003   5M3
  hundred...   5,000,300   5M3H
  thousand...  5,300,000   5M3HT
  six          5,300,006   5M3HT6
  ----------------------------------

Notice that while the decimal numeral changes substantially in each intermediate step of the number, the verbal numeral does not change except for the addition of the new component to the previous numeral.

Above advantages can be used to streamline and simplify the data entry in numerical devices such as calculators and computers.

  Extended Numerical Keyboards

All numerical keyboards have two categories of keys:

a) Data-entry keys.
b) Function keys.

Data-entry keys are the ones used to input a number into the device read buffer. In any numerical keyboard, data-entry keys can be clearly differentiated from function keys by the fact that pressing a data-entry key does not end the number entry procedure.

On the other hand, function keys are the ones that, whenever pressed, end the number entry procedure and starts another type of procedure associated with the nature of the function key pressed.

The most common data-entry keys are the digits 0 to 9 and the decimal period (dot key). Some scientific calculators also have an exponential key which allows the entry of numbers in scientific notation. Some business calculators have a "00" and a "000" key (multi-zero keys) to avoid pressing the zero digit several consecutive times when appropriate (i.e.: to enter zero cents in a currency amount). Another example of a data-entry key is the delete key which deletes the last digit entered.

The most common function keys are the following:
  +  to add the number to the value in accumulator.
  -  to subtract the number from the value in accumulator.
  x  to multiply the number by the value in accumulator.
  /  to divide the number by the value in accumulator.
  =  to display value in accumulator.
  CE to clear entry in read buffer.
  C  to clear entry in read buffer and value in accumulator.
  M+ to add number to value in memory.
  M- to subtract number from value in memory.
  MC to clear value in memory.
  MR to copy value in memory to value in accumulator.

Scientific and business calculators also include specialized function keys used to manipulate the numbers in the memory and/or accumulators. In some cases, these functions can be programmed by the user to perform special manipulations.

Figure A6 shows the main processing routine applied by an electronic calculator. The procedure basically consists of three main steps:

1) Initialization - Occurs when the calculator is turned on, all its registers and display are cleared and initialized.

2) Number Input Operation - while the user enters a consecutive sequence of data-entry keys, the calculator reads the number into an input number register and displays the contents of this register after every keystroke; this process ends when a function key is entered, then, the control flow advances to step (3).

3) Function Processing - the calculator performs the function corresponding to the function key entered in step (2) on the number stored in the input number register, the result is stored in an accumulator register and displayed in the calculator display. The input number register is cleared and the control flow returns to step (2).

Figure A7 shows in more detail the logic involved in the number input operation of a conventional calculator. This logic basically reads a data-entry digit key and then updates the number stored in the input number register by performing an operation equivalent to multiplying such number by ten and adding the value of the digit entered, the updated number is displayed and the process is repeated after every keystroke until a function key is entered; provisions are taken to avoid overflow of the number in the input number register, and to track the position of the decimal point. In summary, this procedure accepts a sequence of digits corresponding to the decimal numeral entered by the user.

As described in the previous section, verbal numerals have several advantages over decimal numerals and therefore can be used with advantage during a number input operation. For this purpose, an extended numerical keyboard can be defined with three additional data-entry keys: one structure key for each unit structure in the set "H, T, M". This arrangement of keys would allow the data-entry of either decimal or verbal numerals in structured mode. To provide additional flexibility, a Swap key can be also included to allow the user to change the structured mode into conventional mode or vice versa during an input number operation.

Of course, since verbal numerals have no practical computational value, the number displayed by the device must be the decimal representation of the number even if the number is entered as a verbal numeral. In order to accomplish this objective, the device must store all the verbal numeral components (digits and/or structures) entered during the same input number operation, and use them as input to convert the verbal numeral into a decimal numeral every time a new component is added to the stored verbal numeral.

Figure A8 shows an example of a logical procedure that could be applied to implement the input number operation in a device able to accept decimal numerals and/or verbal numerals. According with the implementation shown in Figure A8, the input number operation always starts in structured mode and therefore the sequence of digits and/or structures is stored in a memory buffer that will be referred to as the "verbal numeral buffer." The maximum size required for the verbal numeral buffer may be determined by converting the largest number that can be displayed by the device into a verbal numeral and counting the number of components in the resulting verbal numeral, for example, if the maximum capacity of the display is 8 digits, then the largest number will be 99,999,999 which corresponds to 99M9H99T9H99 and the capacity of the buffer should be 12 key code cells.

Once a digit or structure has been accepted and stored in the verbal numeral buffer, the algorithm shown in Figure A5 may be used to convert the new sequence of verbal numeral components stored in the verbal numeral buffer into a number; the number obtained by this conversion replaces the previous contents of the input number register; then, the new number is displayed and the device waits for a new keystroke.

If during the number input operation the user depresses the swap key, the data-entry mode changes from structured mode to conventional mode. In conventional mode the verbal numeral buffer is cleared because it is no longer required, and the code of each data-entry key depressed is used directly in updating the number with a procedure similar to the one used by conventional calculators (see
Figure A7); the only difference is that if a structure key is depressed while the device is in conventional mode, the keystroke is treated as if the zero key has been pressed multiple times, therefore, in conventional data-entry mode structure keys behave as conventional multi-zero keys.

Depressing the swap key during the course of a number input operation while the data-entry mode is set to conventional mode and no decimal point has been entered, will change the data-entry mode to structured mode. The contents of the input number register is used to convert the number into a verbal numeral by using the algorithm described in
Figure A3. The result of this conversion is stored in the verbal numeral buffer.

Entering a decimal point automatically changes the data-entry mode to conventional mode. This is because although the decimal portion of a number could in theory be entered in structured mode, it is not a practical way because users are not always familiar with the verbal expressions of a decimal fraction and these verbal expressions may vary widely from language to language.

Another feature included in the procedure described in
Figure A8, is the possibility of using the swap key as another function key if it is depressed before a new input number operation starts (while the input number register is set to zero). This hybrid nature of the swap key can provide additional possibilities to the nature of the functions used in the calculator, for example, "Swap" followed by a "M=" key could swap the contents of the memory with the contents of the accumulator without losing the value stored in the accumulator as it happens with most conventional calculators. Just think on the possibilities...

James H. Redin


Copyright © James H. Redin. All rights reserved (1996-2007)