Software Spotlight: Compiling Word for Windows from OS/2 1.2
He's also got it posted on his blog. Please give him a round of applause!
After a bit of hiatus, I'm back with more retrocomputing madness. Several years ago, Microsoft donated the source code of Word for Windows 1.1a to the Computer History Museum. After a bit of initial news flurry back in 2014, the topic mostly sat untouched. Microsoft had been kind enough to also include a full set of tools to actually compile Winword from source from both DOS and OS/2. A few people, including neozeed from Virtually Fun managed to compile the code from source from DOS. However, no one has really investigated too deeply into the bowels of this codebase. Furthermore, I could find no documented evidence that anyone had attempted an OS/2 build.
When I decided to dig into Word for Windows's (or Winword for short) source code, I found more than a few surprises. It quickly became clear that this was a topic that needed a deeper dive. It also gave me the good excuse to finally talk a bit about OS/2 which was something on my list of topics for awhile. As it turn out, I would fall fairly far down the rabbit hole before coming back up for daylight. As usual, my findings are summarized in YouTube video form, and it would mean a lot of you took a look.
NOTE: There is errata for this video posted below.
Word for Windows
I'm really not sure how much introduction Word actually needs. It's one of the most popular word processors in existence, and I can't help but think that almost everyone has used it at one point or another. That being said, Word itself started from rather modest beginnings. Although initially released in 1983, the early versions struggled to find traction against WordPerfect and WordStar. It was ironically the Macintosh versions that began to establish Microsoft as a player in this field.
Word 5.0 for DOS (source: WinworldPC)
On the PC side of things, Word only began to establish dominance with the rise of Windows, and the difficulties in bringing a Windows version of WordPerfect to market. With it's largest competitor flailing, Word, combined with Excel would form the cornerstone of the upcoming Microsoft Office. By the end of the 90s, Microsoft would have this entire market segment on lockdown.
However, the early versions of Word, and especially Word for Windows, did have some compelling improvements over its market incumbents. To demonstrate, let's get the retail release of Word 1.1a installed, and take a look. This version of Word required Windows 2.1 at a minimum, although the then upcoming Windows 3.0 was recommended. Installation is handled through a DOS application, which is typical of early Windows 1 and 2 applications. Afterward, we can jump into Windows, and take a closer look.
WYSIWYG Example of Word under Windows 2.1
By modern standards, Word 1.1a is nothing to write home about. However, Winword is one of the earliest examples of a consumer WYSIWYG or "What You See is What You Get" text editors, and had fully integrated support for mouse control. In contrast, its main competition in WordPerfect 5.1 used colored based highlighting to show formatting, and only provided a limited Print Preview
WYSIWYG Example of Word under Windows 2.1
Word also provided support for integrated graphics, and the beginnings of multimedia documents through Object Linking and Embedding, although this would not be fully realized until later. In short, Word did provide real world
advantages over its competition. While Microsoft was accused of underhanded tactics, leading to Microsoft v. Novell, it's impossible to deny that Word itself was a solid competitor on its own merits.
As I would find though, a lot more of Microsoft's success in this field would be revealed as I dived in deeper. However, first I needed to get Winword compiled.
The What and Why of OS/2
IBM Operating System/2
I would not be surprised if quite a few people are wondering on why I chose OS/2 for this project. Well, there's a few reasons. The first is just compiling ontop of DOS is actually quite painful. While Microsoft included a full set of build tools for both DOS and OS/2, the DOS tools require oodles of conventional memory, and a fairly specific setup to actually get the build off the ground.
First, the tools are hardcoded to use the 386MAX memory extender. Microsoft helpfully included that in the box, and it works just fine on both DOS 3.3 and 6.2. However, because this extender is so old, DOS itself cannot relocate to the UMA area, which decreases the total amount of conventional memory on the system. Attempting to use DOS=HIGH just leads to a system crash.
386MAX Memory Extender
While there are unofficial hacks to use HX Memory Extender, I preferred to stay with period specific tools if possible. Even after sorting the memory manager, I still kept getting build failures due to running out of memory. A lot of this is due to the fact that that for this era, the concept of a DOS Extender was still fairly new, and ontop of that, Microsoft was still expecting OS/2 to be the future of computing.
Out of Memory Error from csl
At this point in history (1989), Microsoft was still married to IBM in bringing the protected mode successor to DOS: OS/2. Originally announced in 1987, OS/2 was intended to bring proper multitasking and GUI based applications to the masses. While GUIs (and most notably Windows) had already existed at this point, they were still fairly niche. OS/2 was intended to bring a revolution to both home and office users. However, by 1989, OS/2 was still struggling to find any significant marketshare, and the vast majority of users still continuing to use DOS as their day to day operating system.
It was not until the unexpected success of Windows 3.0 in 1990, followed by the IBM-Microsoft divorce did Microsoft truly abandon OS/2 as a platform, something we will see later on. Given the time period though, I expected that Microsoft used OS/2 fairly extensively for both software development, and day-to-day usage. It is well documented that nearly all Microsoft products of that time period had OS/2 support, or a specific OS/2 version.
Microsoft's own develoment tools also supported OS/2. In truth, while OS/2 was somewhat of a flop with general consumers, it did have a niche for use in software development. This is primarily because OS/2 could multitask, use virtual memory, and bad code would not necessarily crash the system. It also was possible to do "in-place" debugging, instead of a multi-machine setup as required by DOS and Windows of the era. Speaking from personal usage, OS/2 is also quite nice for using as the basis for retrocomputing projects instead of developing directly ontop of Windows or DOS.
To that end, I decided to go with OS/2 1.2, as it would have been relatively close to the time period in which Winword 1.1 was released. This would however lead to much pain and suffering.
Half an Operating System - OS/2 1.2
So, in theory, installing OS/2 is simple. Insert the Install disk, boot the system, and follow the prompts. In practice, I was greeted with the following ...
OS/2 TRAP Error on Boot
In practice, OS/2 actually is notoriously difficult to either emulate or virtualize due to its use of features like x86 rings, and segmented protected mode. The original 16-bit versions of OS/2 (1.0-1.3) compound this issue by self-destructing on anything that is not specifically what they want. In practice, OS/2 1.2 will not install on anyone system faster than a 33Mhz 386DX. The OS/2 Museum has actually debugged this issue, and written unofficial patches to fix it. However, I decided to simply stick with slower emulation in the name of "authenticity". This would end up being a rather painful mistake. Still, with proper machine settings set, I managed to get to the boot splash screen.
OS/2 1.2 Installation Welcome Screen
Installation is pretty straight forward, but there are a few things to note. First, IBM's High Performance File System is not an option yet. HPFS was the intended replacement for FAT, but it would only appear near the end of OS/2 1.x's lifespan, and then as an optional component. Secondly is the truly abysmal hardware support. There are four graphic driver in total.
OS/2 1.2 Graphic Driver List
While people often cite a lack of native OS/2 software as the major con against the system, a lack of hardware support was as bad if not worse. The Microsoft branded versions of OS/2 are slightly better in this regard, but OS/2's poor hardware support would be a defining trait. It is also the reason why I was stuck in 640x480 16-color mode through this entire project.
Lack of hardware support aside, the installation finished without much fanfare, and we end up at what can be described as the single blandest desktop known to mankind.
OS/2 1.2 Desktop Manager
As I stated in the YouTube video, OS/2 did not come with much. The only real program of note is the system text editor, known as E. It is safe to say that "batteries aren't included" was in fact a thing that IBM did. OS/2 2.x would be much improved in that regard, but in short, if you did not have native OS/2 versions, you either had to hope you could get Presentation Manager versions of your applications, or roll the dice with the DOS box.
The DOS Window
As the intended successor to DOS, OS/2 does have some support for running DOS application. However, OS/2's reputation for near perfect backwards compatibility would not come until 2.0. In truth, the DOS box is somewhat of a joke. Since OS/2 1.x was designed around the 286 processor, it does not take advantage of Virtual 8086 Mode, or any sort of sandboxing. Instead, the processor is physically reset back to real mode which means DOS programs have to run with massive performance penalities. On top of that DOS applications can also crash or hang the system, which negates one of the primary advantages of OS/2. Furthermore, only one DOS application can be run at a time, and only in full screen mode, although applications can be backgrounded, and one can return to Presentation Manager.
I did not spend a lot of time in DOS mode, but it's rather notable that even a well-behaving, and popular program like WordPerfect has compatibility issues. I can't imagine a world where one would attempt to use DOS mode seriously as shipped from IBM.
At some point, I would like to go into the flaws and failures of OS/2 more in-depth as that would be a rather long and detailed set of articles and videos. Still, let's remember why we are here: to compile Word for Windows.
The CHM Distribution
With OS/2 installed, it was time to punt the code over to OS/2. Since OS/2 still uses standard DOS FAT filesystem, it was fairly trivial to simply mount the virtual disk, copy the files over, and then start exploring. The first thing I need to talk about though is the license. I am going to apologize in advance, because this is something of a rant.
Microsoft Research License
I brought up during the video how the Winword license only allows non-commercial use, but I did not really elaborate. While I am grateful that the Word for Windows source is legally available, the restrictive license on it really makes it something as a poison pill. The ugly truth of the world is a lot of content, research, and even media production is commercial in nature, and that audience can not touch Winword at all. Although my channel as of writing is still small enough that it does not meet YouTube's monetization requirements, I would have still done this project because I believe of the importance of documenting these sorts of things, but I know I am in a minority in that regard.
Even disregarding the non-commercial restriction, the inability to publish any source, binaries, or modified works is really pushing this. I can get that Microsoft intended this as a museum piece and in that context, maybe such restrictions are understandable, but I do feel like it doesn't help anyone to be so draconian. I had to do a lot of tap-dancing to avoid showing code snippets and a lot of "tell, not show" to work with this content at all.
It is easier to comply with the license in text, but I do feel a lot of this is unnecessary. I suppose I should not complain; after all, Word didn't have to get it's source code published at all, but it still grinds my gears.
Anyway, with that rant out of the way, let's get back to the CHM distribution. The zip file has three folders, with the bulk of the source code being in the "Opus" directory.
CHM Word Unpacked
Opus, as one might gather, was the code name for Winword. The source code is fairly well organized, and there are decent comments within which give a pretty detailed picture of how much there actually is to Word. In truth, I've only gone through a fraction of it. cloc gives a pretty good look at how large this codebase actually is!
mcasadevall@absolution:~/src/word/Word 1.1a CHM Distribution$ cloc .
666 text files.
662 unique files.
140 files ignored.
github.com/AlDanial/cloc v 1.82 T=1.73 s (303.6 files/s, 267620.6 lines/s)
Language files blank comment code
C 242 46442 37729 221843
Assembly 59 8472 25106 53187
C/C++ Header 144 4975 4058 24630
C++ 20 5040 4434 21065
DOS Batch 31 590 81 2036
Pascal 3 37 0 1451
SWIG 1 334 0 1225
sed 13 23 0 311
Windows Resource File 2 19 16 118
make 3 63 70 110
Windows Module Definition 3 16 2 87
awk 1 23 16 78
INI 4 3 0 32
SUM: 526 66037 71512 326173
NOTE: cloc's language detection is a little hit or miss, but it's still pretty easy to see there's a lot to unpack here!
As I previously mentioned, Microsoft was kind enough to bundle both the compiler and header files needed to build Winword, but there was a surprise to be found here.
Contrary to expectations, Winword did not ship with Microsoft C. Instead, its C compiler is known as CSL and it's a truly unique beast.
Unexpected Findings (And Some Corrections)
I should explain my surprise here. In addition to Windows and OS/2, one of Microsoft's major revenue streams was providing compilers and tools. While Microsoft's tools were more expensive than say Borland or Watcom C, they were still both commonly used and frequent fixtures through this time period. In addition to their DOS based tools, Microsoft's tools were fundamental to Windows and OS/2 development, and also provided the basis of Xenix's C compiler. As such, it was a relatively reasonable expectation that a Microsoft product would use a Microsoft compiler to build it.
csl however is not your run of the mill compiler. Instead of compiling to standard x86 assembly language, it compiles to p-code. In fact, the entirety of Word for Windows is built into p-code, and run on top of an interpreter. During the video, I noted that this appeared to be a special environment called "EL". Since then, I have further dug into the code, and discovered that EL appears to actually be part of "Embedded BASIC" which is the macro language shipped with Winword.
While I am annoyed I did not find this mistake until this write-up, I do think it is rather understandable. The Winword source code is very sparse of comments, and there is virtually no written documentation about it. I will attach a comment to the video, and address it directly when I create the follow up. That being said, Microsoft's use of p-code doesn't appear to be unique to Winword. If nothing else, csl reports a copyright date of 1982, so it is not hard to believe that Microsoft used p-code for other tools and products.
The use of p-code is not the only surprise though. Normal Windows programs use what are known as resource files to define menus, dialog boxes, and more. Once again, Microsoft has abstracted this. A special compiler for these resources known as 'de' is built. It even has a character based interface for laying out dialog elements!
DE Editing the About Box
In truth, aside from MASM, almost every tool in Opus appears either internal to Microsoft, or a pre-release or a port of another tool. For example, a native version of MARK is provided, which we saw back in compiling Hello World for Windows 1.0. Here thought, it is provided as native OS/2 binary!
It does make me wonder where and how Microsoft had stored the Winword 1.1a source. It is possible that this release was stored with Microsoft's legal department: Raymond Chen, a Microsoft employee behind The Old New Thing notes that the legal department would often store and collect older versions of Microsoft products, and it would not surprise me of this source was also used during the discovery process of Microsoft v. Novell. While I doubt that I will ever know for sure, it is interesting to think about how Microsoft might have preserved this and presented it to the CHM.
Surprises aside, if I wanted to actually get anywhere, I still needed to get with compiling.
Missing Tools, and Compile Errors
Fortunately, I was not entirely running blind here. Instructions for the DOS version of the build were relatively easy to find, and it stood to reason that the OS/2 version of the tools were vastly similar. In short, I needed to add
C:\OPUS\TOOLS\OS2 to the
PATH, and create a
TEMP directory. Afterwords, I could use the makeopus command to kickstart the build. Makeopus is similar to a Makefile generator or Microsoft's own take of a configure script. It reads a special configuration file for flags and options, and generates makefiles for NMAKE based on those options.
makeopus Configuration File (prod.ini)
Unfortunately, my first attempt blew up before I could even get past the starting line. The cause was relatively easy to spot though, makeopus was trying to use the DOS version of the compile tools. At least under OS/2 1.2, DOS utilities could only be used within the DOS window, so this wasn't going to fly. A closer examination of the help however reveals that there is a specific "
MAKEOS2" option for makeopus that's for 'Protected Mode' builds. Many Microsoft tools such as Microsoft C often referred to OS/2 as a Protected Mode environment, so this is consistent with Microsoft's terminology of the time. Adding
MAKEOS2 to our build configuration file got the build going, but it was not too long until we hit another build failure.
Error Code SYS0197
In this case, it is the 'de' utility I talked about earlier dying. OS/2 has one feature I do appreciate, clear error messages, and an extensive built-in help. By typing help sys1097, we can see what failed, and how to fix it. In this case, it is the fact that 'de' needs a feature called IOPL segments to be enabled. Enabling IOPL is pretty trivial; it is a one line change to CONFIG.SYS, although amusingly the HELP message is accurately inaccurate, you need to use IOPL=true, and not IOPL=yes
Error Code SYS1097
The need for IOPL is frankly mystifying for me. IOPL controls the ability to use x86 rings, specifically, for code to load into ring 2 and touch IO ports directly.
x86 Processor Rings
For the life of me, I can't figure out why a compiler needs this functionality; its possible it is trying to write some sort of diagnostic messages through to the serial port, and that functionality was copy and pasted from DOS, but from what I have read about IOPL segment programming in OS/2 programming guides makes this seem unlikely. The source code to 'de' is not available, so short of sitting several hours trying to decompile the tool, this will likely remain an unsolved mystery. Still, it was a problem with an easy solution, so it did not take too long to back on track.
With that hurdle passed, the build dragged along. Unfortunately, as easy as it was to get back on track, it was not long until I got to another build failure though. Annoyingly, the command output had been hidden by makeopus, so I couldn't see what had failed. Working backwards through the makeopus utility, I found that the entire build process is written out as a shell script called 'mo2.cmd'. From there, some trial and error brought me to the failing command 'egrep'.
Build Failure Due To egrep
egrep Erroring Out
Running egrep by itself shows the same error in the build folder. However, running 'egrep' in the DOS window succeeds. Obviously, we've got a problem here. Many of the binaries shipped in the TOOLS folder are what's known as family mode binaries, meaning that they contain DOS and OS/2 versions in one single version. For example, we can see this with the makeopus tool.
mcasadevall@absolution:~/src/word/Word 1.1a CHM Distribution/Opus/tools$ file makeopus.exe
makeopus.exe: MS-DOS executable, NE for OS/2 1.x (EXE)
fgrep and grep are on the other hand shipped with two different versions, explaining on why they work:
mcasadevall@absolution:~/src/word/Word 1.1a CHM Distribution/Opus/tools$ find . -name \*grep\*
egrep however doesn't fall into either category. Despite being in the common tools directory, it is not a family mode executable.
mcasadevall@absolution:~/src/word/Word 1.1a CHM Distribution/Opus/tools$ file egrep.exe
egrep.exe: MS-DOS executable
Needless to say, this was a problem. Without a functional copy of egrep, I couldn't build Winword without some rather ugly brain surgery. That being said, despite the setback, I was fairly optimistic I could get around this problem. egrep is a fairly standard utility, and an examination of the binary with strings showed that Microsoft had almost certainly ported grep and friends from Xenix. Some testing also suggested that there was nothing really unique about the egrep binary as shipped from Opus. Given I did have a set of normal compilers (Microsoft C), compiling a replacement egrep was entirely practical.
At this point though, I found someone had already done the hard part for me. Back in the early 90s, the GNUish project had ported several GNU utilities to early DOS and OS/2, including GNU grep. A member of the Virtually Fun discord group pointed this out to me, and that the FreeDOS project had archived the lot. Some searching later, and I did indeed find a copy of egrep compiled for OS/2.
egrep Erroring Out
Some testing confirmed that the Opus DOS egrep, and the GNU egrep command functioned identically!
Testing GNU egrep
A few renamed files later, and we're back in business!
It took literal hours for the next attempt to process. At 33Mhz, the system was straining to compile the monster that was Word for Windows. According to the raw video footage, I have over three and half hours of footage from this alone. However, perseverance is a virtue, as the build did in fact succeed!
The question is now, would it run? As far as I know, I am the first to actually attempt building Winword on OS/2 outside of Microsoft. Furthermore, the end result is a Windows binary, and we are on OS/2 so there's implied pain the ass to deal with, but I think is now a good time to segue to the implied elephant in the room.
Who Needs Win-OS/2?
Win-OS/2 (OS/2 2.0)
For those familiar with the later versions of OS/2, you might be wondering why I haven't brought up Win-OS/2. Well, the answer is simple: it did not exist yet. IBM eventually marketed OS/2 as a better Windows than Windows, and a better DOS than DOS. While there was some truth to those claims, that did not come until after the IBM-Microsoft divorce. At the time, OS/2 being the future was still the party line at both Microsoft and IBM, with Windows only being a stopgap until then.
To that end, to a 1989 IBM, there would have been no need to license and include Windows with OS/2. Given how hamstrung DOS support already is, this is probably a good thing. That being said, as it turns out, you can add Windows support if you really want it.
As it turns out, Windows 3.0 (and ONLY 3.0) officially supported OS/2 as a host environment. Simply insert the first disk, and run SETUP as usual from the DOS box, and follow the prompts.
Windows 3.0 SETUP
It should be noted that this is actually an officially supported feature of Windows 3.0, and directly mentioned in the README file!
Running Windows from the OS/2 version 1.2 DOS Compatibility Box
* Do not allow Windows Setup to make changes to your
AUTOEXEC.BAT file or CONFIG.SYS file. Make the appropriate
* To print, you must set the printer driver port to one with
an .OS2 extension as follows:
If your printer is physically connected to LPT1 or to LPT2,
when you configure your printer, make sure you set the
printer-driver port to LPT1.OS2 or LPT2.OS2.
If the printer is physically connected to LPT3 or LPT4, you
must create a line for LPT3.OS2 or LPT4.OS2 in the [ports]
section of your WIN.INI file and then set the printer
driver to the appropriate .OS2 port when you configure it.
For more information about WIN.INI settings, see the
on-line document called WININI.TXT.
In truth, this is not really that useful. Windows is stuck running in Real Mode meaning most Windows 3+ software simply work. In addition, the command line and DOS applications are locked out with a unique error message.
Windows running in Real Mode
Windows can't run non-Windows Applications
That being said, the retail release of Winword 1.1 does work in real mode, and this environment is enough to run our freshly compiled one as well!
Compiled WinWord Starting
Compiled WinWord About
So mission accomplished! We managed to build Word for Windows from OS/2, but there's a fair bit more to say on the topic.
PMWORD and Friends ...
In a broader context though, it helps to talk about Winword in the larger context of the world. As previously mentioned, Word was available for DOS, and there was also a version of Word for OS/2, which is also known as PMWORD. Let's look at Word for OS/2 before we wrap this up.
Installing PMWORD is almost identical to regular Word for Windows. After installation, we end up with a new program group in OS/2 with a single application. Running it though has a bigger surprise.
PMWORD Editing Text
Is it just me, or does this look hauntingly familiar? I can say there's little to no specific code for OS/2 in the Winword source, but I do actually have a reasonably good idea of what's going on. We're going to dig into that, and Word's surprisingly complicated family tree next time, as well as clear up some more of the lingering mysteries.
What I thought would be a relatively simple and straight forward project has ended with more than a few questions. For example, why was Word compiled to p-code, and what was the relationship across multiple versions. Furthermore, I'm betting a lot of you want to know if it runs on Windows 10 or not. After all, I proved 35 years of backwards compatibility was in fact a thing.
That being said, this article is already extremely long, so we'll have to pick this up at a later point. I'd love to hear your feedback, and hope you will like and subscriber to my channel! It really makes a difference.
~ 73 de NCommander
My vague understanding about the use of p=code is that it goes back to the original Microsoft Word for DOS as an attempt to make it more portable between DOS, Xenix, and other platforms.