Skip to main content
  1. Posts/

Y2K, the bug we mostly fixed

··1708 words·9 mins·
Table of Contents
Computer History - This article is part of a series.
Part : This Article

Two decades later it’s easy to laugh at Y2K. The survival kits, the news segments about elevators plummeting into shafts, the bunker preppers. Midnight on January 1, 2000 came and went, the planes didn’t fall, and inside a week the public had decided the whole thing was a hoax.

It wasn’t a hoax. It was the largest successful preventative engineering project in computing history, a roughly $300 billion global remediation effort (US share around $100 billion, per the IDC numbers collected in the Wikipedia summary) that kept financial, utility, and transport systems from rolling over into 1900. Nothing visible broke because a large number of programmers spent the back half of the 1990s making sure nothing did. For anyone doing security work today, Y2K is one of the cleanest case studies you’ll find in what happens when assumptions baked into a data format decades earlier stop holding.

Why two digits
#

Bytes cost real money
#

In 1965, magnetic core memory ran somewhere in the range of $400,000 to $1 million per megabyte (the jcmit.net price series puts it at $400/KB to $1,000/KB depending on the system). Every byte you stored, you paid for many times over: in memory, in tape, in punch cards, in disk. A date stored as MMDDYYYY is 8 bytes. As MMDDYY it’s 6. Across a million records, that’s 2 MB saved, which translated to real hardware budget at the time.

Storing two digits for the year wasn’t laziness. It was a defensible engineering tradeoff against the cost structure of the hardware then in front of the programmer. The implicit assumption was that the code wouldn’t still be running in 2000. That assumption was wrong. COBOL written in 1970 was still keeping the books at the IRS, Visa, and the Federal Reserve in 1999.

Bob Bemer (one of the architects of ASCII) is often credited with raising the alarm earliest, dating his concern internally at IBM to the late 1950s. The first widely-cited public warning was his 1971 piece in the Honeywell Computer Journal. It took another two decades for the industry to take it seriously.

The logic error
#

The arithmetic, when the rollover hits, is unforgiving. To calculate a customer’s age:

SUBTRACT BIRTH-YEAR FROM CURRENT-YEAR GIVING AGE.
  • 1999: 99 - 50 = 49. Fine.
  • 2000: 00 - 50 = -50. Not fine.

For anything that sorts on the year field, 00 comes before 99, so a transaction posted in 2000 sorts as if it happened a century ago. Downstream you get negative interest, collection notices that claim a bill is a century overdue, and certificates apparently issued in 1900. The post-1995 remediation work was mostly about stopping that.

The fixes
#

There was no find-and-replace solution. The work involved auditing billions of lines of code, much of it undocumented, much of it written by people who had retired or died. I had the privilege of working alongside several of the ones who hadn’t. They were COBOL programmers who’d come out of retirement to help with a Navy remediation contract I was on, putting in long days on legacy systems they’d written decades earlier when they should have been on a porch somewhere enjoying their careers. I think about them when people call Y2K a hoax.

Date expansion
#

The clean fix: change the schema from YY to YYYY everywhere. Correct, permanent, and ruinously expensive on a mainframe, because widening a date field by two bytes meant recompiling every program that touched the affected record, which could be hundreds of subroutines per data file. Big shops with budgets did this. Smaller shops looked at the project plan and reached for something cheaper.

Windowing (the pivot trick)
#

The hacker’s fix. Keep two-digit storage, but interpret the year against a pivot point:

IF YEAR < 50
   ADD 2000 TO YEAR
ELSE
   ADD 1900 TO YEAR.

Cheap, fast, and it works until it doesn’t. Most windowing fixes picked a pivot 20 to 50 years out. Code that pivoted on 1950 has been silently rolling years like 50 into 1950 instead of 2050 for a while now. Pen testers occasionally find these in older insurance and pension systems where dates of birth in the 50s and 60s get interpreted as the wrong century.

Encapsulation / bridging
#

A middleware layer between the legacy database and the application: expand dates on read, shrink them on write. The legacy storage stays untouched, the modern application logic gets 4-digit years. Most banks ran something like this through the early 2000s as a stopgap on the way to (eventually) doing real schema expansion.

The embedded problem
#

The mainframe COBOL work got most of the press, but the harder problem was embedded systems where you couldn’t recompile and often couldn’t read the source.

Industrial control systems were the live concern. SCADA controllers running pipeline valves used date arithmetic to track maintenance cycles, and the failsafe behavior when a controller decided maintenance hadn’t been performed in 100 years was sometimes to shut the line down. HVAC and building access systems often used hardcoded day-of-week tables. January 1, 1900 was a Monday; January 1, 2000 was a Saturday. A controller that woke up thinking it was Monday would happily unlock office doors at midnight on what was supposed to be a holiday weekend.

Most of these were tested or replaced before rollover. The ones that weren’t are why the actual list of January 1 incidents is short rather than empty.

What actually broke
#

The remediation work mostly held, but the public incident list is real and worth knowing:

  1. US Naval Observatory displayed the date as “1 Jan 19100” on its public time site for about 45 minutes after rollover. JavaScript string concatenation: year + "19" where year had become 100. The official timekeeper of the United States got bitten by + overloading. Computerworld covered it as a black eye at the time.
  2. Shika Nuclear Power Plant (Ishikawa Prefecture, Japan): radiation monitoring equipment glitched at midnight. No radiation was released; the alarms were spurious. CNN ran the story on January 3, 2000.
  3. Delaware slot machines: around 800 of 3,600 WMS-built slot machines at Dover Downs, Delaware Park, and Harrington stopped working on December 29, 1999 because the date-rollover logic misread the upcoming Saturday as January 1, 1900.
  4. NRO Talent Keyhole: The National Reconnaissance Office’s imagery ground station at Fort Belvoir lost about three hours of processing at rollover, with degraded operation on five satellites (three KH-11 optical and two radar) for the following two days. Roughly the most secret thing on the public Y2K incident list.

Nothing on the list killed anyone. That’s not a coincidence and it’s not because the bug was imaginary. It’s because a lot of people spent the back half of the 1990s on it, most of them doing tedious code review that they never got public credit for. The remediation got dismissed as overreaction approximately one news cycle after it succeeded.

The legacy
#

How Indian IT scaled up
#

The West needed a lot of COBOL labor in 1995 and didn’t have it. India did, and was already setting up technical services firms. Infosys, Wipro, and TCS all scaled up substantially on Y2K remediation contracts, and once they had the operational footprint they kept doing the work for everything that came after: application maintenance, then development, then full-stack engagements. The modern global IT outsourcing model basically dates from that build-out.

Year 2038 (Y2K38)
#

The sequel is already on the schedule. Unix time_t, when stored as a signed 32-bit integer, hits its maximum at 03:14:07 UTC on January 19, 2038. The next second wraps to 20:45:52 UTC on December 13, 1901.

This will hit:

  • Embedded Linux on routers, IoT devices, industrial controllers, and anything else that won’t get a firmware update.
  • Older 32-bit servers still in service, mostly because nobody knows what runs on them.
  • File systems and databases that hardcode 32-bit timestamps in their on-disk format (ext3 inode timestamps in some configurations, older Berkeley DB formats, ext4 in default configurations until kernel 5.4 added 64-bit support for new filesystems).

64-bit Linux has used 64-bit time_t for years. 32-bit Linux mainline got 64-bit time_t support in kernel 5.10 (2020), but that fixes the kernel only. Applications compiled against the old ABI still see the 32-bit value. Glibc’s transition has been gradual.

If you’re still doing pen testing in the 2030s, look at how the target handles timestamps that go past the 2038 boundary. There will be embedded gear out there with no upstream maintainer and a date field that quietly wraps to 1901, and the validation layer above it probably won’t be expecting either condition.

Technical trivia
#

  • 2000 was a leap year. 1900 wasn’t. The Gregorian rule: divisible by 100 means not a leap year, unless also divisible by 400, in which case it is. A number of quick-fix Y2K patches treated 2000 as non-leap and crashed on February 29.
  • Microsoft Excel’s leap year bug. Excel intentionally treats 1900 as a leap year, even though it isn’t. The reason is Lotus 1-2-3 compatibility. Lotus got it wrong, Excel matched the bug for spreadsheet interop, and the bug has been preserved ever since. Microsoft documents this as “wrongly assumes 1900 is a leap year” in their KB.
  • Peter de Jager, a Canadian consultant, published the canonical early-warning piece “Doomsday 2000” in Computerworld in 1993. He was treated as a doomsayer in the mid-90s; by 1999 he was being flown around the world to consult. He is probably the single person most responsible for the alarm being raised early enough that the remediation work could happen.

Why this still matters
#

Y2K is one of the textbook cases of code outliving its authors. Data-format assumptions made by some programmer in 1972 ended up, by 1998, being the structural load on systems that nobody at the bank or the airline or the utility actually understood end to end. The pattern hasn’t gone away. On a Red Team engagement you will, sooner or later, find a 30-year-old COBOL routine deep in a bank’s clearing pipeline that has been load-bearing since before you were born and that almost certainly contains input handling nobody has looked at since the first Bush administration. That is a reasonable place to spend an afternoon.

UncleSp1d3r
Author
UncleSp1d3r
As a computer security professional, I’m passionate about building secure systems and exploring new technologies to enhance threat detection and response capabilities. My experience with Rails development has enabled me to create efficient and scalable web applications. At the same time, my passion for learning Rust has allowed me to develop more secure and high-performance software. I’m also interested in Nim and love creating custom security tools.
Computer History - This article is part of a series.
Part : This Article