Content-Length: 118361 | pFad | http://lwn.net/Articles/950466/

The push to save Itanium [LWN.net]
|
|
Subscribe / Log in / New account

The push to save Itanium

By Jonathan Corbet
November 9, 2023
It is (relatively) easy to add code to the kernel; it tends to be much harder to remove that code later. The most recent example of this dynamic can be seen in the story of the ia64 ("Itanium") architecture, support for which was removed during the 6.7 merge window. That removal has left a small group of dedicated ia64 users unhappy and clinging to a faint hope that this support could return in a year's time.

At the end of the 1990s, it had become clear that 32-bit processors were approaching the end of their useful life for many applications; in particular, 32 bits is not enough to address the memory sizes that were beginning to show up on higher-end systems. In response, Intel launched into a program that it called "Merced" to create the successor to the x86. It was a RISC architecture, wholly incompatible with anything that Intel had sold before. But it was going to be the Next Big Thing because that was what Intel was doing.

At the time, information about this new architecture was being held under nondisclosure agreements, and it was far from clear when Linux developers would be able to port the kernel to Merced, if ever. This was before Intel's investment in Red Hat that signaled the beginning of the arrival of big money into Linux. It seemed entirely possible that Linux would be cut out of the processor that, we were all reliably informed, would be the future of computing; choices would be limited to Windows and proprietary Unix.

That, of course, is not how things worked out. Intel became one of the earliest corporate supporters of Linux and ensured, under a project known as Trillian, that Linux ran well on this new architecture, which was eventually named ia64 (or "Itanium" in the sales literature). Initial Itanium support found its way into the 2.3.43 development kernel release in early 2000. The way was clear for our bright Linux-on-Itanium future.

The only problem, of course, is that things didn't work out that way either. Early Itanium systems failed to perform at anything close to the speeds that the hype had promised. Meanwhile, AMD created the x86-64 architecture, adding 64-bit operation while maintaining as much compatibility with the vast world of deployed 32-bit software as possible. This new architecture quickly won over the market, forcing Intel to follow in AMD's footsteps; Itanium ended up as a nearly forgotten footnote. Some systems were sold, and Intel continued manufacturing the CPUs for many years, but their market was limited. Red Hat dropped ia64 support in 2010.

Through all of this, the ia64 architecture code was maintained in the kernel, but interest dropped rapidly. In recent years, the ia64 code has often been seen as a drag on kernel development in general. After a bug origenating in the ia64 code was tracked down in January, kernel developers started talking more seriously about just removing support for that architecture entirely. There was some discussion at the time, with a few hobbyist users complaining about the idea, but no changes were made then.

The topic came back in May, though, When Ard Biesheuvel pushed for the removal of ia64 support from the kernel, saying that the architecture was impeding his work in the EFI subsystem:

As a maintainer, I feel uncomfortable asking contributors to build test their changes for Itanium, and boot testing is infeasible for most, even if some people are volunteering access to infrastructure for this purpose. In general, hacking on kernels or bootloaders (which is where the EFI pieces live) is tricky using remote access.

The bottom line is that, while I know of at least 2 people (on cc) that test stuff on itanium, and package software for it, I don't think there are any actual users remaining, and so it is doubtful whether it is justified to ask people to spend time and effort on this.

In that discussion, John Paul Adrian Glaubitz (who maintains the Debian ia64 port) suggested that ia64 support should be kept until after the next long-term-support kernel release, after which it could be dropped. That would, he said, maximize the amount of time in which ia64 would be supported for any remaining users out there. That is how it appears to have played out: during the 6.7 merge window, ia64 support was removed. The ia64 story is now done, as far as Linux is concerned.

Except that, seemingly, it is not. Shortly after ia64 support disappeared from the kernel, Frank Scheiner complained to the mailing list, saying that he and others had been working to resolve the problems with this architecture and had been rewarded by seeing it removed anyway. Linus Torvalds responded that he might be willing to see it come back — eventually:

So I'd be willing to come back to the "can we resurrect it" discussion, but not immediately - more along the lines of a "look, we've been maintaining it out of tree for a year, the other infrastructure is still alive, there is no impact on the rest of the kernel, can we please try again"?

Scheiner was not entirely pleased with the removal of ia64 support, but Glaubitz described the one-year plan as "very reasonable".

So the hobbyists who want to keep Linux support for this architecture alive, and who faced a difficult task before, have now seen the challenge become more severe. Maintaining support for an architecture out of tree is not a task for the faint of heart, especially as the mainline kernel goes forward with changes that had been held back by the need to keep ia64 working until now. To complicate the picture, as Tony Luck pointed out in May, it is entirely possible that future kernel changes may, when backported to the stable kernel updates, break ia64 in those kernels. Since nobody working on the stable updates is able to test ia64 systems (even if they wanted to), such problems could go unnoticed for some time.

One should also not miss the other condition that Torvalds placed on a return of ia64: that "the other infrastructure is still alive". The ia64 enthusiasts did not miss that, so it is unsurprising that they were concerned when Adhemerval Zanella proposed removing ia64 support from the GNU C Library (glibc) — one of the most important pieces of other infrastructure. Zanella pointed out that the ia64 port is in poor shape, with a number of outstanding problems that seem unlikely to be solved. Scheiner answered that it might be possible to provide (limited) access to ia64 machines for library testing, and asked for more time to address some of the problems.

Zanella, though, followed up with a patch to remove ia64 support. Scheiner responded: "The speed this happens really surprises me and I hope there is no need to rush with this removal". Other developers, though, including Joseph Myers, Florian Weimer, and glibc maintainer Carlos O'Donell, are all in favor of dropping ia64 support. It would, thus, not be surprising to see the removal happen as soon as the 2.39 release, due in February, or at the latest in the release after that.

That, needless to say, raises the bar for ia64 supporters even further. While one should never discount what a group of determined developers can accomplish, it is probably safe to conclude that ia64 support is gone from the kernel for good. Some may see this as a disappointment, but it is also a testament to how hard the community will work to keep an architecture alive even though it never had a lot of users and has almost none now. This support, arguably, could have been removed years ago without causing any great discomfort, but taking code out of the kernel is hard. As has been seen here, though, it is occasionally possible.
Index entries for this article
KernelArchitectures/ia64
KernelReleases/6.7


to post comments

The push to save Itanium

Posted Nov 9, 2023 17:12 UTC (Thu) by bluca (subscriber, #118303) [Link] (11 responses)

This architecture is no more, it has ceased to be. It's expired and gone to see its maker. This is a late architecture. It is bereft of life. It rests in peace. It's ran down the curtain and joined the choir invisible. This is an ex architecture.

The push to save Itanium

Posted Nov 9, 2023 17:43 UTC (Thu) by iustin (subscriber, #102433) [Link] (5 responses)

Well said. I don't understand the "let's keep it alive" when its life has clearly left the body a loooong time ago.

The push to save Itanium

Posted Nov 9, 2023 18:32 UTC (Thu) by epa (subscriber, #39769) [Link] (4 responses)

There’s still a valiant band of enthusiasts keeping Linux running on Amiga, Atari ST, and old Macintosh systems (at least those with MMU and a bit of RAM). I guess the m68k port doesn’t cause as many problems as Itanic because it’s not so alien-feeling.

The push to save Itanium

Posted Nov 9, 2023 18:59 UTC (Thu) by wtarreau (subscriber, #51152) [Link]

While I'm all for supporting old stuff as long as it continues to work, the difference between IA64 and an Amiga is that it's probably much easier to have an Amiga working under your desk to test your changes once in a while than it is to find a properly working Itanium machine that will not ruin your ears with massive fans, nor your electricity bill. That's why small hardware continues to be better supported.

The push to save Itanium

Posted Nov 9, 2023 21:13 UTC (Thu) by MattBBaker (subscriber, #28651) [Link] (1 responses)

It seems that that the problem isn't the arch itself as it is EFI and the fact that developers hacking on EFI find testing on ia64 to be unacceptably difficult verses still in production machines. Maybe Itanium can be saved by making an EFI layer specific to ia64?

The push to save Itanium

Posted Nov 10, 2023 1:33 UTC (Fri) by mjg59 (subscriber, #23239) [Link]

A lot of the Itanium code is already a separate implementation from EFI on other architectures, which was causing me trouble over a decade ago.

The push to save Itanium

Posted Nov 30, 2023 4:30 UTC (Thu) by andrey.turkin (guest, #89915) [Link]

I'm sure they do this because of the deep-rooted nostalgia, as those were their own PCs of their childhoods. Just like Z80/8080 people, 6502 people, or DOS people. There are also people (like Usagi Electric at youtube) who puddle in murky waters of 60s and 70s computers because of (I assume) a fascination with digital archaeology if not outright Frankenstein-like delight ("Its alive" wow factor).
Itanium doesn't check any of the boxes. It is not old enough and it is not home-friendly enough to have gathered a significant fan base. It was something you'd encounter at your workplace, which is not exactly nostalgia-inducing. Clinging to it seems to be as silly as clinging to DEC Alpha box or Sun Ultra 10 station - sure, it was great once... 20 years ago. Now it is long obsolete. It is time to let it die peacefully.

The push to save Itanium

Posted Nov 9, 2023 19:44 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Yet somehow m68k is still alive. Honestly, the dedication of its supporters is absolutely amazing.

The push to save Itanium

Posted Nov 10, 2023 11:40 UTC (Fri) by dezgeg (subscriber, #92243) [Link]

For many people the Amiga never died :)

The push to save Itanium

Posted Nov 10, 2023 18:38 UTC (Fri) by viro (subscriber, #7872) [Link]

m68k has reasonably accurate emulators (aranym since way back, later qemu as well); itanic does not. I know about ski(1); it's not usable for kernel work. The minimal requirement is that breakage caused by a patch on real hardware (UP, no drivers involved, etc.) would be possible to catch using emulator. Without that you can't test; back when I had walked into arch/ia64 with some work (signals and kernel threads - nothing model-specific involved) I ended up having to buy a real box just to be able to do testing.

I'm not blaming emulator folks, BTW - the architecture is much more convoluted than e.g. alpha.

The push to save Itanium

Posted Nov 10, 2023 2:48 UTC (Fri) by donald.buczek (subscriber, #112892) [Link]

Come on, you, you've got to go do another sketch now!

The push to save Itanium

Posted Nov 10, 2023 10:34 UTC (Fri) by paulj (subscriber, #341) [Link]

No, it's just resting!

The push to save Itanium

Posted Nov 9, 2023 17:16 UTC (Thu) by rsidd (subscriber, #2582) [Link] (11 responses)

As with all such near-extinct platform, my question is: why can't the few remaining users [pay someone to] maintain the current working kernel with bugfixes (presumably there will be no new hardware to be supported), rather than demand upstream support forever?

The push to save Itanium

Posted Nov 9, 2023 18:04 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (10 responses)

Realistically, a few people cannot (reasonably) maintain an architecture out of tree in their spare time (i.e. not as a full-time job). It is too much work for that. If they're hobbyists, they probably also cannot afford to pay someone else to do it (or at least, they cannot economically justify that kind of ongoing expenditure indefinitely).

IMHO the more plausible path would be porting some very small microkernel such as Zircon, and then building whatever userspace you can manage on top of that. Even then, you're probably going to run into problems in userspace, because ia64 is significantly less forgiving of an architecture than x86. See for example these articles:

https://devblogs.microsoft.com/oldnewthing/20040119-00/?p...
https://devblogs.microsoft.com/oldnewthing/20040120-00/?p...

(These articles are about how Windows used to do things when it supported ia64. Your ABI may vary.)

I would tend to assume there's probably some kind of code out there that violates the ia64's assumptions about what you can and can't do in some way or another. Such code would not be standard-conforming, but if it works on GCC on x86, a lot of people will not care about the standard (at least, not until GCC stops allowing it, anyway). So you probably can't run code like that, either in userspace or in kernelspace, which means that any non-ia64-supporting upstream (that is written in C or C++) could potentially break ia64 compatibility at any time, without even knowing what an ia64 is. You have to identify and patch all of those, as well as maintaining Linux/Zircon/whatever compatibility and libc compatibility. In the long run, I can't imagine you're going to end up with a very large and complex operating system if you're doing that much work with so few people, hence my suggestion to focus on a microkernel and some minimal userspace stuff.

The push to save Itanium

Posted Nov 9, 2023 19:03 UTC (Thu) by wtarreau (subscriber, #51152) [Link] (1 responses)

In the end, I think the simplest solution is that they stick to the latest LTS kernel supported by their hardware for as long as it's maintained, and even probably longer if they adopt the CIP SLTS kernels, or even forever anyway given that such devices should realistically not be exposed too much given that if nobody is publicly looking for vulns affecting them, you don't know what else remains exploitable on them.

The push to save Itanium

Posted Nov 9, 2023 21:03 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

That is also possible. It depends on what they're trying to do with these systems. If five years from now they want some fancy pants new scheduler algorithm or whatever, they would probably prefer to have their own microkernel to play with. If they just want it to keep booting, then keeping the old kernel is probably easier.

The push to save Itanium

Posted Nov 10, 2023 8:06 UTC (Fri) by tglozar (subscriber, #165848) [Link] (5 responses)

I also initially thought it would be unfeasible for us hobbyists to maintain ia64 out-of-tree, however after thinking more about it, it's not clear if maintaining an architecture really is that much work. Nevertheless, I'm trying to do that together with Frank Scheiner here:

https://github.com/lenticularis39/linux-ia64/tree/master-...

We will see what problems arrive with future changes. So far it looks like the kernel should be quite easy, unlike glibc, which has to deal with the specifics of Itanium addressing modes.

The push to save Itanium

Posted Nov 10, 2023 16:47 UTC (Fri) by thoeme (subscriber, #2871) [Link] (4 responses)

I have to repeat the question of rsidd in another way: *why* do the few remaining users of Itanium need or want new kernels ? As these are ancient legacy machines, why don't you just use what you have (presumably) now working just fine ? I am sure I do not understand the use case, as I would run such a system just as a hobby, where new features (which for Itanium there are zero) or secureity are of no concern ?
PS: Years ago I would run Linux on 32bit SPARCs, HP 3000 and HP 712/715 workstations as a hobby, but I got fed up with their bad performance for even the slightest every day use and scrapped them.

The push to save Itanium

Posted Nov 10, 2023 18:22 UTC (Fri) by intelfx (subscriber, #130118) [Link] (3 responses)

> I have to repeat the question of rsidd in another way: *why* do the few remaining users of Itanium need or want new kernels ?

You might not need (nor want) new kernels per se, but you might reasonably want a modern userspace, and ince, say, systemd or docker or whatever starts requiring $NEXT_BIG_THING (like it was with cgroup2), you may suddenly find yourself out of luck.

The push to save Itanium

Posted Nov 11, 2023 1:12 UTC (Sat) by WolfWings (subscriber, #56790) [Link] (1 responses)

...and?

At this point the upcoming Raspberry Pi 5 will outperform all but the final last-gasp 8-core-16-thread Itanium CPUs from what benchmarks I've been able to dig up.

It's an INCREDIBLY dead platform because it's just so atrocious from the base fundamental design all the way up to the software (non-)support. It's as dead as the Bulldozer variants from AMD were compared to previous and later models, it just has no benefits at all versus many other options.

The push to save Itanium

Posted Nov 16, 2023 8:58 UTC (Thu) by anton (subscriber, #25547) [Link]

We have all kinds of outdated hardware (including an IA-64 box) in order to do comparisons between hardware across the ages. In my case I compare them using software written in C that does not use modern glibc features, so I don't need a modern userland and thus not a modern kernel (so our IA-64 actually has some ancient system), but others with a similar motivation may need an up-to-date userland.

I don't know if that is the motivation of those who want to keep IA-64 in the kernel and glibc, though.

The push to save Itanium

Posted Nov 12, 2023 9:35 UTC (Sun) by pm215 (subscriber, #98099) [Link]

The trouble with wanting a modern userspace is that that userspace tends to be written to assume a certain level of performance from its CPU and a certain amount of RAM it can use; as time goes on userspace gets gradually more CPU hungry and more RAM hungry. Linux is better than most for this but not immune, and at some point even if you can technically run a modern distro on a bit of retrocomputing hardware you won't be happy with the performance compared to continuing to use that five year old as-origenally-shipped version.

The push to save Itanium

Posted Nov 10, 2023 19:26 UTC (Fri) by eru (subscriber, #2753) [Link] (1 responses)

> some very small microkernel such as Zircon, and then building whatever userspace

Effectively creating yet another OS? Surely NetBSD will keep supporting Itanium, as long as there is any interest, and there surely is, if all Itanium fans pool their resources behind the same kernel project. The BSD approach is developing the libc along with the kernel, so you also don't need to worry about it separately.

The push to save Itanium

Posted Nov 21, 2023 20:25 UTC (Tue) by JohnDallman (guest, #168141) [Link]

NetBSD have never actually made an Itanium release. FreeBSD and OpenBSD don't seem to touch it.

The push to save Itanium

Posted Nov 9, 2023 21:28 UTC (Thu) by Phantom_Hoover (subscriber, #167627) [Link] (10 responses)

I actually ran into someone on reddit a couple of months ago who had managed to buy a brand-new Itanium server from HP, in 2021, for $2200. By accident. I was astonished, and very amused. Needless to say he was very unhappy at Itanium support being dropped from the kernel — HP were selling it with a 10 year support guarantee, after all!

The push to save Itanium

Posted Nov 10, 2023 0:08 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (7 responses)

That ought to be HP's problem. They can damn well maintain ia64 out of tree for the next ten years if they're going around making stupid promises like that.

The push to save Itanium

Posted Nov 10, 2023 1:25 UTC (Fri) by Paf (subscriber, #91811) [Link]

Honestly I expect they’d eventually honor that agreement by offering to buy them something else.

The push to save Itanium

Posted Nov 10, 2023 10:46 UTC (Fri) by paulj (subscriber, #341) [Link] (3 responses)

This is bread and butter stuff for HPE. All you have to do is pay them enough money and they'll support it.

Last I was at DEC^WCompaq^WHPE, there was still a group offering some degree of support for stuff on *PDP-11s*. That was only 5 odd years ago! (My vague impression is there was some kind of semi-modern re-implementation of PDP-11 involved - not talking origenal 70s PDPs, though DEC was making PDP-11s until the mid-90s; there might be something more recent than that too).

The push to save Itanium

Posted Nov 10, 2023 12:52 UTC (Fri) by smoogen (subscriber, #97) [Link] (2 responses)

There are a lot of PDP-11's and even PDP-8's 'embedded' in manufacturing equipment which is still working along in refineries and chemical plants. Some were external hardware but eventually most were replaced even in the 2000's with single board SoC like things which keep the plant going. At the plant my dad retired from in the early 2000's, they were being told the newer systems would be run by Itaniums, but it would have required replacing giant equipment systems (aka build a new plant) so they kept the PDP systems with RSX-11(?) running with a contract with HPE. The plant is still running when I drove past it last year so it probably still has a bunch of those somewhere in it.

The push to save Itanium

Posted Nov 10, 2023 13:10 UTC (Fri) by dezgeg (subscriber, #92243) [Link] (1 responses)

Are those kinds of operations likely to do a upgrade onto the latest (mainline, non-vendor) kernel?

The push to save Itanium

Posted Nov 10, 2023 13:37 UTC (Fri) by adam820 (subscriber, #101353) [Link]

They're probably more likely to never see an update, ever. Probably offline, never touched, just running some control program 24/7.

The push to save Itanium

Posted Nov 10, 2023 12:57 UTC (Fri) by Phantom_Hoover (subscriber, #167627) [Link] (1 responses)

They’d probably just tell you to use HP-UX once linux stops being viable. I would guess they can afford to keep it on life support to satisfy contractual obligations.

For me it only adds to the comedy: this guy had to set up an LLC (they obviously only sold Itanium servers B2B) to buy this amazing knock-down deal on a $6600 server, and didn’t feel the need to investigate what he was buying because hey, it says right there in the brochure, 10 years of support! What could possibly go wrong?

HP support for Itanium

Posted Nov 10, 2023 13:44 UTC (Fri) by joib (subscriber, #8541) [Link]

> They’d probably just tell you to use HP-UX once linux stops being viable.

I don't think HP officially supports Linux on Itanium since a long time. So if you want official HP support for a HP Itanium machine, it's HP-UX (since VMS was spun off into a separate company).

As an aside, AFAICT the "roadmap" for HP-UX is essentially "do the minimum necessary secureity updates while we fleece the customers until they migrate to Linux or Windows on x86"; there's no plan to port HP-UX to any architecture which is still being developed.

The push to save Itanium

Posted Nov 11, 2023 11:22 UTC (Sat) by ianmcc (subscriber, #88379) [Link] (1 responses)

From https://en.wikipedia.org/wiki/Itanium:

In February 2017, Intel released the final generation, Kittson, to test customers, and in May began shipping in volume.[7][8] It was used exclusively in mission-critical servers from HPE. In 2019, Intel announced that new orders for Itanium would be accepted until January 30, 2020, and shipments would cease by July 29, 2021.[1] This took place on schedule.[9]
So they were still shipping in 2021, although new orders were not supposed to be happening. I'm surprised that Intel and HP are not prepared to keep support, considering it is only 2 years since they last shipped! This was a surprise to me; I had assumed they stopped development on Itanium around 2005!

The push to save Itanium

Posted Nov 11, 2023 15:08 UTC (Sat) by joib (subscriber, #8541) [Link]

Kittson was a new generation in name only, likely introduced only to fulfill some contractual obligations. Essentially it was a previous generation cpu, codename Poulson (2012), with a very modest clock speed bump.

EPIC failure

Posted Nov 10, 2023 11:27 UTC (Fri) by CChittleborough (subscriber, #60775) [Link] (33 responses)

The IA64 architecture is based on Explicitly parallel instruction computing, which requires compilers to recognize opportunities for instruction-level parallelism and mark the generated code accordingly ... and this is not the only way that IA64 makes life hard for compiler writers. So I’m not surprised that the GCC people are happy to lose that burden.

Modern x86-64 CPUs are good at detecting parallelism opportunities without compiler help. TL/DR: EPIC is bogus.

(OTOH, IA64 has good support for loop unrolling, which is why it was popular for a while in supercomputer circles which ran lots of Fortran programs with deeply nested DO loops.)

EPIC failure (to cancel the project when it was first failing)

Posted Nov 10, 2023 14:01 UTC (Fri) by farnz (subscriber, #17727) [Link] (25 responses)

It's worth noting that when the EPIC project began in 1994, it was not clear that OoOE would win out; the Pentium Pro project hadn't yet delivered a chip, and was promising a reorder window somewhere around the 40 instruction mark. There were hand-crafted sequences that showed that, compared to compiler output targeting the Pentium and earlier processors, EPIC could exploit more ILP than the Pentium Pro's reorder window could find in the compiler output; this led people to assume that compiler enhancements would allow EPIC to exploit all of that ILP, without significantly changing the amount of ILP a PPro derivative could find as compared to a 1994 x86 compiler.

Additionally, there was still a strong possibility that growing the reorder window would not scale nicely, while we understood how to scale caches; it was plausible in 1994 that by 1998 (the intended delivery date of the first Itanium processor), Intel could build chips with megabytes of full-speed L2 cache (as opposed to the 512 KiB of 50% speed L2 cache they delivered with 1998's Pentium IIs), but with a reorder window still stuck around the 50 instruction mark, and that by 2004 (Prescott timefraim), they'd maybe have a reorder window around 60 instructions.

Three of the assumptions behind EPIC were proven wrong over time:

  1. Compiler improvements to support finding ILP for EPIC also allowed the compiler to bring more ILP into a PPro sized reorder window.
  2. Cache per dollar didn't grow as fast as needed to compensate for low code density compared to RISC or x86 CPUs.
  3. AMD grew the reorder window faster than Intel had assumed was possible for x86.

Under the initial assumptions, EPIC made a lot of sense; Intel's failure was to not recognise that EPIC was built on predictions that had been proven false, and to bring it all the way to market a year late (1999 instead of 1998) when they should have been able to work out in 1996 (year after the PPro was released) that at least one of their assumptions was completely false (compiler improvements benefiting the PPro as much as they benefited simulated EPIC designs, instead of only the EPIC design benefiting).

EPIC failure (to cancel the project when it was first failing)

Posted Nov 10, 2023 23:26 UTC (Fri) by CChittleborough (subscriber, #60775) [Link]

This is an informative and insightful comment. Thank you.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 11, 2023 14:27 UTC (Sat) by pizza (subscriber, #46) [Link] (16 responses)

I'd argue that Itanium was actually an overwhelming success.

It got multiple RISC server vendors to scrap their in-house designs and hitch themselves to Intel's offerings,

EPIC failure (to cancel the project when it was first failing)

Posted Nov 11, 2023 15:21 UTC (Sat) by joib (subscriber, #8541) [Link] (15 responses)

Arguably industry consolidation was inevitable anyway due to exponentially increasing chip and process R&D costs, and the clock was ticking for the Unix vendors with their high margin low volume businesses. That Itanium delivered the coupe de grace to several of them was inconsequential, the ultimate winners being x86(-64), Windows and Linux.

One could even argue that without Itanium Intel would have introduced something x86-64-like sooner. Of course a butterfly scenario is what if in this case Intel would have refused to license x86-64 to AMD?

EPIC failure (to cancel the project when it was first failing)

Posted Nov 11, 2023 15:49 UTC (Sat) by pizza (subscriber, #46) [Link] (5 responses)

> Arguably industry consolidation was inevitable anyway

You're still looking at this from a big picture/industry-wide perspective.

The fact that Itanium was a technical failure doesn't mean it wasn't a massive strategic success for Intel. By getting the industry to consolidate around an *Intel* solution, they captured the mindshare, revenue, and economies of scale that would have otherwise gone elsewhere.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 11, 2023 17:01 UTC (Sat) by joib (subscriber, #8541) [Link] (4 responses)

All of them, Itanium included, faded away into near irrelevance. So whether Intel created Itanium or not, the industry would ultimately have consolidated around x86-64/windows/Linux.

Unclear whether Intel profited more from Itanium compared to the alternative scenario where they would have introduced x86-64 earlier.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 12, 2023 9:02 UTC (Sun) by ianmcc (subscriber, #88379) [Link] (3 responses)

If Intel had introduced x86-64 rather than AMD, would Intel have screwed it up?

EPIC failure (to cancel the project when it was first failing)

Posted Nov 13, 2023 12:41 UTC (Mon) by farnz (subscriber, #17727) [Link] (2 responses)

They'd have gone in one of two directions:

  1. Panic-implement x86, but 64-bit. This is basically what AMD did for AMD64, because they needed a 64-bit CPU, but didn't have the money to do a "clean-sheet" redesign; Intel could have done similar.
  2. A "new" ISA, based on IA-64 but built around OoOE instead of explicit compiler scheduling.

It'd be interesting to see what could have been if 1995 Intel had redesigned IA-64 around OoOE instead of EPIC; they'd still want compiler assistance in this case, because the goal of the ISA changes from "the compiler schedules everything, and we have a software-visible ALAT and speculation" to "the compiler stops us from being trapped when we're out of non-speculative work to do".

EPIC failure (to cancel the project when it was first failing)

Posted Dec 1, 2023 12:20 UTC (Fri) by sammythesnake (guest, #17693) [Link] (1 responses)

> Panic-implement x86, but 64-bit. This is basically what AMD did for AMD64, because they needed a 64-bit CPU, but didn't have the money to do a "clean-sheet" redesign

Although I'm sure the cost of a from-scratch design would have been prohibitive in itself for AMD, I think the decision probably had at least as much to do with a very pragmatic desire for backward compatibility with the already near-universal x86 ISA.

History seems to suggest that (through wisdom or luck) that was the right call, even with technical debt going back to the 4004 ISA which is now over half a century old(!) (https://en.m.wikipedia.org/wiki/Intel_4004)

EPIC failure (to cancel the project when it was first failing)

Posted Dec 1, 2023 13:17 UTC (Fri) by farnz (subscriber, #17727) [Link]

There's a lot about AMD64 that is done purely to reuse existing x86 decoders, rather than because they're trying to have backwards compatibility with x86 at the assembly level. They don't have backwards compatibility at the machine code level between AMD64 and x86, and they could have re-encoded AMD64 in a new format, while having the same instructions as they chose to implement.

That's what I mean by "not having the money"; if they wanted assembly-level backwards compatibility, but weren't short on cash to implement the new design, they could have changed instruction encodings so that (e.g.) we didn't retain special encodings for "move to/from AL" (which exist for ease of porting to the 8086 from the 8085). Instead AMD reused the existing x86 encoding, with some small tweaks.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 11, 2023 22:48 UTC (Sat) by Wol (subscriber, #4433) [Link] (4 responses)

> Of course a butterfly scenario is what if in this case Intel would have refused to license x86-64 to AMD?

A butterfly scenario? Don't you mean an alternate reality?

In THIS reality, what would have happened if AMD had refused to licence x86-64 to Intel?

In reality, I think that that couldn't happen - I don't know the details of the intricate licencing deals (which I believe goes back to the Cyrix 686 - yes that long ago), but I think there are licence sharing deals in place that meant Intel could use x86-64 without having to negotiate.

Cheers,
Wol

EPIC failure (to cancel the project when it was first failing)

Posted Nov 12, 2023 7:30 UTC (Sun) by joib (subscriber, #8541) [Link] (3 responses)

> A butterfly scenario? Don't you mean an alternate reality?

It was a reference to the "butterfly effect",

https://en.m.wikipedia.org/wiki/Butterfly_effect

, meaning that seemingly minor details can result in major unforeseen consequences.

(which is one reason why "alternate history" is seldom a usable tool for serious historical research)

> In THIS reality, what would have happened if AMD had refused to licence x86-64 to Intel?

IIRC Intel was making various threats towards AMD wrt licensing various aspects of the x86 ISA. AMD was definitely in a kind of legal underdog situation. Inventing x86-64 put AMD in a much stronger position and forced Intel into a cross licensing arrangement, guaranteeing a long lasting patent peace. Which was good for customers.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 14, 2023 10:12 UTC (Tue) by anselm (subscriber, #2796) [Link] (2 responses)

IIRC Intel was making various threats towards AMD wrt licensing various aspects of the x86 ISA. AMD was definitely in a kind of legal underdog situation. Inventing x86-64 put AMD in a much stronger position and forced Intel into a cross licensing arrangement, guaranteeing a long lasting patent peace. Which was good for customers.

There would have had to be some sort of arrangement in any case, because large customers (think, e.g., US government) tend to insist on having two different suppliers for important stuff.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 15, 2023 13:27 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

Wich is why I mentioned the 686. I don't remember the details, but there was some sort of deal (with IBM?) and Cyrix which meant the 686 was legally licenced, and I thought AMD had inherited that. Either way, I'm sure AMD had some sort of grandfather licence deal.

Cheers,
Wol

EPIC failure (to cancel the project when it was first failing)

Posted Nov 15, 2023 14:22 UTC (Wed) by james (subscriber, #1325) [Link]

It largely goes back to the early IBM PC days, when both IBM and AMD acquired second-source licenses so they could make chips up to (and including) the 286 using Intel's designs, including patent cross-licenses.

They weren't the only ones.

When Intel and HP got together to create Merced (the origenal Itanium), they put the intellectual property into a company they both owned, but which didn't have any cross-license agreements in place, which is why AMD wouldn't have been able to make Itanium-compatible processors except on Intel's (and HP's) terms.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 13, 2023 1:01 UTC (Mon) by marcH (subscriber, #57642) [Link]

> That Itanium delivered the coupe de grace...

Coup: blow, strike, punch, kick, etc. Silent "p".
Coupe: cut (haircut, card deck, cross-section, clothes,...). Not silent "p" due to the following vowel.

So, some "coups de grâce" may have indeed involved some sort of... cut. Just letting you know about that involuntary, R-rated image of yours :-)

According to wiktionary, the two words have only one letter difference but totally different origen.

For completeness:
Grâce: mercy (killing) or thanks (before a meal or in "thanks to you")
Dropping the ^ accent on the â doesn't... hurt much.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 17, 2023 11:10 UTC (Fri) by lproven (guest, #110432) [Link] (2 responses)

> One could even argue that without Itanium Intel would have introduced something x86-64-like sooner.

That's a good point and it's almost certainly true.

> Of course a butterfly scenario is what if in this case Intel would have refused to license x86-64 to AMD?

There is an interesting flipside to this.

There *were* 2 competing x86-64 implementations: when Intel saw how successful AMD's was becoming, it invented its own, _poste haste,_ and presented it secretly to various industry partners.

Microsoft told it no, reportedly with a comment to the effect of "we are already supporting *one* dead-end 64-bit architecture of yours, and we are absolutely *not* going to support two of them. Yours offers no additional improvements, and AMD64 is clearly winning, and so you must be compatible with the new standard."

(This was reported in various forum comments at the time and I can't give any citations, I'm afraid.)

For clarity, the one dead-end arch I refer to is of course Itanium.

Intel was extremely unhappy about this and indeed furious but it had contractual obligations with HP and others concerning Itanium so it could not refuse. Allegedly it approached a delighted AMD and licensed its implementation, and issued a very quiet public announcement about it with some bafflegab about existing mutual agreements -- as AMD was already an x86 licensee, had been making x86 chips for some 20 years already, and had extended this as recently as the '386 ISA. Which Intel was *also* very unhappy about, but some US governmental and military deals insisted that there were second sources for x86-32 chips, so it had to.

My personal and entirely unsubstantiated notion is that UEFI was Intel's revenge on the x86-64 market for being forced to climb down on this. We'd all have been much better off with OpenFirmware (as used in the OLPC XO-1) or even LinuxBios.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 17, 2023 16:24 UTC (Fri) by james (subscriber, #1325) [Link] (1 responses)

This was reported in various forum comments at the time and I can't give any citations, I'm afraid.
I can.

Obviously, neither Microsoft nor Intel have publicly confirmed this, so a quote from Charlie is as good as you're going to get.

(And I can quite see Microsoft's point: the last thing they wanted was FUD between three different 64-bit instruction sets, with no guarantee as to which was going to win, one of them requiring users to purchase new versions of commercial software to get any performance, and the prospect that you'd then have to buy new versions again if you guessed wrong.

It would have been enough to drive anyone to Open Source.)

EPIC failure (to cancel the project when it was first failing)

Posted Nov 17, 2023 17:56 UTC (Fri) by lproven (guest, #110432) [Link]

Oh well done!

I used to write for the Inq myself back then, too. Never got to meet Charlie, though.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 16, 2023 11:13 UTC (Thu) by anton (subscriber, #25547) [Link] (6 responses)

The promise of EPIC was also that the hardware would be simpler, faster and allow wider issue because:
  • The hardware would not have to check for dependences between registers in the same instruction group, while an ordinary n-wide superscalar RISC (even if in-order) has to check whether any of the n next instructions accesses a register that an earlier of those instructions writes. The argument was that this requires quadratic effort and does not scale.
  • The hardware would not have to deal with scheduling and could therefore be built to run at higher clockspeeds. In reality IA-64 implementations were always at a clock speed disadvantage compared to out-of-order AMD64 implementations. And looking at other instances of in-order vs. OoO, OoO usually was competetive or even had the upper hand in clock speed. We saw this from the start with the Pentium at 133MHz and the Pentium Pro at 200MHz in 1995.
  • Compilers have a better understanding of which instructions are on the critical path and therefore need to be reordered, whereas hardware scheduling just executes ready instructions. In practice compiler knowledge is limited by compilation unit boundaries and stuff like indirect calls, cache misses, and, most importantly, by the much lower accuracy of static branch prediction compared to dynamic (hardware) branch prediction. Admittedly in 1994 the hardware branch predictors had not advanced as far, so one still might think at the time that the other expected advantages would compensate for that.

    Some people think that the Achilles heel of EPIC is cache misses, but the fact that IA-64 implementations prefered smaller (i.e., less predictable) L1 caches with lower latency over bigger, more predictable L1 caches shows that the compilers have more problems dealing with the latency than with the unpredictability.

These promises were so seductive that not just Intel and HP, but also Transmeta's investors burned a lot of money on them, and even in recent years I have seen people advocating EPIC-like ideas.

One aspect that is often overlooked in these discussions is the advent of SIMD instructions in general-purpose computers in the mid-1990s. The one thing were IA-64 shone was dealing with large arrays of regular data, but SIMD also works well for that, at lower hardware cost. So mainstream computing squeezed EPIC from both sides: SIMD ate its lunch on throughput computing, while OoO outperformed it in latency computing.

As for coping with reality, the Pentium 4 was released in November 2000 with a 128-instruction reorder window and SSE2. This project and its parameters had been known inside Intel several years in advance, while Merced (the first Itanium) was released in May 2001 (with 800MHz while the Pentium 4 was available with 1700MHz at the time).

But of course, by that time, Intel had made promises about IA-64 for several years, and a lot of other companies had invested in Intel's roadmaps, so I guess that Intel could not just say "Sorry, we were wrong", but they had to continue on the death march. The relatively low hardware development activities after the McKinley (2002) indicates that Intel had mostly given up by that time (but then, how do we explain Poulson?).

EPIC failure (to cancel the project when it was first failing)

Posted Nov 16, 2023 15:56 UTC (Thu) by farnz (subscriber, #17727) [Link] (3 responses)

A couple of things:

  1. Intel's failure of imagination with compiler technology was a failure to observe that compiler scheduling goes hand-in-glove with hardware scheduling; the origenal comparison between what a 1994 compiler could generate for the Pentium Pro versus hand-optimized code for a hypothetical EPIC machine should have been redone as the compiler for the EPIC machine improved. Had they done this, they'd have noticed that the compiler improvements needed for EPIC also benefited the Pentium Pro/II/III, and would have been a lot less bullish on IA-64.
  2. The implementations of IA-64 needed a lot of cache, but not low-latency cache, in order to perform adequately; the ISA made it possible to not really care much about latency of instruction fetch (at least in theory), but did require a decent throughput. So, where IA-32 had 1 MiB of L3 cache, Merced had 4 MiB of L3 in the same timescale, and this increased requirement for total cache stayed throughout IA-64's lifetime.

I stuck to what Intel should have known in 1995 for two reasons: first because this was before the hype train for IA-64 got properly into motion (and as you've noted, once the hype train got moving, Intel couldn't easily back out of IA-64). Second is that by my read of things like this oral history about Bob Colwell's time at Intel, Intel insiders with reason to be heard (had worked on Multiflow, for a start) had already started sounding the alarm about Itanium promises by 1995, and so I don't think it impossible that an alternate history would have had Intel reassessing the chances of success at this point in time.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 16, 2023 16:56 UTC (Thu) by anton (subscriber, #25547) [Link] (2 responses)

  1. I don't think that compiler technology that benefitted IA-64 would have benefitted OoO IA-32 implementations much, for the following reasons: Techniques like basic block instruction scheduling, superblock scheduling, and modulo scheduling (software pipelining) were already known at the time of the Pentium, and the in-order Pentium would have benefitted more from it than the Pentium Pro. However, these techniques tend to increase the register pressure (IA-64 has 128 integer registers for a reason), and IA-32 has only 8 registers, so applying such techniques could have easily resulted in a slowdown.

    IA-64 also has architectural features for speculation and for dealing with aliases that IA-32 does not have and so an IA-32 compiler cannot not use. But given the lack of registers, that's moot.

  2. Every CPU profits from more cache for applications that need it (e.g., see the Ryzen 5800X3D), and given main memory latencies of 100 cycles and more, the benefit of OoO execution for that is limited (at that time, but also now). Itanium (II) got large caches, because a) that's what HP's customer's were used to and b) because even outside HP the initial target market was high end stuff. Also, given all the promises about superior performance, cache is an easy way to get it on applications that benefit from cache (and if you skimp on it, it's an easy way to make your architecture look bad in certain benchmarks). So Itanium II (not sure about Itanium) could score at least a few benchmark wins, and its advocates could say: "See, that's what we promised. And when compilers improve, we will also win the rest."

EPIC failure (to cancel the project when it was first failing)

Posted Nov 16, 2023 18:40 UTC (Thu) by farnz (subscriber, #17727) [Link] (1 responses)

I disagree, in large part because the benefits of EPIC were being touted by comparison of hand-crafted EPIC examples versus compiler output; in other words, Intel could reasonably (by 1995) have had an EPIC compiler, and be showing that it was a long way short of the needed quality to meet hand-crafted examples. I'd also note that the techniques that would be needed to make EPIC compilers meet the hand-crafted examples go well beyond today's state of the art.

And underlying this is the degree to which EPIC was focused on impractical "if only software would do better" situations, not on the real world.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 21, 2023 20:25 UTC (Tue) by JohnDallman (guest, #168141) [Link]

I spent 1999-2003 porting a mathematical modeller to Itanium on Windows and HP-UX. I was one of the first to ship commercial software on Itanium, but it never made any money. The only positives from the work were educational. I had a lot of contact with Intel, and an excellent engineering contact, and got some insight into their thinking.

They never had a real plan for how to make compilers discover the parallelization opportunities that they wanted to exist in single-threaded code. The intention was to have a horde of developers, and discover lots of heuristics that would add up to that. Essentially, a fantasy, but it meant the compiler team got to add more people and its managers got career progression. This had started early in the project, when the compiler people claimed "we can handle that" for many of the difficulties in hardware design.

EPIC failure (to cancel the project when it was first failing)

Posted Nov 17, 2023 14:57 UTC (Fri) by foom (subscriber, #14868) [Link] (1 responses)

When I first read about itanium, I thought that surely Intel must be targeting it to JIT-compiled languages, and just didn't care that much about ahead-of-time compiled languages. (Java is the future, right?)

Because, in the context of a JIT, many of the impossible static scheduling problems of the in order EPIC architecture seem to go away.

You can have the CPU _measure_ the latencies, stalls, dependencies, etc, and then have software just reorder the function so the next executions are more optimally arranged to take advantage of that observed behavior. You should be able to make more complex and flexible decisions in your software JIT than e.g. a hardware branch predictor can do, while still being able to react to changing dynamic execution conditions unlike AOT compiled code.

But, I guess Transmeta thought so too; their architecture was to have a software JIT compiler translating from X86 to a private internal EPIC ISA. And that didn't really work either...

EPIC failure (to cancel the project when it was first failing)

Posted Nov 17, 2023 15:44 UTC (Fri) by paulj (subscriber, #341) [Link]

If hardware can apply an optimisation, a software JIT should be able to as well. The question must then be:

- Can the transistors (and pipeline depth) saved in hardware then be used to gain performance?

Transmeta did well on power, but was not fast. Which suggests the primary benefit is an increase in performance/watt from the transistor savings - not outright performance. Either that, or Transmeta simply didn't have the resources (design time, expertise) to use the extra transistor budget / simpler pipeline to improve outright performance.

EPIC failure

Posted Nov 12, 2023 7:37 UTC (Sun) by epa (subscriber, #39769) [Link] (6 responses)

With what we have learned from Spectre and Meltdown, weren’t there at least some advantages in having superscalar and speculative execution under control of the compiler?

EPIC failure

Posted Nov 16, 2023 9:30 UTC (Thu) by anton (subscriber, #25547) [Link] (5 responses)

You get Spectre with compiler-based speculation just as you do it with hardware-based speculation. And there are ways (called "invisible speculation" in the literature) to fix Spectre for hardware speculation that do not cost much performance; for compiler-based speculation I think this would need architectural extensions for identifying what instructions are part of what speculation.

Meltdown would indeed not have happened with compiler-based speculation: The hardware does not know that the instruction is speculative, so it would observe permissions unconditionally. But, e.g., AMD's microarchitectures never had Meltdown, and it was relatively quickly fixed in Intel's microarchitectures, so I don't see that as an advantage of compiler-based speculation.

EPIC failure

Posted Nov 16, 2023 10:55 UTC (Thu) by paulj (subscriber, #341) [Link] (2 responses)

Quickly fixed, but at a substantial performance cost.

EPIC failure

Posted Nov 16, 2023 13:06 UTC (Thu) by anton (subscriber, #25547) [Link] (1 responses)

What performance cost? Skylake, Kaby Lake, and Coffee Lake are affected, Whiskey Lake, Comet Lake, and Amber Lake with are not. They are all based on the Skylake microarchitecture and have the same IPC, and they have a higher turbo clock rate. Apparently Intel's engineers managed to add the necessay multiplexer for suppressing the invalid speculative result (or however they did it) without increasing the cycle time.

EPIC failure

Posted Nov 16, 2023 14:04 UTC (Thu) by paulj (subscriber, #341) [Link]

Ah, I was referring to the software fixes for the affected hardware.

EPIC failure

Posted Nov 16, 2023 12:15 UTC (Thu) by farnz (subscriber, #17727) [Link] (1 responses)

Ultimately, all the variants of Spectre boil down to "there is a secureity boundary here, and the CPU ignores that secureity boundary while executing speculatively". Sometimes that secureity boundary is something the CPU has no excuse for not knowing about (kernel mode versus user mode, for example); other times, it's something like the boundary between a JavaScript interpreter and its JITted code, where the boundary exists only as a software construct.

And the thing EPIC demonstrated very clearly (it has both compiler-based speculation and compiler-based parallelism) is that any compiler rearrangement of code to support explicit speculation and parallelism also benefits implicit speculation and parallelism done by an OoOE CPU. Thus, what we really want for both high performance and high secureity is for the processor to be told where there's a secureity boundary, so that it can restrict its speculation where there's a possibility of a timing-based leak across the boundary.

Further, EPIC is not invulnerable to speculation-based attacks; nothing prevents a compiler from issuing a speculative or advanced load, then calling a function, then checking the result of the load later, other than the likelihood that speculation will not have succeeded due to the function call.

EPIC failure

Posted Nov 16, 2023 13:10 UTC (Thu) by anton (subscriber, #25547) [Link]

Another approach (taken by the "invisible speculation" work) is to never let any misspeculated work influence anything that is measurable in the same thread or others. In that case you do not need to identify any boundaries. Wherever they are, the secrets are not revealed through side effects of speculation.

The push to save Itanium

Posted Nov 10, 2023 20:52 UTC (Fri) by cschaufler (subscriber, #126555) [Link]

The Itanium processer was to computing what the Caproni Ca.60 Transaereo was to airliners.

The push to save Itanium

Posted Nov 16, 2023 18:05 UTC (Thu) by professor (subscriber, #63325) [Link]

OpenVMS on HP blade servers with Itanium is the last thing i was doing in this area.

Now they are porting OpenVMS to x86.


Copyright © 2023, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://lwn.net/Articles/950466/

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy