Actually I was thinking of
The Transmeta Crusoe
Here's the thing.
HPC is all about numerical computing
Now (in theory) porting most of the code means re-compiling it with the FORTRAN whatever compiler you have on your architecture.
But
Too cute a re-ordering and your end users carefully crafted high level numerical algorithm turns to rubbish. Quickly produced rubbish results are still rubbish.
TBH I think a lot of this is bo***cks.
Furber knew when they designed ARM it was all about the DRAM latency and it still is.
You're talking to the on chip cache. What happens when it fails? What happens (if you have one) an L2 cache hit fails? Because it will happen.
I'd say a big part of serious HPC design is an n-way memory system and lining up the data & code within those rows and keeping the row shifts as infrequent (and over lapped) as possible. And even then what is the current standard? 8ns flat out when the processor could be clocking at 0.5ns?
As for power consumption we've know the clock drivers are the biggest power sinks on any processor and going clockless is the way to eliminate most of them.
Given that a modern row of a DRAM could take the address space of an entire 8bit processor (laid out as a bit stream on one chip). So yes code density still is a thing.
We know that theoretical compiler technology has improved over the last few decades but how much of that has actually been used?
It's not like there aren't alternatives to x86 already available, like the open source SPARC ISA's.
This has a huge mountain to climb