Comment

Comment on China solves 'century-old problem' with new analog chip that is 1,000 times faster than high-end Nvidia GPUs

Melobol@lemmy.ml ⁨3⁩ ⁨weeks⁩ ago

I asked chatgtp to explain the paper - here is what it said - so you don’t have to:

Many computing tasks (especially in things like signal processing, wireless communications, scientific computing, and AI) boil down to solving equations like A x = b (a matrix times a vector equals another vector). Nature +1

Traditionally these are solved in digital computers (with floating-point arithmetic) and for large problems this can be slow and energy-intensive. Nature +1

An alternative is analogue computing where you do operations more directly in hardware (for example using resistive memory devices) rather than converting everything to the digital domain. These can potentially be much faster and more energy-efficient. Nature +1

But analogue computing has historically had a big problem: precision (how accurate the answers are) and scalability (how large a problem you can handle). This paper addresses those issues.

What they did

They used resistive random-access memory (RRAM) chips — specifically memory devices where each cell’s conductance (i.e., how easily it lets current through) acts like a number in a matrix. Nature +1

They built an analogue system that does two key steps:

A low-precision analogue matrix inversion (LP-INV) step.

A high-precision analogue matrix-vector multiplication (HP-MVM) step, using bit-slicing (splitting the number into parts) to boost precision. Nature +1

They also developed a method called “BlockAMC” (Block Analog Matrix Computing) — this partitions a large matrix into blocks so that the analogue method can be scaled to larger sizes. Nature

They built the hardware: RRAM chips in a foundry (40-nm CMOS process) with a 1 transistor-1 resistor (1T1R) configuration, supporting 3-bit multilevel conductance (so eight states). Nature

They experimentally solved a 16×16 real‐valued matrix inversion with ~24-bit fixed-point precision (which is comparable to 32-bit floating point) using their analogue system. Nature

They also demonstrated a real‐world application: detection in a “massive MIMO” wireless-communication system (16×4 and 128×8 antenna setups) using high-order modulation (256-QAM). Their analogue solver matched the performance of a digital processor in two/three cycles. Nature

They measured the speed (the analogue inversion circuit converged in ~120 ns for 4×4) and estimated that their approach could offer ~1000× higher throughput and ~100× better energy efficiency than state-of-the-art digital processors for the same precision. Nature

Why it matters

If you can solve matrix equations much faster and with much less energy, that opens up possibilities for e.g. base stations in wireless networks (where there are many antennas), real-time signal processing, AI training, scientific simulation, etc.

Using analogue hardware like RRAM arrays helps overcome the “von Neumann bottleneck” (the slowdown/energy cost caused by moving data between memory and processor) because the memory is the compute. Nature

The fact that they reached high precision (comparable to digital float32) is important because one of the big criticisms of analogue computing has been that it’s too “noisy/low precision” for serious tasks. This shows you can do it.

The scalability (through their BlockAMC approach) means this isn’t just a toy demonstration of a 2×2; they show up to 16×16 and hint at larger.

Important caveats & challenges

Their currently demonstrated arrays for LP-INV are small (8×8) and scaling to much larger arrays still has engineering challenges (device reliability, wiring resistance, noise, etc.). Nature

The BlockAMC algorithm introduces some overhead when you scale up. The complexity isn’t strictly constant for arbitrary large matrix sizes; there is some cost. Nature

While they show big energy/throughput gains in estimates, real‐world integration (with all peripheries: DACs, ADCs, control logic) will still need refinement.

Applications: They show wireless signal detection (MIMO) which is great, but other domains (scientific computing, general AI) may have different requirements (matrix size, sparsity, conditioning).

The analogue computing world still has to deal with variability, drift, calibration, faults in memory cells, etc. The paper mentions some of these (e.g., stuck-at faults) and how to mitigate them. Nature

In everyday terms

Imagine you have a huge table of numbers (a matrix) and you need to solve for a vector x so that when the matrix multiplies x you get some result b. This is like solving a system of linear equations. Normally, a computer does this step‐by‐step in digital form and it takes time and energy (especially for large tables). What these researchers did is build a physical piece of hardware where the table of numbers is literally encoded in a memory chip (via conductances) and the solving is done via analogue electrical flows. Because electricity flows in parallel and instantly (relative to digital clocked logic), it can be much faster and more efficient. They also built in ways to ensure the answers are very accurate (not just approximate) and to scale up the method to realistic sizes. In short: they brought back some of the old “analogue computing” idea, but using modern memory chips, and showed it can match digital precision while running faster / lower-power.

source

Sort:hotnew top

Filetternavn@lemmy.blahaj.zone ⁨3⁩ ⁨weeks⁩ ago
This comment violates rule 8 of the community. Please get your AI generated garbage out of here.

source
- Melobol@lemmy.ml ⁨3⁩ ⁨weeks⁩ ago
  In that case I’m editing it. I’m sorry for my mistake, I thought it would be useful to a point. That’s why I said it was AI.
  
  source
TachyonTele@piefed.social ⁨3⁩ ⁨weeks⁩ ago
No one is reading that.

source
- Melobol@lemmy.ml ⁨3⁩ ⁨weeks⁩ ago
  That’s fine. Just have a good day :)
  
  source
BCOVertigo@lemmy.world ⁨3⁩ ⁨weeks⁩ ago
I appreciate that you wanted to help people even if it didn’t land how you intended. :)

source
kalkulat@lemmy.world ⁨3⁩ ⁨weeks⁩ ago
It was a decent summary, I was replying when you pulled it. Analog has its strengths (the first computers were analog, but electronics was much cruder 70 years ago) and it is def. a better fit for neural nets. Bound to happen.

source
kalkulat@lemmy.world ⁨3⁩ ⁨weeks⁩ ago
Nice thorough commentary. The LiveScience article did a better job of describing it for people with no background in this stuff.

The original computers were analog. They were fast, but electronics was -so crude- at the time, it had to evolve a lot … and has in the last half-century.

source
Trainguyrom@reddthat.com ⁨3⁩ ⁨weeks⁩ ago
The article is like 5 paragraphs, not even a single sheet of paper of printed. Why does it need a summary‽

source
- Melobol@lemmy.ml ⁨3⁩ ⁨weeks⁩ ago
  The summary was for the paper the article was based on. And it was also put it in an easier to understand language.
  
  source
  - Trainguyrom@reddthat.com ⁨3⁩ ⁨weeks⁩ ago
    That’s potentially useful then at least!
    
    The big challenge with AI generated summaries is that LLMs are so prone to innaccuracy that a summary that’s never checked for accuracy by a human who has read the source material that’s being summarized, it just exists in a weird limbo state of maybe-false maybe-perfectly-fine and the onus is still on the reader to read the source material to make their own decision just as it is without posting an AI summary
    
    LLMs are brilliant first draft machines. Unless there’s a major breakthrough that improves accuracy they will need to stay that way
    
    source