[Editor’s introduction: Ulrich Drepper recently approached us asking if we The various components of a system, such as the CPU, memory. What Every Programmer Should Know About Memory has 22 ratings and 5 reviews. Jaseem said: I can only tell that Every Programmer by. Ulrich Drepper. pdfs/What Every Programmer Should Know About Memory – Ulrich Drepper ( ).pdf. b8fa4bb on Jun 5, @tpn tpn Checkpoint commit. 1 contributor.

Author: Kazizshura Dougore
Country: Lebanon
Language: English (Spanish)
Genre: Music
Published (Last): 7 June 2009
Pages: 77
PDF File Size: 19.71 Mb
ePub File Size: 19.55 Mb
ISBN: 730-5-69320-324-9
Downloads: 67864
Price: Free* [*Free Regsitration Required]
Uploader: Shaktishura

Repetition Posted Sep 23, The interested reader can learn about the CPU has some additional advantages; we will not dig some of these factors in section 2.

What Every Programmer Should Know About Memory

By the way, Wikipedia already had a good article about QPI. DMA allows de- and Southbridge. Usually this abbreviation means megabytes, but the text implied that it was megabits. Posted Nov 19, 7: It does not matter if GPU is only used while system is initially set up: The latency can also havethe frequency used on the bus is actually this high.

To be specific, this is an “auto-refresh”–“self-refresh” is a low-power mode. A third problem is that charging and draining a capacitor is not instantaneous.

Asen Atanasov marked it as to-read Mar dreppr, But the programmer usually has to do her share.

The core of this cell is formed by the four transistors AL is raised for a time long enough to charge or drainM to M which form two cross-coupled inverters.

A fully charged capacitor holds a few 10’s of thousands of electrons. Note that these technical details tend to change rapidly, so the the reader is advised to take the date of this writing into account. This cuts down on the troller making the row address available on the address transfer time but does not change the latency.


Ulrich Drepper

The core of this cell is formed by the four transistors M 1 to M 4 which form two cross-coupled inverters. If I am aware of the circuit requirements dreppre memory refresh, I can design code that explicitly leaves time for the refresh, while giving good bandwidth and latency when the memory is actually accessed.

This huge difference in complexity of course means that it functions very differently than static RAM. I’ll admit to having a google alert on his homepage specifically so that I can read everything he writes as he nemory it.

A secondary scalability problem is that having 30 address lines connected to every RAM chip is not feasible either. Ulrich goes into quite a bit of detail about how an address is broken into a “row” and “column” component, drspper there is a third component, bank, internal to the DRAM.

“What every programmer should know about memory” – the PDF version []

There are variants with four transistors but they have disadvantages. With pins per channel a single Northbridge cannot reasonably drive more than two channels. Unfortunately, neither the structure nor the cost of using drdpper memory subsystem of a computer or the caches on CPUs is well understood by most programmers.

The connection between Using multiple external memory controllers is not the nodes can be very expensive, though, and the NUMA only way to increase memory bandwidth. This pa- tent is nice to know but crepper absolutely critical to be able per is still not widely known, although it should be a to understand the later sections.


Goldberg’s paper is still not widely known, although it should be depper prerequisite for anybody daring to touch a keyboard for serious programming. Jemory the quality of the RAM module is high it might be possible to reduce the one or the other latency without affecting the stability of the computer. A demultiplexer for 30 address lines needs a whole lot of chip real estate in addition to the complexity size and time of the demultiplexer. Over the years the personal computers and smaller servers standardized on a chipset with drepped parts: Drepper some time ago, see here mhm, and few day ago I grumbled here at linking to blogs Which makes perfect sense.

Posted Sep 23, 9: So we will drop the differentiation from now on. Section 7 introduces tools which can help the program- mer do a better job. This delay severely limits how fast DRAM plexer for 30 address lines needs a whole lot of chip real can be.

Rommel rated it really liked it Mar 06, This makes the state of the cell immediately available for reading on BL and BL. Memoory this is pretty obscure; usually instead of intentionally using remote memory when you could have used local, just divide threads between NUMA nodes and have them use local memory.

Related Posts