Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Behaviour of uninitialised RAM in an ASIC

Rocketmagnet2

Newbie level 4
Newbie level 4
Joined
Feb 9, 2008
Messages
6
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Activity points
1,344
This is a question for anyone with experience designing or with a deep knowledge of volatile memory in an ASIC. E.g. chip designers or silicon process engineers.

We are using the ET1200 EtherCAT ASIC (datasheet) in one of our projects. First some background:

The ET1200 chip:
The ET1200 is a industrial Ethernet based networking chip that essentially gives a host PC and a microcontroller shared memory. The PC (over Ethernet) and microcontroller (over SPI) can both read and write into the small amount of dual-ported memory inside the ET1200.

Our project:
Out of 50 PCBs assembled, we have about 25 that showed symptoms of a strange bug. On some boards the bug was not seen at first, but then increasingly appeared more often, until eventually it was seen very often. Meanwhile, some boards never exhibited the bug.

What was the bug? Eventually, we tracked the bug down to a microcontroller reading uninitialised memory from the ET1200. I.e. after power up, and before the PC had written to any memory in the ET1200, the microcontroller would attempt to read that memory, but would obviously be reading uninitialised memory from the chip.

This firmware bug has now been fixed, but I'm still curious about the behaviour of the uninitialised memory inside the ET1200 ASIC.

Thoughts:
Presumably the memory was not being zeroed after the ASIC reset. Sometimes the memory would read mostly zeros (that doesn't cause a bug) and sometimes it would read lots of random numbers. Sometimes those random numbers would survive a power cycle. I don't know exactly what kind of RAM is in this chip.

Questions:
  • What factors might affect the probability that an uninitialised bit in the ET1200's memory will read as 0 or 1?
  • Is it to be expected that some chips will appear to typically read all zeros, while other read random numbers?
  • What are some actual, physical mechanisms that could cause chips to change behaviour of uninitialised RAM over time? I.e. start out reading mostly zeros, but eventually start reading lots of non-zeros. It's almost as if the memory is 'burning-in'. (warming or cooling the boards did not seem to help).

What kind of answers am I interested in?
I would like some insight into how non-volatile memory cells are constructed in a typical ASIC, and what mechanisms affect their reset value, and if there is a mechanism which affects their reset value over time. I am of course aware that the behaviour of uninitialised memory is undefined.
 
Leakage currents can vary across chips due to variations in manufacturing, which can affect the initial state of uninitialized memory.
 
SRAM is not zeroed in. SRAM memories, generated by a compiler, have no reset functionality. SRAM does have a bit of data retention after power off-on cycling, but that is likely not relevant to your question.

The unitialized value of an SRAM is undetermined. There is whole line of research on how to use the behavior of the memory, before initialize /reset, as a fingerprint. It is unique for different chips, and this is good thing. This is called a Physically Unclonable Function.

The chance of getting a 1 or a 0 depends on process variation. The bitcell inside the SRAM has two inverters connected in a cross coupled fashion. They are actively fighting each other when you power up the SRAM for the first time. But one of them must be only slightly stronger because process variation dictates that no transistor is identical to other transistors. If one side is strong, you get a 1. Otherwise, you get a 0.
 
SRAM is not zeroed in. SRAM memories, generated by a compiler, have no reset functionality. SRAM does have a bit of data retention after power off-on cycling, but that is likely not relevant to your question.

The unitialized value of an SRAM is undetermined. There is whole line of research on how to use the behavior of the memory, before initialize /reset, as a fingerprint. It is unique for different chips, and this is good thing. This is called a Physically Unclonable Function.

The chance of getting a 1 or a 0 depends on process variation. The bitcell inside the SRAM has two inverters connected in a cross coupled fashion. They are actively fighting each other when you power up the SRAM for the first time. But one of them must be only slightly stronger because process variation dictates that no transistor is identical to other transistors. If one side is strong, you get a 1. Otherwise, you get a 0.
Thank you for your reply.
Please could you expand a little bit on the physics behind this, or the process variation? Why might we see chips change their behaviour over time in this respect?

The reason I'm asking this question is that a couple of engineers here are still suspicious of the boards that read non-zero values in uninitialised memory. I would like to reassure them (with the backup of some solid information about the physics of the SRAM cell) that it's perfectly normal that chips might change over time in the values seen in uninitialised memory.
 
You always have a low level mismatch between any FETs.
You could consider the SRAM bit cell at power-up, to be
similar to the simple clocked CMOS comparator with both
inputs at null, deciding the value with the supply as the
clock. Natural offset and positive feedback.

Of course there can be gross defects which make this
worse and there can be design influences in the leaf
cell which make a "0" or a "1" more likely (like any
asymmetry in routing, parasitic C from supply / ground,
attached read / write net loads / leakage and so on
(asymmetries in the access switches, could couple
asymmetrically those large nets).

A SRAM "could" be designed to be self-zeroing but
to do that one (at least) of the memory cell devices
would have to be larger than minimum, which lowers
the (prime selling attribute) array density. Or some added
control and POR circuitry that might blast the array flat,
a row at a time at the cost of some peripheral elaborateness.

But "wait for it..." in uC software is probably the patch that
costs least (presuming you can touch that).
 
Thank you for your reply.
Please could you expand a little bit on the physics behind this, or the process variation? Why might we see chips change their behaviour over time in this respect?

The reason I'm asking this question is that a couple of engineers here are still suspicious of the boards that read non-zero values in uninitialised memory. I would like to reassure them (with the backup of some solid information about the physics of the SRAM cell) that it's perfectly normal that chips might change over time in the values seen in uninitialised memory.
there are aging effects on SRAM, but those have timescales measured in years. probably what you are witnessing is data retention at the memory, the controller of the memory, or any buffers along the way, really.

you can make the following experiment. pick one of the "suspicious" boards. turn it off for 10s. turn it on. read a few addresses until you have read some 1000 or so bits. Turn it off, wait 10s, turn it on, read the same 1000 bits. Compare with the initial reading. You should almost immediately recognize the PUF behavior. Most of the bits should remain stable across different readings. Do it again. Maybe you have 900 bits that remain stable. Do it again. Do it again. Do 10 power cycles. Most likely the majority of bits will remain stable.

You can plot the readings. Every bit is a pixel. Every time you read a 1, make the pixel darker. Every time you read a zero, make it lighter. Something like 0-10 scale. The image you will produce in the end should have some gray, some true black, and some true white pixels.
 
If I understand your report correctly, there's no actual problem left. Original problem was missing initialization of ET1200 RAM.

The question about mechanisms behind initial RAM content is interesting but - as I fear - leads to nothing.
 
If I understand your report correctly, there's no actual problem left. Original problem was missing initialization of ET1200 RAM.

The question about mechanisms behind initial RAM content is interesting but - as I fear - leads to nothing.
You are correct. The reason I'm asking my question is that some of the engineers here are still concerned that the non-zero values we are reading from the chips are a sign of some other problem. I am trying to collect evidence that this is not a problem at all, and it's quite reasonable to expect various kinds of surprising nonsense to be read from uninitialised RAM. The evidence I have collected so far seems to be convincing them. Thank you everyone.
 
Perhaps you are going to need to do some test then. Try changing your firmware, even if it is just for a test, where you blank out the SRAM first, write zeroes everywhere. Then run your regular firmware. If there is some issue with the board, CPU, or SRAMs then you will still see random errors popping up during the program. If no issue shows up then you nailed it.
 
ASIC RAM contents after power up are random, but there is some correlation between different power cycles.
It is normal to have non-zero values in the RAM after power-up.
The software must not assume anything about the RAM contents. If it needs the RAM to be zero at startup, it must write the zeros in the software startup routine. If the software is written in C, uninitialized variables are not a problem, since the C standard says that the initial value is zero. The C startup code copies initialized (non-constant) variable values to RAM (often called segment data), and clears uninitialized variables (often called segment bss). main() is called when the startup code is done.
The "problem" discussed in this thread should not be about uninitialized C variables, because then the bug is in the C startup code.
 

LaTeX Commands Quick-Menu:

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top