Email: Password: Remember Me | Create Account (Free)

Back to Subject List

Old thread has been locked -- no new posts accepted in this thread
Jan Waclawek
12/15/11 00:35
Modified:
  12/15/11 00:47

Read: 748 times
Bratislava
Slovakia


 
#185093 - jump cache miss penalty
Responding to: Erik Malund's previous message
Erik Malund said:
so, if you have a jump to some different place it takes 4 cycles to get to the first byte.

if that is the first byte of the 4 loaded all is well.

if iis the 2nd or 3rd - figure it out by the below

if it is the last byte of the 4 there will be a 3 clcok delay to fetch the next word to the cache.

Thanks, Erik.

This is roughly how I would understand it, too; except that I would expect exactly this behaviour only if all of the instructions at the jump target would be single-world single-cycle (NOP-like). As most of the instructions (even the single-word) take more than that, I would expect slightly less penalty in average.

So, in effect, that all means, that a jump to a non-cached position may result in delay of 4-7 cycles (the 1-3 extra cycles would "happen" at a different instruction and under the circumstances outlined above, but I think it is an adequate description for the purpose of a worst-case description for the table). A jump into a cached position may result in delay of 1-3 cycles, if the next word is not cached yet. Correct?


It would be interesting also to investigate the corner case of a multi-byte instruction, which is the target of the jump and lies across the boundary between two non-cached four-byte words. Would the core stop until all the bytes of the instruction are fetched (i.e. 4 + 4 cycles), or would it execute the instruction partially during the fetch of the rest of the bytes which lie in the next word? If not, it would mean the "jump penalty" be 1-8 cycles rather than 1-7.

I am also curious whether this behaviour can be suppressed if the 100MHz-er runs at lower speed, and also how does this mechanism work with the 50MHz-ers. [*]


Thanks again,

Jan

PS. [*] I found it. I read only the "branch cache" chapter and it is hidden elsewhere: the exact number of cycles for FLASH word reading is given by the FLRT bits in FLSCL register.


List of 19 messages in thread
TopicAuthorDate
'51 derivatives cycle comparison table updated      Jan Waclawek      12/11/11 15:29      
   above about 40 Mhz devices may need extra cycles      Frieder Ferlemann      12/12/11 15:11      
      silabs with cache      Jan Waclawek      12/13/11 15:27      
         Ok, a SILabs cache lesson      Erik Malund      12/14/11 06:22      
            Bytes      Michael Karas      12/14/11 11:08      
               ecc?      Per Westermark      12/14/11 12:44      
               not the cookies      Erik Malund      12/15/11 07:17      
                  Washed?      Michael Karas      12/15/11 09:55      
                     am I as has happened before ...      Erik Malund      12/15/11 10:10      
                        Is that how it's spelled?      Richard Erlacher      12/15/11 12:52      
                     re: Washed?      Andy Peters      12/16/11 09:35      
            jump cache miss penalty      Jan Waclawek      12/15/11 00:35      
               clarifications      Erik Malund      12/15/11 07:33      
               no cache for 50MHz      Maarten Brock      12/15/11 16:01      
                  surely not all      Jan Waclawek      12/16/11 07:13      
                     you missed a word      Erik Malund      12/16/11 07:18      
   more update      Jan Waclawek      01/02/12 15:21      
      Table suggestions      Jim Granville      02/14/12 23:52      
      Updated MC51 supports Cycle Define      Jim Granville      04/26/12 16:17      

Back to Subject List