PDA

View Full Version : Disabling Level 1 Data Cache on the TI TMS320C6713


Randy Yates
07-15-2008, 08:40 PM
Does anyone know if this is possible? I know you can resize the L2D
cache to 0 so it is effectively disabled, but I don't see a way to
disable L1D cache.

I have an algorithm that, by its nature, performs mostly random (i.e.,
non-sequential) accesses to large memory arrays, so I'm thinking that
the requirement to fetch an entire cache line everytime a cache miss
occurs when the processor really only needs one 32-bit word is degrading
performance.

Any thoughts or suggestions would be very welcome.
--
% Randy Yates % "With time with what you've learned,
%% Fuquay-Varina, NC % they'll kiss the ground you walk
%%% 919-577-9882 % upon."
%%%% <[email protected]> % '21st Century Man', *Time*, ELO
http://www.digitalsignallabs.com

Piergiorgio Sartor
07-15-2008, 09:03 PM
Randy Yates wrote:

> I have an algorithm that, by its nature, performs mostly random (i.e.,
> non-sequential) accesses to large memory arrays, so I'm thinking that
> the requirement to fetch an entire cache line everytime a cache miss
> occurs when the processor really only needs one 32-bit word is degrading
> performance.
>
> Any thoughts or suggestions would be very welcome.

It could be the memory controller burst reads anyway a
cache line, regardless of anything else, so disabling
the L1 will not change the performance.

This kind of algorithms are also an headache.

It would be good to confirm that the memory bandwidth
is under pressure due to this access-one-byte-get-many
constrain.

bye,

--

piergiorgio

Randy Yates
07-16-2008, 12:31 AM
Piergiorgio Sartor
<[email protected] EMOVETHIS.de> writes:

> Randy Yates wrote:
>
>> I have an algorithm that, by its nature, performs mostly random (i.e.,
>> non-sequential) accesses to large memory arrays, so I'm thinking that
>> the requirement to fetch an entire cache line everytime a cache miss
>> occurs when the processor really only needs one 32-bit word is degrading
>> performance.
>>
>> Any thoughts or suggestions would be very welcome.
>
> It could be the memory controller burst reads anyway a
> cache line, regardless of anything else, so disabling
> the L1 will not change the performance.

I can see how that would be possible. I have no knowledge
of how this controller works (or would work).

> This kind of algorithms are also an headache.

Tell me about it!

> It would be good to confirm that the memory bandwidth
> is under pressure due to this access-one-byte-get-many
> constrain.

Yes. I suppose I could get a logic analyzer on the control
and address lines, but that takes work!
--
% Randy Yates % "Maybe one day I'll feel her cold embrace,
%% Fuquay-Varina, NC % and kiss her interface,
%%% 919-577-9882 % til then, I'll leave her alone."
%%%% <[email protected]> % 'Yours Truly, 2095', *Time*, ELO
http://www.digitalsignallabs.com

Piergiorgio Sartor
07-16-2008, 07:45 PM
Randy Yates wrote:

> Yes. I suppose I could get a logic analyzer on the control
> and address lines, but that takes work!

Well, I had the same problem with a different CPU.

Basically I had to read small blocks of data from
around the memory.

Implementing the basic approach, i.e. reading exactly
what was needed, was leading to 500% beyond real-time,
so it was clearly not possible.

The solution was, but I guess it is specific to this
CPU, to use the DMA at its best and abuse the internal
memory of the device.

This was working so well that now the complete processing
requires a bit more than 50% the CPU for real-time.

The point is that this was really taking time, since I had
to test several different DMA/memory usage combinations.
And it took time to decide if things did not work due to
hardware limitations or software bugs...

My advice would be to try to estimate if the device can,
under standard conditions, perform as you want.
If not, then depending on what is possible try all the
possible (usable) memory access pattern and see if one
can fit the performance.

I think the C6713 has a DMA engine and some internal
SRAM, so maybe a clever usage of those could help.

bye,

--

piergiorgio

rajesh
07-17-2008, 05:17 AM
For ADI's Blackfin processors L1D( there is no L2 by the way) cache
can be simply disabled by going to 'project options'
or by suitable modifications in linker file and startup code. I dont
see why there shouldnt be a way for the TI processors.

Randy Yates wrote:
> Does anyone know if this is possible? I know you can resize the L2D
> cache to 0 so it is effectively disabled, but I don't see a way to
> disable L1D cache.
>
> I have an algorithm that, by its nature, performs mostly random (i.e.,
> non-sequential) accesses to large memory arrays, so I'm thinking that
> the requirement to fetch an entire cache line everytime a cache miss
> occurs when the processor really only needs one 32-bit word is degrading
> performance.
>
> Any thoughts or suggestions would be very welcome.
> --
> % Randy Yates % "With time with what you've learned,
> %% Fuquay-Varina, NC % they'll kiss the ground you walk
> %%% 919-577-9882 % upon."
> %%%% <[email protected]> % '21st Century Man', *Time*, ELO
> http://www.digitalsignallabs.com

Vladimir Vassilevsky
07-17-2008, 03:15 PM
rajesh wrote:

> For ADI's Blackfin processors L1D( there is no L2 by the way) cache
> can be simply disabled by going to 'project options'
> or by suitable modifications in linker file and startup code. I dont
> see why there shouldnt be a way for the TI processors.

My dear BlackFin talker,

"The electricity is generated by the power outlet on the wall. All you
have to do to make the electricity is turn the switch on"

For ADI BlackFin Processors, the cache is controlled by IMEM/DMEM and
CPLB registers settings. There is a number of different options there.

The cache configuration details are very specific to the particular CPU,
and there is absolutely no connection between BlackFin and 6713 with
respect to it.

I completely understand the problem that Randy is talking about, but I
can't give a meaningful advice since I am not familiar with the fine
details of configuration of 6713.

I can only guess that the penalty for the filling of the cache line by
the burst read from SDRAM is small compared to the time required for the
SDRAM access especially if there is a page/bank switching; so there is
really not too much of loss if the access to SDRAM is random. However it
all depends on the particular details of the application and cache/RAM
operation.



Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Randy Yates
07-17-2008, 04:01 PM
Vladimir Vassilevsky <[email protected]> writes:
> [...]
> I can only guess that the penalty for the filling of the cache line by
> the burst read from SDRAM is small compared to the time required for
> the SDRAM access especially if there is a page/bank switching; so
> there is really not too much of loss if the access to SDRAM is
> random. However it all depends on the particular details of the
> application and cache/RAM operation.

Good point, Vlad. It probably takes on the order of dozen cycles to do
refresh cycles, page address, etc. - an extra 3 to read useless data
isn't going to make a lot of difference.
--
% Randy Yates % "I met someone who looks alot like you,
%% Fuquay-Varina, NC % she does the things you do,
%%% 919-577-9882 % but she is an IBM."
%%%% <[email protected]> % 'Yours Truly, 2095', *Time*, ELO
http://www.digitalsignallabs.com

Brad Griffis
07-19-2008, 04:28 AM
Randy Yates wrote:
> Does anyone know if this is possible? I know you can resize the L2D
> cache to 0 so it is effectively disabled, but I don't see a way to
> disable L1D cache.
>
> I have an algorithm that, by its nature, performs mostly random (i.e.,
> non-sequential) accesses to large memory arrays, so I'm thinking that
> the requirement to fetch an entire cache line everytime a cache miss
> occurs when the processor really only needs one 32-bit word is degrading
> performance.
>
> Any thoughts or suggestions would be very welcome.

Randy,

Are your arrays in external memory? If so, you should make sure the
corresponding MAR bit is cleared. That will make that range of external
memory non-cacheable such that it doesn't get cached anywhere (i.e. not
in L1D or L2). There is no way on 6713 to disable the L1D.

Brad

Randy Yates
07-19-2008, 12:56 PM
Brad Griffis <[email protected]> writes:

> Randy Yates wrote:
>> Does anyone know if this is possible? I know you can resize the L2D
>> cache to 0 so it is effectively disabled, but I don't see a way to
>> disable L1D cache.
>>
>> I have an algorithm that, by its nature, performs mostly random (i.e.,
>> non-sequential) accesses to large memory arrays, so I'm thinking that
>> the requirement to fetch an entire cache line everytime a cache miss
>> occurs when the processor really only needs one 32-bit word is degrading
>> performance.
>>
>> Any thoughts or suggestions would be very welcome.
>
> Randy,
>
> Are your arrays in external memory? If so, you should make sure the
> corresponding MAR bit is cleared. That will make that range of
> external memory non-cacheable such that it doesn't get cached anywhere
> (i.e. not in L1D or L2). There is no way on 6713 to disable the L1D.

Thanks Brad - I'll give that a try. As Vlad pointed out, though, I don't
hold hope for a lot of improvement.

By the way, I meant "disabling level 1 data cache for external memory
access" and not _totally_ disabling it.
--
% Randy Yates % "Maybe one day I'll feel her cold embrace,
%% Fuquay-Varina, NC % and kiss her interface,
%%% 919-577-9882 % til then, I'll leave her alone."
%%%% <[email protected]> % 'Yours Truly, 2095', *Time*, ELO
http://www.digitalsignallabs.com