I am using the DSP67x.lib with the OMAP L137. I am comparing the single precision dot-product (multiply-add) to the C language equivalent and there is not much difference. For example, a 1000 point multiply-add should take about 1000/2 * 10 ns = 5 us and it is taking about 120 us on a 300 MHz OMAP 674x DSP. It is as if the instruction cache is not enabled. When I execute the GEL command to enable the instruction cache, I get
Enable_Instruction_Cache() cannot be evaluated.
identifier not found: CPSR
at CPSR=0x400000d3 [dskda830_dsp.gel:637]
at Enable_Instruction_Cache()
Any ideas?