Compiler/TMS320C6678: Optimized compilation and performance with/without RTSC/OMP

Tool/software: TI C/C++ Compiler

Hi there,

I'm having some performance issues with function I'm running without RTSC (fast) and with RTSC (slow).

In the first example, I link with the following C6678.cmd:

MEMORY
{
    SHRAM:           o = 0x0C000000 l = 0x00400000   /* 4MB Multicore shared Memmory */
  
    CORE0_L2_SRAM:   o = 0x10800000 l = 0x00080000   /* 512kB CORE0 L2/SRAM */
    CORE0_L1P_SRAM:  o = 0x10E00000 l = 0x00008000   /* 32kB CORE0 L1P/SRAM */
    CORE0_L1D_SRAM:  o = 0x10F00000 l = 0x00008000   /* 32kB CORE0 L1D/SRAM */
    // goes on with CORE1-CORE7
}
SECTIONS
{
#ifdef CORE0
    .myfastsection > CORE0_L2_SRAM
    .text:optimized: 	load >> CORE0_L2_SRAM
    // goes on with other sections, all of them placed in L2SRAM
}

The corresponding function are placed in .text:optimized using #pragma CODE_SECTION and arrays are placed in .myfastsection using #pragma DATA_SECTION and double-word aligned using #pragma DATA_ALIGN(., 2). The performance is very satisfying and looking at the generated assembly coded the compiler seems to pipeline well.

In the second example. I'm adding some RTSC because in some other code section (unrelated to the above one) I plan to use OMP. However, using the same compiler options for optimization, the performance of the function above greatly deteriorates (half the speed measured with both TSCL and omp_getwtime). The generated assembly code for the function is identical. My first guess was that I'm doing something wrong with the memory sections? In my modified cfg file I added

program.sectMap[".text:optimized"] = new Program.SectionSpec();
program.sectMap[".myfastsection"] = new Program.SectionSpec();
program.sectMap[".text:optimized"].loadSegment = "L2SRAM";
program.sectMap[".myfastsection"].loadSegment = "L2SRAM";

Shouldn't that be identical to the above linker.cmd? Is it also possibly (and necessary) to partition the L2SRAM for the different cores as above? In case I am not using any OMP in my code (even though I'm compiling with RTSC components), the performance is fine. However, as soon as I'm using OMP in a different function, called after my initial function, the performance is halfed. The initial function is called after omp_set_num_threads().

My second guess was that OMP introduces some overhead. However, I do not understand why since the initial function is totally unrelated to OMP. It would be helpful to get some additional insights here because in some cases it would be really useful to actually use OMP - but the performance degradation is not acceptable in our case.

NB: In the first case, code is loaded onto core0 only. In the second case (compilation with RTSC, no use of OMP in the code) and in the third case (compilation with RTSC, use of OMP in a different function), code is loaded onto all cores. The same optimizer flags are used in all cases. Arrays are double-word aligned and placed in L2SRAM in all cases. The functions are called 4 times in a row in all cases.

Please let me know if you need additional information. Thank you very much in advance.

Best wishes,

Idris

Compiler/TMS320C6678: Optimized compilation and performance with/without RTSC/OMP

Trending Articles

GTA 5 PPSSPP Zip File Download For Android Mediafire 382 MB

SAHARA FLASH LIVE IN WERAGOLLA 2018-04-20

Black Angus Grilled Artichokes

Materials Around Us Class 6 Worksheet Science Chapter 6

Grimsby sex-swap teen Nicole beats the bullies

AVS4YOU Products Patcher v1.4 By RADIXX11

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

New curfew for accused Brathwaite

Zara Larsson – Midnight Sun [iTunes Plus M4V – Full HD]

Troubleshooting Connectivity #9 –ローカル接続でネットワークエラーとはこれいかに？

Chitown Wiseguy Cashed In His Chips In Winter Of ’20, Made Bones In Chicago...

Mp3 Download: Mdu - Mazola

Missing boy, Queens Quay West and Bathurst Street area, Javin Dillon, 15

Gulabi kallu Lyrics and translation | GAV / Govindhudu andhari vadele (2014)

Moondru Mudichu 20-07-2016 – Polimer tv Serial

99 God Status for Whatsapp, Facebook

Bureau of Internal Revenue: Regional Offices (Directory)

Dove Cameron – Too Much – Single [iTunes Plus M4A]

Practice Sheet of Right form of verbs for HSC Students

Portable iSkysoft PDF Editor 5.6.0.1