## Implementation note

The whole algorithm is implemented for Synergistic Processing Unit (SPU).The part for POWER Processing Unit (PPU) contains one usable function. This function starts the algorithm computation on SPU. So the only parameter of this function is structure NeuralGas_InputData that specify algorithm parameters. The result of this function is speid_t that describes the running program on SPU.

## Implementation main idea

We implemented some functions that enable easy SPU processing of arrays that are located in main memory. These functions move sections of main memory to local memory of SPU using DMA transfers. Then user defined function is called for each element section of local copy of processing array. The user defined function retrieves number of elements that are curently located in local memory. The user defined functions can change content of array elements - then moving of cached data back to main memory (using DMA) is needed.Headers of this supporting functions (detail description of this function in code documentation):

*void ReadInputs(InternalData_SPU* internal, void (handler(InternalData_SPU*, int)));*

*void ReadAndWriteNeurons(InternalData_SPU* internal, int overlap, void (handler(InternalData_SPU*, int)));*

The biggest difference between PC (PPU) and SPU implementation is that instead of for-cycles (it is the core of Neural Gas algorithm) in PPU implementation, in SPU implementation there are functions described above and for-cycles are in this functions. The second big difference is that we used SIMD instructions for all vector-based operations.