Fast ARMv6M mempy - Part 5 - Test function
This is the 5th part of "Fast ARMv6M mempy", see other parts you may have missed:
- Part 1 - ASM code
- Part 2 - Options & Results
- Part 3 - Replace SDK memcpy
- Part 4 - Automated case generation
- Part 5 - Test function
- Part 6 - Benchmark function
Full code available at github (test project, only fast ARMv6M mempy).
Memcpy test function
Following the list to automate the tests:
- Automated generation of files needed to test all cases.
- Create a test function to ensure that all memcpy cases work correctly.
- Create a benchmark function to evaluate the speed of each implementation.
- Create a good compare system.
To cover point 2 we need:
- A simple reference and known-to-be-good implementation of memcpy.
- A function to test each function against the reference implementation of memcpy.
Reference memcpy
This is the simplest implementation I can think of for memcpy, to be used as the reference:
static void * memcpy_known_good(void *dst, const void *src, size_t length)
{
void *ret = dst;
while (length--)
*(uint8_t *)dst++ = *(uint8_t *)src++;
return ret;
}
Test function description
The implementation of the test is "test_memcpy":
void test_memcpy(MemcpyFunction memcpyGood, MemcpyFunction memcpyTest);
It checks the function under test "memcpyTest" against "memcpyGood". If "memcpyGood" is NULL, the default "memcpy_known_good" is used as reference.
Emits an "assertion failed" in case any of the tests fails. If all tests succeed, it prints something like this:
Testing implementation: memcpy_armv6m_test_miwo_0_mssp_0_opxip_0_opsz_0_msup_0
Memory space: RAM
16512 tests Ok
Memory space: FLASH
33024 tests Ok
Memory space: FLASH - NO CACHE
49536 tests Ok
Memory space: ROM
66048 tests Ok
One "Test Ok" for each memory type tested, showing the total tests done.
In a nutshell, the function uses 5 nested loops to perform the tests with all combinations of:
- Memory space: RAM, FLASH, FLASH_NO_CACHE, ROM.
- Size: 0 up to TEST_MEMCPY_MAX_TEST_SIZE (128).
- Source & destination offset: 0 to TEST_MEMCPY_MAX_TEST_OFFSET (7).
- Unused value: 0xFF and 0x00.
Unused value is a value needed to check if memcpy writes outside of the memory locations given by its parameters.
Test function implementation
The items and order of the memory spaces to test can be customized in one array:
static const MemorySpaceSourceInfo TEST_MEM_SPACES[] = {
MEMORY_SPACE_RAM,
MEMORY_SPACE_FLASH,
MEMORY_SPACE_FLASH_NO_CACHE,
MEMORY_SPACE_ROM};
"test_memcpy" loops:
void test_memcpy(MemcpyFunction memcpyGood, MemcpyFunction memcpyTest)
{
uint32_t testNumber = 0;
if (memcpyGood == NULL) // Set default good implementation if none provided
memcpyGood = &memcpy_known_good;
for (int mem = 0; mem < TEST_MEM_SPACES_COUNT; mem++)
{
MemorySpaceSourceInfo ms = TEST_MEM_SPACES[mem];
printf("Memory space: %s\n", ms.name);
// Iterate all sizes, offsets & unused values
for (size_t size = 0; size <= TEST_MEMCPY_MAX_TEST_SIZE; size++)
{
for (size_t srcOffset = 0; srcOffset <= TEST_MEMCPY_MAX_TEST_OFFSET; srcOffset++)
{
for (size_t dstOffset = 0; dstOffset <= TEST_MEMCPY_MAX_TEST_OFFSET; dstOffset++)
{
for (uint8_t unusedValue = 0xFF; unusedValue != 1; unusedValue++)
{
.....
For each test, the destination memory is filled with the "unused value":
memset(ramDstBufferGood, unusedValue, TEST_MEMCPY_BUFFER_SIZE);
memset(ramDstBufferTest, unusedValue, TEST_MEMCPY_BUFFER_SIZE);
Then, the code gets the pointer to the source memory, and, if it points to RAM, it fills it with test data:
const uint8_t * srcMemory = ms.memPointer; // Get pointer to source memory
if (srcMemory == ramSrcMemoryBuffer)
{
// Fill source buffer with the complement of unused value
memset(ramSrcMemoryBuffer, ~unusedValue, TEST_MEMCPY_BUFFER_SIZE);
// Fill memory with some data
uint8_t * src = &ramSrcMemoryBuffer[srcOffset];
for (size_t i = 0; i < size; i++)
src[i] = rnd8();
}
- For RAM, the whole buffer is initialized with the complement of "unusedValue", then the range used for the copy is initialized with a "random" (just increasing numbers) function that generates values that are neither "unusedValue" nor "~unusedValue".
- For ROM, the pointer just points to rom address "4" (not 0, under 16k of ROM space).
- For FLASH, the pointer points to a constant buffer filled with increasing numbers at compile time with a macro.
Then the tests:
assert(srcMemory != NULL); // Source must be valid
// Do copy with "good" memcpy & check results against source
const uint8_t * src = &srcMemory[srcOffset];
uint8_t * dst = &ramDstBufferGood[dstOffset];
uint8_t * dstRetGood = (uint8_t *)memcpyGood(dst, src, size);
assert(dstRetGood == dst);
assert(memcmp(dstRetGood, src, size) == 0);
// Do copy with tested memcpy & check results against source
dst = &ramDstBufferTest[dstOffset];
uint8_t * dstRetTest = (uint8_t *)memcpyTest(dst, src, size);
assert(dstRetTest == dst);
assert(memcmp(dstRetTest, src, size) == 0);
// Compare destination buffers using the whole size of buffer,
// this should detect out of buffer writes.
assert(memcmp(ramDstBufferGood, ramDstBufferTest, TEST_MEMCPY_BUFFER_SIZE) == 0);
Assertions:
- Source memory is not NULL.
- Reference return value is dst.
- Compare destination memory copied by reference implementaion with source (using only copied range).
- Same tests with tested implementation.
- Compare both destinations buffers (tested and reference) using the full buffer size (2 x TEST_MEMCPY_MAX_TEST_SIZE), to detect writes outside the copied area.
It could be more complete, but it worked just fine and detected several bugs in some test cases.
Testing time
Each call to the test function takes about 3.3s.
So testing all implementation cases takes 45 * 3.3 = 149s = 2.5min