Kogna debug suggestions

Moderators: TomKerekes, dynomotion

SJHardy
Posts: 46
Joined: Thu Oct 03, 2019 12:36 am

Kogna debug suggestions

Post by SJHardy » Sat Mar 02, 2024 7:21 pm

Porting some code over from kflop to kogna, I am experiencing a sudden crash on the latter, even though it works fine for years on the kflop. It loses the TCP connection and won't respond to pings. Using firmware version 5.3.2.

My question is, if there are a reasonable number of printf()s, how many will be lost if the kogna hard crashes? Currently, the only way I know to debug is to keep adding prints until I can bracket the point of crash. But that's not going to work if the last N prints get lost in the buffer.

Is it possible to divert prints to UDP etc. and have the kogna wait until they are sent on the Ethernet before continuing? I know that would slow things down a bit, but it would be useful for debugging. Could use wireshark etc. to see the prints.

Any other debugging tips? It would be nice to have trap handling so I could locate a line of code more directly. For example, I had a problem with NaNs appearing, and it would have saved many hours if there was a trap when a NaN was first used in any expression.

Talking about NaNs, is there a standard isnan() function for the kogna a-la math.h?

User avatar
TomKerekes
Posts: 2676
Joined: Mon Dec 04, 2017 1:49 am

Re: Kogna debug suggestions

Post by TomKerekes » Sat Mar 02, 2024 8:46 pm

Hi SJH,
My question is, if there are a reasonable number of printf()s, how many will be lost if the kogna hard crashes? Currently, the only way I know to debug is to keep adding prints until I can bracket the point of crash. But that's not going to work if the last N prints get lost in the buffer.
Up to 256 lines can be buffered before printf() will block.

These circular index variables can inform you how many lines are queued

Code: Select all

volatile int NextAvailIn=0;  // when these are equal the queue is empty
volatile int NextAvailOut=0;

Is it possible to divert prints to UDP etc. and have the kogna wait until they are sent on the Ethernet before continuing? I know that would slow things down a bit, but it would be useful for debugging. Could use wireshark etc. to see the prints.
We don't currently support that as the Ethernet code is not Thread safe.

You might use the DSP UART to the virtual COM port on the PC. There shouldn't be any buffering. For example:

Code: Select all

#include "KMotionDef.h"

int uartprintf(const char *format, ...);

void main()
{
	uartprintf("test %d\n", 2+2);
}

It would be nice to have trap handling so I could locate a line of code more directly. For example, I had a problem with NaNs appearing, and it would have saved many hours if there was a trap when a NaN was first used in any expression.
No we don't have any such trap.


Talking about NaNs, is there a standard isnan() function for the kogna a-la math.h?
The 11 exponent bits of all 1 indicate either +inf, -inf, or nan. So this function should tell you if it is an invalid double:

Code: Select all

#include "KMotionDef.h"

int IsInvalid(double d)  // return 1 for +inf, -inf, nan
{
	double ld = d;  // TCC67 doesn't handle addresses of registers save local
	int *i = (int *)&ld;
	int exp_mask = 0x7ff << (52-32);
	return (i[1] & exp_mask) == exp_mask;
}


void main()
{
	double d;
	int *i = (int *)&d;
	
	i[1] = 0x7ff << (52-32);
	i[0] = 0;
	printf("+inf = %g %d\n", d, IsInvalid(d));

	i[1] = (0x7ff << (52-32)) + (1<<31);
	i[0] = 0;
	printf("-inf = %g %d\n", d, IsInvalid(d));

	i[1] = 0x7ff << (52-32);
	i[0] = 1;
	printf("nan = %g %d\n", d, IsInvalid(d));

	d = 123.456e300;
	printf("normal = %g %d\n", d, IsInvalid(d));
}
Regards,

Tom Kerekes
Dynomotion, Inc.

SJHardy
Posts: 46
Joined: Thu Oct 03, 2019 12:36 am

Re: Kogna debug suggestions

Post by SJHardy » Sun Mar 03, 2024 4:09 am

Hi Tom,

Thanks for the response. Regarding the printf buffering, yes I knew that there was a limit in the buffer size, but I'm really wondering about the potential for losing buffer contents in the case of a crash. For example, if my code branches off to a nonsense location which completely trashes the execution state, then how many prior printfs can I expect to have lost (i.e. didn't get back to the host server via TCP)?

Whatever my code is doing, it's resulting in a break of the TCP connection, requiring a power cycle, so if I am to find the point where things go haywire, I will need to have some confidence in the amount of printf trace which can potentially go missing. Maybe I'll have to add a WaitNextTimeSlice after each print, which is ok if necessary, but I'll defer to your advice.

I guess what I'm really asking is what do I have to do to *guarantee* a printf gets to the host.

Some of it will depend on whether you internally use the Nagle algorithm on the TCP connection, or whether it always sends any new data.

BTW, I'm using the TI compiler in a Linux environment.

I'm ok with sending chars over a serial port, if it truly blocks until transmitted. On the kogna, can I use the micro USB port which allows setting the IP address etc.? Is that the DSP uart? Otherwise, it will be a bit of an ordeal to hook up another serial port to the header pins.

Regards,
SJH

User avatar
TomKerekes
Posts: 2676
Joined: Mon Dec 04, 2017 1:49 am

Re: Kogna debug suggestions

Post by TomKerekes » Sun Mar 03, 2024 6:41 pm

Hi SJH,
Regarding the printf buffering, yes I knew that there was a limit in the buffer size, but I'm really wondering about the potential for losing buffer contents in the case of a crash. For example, if my code branches off to a nonsense location which completely trashes the execution state, then how many prior printfs can I expect to have lost (i.e. didn't get back to the host server via TCP)?

Whatever my code is doing, it's resulting in a break of the TCP connection, requiring a power cycle, so if I am to find the point where things go haywire, I will need to have some confidence in the amount of printf trace which can potentially go missing. Maybe I'll have to add a WaitNextTimeSlice after each print, which is ok if necessary, but I'll defer to your advice.
I thought I explained that up to 256 lines could be lost. I think it would be hard to predict how many as it depends on many things. The messages are returned to the PC when the PC asks for status so in the case the PC is busy for 1 second then the entire 256 line buffer might be filled. Also depending on how fast the messages are generated. But in normal cases maybe much less.

A WaitNextTimeSlice is basically a 180us delay and would be not insufficient to guarantee a message is sent.


I guess what I'm really asking is what do I have to do to *guarantee* a printf gets to the host.
if you add a while loop:

Code: Select all

volatile int NextAvailIn=0;  // when these are equal the queue is empty
volatile int NextAvailOut=0;
.
.
.
printf("hello\n");
while (NextAvailIn != NextAvailOut) ;  // wait for no messages in queue
it should guarantee the message is in a TCPIP buffer to be transmitted. Not sure if there is some possibility it isn't actually sent.

Some of it will depend on whether you internally use the Nagle algorithm on the TCP connection, or whether it always sends any new data.
I believe the Nagle algorithm is disabled as the PC might often ask for small responses. But I don't think that really matters for this.

I'm ok with sending chars over a serial port, if it truly blocks until transmitted. On the kogna, can I use the micro USB port which allows setting the IP address etc.? Is that the DSP uart?
Yes. Note uartprintf() is not Thread Safe so you will need to be careful about that. I suppose we could add a Mutex to make it so.

HTH
Regards,

Tom Kerekes
Dynomotion, Inc.

SJHardy
Posts: 46
Joined: Thu Oct 03, 2019 12:36 am

Re: Kogna debug suggestions

Post by SJHardy » Sun Mar 03, 2024 7:41 pm

Yes that does help, as always.

I wasn't thinking too clearly about the print buffering, but it does make sense that a burst of 256 could be lost. That would be if I did 256 prints within the interval my server polls for console messages (about 20ms). I might try writing a blocking printf which waits for the buffer to be empty, then waits another time slice to be sure. Presumably the TCP library services each socket over the 180us period, or should it wait a bit more? Is there a library function I can call to check the number of pending chars in the socket tx buffer? Then, if NextANextAvailIn == NextAvailOut *and* the socket tx buffer is empty, that would pretty much guarantee the host had it all.

The UART seems to be a good solution, since it doesn't interact with the main TCP channel. I don't think thread safety is necessary at present, since I think my problem is confined to one thread. It would be useful to be able to poll the UART for when the Tx buffer is empty, though.

If I WaitNextTimeSlice before sending a single Tx char, would that be effectively thread safe?

Yes, Nagle would not make much difference on a LAN since the ACK time will be almost instant.

Regards,
SJH

User avatar
TomKerekes
Posts: 2676
Joined: Mon Dec 04, 2017 1:49 am

Re: Kogna debug suggestions

Post by TomKerekes » Sun Mar 03, 2024 10:01 pm

Hi SJH,
The UART seems to be a good solution, since it doesn't interact with the main TCP channel. I don't think thread safety is necessary at present, since I think my problem is confined to one thread. It would be useful to be able to poll the UART for when the Tx buffer is empty, though.
The code waits until the Tx Register is empty before place the next character so I don't think there is any possibility of anything to be lost.
Regards,

Tom Kerekes
Dynomotion, Inc.

SJHardy
Posts: 46
Joined: Thu Oct 03, 2019 12:36 am

Re: Kogna debug suggestions

Post by SJHardy » Wed Mar 06, 2024 5:56 am

Debugging the issue we have porting kflop code to kogna:

I can #define printf uartprintf, selectively. That seems to work and gets a lot more messages than I was previously. Not sure, but the homing thread seems to complete, then the supervisor thread notices some updated state, does something different, and gets sent to oblivion.

When facing this sort of problem before, I've used the TI compiler function hook to basically log every function entry and exit, so at least I can identify the function which copped it. It ends up being a vast amount of data, but I can store it in the gather buffer. The tricky bit is when rebooting and somehow reading out the captured data.

Does the uart (firmware) have any sort of debug executive, or is it possible to write one? It just would be nice to have a facility to use the serial terminal to dump memory data etc. Even better would be to implement the same script commands as documented. Previously I have written special programs to print the captured data to console, but it's very tedious to do that when it's not even known what to look for. Yes, I can do this using getgather command etc., but that currently disrupts the normal host connection (I think, or maybe it's my server?) Also it can't access arbitrary memory. I suppose I could write a debug executive in my main app using "magic" MDI commands etc.

The only reason I can think that kflop code fails on the kogna is that there was an uninitialized variable etc. that happened to be harmless (and hidden for years over hundreds of updates) but is exposed on the kogna. Can you think of anything else to look out for?

Is there any diff in the way kogna might be handling things like div by 0? Was the kflop ignoring FP errors like that but the kogna abends?

Regards,
SJH

User avatar
TomKerekes
Posts: 2676
Joined: Mon Dec 04, 2017 1:49 am

Re: Kogna debug suggestions

Post by TomKerekes » Wed Mar 06, 2024 6:49 pm

Hi SJH,
Does the uart (firmware) have any sort of debug executive, or is it possible to write one? It just would be nice to have a facility to use the serial terminal to dump memory data etc.
Not currently. You could use a User Program to dump to it.
I can do this using getgather command etc., but that currently disrupts the normal host connection (I think, or maybe it's my server?)
It shouldn't disrupt normal host connection. Except when the board is locked for an upload nothing else will be allowed access until the upload is complete. Uploading at a few megabytes/s shouldn't take very long. If it it does then you might upload in pieces.
Is there any diff in the way kogna might be handling things like div by 0? Was the kflop ignoring FP errors like that but the kogna abends?
I can't think of anything different. Or how a FP error would cause a code crash.

One issue might be that the C6748 has an enhanced instruction set over KFLOP's C6722. Certain things such as pipelined loops might not be fully supported/saved/restored when interrupted to do a context switch. What Compiler options are you using to compile your code? Specifically are you including the

-mv6710 option?

Here is the command line KMotion uses to compile/link:

C:\KMotionSrcKogna\DSP_KOGNA\ccsv7\tools\compiler\c6000_7.4.21\bin\cl6x.exe --abi=eabi -k -q -as --diag_suppress=163 --mem_model:data=far -i "C:\KMotionSrcKogna\DSP_KOGNA" -i "C:\KMotionSrcKogna\C Programs" -i "C:\KMotionSrcKogna\C Programs" -mu -ml3 -mv6710 -o0 "C:\KMotionSrcKogna\C Programs\print.c" --obj_directory="C:\KMotionSrcKogna\C Programs" --asm_directory="C:\KMotionSrcKogna\C Programs"

And to link:

C:\KMotionSrcKogna\DSP_KOGNA\ccsv7\tools\compiler\c6000_7.4.21\bin\cl6x.exe -z "C:\KMotionSrcKogna\DSP_KOGNA\Link.cmd"

with the attached Link.cmd

Are you linking with any TI library?

Have you tried compiling under KMotion? I suppose you have multiple C files not supported by KMotion?
Attachments
Link.cmd.txt
(1.32 KiB) Downloaded 179 times
Regards,

Tom Kerekes
Dynomotion, Inc.

SJHardy
Posts: 46
Joined: Thu Oct 03, 2019 12:36 am

Re: Kogna debug suggestions

Post by SJHardy » Wed Mar 06, 2024 7:59 pm

Here is a trace for an example compile+link (first step is just generating makefile dependencies)...

Code: Select all

Running command /home/steve/c6000_7.4.14/bin/cl6x "/home/steve/DM6/Source/DM6-SCL-capture.c" -i"/home/steve/DM6/Projects/MA360E/Include" -i"/home/steve/DM6/Boards/MA360-Rev01/Include" -i"/home/steve/DM6/Machines/Mill5Axis/Include/MA360E" -i"/home/steve/DM6/Source" -i"/home/steve/DM6/Firmware/5.3.2" -DDM6FIRMWARE="\"5.3.2\"" -DDM6MAJOR=17 -DDM6BOARDREV=5 -DDM6PROJECT="\"MA360E\"" --preproc_includes --output_file="/home/steve/DM6/Source/Build/MA360E/DM6-SCL-capture.c.d" --gcc

Compiling DM6-SCL-capture.c
Running command /home/steve/c6000_7.4.14/bin/cl6x "/home/steve/DM6/Source/DM6-SCL-capture.c" -i"/home/steve/DM6/Projects/MA360E/Include" -i"/home/steve/DM6/Boards/MA360-Rev01/Include" -i"/home/steve/DM6/Machines/Mill5Axis/Include/MA360E" -i"/home/steve/DM6/Source" -i"/home/steve/DM6/Firmware/5.3.2" -DDM6FIRMWARE="\"5.3.2\"" -DDM6MAJOR=17 -DDM6BOARDREV=5 -DDM6PROJECT="\"MA360E\"" --output_file="/tmp/DM6-SCL-capture.c.o" --abi=eabi -k -q -as --diag_suppress=163 --mem_model:data=far -mv6710 -ml3 -mu -O2 --opt_for_space --gcc

Running command /home/steve/c6000_7.4.14/bin/nm6x "/tmp/DM6-SCL-capture.c.o" > "/tmp/DM6-SCL-capture.c.nm"

Running command /home/steve/c6000_7.4.14/bin/cl6x --run_linker "/tmp/DM6-SCL-capture(1).out.cmd"
We use the --gcc flag to get variadic macros and some other GNU extensions, but I don't think those are the issue.

Here is the linker script:

Code: Select all

/* Link template for TI Linker. */

-q -c -x -e ___user_init
"/tmp/DM6-SCL-capture.c.o"
"/home/steve/DM6/Firmware/5.3.2/user_boot.obj"
--output_file=/home/steve/DM6/Projects/MA360E/Build/DM6-SCL-capture(1).out
-m/home/steve/DM6/Projects/MA360E/Build/DM6-SCL-capture(1).out.map

-heap  0x100
-stack 0x100 
--diag_suppress=16002
--diag_suppress=10063
--cinit_compression=off
--ram_model // this eliminates the need for c initialization tables
"/home/steve/DM6/Firmware/5.3.2/DSPKOGNA-5.3.2.sym"  /* include KFLOP Symbols */

MEMORY
{
	IRAM:     o = 0x11832780	l = 0x0000d880	/*  for FAST User C Programs use small leftover toward end of IRAM */                           
	USER_SA:  o = 0xc0080000	l = 0x00040000	/*  THREAD1 Starts at 0x80050000 */                        
}

SECTIONS
{
	GROUP
	{
		/* MUST PUT A NON ZERO WORD IN NORMAL FIRST WORD OF THREAD SPACE OR KFLOP WILL REFUSE TO RUN THREAD */
		.mytext: START(__start_.text), palign(8), fill = 0xAAAAAAAA {. += 4;}
		.text:  END(__stop_.text)
		.stack
		.bss: START( __start_.bss) END(__stop_.bss)
		.const: START( __start_.data)  
		.data: 
		.far 
		.fardata
		.switch 
		.cio
		.cinit: END(__stop_.data)
	}  >  USER_SA
	
	GROUP
	{
		.IRAM
		.IRAMP
		.IRAMD	
	}  >  IRAM
}                        
We're not using any TI libraries. It's a fairly complex system, so I doubt it would compile under the Windows environment.

I don't think it would be anything to do with context switching. The system *mostly* works for us. For example, we can boot it up holding down an MPG button and it goes into a "test mode" that does almost everything except it just enables axes without the normal homing procedure. We can run G-code etc.

At present I am working on using the gather buffer to be able to log progress, then have another program run a post mortem. But I do need a way to temporarily disable interrupts just so I can reliably allocate the next log buffer. Not sure if I'm clever enough to do it using just the atomic functions, but I'll give it a try.

Regards,
SJH

User avatar
TomKerekes
Posts: 2676
Joined: Mon Dec 04, 2017 1:49 am

Re: Kogna debug suggestions

Post by TomKerekes » Wed Mar 06, 2024 8:14 pm

If you call WaitNextTimeSlice() there will be about 50us of time without any possibility of interruption.
Regards,

Tom Kerekes
Dynomotion, Inc.

Post Reply