Kogna debug suggestions
Moderators: TomKerekes, dynomotion
- TomKerekes
- Posts: 2676
- Joined: Mon Dec 04, 2017 1:49 am
Re: Kogna debug suggestions
I'm not sure what you mean. Its ok to use the beginning of the gather buffer for your application as long as you don't expect it to be preserved after a re-boot like your crash post-mortem stuff might expect.
Regards,
Tom Kerekes
Dynomotion, Inc.
Tom Kerekes
Dynomotion, Inc.
Re: Kogna debug suggestions
Well that's the problem, because I do want to verify that that struct is not trashed. For example, it contains a lot of function pointers, which should be completely untouched after init. But I'll worry about that once I get the basic ability to reset without power cycling.
Re: Kogna debug suggestions
Here's the output on the terminal, from the start of a power-up:
Then, after pressing the reset switch (pull JP4 RESET# to GND):
So obviously the reset is not equivalent to POR, but is only affecting the Ethernet link. Indeed, I cannot connect over Ethernet after using that reset.
Code: Select all
Primary Boot KOGNA 5.0.0 Build 09:52:44 Mar 26 2022
Recovery Delay before Reading Boot Jumper
Boot Jumper Installed Attempting Secondary Boot
Flash Reading from Block 20...
Jump to Secondary Boot
KOGNA 5.3.2 Build 18:10:31 Jan 31 2024
Flash Reading from Block 16...
Board Serial Number/Domain Kogna_SN10
Requesing IP Addr 192.168.1.16
KOGNA IP Address Assigned 192.168.1.16
Code: Select all
No Ethernet Link
- TomKerekes
- Posts: 2676
- Joined: Mon Dec 04, 2017 1:49 am
Re: Kogna debug suggestions
Ah yes. In that case you might move the structure down in memory.Well that's the problem, because I do want to verify that that struct is not trashed. For example, it contains a lot of function pointers, which should be completely untouched after init. But I'll worry about that once I get the basic ability to reset without power cycling.
For Kogna RESET# is a peripheral reset output only. We don't offer a DSP Reset on the connectors. Shorting U17 Pin2 to GND briefly will reset the DSP and everything else. U17 is a 3 pin device. Pin1 is marked with a 'o'. Pin2 is the other pin on the same side. Shorting through a 100ohm resistor should also work and be safer.Then, after pressing the reset switch (pull JP4 RESET# to GND):
No Ethernet Link
So obviously the reset is not equivalent to POR, but is only affecting the Ethernet link. Indeed, I cannot connect over Ethernet after using that reset.
Another option would be to use CCS and the XDS110 to reset the board.
Regards,
Tom Kerekes
Dynomotion, Inc.
Tom Kerekes
Dynomotion, Inc.
Re: Kogna debug suggestions
I implemented the reset as you described, and it works. I can reset and run diagnostic code after a crash.
Unfortunately, I'm no closer to solving this issue. It's like a hand grenade: the pin is pulled in one thread, and some time later everything goes kablooey. No amount of trace and function hook logging has been fruitful.
So do you think it is time to get CCS and a JTAG device? If so, how is that to be used with the Kogna, and what debug features will be enabled? How is it even going to handle the dynamic execution of programs (e.g. this problem arises when thread 1 launches thread 2, all while running a big "supervisor" code in thread 7)? I guess I need a bit of convincing that it will even help. It's more the learning curve and time than the money which is a problem.
Unfortunately, I'm no closer to solving this issue. It's like a hand grenade: the pin is pulled in one thread, and some time later everything goes kablooey. No amount of trace and function hook logging has been fruitful.
So do you think it is time to get CCS and a JTAG device? If so, how is that to be used with the Kogna, and what debug features will be enabled? How is it even going to handle the dynamic execution of programs (e.g. this problem arises when thread 1 launches thread 2, all while running a big "supervisor" code in thread 7)? I guess I need a bit of convincing that it will even help. It's more the learning curve and time than the money which is a problem.
- TomKerekes
- Posts: 2676
- Joined: Mon Dec 04, 2017 1:49 am
Re: Kogna debug suggestions
Regarding CCS: as long as different programs are not overlayed in the same Thread then you should be able set breakpoints and watch points in your code. It supports a number of "hardware" breakpoints. if you load the symbols for the program (even before it is loaded) and set a breakpoint then if the breakpoint location is executed the processor will break. "software" breakpoints need to be placed after the code has been loaded. How the code is "launched" doesn't really matter.
A hardware breakpoint at location 0 might catch a jump using a null pointer. If that occurs you might be able to examine the stack to see where the call was from. I suppose you could also break on a read of location 0 (read from null pointer). Address range 0 is a Processor Boot ROM.
There are also "hardware watch points" that will will break if an address is read or written. Unfortunately I don't see an option to watch a block of addresses. Other Processors/Emulators support an address mask but I can't find it for this.
But I don't see that it guarantees a way to track down the problem. Just increases the chances of seeing something.
CCS is based on eclipse. I use Version 11.
A hardware breakpoint at location 0 might catch a jump using a null pointer. If that occurs you might be able to examine the stack to see where the call was from. I suppose you could also break on a read of location 0 (read from null pointer). Address range 0 is a Processor Boot ROM.
There are also "hardware watch points" that will will break if an address is read or written. Unfortunately I don't see an option to watch a block of addresses. Other Processors/Emulators support an address mask but I can't find it for this.
But I don't see that it guarantees a way to track down the problem. Just increases the chances of seeing something.
CCS is based on eclipse. I use Version 11.
Regards,
Tom Kerekes
Dynomotion, Inc.
Tom Kerekes
Dynomotion, Inc.
Re: Kogna debug suggestions
Is J6 the JTAG port on the Kogna? The online documentation refers to the Kflop, but the Kflop uses JP2 for JTAG. As the only 14-pin header on the board, I'm guessing it is, but would like confirmation before smoke testing.
- TomKerekes
- Posts: 2676
- Joined: Mon Dec 04, 2017 1:49 am
Re: Kogna debug suggestions
Hi Steve,
That is correct. Rev 1.2 Kogna changes the connector label to JP2.
Attached is the Target Configuration we use for CCS 11.
The basic process is to "launch" this target configuration.
Note the JTAG connects to the DSP and daisy chains also to the other 2 PRU processors in the chip that we do not currently use. So right click on the first processor (C674X) and Connect Target.
At this point the Disassembly Window should show where the PC is in Memory and show memory which shows it is connected and viewing into the DSP
You can then Run | Load Program | DSPKOGNA.out
You might also load User Programs or Symbols for User Programs.
If the Boot Jumper is removed then if you "Run" the code it should execute what is in memory without boot loading anything from Flash.
Apps should then be able to communicate with Kogna.
HTH
That is correct. Rev 1.2 Kogna changes the connector label to JP2.
Attached is the Target Configuration we use for CCS 11.
The basic process is to "launch" this target configuration.
Note the JTAG connects to the DSP and daisy chains also to the other 2 PRU processors in the chip that we do not currently use. So right click on the first processor (C674X) and Connect Target.
At this point the Disassembly Window should show where the PC is in Memory and show memory which shows it is connected and viewing into the DSP
You can then Run | Load Program | DSPKOGNA.out
You might also load User Programs or Symbols for User Programs.
If the Boot Jumper is removed then if you "Run" the code it should execute what is in memory without boot loading anything from Flash.
Apps should then be able to communicate with Kogna.
HTH
- Attachments
-
- XDS100V3 Kogna New.ccxml.txt
- (1.89 KiB) Downloaded 188 times
Regards,
Tom Kerekes
Dynomotion, Inc.
Tom Kerekes
Dynomotion, Inc.
Re: Kogna debug suggestions
I think CCS is basically going. For anyone else out there using CCS on Linux (Ubuntu), Version 11 works on Ubuntu 22.04, but fails badly on 16.04. Version 12 doesn't properly install on 16.04 because it requires a slightly up-level libc for a logging utility. Haven't tried CCS 12 on Ubuntu 22.04. I'm using CCS 11 because that's what Tom's using.
Q. for Tom: do you know what's at address 0x00712148? It's an IDLE instruction and it seems to be stuck there. Even though I load symbols for the firmware and my code, it doesn't know that location, although it looks like legit code in the disassembler. I guess it could be my code from other threads, but it's beyond my patience and understanding to load all our code one by one to find it. But that address doesn't look like the normal thread space (?)
Background: I recompiled all our code with -g. It actually made it sometimes *not* crash, but usually crashes. Is at a different location to where it ended up without -g. In that case, it was executing in some of my data, and just happened to hit an instruction sequence which hard looped back on itself.
IER is all zero (except the NMI bit). Would that explain why the firmware doesn't get back control?
Q. for Tom: do you know what's at address 0x00712148? It's an IDLE instruction and it seems to be stuck there. Even though I load symbols for the firmware and my code, it doesn't know that location, although it looks like legit code in the disassembler. I guess it could be my code from other threads, but it's beyond my patience and understanding to load all our code one by one to find it. But that address doesn't look like the normal thread space (?)
Background: I recompiled all our code with -g. It actually made it sometimes *not* crash, but usually crashes. Is at a different location to where it ended up without -g. In that case, it was executing in some of my data, and just happened to hit an instruction sequence which hard looped back on itself.
IER is all zero (except the NMI bit). Would that explain why the firmware doesn't get back control?
- TomKerekes
- Posts: 2676
- Joined: Mon Dec 04, 2017 1:49 am
Re: Kogna debug suggestions
Hi Steve,
That's great you have CCS working. And thanks for the Linux notes.
Turning global interrupts on and off is very tricky on this (as well as KFLOP's) processor. Because of pipelining an interrupt can occur while turning off interrupts. If a Thread is pre-empted with interrupts off the result will be that the new Thread will execute with interrupts off and without any possibility of being pre-empted (interrupted). We are careful to never disable interrupts. Some TI Libraries disable interrupts. But you said you don't link in any TI libraries so that should not be an issue.
That's great you have CCS working. And thanks for the Linux notes.
0x00700000 - 0x007FFFFF is the DSP's internal Boot ROM. If it is stuck there then the code most likely crashed and fell into the Boot ROM.do you know what's at address 0x00712148? It's an IDLE instruction and it seems to be stuck there. Even though I load symbols for the firmware and my code, it doesn't know that location, although it looks like legit code in the disassembler. I guess it could be my code from other threads, but it's beyond my patience and understanding to load all our code one by one to find it. But that address doesn't look like the normal thread space (?)
When -g debug code is enabled much of the compiler optimization is disabled. The TI Compiler is very powerful and aggressive reordering things, operations in parallel, and whatnot. So for example if a loop is waiting on a variable to be set by another Thread, and the variable is not declared volatile, then the compiler might load the variable into a register and test the register in the loop and so get stuck in the loop forever.Background: I recompiled all our code with -g. It actually made it sometimes *not* crash, but usually crashes. Is at a different location to where it ended up without -g. In that case, it was executing in some of my data, and just happened to hit an instruction sequence which hard looped back on itself.
That would be a problem. Normally bit 6 is also set to allow the 90us Timer interrupt to do servo samples and switch Threads. If that was ever turned off multi-tasking wouldn't work.IER is all zero (except the NMI bit). Would that explain why the firmware doesn't get back control?
Turning global interrupts on and off is very tricky on this (as well as KFLOP's) processor. Because of pipelining an interrupt can occur while turning off interrupts. If a Thread is pre-empted with interrupts off the result will be that the new Thread will execute with interrupts off and without any possibility of being pre-empted (interrupted). We are careful to never disable interrupts. Some TI Libraries disable interrupts. But you said you don't link in any TI libraries so that should not be an issue.
Regards,
Tom Kerekes
Dynomotion, Inc.
Tom Kerekes
Dynomotion, Inc.