Hi Kevin,
It would be possible except the FPGA is full. We could offer custom FPGA configurations except I'm afraid that would be a maintenance/documentation nightmare.
Maybe an optimized software routine would give you a significant speed up. Can you post the code you are currently using? What throughput you are getting?
Regards TK
Group: DynoMotion |
Message: 7187 |
From: fireup_kev |
Date: 4/5/2013 |
Subject: Re: SPI |
Here's my routine, I haven't tested the throughput
int SPI_OUT(int data)
{
int i;
int dataIn;
dataIn = 0;
ClearBit(pSS);
for (i = 0; i < 8; i++)
{
if (data & 0x80)
SetBit(pMOSI);
else
ClearBit(pMOSI);
data = data << 1;
dataIn = dataIn << 1;
if (ReadBit(pMISO))
dataIn |= 0x01;
SetBit(pSCK);
ClearBit(pSCK);
}
SetBit(pSS);
return dataIn;
}
--- In DynoMotion@yahoogroups.com, Tom Kerekes <tk@...> wrote:
>
> Hi Kevin,
>
> It would be possible except the FPGA is full. We could offer custom FPGA configurations except I'm afraid that would be a maintenance/documentation nightmare.
>
> Maybe an optimized software routine would give you a significant speed up. Can you post the code you are currently using? What throughput you are getting?
>
> Regards
> TK
>
>
>
>
> ________________________________
> From: fireup_kev <kliboon@...>
> To: DynoMotion@yahoogroups.com
> Sent: Friday, April 5, 2013 10:44 AM
> Subject: [DynoMotion] SPI
>
>
> Â
> Tom,
>
> Is it possible to add hardware SPI the the FPGA of the KFLOP (master mode)? I'm using software bit-bag SPI right now and it's a bit slow.
>
> Kevin
>
|
|
Group: DynoMotion |
Message: 7188 |
From: Tom Kerekes |
Date: 4/5/2013 |
Subject: Re: SPI |
Hi Kevin,
I measure about 18us to send a byte (which is ~ 450KHz clock rate). See attached benchmark. Results:
Times 25.360000 17.800000 17.960000 41.745960 us
First one take a bit longer to fill the cache, next two take about 18us, average of 1000 takes longer because of time slicing the thread only gets ~1/3 of the CPU.
It looks like with some tricky code it could be reduced to ~10us.
Can you say what you are using it for?
Regards TK
Group: DynoMotion |
Message: 7190 |
From: Craig |
Date: 4/6/2013 |
Subject: Re: SPI |
Kevin,
Just a thought:
How about using the serial IO from Kflops to one of the low cost Arduino boards then SPI for there. That is what I am going to do for some of my system.
Craig
--- In DynoMotion@yahoogroups.com, "fireup_kev" <kliboon@...> wrote:
>
> Tom,
>
> Is it possible to add hardware SPI the the FPGA of the KFLOP (master mode)? I'm using software bit-bag SPI right now and it's a bit slow.
>
> Kevin
>
|
|
Group: DynoMotion |
Message: 7192 |
From: Tom Kerekes |
Date: 4/6/2013 |
Subject: Re: SPI |
But that would actually give much less data rate. 56K baud vs 450K bits/sec Regards TK
Group: DynoMotion |
Message: 7195 |
From: fireup_kev |
Date: 4/7/2013 |
Subject: Re: SPI |
I'm using it for many things. One is to expand the I/O and the other is to read external memory. The chips can run at 20 MHz or more and I would like to take advantage of that if I can. 450KHz is decent for now but will make me rethink if I should add other stuff to the bus.
Thanks for the benchmark routine. I'll try unrolling the loop to see if there's any improvements.
Thanks,
Kevin
--- In DynoMotion@yahoogroups.com, Tom Kerekes <tk@...> wrote:
>
> Hi Kevin,
>
> I measure about 18us to send a byte (which is ~ 450KHz clock rate). See attached benchmark. Results:
>
> Times 25.360000 17.800000 17.960000 41.745960 us
>
> First one take a bit longer to fill the cache, next two take about 18us, average of 1000 takes longer because of time slicing the thread only gets ~1/3 of the CPU.
>
>
> It looks like with some tricky code it could be reduced to ~10us.
>
> Can you say what you are using it for?
>
> Regards
> TK
>
>
>
>
>
> ________________________________
> From: fireup_kev <kliboon@...>
> To: DynoMotion@yahoogroups.com
> Sent: Friday, April 5, 2013 12:58 PM
> Subject: [DynoMotion] Re: SPI
>
>
> Â
> Here's my routine, I haven't tested the throughput
>
> int SPI_OUT(int data)
> {
> int i;
> int dataIn;
>
> dataIn = 0;
> ClearBit(pSS);
> for (i = 0; i < 8; i++)
> {
> if (data & 0x80)
> SetBit(pMOSI);
> else
> ClearBit(pMOSI);
> data = data << 1;
>
> dataIn = dataIn << 1;
> if (ReadBit(pMISO))
> dataIn |= 0x01;
>
> SetBit(pSCK);
> ClearBit(pSCK);
> }
> SetBit(pSS);
>
> return dataIn;
> }
>
> --- In DynoMotion@yahoogroups.com, Tom Kerekes <tk@> wrote:
> >
> > Hi Kevin,
> >
> > It would be possible except the FPGA is full.ÃÂ We could offer custom FPGA configurations except I'm afraid that would be a maintenance/documentation nightmare.
> >
> > Maybe an optimized software routine would give you a significant speed up.ÃÂ Can you post the code you are currently using?ÃÂ What throughput you are getting?
> >
> > Regards
> > TK
> >
> >
> >
> >
> > ________________________________
> > From: fireup_kev <kliboon@>
> > To: DynoMotion@yahoogroups.com
> > Sent: Friday, April 5, 2013 10:44 AM
> > Subject: [DynoMotion] SPI
> >
> >
> > ÃÂ
> > Tom,
> >
> > Is it possible to add hardware SPI the the FPGA of the KFLOP (master mode)? I'm using software bit-bag SPI right now and it's a bit slow.
> >
> > Kevin
> >
>
|
|
Group: DynoMotion |
Message: 7196 |
From: Tom Kerekes |
Date: 4/7/2013 |
Subject: Re: SPI |
Hi Kevin, Heh heh, I tried unrolling the loop and it made it slightly worse. I guess it doesn't fit in cache or something. Writing directly to hard coded FPGA addresses to set/clear the bits, placing the code in internal DSP memory, using an optimizing compiler reduced it to the 10us. But it would be hard to write this in a general way. There is also a way in KFLOP to set or clear multiple bits at the same time with one IO operation. Potentially something like the data set and clock set might be done together, but that wouldn't help if the data was low. External IO from the DSP is relatively slow. It surprises me that the time to read an IO bit is about the same time as theoretically to do 50+ 32-bit floating point operations.
Regards TK
Group: DynoMotion |
Message: 7199 |
From: fireup_kev |
Date: 4/7/2013 |
Subject: Re: SPI |
I unrolled the loop and remove all the shifting and it went down to about 14us
Times 43.079999 13.919999 13.979999 32.542440 us
I'm also using SPI in a callback so 18us does put an impact on the 90us time slice.
Kevin
--- In DynoMotion@yahoogroups.com, Tom Kerekes <tk@...> wrote:
>
> Hi Kevin,
>
> Heh heh, I tried unrolling the loop and it made it slightly worse. I guess it doesn't fit in cache or something.
>
> Writing directly to hard coded FPGA addresses to set/clear the bits, placing the code in internal DSP memory, using an optimizing compiler reduced it to the 10us. But it would be hard to write this in a general way.
>
> There is also a way in KFLOP to set or clear multiple bits at the same time with one IO operation. Potentially something like the data set and clock set might be done together, but that wouldn't help if the data was low.Â
>
> External IO from the DSP is relatively slow. It surprises me that the time to read an IO bit is about the same time as theoretically to do 50+ 32-bit floating point operations.
>
>
> Regards
> TK
>
>
>
> ________________________________
> From: fireup_kev <kliboon@...>
> To: DynoMotion@yahoogroups.com
> Sent: Sunday, April 7, 2013 10:03 AM
> Subject: [DynoMotion] Re: SPI
>
>
> Â
> I'm using it for many things. One is to expand the I/O and the other is to read external memory. The chips can run at 20 MHz or more and I would like to take advantage of that if I can. 450KHz is decent for now but will make me rethink if I should add other stuff to the bus.
>
> Thanks for the benchmark routine. I'll try unrolling the loop to see if there's any improvements.
>
> Thanks,
> Kevin
>
> --- In DynoMotion@yahoogroups.com, Tom Kerekes <tk@> wrote:
> >
> > Hi Kevin,
> >
> > I measure about 18us to send a byte (which is ~ 450KHz clock rate).ÃÂ See attached benchmark.ÃÂ Results:
> >
> > Times 25.360000 17.800000 17.960000 41.745960 us
> >
> > First one take a bit longer to fill the cache, next two take about 18us, average of 1000 takes longer because of time slicing the thread only gets ~1/3 of the CPU.
> >
> >
> > It looks like with some tricky code it could be reduced to ~10us.
> >
> > Can you say what you are using it for?
> >
> > Regards
> > TK
> >
> >
> >
> >
> >
> > ________________________________
> > From: fireup_kev <kliboon@>
> > To: DynoMotion@yahoogroups.com
> > Sent: Friday, April 5, 2013 12:58 PM
> > Subject: [DynoMotion] Re: SPI
> >
> >
> > ÃÂ
> > Here's my routine, I haven't tested the throughput
> >
> > int SPI_OUT(int data)
> > {
> > int i;
> > int dataIn;
> >
> > dataIn = 0;
> > ClearBit(pSS);
> > for (i = 0; i < 8; i++)
> > {
> > if (data & 0x80)
> > SetBit(pMOSI);
> > else
> > ClearBit(pMOSI);
> > data = data << 1;
> >
> > dataIn = dataIn << 1;
> > if (ReadBit(pMISO))
> > dataIn |= 0x01;
> >
> > SetBit(pSCK);
> > ClearBit(pSCK);
> > }
> > SetBit(pSS);
> >
> > return dataIn;
> > }
> >
> > --- In DynoMotion@yahoogroups.com, Tom Kerekes <tk@> wrote:
> > >
> > > Hi Kevin,
> > >
> > > It would be possible except the FPGA is full.ÃâàWe could offer custom FPGA configurations except I'm afraid that would be a maintenance/documentation nightmare.
> > >
> > > Maybe an optimized software routine would give you a significant speed up.ÃâàCan you post the code you are currently using?ÃâàWhat throughput you are getting?
> > >
> > > Regards
> > > TK
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: fireup_kev <kliboon@>
> > > To: DynoMotion@yahoogroups.com
> > > Sent: Friday, April 5, 2013 10:44 AM
> > > Subject: [DynoMotion] SPI
> > >
> > >
> > > ÃâÃÂ
> > > Tom,
> > >
> > > Is it possible to add hardware SPI the the FPGA of the KFLOP (master mode)? I'm using software bit-bag SPI right now and it's a bit slow.
> > >
> > > Kevin
> > >
> >
>
|
|
Group: DynoMotion |
Message: 7201 |
From: Tom Kerekes |
Date: 4/7/2013 |
Subject: Re: SPI |
Hi Kevin,
Attached is the version with the compiled in FPGA register writes which executes in about 10.5us.
Times 26.680000 10.460000 10.820000
26.634480 us
Seems to me writing from a Thread would be better because with a 90us callback only one byte per 90us is transferred.
Regards TK
| | | | | | | | | |