Optimizing for max sample rate out of XTRX + gnuradio

GW:

  • firmware version 5
  • gateware version: 3
  • gateware revision 5

LimeSuiteNG: latest develop branch for gnuradio

If i do the simplest of tests - the gnuradio fm demod example - FM Demod - GNU Radio

I cannot get the XTRX to not overflow or report loss (still confused by these both happening on receive, i would only expect overflow to occur on a receive channel).

This happens for complex processing chains too the xtrx seems to always fail to deliver anywhere close to > 1-5MSPS. Any guidance on optimizing xtrx usage for maximum sample rate would be appreciated

another data point is that with a usrp i can comfortably push 25MHz through a flowgraph i have but on a xtrx i can barely push 1MHz

It all depends on the flow graph, which block is the limiting factor, and the CPU performance.
For example let’s take that FM demod flow graph.
I can run it using XTRX with 60Msps without problems, when the graph is setup correctly.
What I mean, if you just take that flow graph and just crank up the Source sampling rate without adjusting other blocks, there is going to be a problem:
In the end the Audio sink is set to consume 48ksps. Taking into account the Rational resampler decimation by 5 and the FM demod decimation by 8, that makes the expected sample consumption rate of the graph to be 1.92Msps.
If the Source is simply set to produce 60Msps, the graph still consumes only 1.92Msps. What to do with the extra 58Msps? The Source has to overrun, because nobody is consuming that much data.
So to fix that, other blocks sampling rates/decimations need to be increased. Let’s say change the Rational resampler decimation from 5 to 50, now Source can be running at 19.2Msps. Increase the decimation to 100, Source can run at 38.4Msps without problems.

Now, for example the graph is running ok with the Source 38.4Msps, no overruns. Let’s add a FIR block between the Source and Rational resampler. Suddenly the Source starts to overrun again, even though no sampling rate changes have been made, and the expected graph samples consumption rate also stays the same.
Looking at the CPU load, I see that a single CPU core is loaded 100% just with the FIR block work processing, that block cannot keep up with realtime with such sampling rate. Therefore it becomes a bottleneck for the entire graph, Source ends up producing data faster than the FIR block can consume, again resulting in Source overrun. So in this case even though graph is correct the problem is CPU performance.

In essence, there is nothing special needed to be done with the XTRX Source block, the overruns are a result of the graph not consuming data fast enough.

overflow and loss, both mean data has been dropped, but at different places. There is a background thread that is contantly processing the received data packets, which are timestamped.

overflow - basically means, the data has been received from the device, it has been parsed and ready, but the application did not take it fast enough, and data had to be dropped intentionally.

loss - means data from the device never arrived, or was not observed by the parsing thread. How that can happen depends on device type PCIe/USB…
PCIe:

  • data can be lost if the device tries to transfer more than the PCIe link is capable. i.e PCIe Gen2 x 1lane theoretical bandwidth 500MB/s, but the device is trying to push >500MB/s.
  • The CPU load is so high, that even the data receive thread cannot keep up with parsing packets. The thread detected a timestamp jump into the future, because DMA system from the device basically overflowed the ring buffer a whole lap.
  • General data corruption/misalignment issues within gateware.