Reading from multiple LimeSDR-Mini v2 in same app

Hi,

I am trying to read from two LimeSDR mini v2 devices in one and the same app. But I observe instability in the received samples and I am trying to figure out what is causing it - software, the SDR or the USB communication.

For this purpose I extended the “singleRX” example in LimeSuite to support multiple devices and to measure the execution time of LMS_RecvStream(). I ran it with 2,3 and even 5 SDRs connected to my laptop and to another machine. And in any of the cases I observe the following:

1166. Radio 0 1D90F53077C892: Read 7680 samples for 19.00us. {On=1 FIFO= 120360/ 7679580(1.5673%) URun=0 ORun=0 Dropped=0 Rate=30.8675MB/s timestamp=9087180}
1166. Radio 1 1D90F5FD139436: Read 7680 samples for 19.00us. {On=1 FIFO= 87720/ 7679580(1.1422%) URun=0 ORun=0 Dropped=0 Rate=30.8675MB/s timestamp=9053520}
1166. Radio 2 1D90EE19E2E8F7: Read 7680 samples for 19.00us. {On=1 FIFO= 63240/ 7679580(0.8235%) URun=0 ORun=0 Dropped=0 Rate=30.8675MB/s timestamp=9027000}
1166. Radio 3 1D90F687323423: Read 7680 samples for 39.00us. {On=1 FIFO= 30600/ 7679580(0.3985%) URun=0 ORun=0 Dropped=0 Rate=30.8675MB/s timestamp=8996400}
1166. Radio 4 1D90F62803335C: Read 7680 samples for 935.00us. {On=1 FIFO= 6120/ 7679580(0.0797%) URun=0 ORun=0 Dropped=0 Rate=30.8675MB/s timestamp=8969880}

One of the SDRs, usually the last one, executes the call of LMS_RecvStream() much much slower then the rest. Depending on the sample rate, this execution seems to be between x10 and x50 slower. In the quoted text above the first 4 SDR need 19 or 39us, while the last one 935us.

Have you observed such behavior? What could be causing it and what can I do to solve it?

My setup:
Laptop: Intel(R) Core™ i7-9850H CPU @ 2.60GHz-4.60GHz 6 cores, 12 threads, RAM:16GB, Ubuntu 20.04
PC: AMD Ryzen 9 7950X @ 3.00GHz-5.879GHz, 16 cores, 32 threads, RAM:64GB, Ubuntu 22.04
LimeSuite: based on master branch, Parent: 38efe9602a7945717a371a1bb9e19d075284f445 (Fix RFE panel checkbox for Rx DC corrector)

Since I can’t find way to upload the code I use for the test, I will paste it here. To build it, simply replace the existing singleRX.cpp example with the code below. Once compiled, you can run it like this, just replace the list of SDR serial numbers (-r option):

./singleRX -f 1300000000 -g 40 -a LNAH -s 7680000 -n 7680 -t 3 -o 1 -r "1D90F53077C892,1D90F5FD139436" -p 0
/**
    @file   singleRX.cpp
    @author Lime Microsystems (www.limemicro.com)
    @brief  RX example
 */
#include "lime/LimeSuite.h"
#include <iostream>
#include <cstdlib> 
#include <unistd.h>
#include <chrono>
#include <vector>
#include <sstream>
#ifdef USE_GNU_PLOT
#include "gnuPlotPipe.h"
#endif

using namespace std;

double                   freq              = 2630e6;
float                    rf_gain           = -1.0;
std::vector<std::string> radios            = {};
std::string              antenna           = "auto";
double                   srate             = 30.72e6;
uint32_t                 samples_per_frame = 30.72e6/1000;
int32_t                  exec_time         = 5;          /* in seconds*/
int32_t                  oversample        = 1;
int32_t                  pstatus           = 1000;

void usage(char* prog)
{
  printf("Usage: %s [fgasntorph]\n", prog);
  printf("\t-f Carrier frequency in Hz [Default %.2f]\n", freq);
  printf("\t-g RF gain [Default AGC(%.2f)]\n", rf_gain);
  printf("\t-a Antenna to use [Default %s]\n", antenna.c_str());
  printf("\t-s Sampling rate [Default %.2f]\n", srate);
  printf("\t-n Samples per frame [Default %d]\n", samples_per_frame);
  printf("\t-t Execution time [Default %d]\n", exec_time);
  printf("\t-o Oversample factor [Default %d]\n", exec_time);
  printf("\t-r Comma separated list of serial numbers to use [Default All detected]\n");
  printf("\t-p Print stream status interval us [Default %d]\n", pstatus);
  printf("\t-h show this message\n");
}

void parse_args(int argc, char** argv)
{
  int opt;
  while ((opt = getopt(argc, argv, "fgasntorph")) != -1) {
    switch (opt) {
      case 'f':
        freq = strtof(argv[optind], NULL);
        break;
      case 'g':
        rf_gain    = strtof(argv[optind], NULL);
        break;
      case 'a':
        antenna = argv[optind];
        break;
      case 's':
        srate = strtof(argv[optind], NULL);
        break;
      case 'n':
        samples_per_frame = strtof(argv[optind], NULL);
        break;
      case 't':
        exec_time = strtol(argv[optind], NULL, 10);
        break;
      case 'p':
        pstatus = strtol(argv[optind], NULL, 10);
        break;
      case 'o':
        oversample = strtol(argv[optind], NULL, 10);
        break;
      case 'r': 
      {
        std::stringstream ss(argv[optind]);
        std::string token;
        while (getline(ss, token, ','))
          radios.push_back(token);
      }
        break;
      case 'h':
      default:
        usage(argv[0]);
        exit(-1);
    }
  }
}

int error(lms_device_t* device = nullptr)
{
    if (device != NULL)
        LMS_Close(device);
    exit(-1);
}

int setupRX(lms_device_t** device, lms_info_str_t sdev)
{
  int r = 0;
  
  //Open the device
  if (LMS_Open(device, sdev, NULL))
      error(*device);

  //Initialize device with default configuration
  //Do not use if you want to keep existing configuration
  //Use LMS_LoadConfig(device, "/path/to/file.ini") to load config from INI
  if (LMS_Init(*device) != 0)
      error(*device);
  
  if (LMS_EnableChannel(*device, LMS_CH_TX, 0, true)!=0)	// Fix for v2
      error(*device);

  //Enable RX channel
  //Channels are numbered starting at 0
  if (LMS_EnableChannel(*device, LMS_CH_RX, 0, true) != 0)
      error(*device);

  //Set center frequency to 800 MHz
  if (LMS_SetLOFrequency(*device, LMS_CH_RX, 0, freq) != 0)
      error(*device);

  //print currently set center frequency
  float_type freq;
  if (LMS_GetLOFrequency(*device, LMS_CH_RX, 0, &freq) != 0)
      error(*device);
  cout << "\nCenter frequency: " << freq / 1e6 << " MHz\n";

  
  int n = 0;
  //select antenna port
  lms_name_t antenna_list[10];    //large enough list for antenna names.
  //Alternatively, NULL can be passed to LMS_GetAntennaList() to obtain number of antennae
  if ((n = LMS_GetAntennaList(*device, LMS_CH_RX, 0, antenna_list)) < 0)
    error(*device);
  cout << "Available antennas:\n";            //print available antennae names
  for (int i = 0; i < n; i++)
    cout << i << ": " << antenna_list[i] << endl;
  if ("auto" == antenna) {
    if ((n = LMS_GetAntenna(*device, LMS_CH_RX, 0)) < 0) //get currently selected antenna index
      error(*device);
    //print antenna index and name
    //cout << "Automatically selected antenna: " << n << ": " << antenna_list[n] << endl;
    if (LMS_SetAntenna(*device, LMS_CH_RX, 0, LMS_PATH_LNAW) != 0) // manually select antenna
      error(*device);
  }
  else {
    for (int i = 0; i < n; i++) {
      if (antenna_list[i] == antenna) {
        if (LMS_SetAntenna(*device, LMS_CH_RX, 0, i) != 0) // manually select antenna
          error(*device);
        break;
      }
    }
  }
  if ((n = LMS_GetAntenna(*device, LMS_CH_RX, 0)) < 0) //get currently selected antenna index
      error(*device);
 //print antenna index and name
  cout << "Selected antenna: " << n << " " << antenna_list[n] << endl;
  
  
  //This set sampling rate for all channels
  if (LMS_SetSampleRate(*device, srate, oversample) != 0)
      error(*device);
  
  //print resulting sampling rates (interface to host , and ADC)
  float_type rate, rf_rate;
  if (LMS_GetSampleRate(*device, LMS_CH_RX, 0, &rate, &rf_rate) != 0)  //NULL can be passed
      error(*device);
  cout << "\nHost interface sample rate: " << rate / 1e6 << " MHz\nRF ADC sample rate: " << rf_rate / 1e6 << "MHz\n\n";

  //Example of getting allowed parameter value range
  //There are also functions to get other parameter ranges (check LimeSuite.h)

  //Get allowed LPF bandwidth range
  lms_range_t range;
  if (LMS_GetLPFBWRange(*device,LMS_CH_RX,&range)!=0)
      error(*device);

  cout << "RX LPF bandwitdh range: " << range.min / 1e6 << " - " << range.max / 1e6 << " MHz\n\n";

  //Configure LPF, bandwidth 8 MHz
  if (LMS_SetLPFBW(*device, LMS_CH_RX, 0, srate) != 0)
      error(*device);

  //Set RX gain
  if (0 > rf_gain) {
    if (LMS_SetNormalizedGain(*device, LMS_CH_RX, 0, 0.7) != 0)
      error(*device);
    //Print RX gain
    float_type gain; //normalized gain
    if (LMS_GetNormalizedGain(*device, LMS_CH_RX, 0, &gain) != 0)
      error(*device);
    cout << "Normalized RX Gain: " << gain << endl;
  }
  else {
    if (LMS_SetGaindB(*device, LMS_CH_RX, 0, rf_gain) != 0)
      error(*device);
  }

  unsigned int gaindB; //gain in dB
  if (LMS_GetGaindB(*device, LMS_CH_RX, 0, &gaindB) != 0)
      error(*device);
  cout << "RX Gain: " << gaindB << " dB" << endl;

  //Perform automatic calibration
  if (LMS_Calibrate(*device, LMS_CH_RX, 0, srate, 0) != 0)
      error(*device);

  return r;
}

int setupRXStream(lms_device_t* device, lms_stream_t* streamId, bool run = false)
{
  //Enable test signal generation
  //To receive data from RF, remove this line or change signal to LMS_TESTSIG_NONE
  //if (LMS_SetTestSignal(device, LMS_CH_RX, 0, LMS_TESTSIG_NCODIV8, 0, 0) != 0)
  //    error(device);

  //Streaming Setup - Initialize stream
  streamId->channel = 0; //channel number
  //streamId->fifoSize = 1024 * 1024; //fifo size in samples
  streamId->fifoSize = srate; //set fifo to contain ~1 second of data
  streamId->throughputVsLatency = 1.0; //optimize for max throughput
  streamId->isTx = false; //RX channel
  streamId->dataFmt = lms_stream_t::LMS_FMT_F32; //32-bit floats
  
  int r = LMS_SetupStream(device, streamId);
  
  if (!r && run)
    r = LMS_StartStream(streamId);
  
  return r;
}

int setupRXStream_Start(lms_stream_t* streamId)
{
  return LMS_StartStream(streamId);
}

int main(int argc, char** argv)
{
    int r = 0;
    parse_args(argc, argv);
    
    //Find devices
    //First we find number of devices, then allocate large enough list,  and then populate the list
    int n;
    if ((n = LMS_GetDeviceList(NULL)) < 0)//Pass NULL to only obtain number of devices
        error();
    cout << "Devices found: " << n << endl;
    if (n < 1)
        return -1;

    lms_info_str_t* list = new lms_info_str_t[n];   //allocate device list

    if (LMS_GetDeviceList(list) < 0)                //Populate device list
        error();

    for (int i = 0; i < n; i++)                     //print device list
        cout << i << ": " << list[i] << endl;
    cout << endl;
    
    
    
    //Device structure, should be initialize to NULL
    std::vector<lms_device_t*> devices(radios.size(), nullptr);
    
    std::vector<lms_stream_t> streamIds(radios.size());
    
    //Data buffers
    //const int bufersize = frame_ms*sr/1000; //complex samples per buffer - holds frame_ms of data
    //float buffer[bufersize * 2]; //must hold I+Q values of each sample
    std::vector<std::vector<float>> buffers(radios.size());
    for (auto&& buffer : buffers)
      buffer.resize(samples_per_frame*2);
    
    for (size_t i=0; i<radios.size() && !r; i++) 
    {
      bool found = false;
      int j = 0;
      //find the desired sdr in the list
      for (j=0; j<n && !found; j++) {
        std::string available(list[j]);
        if ((found = string::npos != available.find(radios[i]))) {
          cout << "Looking for " << radios[i] << " Found: " << available << " Match:" << found << endl;
          break;
        }
      }
      
      if (!found) {
        cout << "Device: " << radios[i] << " not found!" << endl;
        exit(-1);
      }
      
      cout << "Constructing device " << i << " "<< radios[i] <<" List[" << j << "]"  << list[j] << endl;
      
      r = setupRX(&devices[i], list[j]);
      
      cout << "Created device: " << devices[i] << " List:" << list[j] << endl;
    }
    if (r)
      cout << "Failed to construct LMS Devices!" << endl;
    
    for (size_t i=0; i<radios.size() && !r; i++) 
    {
      cout << "Constructing stream " << i << " "<< radios[i] << " streamId:" << streamIds[i].handle << endl;
      
      r = setupRXStream(devices[i], &streamIds[i], false);
      
      cout << "Created stream: " << devices[i] << " streamId:" << streamIds[i].handle << endl;
    }
    if (r)
      cout << "Failed to construct LMS Read Stream!" << endl;
    
    usleep(200000); //seem we need some time to settle
    
    for (size_t i=0; i<radios.size() && !r; i++) 
      r = setupRXStream_Start(&streamIds[i]);
    if (r)
      cout << "Failed to start LMS Read Stream!" << endl;
    
    std::vector<std::pair<int32_t, double>> read_samples(radios.size(), {0,0.0});
    
    auto t1 = chrono::high_resolution_clock::now();
    auto t2 = t1;
    int loop = 0;
    while (!r && chrono::high_resolution_clock::now() - t1 < chrono::seconds(int(exec_time))) //run for 10 seconds
    {
        for (size_t i=0; i<radios.size() && !r; i++) 
        {
          auto beg = chrono::high_resolution_clock::now();

          //Receive samples
          r = LMS_RecvStream(&streamIds[i], buffers[i].data(), samples_per_frame, NULL, 3000);
          
          if (0 <= r) {
            read_samples[i].first += r;
            r = 0;
            auto end = chrono::high_resolution_clock::now();
            std::chrono::duration< double > fs = end- beg;
            read_samples[i].second += std::chrono::duration_cast< std::chrono::microseconds >( fs ).count();
          }
        }
        
        if (!r && (!pstatus || chrono::high_resolution_clock::now() - t2 > chrono::microseconds(pstatus)))
        {
          t2 = chrono::high_resolution_clock::now();
          for (size_t i=0; i<radios.size() && !r; i++) 
          {
            lms_stream_status_t status;
            //Get stream status
            r = LMS_GetStreamStatus(&streamIds[i], &status);
            
            std::string s(5120, 0);

            snprintf((char*)s.data(), s.size(), "%d. Radio %ld %s: Read %d samples for %.2fus. {On=%d FIFO=%8d/%8d(%4.4f%%) URun=%d ORun=%d Dropped=%d Rate=%4.4fMB/s timestamp=%ld}",
                    loop, i, radios[i].c_str(), read_samples[i].first, read_samples[i].second, 
                    status.active, status.fifoFilledCount, status.fifoSize, 100 * (float)status.fifoFilledCount / (float)status.fifoSize,
                    status.underrun, status.overrun, status.droppedPackets, status.linkRate/1e6, status.timestamp);
            
            cout << s << endl;
            
            read_samples[i].first = 0;
            read_samples[i].second = 0;
          }
        }
        loop++;
    }

    for (size_t i=0; i<radios.size(); i++) {
      //Stop streaming
      LMS_StopStream(&streamIds[i]); //stream is stopped but can be started again with LMS_StartStream()
      LMS_DestroyStream(devices[i], &streamIds[i]); //stream is deallocated and can no longer be used
      //Close device
      LMS_Close(devices[i]);
    }

    return 0;
}

You are running 5 independent devices. Even though you start the streams in a loop

for (size_t i=0; i<radios.size() && !r; i++)
r = setupRXStream_Start(&streamIds[i]);

The start is not instant, it actually does a lot of work, like creating individual threads to run the samples acquisition. So in reality, there is going to be a time offset between the radios, when they actually start getting samples. So in your example you’re always going to be waiting most on the last started radio because when it is just getting started, all the radios before it already have buffered some samples.
That said, the LMS_RecvStream function is blocking until the requested amount of samples is available in the FIFO. In your example you are running 7680000 samples per second, and trying to read 7680 samples at a time, that means LMS_RecvStream call might need to wait at most for 1ms to collect that amount of samples before returning, which is close to what you’re getting.

Radio 4 1D90F62803335C: Read 7680 samples for 935.00us.

while you’re waiting on Radio4, all other radios are also preparing samples. So when you’re done with Radio4, other radios already have enough samples ready for next read and the LMS_RecvStream does not need to block, and just copies over the samples, that’s why it takes only ~19us. So the first 4 radios LMS_RecvStream collectively uses up to 100us, then you get back to reading from Radio4, but it only produces the requested amount of samples every 1000us, so that’s what you’re getting (1000-100)=900us to wait for next batch of data.

Hi ricardas,

I was expecting this answer. So the rule of thumb will be that if I need 1ms of data, I should wait for about 1ms.

Yet, the delay between the samples is significant. Also, once all SDRs a up and sampling and I call LMS_RecvStream() I will receive data from different time in the past. According to the given example with 5 SDR: Radio 0 will give me samples from, lets say, 4ms ago, Radio 1 - from 3ms ago, Radio 2 - 2ms ago, Radio 3 - 1 ms ago, Radio 4 - still sampling.

And I hoped to be able to get data within 1ms in the past, from all SDRs.

Are there any guidelines for such synchronizations?

I was reading myriadrf topics and found Synchronize two LimeSDR, which sounds like the ultimate solution. But I don’t know what is its current status and if it is applicable for LimeSDR-Mini v.2.2. Also I have never reprogrammed FPGAs before. Do you know something about that. Btw, I ran the gpio_example, but it does not seem to work properly - the pin status is read, but can not be toggled.

I also started more detailed investigation on the LimeSuite code. And if I get the picture right, we have:

  1. A threadRx, calling ReceivePacketsLoop(), which reads rf samples from the USB and stores them in a FIFO.
  2. Then LMS_RecvStream() call will get the samples from this FIFO.

Then what will happen, if the threadRx of the SDRs wait for some trigger, before entering the while loop in ReceivePacketsLoop? I mean will that harm the FPGA processing somehow, or something else in the data transfer chain?

I will try to implement such thread sync, but I am still inspecting the code.

Thank you!

To make this type of thing work in a real program - you will have to create two separate threads for each SDR. For the receive, each thread sits waits for the data and double buffers it when it comes. On the transmit, each thread sits waits until it has data to transmit. Without the separate threads - you can some times get it to sort of work, but to make it reliable - I have always needed to use the threads.

well, yes, either wait or do your data processing in the mean time.

As I said the LMS_StartStream is taking a long time, as it is allocating memory and creating threads, so it can take several milliseconds. The API wasn’t designed for independent devices synchronization, as that is usually done at hardware level.

I think you should be able to achieve this with some code modification.
The FPGA data streaming is enabled by writing a register RX_EN

So what you can try:

  1. Comment out stream start here: fpga->StartStreaming();
  2. In your program after all radios had their LMS_StartStream calls, manually write the RX_EN registers using LMS_WriteFPGAReg for all radios, to they would actually start sampling as close as possible to each other in time.

Hi RightHalfPlane and ricardas,

I tested both suggestions - using threads and the manual trigger to the FPGAs. And they both work, that should not be a surprise.

But I checked how the timestamps of the streams vary in both solutions. And can conclude that the threaded approach behaves much better.

The images show the timestamp deltas per frame between the SDRs for 10 sec. On the left is the threaded solution and on the right the manual trigger.
Apparently I should go with the threaded option.

Thank you for the help!