New LimeSDR Mini - loopback test failed

Does anyone know what’s the official story around the loopback test failures on the LimeSDR Mini? There are at least half a dozen recent posts here describing new devices that fail the first of the two loopback tests that LimeQuickTest performs.

Some people have described consistent failures, but more common seems to be intermittent failures that may be related to temperature. My board arrived this afternoon, and I’ve been getting intermittent failures all evening.

Is this a defect on the PCB, like a cold solder joint or a short somewhere? That would suggest that there’s an external component responsible for connecting TX to RX. I’ve only skimmed the schematic, but my impression was that the switchable loopback was internal to the LMS7002. If that’s how it works, could it be a defect with the Lime IC? That seems unlikely… I guess it could be a marginal ground connection?

Maybe it’s a firmware issue or a bug in the design of the test?

Is there a bad batch, or does this happen on all of them?

Other than this intermittent loopback failure, the board appears to work. Previous commenters have suggested asking CrowdSupply for an RMA, but I don’t want to make them jump through hoops if it’s not going to fix the problem. Is it even a problem? What’s going on here?

ohazi@woodstock:~$ LimeQuickTest 
[ TESTING STARTED ]
->Start time: Fri May  8 00:20:27 2020

->Device: LimeSDR Mini, media=USB 3.0, module=FT601, addr=24607:1027, serial=1D588E348F8FF6
  Serial Number: 1D588E348F8FF6

[ Clock Network Test ]
->REF clock test
  Test results: 52661; 322; 13519 - PASSED
->VCTCXO test
  Results : 6711067 (min); 6711220 (max) - PASSED
->Clock Network Test PASSED

[ FPGA EEPROM Test ]
->Read EEPROM
->Read data: 13 0A 16 13 0A 16 02
->FPGA EEPROM Test PASSED

[ LMS7002M Test ]
->Perform Registers Test
->External Reset line test
  Reg 0x20: Write value 0xFFFD, Read value 0xFFFD
  Reg 0x20: value after reset 0x0FFFF
->LMS7002M Test PASSED

[ RF Loopback Test ]
->Configure LMS
->Run Tests (TX_2 -> LNA_W):
  CH0 (SXR=1000.0MHz, SXT=1005.0MHz): Result:(-48.3 dBFS, 1.36 MHz) - FAILED
->Run Tests (TX_1 -> LNA_H):
  CH0 (SXR=2100.0MHz, SXT=2105.0MHz): Result:(-15.0 dBFS, 5.00 MHz) - PASSED
->RF Loopback Test FAILED

=> Board tests FAILED <=

Elapsed time: 2.48 seconds

ohazi@woodstock:~$ LimeQuickTest 
[ TESTING STARTED ]
->Start time: Fri May  8 00:21:09 2020

->Device: LimeSDR Mini, media=USB 3.0, module=FT601, addr=24607:1027, serial=1D588E348F8FF6
  Serial Number: 1D588E348F8FF6

[ Clock Network Test ]
->REF clock test
  Test results: 34206; 47403; 60600 - PASSED
->VCTCXO test
  Results : 6711047 (min); 6711200 (max) - PASSED
->Clock Network Test PASSED

[ FPGA EEPROM Test ]
->Read EEPROM
->Read data: 13 0A 16 13 0A 16 02
->FPGA EEPROM Test PASSED

[ LMS7002M Test ]
->Perform Registers Test
->External Reset line test
  Reg 0x20: Write value 0xFFFD, Read value 0xFFFD
  Reg 0x20: value after reset 0x0FFFF
->LMS7002M Test PASSED

[ RF Loopback Test ]
->Configure LMS
->Run Tests (TX_2 -> LNA_W):
  CH0 (SXR=1000.0MHz, SXT=1005.0MHz): Result:(-13.6 dBFS, 5.00 MHz) - PASSED
->Run Tests (TX_1 -> LNA_H):
  CH0 (SXR=2100.0MHz, SXT=2105.0MHz): Result:(-14.7 dBFS, 5.00 MHz) - PASSED
->RF Loopback Test PASSED

=> Board tests PASSED <=

Elapsed time: 2.46 seconds

Something doesn’t add up here. Most people note that their boards “appear to work,” yet LimeQuickTest is intermittently reporting failures. These failures are usually on the same channel/frequency combo, e.g. CH0 (SXR=1000.0MHz, SXT=1005.0MHz) on the LimeSDR Mini, or CH1 (SXR=1800.0MHz, SXT=1805.0MHz) on the LimeSDR.

It doesn’t make sense that one particular transmitter or LNA channel at one particular frequency would be more prone to damage across multiple boards. If this is a manufacturing failure, I would expect other channel combinations to fail more often, but this doesn’t seem to be happening.

If there is actual damage, I would expect the tests to consistently fail. Intermittent failure is something I would expect from a cold solder joint, not a damaged part of an integrated circuit.

There are claims that the loopback tests in LimeQuickTest are supposed to be done on a cold board.
This makes no sense to me. A loopback is a switch, not an amplifier or a filter with temperature sensitive analog components. How could you possibly have a loopback test that is designed to only work on a cold board? Either you have a loopback that works at all reasonable temperatures, or you don’t have a loopback. The LMS7002 datasheet also mentions a loopback amplifier, but if this is the temperature sensitive component, you’d see the output changing by a few dB at most, not -14 dBFS at 5 MHz to -45 dBFS at a completely different frequency.

I’m starting to suspect that there may be a flaw in the way the loopback test is performed by the LimeQuickTest utility. Perhaps the delay between consecutive tests run by LimeQuickTest is marginally too short. If the board isn’t finished configuring itself by the time the next test is run, it’s conceivable that it could return garbage more frequently for one particular test than for any of the others.

Here are other threads documenting similar loopback test failures on the LimeSDR Mini:

Here’s are some LimeSDR (non-mini) loopback test failures that look similar:

There are many more, but you get the point.

Indeed, albeit consistently fail with the same set of operating parameters. I.e. a known temperature and nothing connected to the RF ports. With random temperatures and things attached, all bets are off.

Because the pass/fail is based on a defined performance and this will degrade as temperature increases. Shouldn’t be by too much, but the point is to have a benchmark. So you run the test with the board cold and know whether it’s up to spec or may have suffered damage. This is not always a binary thing of works or doesn’t and rather instead it can work with degradation. Say, too strong a signal has been input to an LNA and caused partial damage, or similarly a bad load/input levels on a Tx port.

So if TX_2 -> LNA_W is passing sometimes and other times failing with results similar to shown above, this would suggest to me that something is wrong. @Zack, would appreciate your thoughts on this.

Yeah, I would expect some continuous degradation with temperature, but I don’t think that’s what we’re seeing with these intermittent failures.

->Run Tests (TX_2 -> LNA_W):
  CH0 (SXR=1000.0MHz, SXT=1005.0MHz): Result:(-46.1 dBFS, -1.38 MHz) - FAILED

->Run Tests (TX_2 -> LNA_W):
  CH0 (SXR=1000.0MHz, SXT=1005.0MHz): Result:(-14.1 dBFS, 5.00 MHz) - PASSED

I’m planning on looking at the tests in more detail soon, but it looks like it’s transmitting a tone/shape at one frequency, connecting the loopback, receiving at a 5 MHz offset, and then taking an FFT and looking for a peak at 5 MHz.

I do see some continuous degradation with temperature when the test is passing – the peak varies from something like -12 dB to -15 dB, always at 5 MHz. But when the test fails, it doesn’t look like the receiver is getting anything sensible at all. -45 dB might as well be noise, and it’s never at 5 MHz – it’s at some other wacky frequency.

Someone in another thread mentioned something about oscillators failing to start or PLLs failing to lock. This hadn’t occurred to me earlier, but it actually makes sense, and would explain tests that always fail at particular temperatures and frequencies. You need a good LO signal going into the mixers for both the transmitter and receiver, otherwise you aren’t going to be able to upconvert / downconvert. But if that’s what’s happening, then this probably isn’t a manufacturing defect, but a design flaw in the LMS7002.

On my unit I’ve noticed that sometimes the entire band turns to noise when I’m tuning the receiver. It’s also intermittent, possibly related to temperature, and definitely related to frequency. I’ll be tuning across some region, e.g. 2.4 GHz, and sometimes I’ll see blips that look obviously like wifi or bluetooth, and then for some small window (~few MHz), I’ll just see a higher noise floor and a vague outline of the bandpass filter, and then as soon as I’m out of that window, the expected 2.4 GHz signals come back. I can then immediately tune backwards over the same window and see the effect again, but 20 minutes later it might be fine. This seems like it would be consistent with a flaky LO.

I have 12 of the LimeSDR Mini boards and they all have frequencies where the TX has dead spots (verified with a spectrum analyzer) or where RX won’t tune (error from API).

Simple self-calibrate test of 56MHz to 1000MHz using the GNU Radio environment:
https://download.silicondust.com/tmp/test.py

  1. Launch a GNU Radio Command Prompt
  2. Run: python test.py
  3. Look for “error" on any line as it scrolls through

no errors for me
Edit: I re-run your test and I do get

INFO: device_handler::calibrate(): INFO: device_handler::set_rf_freq(): RF frequency set [TX]: 686 MHz.
Tx calibration finished
Selected TX path: Band 2
INFO: device_handler::calibrate(): INFO: device_handler::set_rf_freq(): RF frequency set [TX]: 687 MHz.
Tx calibration finished
Selected TX path: Band 2
INFO: device_handler::calibrate(): INFO: device_handler::set_rf_freq(): RF frequency set [TX]: 688 MHz.
Tx Calibration: MCU error 3 (SXR tune failed)
Selected TX path: Band 2
INFO: device_handler::calibrate(): INFO: device_handler::set_rf_freq(): RF frequency set [TX]: 689 MHz.
Tx calibration finished
Selected TX path: Band 2
INFO: device_handler::calibrate(): INFO: device_handler::set_rf_freq(): RF frequency set [TX]: 690 MHz.

Yep - that is the problem. I have 2 boards that will sometimes pass, sometimes fail at 1 or 2 frequencies. Most boards fail every time at a few frequencies. Some fail at a large number of frequencies.

The problem frequencies are close but not exactly the same between boards.

Well that’s extremely disappointing. I wish I had done some more homework before buying this thing… undocumented dead-zones are kind of a non-starter. If 12/12 of @jafa’s LimeSDR Minis exhibit this, then it seems pretty clear that there’s some sort of design flaw in either the board layout of the mini or in the LMS7002 front-end, and that a replacement isn’t likely to fix anything.

If I can get CrowdSupply to respond (ticket opened 3 days ago), I’m going to ask them for a full return and then pay a little more for something with an Analog Devices front-end. :-/