Stay ahead of the curve with trusted IoT expertise
BLOGS/ Debugging / Observability / Testing & Development

How Nordic Used Memfault to Optimize Reliability of the Thingy:91 X Before Launch

Stay ahead of the curve with trusted IoT expertise

Share

If you are reading this then you probably already know that Nordic recently released their much anticipated Thingy:91 X cellular IoT prototyping platform. What you may not know is that the Nordic team developing it used Memfault to resolve complicated faults quickly and identify performance improvements, ensuring the Thingy:91 X would provide a reliable platform for customers to prototype on from day 1.

The Nordic Thingy Series has been a staple prototyping platform helping IoT companies significantly accelerate product development, packed with great features and capabilities right out of the box. Given the critical role the Thingy:91 X will play in many companies’ product development process, not only did it have to be loaded with easy to use, powerful features and capabilities, it also had to be highly reliable. Reliability is made even more complicated given that the Thingy:91 X will be used in a diverse set of conditions and environments as it supports the development of a huge range of different products. This is where Memfault comes in.

The Nordic team used Memfault to efficiently resolve illusive bugs and performance problems during the later stages of the Thingy:91 X development process. Memfault gave the team the ability to collect detailed performance and reliability data from their devices without requiring physical access to the device. They could test the device in real world scenarios, such as locations with poor cellular reception, and still be confident in collecting the data they required to solve problems efficiently. Using more traditional debugging tools, the minute the devices leave the desk faults become much more difficult to diagnose, Memfault solves this problem. 

Nordic’s Engineering team highlighted a few specific examples where Memfault helped them solve real world problems that were difficult to identify in testing, root cause complicated issues much faster and collaborate between teams on bugs that impacted multiple systems.

 

Efficiently fix memory leaks related to location searches with Memfault’s heap analysis.

 

The Thingy:91 X is designed for prototyping cellular IoT applications and obviously this means that the use case is likely to involve devices deployed without easy physical access and communicating over limited bandwidth, metered networks. Traditional embedded debugging tools just don’t work for identifying and resolving problems happening in these environments.

Memfault’s remote, automatic and highly efficient data collection provides embedded teams with the same level of visibility into performance and problems for devices in the field as developers are used to when the device is on their desk. This means identifying and resolving hard to reproduce issues is much easier and much more efficient.

Memfault automatically collects heap utilization on the system. The team used this data to identify the root cause of unexpected reboots.

A visualization in Memfault of free heap decreasing over time until a crash occurs.

 

“We were having reboots in the field related to location searches but we didn’t know where to look. When using Memfault we could see that the Heap size was reducing until the memory was used up and it crashed. Using Memfault we could see where it was allocated”. The team were then able to pinpoint the issue and provide a fix. In their own words “debugging this without Memfault would have taken waaaay longer”. 

 

Identifying and investigating modem crashes using Memfaults built in log collection and analytics.

One of the other critical considerations for embedded devices, and particularly devices communicating over limited bandwidth or metered networks is efficiency of communication. It’s not possible or practical to capture everything happening on device. Memfault is designed for these types of scenarios and allowed the Nordic team to identify a modem issue happening in the field but without initially requiring the capture of modem traces which can be very large. 

Memfault’s efficient log capture gave the team enough information to identify an issue happening in the modem, and then make the decision to temporarily enable modem trace capture, so they could get the data needed to root cause. The log search functionality allowed them to quickly identify relevant information across any of the logs collected from their fleet and accelerate their investigation.

Searching for “crash” using Memfault’s log search and identifying modem crashes.

“Often when the modem crashes you don’t have much to go on. The first clue was we could see Memfault collecting modem crash logs. This pointed us towards the modem and then we could enable sending modem traces to flash.” Once the R&D team had access to the modem traces they could then provide them to the modem team, showing them the issue and conditions in which it occurred allowing them to resolve it.

 

Replace streaming of modem traces via UART with Memfault’s custom data recordings and coredump capture.

Capturing modem traces and coredumps remotely is a core Memfault functionality and something that provides engineering teams, including the team at Nordic, detailed information on devices in the field in a way that’s practical and efficient.

In particular modem traces can be used to collect critical cellular data like the IP level traffic, details on the MVNOs in use, and the AT commands being sent to the modem. To send the traces to Memfault, the Nordic team used Memfault’s custom data recordings to package up this detailed log information such that it could be uploaded to the cloud with minimal bandwidth usage. The Nordic team could download this from the Memfault app and decode it locally using the nRF Connect for Desktop application Cellular Monitor. Additionally, when there is a modem crash, the modem can send a coredump to the host controller for detailed information on exactly what line in the modem firmware triggered a fault.

 

A modem trace custom data recording presented on the device timeline in Memfault.

This out-of-the-box Memfault functionality saved the team significant time and provided efficient access to data from devices deployed in real-world scenarios during testing. This set-up also demonstrates the level of insight available with Nordic and Memfault for customers using the Thingy:91 X (or any other Nordic based device).

 

Optimizing stack allocation with real world usage data using Memfaults stack analyser.

The Nordic team also used Memfault’s built in stack analysis data to make optimizations to their stack utilization based on data collected from devices being tested in the real world. Traditionally, doing optimizations locally can be tedious and can involve some guesswork. Using Memfault on their test fleet the Nordic team could act with confidence to make significant optimizations.

The Nordic team said, “Doing that (optimizations) across a fleet running for days, testing out different scenarios, and then you can tweak the stack based on the (Memfault) UI. It’s amazing, really, very helpful. We did that for a couple of rounds and reduced the stack quite a bit.”

 

Wrapping Up

The team here at Memfault are super excited about the Thingy:91 X (in fact, we have a number of them in the office already) and we loved learning about how the Nordic team used Memfault to help them accelerate the development process and ensure their customers got the most reliable prototyping platform possible, right from launch.

If you want to dive into some more of the technical details on how Nordic uses Memfault you can do so over on Nordic’s DevZone blog on the topic

Related Posts

STAY AHEAD OF THE CURVE

Subscribe for industry trends, advice, and success stories

Trusted expertise for IoT business leaders and development teams