-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLE high frequency of disconnects #761
Comments
Can you tell us more about your hardware configuration please? Which specific SparkFun hardware are you using? Facet? Facet LB? Surveyor? Express? Breakout board? etc? Are you using two SparkFun units in Base/Rover? Or are you using one and getting connections from a network source? By "transmitting NTRIP", are you using IP to transmit your own RTCM from your base to your rover, or are you getting RTCM via NTRIP from a network provider and transmitting it to your rover? (The problem could have been in the network or at the RTCM provider.) Can you describe what you mean by "disconnect" ? |
Great stack of question! You can find my detailed writeup here, but I'll summarize: Using a SparkFun GPS-RTK-SMA Breakout – ZED-F9P transmitting via RTK Base to RTK2GO. Receiving via SW Maps, and passing it into my SparkFun RTK Facet via SW Maps. By disconnect I mean there seems to be some Bluetooth disconnect happening, so it drops out of SW Maps. I had it do this once after reconnecting even before I had reconnected to NTRIP. Usually it reconnects with no issue when I connect back to it in SW Maps, but every once in a while it would lag and refuse to connect again until rebooted. RTK performance was fine, and it would always be in RTK Fix right up until a connection drop. NTRIP performance is also fine, with both base and my phone on stable LTE connections. I'm assuming it's BLE bandwidth related, due to the fact that turning down the update frequency cut way back on the disconnections. This is in contrast to 3.10, where I would only hit occasional disconnects, with the default update frequency. |
We've see a variety of issues with BLE on the v2.x of the ESP32 Arduino core. Without a way to replicate your exact issue/setup, I cannot explain why v3.x of the RTK Firmware was better. We've have had a lot more success on RTK Everywhere where we have more PSRAM on those products, so we are able to move to v3.0.x of the ESP32 core. To clarify, are you running on your own hardware? Not a RTK product? If you're on your own hardware and up for some hacking, you could run RTK Everywhere on an ESP32 dev board with PSRAM. If you can give us a way to replicate I'm happy to take a look. |
I'm running y'all's RTK Facet, I just dumped the reported hardware for reference.
I don't know that I can give you a way to replicate, which is why I suggested if you can do me a logging build, I can run it and give you the results if it's still flaky. |
Ah! I saw the Using a SparkFun GPS-RTK-SMA Breakout and was confused. Are you running any settings that are of note? Did you turn on any additional messages? |
Other than being in BLE mode, tweaking the frequency down, and transmitting NTRIP back to device, no settings of note. |
I just acquired an RTK Express+ since I need the ZED-F9R. I also have a Facet w/ ZED-F9P which I obtained about a year ago. I have v4.1 running in both and I'm using the latest SW Maps as of this post date.... with iPhone 13. I am seeing these BLE disconnects on the Express+ but not on the Facet. The disconnects are unacceptably frequent (typ less than 60 sec to disconnect after the associate from SW maps). The disconnect issue seems more prevalent once I start streaming RTCM corrections from the SW Maps client. So, this topology I was planning to use for my rover is now essentially inutile with the Express+. I am unsure if there is some kind issue with the ZED-F9R vs. ZED-F9P. At this point, I am trying to determine if the Express+ has hardware issues and if I need to exchange it for a new unit or something. At this time, I do not have time to trace this BLE disconnect issue with my Ellisys Vanguard. Maybe in a few weeks or something(?). Advise if Sparkfun plans to investigate and resolve this issue with a FW update. Otherwise I will need to create my own corrections stream path and simply deem the Express+ BLE link as unreliable. I also do not have time to solve the ESP32 FW -or- HW issue. _****We've see a variety of issues with BLE on the v2.x of the ESP32 Arduino core.**** ... please elaborate FYI.. I had to reposition the OLED on the Express+ since the enclosure cutouts registration on the Hammond enclosure were way off ( > 2mm) and the graphic overlay viewing window was therefore not aligned obscuring clear view of all OLED pixels. The hot glue attachment should also changed to screws for these rather expensive units. I modified my Express+ unit to use self-tapping screws. |
Dont have anything to add other than running out of box express+ 4.0 the BLE to ios/SW maps is functionally useless as it doesn't stay connected long enough to do anything useful. |
Here's what I can replicate: With an RTK Express, running v4.1, connected to iOS and SW Maps, using factory defaults, we can very reliably connect over BLE. Once NTRIP is turned on (data is now being sent from the phone back to RTK device) the NTRIP Client will disconnect after 15 to 30 seconds. After lots of testing, it appears the BLE connection is getting saturated and the RTK device does not read the incoming BLE bytes from the phone. When this happens, iOS will disconnect the BLE RX service (because it's not being read from), and the NTRIP Client disconnects. Reducing the RTK device's measurement rate from the default of 4Hz to 1Hz, and disabling the GSV NMEA message increases the NTRIP Connection time to more than 45 minutes (we're still running longer tests). If you are experiencing problems over BLE, try the following:
While I can't replicate the original "BLE disconnects", I believe the combination of user's other base setups (which we can't easily replicate), and what we're seeing with the NTRIP Client disconnecting, it all may be related. I would love to get additional feedback or ways we can replicate this issue. Please keep the feedback coming. Above, the GNSS receiver measurement rate is set to 1 Hz. Above, UBX_NMEA_GSV is set to 0 (disabled). |
I'm still on 3.10 (due to not having used my Express for a while and not updating when I started again a week or two ago), and it is is stable (Android, QField getting NMEA over BLE, and RTCM obtained over wifi from MaCORS). No issues, other than poor accuracy when really close to the north side of a building with bad sky view (no surprise). I am thinking about upgrading, and scared of 4.x. Are there any reports of BLE instability with Android, or does this feel like a dropped byte causes ios to be twitchy and give up? You said "saturated", and I wonder if the problem is the ESP keeping up with reads, or the data transmission being more than fits in the channel, and the transmit buffers filling up. Since it's a shared physical channel it makes sense that high-rate NMEA from the F9P would cause RF congestion and thus cause the RTCM stream to back up, and the ios device to get upset about that. Is that what you think is happening? I will read all the releases notes and plan testing. I might also try to configure BluetoothGNSS and mock GPS to get NMEA into QField, which would result in RTCM and NMEA both on BLE. |
No, I believe this is predominantly an iOS issue.
I have removed the RTK Firmware from the equation and am not concerned about GNSS receivers at this point. I believe it has to due with the ESP32's BLE stack and how iOS deals with devices that do not empty out the iOS phone's buffer in a timely manner. Our testing is done with nothing but a stripped down BLE serial transmission and reception test. In one direction (TX from RTK device to iOS phone), everything is fine up to ~4kB/s. The BLE TX service remains intact even as buffers begin to fill. Once you add in data coming from the phone, data flowing in both directions reach about ~1.5kB/s at which time, if a packet from the phone doesn't get through because it is blocked by a TX from the RTK device, the phone will stop trying to transmit, the BLE RX service is closed, while the BLE TX service remains intact. |
We're getting corroboration from other users that the data rate overwhelming the iOS interface is the culprit. Reducing the reported sentences and/or measurement rate will reduce and eliminate disconnects. |
Subject of the issue
Over 8 hours yesterday, I ran a survey with the new 4.0 firmware. I've had occasional disconnects before, but the new version seems particularly unstable, with well over 50 disconnects over the course of my day. After adjusting the update frequency down from 0.25 sec to 0.33 sec, it seemed to stabilize some, but still had issues.
I tried noticing a pattern, but there was none apparent, with disconnects happening because of:
I guess maybe as a first step, if y'all count either generate a build to log whatever data you might need to diagnose this, or tell me what to turn on to log it to SD card, we can start there. I have another long survey run this next weekend likely, so it would be a good time to do this.
Your workbench
The text was updated successfully, but these errors were encountered: