ESP-IDF wifi event loop keeps receiving SYSTEM_EVENT_STA_WPS_ER_PIN even after code rollback - embedded

I have been working on a project using the ESP32 with the ESP-IDF that will check it's NVS memory for wifi credentials before starting the network stack. If it has said credentials, it will connect to the wifi network in STA mode, if it lacks them, it will launch as it's own AP to allow the user to send it the credentials over HTTP.
After manually putting my test credentials into NVS, I started working on the AP code. Once all the AP code and logic was complete, I manually wiped the flash memory with esptool to force the board to launch in that mode. Doing so worked fine, and I was able to send it the updated credentials over HTTP.
At this point, the board attempted to connect as STA upon reset, however, the SYSTEM_EVENT_STA_WPS_ER_PIN event kept being caught by the wifi event loop. The board has since only experienced this event and has been completely unable to connect to wifi since. To make matters stranger, even after rolling back to a previous version with git, the problem still persists.
main.c
void app_main() {
// Start NVS
initNVS();
// Init Wifi Controller
initWifiController();
// Get Credentials to send to wifi
Creds creds = getCreds();
// Start wifi in STA mode with gathered creds
beginWifi(creds);
initializePins();
initializeTimers();
}
wifiController.c
void initWifiController(){
// * NVS must be initialized before wifi work can be done
// Handle when connected to the network
connectionSemaphore = xSemaphoreCreateBinary();
// Begin network stack
ESP_ERROR_CHECK(esp_netif_init());
// Create event loop for handling callbacks
ESP_ERROR_CHECK(esp_event_loop_create_default());
}
void beginWifi(Creds creds){
if(creds.status == ESP_OK){
ESP_LOGI(TAG, "Connection credentials have been found, connecting to network");
connectSTA(creds);
}
else if(creds.status == ESP_ERR_NVS_NOT_FOUND){
ESP_LOGW(TAG, "Missing credentials, starting as AP");
connectAP();
}
else{
ESP_LOGE(TAG, "ESP failed with error %s, not starting wifi", esp_err_to_name(creds.status));
}
void connectSTA(Creds creds){
ESP_LOGI(TAG, "Attempting to connect to wifi with following creds: %s | %s", creds.ssid, creds.pass);
// Set netif to sta
esp_netif_create_default_wifi_sta();
// Prepare and initialize wifi_init_config_t
wifi_init_config_t wifi_init_config = WIFI_INIT_CONFIG_DEFAULT();
ESP_ERROR_CHECK(esp_wifi_init(&wifi_init_config));
// Register tracked events for event_handler
ESP_ERROR_CHECK(esp_event_handler_register(WIFI_EVENT, ESP_EVENT_ANY_ID, event_handler, NULL));
ESP_ERROR_CHECK(esp_event_handler_register(IP_EVENT, IP_EVENT_STA_GOT_IP, event_handler, NULL));
// TODO: Check if this can be used to avoid havng to use NVS manually
esp_wifi_set_storage(WIFI_STORAGE_RAM);
// Config struct for wifi details
wifi_config_t wifi_config = {};
// Copy casted info into wifi_config
// * https://www.esp32.com/viewtopic.php?f=13&t=14611
// * See above link for details on this
strcpy((char *)wifi_config.sta.ssid, creds.ssid);
strcpy((char *)wifi_config.sta.password, creds.pass);
esp_wifi_set_config(ESP_IF_WIFI_STA, &wifi_config);
ESP_ERROR_CHECK(esp_wifi_start());
// ? Is this required to avoid a memory leak?
free(creds.pass);
free(creds.ssid);
}
void connectAP(){
// ? How important is it that these be called in this order?
ESP_LOGI(TAG, "Starting in AP Mode");
esp_netif_create_default_wifi_ap();
// TODO: maybe move this creation to initWifiController to avoid making it twice
wifi_init_config_t wifi_init_config = WIFI_INIT_CONFIG_DEFAULT();
ESP_ERROR_CHECK(esp_wifi_init(&wifi_init_config));
// TODO: Consider moving this to init Wifi Controller to avoid running it twice as well
ESP_ERROR_CHECK(esp_event_handler_register(WIFI_EVENT, ESP_EVENT_ANY_ID, &event_handler, NULL));
// Configuration for AP
wifi_config_t wifi_config = {
.ap = {
.ssid = "Grow System",
.password = "Password",
.ssid_len = strlen("Grow System"),
.max_connection = 4,
.authmode = WIFI_AUTH_WPA_WPA2_PSK
}
};
// TODO: Enable password support on AP configuration
ESP_ERROR_CHECK(esp_wifi_set_mode(WIFI_MODE_AP));
ESP_ERROR_CHECK(esp_wifi_set_config(WIFI_IF_AP, &wifi_config));
ESP_ERROR_CHECK(esp_wifi_start());
ESP_LOGI(TAG, "Wifi connectAP finished");
registerEndPoints();
}
The code essentially checks NVS for the credentials, if it finds them both it returns a struct containing both of them as well as ESP_OK. If one of them was not found, the struct instead contains ESP_ERR_NVS_NOT_FOUND. wifiController.c then receives this creds struct and calls beginWifi() which then either calls connectSTA() or connectAP() based on the status of the struct. When the credentials are present, connectSTA() is called, but for unknown reasons the event loop consistently only receives the SYSTEM_EVENT_STA_WPS_ER_PIN event. As I mentioned earlier, even after rolling my code back to a version that did not have the connectAP() function, this behavior persists.
Because of this, I have a hunch that the issue may be related to when I wiped the flash manually, as opposed to the code.
The header file contains the following line in it's typedef to define this event.
SYSTEM_EVENT_STA_WPS_ER_PIN, /*!< ESP32 station wps pin code in enrollee mode */
I do not know what that means, as I have not intentionally included anything regarding wps in this project.
So far my research hasn't returned anything incredibly useful, so if anyone has any ideas what SYSTEM_EVENT_STA_WPS_ER_PIN is or what could be causing my issue, I would be very appreciative. If you think that anymore detail would be useful to solve this issue, please let me know and I will be more than happy to provide it.

Useful to solve the problem.
I'm in the proces of learning to use the ESP32 wifi, read your message checked, the ESP-idf and esp-32 technical manual not much info there I found this URI
https://www.wi-fi.org/downloads-public/Wi-Fi_Protected_Setup_Best_Practices_v2.0.2.pdf/8188
Kind regards
Update: check the sdkconfig file in your projectmap against the one in the example map for wifi config settings.

I have been digging into this issue for several days now and I have not found the root cause but have managed to discovery a workaround and some details about the issue. the SYSTEM_EVENT_STA_WPS_ER_PIN event that kept arising actually is caused by the chip trying to use WPS Enrolle Pin mode to connect to the router. Supposedly this generates an eight digit long pin that can be put into your router which should allow it to connect. I really do not know why it was doing this, but I believe it may have had something to do with how I had AP mode set up. My primary confusion now is why rolling back the code did not fix it.
For the "workaround" I found that flashing Espressif's example STA wifi connection code managed to solve it. The chip properly connected to my network once it was flashed, and I was able to reflash my own code without any issues arriving. This seems extremely strange to me, so part of me thinks that maybe the chip just couldn't connect to my network for some reason and an edge case in the code caused it to go into enrollee mode.
Here is a link to the code that I used to "fix" the issue: https://github.com/espressif/esp-idf/blob/master/examples/wifi/getting_started/station/main/station_example_main.c
I will mark this answer as correct and continue looking for the root problem. In the event that someone else know what actually could have caused this, feel free to post and I will update the correct answer. In the event that I find what the root problem is, I will update this answer as well.
EDIT: After continuing to dig, I believe that the problem was actually do to a multitude of errors in my code. Particularly, I was calling esp_netif_create_default_wifi_sta() and then not setting the WI-FI mode. I needed to add esp_wifi_set_mode(WIFI_MODE_STA), which was absent in my program. I believe setting the network stack to sta without changing the wifi mode was what caused my issue.

Related

Sending ethernet frames from kernel module

I have a kernel module that would like to send pre-fabricated ethernet frames from user space such as custom ARP, and other protocols (I'm trying to bypass tcp/ip stack on linux and create custom one for my needs). Frames are valid and complete with all necessary things. The only part that remains is to send them somehow to the queue on eth0 interface. What is the best solution to do this?
For snatching incoming packets I am using netfilter API with the earliest hook possible. I can not use raw sockets from user space due to the need of sudo and also due to my custom requirements.
Edit: I was able to achieve my goals with dev_queue_xmit(). However, I am still wondering if there is another solution that accesses the driver directly.
static void SendFrame(void)
{
struct sk_buff* skb = dev_alloc_skb(1518);
skb->dev = __dev_get_by_name(&init_net, "eth0");
skb_reserve(skb, NET_IP_ALIGN);
skb->data = skb_put(skb, ethFrameBytes);
memcpy(skb->data, pEthFrame, ethFrameBytes);
if (dev_queue_xmit(skb) != NET_XMIT_SUCCESS)
{
printk(KERN_ERR, KERN_ERR "Error: unable to send the frame\n");
}
}

STM32 UART error does not clear flag

I am programming a STM32F446 microcontroller and I am communicating it with an ESP8266 (startByte-command-size-dataArray-crc1-crc2). However I have a problem, whenever the ESP8266 resets it gives a serial debug (cannot turn it off) #74880 baud (also cannot change this) which is causing an error in the STM32 microcontroller, as it should since I programmed them to communicate at 9600.
The problem is that whenever that error occurs in the STM32 microcontroller the error never stops, since it cannot clear the error flag. In order to clear the error flag you just need to read the status register (HAL_UART_GetError function), but my code is unable to do it when running, and by that I mean that no matter how much I read the register it never changes, UNLESS I pause debugging and then resume
void HAL_UART_ErrorCallback(UART_HandleTypeDef *huart) {
errorCounter++;
if(HAL_IS_BIT_CLR(huart->Instance->CR1, 1)) {
SET_BIT(huart->Instance->CR1, USART_CR1_RXNEIE | USART_CR1_PEIE);
SET_BIT(huart->Instance->CR3, USART_CR3_EIE);
if(HAL_IS_BIT_CLR(huart->Instance->CR3, USART_CR3_DMAR)) {
SET_BIT(huart->Instance->CR3, USART_CR3_DMAR);
}
}
while(huart->Instance->SR != 0x80) {
huart->Instance->SR
HAL_UART_GetError(huart);
HAL_UART_GetState(huart);
huart->Instance->SR = 0;
}
}
The while loop is there because I wanted to see if I could force my code to read the same register over and over until it cleared it, but it didn't matter.
I have also tried disabling UART (__HAL_UART_DISABLE) forcefully but still, the same problem, it only clears the flag whenever I pause debugging.
I have searched everywhere and I cannot find any way to make this work. I even disabled optimization, but the same thing kept happening.
EDIT:
Found a way to make it work. It worked when I paused debugging because, as already stated in the answer, the debugger was reading the DR register, thus clearing it, and when I read the SR register it actually cleared it (it wasn't clearing because there was something that still needed to be read).
Solution: read DR register and then read SR register
First of all your code is a total mess. It will not even compile and most of it does not make too much sense. You can't clear error flags by writing 0 to SR register. You must read the SR and then read the data register.
The debugger probably reads the DR register and that is the reason why the flags are being cleared when you break the execution of the program.
My advice - read the RM very carefully.

Socket.io Rooms in a Hostile Network Environment?

I have a very frustrating problem with a client's network environment, and I'm hoping someone can lend a hand in helping me figure this out...
They have an app that for now is written entirely inside of VBA for Excel. (No laughing.)
Part of my helping them improve their product and user experience involved converting their UI from VBA form elements to a single WebBrowser element that houses a rich web app which communicates between Excel and their servers. It does this primarily via a socket.io server/connection.
When the user logs in, a connection is made to a room on the socket server.
Initial "owner" called:
socket.on('create', function (roomName, userName) {
socket.username = userName;
socket.join(roomName);
});
Followup "participant" called:
socket.on('adduser', function (userName, roomName){
socket.username = userName;
socket.join(roomName);
servletparam = roomName;
var request = require('request');
request(bserURL + servletparam, function (error, response, body) {
io.sockets.to(roomName).emit('messages', body);
});
servletparam = roomName + '|' + userName;
request( baseURL + servletparam, function (error, response, body) {
io.sockets.to(roomName).emit('participantList', body);
});
});
This all worked beautifully well until we got to the point where their VBA code would lock everything up causing the socket connection to get lost. When the client surfaces form it's forced VBA induced pause (that lasts anywhere from 20 seconds to 3 minutes), I try to join the room again by passing an onclick to an HTML element that triggers a script to rejoin. Oddly, that doesn't work. However if I wait a few seconds and click the object by hand, it does rejoin the room. Yes, the click is getting received from the Excel file... we see the message to the socket server, but it doesn't allow that call to rejoin the room.
Here's what makes this really hard to debug. There's no ability to see a console in VBA's WebBrowser object, so I use weinre as a remote debugger, but a) it seems to not output logs and errors to the console unless I'm triggering them to happen in the console, and b) it loses its connection when socket.io does, and I'm dead in the water.
Now, for completeness, if I remove the .join() calls and the .to() calls, it all works like we'd expect it to minus all messages being written into a big non-private room. So it's an issue with rejoining rooms.
As a long-time user of StackOverflow, I know that a long question with very little code is frowned upon, but there is absolutely nothing special about this setup (which is likely part of the problem). It's just simple emits and broadcasts (from the client). I'm happy to fill anything in based on followup questions.
To anyone that might run across this in the future...
The answer is to manage your room reconnection on the server side of things. If your client can't make reliable connections, or is getting disconnected a lot, the trick it to keep track of the rooms on the server side and join them when they do a connect.
The other piece of this that was a stumper was that the chat server and the web UI weren't on the same domain, so I couldn't share cookies to know who was connecting. In their case there wasn't a need to have them hosted in two different places, so I merged them, had Express serve the UI, and then when the client surfaced after a forced disconnect, I'd look at their user ID cookie, match them to the rooms they were in that I kept track of on the server, and rejoined them.

WinRT HttpClient blocks splashcreen

I do asynchronous requests in LoadState method of a certain Page. I use HttpClient to make a request and I expect the splashscreen to go away while I await the result.
If I am not connected to any networks, the splashscreen immediately goes away and I get a blank page because the request obviously didn't happen.
But if I am connected to a network but have connectivity issues (for example, I set a wrong IP address) it seems to start a request and just block.
My expectation was that the HttpClient would realize that it cannot send a request and either throw an exception or just return something.
I managed to solve the issue of blocking by setting a timeout of around 800 milliseconds, but now it doesn't work properly when the Internet connection is ok. Is this the best solution, should I be setting the timeout at all? What is the timeout that's appropriate which would enable me to differentiate between an indefinitely blocking call and a proper call that's just on a slower network?
I could perhaps check for Internet connectivity before each request, but that sounds like an unpredictable solution...
EDIT: Now, it's really interesting. I have tried again, and it blocks at this point:
var rd = await httpClient.SendAsync(requestMsg);
If I use Task.Run() as suggested in the comments and get a new Thread, then it's always fine.
BUT it's also fine without Task.Run() if there is no Internet access but the network access is not "Limited" (it says that the IPv4 connectivity is "Internet access" although I cannot open a single website in a browser and no data is returned from the web service. It just throws System.Net.Http.HttpRequestException which was something I was expecting in the first place) Only blocks when the network connection is Limited.
What if instead of setting a timeout, you checked the connection status using
public static bool IsConnected
{
get
{
return NetworkInformation.GetInternetConnectionProfile() != null;
}
}
This way if IsConnected, then you make the call; otherwise, ignore it.
I'm not sure if you are running this in App.xaml.cs? I've found requests made in that class can be fickle and it may be best to move the functionality to an extended splash screen to ensure the application makes it all the way through the activation process.
http://msdn.microsoft.com/en-us/library/windows/apps/xaml/Hh868191(v=win.10).aspx

OpenNETCF.Net.Ftp Behaving Flaky

I tried posting on their boards (authors of this library), however it literally takes months for them to reply when it comes to the free software (can't blame them).
But anyways
I have found that this library is behaving weirdly - for instance, a major problem with my application is when someone is trying to sign in (through FTP), they provide a correct login and mistype the password, no reply is received from FTP server.
I tried doing the same from command window just to verify that it's not the FTP server's fault; and FTP commands were received instantaneously.
It almost looks as though this library eats the commands. The same actions often times will yield different results.
Can anyone recommend a stable, reliable library to use with Compact framework? Or shed some light on this issue...?
I modified the source code inside ConnectThread() as follows:
// if a PWD is required, send it
if( response.ID == 331 )
{
response = SendCommand("PASS " + m_pwd, false);
//ADDED THIS - try again.
if (response.ID == 0)
{
response = SendCommand("PASS " + m_pwd, false);
}
//end of my addition
if( !((response.ID == 202) || (response.ID == 230)) )
{
m_cmdsocket.Close();
m_cmdsocket=null;
Disconnect();
m_connected = false;
return;
}
}
This solved the issue for awhile, until now it started doing it again, the culprit seems to be when 0 is coming back as a response from FTP server, the connection just stalls. I am not sure whether it is a socket issue or some other obscure problem, but I think I am going to give up at this point.
Which FTP set are you using, the stream-based classes in the SDF, or the separate one in the Forums? If you're using the one from the forums (which is the one I actually recommend), then you've got the source. I wrote that one from the ground up by looking at nothing by the RFC. It's really, really simple and if it's "eating" responses, it's likely a timeout issue, though it should be easy to put in a break point and see where it's coming apart.