Debugging an intermittently unresponsive USB device - usb

My app communicates with a simple USB device as follows:
The app sends commands (2 or 3 bytes each) to the USB device by using WriteFile (kernel32.dll).
For each command that is send, the USB device sends a short response, which the PC receives using ReadFile (kernel32.dll).
Reading and writing is done asynchronously, using GetOverlappedResult to check the status of an operation.
Testing on 2 out of 3 PCs, the app and device function perfectly: all responses are received 100% reliably.
Under identical tests on the third PC, approximately 50% of the ReadFile requests do not return any data - the status remains as pending (ERROR_IO_INCOMPLETE) forever.
In other words, approximately for every 2 commands sent, one response is received.
Because the device functions perfectly with the other PCs, it lead me to believe that the problem might be occuring inside Windows, in the underlying code which is called by ReadFile (I presume some lower level USB driver code).
Question:
Please could you advise what debugging tool is most useful to investigate this? With my current knowledge, the internal workings of ReadFile are quite opaque.
The PC which is experiencing the issue is running Windows 8.0

You could try DebugView. Run as admin.
Go to "Capture", enable "Capture Kernel", enable "Enable Verbose Kernel Output".
This might help to investigate errors on Kernel level, if any occured.

Related

Can I poll my USB HID device without first sending a command

I was able to make a working HID USB stack on my "StartUSB for PIC" board for the 18F2550 microcontroller. I based it on one of the MLA libraries, which was made for the 18F45K50 (MLA 2018_11_26, hid_custom, picdem_fs_usb_k50.x), but I converted it to work with the 18F2550 (there might have been easier ways, but only learned to work with PIC about 1 month ago). On the host side, I'm using LibUsbDotNet (also here, there might be easier ways - the documentation on this library really sucks) on a Windows 10 machine.
I'm using the HID class, full speed, and all seems to work. Although, I get some random errors on the host PC (see below), but doing one close/re-open cycle on the host side when getting the error is kind of solving it. Dirty, but it works. So I kind of ignore this now.
Win32Error:Win32Error:GetOverlappedResult Ep 0x01
995:The I/O operation has been aborted because of either a thread exit or an application request.
I'm not an expert on USB (yet). But all examples I'm seeing are based on 1) you send first a command to the device and 2) then you retrieve the answer from the device. I did some performance tests, and see that this indeed shows that I can do about 500 cycles/second. I think that is correct, because each cycle, sending command and retrieving answer, each takes 1 msec.
But do I really need to send a command? Can't I just keep reading endlessly, and when the device has somthing to say, it does send the data in an IN transaction, and when not it ignores which creates a timeout on the host side. That would mean that I can poll at 1000 cycles/second? Unfortunately, I have tried it by changing my implementation on the PIC, but I get very weird results. I think I have issues with suspend mode. That brings me to another question - how can I make the device get out of suspend mode (means that not the host, but the device should be triggering this event). I have searched the MLA library for command such as "wakeup", "resume", ... but couldn't find anything.
So, to summarize, 2 questions:
Conceptual: Can I send data from device to host without being requested for it by a command from the host?
For PIC experts: How can I have a device trigger for a wakeup from suspend mode?
And indeed, the answer is Yes on the first question.
In the meantime, I found another link on the web that contains a Visual Studio C# implementation of a USB library including all the source files.
If you're interested, this is the link
This C# host implementation works as a charm. Without sending a command to the device, I get notified immediately if a button is pressed. Great!
It also proofs that my earlier device implementation based on the original MicroChip MLA, is 100% correct. I stress tested the implementation by sending a "toggle LED command" as fast as I could, and I reach 1000 commands/second. Again great!
I think that LibUsbDotNet isn't that perfect after all. As I wrote above, I get rather unstable communication (Win32Error). But with this implementation, I don't get a single error, even after running for half an hour # 1000 commands/second.
So for me, case closed.

Issues communicating with devices over usb hub

I'm facing issues when communicating with devices over USB hub. When enumerating devices directly to host port, it does work, some devices over usb hub have issues.
Setup: STM32F103C8 - MAX3421E - LUFA (usb stack) (ported to MAX3421E (host) and STM32F103C8T6 (device)) - USB Full-Speed setup
Scenario:
When I attach device directly to host, I don't experience any issues enumerating almost all (some devices seems to be faulty and have weird/nonstandard behavior) devices. But when I try to enumerate over usb hub, devices starts to behave very strangely. I'm receiving much more NAK's from devices than when connected directly to host. Some devices are able to return Device Descriptor, but retrieving Configuration Descriptor fail. Some devices return Toggle Error after several NAK's, this could be remedied so far by delaying retry IN token. Also there is different behavior of devices when connected over different hubs. I.e. one device has no problems when connected to HUB1, but have issues when connected to HUB2. Then I have HUB3 (7 port) which internally acts as HUB in HUB. On this HUB3 device working fine on port behind secondary internal hub, but not on primary ports exposed over "root" hub.
I'm in suspicion that hub's TT could be somehow interfering with usb communication, but according to information I have found, TT should not be enabled under Full-Speed setup.
I have checked (many times) that I'm setting correct device address assigned during SetAddress phase (which is proved by returning Device Descriptor). When I step debug it seems that I can get Configuration Descriptor also, but while in normal system run, it isn't retrieved, but only over hub.
Does anyone has any ideas, what to look after? I've run out of ideas here after week of trying to find a root cause.
Thanks
so...
- as usual after searching for root cause, solution after days of trying comes naturally after asking on somewhere (this is hapenning to me always, but I do try prior asking always)
- when using hubs, make sure you don't suspend SOF generation during control transfers. LUFA just resume bus inside control transfer routines, so make sure you don"t stops and reenable SOF within (my fault as I'm using ported version to MAX)
- if you have tight main loop make sure you don"t reinitialize usb transfer without completion of previous try, but if you do so, check you don't owerwrite data which haven"t been processed yet fully (especially when using interrupt-driven transfer complete processing) [things seems to work when you have quite some debug output, as it delay that time critical transfers]
Enumeration over hub isuues are now second to none. Small glitches are subject for tweaking.
Unfortunately as I was in question for electrical issues, I had to unsolder usb host shield and soldered another one, which in light of new information seems unneeded. Nevermind, I have trained my soldering skills.

USB CDC device stalling

I'm writing a simple virtual serial port device to report an older serial port. By this point I'm able to enumerate the device and send/receive characters.
After a varying number of bulk-out transmissions from the host to the device the endpoint appears to give up and stop transferring data. On the PC side I receive a write error, and judging from a USBlyzer trace the music stops on a stall (USBD_STATUS_STALL_PID). However my code never intentionally issues a STALL condition on that endpoint and the status flag for having generated one never gets set though.
Given the short amount of time elapsed (<300 µs) between issuing the request and the STALL it would appear to be an invalid response of some sort, and not a time-out. On the device side the output endpoint is ready to go, with data in the buffer and proper DATA0/1 synchronization, but nothing further ever happens.
Note that the device appears to work fine even for long periods of time until I start sending "large" quantities of data. As near as I can tell the device enumeration/configuration also appears to complete successfully. Oh, and the bulk-in endpoint continues to work just fine after this.
For the record I'm using the standard Windows usbser.sys driver and an XMega128A4U µP. I'm also seeing the same behaviour across multiple Windows Vista and 7 machines.
Any ideas what I'm doing wrong or what further tests I might run to narrow things down?
USBlyzer log,
USB CDC stack,
test project
For the record this eventually turned out to be an oscillator problem. (Apparently the FLL's reference is always 1,024 Hz even when the 1,000 Hz USB frames are chosen. The slight clock error meant that a packet occasionally got rejected if it happened to contain one too many 1-bits in a row.)
I guess the moral of the story is to check the basics before assuming you've got a problem with the higher-level protocol. Also in retrospect a hardware USB analyzer would have been a worthwhile investment, the software alternatives mostly seems to spit out a generic error code or nothing at all when something goes awry.
Stalling the out-endpoint may happen on an overflow of the output buffer on the host side. Are you sure that the device does fetch the data it receives via out-endpoint - and if so does it fetch the data at least as fast as data is sent to the device?
Note that the device appears to work fine even for long periods of
time until I start sending "large" quantities of data.
This seems to be a hint for an overflow of the output-buffer.

Using WinPCap for UDP receiving

I would like to use WinPCap library for "reliable" UDP receiving in my C++ application. All examples that I found, using this library for capturing and then proceding. Is there any way (example) how to configure PCap for streaming mode and receive UDP only and on uder defined port or how to solve this. In this time I have reliable UDP server able to receiving 0.5Gb/s. But on slower PC I have a packet lose I can see packets in ethereal but not in application.
thanks
vsm
I assume that you have already tried all of the more standard methods of increasing the number of datagrams that you are able to process? Things like increasing the recv buffer size, speeding up the processing that you do per datagram and using IOCP to allow you to bring more threads to bear on the problem or using RIO if you can target Windows 8?
If so then using WinPCap might work but it sounds like a bit of an extreme solution.
What you need to do is create a filter so that you only capture the datagrams that you are interested in... The docs include examples which use filters.
I have server from here: http://www.gamedev.net/topic/533159-article-using-udp-with-iocp/. This code working with IOCP. Its working fine on WIndows XP. There is no problem with receiving 0.5Gb/s. But now on Win7 is little unreliable. Sometimes there are packets positions error. (my device generating udp packets and in its payload there is PacketNumber - number increasing with each packet. When error occured i write all packet numbers into file. I can see for exmaple: 10,11,290,13,14... ). Is there any known differences in WinXP and Win7 for IOCP and multi threading? Or do you konw any free UDP server with IOCP processing?
In procedding loop I only adding packets into buffer and checking their numbers.

Is there way to tell terminal wait before send more data?

I have embedded firmware which have terminal over serial transmission. I am doing command from terminal which waits data (text file) which it should save to flash chip. However, writing flash is much slower than terminal transmission.
Text file may be pretty big (many kB), so in small embedded environment I cannot simply dump it to RAM. I though if it possible to communicate with standard terminal emulator (which have drag/dop support for files) to pause transmission every time when write buffer is full and tell continue again after write is done? I haven't find anything which may help me throught this.
Well, offcourse I can make PC frontend which understands this trick, but in basic level it should be nice if all function can be used through normal terminal if needed.
For a basic serial connection you could see if the hardware supports flow control. This would be the CTS, RTS lines (clear to send, request to send).
http://en.wikipedia.org/wiki/RS-232_RTS/CTS#RTS.2FCTS_handshaking
However many simple embedded systems do not implement this type of flow control.
If the hardware does not support flow control, then you will have to look at using some form of software flow control. You maybe able to implement the Xon/Xoff flow control ( http://en.wikipedia.org/wiki/XON/XOFF ) or could implement a simple file transfer protocol, like XMODEM, or ZMODEM, or even tftp. This depends on what your terminal can support.
I always use XMODEM when programming data into FLASH via a serial link from a PC. When using XMODEM it only sends one data packet at a time and waits for you to acknowledge the packet before sending the next one.
This means we control the flow via software on the receiving side:
Get packet ->
Write packet ->
Ack packet ->
Repeat util done...
XMODEM can be implemented on the smallest of devices (less than 1K RAM) and the code is very simple. All serial terminals support XMODEM (upto windows XP ship with an XMODEM capable terminal). XMODEM requires no special hardware.
Here is the spec.
Here is an example implementation.