Is there a way to record multi-threaded the mixed result of JavaFX AudioClip sounds to disk? - java-audio

My program launchs an arbitrary number of sounds streamed from FreeSound.org and plays them using javafx.scene.media AudioClip instances.
I was trying to figure out whether it could be possible to capture the generated output to disk, from within the same program? Any pointers?

Instead of using AudioClip you could use SourceDataLine for playback. This class allows you to progressively read the audio data, exposing it for handling. You would have to decode from bytes to PCM for each incoming line, then add the PCM from all the lines to be merged, and recode that back to bytes for your output.
I suspect with a little minor tweaking you could get the library I wrote AudioCue to work for you. It has an optional mixer that will handle multiple cue inputs. The inputs make use of SourceDataLine and the mixer uses the logic I described. You would have to tweak the code to output to disk. I could probably help with that if this project is still live for you.

Related

How can I write raw data on SD card without filesystem by using DSP?

I'm new in embedded electronic/programming but I have a project. For this project, I want to write raw data provided by sensor in 512b buffers on a SD card (SPI mode) with no filesystem.
I'm trying to do it by using the low level functions of the FATFS library. I've heard that I have to create my own format and rewrite functions to do it, but I'm lost in all those lines code... I suppose that I have first to initialize SD by using a command sequence (CMD0, CMD8, CMD55, ACMD41...).
I'm not sure for the next steps, if I have to open a file with the fopen function and then use the fwrite function...
If somebody can explain me how it work for a non filesystem SD card and guide me in the steps to follow I would be very grateful.
Keep in mind that cards with a memory capacity > 2Tb are not supported in SPI mode
(Physical Layer Simplified Specification.
If you don't want to use a file system, then fopen, and fwrite are irrelevant. You simply use the the low-level block driver. If you are referring to http://www.elm-chan.org/fsw/ff/00index_e.html, then the API subset you need is :
disk_status - Get device status
disk_initialize - Initialize device
disk_read - Read data
disk_write - Write data
disk_ioctl - Control device dependent functions
However, since you then have to manage the blocks somehow so you know which blocks are available and where to write next (like one large file), you will essentially be writing your own filesystem (albeit a very simple and limited one). So it begs the question of why you would not just use an existing filesystem?
There are many reasons for not using FAT in an embedded system but few of them will be resolved by implementing your own "raw" filesystem without you doing a lot of work reinventing the filesystem! Consider something more robust and designed for resource constrained embedded systems with potentially unreliable power source, such as littlefs.

WasapiLoopbackCapture to WaveOut

I'm using WasapiLoopbackCapture to capture sound coming from my speakers and then using onDataAvailable to send it to another device and I'm attempting to play the data sent using the WaveOut class and a BufferedWaveProvider and just adding a sample everytime data is sent from my client using the onDataAvailable. I'm having problems sending sound. The most functioning I've managed to get it is:
Not syncing the Wave format of the client and the server, just sending data and adding it to the sample. Problem is this is stutters very much even though I checked the buffer stored size and it has 51 seconds. I even have to increase the buffer size which eventually overflows anyway.
I tried syncing the Wave format and I just get clicks but have no problem with buffer size. I also tried making sure that at least a second was stored in the buffer but that had zero effect.
If anyone could point me in the right direction that would be great.
Uncompressed audio takes up a lot of space on a network. On my machine the WasapiLoopbackCapture object produces 32-bit (IeeeFloat) stereo samples at 44100 samples per second, for around 2.7Mbit/sec total raw bandwidth. Once you factor in TCP packet overheads and so on, that's quite a lot of data you're transferring.
The first thing I would suggest though is that you plug in some profiling code at each step in the process to get an idea of where your bottlenecks are happening. How fast is data arriving from the capture device? How big are your packets? How long does it take to service each call to your OnDataAvailable event handler? How much data are you sending per second across the network? How fast is the data arriving at the client? Figure out where the bottlenecks are and you get a much better idea of what the bottlenecks are.
Try building a simulated server that reads data from a wave file in various WaveFormats (channels, bits per sample and sample rate) and simulates sending that data across the network to the client. You might find that the problem goes away at lower bandwidth. And if bandwidth is the issue, compression might be the solution.
If you're using a single-threaded model, and servicing each OnDataAvailable event takes longer than the recording frequency (ie: number of expected calls to OnDataAvailable per second) then there's going to be a data loss issue. Multiple threads can help with this - one to get the data from the audio system, another to process and send the data. But you can end up in the same position: losing data because you're not dealing with it quickly enough. When that happens it's handy to know about it, because it indicates a problem in the program. Find out when and where it happens - overflow in input, processing or output buffers all have different potential reasons and need different attention.

Get samples from a wav file while audio is being played with NAudio

I've been looking at the NAudio demo application "Audio file playback". What I'm missing from this demo is a way to get hold of the samples while the audio file is being played.
I figured that it would somehow be possible to fill a BufferedWaveProvider with samples using a callback whenever new samples are needed, but I can't figure out how.
My other (non-preferred) idea is to make a special version of e.g. DirectSoundOut where I can get hold of the samples before they are written to the sound card.
Any ideas?
With audio file playback in NAudio you construct an audio pipeline, starting with your audio file and going through various transformations (e.g. changing volume) along the way before ending up at your output device. The NAudioDemo does in fact show how the samples can be accessed along the way by drawing a waveform (pre-volume adjustment) and by showing a volume meter (post-volume adjustment).
You could, for example, create an implementer of IWaveProvider or ISampleProvider and insert it into the pipeline. Then, in the Read method, you read from your source, and then you can process or examine or write to disk the samples before passing them on to the next stage in the pipeline. Look at the AudioPlaybackPanel.CreateInputStream to see how this is done in the demo.

Using Cocoa to detect when a running application plays audio

I'm looking into writing an app that runs as a background process and detects when an app (say, Safari) is playing audio. I can use NSWorkspace to get the process ID's of the currently running applications but I'm at a loss when it comes to detecting what those processes are doing. I assume that there is a way to listen in on a process and detect what public messages the objects are sending. I apologize for my ignorance on the subject.
Has anyone attempted anything like this or are aware of any resources that can help?
I don't think that your "answer" is an answer at all...
and there IS an answer (which is not "42")
your best bet for doing this would be to write a pass-through audio output device. Much like soundflower, actually. so your audio output device would then load the actual (physical) audio output device and pass the audio data along to it directly (after first having a look at the audio stream, of course!). then you only need to convince your users to configure your audio device as the default audio output device so that the majority of applications which play sound will use it automatically. and voila...
your audio processing function will probably just do a quick RMS on the buffer before passing it along to the actual output device. and when the audio power crosses a certain threshold (probably something like -54dB with apple audio hardware), then you know that some app is making sound.
|K<
SoundFlower is an open-source project that allows Mac OS X applications to pass audio to each other. It almost certainly does something similar to what you describe.
I've been informed on another thread that while this is possible, it is an extremely advanced technique and not recommended. It would involve using Application Enhancer (APE) and is considered a not 'nice' thing to do. Looks like that app idea is destined for the big recycling bin in the sky :)

Protocols used to talk between an embedded CPU and a PC

I am building a small device with its own CPU (AVR Mega8) that is supposed to connect to a PC. Assuming that the physical connection and passing of bytes has been accomplished, what would be the best protocol to use on top of those bytes? The computer needs to be able to set certain voltages on the device, and read back certain other voltages.
At the moment, I am thinking a completely host-driven synchronous protocol: computer send requests, the embedded CPU answers. Any other ideas?
Modbus might be what you are looking for. It was designed for exactly the type of problem you have. There is lots of code/tools out there and adherence to a standard could mean easy reuse later. It also support human readable ASCII so it is still easy to understand/test.
See FreeModBus for windows and embedded source.
There's a lot to be said for client-server architecture and synchronous protocols. Simplicity and robustness, to start. If speed isn't an issue, you might consider a compact, human-readable protocol to help with debugging. I'm thinking along the lines of modem AT commands: a "wakeup" sequence followed by a set/get command, followed by a terminator.
Host --> [V02?] // Request voltage #2
AVR --> [V02=2.34] // Reply with voltage #2
Host --> [V06=3.12] // Set voltage #6
AVR --> [V06=3.15] // Reply with voltage #6
Each side might time out if it doesn't see the closing bracket, and they'd re-synchronize on the next open bracket, which cannot appear within the message itself.
Depending on speed and reliability requirements, you might encode the commands into one or two bytes and add a checksum.
It's always a good idea to reply with the actual voltage, rather than simply echoing the command, as it saves a subsequent read operation.
Also helpful to define error messages, in case you need to debug.
My vote is for the human readable.
But if you go binary, try to put a header byte at the beginning to mark the beginning of a packet. I've always had bad luck with serial protocols getting out of sync. The header byte allows the embedded system to re-sync with the PC. Also, add a checksum at the end.
I've done stuff like this with a simple binary format
struct PacketHdr
{
char syncByte1;
char syncByte2;
char packetType;
char bytesToFollow; //-or- totalPacketSize
};
struct VoltageSet
{
struct PacketHdr;
int16 channelId;
int16 voltageLevel;
uint16 crc;
};
struct VoltageResponse
{
struct PacketHdr;
int16 data[N]; //Num channels are fixed
uint16 crc;
}
The sync bytes are less critical in a synchronous protocol than in an asynchronous one, but they still help, especially when the embedded system is first powering up, and you don't know if the first byte it gets is the middle of a message or not.
The type should be an enum that tells how to intepret the packet. Size could be inferred from type, but if you send it explicitly, then the reciever can handle unknown types without choking. You can use 'total packet size', or 'bytes to follow'; the latter can make the reciever code a little cleaner.
The CRC at the end adds more assurance that you have valid data. Sometimes I've seen the CRC in the header, which makes declaring structures easier, but putting it at the end lets you avoid an extra pass over the data when sending the message.
The sender and reciever should both have timeouts starting after the first byte of a packet is recieved, in case a byte is dropped. The PC side also needs a timeout to handle the case when the embedded system is not connected and there is no response at all.
If you are sure that both platforms use IEEE-754 floats (PC's do) and have the same endianness, then you can use floats as the data type. Otherwise it's safer to use integers, either raw A/D bits, or a preset scale (i.e. 1 bit = .001V gives a +/-32.267 V range)
Adam Liss makes a lot of great points. Simplicity and robustness should be the focus. Human readable ASCII transfers help a LOT while debugging. Great suggestions.
They may be overkill for your needs, but HDLC and/or PPP add in the concept of a data link layer, and all the benefits (and costs) that come with a data link layer. Link management, framing, checksums, sequence numbers, re-transmissions, etc... all help ensure robust communications, but add complexity, processing and code size, and may not be necessary for your particular application.
USB bus will answer all your requirements. It might be very simple usb device with only control pipe to send request to your device or you can add an interrupt pipe that will allow you to notify host about changes in your device.
There is a number of simple usb controllers that can be used, for example Cypress or Microchip.
Protocol on top of the transfer is really about your requirements. From your description it seems that simple synchronous protocol is definitely enough. What make you wander and look for additional approach? Share your doubts and we will try to help :).
If I wasn't expecting to need to do efficient binary transfers, I'd go for the terminal-style interface already suggested.
If I do want to do a binary packet format, I tend to use something loosely based on the PPP byte-asnc HDLC format, which is extremely simple and easy to send receive, basically:
Packets start and end with 0x7e
You escape a char by prefixing it with 0x7d and toggling bit 5 (i.e. xor with 0x20)
So 0x7e becomes 0x7d 0x5e
and 0x7d becomes 0x7d 0x5d
Every time you see an 0x7e then if you've got any data stored, you can process it.
I usually do host-driven synchronous stuff unless I have a very good reason to do otherwise. It's a technique which extends from simple point-point RS232 to multidrop RS422/485 without hassle - often a bonus.
As you may have already determined from all the responses not directly directing you to a protocol, that a roll your own approach to be your best choice.
So, this got me thinking and well, here are a few of my thoughts --
Given that this chip has 6 ADC channels, most likely you are using Rs-232 serial comm (a guess from your question), and of course the limited code space, defining a simple command structure will help, as Adam points out -- You may wish to keep the input processing to a minimum at the chip, so binary sounds attractive but the trade off is in ease of development AND servicing (you may have to trouble shoot a dead input 6 months from now) -- hyperterminal is a powerful debug tool -- so, that got me thinking of how to implement a simple command structure with good reliability.
A few general considerations --
keep commands the same size -- makes decoding easier.
Framing the commands and optional check sum, as Adam points out can be easily wrapped around your commands. (with small commands, a simple XOR/ADD checksum is quick and painless)
I would recommend a start up announcement to the host with the firmware version at reset - e.g., "HELLO; Firmware Version 1.00z" -- would tell the host that the target just started and what's running.
If you are primarily monitoring, you may wish to consider a "free run" mode where the target would simply cycle through the analog and digital readings -- of course, this doesn't have to be continuous, it can be spaced at 1, 5, 10 seconds, or just on command. Your micro is always listening so sending an updated value is an independent task.
Terminating each output line with a CR (or other character) makes synchronization at the host straight forward.
for example your micro could simply output the strings;
V0=3.20
V1=3.21
V2= ...
D1=0
D2=1
D3=...
and then start over --
Also, commands could be really simple --
? - Read all values -- there's not that many of them, so get them all.
X=12.34 - To set a value, the first byte is the port, then the voltage and I would recommend keeping the "=" and the "." as framing to ensure a valid packet if you forgo the checksum.
Another possibility, if your outputs are within a set range, you could prescale them. For example, if the output doesn't have to be exact, you could send something like
5=0
6=9
2=5
which would set port 5 off, port 6 to full on, and port 2 to half value -- With this approach, ascii and binary data are just about on the same footing in regards to computing/decoding resources at the micro. Or for more precision, make the output 2 bytes, e.g., 2=54 -- OR, add an xref table and the values don't even have to be linear where the data byte is an index into a look-up table ...
As I like to say; simple is usually better, unless it's not.
Hope this helps a bit.
Had another thought while re-reading; adding a "*" command could request the data wrapped with html tags and now your host app could simply redirect the output from your micro to a browser and wala, browser ready --
:)