Removing a trailing white space when copying a text file to another text file in C - printf

I am trying to copy characters from one text file directly to another using fscanf/fprintf w/ %c but I am being left with an extra white space character at the very end of the file.
Is there a way I can trim the last whitespace character from the output file or tell my function to ignore the very last white space?
Also my constraints dictate that I am not allowed to use an array to parse any read characters. My program needs to read and copy directly from the input file to the output file.
Here is my code below.
void copyFile(FILE* inputStream, FILE* outputStream){
while(!feof(inputStream)) {
int readCorrectly;
char readChar;
readCorrectly = fscanf(inputStream, "%c", &readChar);
if (readCorrectly) {
fprintf(outputStream, "%c", readChar);
}
}
}

The problem here is that the end-of-file indicator is set when you attempt to read past the end of the file.
To resolve this, you need to add a check after the fscanf.
E.g.
if (feof(inputStream)) {
break;
}
In doing so, you could also replace your while condition with an infinite loop, i.e. while (1).
There are more explanations about how feof works here:
How feof() works in C

Related

removing unconventional field separators (^#^#^#) in a text file [duplicate]

I have a text file containing unwanted null characters (ASCII NUL, \0). When I try to view it in vi I see ^# symbols, interleaved in normal text. How can I:
Identify which lines in the file contain null characters? I have tried grepping for \0 and \x0, but this did not work.
Remove the null characters? Running strings on the file cleaned it up, but I'm just wondering if this is the best way?
I’d use tr:
tr < file-with-nulls -d '\000' > file-without-nulls
If you are wondering if input redirection in the middle of the command arguments works, it does. Most shells will recognize and deal with I/O redirection (<, >, …) anywhere in the command line, actually.
Use the following sed command for removing the null characters in a file.
sed -i 's/\x0//g' null.txt
this solution edits the file in place, important if the file is still being used. passing -i'ext' creates a backup of the original file with 'ext' suffix added.
A large number of unwanted NUL characters, say one every other byte, indicates that the file is encoded in UTF-16 and that you should use iconv to convert it to UTF-8.
I discovered the following, which prints out which lines, if any, have null characters:
perl -ne '/\000/ and print;' file-with-nulls
Also, an octal dump can tell you if there are nulls:
od file-with-nulls | grep ' 000'
If the lines in the file end with \r\n\000 then what works is to delete the \n\000 then replace the \r with \n.
tr -d '\n\000' <infile | tr '\r' '\n' >outfile
Here is example how to remove NULL characters using ex (in-place):
ex -s +"%s/\%x00//g" -cwq nulls.txt
and for multiple files:
ex -s +'bufdo!%s/\%x00//g' -cxa *.txt
For recursivity, you may use globbing option **/*.txt (if it is supported by your shell).
Useful for scripting since sed and its -i parameter is a non-standard BSD extension.
See also: How to check if the file is a binary file and read all the files which are not?
I used:
recode UTF-16..UTF-8 <filename>
to get rid of zeroes in file.
I faced the same error with:
import codecs as cd
f=cd.open(filePath,'r','ISO-8859-1')
I solved the problem by changing the encoding to utf-16
f=cd.open(filePath,'r','utf-16')
Remove trailing null character at the end of a PDF file using PHP, . This is independent of OS
This script uses PHP to remove a trailing NULL value at the end of a binary file, solving a crashing issue that was triggered by the NULL value. You can edit this script to remove all NULL characters, but seeing it done once will help you understand how this works.
Backstory
We were receiving PDF's from a 3rd party that we needed to upload to our system using a PDF library. In the files being sent to us, there was a null value that was sometimes being appended to the PDF file. When our system processed these files, files that had the trailing NULL value caused the system to crash.
Originally we were using sed but sed behaves differently on Macs and Linux machines. We needed a platform independent method to extract the trailing null value. Php was the best option. Also, it was a PHP application so it made sense :)
This script performs the following operation:
Take the binary file, convert it to HEX (binary files don't like exploding by new lines or carriage returns), explode the string using carriage return as the delimiter, pop the last member of the array if the value is null, implode the array using carriage return, process the file.
//In this case we are getting the file as a string from another application.
// We use this line to get a sample bad file.
$fd = file_get_contents($filename);
//We trim leading and tailing whitespace and convert the string into hex
$bin2hex = trim(bin2hex($fd));
//We create an array using carriage return as the delminiter
$bin2hex_ex = explode('0d0a', $bin2hex);
//look at the last element. if the last element is equal to 00 we pop it off
$end = end($bin2hex_ex);
if($end === '00') {
array_pop($bin2hex_ex);
}
//we implode the array using carriage return as the glue
$bin2hex = implode('0d0a', $bin2hex_ex);
//the new string no longer has the null character at the EOF
$fd = hex2bin($bin2hex);

How to add asterisk to a list of filenames and then make it a line using Notepad++

I have a list of file names (about 4000).
For example:
A-67569
H-67985
J-87657
K-85897
...
I need to put an asterisk before and after each file name. And then make it a line format.
Example:
*A-67569* *H-67985* *J-87657* *K-85897* so on...
Note that there is a space between filenames.
Forgot to mention, I'm trying to do this with Notepad++
How can I do it?
Please advise.
Thanks
C# example for list to string plus edits
List<string> list = new List<string> { "A - 67569"), "H-67985", "J-87657", "K-85897"};
string outString = "";
foreach(string item in list)
{
outString += "*" + item + "* ";
}
content of outstring: *A - 67569* *H-67985* *J-87657* *K-85897*
Use the Replace of your Notedad++ (Search > Replace..)
Select Extended (\n \r \t \0 \x...) on the bottom of the Replace window
In the field Find what write '\r\n' and in the field Replace with write * *
Replace all
Note, that you should manually place the single asterisk before the first and after the last words.
If this won't work, in step 2. instead of \r\n try to use only \n or \r.
You can use Regular expression in the search Mode.
Find what:
(\S+)(\R|$)
Replace with:
*$1
Note the space after de number one
For the archive
A-67569
H-67985
J-87657
K-85897
Output:
*A-67569 *H-67985 *J-87657 *K-85897
Explication of regex:
(\S+) Mean find one or more caracters is not a blank.
(\R|$) Mean find any end of line or end of file
(\S+)(\R|$) Mean find any gorup of caracters not blank ho end with end of line or end of file.
Explication of Replace with
When you use the $ simpbol, you are using a reference to the groups finded, $1 is the first group, in this case the group (\S+).

How to read elements from a line in VHDL?

I'm trying to use VHDL to read from a file that can have different formats. I know you're supposed to use the following two lines of code to read a line at a time, the read individual elements in that line.
readline(file, aline);
read(aline, element);
However my question is what will read(aline, element) return into element? What will it return if the line is empty? What will it return if I've used it let's say 5 times and my line only has 4 characters?
The reason I want to know is that if I am reading a file with an arbitrary number of spaces between valid data, how do I parse this valid data?
The file contains ASCII characters separated by arbitrary amounts of white space (any number of spaces, tabs, or new lines). If the line starts with a # that line is a comment and should be ignored.
Outside of these comments, the first part of the file contains characters that are only letters or numbers in combinations of variable size. In other words this:
123 ABC 12ABB3
However, the majority of the file (after a certain number of read words) will be purely numbers of arbitrary length, separated by an arbitrary amount of white space. In other words, the second part of the file is this:
255 0 2245 625 430
2222 33 111111
and I must be able to parse these numbers (and interpret them as such) individually.
As mentioned in the comments, all the read procedures in std.textio and ieee.std_logic_textio skip over leading spaces apart from the character and string versions (because a space is as much a character as any other).
You can test whether a line variable (the buffer) is empty like this:
if L'length > 0 then
where L is your line variable. There is also a set of overloaded read procedures with an extra status output:
procedure read (L : inout LINE;
VALUE: out <type> ;
GOOD : out BOOLEAN);
The extra output - GOOD - is true if the read was successful and false if it wasn't. The advantage of these if that the read is unsuccessful, the simulation does not stop (as it does with the regular procedures). Also, with the versions in std.textio, if the read is unsuccessful, the read is non-destructive (ie whatever you were trying to read remains in the buffer). This is not the case with the versions in ieee.std_logic_textio, however.
If you really do not know what format you are trying to read, you could read the entire line into a string, like this:
variable S : string(1 to <some big number>);
...
readline(F, L);
assert L'length < S'length; -- make sure S is big enough
S := (others => ' '); -- make sure that the previous line is overwritten
if L'length > 0 then
read(L, S(1 to L'length);
end if;
The line L is now in the string S. You can then write some code to parse it. You may find the type attribute 'value useful. This converts a string to some type, eg
variable I : integer;
...
I := integer'value(S(12 to 14));
would set integer I to the value contained in elements 12 to 14 of string S.
Another approach, as suggested by user1155120 below, is to peek at the values in the buffer, eg
if L'length > 0 then -- check that the L isn't empty, otherwise the next line blows up
if L.all(1) = '#' then
-- the first character of the line is a '#' so the line must be a comment

Write multiple lines to text file with '\n'

I have a program that iterates over all lines of a text file, adds spaces between the characters, and writes the output to the same file. However, if there are multiple lines in the input, I want the output to have separate lines as well. I tried:
let text = format!(r"{}\n", line); // Add newline character to each line (while iterating)
file.write_all(text.as_bytes()); // Write each line + newline
Here is an example input text file:
foo
bar
baz
And its output:
f o o\n b a r\n b a z
It seems that Rust treats "\n" as an escaped n character, but using r"\n" treats it as a string. How can I have Rust treat \n as a newline character to write multiple lines to a text file?
Note: I can include the rest of my code if you need it, let me know.
Edit: I am on Windows 7 64 bit
The problem is the 'r' in front of your string. Remove it and your program will print newlines instead of '\n'.
Also note that only most Unices use '\n' as newline. Windows uses "\r\n".

Removing blank line at end of string before writing to text file?

Been searching around for this for a couple hours, can't find anything which will do this correctly. When writing a string to a text file, a blank line is outputted at the end.
writeString = New StreamWriter(path, False)
writeString.WriteLine("Hello World")
writeString.Flush()
writeString.Close()
This will write the following to file:
Hello World
(Blank Line)
I've tried removing last character of string (both as regular string with varString.Substring(0, varString.Length - 1) and also as a list of string with varList.RemoveAt(varList.Count - 1)) but it just removes the literal last character.
I've also tried using Replace(vbCrLf, "") and many variations of it but again, they only remove literal new lines created in the string, not the new line at the end that is magically created.
Preferably, I'm seeking a method which will be able to remove that magical newline before the string is ever written to the file. I found methods which read from the file and then write back to it which would require Write > Read > Write, but in all cases the magical new line still appeared. :(
If it's important to note: The file will contain a string which may contain actual new lines (it's 'Song Artist - Song Title', though can contain other information and new lines can be added if the user wishes). That text file is then read by other applications (such as mIRC etc) of which output the contents by various means depending on application.
Eg. If an application were to read it and output it into a textbox.. the new line will additionally output to that textbox.. which is a problem! I have no control of the applications which will read the file as input considering it's the client which decides the application, so the removal of the new line needs to be done when outputted.
Help is appreciated~!
Use the Write method instead of WriteLine. The WriteLine method is the one adding a blank 0 length line to the file because it is terminating the "Hello World" string with a newline.
writeString.Write("Hello World")