Delete Files Which MD5's listed In Text File - VB.net - vb.net

I have a list of MD5 hash of files stored in a text file. And I want delete them all when it been found on system or a path. But I have problem to code it. I have tried to but it only scan one file from listed MD5 so its not what i needed. Is there any way to find them and deleted files which there MD5 hash's are listed in a path. Thanks.

pidgin pseudocode:
put md5s in array
cycle through a filesystem
for each file, put into varable, compute md5hash of variable
if md5hash is in array, delete file
maybe you should skip swap files and system folders.

Related

how to read multiple text files into a dataframe in pyspark

i have a few txt files in a directory(i have only the path and not the names of the files) that contain json data,and i need to read all of them into a dataframe.
i tried this:
df=sc.wholeTextFiles("path/*")
but i cant even display the data and my main goal is to preform queries in diffrent ways on the data.
Instead of wholeTextFiles(gives key, value pair having key as filename and data as value),
Try with read.json and give your directory name spark will read all the files in the directory into dataframe.
df=spark.read.json("<directorty_path>/*")
df.show()
From docs:
wholeTextFiles(path, minPartitions=None, use_unicode=True)
Read a directory of text files from HDFS, a local file system
(available on all nodes), or any Hadoop-supported file system URI.
Each file is read as a single record and returned in a key-value pair,
where the key is the path of each file, the value is the content of
each file.
Note: Small files are preferred, as each file will be loaded fully in
memory.

Visual Basic read text file and delete files from that file

I know how to tell my program how to read a file but I dont know how to use that information to delete some files from that text file.
Example;
I have a text file called ban.txt inside that file there are two lines with text abc.exe and cba.exe
I want my program to read content of ban.txt and the delete those specified files.
Assuming that you know how to read the file and find the file names, then just add this statement to a for-each:
My.Computer.FileSystem.DeleteFile(strFilename)
There are also options for displaying error messages and sending the file to the recycle bin.
My.Computer.FileSystem.DeleteFile("C:\Test.txt", FileIO.UIOption.AllDialogs, FileIO.RecycleOption.SendToRecycleBin)

WinSCP Session::RemoveFiles - Delete specified files in sub directories

[Question] Does Session::RemoveFiles() remove files in sub directory of source directory? If not, how to implement this ability?
(Please do not ask me why I have the remote directory as /C/testTransfer/. The code just for testing purpose.)
I have a SFTP program using WinSCP .Net assembly. Program language is C++/CLI. It opens up a work file. The file contains many lines of FTP instructions.
One type of instruction I have to handle is to transfer *.txt from source directory. The source directory may contain sub directories which may contain .txt as well. Once transfer is successful, delete the source files.
I use Session::GetFiles() for the transfer. It correctly transfer all .txt files (/C/testTransfer/*.txt), even those in sub directories (/C/testTransfer/sub/*.txt), in the source to the destination.
transferOptions->FileMask = "*.txt";
session->GetFiles("/C/testTransfer", "C:\\temp\\win", false, transferOption);
Now to remove, I use session->RemoveFiles("/C/testTransfer/*.txt"). I only see *.txt in the source (/C/testTransfer/*.txt), but not in the sub directory (/C/testTransfer/sub/*.txt), are removed.
The Session::RemoveFiles can remove even files in subdirectories in general. But not this way with wildcard, because WinSCP will not descend to subdirectories that do not match the wildcard (*.txt). Also note that even if you do not need the wildcard, the Session::RemoveFiles would remove even the subdirectories themselves, what I'm not sure you want it to.
Though you have other (and better = more safe) options:
Use the remove parameter of the Session::GetFiles method to instruct it to remove source file after successful transfer.
If you need to delete source files transactionally (=only after download of all files succeed), iterate the TransferOperationResult::Transfers returned by Session::GetFiles and call the Session::RemoveFiles for each (unless the TransferEventArgs::Error is not null).
Use the TransferEventArgs::FileName to get a file path to pass to the Session::RemoveFiles. Use the RemotePath::EscapeFileMask to escape the file name before passing it to the Session::RemoveFiles.
There's a similar full example available for Moving local files to different location after successful upload.
To recursively delete files matching a wildcard in a standalone operation (not after downloading the same files), use the Session::EnumerateRemoteFiles. Pass your wildcard to its mask argument. Use the EnumerationOptions.AllDirectories option for recursion.
Call the Session::RemoveFiles for each returned file. Use the RemotePath::EscapeFileMask to escape the file name before passing it to the Session::RemoveFiles.

CFSCRIPT - How to check the length of a filename before uploading

I ran into this problem when uploading a file with a super long name - my database field was only set to 50 characters. Since then, I have increased my database field length, but I'd like to have a way to check the length of the filename before uploading. Below is my code. The validation returns '85' as the character length. And it returns the same count for every different file I upload (none of which have a file name length of 85).
<cfscript>
missing_info = "<p>There was a slight problem with your submission. The following are required or invalid:</p><ul>";
// Check the length of the file name for our database field
if ( len(Form["ResumeFile1"]) gt 100 )
{
missing_info = missing_info & "<li>'Resume File 1' is invalid. Character length must be less than 100. Current count is " & len(Form["ResumeFile1"]) & ".</li>";
validation_error = true;
ResumeFileInvalidMarker = true;
}
</cfscript>
Anyone see anything wrong with this?
Thanks!
http://www.cfquickdocs.com/cf9/#cffile.upload
After you upload the file, the variable "clientFileName" will give you the name of the uploaded file, without a file extension.
The only way to read the filename before you upload it would be to use JavaScript to read and parse the value (file path) in the file field.
A quick clarification in the wording of your question. By the time your code executes the file upload has already happened. The file resides in a temporary directory on the ColdFusion server and the form field related to the file upload contains the temporary filename for that file. Aside from checking to see if a file has been specified, do not do anything directly with that file or you'll be circumventing some built in security.
You want to use the cffile tag with the upload action (or equivalent udf) to move the temp file into a folder of your choosing. At that point you get access to a structure containing lots of information. Usually I "upload" into a temporary directory for the application, which should be outside of the webroot for security.
At this point you'll then want to do any validation against the file, such as filename length, file type, file size, etc and delete the file if it fails any checks. If it passes all checks then you move it into it's final destination which may be inside the webroot.
In your case you'll want to check the cffile structure element clientFile which is the original filename including extension (which you'll need to check, since an extension doesn't need to be present and can be any length).

Why .RAR file contains different files with the same name

I got a .RAR file which contains different files with the same name.
For example,
index.txt 40 Text Document 04/01/2010 4:40PM
index.txt 22 Text Document 04/01/2010 4:42PM
index.txt 10 Text Document 04/01/2010 4:45PM
index.txt 13 Text Document 04/01/2010 4:50PM
Why?
Like said before, the files could be in separate paths, but as I'll show further, this isn't always the case.
If you use WinRAR to list the file contents and your options are set as the following, then it only appears you have files with the same name, but they are in different paths.
Options -> File list -> Flat folders view (ctrl+h)
Options -> File list -> Details
After the column CRC32, there is one called Path. If this is different, extraction shouldn't be a problem if:
Extract -> Extraction path and options -> Advanced -> Extract relative paths is set.
If it is Do not extract paths, WinRAR will need to ask you to rename them because of file system limitations.
I assume command line unrar won't be a problem in this case because you need to specify additional parameters to change its default behavior.
It is possible for a RAR archive to have multiple files with the same name in the same directory. If you use Windows, use "C:\Program Files\WinRAR\Rar.exe"
instead of rar on the command line in the following examples.
Create a new file and add it to a RAR archive. You can also check the changes by listing its contents.
rar a rarfile.rar testfile.txt
rar l rarfile.rar
rar a rarfile.rar testfile.txt
If you try to re-add this file, rar will replace the already added file with the same name.
Updating archive rarfile.rar
Updating testfile.txt OK
Done
Create an other file or rename the first one and add it to the RAR file.
move testfile.txt second.txt (new file)
rar a rarfile.rar second.txt (add it)
rar lb rarfile.rar (list archive, bare info)
Rename the second file to the first one's name.
rar rn rarfile.rar second.txt testfile.txt
This is how you create a RAR file with multiple files of the same name in the same path. These steps will be similar in WinRAR. If you try to rename the file again, the file name of all files in that directory will change too.
Why would someone want to do this?
The only explanation I can think of is that the person that created this archive wanted to imitate a version control/backup system. But if you want to extract only one specific version and it isn't the first one, WinRAR extracts the wrong file. It seems I've found a very obscure WinRAR bug :-)
Edit: seems a bad explanation after finding this in the RAR documentation:
-ver[n] File version control
Forces RAR to keep previous file versions when updating
files in the already existing archive. Old versions are
renamed to 'filename;n', where 'n' is the version number.
By default, when unpacking an archive without the switch
-ver, RAR extracts only the last added file version, the name
of which does not include a numeric suffix. But if you specify
a file name exactly, including a version, it will be also
unpacked. For example, 'rar x arcname' will unpack only
last versions, when 'rar x arcname file.txt;5' will unpack
'file.txt;5', if it is present in the archive.
If you specify -ver switch without a parameter when unpacking,
RAR will extract all versions of all files that match
the entered file mask. In this case a version number is
not removed from unpacked file names. You may also extract
a concrete file version specifying its number as -ver parameter.
It will tell RAR to unpack only this version and remove
a version number from file names. For example,
'rar x -ver5 arcname' will unpack only 5th file versions.
If you specify 'n' parameter when archiving, it will limit
the maximum number of file versions stored in the archive.
Old file versions exceeding this threshold will be removed.
they are in different paths, most likely.
try outputting the full path. or see what happens when you extract them.
you'll probably see something like:
index.txt
path1/index.txt
path2/index.txt
etc etc