Mule 4: SFTP List files that contain a variable - mule

I have an SFTP directory that contains several files in this format
19328D_T001045863113302101909_20220721_103898.txt
1932A8_T001045863113302101909_20220721_103802.txt
The part starting with T i have saved as a dynamic variable vars.transaction (e.g. vars.transaction == "T001045863113302101909"). I want to do a check if I have any files in this directory that contain my vars.transaction in the filename.
So I think I need to use sftp list connector, edit inline and use filename pattern. But as there is numbers before and after the Transaction part I am not sure what to put in the filename pattern. Something like [#vars.transaction]
Thanks in advance

You can use the wildcard * along with your variable. Like *#[vars.transaction]* that will match all the files which has the vars.transaction in their name

Related

Not able to filter files using pathGlobFilter

We are trying to read file from directory based on pattern from azure blob srorage.We are using
pathGlobFilter option to select files. The directory contains following files
Sales_51820_14529409_T_7a3cc7d1d17261fd17e7e1fabd3.csv
Sales_51820_14529409_7a3cc7d1d17261fd17e7e1fabd3.csv
Sales_61820_17529409_7a3cc7d1d17261fd17e7e1fabd3.csv
Sales_61820_17529409_T_7a3cc7d1d17261fd17e7e1fabd3.csv
We need to process only those files which does not have "T" in file name .We need to process only these two files
Sales_51820_14529409_7a3cc7d1d17261fd17e7e1fabd3.csv
Sales_61820_17529409_7a3cc7d1d17261fd17e7e1fabd3.csv
But we are not able to read only these two files.
Here is the code,
df = spark.read.format("csv").schema(structSchema).options(header=False,inferSchema=True,sep='|',pathGlobFilter= "Sales_\d{5} _ \d{8}_[a-z0-9]+.csv$").load("wasbs://abc#xxxxx.blob.core.windows.net/abc/2022/02/11/"
Regards,
Rajib
Glob is not a standard regular expression, there is differences between them.
For example glob doesn't match the number of times.
For details, see:here
Back to this question, a relatively stupid way, looking forward to the perfect solution of the giant.
pathGlobFilter="Sales_[0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[a-z0-9]*.csv"

Regexpression for getting a file

I have to get a file through PDI based on the filename and i want to select file with name matching pattern eligible_for_push which has to be at the end.The file can be .txt or .csv
Please Help
Thanks
There are two part to your query:
1. Finding all files ending with "eligible_for_push":
You cannot use regex to find this sort of pattern (at least i am not aware of). So as an alternate do the following:
Search all the files in the path using "Get Filename" steps. Use modified Javascript to find out the file ending with the above pattern. Check the JS file below.
2. Files can be ".txt" or ".csv":
You can use the below regex/wildcard to find choose between either .txt or .csv
.*\.txt|.*\.csv
Note : Use this code once you have filtered out the files ending with "eligible_for_push". The above JS ignore all the file patterns. After that use the second step to sort out all the .txt or .csv files.
Hope it helps :)

WinSCP Session::RemoveFiles - Delete specified files in sub directories

[Question] Does Session::RemoveFiles() remove files in sub directory of source directory? If not, how to implement this ability?
(Please do not ask me why I have the remote directory as /C/testTransfer/. The code just for testing purpose.)
I have a SFTP program using WinSCP .Net assembly. Program language is C++/CLI. It opens up a work file. The file contains many lines of FTP instructions.
One type of instruction I have to handle is to transfer *.txt from source directory. The source directory may contain sub directories which may contain .txt as well. Once transfer is successful, delete the source files.
I use Session::GetFiles() for the transfer. It correctly transfer all .txt files (/C/testTransfer/*.txt), even those in sub directories (/C/testTransfer/sub/*.txt), in the source to the destination.
transferOptions->FileMask = "*.txt";
session->GetFiles("/C/testTransfer", "C:\\temp\\win", false, transferOption);
Now to remove, I use session->RemoveFiles("/C/testTransfer/*.txt"). I only see *.txt in the source (/C/testTransfer/*.txt), but not in the sub directory (/C/testTransfer/sub/*.txt), are removed.
The Session::RemoveFiles can remove even files in subdirectories in general. But not this way with wildcard, because WinSCP will not descend to subdirectories that do not match the wildcard (*.txt). Also note that even if you do not need the wildcard, the Session::RemoveFiles would remove even the subdirectories themselves, what I'm not sure you want it to.
Though you have other (and better = more safe) options:
Use the remove parameter of the Session::GetFiles method to instruct it to remove source file after successful transfer.
If you need to delete source files transactionally (=only after download of all files succeed), iterate the TransferOperationResult::Transfers returned by Session::GetFiles and call the Session::RemoveFiles for each (unless the TransferEventArgs::Error is not null).
Use the TransferEventArgs::FileName to get a file path to pass to the Session::RemoveFiles. Use the RemotePath::EscapeFileMask to escape the file name before passing it to the Session::RemoveFiles.
There's a similar full example available for Moving local files to different location after successful upload.
To recursively delete files matching a wildcard in a standalone operation (not after downloading the same files), use the Session::EnumerateRemoteFiles. Pass your wildcard to its mask argument. Use the EnumerationOptions.AllDirectories option for recursion.
Call the Session::RemoveFiles for each returned file. Use the RemotePath::EscapeFileMask to escape the file name before passing it to the Session::RemoveFiles.

CFSCRIPT - How to check the length of a filename before uploading

I ran into this problem when uploading a file with a super long name - my database field was only set to 50 characters. Since then, I have increased my database field length, but I'd like to have a way to check the length of the filename before uploading. Below is my code. The validation returns '85' as the character length. And it returns the same count for every different file I upload (none of which have a file name length of 85).
<cfscript>
missing_info = "<p>There was a slight problem with your submission. The following are required or invalid:</p><ul>";
// Check the length of the file name for our database field
if ( len(Form["ResumeFile1"]) gt 100 )
{
missing_info = missing_info & "<li>'Resume File 1' is invalid. Character length must be less than 100. Current count is " & len(Form["ResumeFile1"]) & ".</li>";
validation_error = true;
ResumeFileInvalidMarker = true;
}
</cfscript>
Anyone see anything wrong with this?
Thanks!
http://www.cfquickdocs.com/cf9/#cffile.upload
After you upload the file, the variable "clientFileName" will give you the name of the uploaded file, without a file extension.
The only way to read the filename before you upload it would be to use JavaScript to read and parse the value (file path) in the file field.
A quick clarification in the wording of your question. By the time your code executes the file upload has already happened. The file resides in a temporary directory on the ColdFusion server and the form field related to the file upload contains the temporary filename for that file. Aside from checking to see if a file has been specified, do not do anything directly with that file or you'll be circumventing some built in security.
You want to use the cffile tag with the upload action (or equivalent udf) to move the temp file into a folder of your choosing. At that point you get access to a structure containing lots of information. Usually I "upload" into a temporary directory for the application, which should be outside of the webroot for security.
At this point you'll then want to do any validation against the file, such as filename length, file type, file size, etc and delete the file if it fails any checks. If it passes all checks then you move it into it's final destination which may be inside the webroot.
In your case you'll want to check the cffile structure element clientFile which is the original filename including extension (which you'll need to check, since an extension doesn't need to be present and can be any length).

Parse M3U file locations to fully qualified paths

I would like to parse the file location information in an M3U playlist into fully qualified paths. The possible formats in M3U files seem to be:
c:\mydir\songs\tune.mp3
\songs\tune.mp3
..\songs\tune.mp3
For the first example, just leave it alone. For the second add the directory that the playlist resides in so it would become c:\playlists\songs\tune.mp3 and the same for the third case so it would also become: c:\playlists\songs\tune.mp3.
I'm using vb under VS2008 and I can't find a way to recognise each of the potential location formats in the M3U file. System.IO.Path offers no solution that I can find. I've searched extensively for terms like "convert relative path to absolute" but no luck.
Any advice appreciated.
Thanks.
Write a batch script that just reads the m3u file line by line, and then just parse each line looking for ":" , and for "..", and edit the string as needed. You can then just write the "converted" strings to another file...