RAR a folder without persisting the full path - rar

1) I have a folder called CCBuilds containing couple of files in this path: E:\Testing\Builds\CCBuilds.
2) I have written C# code (Process.Start) to Rar this folder and save it in E:\Testing\Builds\CCBuilds.rar using the following command
"C:\program files\winrar\rar.exe a E:\Testing\Builds\CCBuilds.rar E:\Testing\Builds\CCBuilds"
3) The problem is that, though the rar file gets created properly, when I unrar the file to CCBuilds2 folder (both through code using rar.exe x command or using Extract in context menu), the unrared folder contains the full path, ie. extracting E:\Testing\Builds\CCBuilds.rar ->
E:\Testing\Builds\CCBuilds2\Testing\Builds\CCBuilds\<<my files>>
Whereas I want it to be something like this: E:\Testing\Builds\CCBuilds2\CCBuilds\<<my files>>
How can I avoid this full path persistence while adding to rar / extracting back from it. Any help is appreciated.

Use the -ep1 switch.
More info:
-ep = Files are added to an archive without including the path information. Could result in multiple files existing in the archive
with same name.
-ep1 = Do not store the path entered at the command line in archive. Exclude base folder from names.
-ep2 = Expand paths to full. Store full file paths (except drive letter and leading backslash) when archiving.
(source: http://www.qa.downappz.com/questions/winrar-command-line-to-add-files-with-relative-path-only.html)

Just in case this helps: I am currently working on an MS Access Database project (customer relations management for a small company), and one of the tasks there is to zip docx-files to be sent to customers, with a certain password encryption used.
In the VBA procedure that triggers the zip-packaging of the docx-files, I call WinRAR as follows:
c:\Programme\WinRAR\winrar.exe a -afzip -ep -pThisIsThePassword "OutputFullName" "InputFullName"
-afzip says: "Create a zip file (as opposed to a rar file)
-ep says: Do not include the paths of the source file, i.e. put the file directly into the zip folder
A full list of such switches is available in the WinRAR Help, section "Command line".

x extracts it as E:\Testing\Builds\CCBuilds2\Testing\Builds\CCBuilds\, because you're using full path when declaring the source. Either use -ep1 or set the default working dir to E:\Testing\Builds.
Use of -ep1 is needed but it's a bit tricky.
If you use:
Winrar.exe a output.rar inputpath
Winrar.exe a E:\Testing\Builds\CCBuilds.rar E:\Testing\Builds\CCBuilds
it will include the input path declared:
E:\Testing\Builds\CCBuilds -> E:\Testing\Builds\CCBuilds.rar:
Testing\Builds\CCBuilds\file1
Testing\Builds\CCBuilds\file2
Testing\Builds\CCBuilds\folder1\file3
...
which will end up unpacked as you've mentioned:
E:\Testing\Builds\CCBuilds2\Testing\Builds\CCBuilds\
There are two ways of using -ep1.
If you want the simple path:
E:\Testing\Builds\CCBuilds\
to be extracted as:
E:\Testing\Builds\CCBuilds2\CCBuilds\file1
E:\Testing\Builds\CCBuilds2\CCBuilds\file2
E:\Testing\Builds\CCBuilds2\CCBuilds\path1\file3
...
use
Winrar.exe a -ep1 E:\Testing\Builds\CCBuilds.rar E:\Testing\Builds\CCBuilds
the files inside the archive will look like:
CCBuilds\file1
CCBuilds\file2
CCBuilds\folder1\file3
...
or you could use ep1 to just add the files and folder structure sans the base folder with the help of recursion and defining the base path as the inner path of the structure:
Winrar.exe a -ep1 -r E:\Testing\Builds\CCBuilds.rar E:\Testing\Builds\CCBuilds\*
The files:
E:\Testing\Builds\CCBuilds\file1
E:\Testing\Builds\CCBuilds\file2
E:\Testing\Builds\CCBuilds\folder1\file3
...
inside the archive will look like:
file1
file2
folder1\file3
...
when extracted will look like:
E:\Testing\Builds\CCBuilds2\file1
E:\Testing\Builds\CCBuilds2\file2
E:\Testing\Builds\CCBuilds2\folder1\file3
...
Anyway, these are two ways -ep1 can be used to exclude base path with or without the folder containing the files (the base folder / or base path).

Related

How to get all files and folders recursively using GetMetadata activity

I need to delete many folders with an exempt list which has a list of folders and files that I should not delete. So I tried delete activity and tried to use if function of Add dynamic content to check whether the file or folder name is the same of the specified ones. But I do not have what should be the parameters #if(). In other words, to use these functions, how do we get the file name or folder name?
It is difficult to get all files and folders by using GetMetadata activity.
As workarounds, you can try these ways:
1.create a Delete activity and select List of files option. Then create a file filled with those path of files and folders need to be deleted.(relative path to the path configured in the dataset).
2.using blob SDK to do this thing.

BigQuery loading batch folders error

I'm trying to load group of folders files in one time with when
i set
sourceURI = 'gs://ybbi/bi_landing_zone/files_to_load/app/reports/app_network_analytics_report/201409011*'
all the folders that i'm want to load start with 20140911
but i get the error:
ERROR: Invalid path: gs://ybbi/bi_landing_zone/files_to_load/apn/reports/appnexus_network_analytics_report/20140901191111_3bab8ec0_092a_43de_a157_db35d1555ea0/
20140901191111_3bab8ec0_092a_43de_a157_db35d1555ea0 is one of these folders(don't know why it's print the all folder name of this specific folder)
in some other folder tree cases it's works, but in this specific folder tree it's return the same error .
i know that cloud storage don't have real folders and it's part of the name of the object, but you understand what i mean.
is it bug?
Without more information, what it looks like is that you have a object file called gs://ybbi/bi_landing_zone/files_to_load/apn/reports/appnexus_network_analytics_report/20140901191111_3bab8ec0_092a_43de_a157_db35d1555ea0/ that is not a csv/json file. Some tools may create these dummy files in order to simulate directories. BigQuery requires all objects that match the input glob path to be importable files.
One solution would be to change the glob path to include a narrower set of files. You can pass multiple paths if that makes things easier. For example, you could pass
gs://ybbi/bi_landing_zone/files_to_load/apn/reports/appnexus_network_analytics_report/20140901191111_3bab8ec0_092a_43de_a157_db35d1555ea0/*
and
gs://ybbi/bi_landing_zone/files_to_load/apn/reports/appnexus_network_analytics_report/20140901191111_some_other_path/*

Need help on Pig

I am executing a Pig script, which reads files from a directory, performs some operation and stores to some output directory. In output directory I'm getting one or more "part" files, one _SUCCESS file and one _logs directory. My questions are:
Is there any way to control the name of files generated (upon execution of STORE command) in output directory. To be specific, I don't want the names to be "part-.......". I want Pig to generate files according to the file name pattern I specify.
Is there any way to suppress the _SUCCESS file and the _log directory? Basically I don't want the _SUCCESS and _logs to be generated in the output directory.
Regards
Biswajit
See this post.
To remove _SUCCESS, use SET mapreduce.fileoutputcommitter.marksuccessfuljobs false;. I'm not 100% sure how to remove _logs but you could try SET pig.streaming.log.persist false;.

WinSCP Session::RemoveFiles - Delete specified files in sub directories

[Question] Does Session::RemoveFiles() remove files in sub directory of source directory? If not, how to implement this ability?
(Please do not ask me why I have the remote directory as /C/testTransfer/. The code just for testing purpose.)
I have a SFTP program using WinSCP .Net assembly. Program language is C++/CLI. It opens up a work file. The file contains many lines of FTP instructions.
One type of instruction I have to handle is to transfer *.txt from source directory. The source directory may contain sub directories which may contain .txt as well. Once transfer is successful, delete the source files.
I use Session::GetFiles() for the transfer. It correctly transfer all .txt files (/C/testTransfer/*.txt), even those in sub directories (/C/testTransfer/sub/*.txt), in the source to the destination.
transferOptions->FileMask = "*.txt";
session->GetFiles("/C/testTransfer", "C:\\temp\\win", false, transferOption);
Now to remove, I use session->RemoveFiles("/C/testTransfer/*.txt"). I only see *.txt in the source (/C/testTransfer/*.txt), but not in the sub directory (/C/testTransfer/sub/*.txt), are removed.
The Session::RemoveFiles can remove even files in subdirectories in general. But not this way with wildcard, because WinSCP will not descend to subdirectories that do not match the wildcard (*.txt). Also note that even if you do not need the wildcard, the Session::RemoveFiles would remove even the subdirectories themselves, what I'm not sure you want it to.
Though you have other (and better = more safe) options:
Use the remove parameter of the Session::GetFiles method to instruct it to remove source file after successful transfer.
If you need to delete source files transactionally (=only after download of all files succeed), iterate the TransferOperationResult::Transfers returned by Session::GetFiles and call the Session::RemoveFiles for each (unless the TransferEventArgs::Error is not null).
Use the TransferEventArgs::FileName to get a file path to pass to the Session::RemoveFiles. Use the RemotePath::EscapeFileMask to escape the file name before passing it to the Session::RemoveFiles.
There's a similar full example available for Moving local files to different location after successful upload.
To recursively delete files matching a wildcard in a standalone operation (not after downloading the same files), use the Session::EnumerateRemoteFiles. Pass your wildcard to its mask argument. Use the EnumerationOptions.AllDirectories option for recursion.
Call the Session::RemoveFiles for each returned file. Use the RemotePath::EscapeFileMask to escape the file name before passing it to the Session::RemoveFiles.

Why .RAR file contains different files with the same name

I got a .RAR file which contains different files with the same name.
For example,
index.txt 40 Text Document 04/01/2010 4:40PM
index.txt 22 Text Document 04/01/2010 4:42PM
index.txt 10 Text Document 04/01/2010 4:45PM
index.txt 13 Text Document 04/01/2010 4:50PM
Why?
Like said before, the files could be in separate paths, but as I'll show further, this isn't always the case.
If you use WinRAR to list the file contents and your options are set as the following, then it only appears you have files with the same name, but they are in different paths.
Options -> File list -> Flat folders view (ctrl+h)
Options -> File list -> Details
After the column CRC32, there is one called Path. If this is different, extraction shouldn't be a problem if:
Extract -> Extraction path and options -> Advanced -> Extract relative paths is set.
If it is Do not extract paths, WinRAR will need to ask you to rename them because of file system limitations.
I assume command line unrar won't be a problem in this case because you need to specify additional parameters to change its default behavior.
It is possible for a RAR archive to have multiple files with the same name in the same directory. If you use Windows, use "C:\Program Files\WinRAR\Rar.exe"
instead of rar on the command line in the following examples.
Create a new file and add it to a RAR archive. You can also check the changes by listing its contents.
rar a rarfile.rar testfile.txt
rar l rarfile.rar
rar a rarfile.rar testfile.txt
If you try to re-add this file, rar will replace the already added file with the same name.
Updating archive rarfile.rar
Updating testfile.txt OK
Done
Create an other file or rename the first one and add it to the RAR file.
move testfile.txt second.txt (new file)
rar a rarfile.rar second.txt (add it)
rar lb rarfile.rar (list archive, bare info)
Rename the second file to the first one's name.
rar rn rarfile.rar second.txt testfile.txt
This is how you create a RAR file with multiple files of the same name in the same path. These steps will be similar in WinRAR. If you try to rename the file again, the file name of all files in that directory will change too.
Why would someone want to do this?
The only explanation I can think of is that the person that created this archive wanted to imitate a version control/backup system. But if you want to extract only one specific version and it isn't the first one, WinRAR extracts the wrong file. It seems I've found a very obscure WinRAR bug :-)
Edit: seems a bad explanation after finding this in the RAR documentation:
-ver[n] File version control
Forces RAR to keep previous file versions when updating
files in the already existing archive. Old versions are
renamed to 'filename;n', where 'n' is the version number.
By default, when unpacking an archive without the switch
-ver, RAR extracts only the last added file version, the name
of which does not include a numeric suffix. But if you specify
a file name exactly, including a version, it will be also
unpacked. For example, 'rar x arcname' will unpack only
last versions, when 'rar x arcname file.txt;5' will unpack
'file.txt;5', if it is present in the archive.
If you specify -ver switch without a parameter when unpacking,
RAR will extract all versions of all files that match
the entered file mask. In this case a version number is
not removed from unpacked file names. You may also extract
a concrete file version specifying its number as -ver parameter.
It will tell RAR to unpack only this version and remove
a version number from file names. For example,
'rar x -ver5 arcname' will unpack only 5th file versions.
If you specify 'n' parameter when archiving, it will limit
the maximum number of file versions stored in the archive.
Old file versions exceeding this threshold will be removed.
they are in different paths, most likely.
try outputting the full path. or see what happens when you extract them.
you'll probably see something like:
index.txt
path1/index.txt
path2/index.txt
etc etc