Download Multiple Attachments from Salesforce using Jitterbit - jitterbit

I am able to create a query for attachments and download 1 individual file like this:
SOQL:
SELECT Body, Id FROM Attachment WHERE Id = '00P4M00000q8ChI'
Code on Body:
<trans>$content = root$transaction.response$body$queryResponse$result$records.Attachment$Body$;
$decoded_content=Base64Decode($content);
WriteFile("<TAG>Targets/Files/FMLA _Extract</TAG>",$decoded_content);
</trans>
But when the multiple attachments are pulled, it creates 1 large file. This large file sometimes shows the first page, but most of the time Adobe is not able to read it. Instead, I would like to have multiple files listed on my target directory.
Thank you in advance for your help!
Target file:
FMLA_Extract

What does your file target look like? (Targets/File/FMLA_Extract). I'm guessing it's configured to append to existing files and you're not changing the file name, so they all get glommed on top of each other.

Related

Get File Structure from Get Metadata in ADF

I want to get the column names for a parquet file. I have a Get Metadata module in my pipeline and it is using a parquet dataset with only the root folder provided. Because only the folder is provided ADF is not letting me get the file structure that contains the column names. The file name is not provided because that can change. Can anyone provide some advice on how to approach this?
You will need 2 Get Metadata activities and a ForEach activity to get the file structure if your file name is not the same every time.
Source dataset:
Parameterize the file name as the name changes frequently.
Preview of source data:
Get Metadata1:
In the first Get Metadata activity, get the file name dynamically.
You can also specify if your file name contains any specific pattern by adding an expression in the filename or you can mention asterisk (*) if you don’t have a specific pattern or need more than 1 file in the folder needs to be processed.
Give field list as child items when you want to get the files from the folder.
Output of Get Metadata1: Get the file name from the folder.
FoEach activity:
Using the ForEach activity, you can get the item's name listed inside the Get Metadata activity output array.
Get Metadata2:
Add Get Metadata activity inside ForEach activity to get the file structure or column list of the current file from the folder. It can loop the number of items count in the folder (1 or more).
Output of Get Metadata2:
You can parameterize your file name in dataset or via GetMeta data activity, get the list of files within the folder and then via GetMetaData activity get the list of columns for those corresponding files.

Is there a way to list the directories in a using PySpark in a notebook?

I'm trying to see every file is a certain directory, but since each file in the directory is very large, I can't use sc.wholeTextfile or sc.textfile. I wanted to just get the filenames from them, and then pull the file if needed in a different cell. I can access the files just fine using Cyberduck and it shows the names on there.
Ex: I have the link for one set of data at "name:///mainfolder/date/sectionsofdate/indiviual_files.gz", and it works, But I want to see the names of the files in "/mainfolder/date" and in "/mainfolder/date/sectionsofdate" without having to load them all in via sc.textFile or sc.Wholetextfile. Both those functions work, so I know my keys are correct, but it takes too long for them to be loaded.
Considering that the list of files can be retrieve by one single node, you can just list the files in the directory. Look at this response.
wholeTextFiles returns a tuple (path, content) but I don't know if the file content is lazy to get only the first part of the tuple.

osquery - How can I retrieve a file origin using osquery?

I'm using osquery on Windows and I need help: I want to retrieve the file origin of a specific file. For example I download a file from http://example.com and I'm looking for a query on osquery that show me the info that I download that specific file from http://example.com (or something like this). I thought that to derive this information I can compare the timestamps between the table file and the table routes but there isn't the column timestamp in routes. How can I do that?
I don't see a table for this on windows, although the information is available on the system through ADS(see this answer). I would open an issue for this on the osquery repo, it would be a valuable table to have.
You can use the extended_attributes table. For example:
osquery> select path, key, value, base64 from extended_attributes where path ='/Users/victor/Downloads/osqueryi.zip';
path = /Users/victor/Downloads/osqueryi.zip
key = com.apple.lastuseddate#PS
value = eynzWgAAAAAbZEQgAAAAAA==
base64 = 1
path = /Users/victor/Downloads/osqueryi.zip
key = where_from
value = https://files.slack.com/files-pri/T04QVKUQG-FALAL3WP2/download/osqueryi.zip
base64 = 0
osquery>
+1 on what #groob mentioned, this'd be a nice table to have and I think we've wanted it for some time. I thought we already had an issue cut for this, but I went ahead and made a new one as simple searches wasn't turning anything up. Thanks for the question :)
https://github.com/facebook/osquery/issues/5250

SharePoint REST API getFolderByServerRelativeUrl Returns Nothing

I would like to drill into the library and then into a specified folder, but I am having problems getting 'getFolderByServerRelativeUrl' to grab anything for me.
This http://_base/_api/web/getFolderByServerRelativeUrl('LibName')/files returns zero results. But if I use http://_base/_api/web/lists/getbytitle('LibName')/items it returns multiple items.
/_api/web/lists/getbytitle('<list title>')/items endpoint returns all list items within a library, /_api/web/getFolderByServerRelativeUrl('<url>')/files returns only files located under (one level beneath only) the specified folder.
Example
Assume the following Documents library structure:
Documents (library)
|
Guides (folder)
|
SharePoint User Guide.docx (file)
Then, the following request:
/_api/web/lists/getbytitle('Documents')/items will return 2 items:
list item associated with Guides folder
list item associated with file
At the same time, the request: /_api/web/getFolderByServerRelativeUrl('Documents')/files
will returns 0 files since there are no files contained in the root folder
but the request with provided folder url:
/_api/web/getFolderByServerRelativeUrl('Documents/Guides')/files
will return SharePoint User Guide.docx file.
You're using the wrong URL. Try appending it to the other URL.

Single file versioning best practices?

User is selecting rather hefty single XML files via an NSOpenPanel. The application is making moderate changes to the file so I'd like to include an option of creating a backup in a subfolder based on the directory the original file was selected. Creating the new subfolder is no problem but does anybody have a good way to to create a backup of said foo.xml, is there a practice for such thing or is it as simple as creating a duplicate and renaming it foo.back01.xml?
Not sure, how much this Approach will fit with your requirement, but this is what i was doing,
-- Have a directory in the Temporary folder of the System : Assuming once the Application is closed all this files will be deleted,
-- To have the uniqueness in the file, generate file name with following pattern , have a function say [+(NSString *) generateFileNameForExtension:(NSString *)extension Create:(bool)bCreate]
Assuming input is .xml and false , it might give fileName something like this,
AppName128908765445.xml , i.e. [AppName][UTCTimeStamp].[Fileextension]
-- Once you think its done, there could be Function call [self addToDeleteList:(NSString *)fileName] which will add a file to delete list,
-- There would be a function, which shall invoke a timer for 1 minute and every one minute it will read all the files gets added into delete list then delete it.
Will share the code with you if needed...