How to create a http request that contains multiple FileHeaders? - file-upload

I am trying to test a uploading service that supports multiple files uploading,and I found this:
golang POST data using the Content-Type multipart/form-data
that introduced how to create a request to upload a single file,but I need to upload multiple files,is there simple way to create this kind of request?
update:
please check line:38 and 39 in post:to support html5 multiple files uploading
line 38 files := m.File["myfiles"]
line 29 for i, _ := range files {
It seems that it needs to set single name for multiple file headers to stimulate the html5 multiple files uploading.

For each file, call CreateFormFile to create the header for the file. Call Write on the writer returned from CreateFormFile one or more times to write data to the file. When done with all files, close the multipart writer.
The top answer in the linked question uploads two files, one named "image" and one named "key". The data for the "image" is copied from a file. The data for "key" is simply the bytes "KEY".
The field name is the first argument to CreateFormFile. If you want to upload multiple files with the same name, use the same name each time you call CreateFormFile.

Related

Scrapy upload files to dynamically created directories in S3 based on field

I've been experimenting with Scrapy for sometime now and recently have been trying to upload files (data and images) to an S3 bucket. If the directory is static, it is pretty straightforward and I didn't hit any roadblocks. But what I want to achieve is to dynamically create directories based on a certain field from the extract data and place the data & media in those directories. The template path, if you will, is below:
s3://<bucket-name>/crawl_data/<account_id>/<media_type>/<file_name>
For example if the account_id is 123, then the images should be placed in the following directory:
s3://<bucket-name>/crawl_data/123/images/file_name.jpeg
and the data file should be placed in the following directory:
s3://<bucket-name>/crawl_data/123/data/file_name.json
I have been able to achieve this for the media downloads (kind of a crude way to segregate media types, as of now), with the following custom File Pipeline:
class CustomFilepathPipeline(FilesPipeline):
def file_path(self, request, response=None, info=None, *, item=None):
adapter = ItemAdapter(item)
account_id = adapter["account_id"]
file_name = os.path.basename(urlparse(request.url).path)
if ".mp4" in file_name:
media_type = "video"
else:
media_type = "image"
file_path = f"crawl_data/{account_id}/{media_type}/{file_name}"
return file_path
The following settings have been configured at a spider level with custom_settings:
custom_settings = {
'FILES_STORE': 's3://<my_s3_bucket_name>/',
'FILES_RESULT_FIELD': 's3_media_url',
'DOWNLOAD_WARNSIZE': 0,
'AWS_ACCESS_KEY_ID': <my_access_key>,
'AWS_SECRET_ACCESS_KEY': <my_secret_key>,
}
So, the media part works flawlessly and I have been able to download the images and videos in their separate directories based on the account_id, in the S3 bucket. My questions is:
Is there a way to achieve the same results with the data files as well? Maybe another custom pipeline?
I have tried to experiment with the 1st example on the Item Exporters page but couldn't make any headway. One thing that I thought might help is to use boto3 to establish connection and then upload files but that might possibly require me to segregate files locally and upload those files together, by using a combination of Pipelines (to split data) and Signals (once spider is closed to upload the files to S3).
Any thoughts and/or guidance on this or a better approach would be greatly appreciated.

Repast: how to add and set a new parameter directly from the code instead of GUI

I want to create a parameter that contains a list of string (list of hub codes). This list of string is created by reading an external csv file (this list could contain the different codes depending on the hub codes in the CSV file)
What I want is to find a easy auto way to perform batch runs by each hub code in the list.
So this question is:
1) how to add and set a new parameter directly from the code (during the initialization when reading the CSV) instead of GUI parameter panel?
2) how to avoid manual configuration of hub list in the batch run configuration
Something like this for adding the parameters should work in your ContextBuilder.
Parameters params = RunEnvironment.getInstance().getParameters();
((DefaultParameters)params).addParameter("foo", "Big Foo", Integer.class, 3, false);
You would read the csv file to get the parameter name and value.
I'm not sure I completely understand the batch run configuration question, but each batch run has a run number associated with it
RunState.getInstance().getRunInfo().getRunNumber()
If you can associate line numbers in your csv parameter file with run number (e.g. run number 1 should use line 1, and so on), then each batch run would use a different parameter line.

Fetch images which are stored in filestore odoo 11

How can I Fetch images which are stored in file-store odoo-11?
I am trying to fetch the product.template image, which is stored in ir_attachment in the format 39/39abfeca081b17a6b93fbeaeead3e34025a39f9c.
This is not a binary code. I tried this code in this URL. It didn't give any image. Later, I understood that this is a code in file store. When we download a Database in zip format and extract the DB we will see the file-store inside this folder "39" is a folder name and 39abfeca081b17a6b93fbeaeead3e34025a39f9c is an image name.
My Requirement is Product Image will be fetched from another Application. How can I store this in database with binary code so that other applications will fetch that binary code and get the image?
Thanks in Advance.
The stored files in Odoo filestores are regular files, which can be opened by the OS programs and can be read by any other Application as bytes of data like any other file in your computer. If you wanna get the value of the file stored in base64 format you could build the url for that file by having the id of the stored attachment and make a call to the Odoo instance and get the file content in base64.
The url format is like:
http://example.com/web/content/5
Where the id of the attachment is 5 at the end of the url

How to handle file inputs with changing schemas in Talend

Questions: How do I continue to process files that differ substantially from a base schema and that trigger tSchemaComplianceCheck errors?
Background
Suppose I have a folder with Customer xls files called file1,file2,....file1000. Assume I have imported the file schema into Talend repository and called it 6Columns and I have the talend job configured to iterate through each of the files and process them
1-tFileInput ->2-tSchemaCompliance-6Columns -> 3-tMap ->4-FurtherProcessing
Read each excel file
Compare it to the schema 6Columns
Format the output (rename columns)
Take the collection of Customer data and process it more
While processing I notice that the schema compliance is generating errors (errorCode 16) which points to a number of files (200) with a different schema 13Columns but there isn't a way to identify the files in advance to filter then into a subjob
How do I amend my processing to correctly integrate the files with 13Columnsschema into the process (whats the recommended way of handling) and designing incase other schema changes occur
1-tFileInput ->2-tSchemaCompliance-6Columns -> 3-tMap ->4-FurtherProcessing
|
|Reject Flow (ErrorCode 16)
|Schema-13Columns
|
|-> ??
Current Thinking When ErrorCode 16 detected
Option 1 Parallel. Take the file path for the current file and process it against 13Columns using a new FileInput before merging the 2 flows back into 1
Option 2 Serial. Collect the list of files that triggered the error and process them after I've finished with the compliance files?
You could try something like below :
tFileList - Read your input repository
tFileInput "schema6" - tSchemaComplianceCheck : read files as 6-columns schema
tMap_1 : further processing
In the reject part :
tMap after reject link : add a new column containing the filepath that has been rejected
tFlowToIterate : used to get an iterate link, acceptable input for tFileInputDelimited that follows.
tFileInput : read data as 13-columns schema. Following components are the same as in part 1.
After that, you can push your data to tHashOutput, in order to read them further in another subjob.

Rackspace Cloud Files PHP get_objects at the "Root level"

I have been trying to figure out how to get files that are at the Root level, meaning get all files that don't have a path attached to their file name.
I have a container that looks like this
image.png image/png
ui application/directory
ui/css application/directory
ui/css/test.css text/css
ui/image2.jpg image/jpg
I'm using the call
Container->get_objects(0, null, null, 'ui/');
which returns 2 CF_Objects:
ui/css
ui/image2.jpg
This is the desired output
but if I request the files at the "root level"
Container->get_objects(0, null, null, '/');
returns an empty array.
Container->get_objects(0, null, null, '');
returns all the files in the container.
Ideally It would return two CF_Objects image.png, and ui.
Is there a way to do this?
Thank you!
The Cloud Files Developer guide of Nov 15 2011 page 20 says:
You can also use a delimiter parameter to represent a nested directory
hierarchy without the need for the directory marker objects. You can
use any single character as a delimiter. The listings can return
virtual directories - they are virtual in that they don't actually
represent real objects. like the directory markers, though, they will
have a content-type of application/directory and be in a subdir
section of json and xml results.
If you have the following objects—photos/photo1, photos/photo2,
movieobject, videos/ movieobj4—in a container, your delimiter
parameter query using slash (/) would give you
photos,
movieobject,
videos.
The parameter "delimiter" is not supported by the get_objects in the PHP SDK, and using it seems to be the only way to get the base directory files.
There is currently a merge request in github [this request has since been approved] adding this particular parameter to the get_objects method.
Other users of the Rackspace Cloud Files API PHP SDK have also added support for this parameter.
See if the original php-cloudfiles repo gets updated or just create a fork of the original and add your own code, if you don't feel comfortable adding your own changes, clone a fork that has added the delimiter parameter like
https://github.com/michealmorgan/php-cloudfiles
or
https://github.com/onema/php-cloudfiles
The merge request referenced in the answer was approved on May 09, 2012
An optional parameter for get_objects was added for $delimiter ...
However, there was an error introduced into the code at some other point which falsely reports the Container name is not set if one tries to use any of the optional parameters.
A request has been put in to correct this error.