Silverstripe 4 large Files in Uploadfield - file-upload

when uploading a large file with uploadfield i get the error.
"Server responded with an error.
Expected a value of type "Int" but received: 4008021167"
to set the allowed filesize i used $upload->getValidator()->setAllowedMaxFileSize(6291456000);
$upload is an UploadField.
every file larger than 2gb gets this error. smaller files are uploaded without any error.
where can i adjust that i can upload bigger files.
I remember that there has been a 2GB border in the past, but i don't know where to adjust it
thanks for your answers
klaus

The regular file upload limits don't seem to be the issue, if you are already at 2 GB. This might be the memory limit of the process itself. I would recommend looking into chunked uploads - this allows you processing larger files.

I know, this answer is late - but the problem is rooted in the graphQL type definition of the File type (it is set to Int). I've submitted a pull request to the upstream repository. Also here is the sed one-liner to patch it:
sed -i 's/size\: Int/size\: Float/g' vendor/silverstripe/asset-admin/_graphql/types/File.yml

Related

File type not allowed - pdf upload - HippoCMS

While uploading .pdf files bigger than 1MB in size through assets in Hippo CMS it gives an error "File type not allowed".
I have already checked MySQL configuration and checked /hippo:configuration/hippo:frontend/cms/cms-services/assetValidationService node in hippo console, where default value is 10M.
So the specific question is:
How do you fix the error and are able to upload files bigger than 1MB in Hippo CMS of .pdf type.
checkout:
http://www.onehippo.org/library/concepts/editor-interface/image-and-asset-upload-validation.html
Here you can see how to set the file size limit. Note that there is also possibly a wicket setting you have to be aware of. Details in the page.
Though I wouldn't expect it to return file type not allowed if the problems was the size of the file. Perhaps the file is not validating as a pdf?
The problem was actually on the server of nginx. The server was rejecting all files bigger then 1MB and after long check at the logs the setting got changed to appropriate size.
I also gave the vote to Jasper since that can also be solution and it effects the same problem.

Checksum Exception when reading from or copying to hdfs in apache hadoop

I am trying to implement a parallelized algorithm using Apache hadoop, however I am facing some issues when trying to transfer a file from the local file system to hdfs. A checksum exception is being thrown when trying to read from or transfer a file.
The strange thing is that some files are being successfully copied while others are not (I tried with 2 files, one is slightly bigger than the other, both are small in size though). Another observation that I have made is that the Java FileSystem.getFileChecksum method, is returning a null in all cases.
A slight background on what I am trying to achieve: I am trying to write a file to hdfs, to be able to use it as a distributed cache for the mapreduce job that I have written.
I have also tried the hadoop fs -copyFromLocal command from the terminal, and the result is the exact same behaviour as when it is done through the java code.
I have looked all over the web, including other questions here on stackoverflow however I haven't managed to solve the issue. Please be aware that I am still quite new to hadoop so any help is greatly appreciated.
I am attaching the stack trace below which shows the exceptions being thrown. (In this case I have posted the stack trace resulting from the hadoop fs -copyFromLocal command from terminal)
name#ubuntu:~/Desktop/hadoop2$ bin/hadoop fs -copyFromLocal ~/Desktop/dtlScaleData/attr.txt /tmp/hadoop-name/dfs/data/attr2.txt
13/03/15 15:02:51 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/03/15 15:02:51 INFO fs.FSInputChecker: Found checksum error: b[0, 0]=
org.apache.hadoop.fs.ChecksumException: Checksum error: /home/name/Desktop/dtlScaleData/attr.txt at 0
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:219)
at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:237)
at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:68)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:100)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:230)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:176)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1183)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:130)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1762)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895)
copyFromLocal: Checksum error: /home/name/Desktop/dtlScaleData/attr.txt at 0
You are probably hitting the bug described in HADOOP-7199. What happens is that when you download a file with copyToLocal, it also copies a crc file in the same directory, so if you modify your file and then try to do copyFromLocal, it will do a checksum of your new file and compare to your local crc file and fail with a non descriptive error message.
To fix it, please check if you have this crc file, if you do just remove it and try again.
I face the same problem solved by removing .crc files
Ok so I managed to solve this issue and I'm writing the answer here just in case someone else encounters the same problem.
What I did was simply create a new file and copied all the contents from the problematic file.
From what I can presume it looks like some crc file is being created and attached to that particular file, hence by trying with another file, another crc check will be carried out. Another reason could be that I have named the file attr.txt, which could be a conflicting file name with some other resource. Maybe someone could expand even more on my answer, since I am not 100% sure on the technical details and these are just my observations.
CRC file holds serial number for the Particular block data. Entire data is spiltted into Collective Blocks. Each block stores metada along with the CRC file inside /hdfs/data/dfs/data folder. If some one makes correction to the CRC files...the actual and current CRC serial numbers would mismatch and it causes the ERROR!! Best practice to fix this ERROR is to do override the meta data file along with CRC file.
I got the exact same problem and didn't fid any solution. Since this was my first hadoop experience, I could not follow some instruction over the internet. I solved this problem by formatting my namenode.
hadoop namenode -format

aws s3 - s3cmd: "WARNING: MD5 signatures do not match:" - what do?

When I use s3cmd to pull down files (of not unreasonable size - less than 100 megabytes) I occasionally see this error:
WARNING: MD5 signatures do not match: computed=BLAH, received="NOT-BLAH"
Googling suggests that this may be caused by the way S3 segments files. Others have said to ignore it.
Does anybody know why this happens and what the right thing to do is?
Thank you for your time,
-- Henry
Looking into this deeper, it seems as though s3cmd is reading the wrong md5 sum from Amazon. It looks as though s3cmd is getting its sum from the ETAG field. Comparing the actual data of the object that was PUT with the object that was GET'ed the contents are identical and this error can be safely ignored.
The ETag of a file in S3 will not match the MD5 if the file was uploaded as "Multipart". When a file is marked multipart AWS will hash each part, concatenate the results and then hash that value.
If the file does not actually have multiple parts the result will be a hash of a hash with -1 added to the end. Try disabling multipart in the tool you use to upload files to S3. For s3cmd, the option is --disable-multipart.
ETags with a '-' in them are expected, if the file was uploaded using the S3 Multipart Upload feature (typically used for files >15MB or files read from stdin). s3cmd 1.5.2 knows this and ignores such ETags. If your s3cmd is older than 1.5.2, please upgrade.
This is a bigger problem is you are using s3cmd sync, because it causes it to re-download previously-synced files. To solve this, add the --no-check-md5 option, which causes s3cmd to only check file sizes to determine changed files (this is good for my purposes, but probably not for everyone, depending on the application).
I saw reports about an hour ago that S3 is currently having exactly this problem, e.g. this tweet:
RT #drags: #ylastic S3 returning incorrect md5s to s3cmd as well. Never seen an md5 with a '-' in it, until AWS. #AWS #S3
Though the AWS Status Page reports no issue, I expect this is a transient problem. Try again soon :-)

what do these liferay config params actually mean?

In my portal-ext properties file, I found these parameters. I don't remember why I put them into the config-file, I think I simply copied them from some other web page where someone said it'll help.
There are comments explaining what the parameters do, but I still don't understand the underlying issues.
How can uploaded data be serialized extraneously?
Why are files > 10 MB be considered excessively large, and why do they have to be cached?
#Set the threshold size to prevent extraneous serialization of uploaded data.
com.liferay.portal.upload.LiferayFileItem.threshold.size=262144
#Set the threshold size to prevent out of memory exceptions caused by caching excessively
#large uploaded data. Default is 1024 * 1024 * 10.
com.liferay.portal.upload.LiferayInputStream.threshold.size=10485760
These properties will be invoked when you have an external file upload functionality in your portal.
When you upload a larger file, it needs to be written to a temporary file on the disk.
Since the part of the file upload process is to hold the file in memory before writing it to the disk/database, Larger files must be avoided and it will prevent out of memory exceptions.
If you want to know more details on this,
Please go through this link.
Liferay's Document Library uses other properties to restrict the file size. Such as
dl.file.max.size=3072000
These properties are connected with maximum file size for upload (e.g. for document library). However these seem to be the default values.

PHP $_POST / $_FILES empty when upload larger than POST_MAX_SIZE [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to detect if a user uploaded a file larger than post_max_size?
I'm writing a script that handles file uploads from a web application. I've got a set limit on the size of files that may be uploaded to my application (storage space limitations). I'm currently trying to put some validation code in that will check to make sure the user actually uploaded a file, so that I can display a nice error message to them. But I'd also like to be able to display an error message to the user if they've uploaded a file that's too big. I can use Javascript for this, but I'd like a PHP check as well in case they don't have Javascript enabled.
I've set my POST_MAX_SIZE var in PHP.ini to be the maximum file upload size, but this has produced an unexpected issue. If someone tries to upload a file larger than the POST_MAX_SIZE, the binary data just gets truncated at the max size, and the $_FILES array doesn't contain an entry for that file. This is the same behavior that would occur if the user didn't submit a file at all.
This makes it difficult to tell why the $_FILES array doesn't contain a file, i.e. whether it wasn't ever uploaded, or whether it was too big to send completely.
Is there a way to distinguish between these two cases? In other words, is there a way to tell whether POST data was sent for a file, but was truncated prematurely before the entire file was sent?
Odd as it may seem, this is intentional behavior, as POST_MAX_SIZE is a low level ultimate failsafe, and to protect you and prevent DOS attacks, there's no way the server can do anything but discard all POST data when it realizes, mid-stream, that it's receiving more data than it can safely handle. You can raise this value if you know you need to receive more data than this at once (but be sure your server can handle the increased load this will put on it) but I'd suggest looking into other ways of handling your use case, hitting up against POST_MAX_SIZE suggests to me that there might be more robust solutions than one massive HTTP POST, such as splitting it up into multiple AJAX calls, for instance.
Separate from POST_MAX_SIZE there is UPLOAD_MAX_SIZE which is the php.ini setting for a single file limit, which is what I assumed you were talking about initially. It limits the size of any one uploaded file, and if a file exceeds this value, it will set $_FILES['file']['error'] to 1. Generally speaking, you want to have your site set up like this:
The <form> MAX_FILE_SIZE should be set to the maximum you actually want to accept for this form. While any user attempting to exploit your site can get around this, it's nice for users actually using your site, as the browser will (actually, could) prevent them from wasting the bandwidth attempting to upload it. This should always be smaller than your server-side settings.
UPLOAD_MAX_FILESIZE is the maximum size the server will accept, discarding anything larger and reporting the error to the $_FILES array. This should be larger than the largest file you want to actually accept throughout your site.
POST_MAX_SIZE is the maximum amount of data your server is willing to accept in a single POST request. This must be bigger than UPLOAD_MAX_SIZE in order for large uploads to succeed, and must be much bigger to allow more than one file upload at a time. I might suggest a value of UPLOAD_MAX_FILESIZE * 4.1 - this will allow four large files at a time, along with a little extra data. YMMV of course, and you should ensure your server can properly handle whatever values you decide to set.
To your specific question of How to tell, PHP documentation on POST_MAX_SIZE I linked to suggested setting a get variable in the form, i.e.
<form action="edit.php?processed=1">
However like I said above, if you're running into this issue, you may want to explore alternative upload methods.
Something like this:
if ($_SERVER['CONTENT_LENGTH'] && !$_FILES && !$_POST) {
// upload failed
}
Untested, so play around with the various scenarios to see what combination works. Not sure if it works with multiple file uploads at the same time.
You may need to inspect $_SERVER['CONTENT_LENGTH'] and compare it to the sum of files received if dealing with multiple uploads.