Jasper Reports taking long time in Server, But not in local - pdf

I am trying to generate PDF files by DynamicReports which uses JasperReports in background. I tried generating 50 pdf files in my local machine. It worked fine, it took around 35 seconds to generate and upload the file to my aws S3 client.
After uploading the JAR to server, server takes 50 seconds for each PDF file to generate. So totally it take 2500 seconds for 50 files to get generated and uploaded in s3. Each file is around 3 or 4 kb
Updated:
I tried to debug more in server. As far as I found, toPdf function takes a lot of time. Not sure this is because of toPdf or Storing the filestream in server
Code Snippet:
try {
String directory = "/home/ubuntu/generatedFiles";
for(Invoice eachInvoice : invoiceList){
JasperReportBuilder report = design.build(eachInvoice);
String fileName = "invoice_"+eachInvoice.getId()+".pdf";
String filePath= directory+"/"+fileName;
FileOutputStream stream = new FileOutputStream(filePath);
report.toPdf(stream);
client.uploadDocument(filePath, fileName);
}
// Delete all the files in the generatedFiles folder
File file = new File(directory);
String[] invoiceFiles;
if(file.isDirectory()){
// Delete the files from machine
}
}
catch (DRException e) {
e.printStackTrace();
}
If anyone can help me through this!! It would be awesome!!

Related

How to reduce the execution time of uploading many files to Dropbox in Python?

I have 100 files ( .csv) inside the folder and I want to upload these files to Dropbox. I have successfully done this with the following code, but the problem is that the execution time takes a long time. So my question is: How to reduce the execution time of uploading these files to Dropbox in Python.
With my thanks and appreciation
dbx = dropbox.Dropbox('Access token')
filename = (os.listdir('path'))
for fn in filename:
with open(fn, 'rb') as f:
dbx.files_upload(f.read(), '/sendTOcloud/' + fn + '.csv', mute=True)

Flink on EMR - no output, either to console or to file

I'm trying to deploy my flink job on AWS EMR (version 5.15 with Flink 1.4.2). However, I could not get any output from my stream.
I tried to create a simple job:
object StreamingJob1 {
def main(args: Array[String]) {
val path = args(0)
val file_input_format = new TextInputFormat(
new org.apache.flink.core.fs.Path(path))
file_input_format.setFilesFilter(FilePathFilter.createDefaultFilter())
file_input_format.setNestedFileEnumeration(true)
val env = StreamExecutionEnvironment.getExecutionEnvironment
val myStream: DataStream[String] =
env.readFile(file_input_format,
path,
FileProcessingMode.PROCESS_CONTINUOUSLY,
1000L)
.map(s => s.split(",").toString)
myStream.print()
// execute program
env.execute("Flink Streaming Scala")
}
}
And I executed it using the following command:
HADOOP_CONF_DIR=/etc/hadoop/conf; flink run -m yarn-cluster -yn 4 -c my.pkg.StreamingJob1 /home/hadoop/flink-test-0.1.jar hdfs:///user/hadoop/data/
There was no error, but no output on the screen except flink's INFO logs.
I tried to output to a Kinesis stream, or to an S3 file. Nothing was recorded.
myStream.addSink(new BucketingSink[String](output_path))
I also tried to write to a HDFS file. In this case, a file was created, but with size = 0.
I am sure that the input file has been processed using a simple check:
myStream.map(s => {"abc".toInt})
which generated an exception.
What am I missing here?
It looks like stream.print() doesn't work on EMR.
Output to file: HDFS is used, and sometimes (or most of the time) I need to wait for the file to be updated.
Output to Kinesis: I had a typo in my stream name. I don't know why I didn't get any exception for that stream-not-exist. However, after get the name corrected, I got my expected message.

How to use the taildir source in Flume to append only newest lines of a .txt file?

I recently asked the question Apache Flume - send only new file contents
I am rephrasing the question in order to learn more and provide more benefitto future users of Flume.
Setup: Two servers, one with a .txt file that gets lines appended to it regularly.
Goal: Use flume TAILDIR source to append the most recently written line to a file on the other server.
Issue: Whenever the source file has a new line of data added, the current configuration appends everything in file on server 1 to the file in server 2. This results in duplicate lines in file 2 and does not properly recreate the file from server 1.
Configuration on server 1:
#configure the agent
agent.sources=r1
agent.channels=k1
agent.sinks=c1
#using memort channel to hold upto 1000 events
agent.channels.k1.type=memory
agent.channels.k1.capacity=1000
agent.channels.k1.transactionCapacity=100
#connect source, channel,sink
agent.sources.r1.channels=k1
agent.sinks.c1.channel=k1
#define source
agent.sources.r1.type=TAILDIR
agent.sources.r1.channels=k1
agent.sources.r1.filegroups=f1
agent.sources.r1.filegroups.f1=/home/tail_test_dir/test.txt
agent.sources.r1.maxBackoffSleep=1000
#connect to another box using avro and send the data
agent.sinks.c1.type=avro
agent.sinks.c1.hostname=10.10.10.4
agent.sinks.c1.port=4545
Configuration on server 2:
#configure the agent
agent.sources=r1
agent.channels=k1
agent.sinks=c1
#using memory channel to hold up to 1000 events
agent.channels.k1.type=memory
agent.channels.k1.capacity=1000
agent.channels.k1.transactionCapacity=100
#connect source, channel, sink
agent.sources.r1.channels=k1
agent.sinks.c1.channel=k1
#here source is listening at the specified port using AVRO for data
agent.sources.r1.type=avro
agent.sources.r1.bind=0.0.0.0
agent.sources.r1.port=4545
#use file_roll and write file at specified directory
agent.sinks.c1.type=file_roll
agent.sinks.c1.sink.directory=/home/Flume_dump
You have to set position json file. Then the source check the position and write only new added lines to sink.
ex) agent.sources.s1.positionFile = /var/log/flume/tail_position.json

Cannot append to file when some other process writes to it on *nix systems

I have a very simple piece of code which just writes a small amount of data to a file at some regular interval. Once my program has created the file and appended some data, when I open this file in vim(or any other editor for that matter) and edit it, my process cannot seem to update the file anymore. I do not see any errors being returned from the syscall. I tried tracing the system calls, and did not observe anything weird even while the file is NOT being updated.
Since each process gets its own file table entry which has the current offset, all I was expecting was an output file with data interspersed with writes from the two non-cooperating processes(possibly garbled too). But what I am observing is that my program cannot update the file anymore once any other editor writes to the file.
Couple of other interesting observations
1) When I cat something to the output file, my program can continue to update no problem
2) When multiple instances of my own program are writing to the same file, everything is fine again
I understand that there's mandatory locking to prevent multiple writes, but I am trying to understand whats happening underneath. Also this kind of scenario behaves normally for some loggers (like system log, apache logs etc)
Any ideas to explain this behavior?. Also any hints on how I can debug this further?
My code is pretty simple:
1 int main(int argc, char** argv)
2 {
3 const char* buf;
4 if(argc < 2)
5 buf = "test->";
6 else
7 buf = argv[1];
8
9 int fd;
10 if((fd = open("test.log", O_CREAT|O_WRONLY|O_APPEND, 0644)) == -1) {
11 perror("Cannot open test.log");
12 exit(1);
13 }
14
15 int num_bytes = strlen(buf), num_bytes_written = -1;
16
17 while(1) {
18 if((num_bytes_written = write(fd, buf, num_bytes)) == -1) {
19 perror("Could not write to fd");
20 }
21 fsync(fd);
22 sleep(5);
23 }
24 }
When the vim(1) editor exits, it's likely replacing the original file with the edited version. Your process is holding the original file open but that file no longer exists in the sense that it's directory entry has been replaced and, so, no process that doesn't already have the file open can access it. Your process is now appending to a file that can't be accessed by any other process. Once your process closes the file, it will be gone for good (unless you run a partition recovery program).
Your vim editor works on a cached version of your file. It modifies this cache while your other programs append to the original file. When you save with vim, you overwrite the original file with the updated cached file and loose all logs.

PHP File Upload greater than upload_max_filesize and error

How can I catch an error on a file uploaded to my web server that is greater than php upload_max_filesize?
My question is similar to so/large-file-upload-errors-with-php although my memory limit is set to 512M so the resolution that question used will not help me.
I am trying to upload a file 6.9MB for example and my upload_max_filesize = 6M. My script basically just stops executing and I cannot tell where or why. Also, I have error reporting turned on.
Also I should note that files <6MB I can upload and process correctly with the following code:
if(isset($_FILES['file']['name']) and !empty($_FILES['file']['name'])){
$info = pathinfo($_FILES['file']['name']);
$ext = $info['extension'];
//verify file is of allowed types
if(Mimetypes::isAllowed($ext)){
if(filesize($_FILES['file']['tmp_name']) <= AttachmentUploader::$maxFilesize){
$a = new AttachmentUploader(); //for file uploading
if($a->uploadFile($_FILES['file'], 'incident', $_POST['sys_id'])){
header("location: ".$links['status']."?item=incident&action=update&status=1&place=".urlencode($links['record']."id=".$_POST['sys_id']));
}else{
header("location: ".$links['status']."?item=incident&action=update&status=-1&place=".urlencode($links['record']."id=".$_POST['sys_id']));
}
}else{
$errors[] = 'The file you attempted to upload is too large. 0.5MB is the maximum allowed size for a file. If you are trying to upload an image, it may need to be scaled down.';
}
}else{
$errors[] = 'The file you attempted to upload is not allowed. Acceptable extensions: jpg, gif, bmp, png, xls, doc, docx, txt, pdf';
}
}else{
$errors[] = 'Please attach a file.';
}
My php.ini settings:
;;;;;;;;;;;;;;;;;;;
; Resource Limits ;
;;;;;;;;;;;;;;;;;;;
max_execution_time = 7200 ; Maximum execution time of each script, in seconds
memory_limit = 512M ; Maximum amount of memory a script may consume (8MB)
;;;;;;;;;;;;;;;;
; File Uploads ;
;;;;;;;;;;;;;;;;
; Whether to allow HTTP file uploads.
file_uploads = On
; Temporary directory for HTTP uploaded files (will use system default if not
; specified).
upload_tmp_dir = /tmp
; Maximum allowed size for uploaded files.
upload_max_filesize = 6M
The error is in $_FILES['userfile']['error']. You just have to check that this value is UPLOAD_ERR_INI_SIZE to detect if the file is bigger than the max size defined in your php.ini.
Resources :
php.net - File upload, Error Messages Explained