Restrict number of pages on a PDF shown in browser Node server - express

I am developing a platform with model preview before paying to
download for pdf's my problem is how to restrict the number of pages of pdf showing to the user.
eg.
I use pdfjs to render the pdf on canvas but when I see the network
request the full file is downloadable.
I tried with hummus js to split on server & display to the client, here
my confusion is firstly I put the path on sql db & store pdfs on fs so
the response is JSON, so how to handle that? Secondly how to make it
dynamic, since hummusJs needs an input and output name and I am working with a lot of pdf's.
There are related questions on Stack Overflow but they are all for PHP.
Here is my try I read the pdf uploaded by user's and write it with hummusjs then send it to the browser but this is not scalable for many files
app.get('/', (req, res) => {
//hummusjs
let readPdf = hum.createReader('./uploads/quote.pdf-1675405130171.pdf');
let pageCount = readPdf.getPagesCount();
writePdf = hum.createWriter('preview.pdf');
writePdf
.createPDFCopyingContext(readPdf)
.appendPDFPageFromPDF(0);
writePdf.end()
const path = './preview.pdf';
if (fs.existsSync(path)) {
res.contentType("application/pdf");
fs.createReadStream(path).pipe(res)
} else {
res.status(500)
console.log('File not found')
res.send('File not found')
}
});
Restrict number of pages on a PDF shown in browser Node server, Users can't able to download it in no way without the preview file.
!(Inspect, Print & Network tab)
|| Pay & get full document

Related

How to work around maximum execution time when uploading to S3 Bucket?

I am using the S3-for-Google-Apps-Script library to export full attachments from Gmail to an S3 bucket. I changed the S3 code to upload the actual content of the attachment rather than an encoded string, as detailed in this post.
However, when attempting to upload an attachment approximately > 5 MB, apps script throws the following error: "Maximum Execution Time Exceeded". I used timestamps to measure the difference in time to ensure that the time issue occurred in the s3.putObject(bucket,objectKey,file) function.
It might be also helpful to note that for a file barely over the limit, it still gets uploaded to my s3 bucket, but apps script returns that the execution time has been exceeded (30 seconds) to the user, disrupting user flow.
Reproducible Example
This is basically a simple button that scrapes a current email for all attachments, if they are pdf's then it calls the export function. and it exports those attachments to our s3 instance. the problem is that when the file > 5mb, it throws the error:
"exportHandler exceeded execution time"
If you're trying to reproduce this be aware that you need to copy an instance of s3 for gas and initialize that as a separate library in apps script with the changes made here.
In order to link the libraries, go to file>libraries, and add the respective library id, version, and development mode in the google apps script console. You'll also need to save your AWS access key and secret key in your property service cache, as detailed in the library documentation.
An initial button that triggers an export of a single attachment on the current Gmail thread:
export default function testButton() {
const Card = CardService.newCardBuilder();
const exportButtonSection = CardService.newCardSection();
const exportWidget = CardService.newTextButton()
.setText('Export File')
.setOnClickAction(CardService.newAction().setFunctionName('exportHandler'));
exportButtonSection.addWidget(exportWidget);
Card.addSection(exportButtonSection);
return Card.build();
}
Export an attachment to a specified s3 bucket. Note that S3Modified is an instance of the s3 for google apps script that is modified in accordance to the post outlined above, it's a separate Apps Script file, s3.putObject is where it takes a long time to process an attachment (this is where the error occurs I think).
credentials initialize your s3 awsAccessKey and awsBucket, and can be stored in PropertiesService.
function exportAttachment(attachment) {
const fileName = attachment.getName();
const timestamp = Date.now();
const credentials = PropertiesService.getScriptProperties().getProperties();
const s3 = S3Modified.getInstance(credentials.awsAccessKeyId, credentials.awsSecretAccessKey);
s3.putObject(credentials.awsBucket, fileName, attachment, { logRequests: true });
const timestamp2 = Date.now();
Logger.log('difference: ', timestamp2 - timestamp);
}
This gets all the attachments that are PDFs in the current email message, this function is pretty much the same as the one on the apps script site for handling Gmail attachments, this specifically looks for pdf's though (not a requirement for the code):
function getAttachments(event) {
const gmailAccessToken = event.gmail.accessToken;
const messageIdVal = event.gmail.messageId;
GmailApp.setCurrentMessageAccessToken(gmailAccessToken);
const mailMessage = GmailApp.getMessageById(messageIdVal);
const thread = mailMessage.getThread();
const messages = thread.getMessages();
const filteredAttachments = [];
for (let i = 0; i < messages.length; i += 1) {
const allAttachments = messages[i].getAttachments();
for (let j = 0; j < allAttachments.length; j += 1) {
if (allAttachments[j].getContentType() === 'application/pdf') {
filteredAttachments.push(allAttachments[j]);
}
}
}
return filteredAttachments;
}
the global handler that gets attachments and exports them to the s3 bucket when the button is clicked:
function exportHandler(event) {
const currAttachment = getAttachments(event).flat()[0];
exportAttachment(currAttachment);
}
global.export = exportHandler;
To be absolutely clear, the bulk of the time is being processed in the second code sample (exportAttachment), since that is where the object is being put into the s3 application.
The timestamps help log how much time that function takes, test it with a 300kb file, you'll get 2 seconds, 4mb 20 seconds, >5mb approx 30 seconds. This part contributes the most to the max execution time.
So this is what leads me to my question, why do I get the maximum execution time exceeded error and how can I fix it? Here are my two thoughts on potential solutions:
Why does the execution limit occur? The quotas say that the runtime limit for a custom function is 30 seconds, and the runtime limit for the script is 6 minutes.
After some research, I only found custom function mentions in the context of AddOns in Google Sheets, but the function where I'm getting the error is a global function (so that it can be recognized by a callback) in my script. Is there a way to change it to not be recognized as a custom function so that I'm not limited to the 30-second execution limit?
Now, how can I work around this execution limit? Is this an issue with the recommendation to modify the S3 library in this post? Essentially, the modification suggests that we export the actual bytes of the attachment rather than the encoded string.
This definitely increases the load that Apps Script has to handle which is why it increases the execution time required. How can I work around this issue? Is there a way to change the S3 library to improve processing speed?
Regarding the first question
From https://developers.google.com/gsuite/add-ons/concepts/actions#callback_functions
Warning: The Apps Script Card service limits callback functions to a maximum of 30 seconds of execution time. If the execution takes longer than that, your add-on UI may not update its card display properly in response to the Action.
Regarding the second question
On the answer to Google Apps Script Async function execution on Server side it's suggested a "hack": Use an "open link" action to call something that can run asynchronously the task that will requiere a long time to run.
Related
How to use HtmlService in Gmail add-on using App Script
Handling Gmail Addon Timeouts
Can't serve HTML on the Google Apps Script callback page in a GMail add-on
Answer to rev 1.
Regarding the first question
In Google Apps Script, a custom function is a function to be used in a Google Sheets formula. There is no way not extend this limit. Reference https://developers.google.com/app-script/guides/sheets/functions
onOpen and onEdit simple triggers has also a 30 seconds execution time limit. Reference https://developers.google.com/apps-script/guides/triggers
Functions being executed from the Google Apps Script editor, a custom menu, an image that has assigned the function, installable triggers, client side code, Google Apps Script API has an execution time limit of 6 minutes for regular Google accounts (like those that have a #gmail.com email address) by the other hand G Suite accounts have a 30 minutes limit.

Sails Skipper: how to read and validate a csv file and exclude the invalid file types during upload?

I'm trying to write a controller that uploads a file to S3 location. However, before upload I need to validate if the incoming file type is a csv or not. And then I need to read the file to check for header colummns in the files etc. I got the type of the file as per below snippet:
req.file('foo')._files[0].stream
But, how to read the entire file stream and check for headers and data etc?There were other similar Qs like (Sails.js Skipper: How to read the uploaded file stream during upload?). But the solution mentioned is to use skipper-csv adapter(which i cannot use as I already use skipper-s3 to upload to s3).
Can someone please post an example on how to read the upstreams and perform any validations before the upload?
Here is how my problem got solved: I'm making a copy of the stream to validate before actual upload. And then checking my validations on the original stream and once passed, I upload the copied stream to my desired location.
For reading the Csv stream, I found a npm package: csv-parser(https://github.com/mafintosh/csv-parser) , which I felt easy to handle events like headers, data.
For creating the copy of the stream, I used the following logic:
const upstream = req.file('file');
const fileStreamMap = {};
const fileStreamMapCopy = {};
_.each(upstream._files, (file) => {
const stream = PassThrough();
const streamCopy = PassThrough();
file.stream.pipe(stream);
file.stream.pipe(streamCopy);
fileStreamMap[fileName] = stream;
fileStreamMapCopy[fileName] = streamCopy;
});
// validate and upload files to S3, if Valid.
validateAndUploadFile(fileStreamMap, fileStreamMapCopy);
}
validateAndUploadFile() contains my custom validation logic for my csv upload.
Also, we can use aws-sdk(https://www.npmjs.com/package/aws-sdk) for s3 upload.
Hope, this helps someone.

Downloading a publicly-shared file from OneDrive

When I create a share link in the UI with the "Anyone with this link can view this item" option, I get a URL that looks like https://onedrive.live.com/redir?resid=XXX!YYYY&authkey=!ZZZZZ&ithint=<contentType>. What I can't figure out is how to use this URL from code to download the content of the file. Hitting the link gives HTML for a page to show the file.
How can I construct a call to download the file? Also, is there a way to construct a call to get some (XML/JSON) metadata about the file, and maybe even a preview or something? I want to be able to do this all without prompting a user for credentials, and all the API docs are about how to make authenticated calls. I want to make anonymous calls to get publicly-shared files.
Have a read over https://dev.onedrive.com - it documents how you can make a query to our service to get the metadata for an item, along with URLs that can be used to directly download the content.
Update with more details
Sorry, the documentation you need for your specific scenario is still in process (along with the associated SDK changes) so I'll give you an overview of how to do it.
There's a sibling to the /drives path called /shares which accepts a sharing URL (such as the one you have above) in an encoded format and allows you to get metadata for the item it represents. This does not require authentication provided the sharing URL has a valid authkey.
The encoding scheme for the id is u!<UrlSafeBase64EncodedUrl>, where <UrlSafeBase64EncodedUrl> follows the guidelines outlined here (trim the = characters from the end).
Here's a snippet that should give you an idea of the whole process:
string originalUrl = "https://onedrive.live.com/redir?resid=XXX!YYYY&authkey=!foo";
byte[] urlAsUtf8Bytes = Encoding.UTF8.GetBytes(originalUrl);
string utf8BytesAsBase64String = Convert.ToBase64String(urlAsUtf8Bytes);
string encodedUrl = "u!" + utf8BytesAsBase64String.TrimEnd('=').Replace('/', '_').Replace('+', '-');
string metadataUrl = "https://api.onedrive.com/v1.0/shares/" + encodedUrl + "/root";
From there you can append /content if you want to get the contents of the file, or you can start navigating through if the URL represents a folder (e.g. /children/childfile.txt)

Google Apps Script login to website with HTTP request

I have a spreadsheet on my Google Drive and I want to download a CSV from another website and put it into my spreadsheet. The problem is that I have to login to the website first, so I need to use some HTTP request to do that.
I have found this site and this. If either of these sites has the answer on it, then I clearly don't understand them enough to figure it out. Could someone help me figure this out? I feel that the second site is especially close to what I need, but I don't understand what it is doing.
To clarify again, I want to login with an HTTP request and then make a call to the same website with a different URL that is the call to get the CSV file.
I have done a lot of this in the past month so I should be able to help you, we are trying to emulate the browsers behaviour here so first you need to use chrome's developer tools(or something similar) and note down the exact things the browser does like the form values posted, the url that is called and so on. The following example shows the general techinique to be used:
The first step is to login to the website and get the session cookie:
var payload =
{
"user_session[email]" : "username",
"user_session[password]" : "password",
};// The actual values of the post variables (like user_session[email]) depends on the site so u need to get it either from the html of the login page or using the developer tools I mentioned.
var options =
{
"method" : "post",
"payload" : payload,
"followRedirects" : false
};
var login = UrlFetchApp.fetch("https://www.website.com/login" , options);
var sessionDetails = login.getAllHeaders()['Set-Cookie'];
We have logged into the website (In order to confirm just log the sessionDetails and match it with the cookies set by chrome). The next step is purely dependent on the website so I will give u a general example
var downloadPayload =
{
"__EVENTTARGET" : 'ctl00$ActionsPlaceHolder$exportDownloadLink1',
};// This is just an example it may or may not be needed, if needed u need to trace the values from the developer tools.
var downloadCsv = UrlFetchApp.fetch("https://www.website.com/",
{"headers" : {"Cookie" : sessionDetails},
"method" : "post",
"payload" : downloadPayload,
});
Logger.log(downloadCsv.getContentText())
The file should now be logged, you can then parse the csv using hte GAS inbuilt function and dump the data in the spreadsheet.
A few points to note:
I have assumed that all form post values are static and can be
hardcoded, in case this is not true then let me know I will give you
a function that can extract values from the html.
Some websites require the browser to send a token value(the value will be present in the html) along with the credentials. In this case you need to extract the values and then post it.

loading response data into web view Titanium

I got response data from the web services, which is base64binary data.
I want to load this base64binary data into web view for titanium alloy [version 3.1.0.2].
The data base64binary is of pdf file.
Ti.API.info('Status is ::',xhrDocument.status);
var ResponseData = xhrDocument.getResponseXML().getElementsByTagName('GetDocResult').item(0).text;
var file = Titanium.Filesystem.getFile(Titanium.Filesystem.applicationDataDirectory,'pdfbinarray.pdf');
if(xhrDocument.status == 200){
var file = Titanium.Filesystem.getFile(Titanium.Filesystem.applicationDataDirectory, 'filename2.pdf'); file.write(xhrDocument.getResponseXML().getElementsByTagName('GetDocResult').item(0).text);
Titanium.API.info('file write');
Titanium.API.info(file.size);
}
The above code created filename2.pdf in my Documents directory. When I open the file using Adobe Reader, it says Adobe Reader could not open filename2.pdf because it is either not a valid file or has been damaged (for example, it was sent as an email attachment and wasn't correctly decoded).
Is the web service call returning ONLY the document, or is there additional data included in the response?
We have had success using a simpler method. If the service is simply returning the document, try changing line two to something more like this:
var ResponseData = xhrDocument.responseText;