Do you have any idea how to read PDF files, which mimetype is text/html?
I have tried the snippet below, but OCR doesn't work, resulting in this issue "API call to drive.files.insert failed with error: OCR is not supported for files of type text/html"
function extractTextFromPDF(pdfID) {
// PDF File URL
// You can also pull PDFs from Google Drive
var url = "https://drive.google.com/file/d/"+pdfID
var blob = UrlFetchApp.fetch(url).getBlob();
var resource = {
title: blob.getName(),
mimeType: blob.getContentType(),
};
// Enable the Advanced Drive API Service
var file = Drive.Files.insert(resource, blob, { ocr: true, ocrLanguage: 'en' });
// Extract Text from PDF file
var doc = DocumentApp.openById(file.id);
var text = doc.getBody().getText();
return text;
}
Also, I have tried to convert files to any other format like .csv .css or text, but when did it the text is horrible, long HTML, with content encrypted I think. I considered splitting data from extracted HTML, but unfortunately, content is not there or is encrypted somehow.
What I want to do is to print the text from this wired pdf, so I can later write it to Google Sheets. Do you have any idea how I can read this file?
File
I am attaching a pdf here, so you can see what I am fighting with.
https://drive.google.com/file/d/1HXQk6PU9hzBb26EwoFQ0840W6ZihDUIX/view?usp=sharing
I used your sample file, see how I did it below:
function myFunction() {
var pdfFile = DriveApp.getFilesByName("222-1522118.pdf").next();
var blob = pdfFile.getBlob();
// Get the text from pdf
var filetext = pdfToText( blob, {keepTextfile: false} );
console.log(filetext)
}
Output:
I used Mogsdad's library pdfToText
Reference: Get text from PDF in Google
I was creating a pdf generator following guidelines from this website "https://www.wix.com/velo/forum/coding-with-velo/pdf-generator-api-npm-demo"
code screenshot
I copy every code from the guideline, but when I click the submit button, error of [data:application/pdf;base64,undefined] occur, and no record is saved at the pdf generator.
Check the link of the website here: website link
I believe it should be something wrong with the code below
//code start
import { pdf } from 'backend/pdf.jsw';
$w.onReady(function () {
var base64;
$w('#btnSubmit').onClick(() => {
let name = $w('#inName').value;
let detail = $w('#inDetail').value;
pdf(name, detail).then( e => {
base64 = e;
base64 = 'data:application/pdf;base64,' + base64
let msg = {
"conv": true,
"dataUrl": base64
}
$w('#html1').postMessage(msg);
});
});
});
// Code end
I search through the internet and some mentioned open data URl directly function is not supported since 2020 (see website). I am new in programme or codes, it would be greatful if someone could provide me the adjusted code.
Sorry for the trouble
I currently have a file uploader that accepts a single CSV file. Then with axios I POST such file to the server and everything works just fine. What I'm not being able to achieve is being able to upload another CSV that will get added to the list of CSVs uploaded. I'm not talking about uploading various files at once, I'm taking about uploading different files at different points in time.
This is the method that is used to select a CSV file in the .vue file.
staticCampaignCSVSelected: function (file) {
console.log('campaign-detail.vue#staticCampaignCSVSelected', file)
let vc = this
vc.selectedHeuristicId = -1
Campaign.uploadStaticCSV(vc.campaign, file[0])
.then(
function (data) {
alert('CSV cargado con exito')
}
)
.catch(
function (err, data) {
console.log("campaign-detail#staticCampaignCSVSelected - catch", err.response)
alert(err.response.data.error)
}
)
},
This is the function that I have in some other JS file to POST to the API:
function uploadStaticCSV (campaign, csv) {
console.log('Campaign#uploadStaticCSV', campaign, csv)
//long list of assertions
let formData = new FormData()
formData.append('csv', csv)
return axios.post(API.campaignUploadStaticCSV(campaign.id), formData)
}
And this is the function I have in my endpoints.js file:
campaignUploadStaticCSV: function (id) { return this.campaign(id) + '' + '/csv' },
I haven't found a way to properly pass a[file] array as a parameter to the functions, which is what I believe I need to somehow do.
Any help would be appreciated :)
As far as i understood your question you need a way to pass a file from browser interface to your staticCampaignCSVSelected(file) method. If so why not to use an input model or a simple event or a watcher. E.g.
<input type="file" #input="staticCampaignCSVSelected($event.target.files[0])" />
But also i see a mistake in your code. You should append .then().catch() callbacks to axios.post() itself but not to Campaign.uploadStaticCSV() method.
And
return axios.post()
will not return a server response. You have to handle it in
axios.post().then(response => {})
callback
I am trying to upload an excel file(named test1.xls) present in my computer in c drive at location->C:\test\test1.xls to google site(https://sites.google.com/xyz).To do so, I am using Google script editor and code shown below. The issue is that I am not able to pass the contents of file test1.txt at location C:\test\test1.txt instead of the text ("Here is some data")shown in code line shown below(var blob = Utilities.newBlob("Here is some data", "text/plain", "test1.xls");).Also what needs to be given instead of "text/plain" as it is excel file.Please let me know how to do that with code as I am new to google scripting/api coding,many thanks in advance.
CODE->
function doPost(e) {
var site = SitesApp.getSiteByUrl("https://sites.google.com/xyz"); Logger.log(site.getName());
var page = site.getChildren()[0];
// Create a new blob and attach it. Many useful functions also return
// blobs file uploads, URLFetch
var blob = Utilities.newBlob("Here is some data", "text/plain", "test1.xls");
try {
// Note that the filename must be unique or this call will fail
page.addHostedAttachment(blob);
}
catch(e){
Logger.log('Hosted attachment error msg:' +e.message);
}
}
I am having trouble with blob URLs.
I was searching for src of a video tag on YouTube and I found that the video src was like:
src="blob:https://video_url"
I opened the blob URL that was in src of the video, but it gave an error. I can't open the link, but it was working with the src tag. How is this possible?
I have a few questions:
What is a blob URL?
Why it is used?
Can I make my own blob URL on a server?
Any additional details about blob URLs would be helpful as well.
Blob URLs (ref W3C, official name) or Object-URLs (ref. MDN and method name) are used with a Blob or a File object.
src="blob:https://crap.crap" I opened the blob url that was in src of
video it gave a error and i can't open but was working with the src
tag how it is possible?
Blob URLs can only be generated internally by the browser. URL.createObjectURL() will create a special reference to the Blob or File object which later can be released using URL.revokeObjectURL(). These URLs can only be used locally in the single instance of the browser and in the same session (ie. the life of the page/document).
What is blob url?
Why it is used?
Blob URL/Object URL is a pseudo protocol to allow Blob and File objects to be used as URL source for things like images, download links for binary data and so forth.
For example, you can not hand an Image object raw byte-data as it would not know what to do with it. It requires for example images (which are binary data) to be loaded via URLs. This applies to anything that require an URL as source. Instead of uploading the binary data, then serve it back via an URL it is better to use an extra local step to be able to access the data directly without going via a server.
It is also a better alternative to Data-URI which are strings encoded as Base-64. The problem with Data-URI is that each char takes two bytes in JavaScript. On top of that a 33% is added due to the Base-64 encoding. Blobs are pure binary byte-arrays which does not have any significant overhead as Data-URI does, which makes them faster and smaller to handle.
Can i make my own blob url on a server?
No, Blob URLs/Object URLs can only be made internally in the browser. You can make Blobs and get File object via the File Reader API, although BLOB just means Binary Large OBject and is stored as byte-arrays. A client can request the data to be sent as either ArrayBuffer or as a Blob. The server should send the data as pure binary data. Databases often uses Blob to describe binary objects as well, and in essence we are talking basically about byte-arrays.
if you have then Additional detail
You need to encapsulate the binary data as a BLOB object, then use URL.createObjectURL() to generate a local URL for it:
var blob = new Blob([arrayBufferWithPNG], {type: "image/png"}),
url = URL.createObjectURL(blob),
img = new Image();
img.onload = function() {
URL.revokeObjectURL(this.src); // clean-up memory
document.body.appendChild(this); // add image to DOM
}
img.src = url; // can now "stream" the bytes
This Javascript function supports to show the difference between the Blob File API and the Data API to download a JSON file in the client browser:
/**
* Save a text as file using HTML <a> temporary element and Blob
* #author Loreto Parisi
*/
var saveAsFile = function(fileName, fileContents) {
if (typeof(Blob) != 'undefined') { // Alternative 1: using Blob
var textFileAsBlob = new Blob([fileContents], {type: 'text/plain'});
var downloadLink = document.createElement("a");
downloadLink.download = fileName;
if (window.webkitURL != null) {
downloadLink.href = window.webkitURL.createObjectURL(textFileAsBlob);
} else {
downloadLink.href = window.URL.createObjectURL(textFileAsBlob);
downloadLink.onclick = document.body.removeChild(event.target);
downloadLink.style.display = "none";
document.body.appendChild(downloadLink);
}
downloadLink.click();
} else { // Alternative 2: using Data
var pp = document.createElement('a');
pp.setAttribute('href', 'data:text/plain;charset=utf-8,' +
encodeURIComponent(fileContents));
pp.setAttribute('download', fileName);
pp.onclick = document.body.removeChild(event.target);
pp.click();
}
} // saveAsFile
/* Example */
var jsonObject = {"name": "John", "age": 30, "car": null};
saveAsFile('out.json', JSON.stringify(jsonObject, null, 2));
The function is called like saveAsFile('out.json', jsonString);. It will create a ByteStream immediately recognized by the browser that will download the generated file directly using the File API URL.createObjectURL.
In the else, it is possible to see the same result obtained via the href element plus the Data API, but this has several limitations that the Blob API has not.
I have modified working solution to handle both the case.. when video is uploaded and when image is uploaded .. hope it will help some.
HTML
<input type="file" id="fileInput">
<div> duration: <span id='sp'></span><div>
Javascript
var fileEl = document.querySelector("input");
fileEl.onchange = function(e) {
var file = e.target.files[0]; // selected file
if (!file) {
console.log("nothing here");
return;
}
console.log(file);
console.log('file.size-' + file.size);
console.log('file.type-' + file.type);
console.log('file.acutalName-' + file.name);
let start = performance.now();
var mime = file.type, // store mime for later
rd = new FileReader(); // create a FileReader
if (/video/.test(mime)) {
rd.onload = function(e) { // when file has read:
var blob = new Blob([e.target.result], {
type: mime
}), // create a blob of buffer
url = (URL || webkitURL).createObjectURL(blob), // create o-URL of blob
video = document.createElement("video"); // create video element
//console.log(blob);
video.preload = "metadata"; // preload setting
video.addEventListener("loadedmetadata", function() { // when enough data loads
console.log('video.duration-' + video.duration);
console.log('video.videoHeight-' + video.videoHeight);
console.log('video.videoWidth-' + video.videoWidth);
//document.querySelector("div")
// .innerHTML = "Duration: " + video.duration + "s" + " <br>Height: " + video.videoHeight; // show duration
(URL || webkitURL).revokeObjectURL(url); // clean up
console.log(start - performance.now());
// ... continue from here ...
});
video.src = url; // start video load
};
} else if (/image/.test(mime)) {
rd.onload = function(e) {
var blob = new Blob([e.target.result], {
type: mime
}),
url = URL.createObjectURL(blob),
img = new Image();
img.onload = function() {
console.log('iamge');
console.dir('this.height-' + this.height);
console.dir('this.width-' + this.width);
URL.revokeObjectURL(this.src); // clean-up memory
console.log(start - performance.now()); // add image to DOM
}
img.src = url;
};
}
var chunk = file.slice(0, 1024 * 1024 * 10); // .5MB
rd.readAsArrayBuffer(chunk); // read file object
};
jsFiddle Url
https://jsfiddle.net/PratapDessai/0sp3b159/
The OP asks:
What is blob URL? Why is it used?
Blob is just byte sequence. Browsers recognize Blobs as byte streams. It is used to get byte stream from source.
According to Mozilla's documentation
A Blob object represents a file-like object of immutable, raw data. Blobs represent data that isn't necessarily in a JavaScript-native format. The File interface is based on Blob, inheriting blob functionality and expanding it to support files on the user's system.
The OP asks:
Can i make my own blob url on a server?
Yes you can there are several ways to do so for example try http://php.net/manual/en/function.ibase-blob-echo.php
Read more here:
https://developer.mozilla.org/en-US/docs/Web/API/Blob
http://www.w3.org/TR/FileAPI/#dfn-Blob
https://url.spec.whatwg.org/#urls
blob urls are used for showing files that the user uploaded, but they are many other purposes, like that it could be used for secure file showing, like how it is a little difficult to get a YouTube video as a video file without downloading an extension. But, they are probably more answers. My research is mostly just me using Inspect to try to get a YouTube video and an online article.
Another use case of blob urls is to load resources from the server, apply hacks and then tell the browser to interpret them.
One such example would be to load template files or even scss files.
Here is the scss example:
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/sass.js/0.11.1/sass.sync.min.js"></script>
function loadCSS(text) {
const head = document.getElementsByTagName('head')[0]
const style = document.createElement('link')
const css = new Blob([text], {type: 'text/css'})
style.href = window.URL.createObjectURL(css)
style.type = 'text/css'
style.rel = 'stylesheet'
head.append(style)
}
fetch('/style.scss').then(res => res.text()).then(sass => {
Sass.compile(sass, ({text}) => loadCSS(text))
})
Now you could swap out Sass.compile for any kind of transformation function you like.
Blob urls keeps your DOM structure clean this way.
I'm sure by now you have your answers, so this is just one more thing you can do with it.