Search text from PDF files stored locally or in SQLite blob - react-native

I need to store PDF documents in either a local storage or SQLite (React-native iOS/Android). The app needs to work offline, so documents should be indexed and searchable for offline viewing.
Any ideas of how I can implement it? if I store a document in SQLite as blob, is there any plugin I can use to search. OR if is there any search engine library for react-native that can index the document as soon as PDF is uploaded.
Thanks in advance for your help/suggestions.

Related

How to upload and download media files using GUNDB?

I'm trying to use GUN to create a File sharing platform. I read the tutorial and API but I couldn't find a general way to upload/download a file.
I hear that there is a limitation of 5Mb of localStorage in GUN, if I want to upload a large file, I have to slice it then storage it into GUN. But right now I can't find a way to storage file into GUN.
I read the question from Retric and I know how to store the image into GUN, but can I store the other type of Files such as .zip or .doc File? Is there a general API for file storage?
I wrote a quick little app in 35 lines of HTML to demonstrates file sharing for images, videos, sound, etc.
https://github.com/amark/gun/blob/master/examples/basic/upload.html
I've sent 20MB files thru it, tho yeah, I'm sure there is a better way of splitting it up into 2MB chunks - that is currently not automatic, you'd have to code it.
We'll have a feature in the future that will automatically split up video files. Do you want to help with this?
I think on the download side, all you have to do is make sure you have the whole file (stitch it back together if you do write a splitter upper), and add it to some <a href=" target. Actually, I'm not sure exactly how, but I know browsers support download file attributes for a few years now, where you can create a download link even of a in-memory file... but you'll have to search online for how. Then please write a tutorial and share it with the community!!
I would recommend using IPFS for file storage and GUN to store the links to those files. GUN isn't meant for file storage I believe, primarily user/graph data. Thus the 5 MB limitation.

Save pdf file loaded in iFrame to database after edit Oracle APEX

I am trying to save a PDF file that is loaded in an iFrame after sign it, i am using the (PSPDFKit standalone) in Oracle APEX 190200 version.
I need save the is database instead of download file.
How I can get file and save file in database through AJAX callback?
Screenshot:
You can use instance.exportPDF() to get the PDF as an ArrayBuffer. Then you can convert the ArrayBuffer to Blob and send it to the server. Hopefully, this should solve your issue.
I would suggest you to reach our support directly. We offer a blazing fast assistance and the questions are handled directly by the Web team: https://pspdfkit.com/support/request/.

Best way to save json data and files in react native?

I am developing a react-native application. I want it to get a json file online and print it as a list, showing the image and by clicking, showing the pdf. Here is my json :
[
{
"id":"1",
"nb_edit":0,
"nom":"Catalog 1",
"description":"test",
"apercu":"img\/1.png",
"pdf":"pdf\/1.pdf"
},
{
"id":"2",
"nb_edit":"0",
"nom":"Hi",
"description":"yes",
"apercu":"img\/2.png",
"pdf":"pdf\/2.pdf"
}
]
It actually works fine. But now, I want to add an offline mode. For this, I want to basically set a version variable in the application to 0, and check if the version on the website (version.txt) is the same. If yes, just load the json file saved in the phone (the png and pdf are saved too). If not, download the json, update the json saved locally, update the version variable, and then load the files from the phone.
Do you have an idea of how could I do? I thought about using redux-persist for the version, but will it work for the json, the images and the pdfs, and how?
Thanks for your help.
You can use :
Option1:
AsyncStorage is an unencrypted, asynchronous, persistent, key-value
storage system that is global to the app. It should be used instead of
LocalStorage.
https://reactnative.dev/docs/asyncstorage.html
Option2:
Local database
SQLite is an open-source SQL database that stores data to a text file
on a device. It supports all the relational database features. In
order to access this database, we don’t need to establish any kind of
connections for it like JDBC, ODBC.
react-native-sqlite-storage
https://www.npmjs.com/package/react-native-sqlite-storage
I would not try to save the file to the device. Instead of trying to save the file, I would save data to the device's storage by using AsyncStorage (If the file is not that big). If you go in this way, do not forget to stringify the object while saving the object to the storage.
You may use react-native-fs for saving files such as pdf, images.

Genexus 15, save PDFs, GIFs, JPGs, WORD documents

I need to save PDFs, GIFs, TIFs, JPGs, etc...how can I do this in Genexus 15 compiling in C#.
After this I have to show the saved documents in a form.
Thank you..
PD: I'm new using Genexus...
This question seems too broad to answer... Will those files be stored in the database or the file system? Or perhaps in an external storage like Amazon's S3?
Will the application store different file types in the same database column, or will there be a filed storing images, another one por PDFs, etc.?
Anyway, here are some documents that may be of some help:
Blob data type for storing any file in the database (or see BlobFile data type if using GeneXus Tero, in pre-beta at this moment...)
File data type for storing files in the file system
Image data type for storing image files in the database (there is also Audio and Video which work exactly the same way)
External Storage for Multimedia explains hot to store multimedia files in an external service.
Hope this helps...

Using ElasticSearch and/or Solr as a datastore for MS Office and PDF documents

I'm currently designing a full text search system where users perform text queries against MS Office and PDF documents, and the result will return a list of documents that best match the query. The user will then be to select any document returned and view that document within MS Word, Excel, or a PDF viewer.
Can I use ElasticSearch or Solr to import the raw binary documents (ie. .docx, .xlsx, .pdf files) into its "data store", and then export the document to the user's device on command for viewing.
Previously, I used MongoDB 2.6.6 to import the raw files into GridFS and the extracted text into a separate collection (the collection contained a text index) and that worked fine. However, MongoDB full text searching is quite basic and therefore I'm now looking at either Solr or ElasticSearch to perform more complex text searching.
Nick
Both Solr and Elasticsearch will index the content of the document. Solr has that built-in, Elasticsearch needs a plugin. Easy either way and both use Tika under the covers.
Neither of them will store the document itself. You can try making them do it, but they are not designed for it and you will suffer.
Additionally, neither Solr nor Elasticsearch are currently recommended as a primary storage. They can do it, but it is not as mission critical for them as - say - for a filesystem implementation.
So, I would recommend having the files somewhere else and using Solr/Elasticsearch for searching only. That's where they shine.
I would try the Elasticsearch attachment plugin. Details can be found here:
https://www.elastic.co/guide/en/elasticsearch/plugins/2.2/mapper-attachments.html
https://github.com/elasticsearch/elasticsearch-mapper-attachments
It's built on top of Apache Tika:
http://tika.apache.org/1.7/formats.html
Attachment Type
The attachment type allows to index different "attachment" type field
(encoded as base64), for example, Microsoft Office formats, open
document formats, ePub, HTML, and so on (full list can be found here).
The attachment type is provided as a plugin extension. The plugin is a
simple zip file that can be downloaded and placed under
$ES_HOME/plugins location. It will be automatically detected and the
attachment type will be added.
Supported Document Formats
HyperText Markup Language
XML and derived formats
Microsoft Office document formats
OpenDocument Format
iWorks document formats
Portable Document Format
Electronic Publication Format
Rich Text Format
Compression and packaging formats
Text formats
Feed and Syndication formats
Help formats
Audio formats
Image formats
Video formats
Java class files and archives
Source code
Mail formats
CAD formats
Font formats
Scientific formats
Executable programs and libraries
Crypto formats
A bit late to the party but this may help someone :)
I had a similar problem and some research led me to fscrawler. Description:
This crawler helps to index binary documents such as PDF, Open Office, MS Office.
Main features:
Local file system (or a mounted drive) crawling and index new files,
update existing ones and removes old ones. Remote file system over SSH
crawling.
REST interface to let you "upload" your binary documents to elasticsearch.
Regarding solr:
If the docs only need to be returned on metadata searches, Solr features a BinaryField fieldtype, to which you can send binary data base64 encoded.Keep in mind that in general people recommend against doing this, as it may increase your index (RAM requirements/performance), and if possible a set-up where you store the files externally (and the path to the file in solr) might bea better choice.
If you want solr to automatically index the text inside the pdf/doc -- that's possible with the extractingrequesthandler: https://wiki.apache.org/solr/ExtractingRequestHandler
Elasticsearch do store documents (.pdfs, .docs for instance) in the _source field. It can be used as a NoSQL datastore (same as MongoDB).