Firestore document with umlaut, two different "ö" - react-native

My problem is, when I try to set up a new document in my firestore with a name including umlaut "ö" it writes it in a worse way. Can you compare both documents and tell me what the difference between these two "ö" are? In the first picture the "ö" is bigger than in the second picture. Because of that my further functions - for example search function which is looking for the document name - is not working for document names with umlaut. I can't figure out the answer of my problem. I hope you guys can show me the right way to handle this. I don't want to replace the umlauts.
Should I decode my variable which I pass as the document name in my setup function?
First image:
Second image:
Update:
I will explain a little bit more about my goal. I have an index.html upload form for multi-image upload to Firebase storage and writing the imageurl and other information to the |irestore. When I upload my image folder, I retrieve the path of the imagedata from my system and make a split to have only the foldername. I use this name as the document name for my firestore (it is working for folders without an umlaut in the name). But when I write the same name for creating a document through the firebase console or replace it with a variable text = "my string for foldername" it is not matching. I would say the retrieved foldername has a different coding for example for the letter "ö".
var relpath = files[i].webkitRelativePath;
folder = relpath.split("/");
var foldername= "";
//foldername = unescape(encodeURIComponent(folder[0]));
foldername = folder[0];
var storage = firebase.storage().ref().child('kitaDE/duesseldorf/'+foldername+'/'+files[i].name);
//upload file
var upload = storage.put(files[i]); //webkitRelatviPath hinzugefügt
//update progress bar
upload.on(
"state_changed",
function progress(snapshot) {
var percentage =
(snapshot.bytesTransferred / snapshot.totalBytes) * 100;
document.getElementById("progress").value = percentage;
},
function error() {
alert("error uploading file");
},
function complete() {
document.getElementById(
"uploading"
).innerHTML += `${files[i].name} uploaded <br />`;
},
);
db.collection("kitaDE").doc(foldername).set({
image: [],
id: "",
active: true,
title: "",
street: "",
zipcode: "",
location: "",
})
Update 2
I copy & paste the foldername and my direct entry for the name over the firebase console.
Foldername copied:
Am Köhnen
Entered name in firbase console through my keyboard:
Am Köhnen
It looks for me the same. I run my javascript code and give out the following part on the console log.
var relpath = files[i].webkitRelativePath;
folder = relpath.split("/");
var foldername= "";
foldername = folder[0];
var foldername2 = "Am Köhnen";
var foldername3 = decodeURIComponent(escape(foldername2))
My result is the following screenshot.
Console.log Output
You can see that first name seems right, but first and the third output names are not matching. It seems like they are the same but they not, i refer here to my both picture at the begin of my post here. Firestore handle the names different.
To get a hex dump, I ran this command in the parent directory of the problematic one:
bash$ printf '%s\n' Am\K*hnen | xxd
00000000: 416d 204b 6fcc 8868 6e65 6e0a Am Ko..hnen.

There are multiple sequences that result in an ö character being displayed. One of them uses a single Unicode codepoint to represent the character (U+00F6), but the other actually uses a separate codepoint for the o and then another one for the umlaut (U+006F U+0308).
Also see:
The wikipedia page on combining characters
The wikipedia list of unicode characters
My first idea is that the two titles in your documents are written with different Unicode sequences.
I thought that Firestore would equate these two ways of writing, but I can't find anything in the documentation about that now. If it doesn't, then that would explain why a query that matches one of the codepoint combinations for ö doesn't match the other combination.

Related

How do I print the output of google sheets script back to google sheet?

I am trying to use an api for verifying phone numbers and emails for the database I want to create in google sheets. I coded the following on app scripts of sheet.
function phone(phno) {
var response = UrlFetchApp.fetch("https://neutrinoapi.net/phone-validate?user-id=USERNAME_HERE&api-key=API_KEY_HERE&number="+phno+"&country-code=IN")
// Parse the JSON reply
var json = response.getContentText();
var data = JSON.parse(json);
Logger.log(data["valid"]);
Logger.log(data["prefix-network"]);
Logger.log(data["type"]
}
The json returns a bunch of things, valid = true\false {If the number is valid}; prefix-network = service_provider; type = mobile\landline etc.
The idea is that if I call = phone(D2) in cell E2 for example, where D2 stores a valid phone number, I want the validity to be shown in E2, which is a boolean of true\false for if the phone number is valid or not. I would also love to show the next 2 columns with prefix-network and type. But they aren't necessarily essential, just desirable.
The API works fine, and I can see the output in the execution logs in App Scripts, and it's correct with a few test data I tried. However, I cannot find any way to display the result back in the sheet in cell E2. The code has to be generic such that if I drag the code for the cell, it should return the output to the subsequent cells.
I hope the question is clear and the code is sufficient to explain it.
Looking forward to the replies.
Thanks in advance for the help!!
Something like this:
function phone(phno) {
var response = UrlFetchApp.fetch("https://neutrinoapi.net/phone-validate?user-id=USERNAME_HERE&api-key=API_KEY_HERE&number="+phno+"&country-code=IN")
// Parse the JSON reply
var json = response.getContentText();
var data = JSON.parse(json);
Logger.log(data["valid"]);
Logger.log(data["prefix-network"]);
Logger.log(data["type"]
const sh = SpreadsheetApp.getActiveSheet();
sh.getRange(1,1,1,3).setValues([[data["valid"],data["prefix-network"],data["type"]]]);
}

Need t2.gstatic URL parameters for Web Scraping

I am checking to see if I can use gstatic to scrape favicon from websites. Below will fetch the websites Favicon:
https://t2.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://yahoo.com&size=64
I understand that the URL parameters might not be for general use, but just checking if anyone knows where this might be documented?
UPDATE: I have just started building an app on Google App Script. I need to list website names along with their favicons and metadata like site description, etc. Currently the only approach is to read the webpage and use beautifulSoup to parse the page and then locate the favicon. I came across the above link that will directly give me the favicon! But I want to understand it better and trying to locate more information on the URL parameters for gstatic.
I am also open to alternative ways to scrape a web site from Google App Script...
Thanks
I believe your goal is as follows.
You want to retrieve the favicon from the websites.
You want to use the following sample URL.
https://t2.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://yahoo.com&size=64
From I need to list website names along with their favicons and metadata like site description, etc., you want to retrieve the favicon, title, and description of the site using Google Apps Script.
Sample script 1:
When your URL of https://t2.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://yahoo.com&size=64 is used, how about the following sample script? Please copy and paste the following script to the script editor of Google Apps Script. And, run samoke1 at the script editor.
function sample1() {
const uri = 'https://t2.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://yahoo.com&size=64';
const blob = UrlFetchApp.fetch(encodeURI(uri)).getBlob();
DriveApp.createFile(blob);
}
When this script is run, the favicon is retrieved and that is saved as a file to the root folder of Google Drive.
When I saw the URL, it seems that the favicon is retrieved as the image data.
Sample script 2:
When the favicon, title, and description of the site are retrieved, how about the following sample script?
function sample2() {
const uri = 'https://yahoo.com'; // Please set the URL.
const obj = { title: "", description: "", faviconUrl: "" };
const res = UrlFetchApp.fetch(encodeURI(uri));
const html = res.getContentText();
const title = html.match(/<title>(.+?)<\/title>/i);
if (title || title.length > 1) {
obj.title = title[1];
}
const description = html.match(/<meta.+name\="description".+>/i);
if (description) {
const d = description[0].match(/content\="(.+)"/i);
if (d && d.length > 1) {
obj.description = d[1];
}
}
const faviconUrl = html.match(/rel="icon".+?href\="(.+?)"/i);
if (faviconUrl && faviconUrl.length > 1) {
obj.faviconUrl = faviconUrl[1];
}
console.log(obj);
}
When this script is run, you can see the following value in the log.
{
"title":"Yahoo | Mail, Weather, Search, Politics, News, Finance, Sports & Videos",
"description":"Latest news coverage, email, free stock quotes, live scores and video are just the beginning. Discover more every day at Yahoo!",
"faviconUrl":"https://s.yimg.com/cv/apiv2/default/icons/favicon_y19_32x32_custom.svg"
}
Reference:
fetch(url)

AngularJS: Take a single item from an array and add to scope

I have a ctrl that pulls a json array from an API. In my code I have an ng-repeat that loops through results.
This is for a PhoneGap mobile app and I'd like to take a single element from the array so that I can use it for the page title.
So... I'm wanting to use 'tool_type' outside of my ng-repeat.
Thanks in advance - I'm just not sure where to start on this one.
Example json data
[{ "entry_id":"241",
"title":"70041",
"url_title":"event-70041",
"status":"open",
"images_url":"http://DOMAIN.com/uploads/event_images/241/70041__small.jpg",
"application_details":"Cobalt tool bits are designed for machining work hardening alloys and other tough materials. They have increased water resistance and tool life. This improves performance and retention of the cutting edge.",
"product_sku":"70041",
"tool_type": "Toolbits",
"sort_group": "HSCo Toolbits",
"material":"HSCo8",
"pack_details":"Need Checking",
"discount_category":"102",
"finish":"P0 Bright Finish",
"series_description":"HSS CO FLAT TOOLBIT DIN4964"},
..... MORE .....
Ctrl to call API
// Factory to get products by category
app.factory("api_get_channel_entries_products", function ($resource) {
var catID = $.url().attr('relative').replace(/\D/g,'');
return $resource(
"http://DOMAIN.com/feeds/app_productlist/:cat_id",
{
cat_id: catID
}
);
});
// Get the list from the factory and put data into $scope.categories so it can be repeated
function productList ($scope, api_get_channel_entries_products, $compile) {
$scope.products_list = [];
// Get the current URL and then regex out everything except numbers - ie the entry id
$.url().attr('anchor').replace(/\D/g,'');
$scope.products_list = api_get_channel_entries_products.query();
}
Angular works as following:
Forgiving: expression evaluation is forgiving to undefined and null, unlike in JavaScript, >where trying to evaluate undefined properties can generate ReferenceError or TypeError.
http://code.angularjs.org/1.2.9/docs/guide/expression
so you only need to write:
<title>{{products_list[0].tool_type}}</title>
if there is a zero element the title will be the tool_type, if not, there is no title.
Assuming you want to select a random object from the list to use something like this should work:
$scope.product-tool_type = products_list[Math.floor(Math.random()*products_list.length)].tool_type
Then to display the result just use
<h1>{{product-tool_type}}</h1>
Or alternatively:
<h1>{{products_list[Math.floor(Math.random()*products_list.length)].tool_type}}</h1>

how do we read a csv file and display the same in dojo?

i want to write a dojo code where upon a button click i want to read a .csv file and display the .csv file in a datagrid using Dojo. Can anyone please help me with this?
Your best try is to retrieve the data using the dojo/request/xhr module, with it you can retrieve an external CSV file. For example:
require(["dojo/request/xhr"], function(xhr) {
xhr.get("myfile.csv", {
handleAs: "text"
}).then(function(data) {
// Use data
});
});
Well, now you have the data as a string in your data parameter, what you now need to do is parse that string. The easiest way is to split your string on each enter, for example:
var lines = data.split(/\r?\n/);
The \r is optional (depends on the system you're using), but now you have each line seperated in an array element in lines. The next step is to retrieve each seperate value, for example by doing:
require(["dojo/_base/array"], function(array) {
/** Rest of code */
array.forEach(lines, function(line) {
var cells = line.split(',');
});
});
Then you have your data splitted by each comma. The next step is that you have to change it to a format that the dojox/grid/DataGrid can read. This means that you will probably convert the first line of your CSV content to your headers (if they contain headers) and the rest of the data to objects (in stead of arrays of strings).
You can get the first line with:
var headers = lines[0].split(',');
And the rest of the data with:
var otherData = lines.splice(1);
Now you should carefully read the documentation #MiBrock gave you, and with it you can transform the simple array of strings to the correct format.
The headers should become an array of objects like this:
[{ name: "display name", field: "field name" }, { /** other objects */ }]
I did that by doing:
array.map(headers, function(header) {
return {
field: header,
name: header
};
});
This will actually convert [ "a", "b" ] to:
[{ field: "a", name: "a" }, { field: "b", name: "b" }]
Now you need to convert all other lines to objects containing a field with name a and a field with name b, you can do that using:
array.map(otherData, function(line) {
var cells = line.split(','), obj = {};
array.forEach(cells, function(cell, idx) {
obj[headers[idx]] = cell;
});
return obj:
});
This will actually retrieve the fieldname by retrieving the corresponding header value and output it as a single object.
Now you can put it in a grid, look at #MiBrock's answer for more details about how to do that. The final result would look a bit like this JSFiddle.
Note: Next time you encounter a problem you can't solve, handle the parts you actually can solve first and then ask a question about the other parts. I mean, it's hard to believe that you can actually not solve any of this by yourself. You should try and learn it by yourself first.
The dojo-smore project includes a CSV object store that loads data from CSV into an in-memory store which can then be used with any component that supports object stores, like dgrid. You shouldn’t try to do it yourself, like Dimitri M suggested, and you shouldn’t use the dojox grids, which are deprecated.
You can use dojox.data.CsvStore to read a CSV store and use it in a data grid.
Have a look here : http://dojotoolkit.org/reference-guide/1.8/dojox/grid/index.html#dojox-grid-index
and here: http://dojotoolkit.org/documentation/tutorials/1.9/populating_datagrid/
This should help you to start with your Programming.
If you need further help, show us what you have tried to solve the Problem by posting your code and we'll be glad to help you.
Regards, Miriam

Can I read PDF or Word Docs with Node.js?

I can't find any packages to do this. I know PHP has a ton of libraries for PDFs (like http://www.fpdf.org/) but anything for Node?
textract is a great lib that supports PDFs, Doc, Docx, etc.
Looks like there's a few for pdf, but I didn't find any for Word.
CPU bound processing like that isn't really Node's strong point anyway (i.e. you get no additional benefits using node to do it over any other language). A pragmatic approach would be to find a good tool and utilise it from Node.
I have heard good things around the office about docsplit http://documentcloud.github.com/docsplit/
While it's not Node, you could easily invoke it from Node with http://nodejs.org/docs/latest/api/all.html#child_process.exec
You can easily convert one into another, or use for example a .doc template to generate a .pdf file, but you will probably want to use an existing web service for this task.
This can be done using the services of Livedocx for example
To use this service from node, see node-livedocx (Disclaimer: I am the author of this node module)
I would suggest looking into unoconv for your initial conversion, this uses LibreOffice or OpenOffice for the actual conversion. Which adds some overhead.
I'd setup a few workers with all the necessities setup, and use a request/response queue for handling the conversion... (may want to look into kue or zmq)
In general this is a CPU bound and heavy task that should be offloaded... Pandoc and others specifically mention .docx, not .doc so they may or may not be options as well.
Note: I know this question is old, just wanted to provide a current answer for others coming across this.
you can use pdf-text for pdf files. it will extract text from a pdf into an array of text 'chunks'. Useful for doing fuzzy parsing on structured pdf text.
var pdfText = require('pdf-text')
var pathToPdf = __dirname + "/info.pdf"
pdfText(pathToPdf, function(err, chunks) {
//chunks is an array of strings
//loosely corresponding to text objects within the pdf
//for a more concrete example, view the test file in this repo
})
var fs = require('fs')
var buffer = fs.readFileSync(pathToPdf)
pdfText(buffer, function(err, chunks) {
console.log(chunks)
})
for docx files you can use mammoth, it will extract text from .docx files.
var mammoth = require("mammoth");
mammoth.extractRawText({path: "./doc.docx"})
.then(function(result){
var text = result.value; // The raw text
console.log(text);
var messages = result.messages;
})
.done();
I hope this will help.
For parsing pdf files you can use pdf2json node module
It allows you to convert pdf file to json as well as to raw text data.
Another good option if you only need to convert from Word documents is Mammoth.js.
Mammoth is designed to convert .docx documents, such as those created
by Microsoft Word, and convert them to HTML. Mammoth aims to produce
simple and clean HTML by using semantic information in the document,
and ignoring other details. For instance, Mammoth converts any
paragraph with the style Heading 1 to h1 elements, rather than
attempting to exactly copy the styling (font, text size, colour, etc.)
of the heading.
There's a large mismatch between the structure used by .docx and the
structure of HTML, meaning that the conversion is unlikely to be
perfect for more complicated documents. Mammoth works best if you only
use styles to semantically mark up your document.
Here is an example showing how to download and extract text from a PDF using PDF.js:
import _ from 'lodash';
import superagent from 'superagent';
import pdf from 'pdfjs-dist';
const url = 'http://unec.edu.az/application/uploads/2014/12/pdf-sample.pdf';
const main = async () => {
const response = await superagent.get(url).buffer();
const data = response.body;
const doc = await pdf.getDocument({ data });
for (const i of _.range(doc.numPages)) {
const page = await doc.getPage(i + 1);
const content = await page.getTextContent();
for (const { str } of content.items) {
console.log(str);
}
}
};
main().catch(error => console.error(error));
You can use Aspose.Words Cloud SDK for Node.js to extract text from DOC/DOCX,Open Office, and PDF. It's paid API but the free plan provides 150 free monthly API calls.
P.S: I'm developer evangelist at Aspose.
const { WordsApi, ConvertDocumentRequest } = require("asposewordscloud");
const fs = require('fs');
// Get Customer ID and Customer Key from https://dashboard.aspose.cloud/
wordsApi = new WordsApi("xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx", "xxxxxxxxxxxxxxxxxxxx");
const request = new ConvertDocumentRequest({
format: "txt",
document: fs.createReadStream("C:/Temp/02_pages.pdf"),
});
const outputFile = "C:/Temp/ConvertPDFtotxt.txt";
wordsApi.convertDocument(request).then((result) => {
console.log(result.response.statusCode);
console.log(result.body.byteLength);
fs.writeFileSync(outputFile, result.body);
}).catch(function(err) {
// Deal with an error
console.log(err);
});