Big Query - extract json field which contains emoji? - google-bigquery

In my BQ database table I have a column called payload which contains raw facebook webhooks JSON payloads as string. One of them contains a text with an emoji like Sample 🏦. In big query it look like
{"object":"page","entry":[{"id":"xxxx","time":1602757469275,"messaging":[{"sender":{"id":"xxxx"},"recipient":{"id":"xxxx"},"timestamp":1602757469062,"message":{"mid":"m_xxxx","text":"Sample \ud83c\udfe6","quick_reply":{"payload":"{\"key\": \"value\"}"},"tags":{"source":"source"}}}]}]}
I would like to create a view with a column text with extracted text field value from the raw json. I created an sql like
SELECT
JSON_EXTRACT_SCALAR(payload, '$.entry[0].messaging[0].message.text') as text,
FROM `my_table.facebook.webhook_received`
Sadly the result I get looks like that Sample ��
Does anyone know how to make big query decode the emoji properly or at least just not change it to those � signs ?

Those characters you have embedded are not for a bank icon which is your issue I believe.
Run the following in BQ and it returns the desired emoji:
select " Sample \U0001f3e6"
Ref:https://emojipedia.org/bank/
The two you have provided seem to default to the '?', invalid character
http://unicode.scarfboy.com/?s=U%2Bdfe6
edit: what ever is handling the message maybe throwing the encodings you're seeing in your message which may be the actual problem.

If you are using BigQuery Python client and its load_table_from_json method, there is a Unicode bug (especially its byte is over 0xFFFF, like 🏦) in the previous version, and I have submitted this bug fix which is already included in the latest release include it, https://github.com/googleapis/python-bigquery/releases/tag/v2.24.0. By the way, you should use \U0001F3E6, not \ud83c\udfe6 (UTF-16 hex type) to present 🏦 in your Python code with BigQuery.
Unicode Character 'BANK': https://www.fileformat.info/info/unicode/char/1f3e6/index.htm,
https://charbase.com/1f3e6-unicode-bank

Related

PostgreSQL: How to extract text from a particular letter?

I'm practicing exercises with SQL and I've got a problem I couldn't resolve yet.
I have a table with a column named: **'email' ** and I want to extract just the Domain of each mail. Then I was thinking to extract since '#' to get that information.
But idk how to do it, was trying with SUBSTRING, but that didn't work because that's about position, and each mail has different size.
I attach a screenshot about the table's composition (does not contain real information). Thank u so much :)
I tried with SUBSTRING method but that didn't work
Example email: example_email#outlook.com
Output expected: #outlook.com
We can use SPLIT_PART to fetch everything after the # and then append the #:
SELECT CONCAT('#',SPLIT_PART(email, '#', 2)) AS mailDomain
FROM people_practice;
Here the documentation about this and other useful string functions.

#Dblookup and formatting on web

I have been developing a web application using domino, therein I have dblookup-ing the field from notes client; Now, this is working fine but the format of value is missing while using on web.
For example in lotus notes client the field value format is as above
I am one, I am two, I am one , I am two, labbblallalalalalalalalalalalalalalalalalalaallllal
Labbbaalalalallalalalalalaalallaal
Hello there, labblalalallalalalllaalalalalalalalalalalalalalalalalalalalalalalala
Now when I retrieve the value of the field on web it seems it takes 2 immediate after 1. and so forth, I was expecting line feed here which is not happening.
The field above is multi valued field. Also on web I have used computed text which does db lookup from notes client.
Please help me what else could/alternate solution for this case.
Thanks
HD
Your multi-valued field has display options associated with it and the Notes client honors those. Obviously, your options are set up to display entries separated by newlines.
The computed text that you are using for the web does not have options like that and the field options are irrelevant because you aren't displaying the field. Your code has to insert the #Newlines. That's pretty easy because #DbLookup returns a list, and if you concatenate a list and a scalar, the scalar will be appended to each element of the list. (Look at the third example under "concatenation, pairwise" here to see what I mean.
The way you've worded your question is a little unclear to me, but what you need in your computed text formula is either something like this:
list := #DbLookup(etc,. etc.);
list + #Newline;
Or something like this:
multiValueFieldContainingListWithDbLookupResult + #NewLine;
I used #implode(Dblookupreturnedvalue;"");
thanks All :)

Start index error while doing Code First Migration

I am trying to add fields from VB.Net class file to SQL database, while doing "Add Migration" It is showing "startIndex cannot be larger than length of string."
enter image description here
I had the same error message when i tried to make a migration. The cause in my case was an empty value for MigrationId for a particular migration in the _MigrationHistory table.
This field must have a value in the same format as the string parameter of the attribute [Migration("YYYYMMDDHHMMSS_SeedData")], which is described in the other answer.
Most of all you have some class in your data project with attribute Migration (maybe for seeding data or something similar) which name is not in expected format like this:
[Migration("YYYYMMDDHHMMSS_SeedData")]
Adjust the migration name to be in YYYYMMDDHHMMSS_Description format to fix the error startIndex cannot be larger than length of string.

Whats wrong with Neo4j 2.0 Query?

I am trying to understand why the data is not showing up in my query. I was wondering if there is any way to troubleshoot whats going on.
Here is the current issue:
I have populated some data from existing test database to check the performance with a relation like this : (e:Event)-[:FOR_USER]->(u:User) when I get all the users and look at the property, I can see the data, but when I query the users using same data it says 0 records found.
Below image shows the 2 query:
Can some one please help me understand how to debug such issue in neo4j
EDIT
Issue is that the Browser is somehow truncating the multiple spaces in the result. Like in this case "User-May<space>1 2013 1:18AM" was displayed on both webadmin and new browser, but in reality it should have been "User-May<space><space>1 2013<space><space>1:18AM"
So no matter what I do I can't query the value as looks like duplicate space is truncated somewhere.
Tabular data as Micheal suggested is as below
{"id":"75307","labels":["User"],"properties":{"Name":"User-May 1 2013 1:18AM"}}
and what we are seeing is User-May 1 2013 1:18AM
Regards
Kiran
Use the following Cypher syntax in the browser:
MATCH (user:User { Name: "User-May 1 2013 1:18AM" })
RETURN user.Name as Name
As far as the rendering of multiple spaces being trimmed, that is a browser specific functionality. See screenshot below for example:
The text itself is preserved as it is returned from the Neo4j server. As you can see when I analyze the HTML element of the browser using Firebug, the redundant spaces are indeed there.
So again, this doesn't seem to be a bug with Neo4j, it's how the browser you are using renders the text. The browser expects redundant spaces to be encoded as like so: "Testing testing" which is HTML encoded as Testing testing

Preventing YQL from URL encoding a key

I am wondering if it is possible to prevent YQL from URL encoding a key for a datatable?
Example:
The current guardian API works with IDs like this:
item_id = "environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy"
The problem with these IDs is that they contain slashes (/) and these characters should not be URL encoded in the API call but instead stay as they are.
So If I now have this query
SELECT * FROM guardian.content.item WHERE item_id='environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy'
while using the following url defintion in my datatable
<url>http://content.guardianapis.com/{item_id}</url>
then this results in this API call
http://content.guardianapis.com/environment%2F2010%2Foct%2F29%2Fbiodiversity-talks-ministers-nagoya-strategy?format=xml&order-by=newest&show-fields=all
Instead the guardian API expects the call to look like this:
http://content.guardianapis.com/environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy?format=xml&order-by=newest&show-fields=all
So the problem is really just that the / characters gets encoded as %2F which I don't want to happen in this case.
Any ideas on how this can be achieved?
You can also check the full datatable I am using:
http://github.com/spier/yql-tables/blob/master/guardian/guardian.content.item.xml
The URI-template expansions in YQL (e.g. {item_id}) only follow the version 3 spec. With version 4 it would be possible to simply (only slightly) change the expansion to do what you want, but alas not currently with YQL.
So, a solution. You could bring a very, very basic <execute> block into play: one which adds the item_id value to the path as needed.
<execute><![CDATA[
response.object = request.path(item_id).get().response;
]]></execute>
Finally, see the diff against your table (with a few other, minor tweaks to allow the above to work).