What is the format of a Helium API "B58" address? - helium-api

The Helium API includes several requests that specify an address in "B58" format (examples).
What is the "B58" format, and what API will return a B58 address given a Helium node name?

What is the "B58" format
It is sort of like base64 encoding, but some easily confused characters have been removed from the alphabet. From wikipedia:
Similar to Base64, but modified to avoid both non-alphanumeric
characters (+ and /) and letters that might look ambiguous when
printed (0 – zero, I – capital i, O – capital o and l – lower-case L).
Base58 is used to represent bitcoin addresses.[2] Some messaging and
social media systems break lines on non-alphanumeric strings. This is
avoided by not using URI reserved characters such as +. For segwit it
was replaced by Bech32, see below.
and what API will return a B58 address given a Helium node name?
You want: https://api.helium.io/v1/hotspots/name/:name
From here: https://docs.helium.com/api/blockchain/hotspots/#hotspots-for-name
I think it was added sort of recently, so it likely didn't exist when you asked this question.

Related

What manipulations can be done to user emails to prevent duplicates

I am woking on email based authentication that checks database for existing users based on their email and decides whether to create new account or use existing one.
Issue I came across is that users sometimes use different capitalisation in their emails, append things like +1 in the middle etc...
To combat some of these I am now (1) Stripping whitespaces away from the emails (2) always lowercasing them.
I would like to take this further, but am not sure what else I am allowed to do without breaking some emails i.e.
(3) Can I remove everything after + and before # signs?
(4) Can I remove other symbols like . from the emails?
Email addresses are case-insensitive (A and a are treated the same), so changing all upper case to lower case is fine. Digits (0-9) are also valid for emails.
However, you should not remove any of the following characters from an email address:
!#$%&'*+-/=?^_`{|}~.
Control characters, white space and other specials are invalid.
If you discover characters not in the list of 20 characters above, they would represent an invalid email. How those are handled is undefined in the standard.
Why removing the + is an issue:
It is used by some mail providers to separate (file) inbound email into folders for a user. So jack+finance#email.com would go to a finance folder in Jack's email. Other mail providers would consider it part of the email address. So jack+bauer#email.com can be a different account than jack+sparrow#email.com.
So removing the + (along with characters after it) could conflate different email accounts into an invalid email address.
Can I remove everything after + and before # signs? Can I remove other symbols like . from the emails?
Sure, you can - but should you?
If you don't care about standards and want to block valid email addresses, then block any characters you like.
RFC 822 - Standard for ARPA Internet Text Messages and RFC 2822 - Internet Message Format clearly specify the valid characters for email addresses.
+ is no different to x, ! or $
The local-part (before #) can contain:
uppercase and lowercase Latin letters (A-Z, a-z)
numeric values (0-9)
special characters, such as # ! % $ ‘ &
+ * – \ = ? ^ _ . { | } ~ `
...and you can block x, ! or $ or indeed any of them - but again - should you?
See: https://mozilla.fandom.com/wiki/User:Me_at_work/plushaters
No. Any manipulation along these lines is speculative at best, and harmful at worst. Some providers regard some characters as insignificant (so, for example, Gmail will famously ignore any dots in the localpart) but there is no safe generalization.
The only sane and safe way to validate an email address remains to send a message to it, and discard the address if the recipient does not respond e.g. by clicking a link in the message or replying to it within a reasonable time frame (say, 48 hours). And if you don't have any previous relationship with the owner of this mailbox, don't; then you're a spammer.
You can treat gmail separately. (This is what some banks do today.)
If the address is gmail, you do your items (3) and (4). (Removing the plus part and ignoring the dots before the ‘#‘ sign.). It is a good idea to warn the user at registration before removing.
For other email providers, since it is impossible to keep track how each one behaves, better to accept both the dot and plus.
Considering gmail addresses are the most frequently used ones for subscriptions, you should be OK to go for most cases.

Regular expression to find usernames in NSString Objective C [duplicate]

Could you provide a regex that match Twitter usernames?
Extra bonus if a Python example is provided.
(?<=^|(?<=[^a-zA-Z0-9-_\.]))#([A-Za-z]+[A-Za-z0-9-_]+)
I've used this as it disregards emails.
Here is a sample tweet:
#Hello how are #you doing #my_friend, email #000 me # whats.up#example.com #shahmirj
Matches:
#Hello
#you
#my_friend
#shahmirj
It will also work for hashtags, I use the same expression with the # changed to #.
If you're talking about the #username thing they use on twitter, then you can use this:
import re
twitter_username_re = re.compile(r'#([A-Za-z0-9_]+)')
To make every instance an HTML link, you could do something like this:
my_html_str = twitter_username_re.sub(lambda m: '%s' % (m.group(1), m.group(0)), my_tweet)
The regex I use, and that have been tested in multiple contexts :
/(^|[^#\w])#(\w{1,15})\b/
This is the cleanest way I've found to test and replace Twitter username in strings.
#!/usr/bin/python
import re
text = "#RayFranco is answering to #jjconti, this is a real '#username83' but this is an#email.com, and this is a #probablyfaketwitterusername";
ftext = re.sub( r'(^|[^#\w])#(\w{1,15})\b', '\\1\\2', text )
print ftext;
This will return me as expected :
RayFranco is answering to jjconti, this is a real 'username83' but this is an#email.com, and this is a #probablyfaketwitterusername
Based on Twitter specs :
Your username cannot be longer than 15 characters. Your real name can be longer (20 characters), but usernames are kept shorter for the sake of ease.
A username can only contain alphanumeric characters (letters A-Z, numbers 0-9) with the exception of underscores, as noted above. Check to make sure your desired username doesn't contain any symbols, dashes, or spaces.
Twitter recently released to open source in various languages including Java, Ruby (gem) and Javascript implementations of the code they use for finding user names, hash tags, lists and urls.
It is very regular expression oriented.
The only characters accepted in the form are A-Z, 0-9, and underscore. Usernames are not case-sensitive, though, so you could use r'#(?i)[a-z0-9_]+' to match everything correctly and also discern between users.
This is a method I have used in a project that takes the text attribute of a tweet object and returns the text with both the hashtags and user_mentions linked to their appropriate pages on twitter, complying with the most recent twitter display guidelines
def link_tweet(tweet):
"""
This method takes the text attribute from a tweet object and returns it with
user_mentions and hashtags linked
"""
tweet = re.sub(r'(\A|\s)#(\w+)', r'\1#\2', str(tweet))
return re.sub(r'(\A|\s)#(\w+)', r'\1#\2', str(tweet))
Once you call this method you can pass in the param my_tweet[x].text. Hope this is helpful.
Shorter, /#([\w]+)/ works fine.
This regex seems to solve Twitter usernames:
^#[A-Za-z0-9_]{1,15}$
Max 15 characters, allows underscores directly after the #, (which Twitter does), and allows all underscores (which, after a quick search, I found that Twitter apparently also does). Excludes email addresses.
I have used the existing answers and modified it for my use case. (username must be longer then 4 characters)
^[A-z0-9_]{5,15}$
Rules:
Your username must be longer than 4 characters.
Your username must be shorter than 15 characters.
Your username can only contain letters, numbers and '_'.
Source: https://help.twitter.com/en/managing-your-account/twitter-username-rules
In case you need to match all the handle, #handle and twitter.com/handle formats, this is a variation:
import re
match = re.search(r'^(?:.*twitter\.com/|#?)(\w{1,15})(?:$|/.*$)', text)
handle = match.group(1)
Explanation, examples and working regex here:
https://regex101.com/r/7KbhqA/3
Matched
myhandle
#myhandle
#my_handle_2
twitter.com/myhandle
https://twitter.com/myhandle
https://twitter.com/myhandle/randomstuff
Not matched
mysuperhandleistoolong
#mysuperhandleistoolong
https://twitter.com/mysuperhandleistoolong
You can use the following regex: ^#[A-Za-z0-9_]{1,15}$
In python:
import re
pattern = re.compile('^#[A-Za-z0-9_]{1,15}$')
pattern.match('#Your_handle')
This will check if the string exactly matches the regex.
In a 'practical' setting, you could use it as follows:
pattern = re.compile('^#[A-Za-z0-9_]{1,15}$')
if pattern.match('#Your_handle'):
print('Match')
else:
print('No Match')

Is there standard format for representing date/time in URIs?

I am building an API endpoint that accepts DateTime as a parameter.
It is recommended not to use : character as part of the URI, so I can't simply use ISO 8601 format.
So far I have considered two formats:
A) Exclamation mark as minute delimiter:
http://api.example.com/resource/2013-08-29T12!15
Looks unnatural and even with clear documentation, API consumers are bound to make mistakes.
B) URI segment per DateTime part:
http://api.example.com/resource/2013/08/29/12/15
Looks unreadable. Also, once I add further numeric parameters - it will become incomprehensible!
Is there standard/convention for for representing date/time in URIs?
I'd use the data interchange standard format.
Check this: http://en.wikipedia.org/wiki/ISO_8601
You can use : in URI paths.
The colon is a reserved character, but it has no delimiting role in the path segment. So the following should apply:
If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.
There is only one exception for relative-path references:
A path segment that contains a colon character (e.g., "this:that") cannot be used as the first segment of a relative-path reference, as it would be mistaken for a scheme name. Such a segment must be preceded by a dot-segment (e.g., "./this:that") to make a relative-path reference.
But note that some encoding libraries might percent-encode the colon anyway.

How Do I Convert a Byte Stream to a Text String?

I'm working on a licensing system for my application. I'd like to put all licensing information (licensee name, expiration date, and enabled features) into an object, encrypt that object with a private key, then represent the encrypted data as a single text string which I can send via email to my customers.
I've managed to get the encrypted data into a byte stream, but I don't know how to convert that byte stream into a text value -- something that contains no control characters or whitespace. Can anyone offer advice on how to do that? I've been researching the Encoding class, but I can't find a text-only encoding.
I'm using Net 2.0 -- mostly VB, but I can do C# also.
Use a Base64Encoder to convert it to a text string that can be decoded with a Base64Decoder. It is great for representing arbitary binary data in a text friendly manner, only upper and lower case A-Z and 0-9 digits.
BinHex is an example of one way to do that. It may not be exactly what you want -- for example, you might want to encode your data such that it's impossible to inadvertently spell words in your string, and you may or may not care about maximizing the density of information. But it's an example that may help you come up with your own encoding.
I've found Base32 useful for license keys before. There are some C# implementations linked from this answer. My own license code is based on this implementation, which avoids ambiguous characters to make it easier to retype the keys.

Objective-C How to get unicode character

I want to get unicode code point for a given unicode character in Objective-C. NSString said it internal use UTF-16 encoding and said,
The NSString class has two primitive methods—length and characterAtIndex:—that provide the basis for all other methods in its interface. The length method returns the total number of Unicode characters in the string. characterAtIndex: gives access to each character in the string by index, with index values starting at 0.
That seems assume characterAtIndex method is unicode aware. However it return unichar is a 16 bits unsigned int type.
- (unichar)characterAtIndex:(NSUInteger)index
The questions are:
Q1: How it present unicode code point above UFFFF?
Q2: If Q1 make sense, is there method to get unicode code point for a given unicode character in Objective-C.
Thx.
The short answer to "Q1: How it present unicode code point above UFFFF?" is: You need to be UTF16 aware and correctly handle Surrogate Code Points. The info and links below should give you pointers and example code that allow you to do this.
The NSString documentation is correct. However, while you said "NSString said it internal use UTF-16 encoding", it's more accurate to say that the public / abstract interface for NSString is UTF16 based. The difference is that this leaves the internal representation of a string a private implementation detail, but the public methods such as characterAtIndex: and length are always in UTF16.
The reason for this is it tends to strike the best balance between older ASCII-centric and Unicode aware strings, largely due to the fact that Unicode is a strict superset of ASCII (ASCII uses 7 bits, for 128 characters, which are mapped to the first 128 Unicode Code Points).
To represent Unicode Code Points that are > U+FFFF, which obviously exceeds what can be represented in a single UTF16 Code Unit, UTF16 uses special Surrogate Code Points to form a Surrogate Pair, which when combined together form a Unicode Code Point > U+FFFF. You can find details about this at:
Unicode UTF FAQ - What are surrogates?
Unicode UTF FAQ - What’s the algorithm to convert from UTF-16 to character codes?
Although the official Unicode UTF FAQ - How do I write a UTF converter? now recommends the use of International Components for Unicode, it used to recommend some code officially sanctioned and maintained by Unicode. Although no longer directly available from Unicode.org, you can still find copies of the "no longer official" example code in various open-source projects: ConvertUTF.c and ConvertUTF.h. If you need to roll your own, I'd strongly recommend examining this code first, as it is well tested.
From the documentation of length:
The number returned includes the
individual characters of composed
character sequences, so you cannot use
this method to determine if a string
will be visible when printed or how
long it will appear.
From this, I would infer that any characters above U+FFFF would be counted as two characters and would be encoded as a Surrogate Pair (see the relevant entry at http://unicode.org/glossary/).
If you have a UTF-32 encoded string with the character you wish to convert, you could create a new NSString with initWithBytesNoCopy:length:encoding:freeWhenDone: and use the result of that to determine how the character is encoded in UTF-16, but if you're going to be doing much heavy Unicode processing, your best bet is probably to get familiar with ICU (http://site.icu-project.org/).