Best Practices in Building and Processing Macros - objective-c

I'm building an IOS application, and would like to offer the user the ability to use macros for different aspects of the system.
For example, I might have a simple macro this this:
{include name="some name" pre="some it of htmk" post="some other bit of html"}
That would include the contents of the item named "some name" in the body of the document the user is working on.
or I might have something more complex like this:
{notesForTag name={ListAllTags pre="some bit of html" post="some other bit of html"} pre="..." post="..."}
Which would list all the documents in the system grouped by tag.. the ability to add on data (like html) at the beginning and the end of each tag returned would allow the user, for example, to format the response as a table, or use particular styling, etc.
Conceptually, I know how I want this to work, but I'm wondering if there are any macro construction and processing best practices out there that would help me on my way. Anything geared towards Objective C / IOS would be most helpful.
Edit: To add some clarity here, what I'm looking to discover is an efficient and accurate way to parse something like this. Having parsed things, I think the rest will be fairly straightforward.
Thank you.

NSScanner would probably work well for parsing something like this, perhaps with a recursive function to allow nested macros like the second one. You may also want to consider using XML to represent your macros, which would allow you to use NSXMLParser to parse it, so you only have to worry about the content and not the structure.

Related

How to style parts of i18n messages when using thymeleaf

I'm not sure this is the right place to ask this. I would like to know how best to style parts of messages from l10n properties files. For example, my client want this message and formatting in a help window:
This is a self-assessment and comparison application.
Simplest solution would be to include the HTML tags in the messages.properties entry for this label. The problem with that is that the 40 translators that will process the messages.properties are bound to make mistakes like deleting the <, translating the attributes or styles of the HTML markup etc. Also it makes maintaining the markup and styling difficult for the devs.
Any better way to do this?
The solution I've seen typically done just uses th:utext with HTML tags in the .properties files. I would opine it does create a maintenance hassle as you mention and should be kept to a minimum.
One workaround is to create separate strings in some cases, like:
<span th:text=#{thisIsA}>This is a </span><strong><span th:text="#{selfAssessment}">self-assessment</span></strong>
However, this is error-prone since certain languages may change the order of the words. So that's not a great option.
If the HTML tags specifically are an issue, another way albeit somewhat ugly could be:
thisIsASelfAssessment=This is a {0}self-assessment{1}.
Or even
thisIsA=This is a {0}.
selfAssessment=self-assessment
But that might be confusing for the next developer reading it and may introduce the same issue you have with the 40 translators looking at it since you have curly braces. It also all becomes very tedious and generates more lines.
So in the end, you're likely best going with the simplest solution of utext.
Project-wise, you could have the initial translation done without the markup and add the markup in after they are done with a first pass at translating it. The issue may arise in the future when you need to change strings, but doing this would minimize some headache. It could make sense to keep these strings in a separate block in the .properties file so you can target them later.
Good question as I've had this issue myself.

General stategy for designing Flexible Language application using ANTLR4

Requirement:
I am trying to develop a language application using antlr4. The language in question is not important. The important thing is that the grammar is very vast (easily >2000 rules!!!). I want to do a number of operations
Extract bunch of informations. These can be call graphs, variable names. constant expressions etc.
Any number of transformations:
if a loop can be expanded, we go ahead and expand it
If we can eliminate dead code we might choose to do that
we might choose to rename all variable names to conform to some norms.
Each of these operations can be applied independent of each other. And after application of these steps I want the rewrite the input as close as possible to the original input.
e.g. So we might want to eliminate loops and rename the variable and then output the result in the original language format.
Questions:
I see a need to build a custom Tree (read AST) for this. So that I can modify the tree with each of the transformations. However when I want to generate the output, I lose the nice abilities of the TokenStreamRewriter. I have to specify how to write each of the nodes of the tree and I lose the original input formatting for the places I didn't do any transformations. Does antlr4 provide a good way to get around this problem?
Is AST the best way to go? Or do I build my own object representation? If so how do I create that object efficiently? Creating object representation is very big pain for such a vast language. But may be better in the long run. Again how do I get back the original formatting?
Is it possible to work just on the parse tree?
Are there similar language applications which do the same thing? If so what strategy do they use?
Any input is welcome.
Thanks in advance.
In general, what you want is called a Program Transformation System (PTS).
PTSs generally have parsers, build ASTs, can prettyprint the ASTs to recover compilable source text. More importantly, they have standard ways to navigate/inspect/modify the ASTs so that you can change them programmatically.
Many offer these capabilities in the form of pattern-matching code fragments written in the surface syntax of the language being transformed; this avoids the need to forever having to know excruciatingly fine details about which nodes are in your AST and how they are related to children. This is incredibly useful when you big complex grammars, as most of our modern (and our legacy languages) all seem to have.
More sophisticated PTSs (very few) provide additional facilities for teasing out the semantics of the source code. It is pretty hard to analyze/transform most code without knowing what scopes individual symbols belong to, or their type, and many other details such as data flow. Full disclosure: I build one of these.

Efficient way to translate an application

So i have developed an application in vb.net but recently i came across the requisite of allowing multiple languages for it. I dont know if there is any 'common' way of doing this kind of things, but my approach to accomplish that is the following:
I'll need to search in the code for components, error messages and everything that is displayed in the GUI of the application to be translated.
Secondly i will create a class in which i'll store in memory a dictionary of everything that will be translated
after, i'll replace the stuff to be translated withing an entry of the dictionary
then when the application start i'll load the dictionary
later on, i'll replace the static dictionary and will load it in memory from the database
So for example, my dictionary class:
Dim dictionary As New Dictionary(Of String, String)
dictionary.Add("00011", "hello there!")
Somewhere in my code i'll replace:
mylabel.text = "hello there!"
With:
mylabel.text = dictionary.item("00011")
Later on i will, instead of having a static dictionary, create that dictionary getting the information from a database like this (and load it at the start of the application:
_______________________________________
word_code ### word_EN ### word_FR
_______________________________________
00011 ### hello there ### bonjour il
I will load the dictionary considering which language is selected.
I'm not very confortable with this approach and i have no idea if this is the right thing to do, but if so i have a couple of questions:
is a dictionary the best data-structure to do so?
will this be memory-heavy considering i'll have 1000 entries, 1m entries or 10m entries?
is there any logic and faster way of accomplish the same?
Thank you so much in advanced,
J
It's a common way of doing it - having a system name along side a language code being used to look up a translated value. However, generally speaking I'd only advice you to do this for something like system texts and smaller text segments.
The reason is that in for example CMS/ecommerce systems, pages with lots of text likely will need to be translated in a data model to support it to begin with; and then you already have the language division.
So in that situation, you're better off making a page structure with a translated data model where the detail will be language specific per language for your current website.
For example, you'll have a product -> product_detail where detail keeps the translated values for said product. Similar for article -> article_detail and so on.
But for general translations and system texts which needs to be displayed, it's a common way to do it.
And as you suggest yourself, structures like like dictionary would be a good structures to to make fast look ups and can be cached in the system so you do not need to retrieve them all the time.
Some ways you can expand on it, is by sub dividing your translations into sub groups; say you have an order page and a product page. Then you can have translations assigned to "product" and to "order" with a "common" group as well.
It will also make it easier to build smaller cache objects, extract less data from your data storage etc, so a page which only revolves around orders don't need to worry about product translations.
It will require memory, but unless you put entire CMS systems into the translations, it should be "minor".
I would however question a need of 10 million entities of translations and wonder whether or not your system actually requires that many and if it does, then maybe consider an alternate approach and whether it might be better to make multiple versions of the "page" to eliminate the need for translations.
I would also advice you to not use "00011" as a system code to begin, and go for a more "readable" version (like "hello") to ease the readability and maintainability of your code. Then if you want a 'system value' which is like "00011", it's easy to do a search/replace.

Distinct Row Templates for NSPredicateEditor/NSRuleEditor rows

I'm trying to figure out if it's possible to use different row templates for specific rows in an NSPredicateEditor (or, if need be, an NSRuleEditor). I've got a screenshot that I think helps me explain this more clearly.
In this contrived example, I only want people to generate a filter that looks for a specific path above a certain size. So, in Section A (the Any block), users can only specify path rules (and the users can add additional paths). In section B, I only want the Size option to be available.
Nothing's jumping out at me from the docs (or, the stuff that does jump out at me ends up being something else), but it seems like this is the sort of thing that might come in handy, which makes me think it might be possible.
From what I understand about NSPredicateEditor, this is not possible. You might be able to swing it if you do everything yourself with an NSRuleEditor, but I haven't played with that class as much.
So in a nutshell: if you implement it yourself, it's possible. With the built-in stuff, I'm 99.9% certain that it's not a configurable behavior.

Code related web searches

Is there a way to search the web which does NOT remove punctuation? For example, I want to search for window.window->window (Yes, I actually do, this is a structure in mozilla plugins). I figure that this HAS to be a fairly rare string.
Unfortunately, Google, Bing, AltaVista, Yahoo, and Excite all strip the punctuation and just show anything with the word "window" in it. And according to Google, on their site, at least, there is NO WAY AROUND IT.
In general, searching for chunks of code must be hard for this reason... anyone have any hints?
google codesearch ("window.window->window" but it doesn't seem to get any relevant result out of this request)
There is similar tools all over the internet like codase or koders but I'm not sure they let you search exactly this string. Anyway they might be useful to you so I think they're worth mentioning.
edit: It is very unlikely you'll find a general purpose search engine which will allow you to search for something like "window.window->window" because most search engines will do some processing on the document before storing it. For instance they might represent it internally as vectors of words (a vector space model) and use that to do the search, not the actual original string. And creating such a vector involves first cutting the document according to punctuation and other critters. This is a very complex and interesting subject which I can't tell you much more about. My bad memory did a pretty good job since I studied it at school!
BTW they might do the same kind of processing on your query too. You might want to read about tf-idf which is probably light years from what google and his friends are doing but can give you a hint about what happens to your query.
There is no way to do that, by itself in the main Google engine, as you discovered -- however, if you are looking for information about Mozilla then the best bet would be to structure your query something more like this:
"window.window->window" +Mozilla
OR +XUL
+ Another search string related to what you are
trying to do.
SymbolHound is a web search that does not remove punctuation from the queries. There is an option to search source code repositories (like the now-discontinued Google Code Search), but it also has the option to search the Internet for special characters. (primarily programming-related sites such as StackOverflow).
try it here: http://www.symbolhound.com
-Tom (co-founder)