How can I skip includes using libclang? - objective-c

I am using libclang to parse a objective c source code file. The following code finds all Objective-C instance method declarations, but it also finds declarations in the includes:
enum CXCursorKind curKind = clang_getCursorKind(cursor);
CXString curKindName = clang_getCursorKindSpelling(curKind);
const char *funcDecl="ObjCInstanceMethodDecl";
How can I skip everything, which comes from header includes? I am only interested in my own Objective-C instance method declarations in the source file, not in any of the includes.
e.g. the following should not be included
Location: /System/Library/Frameworks/Foundation.framework/Headers/NSObject.h:15:9:315
TypeKind: Invalid
CursorKind: ObjCInstanceMethodDecl

Answering this question because I couldn't believe that hard-coding paths comparisons was the only solution, and indeed, there is a clang_Location_isFromMainFile function that does exactly what you want, so that you can filter unwanted results in the visitor, like this :
if (clang_Location_isFromMainFile (clang_getCursorLocation (cursor)) == 0) {
return CXChildVisit_Continue;

The only way I know would be to skip unwanted paths during the AST visit. You can for example put something like the following in your visitor function. Returning CXChildVisit_Continue avoids visiting the entire file.
CXFile file;
unsigned int line, column, offset;
CXString fileName;
char * canonicalPath = NULL;
clang_getExpansionLocation (clang_getCursorLocation (cursor),
&file, &line, &column, &offset);
fileName = clang_getFileName (file);
if (clang_getCString (fileName)) {
canonicalPath = realpath (clang_getCString (fileName), NULL);
clang_disposeString (fileName);
if (strcmp(canonicalPath, "/canonical/path/to/your/source/file") != 0) {
return CXChildVisit_Continue;
Also, why compare CursorKindSpelling instead of the CursorKind directly?


Building a linked list in yacc with left recursive Grammar

I want to build a linked list of data in yacc.
My Grammar reads like this:
list: item
| list ',' item
I have put the appropriate structures in place in the declarations section. But I am not able to figure out a way to get a linked list out of this data. I have to store the recursively obtained data and then redirect it for other purposes.
Basically I am looking for a solution like this one:
But this solution is for right recursion and doesn't work with left.
It depends heavily on how you implement your linked list, but once you have that, it is straight-forward. Something like:
struct list_node {
struct list_node *next;
value_t value;
struct list {
struct list_node *head, **tail;
struct list *new_list() {
struct list *rv = malloc(sizeof(struct list));
rv->head = 0;
rv->tail = &rv->head;
return rv; }
void push_back(struct list *list, value_t value) {
struct list_node *node = malloc(sizeof(struct list_node));
node->next = 0;
node->value = value;
*list->tail = node;
list->tail = &node->next; }
allows you to write your yacc code as:
list: item { push_back($$ = new_list(), $1); }
| list ',' item { push_back($$ = $1, $3); }
of course, you should probably add checks for running out of memory, and exit gracefully in that case.
If you use a left recursive rule, then you need to push the new item at the end of the list rather than the beginning.
If your linked list implementation doesn't support push_back, then push the successive items at the front and reverse the list when its finished.
Very simple.
: item
$$ = new MyList<SomeType>();
| list ',' item
$$ = $1;
assuming you are using C++, which you didn't state, and assuming you have some MyList<T> class with an add(T) method.

Is it valid to rebind a variable in a while loop?

Is it valid to rebind a mutable variable in a while loop? I am having trouble getting the following trivial parser code to work. My intention is to replace the newslice binding with a progressively shorter slice as I copy characters out of the front of the array.
/// Test if a char is an ASCII digit
fn is_digit(c:u8) -> bool {
match c {
30|31|32|33|34|35|36|37|38|39 => true,
_ => false
/// Parse an integer from the front of an ascii string,
/// and return it along with the remainder of the string
fn parse_int(s:&[u8]) -> (u32, &[u8]) {
use std::str;
let mut newslice = s; // bytecopy of the fat pointer?
let mut n:Vec<u8> = vec![];
// Pull the leading digits into a separate array
while newslice.len()>0 && is_digit(newslice[0])
newslice = newslice.slice(1,newslice.len()-1);
//newslice = newslice[1..];
match from_str::<u32>(str::from_utf8(newslice).unwrap()) {
Some(i) => (i,newslice),
None => panic!("Could not convert string to int. Corrupted pgm file?"),
fn main(){
let s:&[u8] = b"12345";
let (i,newslice) = parse_int(s);
println!("length of returned slice: {}",newslice.len());
parse_int is failing to return a slice that is smaller than the one I passed in:
length of returned slice: 5
task '<main>' panicked at 'assertion failed: newslice.len() == 0', <anon>:37
playpen: application terminated with error code 101
Run this code in the rust playpen
As Chris Morgan mentioned, your call to slice passes the wrong value for the end parameter. newslice.slice_from(1) yields the correct slice.
is_digit tests for the wrong byte values. You meant to write 0x30, etc. instead of 30.
You call str::from_utf8 on the wrong value. You meant to call it on n.as_slice() rather than newslice.
Rebinding variables like that is perfectly fine. The general rule is simple: if the compiler doesn’t complain, it’s OK.
It’s a very simple error that you’ve made: your slice end point is incorrect.
slice produces the interval [start, end)—a half-open range, not closed. Therefore when you wish to just remove the first character, you should be writing newslice.slice(1, newslice.len()), not newslice.slice(1, newslice.len() - 1). You could also write newslice.slice_from(1).

How to refer to the iterator within the find_if predicate?

I want to make sure a string has at least one alpha. Simple:
if ( find_if(field_name.begin(), field_name.end(), isalpha) == field_name.end() )
But I want to use a locale. I know I can easily write a separate function but I'd prefer to use it within the find_if. I.e.,
include <locale>
std::locale loc;
if ( find_if(field_name.begin(), field_name.end(), isalpha(*this_iterator,loc) == field_name.end() )
Question: Is there someway to make this_iterator refer to the then-current iterator?
In C++11 you can do this with a lambda as Timo suggests, or with std::bind(), as in std::bind(isalpha, std::placeholders::_1, loc).
Pre-C++11, you can use std::bind2nd() instead. This gets a bit complicated though, as it requires a unary_function or binary_function as an argument, instead of any old function object. We can create one using std::ptr_fun(), although for some reason we need to explicitly tell it what the template parameters are. And we need to use std::isalpha() instead of isalpha() in order to get the locale-enabled version. So the full expression looks like
std::bind2nd(std::ptr_fun<char, const std::locale&, bool>(std::isalpha), loc)
Needless to say, the C++11 version is vastly simpler.
BTW if you are using C++11 then you can use std::any_of(...) instead of std::find_if(...) == foo.end(). It should behave the same, but be slightly more readable.
In C++11, you can use a lambda:
if (std::find_if(field_name.begin(), field_name.end(),
[&loc](char c)
return isalpha(c, loc);
}) == field_name.end())
In pre-C++11 you probably have to use something like boost::bind or boost::lambda to achieve the same functionality.
In pre-C++11, you can wrap the isalpha() with an object that overrides the () operator, then use it as a predicate, if you don't want to use std::bind...() or boost, eg:
#include <locale>
struct isalphaloc
const std::locale &_loc;
isalphaloc(const std::locale &loc) : _loc(loc) {}
bool operator(const char c) const
return isalpha(c, _loc);
std::locale loc;
if ( find_if(field_name.begin(), field_name.end(), isalphaloc(loc)) == field_name.end() )

Structure of a block declaration

When declaring a block what's the rationale behind using this syntax (i.e. surrounding brackets and caret on the left)?
For example:
int (^myBlock)(int) = ^(int num) {
return num * multiplier;
C BLOCKS: Syntax and Usage
Variables pointing to blocks take on the exact same syntax as variables pointing to functions, except * is substituted for ^. For example, this is a function pointer to a function taking an int and returning a float:
float (*myfuncptr)(int);
and this is a block pointer to a block taking an int and returning a float:
float (^myblockptr)(int);
As with function pointers, you'll likely want to typedef those types, as it can get relatively hairy otherwise. For example, a pointer to a block returning a block taking a block would be something like void (^(^myblockptr)(void (^)()))();, which is nigh impossible to read. A simple typedef later, and it's much simpler:
typedef void (^Block)();
Block (^myblockptr)(Block);
Declaring blocks themselves is where we get into the unknown, as it doesn't really look like C, although they resemble function declarations. Let's start with the basics:
myvar1 = ^ returntype (type arg1, type arg2, and so on) {
block contents;
like in a function;
return returnvalue;
This defines a block literal (from after = to and including }), explicitly mentions its return type, an argument list, the block body, a return statement, and assigns this literal to the variable myvar1.
A literal is a value that can be built at compile-time. An integer literal (The 3 in int a = 3;) and a string literal (The "foobar" in const char *b = "foobar";) are other examples of literals. The fact that a block declaration is a literal is important later when we get into memory management.
Finding a return statement in a block like this is vexing to some. Does it return from the enclosing function, you may ask? No, it returns a value that can be used by the caller of the block. See 'Calling blocks'. Note: If the block has multiple return statements, they must return the same type.
Finally, some parts of a block declaration are optional. These are:
The argument list. If the block takes no arguments, the argument list can be skipped entirely.
myblock1 = ^ int (void) { return 3; }; // may be written as:
myblock2 = ^ int { return 3; }
The return type. If the block has no return statement, void is assumed. If the block has a return statement, the return type is inferred from it. This means you can almost always just skip the return type from the declaration, except in cases where it might be ambiguous.
myblock3 = ^ void { printf("Hello.\n"); }; // may be written as:
myblock4 = ^ { printf("Hello.\n"); };
// Both succeed ONLY if myblock5 and myblock6 are of type int(^)(void)
myblock5 = ^ int { return 3; }; // can be written as:
myblock6 = ^ { return 3; };
I think the rationale is that it looks like a function pointer:
void (*foo)(int);
Which should be familiar to any C programmer.

libxml : xpath not found?

I'm completly new to XML, libxml and xpath. Therefore I wanted to parse a very simple XML expression
<active type="integer">1</active>
I want to retrieve the value 1. I wrote the following code that uses libxml (this code contains objective c code):
const unsigned char* xPathExp = (const unsigned char*) "count/active/text()";
xmlXPathContextPtr xpathCtx = xmlXPathNewContext(doc);
xmlXPathObjectPtr xpathObj = xmlXPathEvalExpression( xPathExp, xpathCtx);
xmlNodeSetPtr nodeSetPtr = xpathObj->nodesetval;
if( nodeSetPtr != 0 ) {
xmlNode* node = nodeSetPtr->nodeMax > 0 ? nodeSetPtr->nodeTab[0] : 0;
if( node != 0 ) {
NSLog(#"Active Projects = %#", [self stringFromCString:node->name] );
} else {
NSLog(#"Node for xpath Exp: count/active/text() does not exist.");
} else
NSLog(#"nodeSetPtr is null");
It turns out that the output of this code is
nodeSetPtr is null
So that confuses me. I ran an online xpath evaluator on the xml above and the xpath expression "count/active/text()" return 1.
What am I doing wrong here?
It may be as simple as your document setup and what the root element is. first, try //count/active/text(). If that works, then try /count/active/text(). In general, try not to use // unless you really want to match anywhere (great for debugging this particular problem, though). /count should find the count element against the root.