Address of element within a structure from elf executable - elf

Is it possible to obtain the address of an element within a structure from an ELF executable not compile for debug?
Example, given the following code:
typedef struct {
int tokyo;
int paris;
int london;
}cities;
cities places;
Both nm and readelf give the start address of the variable 'places', and readelf also gives the sizeof:
Num: Value Size Type Bind Vis Ndx Name
1994983: d0003ae8 12 OBJECT GLOBAL DEFAULT 23 cities
However what I need is the address of each element within the structure. So from above what I want is:
d0003ae8 cities.tokyo
d0003aec cities.paris
d0003af0 cities.london
My only route at present is to compile with dwarf2 debug info, use readelf (-wliao) to dump out the .debug_info section, and then parse the type tree from a DW_TAG_variable adding up base_type sizes. Example readelf:
<1><e00b>: Abbrev Number: 5 (DW_TAG_structure_type)
DW_AT_byte_size : 12
DW_AT_decl_file : 3
DW_AT_decl_line : 25
<2><e013>: Abbrev Number: 6 (DW_TAG_member)
DW_AT_name : tokyo
DW_AT_decl_file : 3
DW_AT_decl_line : 15
DW_AT_type : <df04>
<2><e02e>: Abbrev Number: 6 (DW_TAG_member)
DW_AT_name : paris
DW_AT_decl_file : 3
DW_AT_decl_line : 16
DW_AT_type : <df04>
<2><e02e>: Abbrev Number: 6 (DW_TAG_member)
DW_AT_name : london
DW_AT_decl_file : 3
DW_AT_decl_line : 16
DW_AT_type : <df04>
I need to find a way of doing this without access to the source code, and with debug info turned off..
Any help or pointers appreciated.
Thanks,
Chris

No, there is no way to do this. ELF alone does not describe types or offsets.
If you know the types of the fields of the structure and the ABI of the architecture for which the object is built, you can recreate the layout.

Related

How to use a multiple line input to create my game?

So, i have one method that will create all of the instance variables for my other methods that will create the game, i know how to do them separately but finding how to do it from one method is really hard.
I need to be reading data from a String where each line must be treated separately.
Am using Pharo.
Class Game, everything is within one Game class.
Game: instance variables: 'rol col'. Using instance methods.
readFrom: 'Board 3 4
Dice 2 1 1 1
Players 1'
board
[my actual code that creates a board]
row for loop[
Transcript show: 'creating board'.
col for loop[
Transcript show: 'creating board'.
]
]
dice
[dice code..]
players
[players code]
Your model is not clearly defined yet. However, by helping you with some coding I will try to give you some insights on how to fill the gaps still remaining.
So, let's say you have a class Game. This class defines (at least) 4 instance variables: rows, columns, dice and players.
Now you want to create an instance of Game by reading some String that conforms to a certain format, such in:
'Board 3 4
Dice 2 1 1 1
Players 1'
To do this create a class side method in Game on the lines of
readFrom: aString
^self new readFrom: aString
and then an instance method
readFrom: aString
aString lines do: [:line | | data key |
data := line substrings.
key := data at: 1.
key = 'Board'
ifTrue: [
rows := data at: 2.
columns := data at: 3].
key = 'Dice'
ifTrue: [
dice := data allButFirst collect: [:s | s asInteger]].
key = 'Players'
ifTrue: [
players := (data at: 2) asInteger]]
Again, this won't solve all problems, but should help you get started. Otherwise, ask again.

How to create words within a Forth definition

I'm using Gforth, and I want to create a word in a definition. In the cmd line of Gforth I can type:
create foo
ok
Or more specifically, I defined an array function that expects a size on the stack and creates a word with the address to that array:
: array ( n -- ) ( i -- addr)
create cells allot
does> cells + ;
So if I type 10 array foo I can then use foo later.
But if I were to write 10 array foo within another definition it gives me a compilation error. I've tried replacing foo with s" foo" which compiles, but it blows up at run time, saying:
Attempt to use zero-length string as a name
Is there a way to do this?
One way to do it in gforth:
: bar 10 s" foo" ['] array execute-parsing ;
Other implementations do it differently, e.g. http://pfe.sourceforge.net/words/w-header-015.html
It's not easy to do in Standard Forth, but this may be good enough:
: bar 10 s" array foo" evaluate ;
I guess most of what you want to do can be done by defining words, i.e. using create ... does> ... This allows you to define a word with specialized behaviour.
E.g.:
: 2const create , , does> 2# ;
can be used to create double constants like 2 3 2const a-double (that stashes 2 and 3 away in a-double) and then a-double pushes two values (2 3).

Why my rules of bison don't work

Every time I run my parser, it will appear "syntax error in line 1 near <>" (Because there is a subroutine yyerror(char *s)). I think that's because there is something wrong with my rules in bison.
The file (c17.isc) I want to parse.
*c17 iscas example (to test conversion program only)
*---------------------------------------------------
*
*
* total number of lines in the netlist .............. 17
* simplistically reduced equivalent fault set size = 22
* lines from primary input gates ....... 5
* lines from primary output gates ....... 2
* lines from interior gate outputs ...... 4
* lines from ** 3 ** fanout stems ... 6
*
* avg_fanin = 2.00, max_fanin = 2
* avg_fanout = 2.00, max_fanout = 2
*
*
*
*
*
1 1gat inpt 1 0 >sa1
2 2gat inpt 1 0 >sa1
3 3gat inpt 2 0 >sa0 >sa1
8 8fan from 3gat >sa1
9 9fan from 3gat >sa1
6 6gat inpt 1 0 >sa1
7 7gat inpt 1 0 >sa1
10 10gat nand 1 2 >sa1
1 8
11 11gat nand 2 2 >sa0 >sa1
9 6
14 14fan from 11gat >sa1
15 15fan from 11gat >sa1
16 16gat nand 2 2 >sa0 >sa1
2 14
20 20fan from 16gat >sa1
21 21fan from 16gat >sa1
19 19gat nand 1 2 >sa1
15 7
22 22gat nand 0 2 >sa0 >sa1
10 20
23 23gat nand 0 2 >sa0 >sa1
21 19
My flex file is as follows and it is right. You can find some information about how my scanner work here.
Error in the output of my flex file
declare.h
# include <stdio.h>
# include <string.h>
# include <stdlib.h>
# define INPT 1
# define NOR 2
# define NAND 3
# define NOT 4
# define XOR 5
# define AND 6
# define BUFF 7
# define FROM 8
flex file is
%{
# include "declare.h"
# include "parse.tab.h"
/*gi=1,it's input;gi=8,it's fanout;otherwise,it's gate*/
static int gi=-1;
static int inum=0;
struct{
char *symbol;
int val;
} symtab[]={
{"inpt", INPT},
{"nor", NOR},
{"nand", NAND},
{"not", NOT},
{"xor", XOR},
{"and", AND},
{"buff", BUFF},
{"from",FROM},
{"0",0}
};
extern FILE *yyin;
extern int yylval;
%}
%start A B C D E
DIGITS [0-9]+
BLANK [ \t\n\r\f\v\b]+
ALPHA [a-z]+
%%
"*".*\n {BEGIN A; return(COMMENT);}
<A>{DIGITS} {yylval=atoi(yytext); BEGIN B; return(NUM);}
<B>{DIGITS}{ALPHA} {yylval=atoi(yytext); BEGIN C; return(GNAME);}
<C>{DIGITS} {yylval=atoi(yytext); BEGIN D; return(OPNUM);}
<C>{DIGITS}{ALPHA} {yylval=atoi(yytext); BEGIN A; return(FR);}
<D>{DIGITS} {inum=atoi(yytext);
yylval=inum;
if(gi==1)
{BEGIN A;}
if(gi!=1)
{BEGIN E;}
return(IPNUM);
}
<E>{DIGITS} {inum--;
yylval=atoi(yytext);
if(inum<0)
{BEGIN B; return(NUM);}
else
{BEGIN E; return(ILIST);}
}
{ALPHA} {yylval=lookup(yytext);
return(GTYPE);
}
">sa"[0-1] {yylval=atoi(&yytext[yyleng-1]);return(FAULT);}
{BLANK} ;
. ;
%%
int lookup(const char *s)
{
int i;
for (i = 0; symtab[i].val != 0; i++)
{
if (strcmp(symtab[i].symbol, s) == 0)
break;
}
return(symtab[i].val);
}
The right rules in bison file are as follows
parto:
| parto COMMENT
| parto parti
;
parti: NUM
{...}
GNAME
{...}
GTYPE
{...}
| parti partii
| parti partiii
;
partii:OPNUM
{...}
IPNUM
{...}
partiv
partv
;
partiii: FR
{...}
partiv
;
partiv:
| partiv FAULT
{...}
;
partv:
| partv ILIST
{...}
;
Transferring the key comments into an answer.
The first edition of the code had a couple of problems. In the scanner code, there were lines like this:
<A>{DIGITS} { yylval=atoi(yytext); return(NUM); BEGIN B; }
You should be getting warnings about unreachable code from the BEGIN operations appearing after return. The BEGIN operations have to be executed. They aren't being executed, so you're not switching into your start states.
Michael commented:
There is no warning. I've modified it as you say and edit my codes in the question. Now I put return after BEGIN. Still, "syntax error in line 1 near <�>".
This probably means you aren't compiling the C code with enough warnings. Assuming you're using GCC, add -Wall to the compilation options for starters. There's a chance the warning requires optimization too.
Have you printed the tokens as they're returned (in the Flex scanner)? Have you compiled the Bison grammar with -DYYDEBUG? You also need to turn the debug on: yydebug = 1; in the main() program. You're probably not getting the tokens you expect when you expect them. I've not tried compiling this code yet. Tracking the tokens is key (in my experience) to getting grammars to work. Otherwise, you're running blind.
The other problem (closely related) is that you need to generate the symbolic names for FAULT etc from the grammar (bison -d grammar.y generates grammar.tab.h). You'll find that COMMENT is assigned the value 258, for example. Your scanner, though, is returning other numbers altogether because they're in declare.h. You'll have to fix this mismatch. One option is to #include "grammar.tab.h" in your scanner; this is more or less normal.
In retrospect, I think this is probably the most important observation; things seemed to revert to normal C debugging after this was resolved.
(People often include 'grammar.h' and only update 'grammar.h' if the content of 'grammar.tab.h' changes, so you don't recompile the scanner all the time).
The significance of this is that the set of tokens used by a grammar tends to be fairly stable while the actions associated with the rules change all the time as the implementation of the grammar evolves. So, if it takes enough time to be worth worrying about, you can create file grammar.h that is a copy of grammar.tab.h, but only update grammar.h when the content of grammar.tab.h changes.
cmp -s grammar.tab.h grammar.h || cp grammar.tab.h grammar.h
You'd include this in the makefile rule that converts that grammar into a C file (or an object file).
If the scanner is small enough and your machine fast enough, it may be simpler not to bother with this refinement; it mattered more in the days of 50 MHz machines with a few MiB of RAM than it does in these days of multiple cores running at 2+ GHz with a few GiB of RAM.

COBOL level 88 data type

Very basic question here.
I have to write out a data glossary for a COBOL program. This data glossary includes the following details about every variable:
Name
Data type
Range of values (if applicable)
Line numbers
Fuller name
I have several variables that include level 88 switches. My question is this: Are these level 88 switches counted as variables, and should I include them in the data glossary? Or, judging by the data glossary structure I have to work with, should they be ignored in this context?
And while I'm here, another simple question. Should fillers be included in data glossaries? This program in particular contains a LOT of filler variables, most being simple "PIC X" variables.
Assuming I understand the question being asked.
It would help if you could give an example with a COBOL layout and data glossary entry one with and one without an 88 entry. However, I'll do my best to try to answer the question.
No, 88 level entries are not variables and they do not increase or decrease the length of the record. They simply allow you to create a conditional statement.
With that being said should your data glossary only include variables that contribute to the length of the record?
If yes then there shouldn't be a separate data glossary entry per 88 item. However, it might help to explain a given variable's value[s] (3 and maybe 5 or even an extra line for expected values).
01 record-store.
02 location pic 9(4).
88 dist-center value 100, 101, 102.
02 value pic 9(6).
02 paid pic X(1).
88 yes value 'Y', 'y'.
88 no value 'N', 'n'.
Your data glossary would/could be:
location
Name: location
Data Types: integer
Range of Value: 0-9999
Line Numbers: 20
Fuller name: location of the data
Expected Values:
100, 101, 102 for distribution centers
1-99 for customers
103-9999 invalid
Now knowing your expected values you might go back and change your 88 values?
...
02 location pic 9(4).
88 dist-center value 100, 101, 102.
88 customers value 1 thru 99.
88 invalid value 0, 103 thru 9999.
...
If no then:
You could have a separate data glossary entry pre 88 level entry.
Your data glossary would/could be:
location
Name: location
Data Types: integer
Range of Value: 0000-9999
Line Numbers: 20
Fuller Name: The location of the data
dist-center
Name: dist-center
Data Types: boolean
Range of Value: 100, 101, 102
Line Numbers: 5
Fuller Name: Is location is a distribution center
customer
Name: customer
Data Types: boolean
Range of Value: 1-99
Line Numbers: 5
Fuller Name: Is location a customer
invalid
Name: invalid
Data Types: boolean
Range of Value: 0001, 0010, 0100
Line Numbers: 5
Fuller Name: Is location an invalid value
As usual, it depends. :-)
The level 88 values seem to belong under part 3 "Range of values", especially if they document the only values allowed for some variable.
The FILLER fields are of course important if the documentation is used to reconstruct the records. If you just want to document the usage of the other fields, they are not very interesting.
The 'PIC X' FILLER variables are probably flags in working storage with 88 levels, and therefore quite important.
For instance, we use this type of construct a lot:
01 FILLER PIC X.
88 OPTION-IS-ON VALUE 'Y', FALSE 'N'.
88 OPTION-IS-OFF VALUE 'N'.
This defines a flag which we only reference using it's conditions. For example we might use it like this:
SET OPTION-IS-ON TO TRUE. | This puts a 'Y' in the PIC X
.
.
.
IF OPTION-IS-ON
do something
END-IF
In this case we never need to refer to the actual flag value itself, and hence you do not need to give it a name.
The 'FALSE' in the 88 level just allows you to specify what is stored when you use the statement:
SET OPTION-IS-ON TO FALSE | This puts an 'N' in the PIC X
which of course is the same as saying:
SET OPTION-IS-OFF TO TRUE | This also puts an 'N' in the PIC X
It all depends what is more readable at the time.

How to find the "lexical file" in Wordnet?

If you look at the original Wordnet search and select "Display options: Show Lexical File Info", you'll see an extremely useful classification of words called lexical file. Eg for "filling" we have:
<noun.substance>S: (n) filling, fill (any material that fills a space or container)
<noun.process>S: (n) filling (flow into something (as a container))
<noun.food>S: (n) filling (a food mixture used to fill pastry or sandwiches etc.)
<noun.artifact>S: (n) woof, weft, filling, pick (the yarn woven across the warp yarn in weaving)
<noun.artifact>S: (n) filling ((dentistry) a dental appliance consisting of ...)
<noun.act>S: (n) filling (the act of filling something)
The first thing in brackets is the "lexical file". Unfortunately I have not been able to find a SPARQL endpoint that provides this info
The latest RDF translation of Wordnet 3.0 points to two things:
Talis SPARQL endpoint. Use eg this query to check there's no such info:
DESCRIBE <http://purl.org/vocabularies/princeton/wn30/synset-chair-noun-1>
W3C's mapping description. Appendix D "Conversion details" describes something useful: wn:classifiedByTopic.
But it's not the same as lexical file, and is quite incomplete. Eg "chair" has nothing, while one of the senses of "completion" is in the topic "American Football"
DESCRIBE <http://purl.org/vocabularies/princeton/wn30/synset-completion-noun-1> ->
<j.1:classifiedByTopic rdf:resource="http://purl.org/vocabularies/princeton/wn30/synset-American_football-noun-1"/>
The question: is there a public Wordnet query API, or a database, that provides the lexical file information?
Using the Python NLTK interface:
from nltk.corpus import wordnet as wn
for synset in wn.synsets('can'):
print synset.lexname
I don't think you can find it in the RDF/OWL Representation of WordNet. It's in the WordNet distribution though: dict/lexnames. Here is the content of the file as of WordNet 3.0:
00 adj.all 3
01 adj.pert 3
02 adv.all 4
03 noun.Tops 1
04 noun.act 1
05 noun.animal 1
06 noun.artifact 1
07 noun.attribute 1
08 noun.body 1
09 noun.cognition 1
10 noun.communication 1
11 noun.event 1
12 noun.feeling 1
13 noun.food 1
14 noun.group 1
15 noun.location 1
16 noun.motive 1
17 noun.object 1
18 noun.person 1
19 noun.phenomenon 1
20 noun.plant 1
21 noun.possession 1
22 noun.process 1
23 noun.quantity 1
24 noun.relation 1
25 noun.shape 1
26 noun.state 1
27 noun.substance 1
28 noun.time 1
29 verb.body 2
30 verb.change 2
31 verb.cognition 2
32 verb.communication 2
33 verb.competition 2
34 verb.consumption 2
35 verb.contact 2
36 verb.creation 2
37 verb.emotion 2
38 verb.motion 2
39 verb.perception 2
40 verb.possession 2
41 verb.social 2
42 verb.stative 2
43 verb.weather 2
44 adj.ppl 3
For each entry of dict/data.*, the second number is the lexical file info. For example, this filling entry contains the number 13, which is noun.food.
07883031 13 n 01 filling 0 002 # 07882497 n 0000 ~ 07883156 n 0000 | a food mixture used to fill pastry or sandwiches etc.
It can be done through MIT JWI (MIT Java Wordnet Interface) a Java API to query Wordnet. There's a topic in this link showing how to implement a java class to access lexicographic
This is what worked for me,
Synset[] synsets = database.getSynsets(wordStr);
ReferenceSynset referenceSynset = (ReferenceSynset) synsets[i];
int lexicalCode =referenceSynset.getLexicalFileNumber();
Then use above table to deduce "lexnames" e.g. noun.time
If you're on Windows, chances are it is in your appdata, in the local directory. To get there, you will want to open your file browser, go to the top, and type in %appdata%
Next click on roaming, and then find the nltk_data directory. In there, you will have your corpora file. The full path is something like:
C:\Users\yourname\AppData\Roaming\nltk_data\corpora
and lexnames will present under
C:\Users\yourname\AppData\Roaming\nltk_data\corpora\wordnet.