How to use the m5ops in gem5 such m5_exit and m5_dump_stats in se mode - gem5

I know this a trivial question but I am having difficulties in running the m5ops in gem5,
lets take for example the m5-exit.c file that has been provided by gem5, in the test programs, how would I compile it and link it to the file m5op_x86.S
Currently this is the way I am compiling and linking it:
gcc m5-exit.c -I ~/Desktop/gem5_86/gem5/include -o test ~/Desktop/gem5_86/gem5/util/m5/m5op_x86.S
the error i get:
/tmp/ccXsGX3d.o: relocation R_X86_64_16 against undefined symbol `M5OP_ARM' can not be used when making a PIE object; recompile with -fPIC
the directory i am in is:
gem5/tests/test-progs/m5-exit/src
the code for m5-exit.c is from the gem5 directory found here

This is a copy of: How to use m5 in gem5-20 which was deleted on my other answer, since my previous DRY link-only answer was removed followed by an unsuccessful (although correct, but not enough users who care) dupe close attempt.
On gem5 046645a4db646ec30cc36b0f5433114e8777dc44 I can do:
scons -C util/m5 build/x86/out/m5
gcc -static -I include -o main.out main.c util/m5/build/x86/out/libm5.a
with:
main.c
#include <gem5/m5ops.h>
int main(void) {
m5_exit(0);
}
Or for ARM:
sudo apt install gcc-aarch64-linux-gnu g++-aarch64-linux-gnu
scons -C util/m5 build/aarch64/out/m5
aarch64-linux-gnu-gcc -static -I include -o main.out main.c \
util/m5/build/aarch64/out/libm5.a
But in practice, I often just don't have the patience for this business, so I just misbehave and add raw assembly directly as shown here muahahaha e.g.:
#if defined(__x86_64__)
#define LKMC_M5OPS_CHECKPOINT __asm__ __volatile__ (".word 0x040F; .word 0x0043;" : : "D" (0), "S" (0) :)
#define LKMC_M5OPS_DUMPSTATS __asm__ __volatile__ (".word 0x040F; .word 0x0041;" : : "D" (0), "S" (0) :)
#elif defined(__aarch64__)
#define LKMC_M5OPS_CHECKPOINT __asm__ __volatile__ ("mov x0, 0; mov x1, 0; .inst 0xFF000110 | (0x43 << 16);" : : : "x0", "x1")
#define LKMC_M5OPS_DUMPSTATS __asm__ __volatile__ ("mov x0, 0; mov x1, 0; .inst 0xFF000110 | (0x41 << 16);" : : : "x0", "x1")
More general m5op information can also be found at: What are pseudo-instructions for in gem5?
Tested on Ubuntu 20.04.

Related

Prevent CUDA-enabled MPI from checking for CUDA devices

The OpenMPI 4.0.5 on our cluster is built with CUDA support, but I want to benchmark pnetcdf without needing CUDA for that. Since I want to do a number of test runs that I can start on like 1/4th of a node and my tests won't make use of the GPUs I wanted to ask if there is a way to suppress the MPI check for CUDA devices. Because when I simply obtain a SLURM allocation without GPUs, I get lots of errors from that alone.
These errors come from hwloc, and can be suppressed with HWLOC_HIDE_ERRORS=1 but I'd like to know if there is a more specific method.
Steps to reproduce:
frontend$ salloc -n 16 -t 8:00:00 -A k20200
node$ exec bash -l
node$ module load gcc openmpi
node$ mpicc -o /tmp/hello ~/usr/src/helloworld_mpi.c
node$ srun -n 1 /tmp/hello
CUDA: Failed to get number of devices with cudaGetDeviceCount(): no CUDA-capable device is detected
Hello world!, I'm rank 0 of 1!
node$ HWLOC_HIDE_ERRORS=1 srun -n 1 /tmp/hello
Hello world!, I'm rank 0 of 1!
node$ logout
The example code used above is the following but any program without CUDA use is equally useful in this exercise
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define xmpi(rc) \
do { \
int err = (rc); \
if (err != MPI_SUCCESS) { \
char msg[MPI_MAX_ERROR_STRING + 1]; \
int msg_len; \
\
if (MPI_Error_string(err, msg, &msg_len) \
== MPI_SUCCESS){ \
msg[msg_len] = '\0'; \
\
fprintf(stderr, \
"Problem in MPI call: %d = %s\n", \
err, msg); \
MPI_Abort(MPI_COMM_WORLD, 1); \
} \
} \
} while (0)
int
main(int argc, char *argv[])
{
xmpi(MPI_Init(&argc, &argv));
int rank, size;
xmpi(MPI_Comm_rank(MPI_COMM_WORLD, &rank));
xmpi(MPI_Comm_size(MPI_COMM_WORLD, &size));
printf("Hello world!, I'm rank %d of %d!\n", rank, size);
xmpi(MPI_Finalize());
return EXIT_SUCCESS;
}

How to use m5 in gem5-20 linking it with my own C++ program?

In gem5-20, I can build the m5 utility by using.
scons build/<arch>/out/m5
But actually I don’t know how to link M5 to my C++ code.
Some necessary operations are mentioned at the end of this document, but I hope to get more specific guidance.
http://www.gem5.org/documentation/general_docs/m5ops/
Has anyone done it,please help me.
Thanks!
Best wishes!
On gem5 046645a4db646ec30cc36b0f5433114e8777dc44 I can do:
scons -C util/m5 build/x86/out/m5
gcc -static -I include -o main.out main.c util/m5/build/x86/out/libm5.a
with:
main.c
#include <gem5/m5ops.h>
int main(void) {
m5_exit(0);
}
Or for ARM:
sudo apt install gcc-aarch64-linux-gnu g++-aarch64-linux-gnu
scons -C util/m5 build/aarch64/out/m5
aarch64-linux-gnu-gcc -static -I include -o main.out main.c \
util/m5/build/aarch64/out/libm5.a
An official example can also be found at: gem5-resources/src/simplem5_exit.c with instructions at on the README.
And here is one using the m5_exit_addr variant, which uses the memory version of the m5op instead of instruction, which can also be used from KVM for example: https://gem5-review.googlesource.com/c/public/gem5/+/31219/7
But in practice, I often just don't have the patience for this business, so I just misbehave and add raw assembly directly as shown here muahahaha e.g.:
#if defined(__x86_64__)
#define LKMC_M5OPS_CHECKPOINT __asm__ __volatile__ (".word 0x040F; .word 0x0043;" : : "D" (0), "S" (0) :)
#define LKMC_M5OPS_DUMPSTATS __asm__ __volatile__ (".word 0x040F; .word 0x0041;" : : "D" (0), "S" (0) :)
#elif defined(__aarch64__)
#define LKMC_M5OPS_CHECKPOINT __asm__ __volatile__ ("mov x0, 0; mov x1, 0; .inst 0xFF000110 | (0x43 << 16);" : : : "x0", "x1")
#define LKMC_M5OPS_DUMPSTATS __asm__ __volatile__ ("mov x0, 0; mov x1, 0; .inst 0xFF000110 | (0x41 << 16);" : : : "x0", "x1")
More general m5op information can also be found at: What are pseudo-instructions for in gem5?
Related:
How to use m5 in gem5-20 linking it with my own C++ program?
Tested on Ubuntu 20.04.

How to improve the C++ code to get similar efficiency as in the Fortran code with arrays

There are two codes in Fortran and C++ with simple array manipulations with different declarations of arrays for C++. Let me know how to improve C++ code to get similar efficiency as in the Fortran code. The duration of runs in seconds is summarized below.
The Fortran program:
! fort.f90
PARAMETER ( N=1000000, M=10000 )
REAL*8, ALLOCATABLE :: D(:)
ALLOCATE(D(N))
A=1.0
DO J=1,M
DO I=1,N
D(I)=A+I+J
ENDDO
ENDDO
END
The Cpp program:
// main.cpp
using namespace std;
using namespace blitz;
int main(int argc, char* argv[]){
int N=1000000, M=10000;
double* D=new double[N];
//Array<double,1> D(N); // for Blitz C++
//vector<double>D(N);
//valarray<double>D(N);
const double a=1.0;
size_t i,j;
for (j=0; j<M; j++)
for (i=0; i<N; i++)
D[i]=a+i+j;
return 0;
}
g++ main.cpp -o main && time main
g++ main.cpp -o main -Ofast && time main
f95 fort.f90 -o fort && time fort
f95 fort.f90 -o fort -Ofast && time fort
Here is statistics:
1) double* D=new double[N]; g++: 58,1s, g++ -Ofast : 16,413s
2) valarrayD(N); ~ the same
3) vector; ~ the same
4) BlitzC++ ; g++ : 3m19,017s, g++ -Ofast : 15,142s
5) ALLOCATE(d(N)); f95 : 42,092s, f95 -Ofast : 0,002s

G++ doesn't find CoInitializeEx (and several other functions)?

I'm trying to compile the following code
#include <iostream>
#include <windows.h>
#include <objbase.h>
int main (int argc, char** argv) {
HRESULT hr;
hr = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
if (SUCCEEDED(hr)) {
std::cout << "Initialized" << std::endl;
} else {
std::cout << "Failed" << std::endl;
}
CoUninitialize();
return 0;
}
but
g++ -o test -L"<dir>" -lOle32 <file>.cpp
# <dir> contains Ole32.Lib
always tells me that __imp_CoInitializeEx and __imp_CoUninitialize are undefined and -print-file-name=Ole32.Lib just return Ole32.Lib. If g++ doesn't find Ole32.Lib, maybe
g++ -c -o test.o <file>.cpp
ld -L"<dir>" -lOle32 -o test test.o
works. Now g++/ld actually finds CoInitializeEx and CoUninitialize, but the standard library seems to be missing and adding -static-libstdc++ or -lstdc++ or -llibstdc++ doesn't help either. So what am I missing? Why is g++ unable to find CoInitializeEx and CoUninitialize?
EDIT: I can definitely say that there is nothing wrong with my code, my header files and my library files, because I can compile the code using Visual Studios compiler:
cl /c /EHsc ^
/I"<...>\Microsoft Visual Studio 14.0\VC\include" ^
/I"<...>\Windows Kits\10\Include\<version>\ucrt" ^
/I"<...>\Windows Kits\10\Include\<version>\shared" ^
/I"<...>\Windows Kits\10\Include\<version>\um" ^
/Fotest.obj ^
main.cpp
link /nologo /machine:x64 /subsystem:console ^
/libpath:"<...>\Microsoft Visual Studio 14.0\VC\lib\amd64" ^
/libpath:"<...>\Windows Kits\10\Lib\<version>\ucrt\x64" ^
/libpath:"<...>\Windows Kits\10\Lib\<version>\um\x64" ^
/out:test.exe ^
test.obj Ole32.Lib

How to Compile Programs Built With Yacc and Lex?

My Yacc source is in pos.yacc and my Lex source is in pos1.lex, as shown.
pos1.lex
%{
#include "y.tab.h"
int yylval;
%}
DIGIT [0-9]+
%%
{DIGIT} {yylval=atoi(yytext);return DIGIT;}
[\n ] {}
. {return *yytext;}
%%
pos.yacc
%token DIGIT
%%
s:e {printf("%d\n",$1);}
e:DIGIT {$$=$1;}
|e e "+" {$$=$1+$2;}
|e e "*" {$$=$1*$2;}
|e e "-" {$$=$1-$2;}
|e e "/" {$$=$1/$2;}
;
%%
main() {
yyparse();
}
yyerror() {
printf("Error");
}
Compilation errors
While compiling I am getting errors like:
malathy#malathy:~$ cc lex.yy.c y.tab.c -ly -ll
pos.y: In function ‘yyerror’:
pos.y:16: warning: incompatible implicit declaration of built-in function ‘printf’
pos.y: In function ‘yyparse’:
pos.y:4: warning: incompatible implicit declaration of built-in function ‘printf’
What causes those errors?
How am I supposed to compile Lex and Yacc source code?
printf() is defined in stdio.h so just include it above y.tab.h in pos1.lex:
%{
#include <stdio.h>
/* Add ^^^^^^^^^^^ this line */
#include "y.tab.h"
int yylval;
%}
DIGIT [0-9]+
%%
{DIGIT} {yylval=atoi(yytext);return DIGIT;}
[\n ] {}
. {return *yytext;}
%%
You have the direct answer to your question from trojanfoe - you need to include <stdio.h> to declare the function printf(). This is true in any source code presented to the C compiler.
However, you should also note that the conventional suffix for Yacc source is .y (rather than .yacc), and for Lex source is .l (rather than .lex). In particular, using those sufffixes means that make will know what to do with your source, rather than having to code the compilation rules by hand.
Given files lex.l and yacc.y, make compiles them to object code using:
$ make lex.o yacc.o
rm -f lex.c
lex -t lex.l > lex.c
cc -O -std=c99 -Wall -Wextra -c -o lex.o lex.c
yacc yacc.y
mv -f y.tab.c yacc.c
cc -O -std=c99 -Wall -Wextra -c -o yacc.o yacc.c
rm lex.c yacc.c
$
This is in a directory with a makefile that sets CFLAGS = -O -std=c99 -Wall -Wextra. (This was on MacOS X 10.6.6.) You will sometimes see other similar rules used; in particular, lex generates a file lex.yy.c by default (at least on MacOS X), and you'll often see a rule such as:
lex lex.l
mv lex.yy.c lex.c
cc -O -std=c99 -Wall -Wextra -c -o lex.o lex.c
Or even:
lex lex.l
cc -O -std=c99 -Wall -Wextra -c -o lex.o lex.yy.c
The alternatives are legion; use make and it gets it right.
Include header file stdio.h
for compilation
open terminal locate both files and type
lex pos1.l
yacc pos.y
cc lex.yy.c y.tab.h -ll
./a.out
You can follow these steps.
you just need to save in file
pos1.l and pos.y
I hope it will work