OpenNebula - Bridge VM NIC with Host NIC - take Ip from LAN DCHP - kvm

I hope you are doing well,
I start using OpenNebula here, I deploy a basic setup one Opennebula fronend in centos 8
another server as OpenNebula Node,
I download an image from marketplace it's centos image, Then I create a network Under Network >> Virual Network. Bridge it with ens33 (ens3 is the physical interface of my node) in order to give VM access to LAN,
he is my Node net
[centos#host1 ~]$ ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.60 netmask 255.255.255.0 broadcast 192.168.0.255
ether 00:0c:29:68:26:2b txqueuelen 1000 (Ethernet)
RX packets 679155 bytes 994474147 (948.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 41914 bytes 3220552 (3.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 6 bytes 672 (672.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6 bytes 672 (672.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
virbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255
ether 52:54:00:89:84:b1 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
once I create a VM and attach it to the bridge network I create already, i get status Failed with the bellow log :
Sat May 1 03:50:25 2021 [Z0][VM][I]: New state is ACTIVE
Sat May 1 03:50:25 2021 [Z0][VM][I]: New LCM state is PROLOG
Sat May 1 03:50:38 2021 [Z0][VM][I]: New LCM state is BOOT
Sat May 1 03:50:38 2021 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/14/deployment.0
Sat May 1 03:50:39 2021 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Sat May 1 03:50:40 2021 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vnm/bridge/pre
Sat May 1 03:50:40 2021 [Z0][VMM][E]: pre: Command "sudo ip link add name ens33 type bridge " failed.
Sat May 1 03:50:40 2021 [Z0][VMM][E]: pre: RTNETLINK answers: File exists
Sat May 1 03:50:40 2021 [Z0][VMM][E]: RTNETLINK answers: File exists
Sat May 1 03:50:40 2021 [Z0][VMM][E]:
Sat May 1 03:50:40 2021 [Z0][VMM][I]: ExitCode: 2
Sat May 1 03:50:40 2021 [Z0][VMM][I]: Failed to execute network driver operation: pre.
Sat May 1 03:50:40 2021 [Z0][VMM][E]: Error deploying virtual machine: bridge: RTNETLINK answers: File exists
Sat May 1 03:50:40 2021 [Z0][VM][I]: New LCM state is BOOT_FAILURE
can anyone please explain to me what's wrong here, Im familiar with vsphere esxi/vcenter, I want just to create a VMNetwork and attach it to the node physical NIC then attach the VM to this VMNetwork in order to give it LAN access, on VMware side it's easy simple but with OpenNebula Im not sure how it's work
Thank you

The problem here is that you are using a physical interface instead of using a bridge. If you would like to use bridge networking, you need to create a bridge or let OpenNebula create it for you.
Let me know if this answers your issue, if not, feel free to submit your query on OpenNebula Forum - https://forum.opennebula.io/. :)

Related

running ARM- fullsystem on gem5

0
I am trying to run ARM-full_system on gem5.when I enter this command for Performance : {root#farideh-S551LN:/home/farideh/gem5# ./build/ARM/gem5.opt configs/example/arm/starter_fs.py --cpu="minor" --num-cores=1 --disk-image=/home/farideh/Downloads/fullsystem/disks/linaro-minimal-aarch64.img --kernel=/home/farideh/Downloads/fullsystem/binaries/vmlinux.vexpress_gem5_v1_64.20170616}
the terminal shows:
{gem5 Simulator System. http://gem5.org gem5 is copyrighted software; use the --copyright option for details. gem5 compiled May 1 2019 08:46:14 gem5 started Jun 2 2019 11:49:47 gem5 executing on farideh-S551LN, pid 7162 command line: ./build/ARM/gem5.opt configs/example/arm/starter_fs.py --cpu=minor --num-cores=1 --disk-image=/home/farideh/Downloads/fullsystem/disks/linaro-minimal-aarch64.img --kernel=/home/farideh/Downloads/fullsystem/binaries/vmlinux.vexpress_gem5_v1_64.20170616 Global frequency set at 1000000000000 ticks per second warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (2048 Mbytes) info: kernel located at: /home/farideh/Downloads/fullsystem/binaries/vmlinux.vexpress_gem5_v1_64.20170616 warn: Bootloader entry point 0x10 overriding reset address 0 warn: Highest ARM exception-level set to AArch32 but bootloader is for AArch64. Assuming you wanted these to match. system.vncserver: Listening for connections on port 5900 system.terminal: Listening for connections on port 3456 0: system.remote_gdb: listening for remote gdb on port 7000 info: Using bootloader at address 0x10 info: Using kernel entry physical address at 0x80080000 info: Loading DTB file: /home/farideh/gem5/m5out/system.dtb at address 0x88000000 warn: Existing EnergyCtrl, but no enabled DVFSHandler found. info: Entering event queue # 0. Starting simulation... warn: ClockedObject: Already in the requested power state, request ignored warn: SCReg: Access to unknown device dcc0:site0:pos0:fn7:dev0}
then I enter telnet 127.0.0.1 3456 in an other terminal and it shows:Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ==== m5 slave terminal: Terminal 0 ==== }
after several times I confront by this error about telnet:Connection closed by foreign host. and another terminal , I see this:
root#farideh-S551LN:/home/farideh/gem5# ./build/ARM/gem5.opt configs/example/arm/starter_fs.py --cpu="minor" --num-cores=1 --disk-image=/home/farideh/Downloads/fullsystem/disks/linaro-minimal-aarch64.img --kernel=/home/farideh/Downloads/fullsystem/binaries/vmlinux.vexpress_gem5_v1_64.20170616 gem5 Simulator System. http://gem5.org gem5 is copyrighted software; use the --copyright option for details.
gem5 compiled May 1 2019 08:46:14 gem5 started Jun 2 2019 11:49:47 gem5 executing on farideh-S551LN, pid 7162 command line: ./build/ARM/gem5.opt configs/example/arm/starter_fs.py --cpu=minor --num-cores=1 --disk-image=/home/farideh/Downloads/fullsystem/disks/linaro-minimal-aarch64.img --kernel=/home/farideh/Downloads/fullsystem/binaries/vmlinux.vexpress_gem5_v1_64.20170616
Global frequency set at 1000000000000 ticks per second warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (2048 Mbytes) info: kernel located at: /home/farideh/Downloads/fullsystem/binaries/vmlinux.vexpress_gem5_v1_64.20170616 warn: Bootloader entry point 0x10 overriding reset address 0 warn: Highest ARM exception-level set to AArch32 but bootloader is for AArch64. Assuming you wanted these to match. system.vncserver: Listening for connections on port 5900 system.terminal: Listening for connections on port 3456 0: system.remote_gdb: listening for remote gdb on port 7000 info: Using bootloader at address 0x10 info: Using kernel entry physical address at 0x80080000 info: Loading DTB file: /home/farideh/gem5/m5out/system.dtb at address 0x88000000 warn: Existing EnergyCtrl, but no enabled DVFSHandler found. info: Entering event queue # 0. Starting simulation... warn: ClockedObject: Already in the requested power state, request ignored warn: SCReg: Access to unknown device dcc0:site0:pos0:fn7:dev0 33342160250: system.terminal: attach terminal 0 49510353000: system.terminal: detach terminal 0 62922704750: system.terminal: attach terminal 0 warn: Tried to read RealView I/O at offset 0x60 that doesn't exist warn: Tried to read RealView I/O at offset 0x48 that doesn't exist warn: Tried to write RVIO at offset 0xa8 (data 0) that doesn't exist simulate() limit reached # 18446744073709551615 root#farideh-S551LN:/home/farideh/gem5#
I do not know that Arm-fullsystem runs or not and why is telnet shows Connection closed by foreign host?

HyperV Gen2 VM not booting over PXE

I have two VMs in HyperV, both on the same virtual switch (internal), on the same subnet. I am trying to set up one as a DHCP and TFTP server for PXE boot. With Gen1 machine, it's working fine with pxelinux. Gen2 with UEFI does not unfortunately work.
DHCP & TFTP Server
IP 192.168.1.2
VLAN identification is disabled
DHCP - ISC DHCP Server running in a docker container with "host" network type with the following configuration:
set vendorclass = option vendor-class-identifier;
option pxe-system-type code 93 = unsigned integer 16;
set pxetype = option pxe-system-type;
authoritative;
default-lease-time 7200;
max-lease-time 7200;
option tftp-server-name "192.168.1.2";
option bootfile-name "efi/core.efi";
subnet 192.168.1.0 netmask 255.255.255.0 {
interface "eth0:0";
option routers 192.168.1.1;
option subnet-mask 255.255.255.0;
range 192.168.1.100 192.168.1.150;
option broadcast-address 192.168.1.255;
option domain-name-servers 8.8.8.8, 8.8.4.4;
option domain-name "ad.lholota.net";
option domain-search "ad.lholota.net";
if substring(vendorclass, 0, 9)="PXEClient" {
if pxetype=00:06 or pxetype=00:07 {
filename "efi/core.efi";
} else {
filename "pxelinux/pxelinux.0";
}
}
next-server 192.168.1.2;
}
TFTP - tftp-hpa running in a docker container on a "host" type network. I can download the efi files manually through a standard tftp client.
Booting machine
HyperV Gen2
No virtual HDD or DVD
Firmware tab has only one item in the boot sequence - network
Secure boot is disabled
VLAN identification is disabled
Network adapter pointing into the same internal switch as the first VM
Enable virtual machine queue - checked
Enable IPsec task offloading - checked, maximum number: 512
MAC Address dynamic
Enable DHCP guard - NOT checked
Enable router advertisement guard - NOT checked
Procted network - NOT checked
Mirroring mode - None
Enable device naming - NOT checked
The trouble is that the machine doesn't even get to the TFTP server because it doesn't finish the DHCP Discover-Offer-Request-Ack flow. It gets stuck on offer as shown in the dhcpdump below. The booting machine never sends the request message. Funny enough, BIOS based Gen1 HyperV machine boots without any issue so the DHCP flow works there.
Can you please give me a hint of what might be wrong?
TIME: 2018-07-11 19:49:37.641
IP: 0.0.0.0 (0:15:5d:0:50:d0) > 255.255.255.255 (ff:ff:ff:ff:ff:ff)
OP: 1 (BOOTPREQUEST)
HTYPE: 1 (Ethernet)
HLEN: 6
HOPS: 0
XID: 8bf1c250
SECS: 0
FLAGS: 7f80
CIADDR: 0.0.0.0
YIADDR: 0.0.0.0
SIADDR: 0.0.0.0
GIADDR: 0.0.0.0
CHADDR: 00:15:5d:00:50:d0:00:00:00:00:00:00:00:00:00:00
SNAME: .
FNAME: .
OPTION: 53 ( 1) DHCP message type 1 (DHCPDISCOVER)
OPTION: 57 ( 2) Maximum DHCP message size 1472
OPTION: 55 ( 35) Parameter Request List 1 (Subnet mask)
2 (Time offset)
3 (Routers)
4 (Time server)
5 (Name server)
6 (DNS server)
12 (Host name)
13 (Boot file size)
15 (Domainname)
17 (Root path)
18 (Extensions path)
22 (Maximum datagram reassembly size)
23 (Default IP TTL)
28 (Broadcast address)
40 (NIS domain)
41 (NIS servers)
42 (NTP servers)
43 (Vendor specific info)
50 (Request IP address)
51 (IP address leasetime)
54 (Server identifier)
58 (T1)
59 (T2)
60 (Vendor class identifier)
66 (TFTP server name)
67 (Bootfile name)
97 (UUID/GUID)
128 (???)
129 (???)
130 (???)
131 (???)
132 (???)
133 (???)
134 (???)
135 (???)
OPTION: 97 ( 17) UUID/GUID 008c0c7ab81331a0 ...z..1.
4297445b2e41610e B.D[.Aa.
a8 .
OPTION: 94 ( 3) Client NDI 010300 ...
OPTION: 93 ( 2) Client System 0007 ..
OPTION: 60 ( 32) Vendor class identifier PXEClient:Arch:00007:UNDI:003000
---------------------------------------------------------------------------
TIME: 2018-07-11 19:49:37.641
IP: 0.0.0.0 (0:15:5d:0:50:12) > 255.255.255.255 (ff:ff:ff:ff:ff:ff)
OP: 2 (BOOTPREPLY)
HTYPE: 1 (Ethernet)
HLEN: 6
HOPS: 0
XID: 8bf1c250
SECS: 0
FLAGS: 7f80
CIADDR: 0.0.0.0
YIADDR: 192.168.1.105
SIADDR: 192.168.1.2
GIADDR: 0.0.0.0
CHADDR: 00:15:5d:00:50:d0:00:00:00:00:00:00:00:00:00:00
SNAME: .
FNAME: efi/core.efi.
OPTION: 53 ( 1) DHCP message type 2 (DHCPOFFER)
OPTION: 51 ( 4) IP address leasetime 7200 (2h)
OPTION: 1 ( 4) Subnet mask 255.255.255.0
OPTION: 3 ( 4) Routers 192.168.1.1
OPTION: 6 ( 8) DNS server 8.8.8.8,8.8.4.4
OPTION: 15 ( 14) Domainname ad.lholota.net
OPTION: 28 ( 4) Broadcast address 192.168.1.255
I have had what i believe is the same issue when booting HyperV virtual machines on win10 2004(19041.685): gen 1 works, gen 2 times out without ever asking for the boot file.
I strongly suspect this is an issue with the GEN2 UEFI PXE implementation. Because as soon as I have at least two entries to choose from in the pxe boot menu it requests files and downloads as expected.
I run dnsmasq for tftp and DHCP and my config file below works if and only if at least one of the last two rows are uncommented. (pxe-service=x86-64_EFI and pxe-service=7 are equal)
config context: https://linuxconfig.org/how-to-configure-a-raspberry-pi-as-a-pxe-boot-server
# /etc/dnsmasq.d/03-tftpboot.conf
enable-tftp
tftp-lowercase
tftp-root=/mnt/data/netboot
pxe-prompt="Choose:"
pxe-service=x86PC,"PXELINUX (BIOS)",bios/pxelinux.0
pxe-service=x86PC,"WinPE (BIOS)",boot/pxeboot.n12
pxe-service=x86-64_EFI,"PXELINUX (EFI)",efi64/syslinux.efi
pxe-service=x86-64_EFI,"winpe (EFI)",boot/wdsmgfw.efi
#pxe-service=7,"PXELINUX (EFI-7)",efi64/syslinux.efi
I think I am experiencing the same problem when using digital rebar provisioner. Works great on Gen 1 but not on Gen 2. Have followed the same configuration as well.
Looking at the digital rebar code it seems like it should work but does not: https://github.com/digitalrebar/provision/blob/8269e1c7ff12a82854c19eccd114d064e2278211/midlayer/pxe.go#L252
I think this could be related:
https://wiki.fogproject.org/wiki/index.php/BIOS_and_UEFI_Co-Existence
https://serverfault.com/questions/739138/hyper-v-2016-gen2-vm-pxe-dhcp-timeout-wireshark-dhcp-discover-offer

CumulocityLongPollingTransport - canceling the long poll request because of inactivity

I am using the Cumulocity java agent (7.38.0) and it apparently lost communication with the server somehow and never recovered. The admin interface says:
LAST COMMUNICATION
November 22, 2016 2:25 AM
and last cumulo record in the the device syslog was:
Nov 22 01:25:47 localhost root: 01:25:47.166 [CumulocityLongPollingTransport-scheduler-2] WARN c.c.s.c.n.ConnectionHeartBeatWatcher - canceling the long poll request because of inactivity
(there was 1 hour time diff due to some device config prob.)
process looks running anyways:
ps -ef | grep -i c8y
root 1341 1257 0 Nov19 ? 00:00:00 /bin/sh ./c8y-agent.sh
root 1342 1341 0 Nov19 ? 00:00:00 /bin/sh ./c8y-agent.sh
root 1344 1342 0 Nov19 ? 00:25:39 java -cp cfg/*:lib/* -Dlogback.configurationFile=cfg/logback.xml c8y.lx.agent.Agent
Has anyone seen this prob before?
We had it once or twice when people were connecting to cumulocity via firewall or vpn. The result was exactly as you described: the polling gets stuck after some time, like if connections were blocked. In other words i would suspect that it’s a proxy that’s blocking the reconnect.

OpenShift Origin: Node not ready

I appear to have some problem with my installation of OpenShift Origin.
When I get endpoints for the router, I get the following:
oc get endpoints --namespace=default --selector=router
NAME ENDPOINTS AGE
router-west <none> 21m
Obviously the router should have at least one endpoint.
Im trying to follow the troubleshooting guide on https://docs.openshift.com/enterprise/3.1/admin_guide/sdn_troubleshooting.html#debugging-the-router however it does not provide assistance in the situation where the router has not endpoints.
When I get my list of nodes, I get:
oc get nodes
NAME LABELS STATUS AGE
openshift.hughestech.space kubernetes.io/hostname=openshift.mydomain.com NotReady 38d
When I describe the node, I get the following:
oc describe node openshift.mydomain.com
Name: openshift.mydomain.com
Labels: kubernetes.io/hostname=openshift.mydomain.com
CreationTimestamp: Sat, 06 Feb 2016 21:44:23 +0100
Phase:
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
──── ────── ───────────────── ────────────────── ────── ───────
Ready Unknown Fri, 04 Mar 2016 18:50:39 +0100 Fri, 04 Mar 2016 18:51:21 +0100 NodeStatusUnknown Kubelet stopped posting node status.
Addresses: 88.198.37.183,88.198.37.183
Capacity:
memory: 24515560Ki
pods: 40
cpu: 8
System Info:
Machine ID: bafaea4f3c4c4cf6a632047c1d14db1a
System UUID: 00000000-0000-0000-0000-002421DDE3D7
Boot ID: f9febe14-ec61-41d5-b7c3-db2e42f9b452
Kernel Version: 3.10.0-327.4.5.el7.x86_64
OS Image: Red Hat Enterprise Linux
Container Runtime Version: docker://1.8.2-el7
Kubelet Version: v1.1.0-origin-1107-g4c8e6f4
Kube-Proxy Version: v1.1.0-origin-1107-g4c8e6f4
ExternalID: openshift.mydomain.com
Non-terminated Pods: (0 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
───────── ──── ──────────── ────────── ─────────────── ─────────────
Allocated resources:
(Total limits may be over 100%, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md)
CPU Requests CPU Limits Memory Requests Memory Limits
──────────── ────────── ─────────────── ─────────────
0 (0%) 0 (0%) 0 (0%) 0 (0%)
No events.
Where have I gone wrong? What do I need to do?
Thanks
Restart the node service and see if that makes a difference in oc get nodes output.
systemctl restart origin-node
Unless your node is running you can cannot make a running router pod and resulting in no endpoints.

Apache 2.4.10 hangs AH00485: scoreboard is full, not at MaxRequestWorkers

Apache server will stay up for random amount of time, usually days, but eventually enters a hung state. When hung the CPU load gradually spikes on the machine and new web server requests are unresponsive.
Error logs typically contain lots of these:
Wed Jan 28 16:06:58.667188 2015] [mpm_event:error] [pid 25336:tid 1] AH00485: scoreboard is full, not at MaxRequestWorkers
Environment:
LDOM (VM) SunOS myhostname 5.10 Generic_118833-36 sun4v sparc SUNW,Sun-Fire-T200
http Conf:
StartServers 8
MinSpareServers Not set
MaxSpareServers Not set
ServerLimit 256
MaxRequestWorkers 100
MaxConnectionsPerChild 1000
KeepAlive On
TimeOut 3000
MaxKeepAliveRequests 50
KeepAliveTimeout 2
Current non-hung Score Board:
Server Version: Apache/2.4.10 (Unix)
Server MPM: event
Server Built: Oct 30 2014 16:29:03
Current Time: Wednesday, 28-Jan-2015 10:59:39 PST
Restart Time: Wednesday, 28-Jan-2015 09:49:21 PST
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 1 hour 10 minutes 17 seconds
Server load: 0.60 0.46 0.41
Total accesses: 1134 - Total Traffic: 2.2 GB
CPU Usage: u9.07 s16.94 cu609.51 cs69.31 - 16.7% CPU load
.269 requests/sec - 0.5 MB/second - 2.0 MB/request
1 requests currently being processed, 99 idle workers
PID Connections Threads Async connections
total accepting busy idle writing keep-alive closing
25337 0 yes 1 24 0 0 0
25338 1 yes 0 25 1 0 0
25339 1 yes 0 25 0 0 1
25340 1 yes 0 25 0 0 1
Sum 3 1 99 1 0 2
Any thoughts on http conf tuning, OS patches, apache bug fixes appreciated.
Yes I have seen the open ASF bugzilla for the same error message.
This is a production server, so you can imagine, having it go down at random times (usually when I am asleep) is not fun!