Bug #4682
cloud-config crashes OpenNebula
| Status: | Closed | Start date: | 07/25/2016 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | - | % Done: | 0% | |
| Category: | - | |||
| Target version: | - | |||
| Resolution: | worksforme | Pull request: | ||
| Affected Versions: | OpenNebula 5.0 |
Description
Environment:
root@tetra-oned:~# uname -a Linux tetra-oned 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2+deb8u3 (2016-07-02) x86_64 GNU/Linux
root@tetra-oned:~# oned -v Copyright 2002-2016, OpenNebula Project, OpenNebula Systems OpenNebula 5.0.0 is distributed and licensed for use under the terms of the Apache License, Version 2.0 (http://www.apache.org/licenses/LICENSE-2.0).
Steps to reproduce:
1. Build an actually usable CoreOS image for OpenNebula (https://github.com/zllovesuki/coreos-opennebula-image)
2. Use this template to provision:
CONTEXT = [ USER_DATA = "$USER_DATA", NETWORK = "YES", SET_HOSTNAME = "$NAME", SSH_PUBLIC_KEY = "$USER[SSH_PUBLIC_KEY]" ] CPU = "2" DISK = [ IMAGE = "CoreOS Stable 1068.8.0" ] FEATURES = [ ACPI = "yes", APIC = "yes", HYPERV = "yes", PAE = "yes" ] GRAPHICS = [ LISTEN = "0.0.0.0", TYPE = "VNC" ] HYPERVISOR = "kvm" MEMORY = "8192" NIC = [ NETWORK = "NAT" ] NIC = [ NETWORK = "External" ] NIC_DEFAULT = [ MODEL = "virtio" ] OS = [ ARCH = "x86_64" ] USER_INPUTS = [ USER_DATA = "M|text|cloud-config" ] VCPU = "4"
Problems:
Attempting to provision VM with the following cloud-config crashes OpenNebula:
Jul 25 05:27:12 tetra-oned kernel: [63985.419425] oned[5217]: segfault at 20 ip 0000000000421f8e sp 00007f23ffffc430 error 4 in oned[400000+2a4000]
#cloud-config
coreos:
units:
- name: etcd2.service
runtime: true
drop-ins:
- name: 10-oem.conf
content: |
[Service]
Environment=ETCD_ELECTION_TIMEOUT=1200
- name: reload.service
command: start
content: |
[Unit]
Description=reload systemd
[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl daemon-reload
- name: start.service
command: start
content: |
[Unit]
Description=start etcd2
[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl start etcd2
etcd2:
# generate a new token for each unique cluster from https://discovery.etcd.io/new?size=3
discovery: "https://discovery.etcd.io/aeaa1d21e3ea14bca24071f8a7a029f4"
advertise-client-urls: "http://$NIC[IP, NETWORK=\"External\"]:2379"
initial-advertise-peer-urls: "http://$NIC[IP, NETWORK=\"NAT\"]:2380"
listen-client-urls: "http://0.0.0.0:2379,http://0.0.0.0:4001"
listen-peer-urls: "http://$NIC[IP, NETWORK=\"NAT\"]:2380,http://$NIC[IP, NETWORK=\"NAT\"]:7001"
I'm suspecting that OpenNebula is trying to replace $NIC variable when creating the context ISO but fails for some reasons?
History
#1
Updated by Ruben S. Montero almost 5 years ago
Hi Jerry
I cannot reproduce this in 5.0.2, Could you try with that version? Although I think there has not been any change regarding context generation.
This is what I'm getting
onevm show 2
VIRTUAL MACHINE 2 INFORMATION
ID : 2
NAME : test-2
USER : ruben
GROUP : oneadmin
STATE : PENDING
LCM_STATE : LCM_INIT
RESCHED : No
START TIME : 08/04 10:52:24
END TIME : -
DEPLOY ID : -
VIRTUAL MACHINE MONITORING
PERMISSIONS
OWNER : um-
GROUP : ---
OTHER : ---
VM DISKS
ID DATASTORE TARGET IMAGE SIZE TYPE SAVE
0 - hda CONTEXT -/- - -
VM NICS
ID NETWORK BRIDGE IP MAC PCI_ID
0 NAT br1 172.16.0.10 02:00:ac:10:00:0a
1 External br0 10.0.0.1 02:00:0a:00:00:01
SECURITY
NIC_ID NETWORK SECURITY_GROUPS
0 NAT 0
1 External 0
SECURITY GROUP TYPE PROTOCOL NETWORK RANGE
ID NAME VNET START SIZE
0 default OUTBOUND ALL
0 default INBOUND ALL
USER TEMPLATE
HYPERVISOR="kvm"
SCHED_MESSAGE="Thu Aug 4 10:52:27 2016 : No hosts enabled to run VMs"
USER_DATA="#cloud-config
coreos:
units:
- name: etcd2.service
runtime: true
drop-ins:
- name: 10-oem.conf
content: |
[Service]
Environment=ETCD_ELECTION_TIMEOUT=1200
- name: reload.service
command: start
content: |
[Unit]
Description=reload systemd
[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl daemon-reload
- name: start.service
command: start
content: |
[Unit]
Description=start etcd2
[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl start etcd2
etcd2:
# generate a new token for each unique cluster from https://discovery.etcd.io/new?size=3
discovery: \"https://discovery.etcd.io/aeaa1d21e3ea14bca24071f8a7a029f4\"
advertise-client-urls: \"http://$NIC[IP, NETWORK=\"External\"]:2379\"
initial-advertise-peer-urls: \"http://$NIC[IP, NETWORK=\"NAT\"]:2380\"
listen-client-urls: \"http://0.0.0.0:2379,http://0.0.0.0:4001\"
listen-peer-urls: \"http://$NIC[IP, NETWORK=\"NAT\"]:2380,http://$NIC[IP, NETWORK=\"NAT\"]:7001\" "
USER_INPUTS=[
USER_DATA="M|text|cloud-config" ]
VIRTUAL MACHINE TEMPLATE
AUTOMATIC_DS_REQUIREMENTS="\"CLUSTERS/ID\" @> 0"
AUTOMATIC_REQUIREMENTS="(CLUSTER_ID = 0) & !(PUBLIC_CLOUD = YES)"
CONTEXT=[
DISK_ID="0",
ETH0_CONTEXT_FORCE_IPV4="",
ETH0_DNS="",
ETH0_GATEWAY="",
ETH0_GATEWAY6="",
ETH0_IP="172.16.0.10",
ETH0_IP6="",
ETH0_IP6_ULA="",
ETH0_MAC="02:00:ac:10:00:0a",
ETH0_MASK="",
ETH0_MTU="",
ETH0_NETWORK="",
ETH0_SEARCH_DOMAIN="",
ETH0_VLAN_ID="",
ETH0_VROUTER_IP="",
ETH0_VROUTER_IP6="",
ETH0_VROUTER_MANAGEMENT="",
ETH1_CONTEXT_FORCE_IPV4="",
ETH1_DNS="",
ETH1_GATEWAY="",
ETH1_GATEWAY6="",
ETH1_IP="10.0.0.1",
ETH1_IP6="",
ETH1_IP6_ULA="",
ETH1_MAC="02:00:0a:00:00:01",
ETH1_MASK="",
ETH1_MTU="",
ETH1_NETWORK="",
ETH1_SEARCH_DOMAIN="",
ETH1_VLAN_ID="",
ETH1_VROUTER_IP="",
ETH1_VROUTER_IP6="",
ETH1_VROUTER_MANAGEMENT="",
NETWORK="YES",
SET_HOSTNAME="test-2",
SSH_PUBLIC_KEY="",
TARGET="hda",
USER_DATA="#cloud-config
coreos:
units:
- name: etcd2.service
runtime: true
drop-ins:
- name: 10-oem.conf
content: |
[Service]
Environment=ETCD_ELECTION_TIMEOUT=1200
- name: reload.service
command: start
content: |
[Unit]
Description=reload systemd
[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl daemon-reload
- name: start.service
command: start
content: |
[Unit]
Description=start etcd2
[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl start etcd2
etcd2:
# generate a new token for each unique cluster from https://discovery.etcd.io/new?size=3
discovery: \"https://discovery.etcd.io/aeaa1d21e3ea14bca24071f8a7a029f4\"
advertise-client-urls: \"http://$NIC[IP, NETWORK=\"External\"]:2379\"
initial-advertise-peer-urls: \"http://$NIC[IP, NETWORK=\"NAT\"]:2380\"
listen-client-urls: \"http://0.0.0.0:2379,http://0.0.0.0:4001\"
listen-peer-urls: \"http://$NIC[IP, NETWORK=\"NAT\"]:2380,http://$NIC[IP, NETWORK=\"NAT\"]:7001\" " ]
CPU="2"
FEATURES=[
ACPI="yes",
APIC="yes",
HYPERV="yes",
PAE="yes" ]
GRAPHICS=[
LISTEN="0.0.0.0",
TYPE="VNC" ]
MEMORY="8192"
OS=[
ARCH="x86_64" ]
TEMPLATE_ID="0"
VCPU="4"
VMID="2"
Note that there is no double substitution, once USER_DATA=$USER_DATA is resolved it is not resolved again.
#2
Updated by Ruben S. Montero almost 5 years ago
Maybe you can get the backtrace from gdb, using coredumpctl or similar?
#3
Updated by Rachel Chen almost 5 years ago
Interesting. I could not reproduce the same result on my home lab OpenNebula. Maybe it is an isolated incident where there is a bug somewhere in my system. I will close this instead.
#4
Updated by Rachel Chen almost 5 years ago
Well, could you please close this?
#5
Updated by Ruben S. Montero almost 5 years ago
- Status changed from Pending to Closed
- Resolution set to worksforme
Thanks!!!