0%

Ubuntu在Openstack中启动密码注入不生效问题定位1

问题说明

使用Ubuntu的cloud镜像创建实例,在创建时使用脚本修改密码

1
2
3
4
5
#!/bin/bash
passwd ubuntu<<EOF
ubuntu
ubuntu
EOF

系统启动后,发现密码修改未成功

定位思路

查看实例启动日志 ➡️ 查看openstack的neutron agent状态 ➡️ 查看meta-data服务日志

step by step

  • 查看实例启动日志
1
2
3
[  731.435163] cloud-init[845]: 2018-09-11 01:42:24,574 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [119/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fa26f1eb6d8>: Failed to establish a new connection: [Errno 101] Network is unreachable',))]
[ 738.460169] cloud-init[845]: 2018-09-11 01:42:31,597 - DataSourceEc2.py[CRITICAL]: Giving up on md from ['http://169.254.169.254/2009-04-04/meta-data/instance-id'] after 126 seconds
[ 738.470572] cloud-init[845]: 2018-09-11 01:42:31,608 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceCloudStack.DataSourceCloudStack'> failed

以上日志说明cloud-init在向meta-data server获取meta-data时失败

  • 查看neutron agent状态
1
2
3
4
5
6
7
8
9
10
11
12
root@pxea4badb217413:~# neutron agent-list
+--------------------------------------+--------------------+-----------------+-------------------+-------+----------------+---------------------------+
| id | agent_type | host | availability_zone | alive | admin_state_up | binary |
+--------------------------------------+--------------------+-----------------+-------------------+-------+----------------+---------------------------+
| 459740a7-b60c-4db7-9ee0-3b45d6c3b2de | Linux bridge agent | pxe1418773526e5 | | :-) | True | neutron-linuxbridge-agent |
| 4ec74c6a-c392-4216-bddd-789ec5aa3d86 | DHCP agent | pxea4badb217413 | nova | :-) | True | neutron-dhcp-agent |
| 6d992669-fbe6-412c-a717-5c2b61a2b901 | Metadata agent | pxea4badb217413 | | :-) | True | neutron-metadata-agent |
| 77dc04a5-0e43-4907-be3a-6f165e77807c | Linux bridge agent | pxe74867aee16bc | | :-) | True | neutron-linuxbridge-agent |
| c54c7593-5973-41ed-86c8-7e22ac1e95ec | Linux bridge agent | pxea4badb217413 | | :-) | True | neutron-linuxbridge-agent |
| e882e27d-45de-4cb3-9284-7e9101e2b39b | L3 agent | pxea4badb217413 | nova | :-) | True | neutron-l3-agent |
+--------------------------------------+--------------------+-----------------+-------------------+-------+----------------+---------------------------+
root@pxea4badb217413:~#
  • 检查neutron-metadata日志
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent [-] Failed reporting state!
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent Traceback (most recent call last):
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 266, in _report_state
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent use_call=self.agent_state.get('start_flag'))
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/neutron/agent/rpc.py", line 87, in report_state
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent return method(context, 'report_state', **kwargs)
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent retry=self.retry)
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent timeout=timeout, retry=retry)
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 470, in send
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent retry=retry)
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 459, in _send
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent result = self._waiter.wait(msg_id, timeout)
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 342, in wait
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent message = self.waiters.get(msg_id, timeout=timeout)
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 244, in get
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent 'to message ID %s' % msg_id)
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent MessagingTimeout: Timed out waiting for a reply to message ID 5b145001c0a34d37a68337821e64908f
2018-09-05 19:07:06.404 10617 ERROR neutron.agent.metadata.agent
2018-09-05 19:07:06.405 10617 WARNING oslo.service.loopingcall [-] Function 'neutron.agent.metadata.agent.UnixDomainMetadataProxy._report_state' run outlasted interval by 30.00 s
ec
1
2
3
4
2018-09-06 16:23:48.658 1238 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on controller:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds.
2018-09-06 16:23:49.752 1238 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on controller:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 2 seconds.
2018-09-06 16:23:51.766 1238 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on controller:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 4 seconds.
2018-09-06 16:23:55.779 1238 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on controller:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 6 seconds.

以上显示为AMQP server连接不到

  • 检查meta-data的配置
1
2
3
4
5
6
7
8
9
 /etc/neutron/metadata_agent.ini

# IP address used by Nova metadata server. (string value)
#nova_metadata_ip = 127.0.0.1
nova_metadata_ip = controller
# TCP Port used by Nova metadata server. (port value)
# Minimum value: 0
# Maximum value: 65535
#nova_metadata_port = 8775
  • 查看controller上端口监听状况
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
root@pxea4badb217413:~# netstat -lnp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1149/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 2316/exim4
tcp 0 0 127.0.0.1:6010 0.0.0.0:* LISTEN 18284/12
tcp 0 0 127.0.0.1:6011 0.0.0.0:* LISTEN 18861/20
tcp 0 0 0.0.0.0:9696 0.0.0.0:* LISTEN 1243/python
tcp 0 0 0.0.0.0:6080 0.0.0.0:* LISTEN 1247/python
tcp 0 0 0.0.0.0:8774 0.0.0.0:* LISTEN 1237/python
tcp 0 0 0.0.0.0:8775 0.0.0.0:* LISTEN 1237/python
tcp 0 0 0.0.0.0:9191 0.0.0.0:* LISTEN 1242/python
tcp 0 0 0.0.0.0:25672 0.0.0.0:* LISTEN 2922/beam.smp
tcp 0 0 0.0.0.0:8776 0.0.0.0:* LISTEN 1235/python
tcp 0 0 10.160.17.196:3306 0.0.0.0:* LISTEN 1769/mysqld
tcp 0 0 10.160.17.196:11211 0.0.0.0:* LISTEN 2443/memcached
tcp 0 0 0.0.0.0:9292 0.0.0.0:* LISTEN 1246/python
tcp6 0 0 :::21 :::* LISTEN 833/vsftpd
tcp6 0 0 :::22 :::* LISTEN 1149/sshd
tcp6 0 0 ::1:25 :::* LISTEN 2316/exim4
tcp6 0 0 ::1:6010 :::* LISTEN 18284/12
tcp6 0 0 ::1:6011 :::* LISTEN 18861/20
tcp6 0 0 :::35357 :::* LISTEN 3904/apache2
tcp6 0 0 :::5000 :::* LISTEN 3904/apache2
tcp6 0 0 :::80 :::* LISTEN 3904/apache2
tcp6 0 0 :::8081 :::* LISTEN 4749/cma
tcp6 0 0 :::4369 :::* LISTEN 2586/epmd
udp 0 0 0.0.0.0:123 0.0.0.0:* 3218/chronyd
udp 0 0 0.0.0.0:161 0.0.0.0:* 3189/snmpd
udp 0 0 0.0.0.0:323 0.0.0.0:* 3218/chronyd
udp6 0 0 :::123 :::* 3218/chronyd
udp6 0 0 :::323 :::* 3218/chronyd

以上,发现5762端口未监听

  • 尝试启动rabbitmq-server
1
2
3
service rabbitmq-server start
* Starting RabbitMQ Messaging Server rabbitmq-server * RabbitMQ Messaging Server already running
[ OK ]
  • 再次查看端口监听状态
1
tcp6       0      0 :::5672                 :::*                    LISTEN      2922/beam.smp
  • 重启meta-data的service,发现日志已经不再报错了

然而,这竟然不是最终原因,最终的问题发现是网络问题