Skip to main content

2.OpenStack 报错汇总

報錯:Horizon Access - Unable to create a new session key. It is likely that the cache is unavailable

用devstack安裝openstack后,重啓電腦后,無法登錄openstack web control ,報錯是無法生成session key

解決方案: 原因是memcached沒有啓動,啓動memcached並設爲自啓

systemctl enable memcached
systemctl start memcached

參考:https://stackoverflow.com/questions/63452218/horizon-access-unable-to-create-a-new-session-key-it-is-likely-that-the-cache#:~:text=openstack%20%2D%20Horizon%20Access%20%2D%20Unable%20to,cache%20is%20unavailable%20%2D%20Stack%20Overflow

報錯:用devstack安裝openstack后,本機能夠登錄openstack web control(http://xxxxxxx/dashboard), 但是其他主機無法通過瀏覽器打開openstack web control

解決方案: 1.首先查看了httpd服務,服務正常,然後重啓httpd后,還是無法訪問,查看httpd日志,沒有遠程主機相關的通信日志

2.使用python開了一個自帶的簡易http服務,發現http服務在非80端口也無法訪問

#開啓python自帶的簡易http服務
python -m http.server

3.分別在openstack所在linux主機使用tcpdump和windows瀏覽器主機上使用wireshark抓包工具抓包,linux主機上能夠收到windows發送的tcp請求,windows主機上包顯示發送了SYN,但是沒有收到ACK,可能是linux主機收到的包被丟棄了

#linux主機上運行
sudo tcpdump src 192.168.123.160 #160是windows主機ip
#windows主機上運行
ip.addr == 192.168.123.150 #wireshark過濾條件

4.linux上的firewalld進程狀態是關閉,查看到iptables INPUT之開啓了ssh相關的tcp端口

sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED
ACCEPT icmp -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited

5.iptables放行所有IP的TCP 80端口

iptables -I INPUT -p tcp --dport 80 -j ACCEPT
service iptables save

6.iptables放行后,遠程window主機上能夠登錄openstack web control

參考:

#iptables命令相關

https://zhuanlan.zhihu.com/p/370613338

#抓包分析相關

https://blog.51cto.com/fengyuzaitu/2669066

#TCP協議相關

https://www.xiaolincoding.com/network/3_tcp/tcp_interview.html#tcp-%E4%B8%89%E6%AC%A1%E6%8F%A1%E6%89%8B%E8%BF%87%E7%A8%8B%E6%98%AF%E6%80%8E%E6%A0%B7%E7%9A%84

报错:openstack无法打开控制台,控制台界面实现连接超时

解决方案: iptables没有放行,如下操作放行即可:

#放行6080的所有tcp连接
iptables -I INPUT -p tcp --dport 6080 -j ACCEPT
service iptables save

网上的大多数方案都是更改/etc/nova/nova.conf的[vnc]配置段,但是devstack安装的openstack部分配置和原生手动安装的配置有点不一样,并不能解决该问题。

还有就是devstack安装的openstack没有自动设置iptables放行,各种网络端口需要手动设置iptables,不确定是不是devstack的bug,还是我的安装操作有问题。

報錯:ERROR nova.virt.driver [req-4c7902c7-9b0d-4776-a451-ff97e63acd23 - - - - -] Compute driver option required, but not specified

解決方案: 在Config file path:/etc/nova/nova.conf中添加以下配置

compute_driver=libvirt.LibvirtDriver

參考: https://bugs.launchpad.net/nova/+bug/1139684

添加以上配置后如果報錯以下問題:

FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/python3.6/site-packages/instances'

可以在Config file path:/etc/nova/nova.conf中[default]添加以下配置

instances_path=/var/lib/nova/instances

參考: https://bugs.launchpad.net/nova/+bug/1955633

https://bugs.launchpad.net/nova/+bug/1163112

https://www.edureka.co/community/65463/openstack-threadgroup-directory-python2-packages-instances

報錯:Could not add the parameter --listen to open tcp socket

解決方案: 注釋/etc/libvirt/libvirtd.conf 的 --listen參數

參考: https://stackoverflow.com/questions/65663825/could-not-add-the-parameter-listen-to-open-tcp-socket

報錯:AH01630: client denied by server configuration: /usr/share/openstack-dashboard/openstack_dashboard/wsgi

解決方案: 1. in </etc/httpd/conf.d/openstack-dashboard.conf> change "WSGIScriptAlias /dashboard /usr/share/openstack-dashboard/openstack_dashboard/wsgi/django.wsgi" to "WSGIScriptAlias /dashboard /usr/share/openstack-dashboard/openstack_dashboard/wsgi.py" because there is no django.wsgi, but wsgi.py on your server.

  1. in </etc/httpd/conf.d/openstack-dashboard.conf> change "<Directory /usr/share/openstack-dashboard/openstack_dashboard/wsgi>" to "<Directory /usr/share/openstack-dashboard/openstack_dashboard>" because of the same reason to 1
  2. in </etc/openstack-dashboard/local_settings> add "WEBROOT = '/dashboard/'"
  3. in </etc/openstack-dashboard/local_settings> change "http://%s/identity/v3 #按照自己設置的鏈接配置 to "http://%s:5000/identity/v3"

參考: https://bugs.launchpad.net/horizon/+bug/1947295

報錯: [ERROR] WSREP: rsync SST method requires wsrep_cluster_address to be configured on startup.

背景: 在rockylinux8.8上使用packstack安裝openstack時,報錯fial to start mariadb, 查詢mariadb的日志得到以上報錯

解決方案:不要使用packstack安装mariadb,手动安装,并将answer-file设置CONFIG_MARIADB_INSTALL=n

#卸載已安裝的mariadb
yum remove mariadb
#添加官方源
curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash
#安裝
dnf install MariaDB-server
systemctl start mariadb
#添加root用戶
mariadb -uroot
#
CREATE USER sjzadmin@'%' IDENTIFIED BY 'P_sjz123';
GRANT ALL PRIVILEGES ON *.* TO 'sjzadmin'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;

參考: https://www.xjx100.cn/news/2320945.html?action=onClick

报错:ERROR 1396 (HY000): Operation CREATE USER failed for 'keystone'@'%'

背景: 使用delete frome mysql.user where user="keystone" 删除用户后还是无法重新添加用户,可能是忘记刷新权限了

解决方案:

#先删除用户
drop user keystone@localhost;
drop user keystone@"%";
flush privileges;
#再添加
Create user 'keystone'@'localhost' IDENTIFIED BY 'xxxxxxxxxx';
Create user 'keystone'@'%' IDENTIFIED BY 'xxxxxxxxxx';
flush privileges;

参考: mysql - ERROR 1396 (HY000): Operation CREATE USER failed for 'jack'@'localhost' - Stack Overflow

報錯: neutron-linuxbridge-agent.service無法啓動,neutron-linuxbridge-agent.service: Start request repeated too quickly.

解決方案: 查看日志tail -40 /var/log/neutron/linuxbridge-agent.log,發現以下報錯:

#2023-10-12 11:11:15.906 41050 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Tunneling cannot be enabled without the local_ip bound to an interface on the host. Please configure local_ip 192.168.122.150 on the host interface to be used for tunneling and restart the agent.

发现上面的local_ip(192.168.122.150)配置错误,应该配置成controller的ip 192.168.123.150,更改后成功启动服务

報錯:memcached[44210]: failed to listen on TCP port 11211: Address already in use

背景: 單節點openstack,安裝完memcached后,使用以下命令改了監聽地址:結果報錯了

cd /etc/sysconfig/ && cp /etc/sysconfig/memcached /etc/sysconfig/memcached.source && sed -i 's/OPTIONS="-l 127.0.0.1,::1"/OPTIONS="-l 127.0.0.1,::1,CA-S2101"/g' /etc/sysconfig/memcached

解決方案: 刪除CA-S2101,單節點中,主機名CA-S2101和localhost或127.0.0.1是同一個地址,綁定了2次,所以報錯了

memcached was able to successfully bind to the first address, but because 127.0.0.1 duplicates localhost, memcached gave the Address already in use error when it tried to bind to the second address.

參考: Memcached problems "failed to listen on TCP port 11211" - Server Fault

報錯:40887 ERROR nova.api.openstack.wsgi keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct.

解決方案: 認證地址不正確,將nova.conf的認證參數改成以下:

[service_user]
auth_url = auth_url = https://192.168.123.150:5000/identity

不确定是不是openstack官網文檔的配置有误,文档中keystone的地址和其他配置不一致导致的报错,也可能是官网的文档和后续自己的自定义配置不匹配导致的。

報錯:nova.exception.ResourceProviderRetrievalFailed: Failed to get resource provider with UUID

解決方案1: 重新检查配置文件后,发现是自己操作有误:endpoint端口設置錯誤,應該是8778,設置成9229了,重新設置:

openstack endpoint create --region RegionOne placement public http://192.168.123.150:8778
openstack endpoint create --region RegionOne placement internal http://192.168.123.150:8778
openstack endpoint create --region RegionOne placement admin http://192.168.123.150:8778

提示:

endpoint端口設置錯誤后,service收不到请求,現象是,placement-api.log沒有日志輸出。

解決方案2: /etc/httpd/conf.d/00-placement-api.conf少了以下内容也可能報以上錯誤,但不能肯定:

#在 ErrorLog /var/log/placement/placement-api.log 下面添加:
<Directory /usr/bin>
<IfVersion >= 2.4>
Require all granted
</IfVersion>
<IfVersion < 2.4>
Order allow,deny
Allow from all
</IfVersion>
</Directory>

解決方案3: 可能是placement.conf配置文件有問題,再次檢查下配置文件

解决方案4: 发现nova.conf或placement.conf配置文件有误,纠正后,问题解决。

報錯:start lvm2etad, unit file lvm2-lvmetad.service not found

背景:在rockylinux8上安装openstack过程中,报以上错

解決方案: 不用管,cinder服务能起来就行,lvm2-lvmetad.service在rockylinux8中被废弃了,下面参考连接有相关信息

參考: centos 8 start lvm2etad, unit file lvm2-lvmetad.service not found - CentOS

Support Policies for RHEL High Availability Clusters - LVM in a Cluster - Red Hat Customer Portal

報錯:pvcreate: Can't use /dev/sda: device is partitioned

解决方案1:使用parted按如下操作即可

#with parted:
parted /dev/sda > mktable msdos > quit

#After which you should be able to create your physical volume:
pvcreate /dev/sda

解决方案2:直接加 -ff 選項

pvcreate -ff /dev/sda

參考: partition - pvcreate: Can't use /dev/sda: device is partitioned - Unix & Linux Stack Exchange

报错:Failed to delete volume with name or ID '298de56d-a83f-4528-aa72-e231b18e5df1': Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating。cinder卷删除不掉 僵尸卷无法删除

解决方案:

#查看卷
openstack volume list
#直接删除,删除不了
openstack volume delete 298de56d-a83f-4528-aa72-e231b18e5df1
#这时用命令强制改变卷的状态为available
cinder reset-state 298de56d-a83f-4528-aa72-e231b18e5df1 --state available
#然后再删除试试
#可能有些报错卷能够删除,有些卷上面的操作删除不了,删除不了的卷需要使用下面的操作删除
#操作数据库命令删除
mysql -uroot -p
use cinder;
select * from volumes where id = '卷ID';
update volumes set deleted=1 ,status = 'deleted' where id ='卷ID';
#操作完成后,lsblk查看没有相应的块了,不确定后续会不会出问题

参考:

#这个操作简单

https://blog.51cto.com/u_15057823/3435672

#另一种解决方案,操作稍微麻烦,搞到一半放弃了,不确定是否可以解决问题:

https://blog.csdn.net/qq_37242520/article/details/106421759

報錯:openstack-nova-scheduler.service: Start request repeated too quickly.

背景: 重啓服務器后,openstack-nova-scheduler.service openstack-nova-conductor.service這兩個服務沒有正常自啓,報錯均如上

解決方案: 在service自啓文件添加重啓等待時間RestartSec這個參數,然後重啓主機后服務額能夠正常啓動了。

vim openstack-nova-scheduler.service
#
[Service]
Type=notify
NotifyAccess=all
TimeoutStartSec=0
Restart=always
RestartSec=1s #啓動失敗后等待1s后重啓

#重载daemon
systemctl daemon-reload

參考: https://stackoverflow.com/questions/35452591/start-request-repeated-too-quickly

https://learnku.com/articles/72684

報錯:Could not load 's3': No module named 'boto3': ModuleNotFoundError: No module named 'boto3'

解決方案: 手动安装boto3解决

dnf install python3-boto3

報錯:Failed to run task cinder.scheduler.flows.create_volume.ScheduleCreateVolumeTask;volume:create: No valid backend was found

背景: 服務器重啓后cinder volume sersvice有一個後端存儲服務總是down的狀態,而且服務綁定的HDD盤符變了,lsblk塊存儲文件都消失了。確認在服務器重啓之前的cinder配置沒問題,也能夠正常創建卷。

解決方案: 原因是服務器上有一塊臨時插上去的移動硬盤,服務器重啓后盤符變了,每次重啓盤符都可能變化。

把移動硬盤拔掉,再把/etc/lvm/lvm.conf 設備信息改成相應的設備就行:

devices {
filter = [ "a/dev/nvme0n1/", "a/dev/nvme1n1/", "a/dev/sdb/", "r/.*/"] #添加相應的設備信息

報錯:openstack的web端上傳鏡像報錯413 Request Entity Too Large

解決方案: 修改Apache配置文件:打开Apache的配置文件(通常是httpd.conf或apache2.conf),找到"LimitRequestBody"或"LimitRequestLine"等相关指令。将这些指令的值增加到适当的大小,以容纳你的请求实体。保存文件后重启Apache服务。如果沒有"LimitRequestBody"或"LimitRequestLine",就自己添加。

<Directory />
AllowOverride none
Require all denied
LimitRequestBody 10485760000 #改成10G
</Directory>

上傳鏡像的 API 好像是http://xxxxxxxx/dashboard/api/glance/images/,也許可以在這個directory加LimitRequestBody,不過沒試過。

報錯:cinder.exception.MetadataCopyFailure: Failed to copy metadata to volume: Glance metadata cannot be updated, key signature_verified exists

解決方案:刪除鏡像元數據的signature_verified 元數據就行,下面是官方開發人員給出的解釋和相應鏈接:

  1. Create an image from a volume
  2. The signature_verified=False property gets added to the image
  3. Now you can't create volumes from this image
  4. openstack image unset --property signature_verified <image ID>
  5. Now you can create volumes from this image

參考: https://bugs.launchpad.net/cinder/+bug/1823445

報錯:openstack實例啓動時卡在“正在打开电源”或者實例關閉時卡在“關閉電源”

解決方案: 重啓nova-compute服務

systemctl restart openstack-nova-compute.service

參考: https://stackoverflow.com/questions/67284824/openstack-instance-task-state-stuck-in-powering-off-how-reset-it-to-none

https://bugs.launchpad.net/nova/+bug/1593186

報錯:openstack實例重啓卡在Rebooting

解決方案: 重置實例狀態>重啓實例,操作如下:

nova list  #找到vm所对应的id
nova reset-state id #重设vm状态
nova stop id #停止vm运行
nova start id #启动id

參考: https://blog.csdn.net/qiqi_521/article/details/121398616