Considere o seguinte ambiente de cluster VMware:
1 x chassi de servidor blade Cisco UCS 5108
6 x Cisco UCS B200 M3 (vSphere 6.5 Enterprise Plus)
1x VMware vCenter Standard 6.5
Tivemos um problema com um dos servidores, provavelmente causado por um HBA de armazenamento com defeito. Infelizmente, no momento do incidente, a VM do vCenter estava hospedada no servidor com defeito. Passamos por todos os tipos de problemas para mover manualmente todas as VMs para os outros servidores e, no final, tudo deu certo, exceto a VM do vCenter. Suspeitamos que algo nele tenha sido corrompido porque não conseguimos executar alguns serviços, especialmente o serviço vmware-sps, que causou alguns problemas com backups da Veeam e outras funcionalidades. Ao tentar iniciar manualmente o serviço vmware-sps, obtivemos o seguinte:
Command> service-control --all --status
Running:
applmgmt lwsmd pschealth vmafdd vmcad vmcam vmdird vmdnsd vmonapi vmware-cis-license vmware-cm vmware-content-library vmware-eam vmware-mbcs vmware-perfcharts vmware-psc-client vmware-rhttpproxy vmware-sca vmware-statsmonitor vmware-sts-idmd vmware-stsd vmware-updatemgr vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui
Stopped:
vmware-imagebuilder vmware-netdumper vmware-rbd-watchdog vmware-sps vmware-vcha
Command> service-control --start vmware-sps
Perform start operation. vmon_profile=None, svc_names=['vmware-sps'], include_coreossvcs=False, include_leafossvcs=False
2024-08-07T12:21:32.331Z Service sps state STOPPED
Error executing start on service sps. Details {
"resolution": null,
"detail": [
{
"args": [
"sps"
],
"id": "install.ciscommon.service.failstart",
"localized": "An error occurred while starting service 'sps'",
"translatable": "An error occurred while starting service '%(0)s'"
}
],
"componentKey": null,
"problemId": null
}
Service-control failed. Error {
"resolution": null,
"detail": [
{
"args": [
"sps"
],
"id": "install.ciscommon.service.failstart",
"localized": "An error occurred while starting service 'sps'",
"translatable": "An error occurred while starting service '%(0)s'"
}
],
"componentKey": null,
"problemId": null
}
/var/log/vmware/vmware-sps/sps.log
2024-08-07T09:21:36.969-03:00 [main] ERROR opId=sps-Main-33292-112 com.vmware.sps.StorageMain - Exception when running SPS service
org.springframework.beans.factory.BeanDefinitionStoreException: Invalid bean definition with name 'httpServerEndpoint' defined in class path resource [../conf/pbm-spring-config.xml]: Could not resolve placeholder 'pbm.http.port' in value "${pbm.http.port}"; nested exception is java.lang.IllegalArgumentException: Could not resolve placeholder 'pbm.http.port' in value "${pbm.http.port}"
at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:223)
at org.springframework.beans.factory.config.PropertyPlaceholderConfigurer.processProperties(PropertyPlaceholderConfigurer.java:222)
at org.springframework.beans.factory.config.PropertyResourceConfigurer.postProcessBeanFactory(PropertyResourceConfigurer.java:86)
at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(PostProcessorRegistrationDelegate.java:281)
at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(PostProcessorRegistrationDelegate.java:161)
at org.springframework.context.support.AbstractApplicationContext.invokeBeanFactoryPostProcessors(AbstractApplicationContext.java:687)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:525)
at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:139)
at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:105)
at com.vmware.vim.storage.common.app.CommonSpringService.createSpringAppContext(CommonSpringService.java:38)
at com.vmware.pbm.app.PbmLocalService.initialize(PbmLocalService.java:135)
at com.vmware.pbm.app.PbmLocalService.<init>(PbmLocalService.java:117)
at com.vmware.pbm.app.PbmLocalService.initInstance(PbmLocalService.java:195)
at com.vmware.sps.StorageMain.loadPbmService(StorageMain.java:169)
at com.vmware.sps.StorageMain.main(StorageMain.java:38)
Caused by: java.lang.IllegalArgumentException: Could not resolve placeholder 'pbm.http.port' in value "${pbm.http.port}"
at org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174)
at org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126)
at org.springframework.beans.factory.config.PropertyPlaceholderConfigurer$PlaceholderResolvingStringValueResolver.resolveStringValue(PropertyPlaceholderConfigurer.java:258)
at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveStringValue(BeanDefinitionVisitor.java:282)
at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveValue(BeanDefinitionVisitor.java:204)
at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitGenericArgumentValues(BeanDefinitionVisitor.java:159)
at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitBeanDefinition(BeanDefinitionVisitor.java:85)
at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:220)
... 14 more
Nossas pesquisas no Google e na documentação oficial não tiveram sucesso. Não temos um contrato de suporte VMware ativo (eu sei, eu sei, mas não é minha culpa...). Alguém tem uma idéia de como solucionar e reparar isso? Desde já, obrigado!
No mesmo diretório (/usr/lib/vmware-vpx/sps/conf) do arquivo pbm.properties, encontramos um arquivo muito antigo chamado pbm.properties.upgrade. Este arquivo estava faltando todos os parâmetros que deveriam ter sido declarados em pbm.properties. Não tenho ideia de por que esses parâmetros foram removidos de pbm.properties, mas assim que copiamos todo o conteúdo de pbm.properties.upgrade para pbm.properties, conseguimos iniciar todos os serviços, incluindo vmware-sps, e assim até agora tudo parece ter voltado ao normal.