Discussion:
[vdsm] Problems with vdsm deleted Storages Domains
David Andino
2014-05-25 22:37:35 UTC
Permalink
Hello everyone,

I have serious troubles with something that recently happend and this is our story.

We have 12 nodes and 1 engine server. Our infrastructure is based in ISCSI and NFS (for Exports and ISOs). Recently my partner made some bad practices with our NFS server and one thing we noted in the engine, was that all of our nodes were contending the SPM and they never got agreed which one would it take it and the logs pointed to a Metadata corruption in the NFS Domains and after hours of the contending situation we tried to stabilized it.

Trying to stabilize our cluster, we shutdown our NFS server and nothing happened. The cluster was still contending the SPM. After many tests like putting the nodes in maintain mode, reboot, shutdown all the cluster, etc, The last thing we did was destroying from the configuration (using the web interface) the Export and ISO Domains leaving the Data Domain intact trying to put online our guests, and that was (I think) our big mistake. Because now we are getting a serious situation that our Data Domain is not getting online because the vdsm in every node are looking for the NFS domains we already deleted.

We were using vdsClient to consult the information the vdsm is getting and it says that we have 1 Storage Pool Domain and 3 Storage Domains, our ISCSI and the 2 NFS domains that were deleted.

We let only one node active and the rest are in maintenance and the message is that it still contending for SPM. The vdsm.log says that the node is not finding the NFS Storages. We tried to delete this domains using vdsCLient but it says that the node has to have the SPM but it can't take the SPM because it can't find the NFS Domains so is like a circle.

Now the question would be. Is there any way to delete this domains from the vdsm configuration?. What do I have to do to break this circle and the node leave to contending the SPM and activate our Data Domain again?.

I appreciate all your coments and help you can share with me.

Regards

David
Federico Simoncelli
2014-05-26 07:54:37 UTC
Permalink
Can you paste the exact traceback that you see in the vdsm log?
Also, what vdsm version is it?

Thanks,
--
Federico

----- Original Message -----
From: "David Andino" <david_andino at yahoo.com>
To: vdsm-devel at lists.fedorahosted.org
Sent: Monday, May 26, 2014 12:37:35 AM
Subject: [vdsm] Problems with vdsm deleted Storages Domains
Hello everyone,
I have serious troubles with something that recently happend and this is our story.
We have 12 nodes and 1 engine server. Our infrastructure is based in ISCSI
and NFS (for Exports and ISOs). Recently my partner made some bad practices
with our NFS server and one thing we noted in the engine, was that all of
our nodes were contending the SPM and they never got agreed which one would
it take it and the logs pointed to a Metadata corruption in the NFS Domains
and after hours of the contending situation we tried to stabilized it.
Trying to stabilize our cluster, we shutdown our NFS server and nothing
happened. The cluster was still contending the SPM. After many tests like
putting the nodes in maintain mode, reboot, shutdown all the cluster, etc,
The last thing we did was destroying from the configuration (using the web
interface) the Export and ISO Domains leaving the Data Domain intact trying
to put online our guests, and that was (I think) our big mistake. Because
now we are getting a serious situation that our Data Domain is not getting
online because the vdsm in every node are looking for the NFS domains we
already deleted.
We were using vdsClient to consult the information the vdsm is getting and it
says that we have 1 Storage Pool Domain and 3 Storage Domains, our ISCSI and
the 2 NFS domains that were deleted.
We let only one node active and the rest are in maintenance and the message
is that it still contending for SPM. The vdsm.log says that the node is not
finding the NFS Storages. We tried to delete this domains using vdsCLient
but it says that the node has to have the SPM but it can't take the SPM
because it can't find the NFS Domains so is like a circle.
Now the question would be. Is there any way to delete this domains from the
vdsm configuration?. What do I have to do to break this circle and the node
leave to contending the SPM and activate our Data Domain again?.
I appreciate all your coments and help you can share with me.
Regards
David
_______________________________________________
vdsm-devel mailing list
vdsm-devel at lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Loading...