LXSHARE cluster

LXSHARE cluster: first set:

check the machine status:

there are right now more than 200 nodes available

here is the receint list of nodes

All nodes for this atlas test are still running LSF, but with all queues closed.

They are still part of lxbatch with all the monitoring tools running, thus if a node shows problems the sysadmin team looks into that and the node might be rebooted, if it had serious problems.
In urgent cases the operators in the center can be called for an intervention(reboot) (75011)

The command (i.e. on lxplus) bhosts -w -R atlas takes this into account (marking the nodes as closed_FULL), thus nodes which are down are marked unavailable. 
With this command the order of nodes in the listing is not always the same.

If you want to get an ordered list, for example of anavailable machines type

bhosts -w -R atlas | egrep lx | egrep -v unavai | awk '{print $1}' | sort

LXSHARE cluster: second set:

The 124 machines with gigabit connection:
> The installation went well on 121 of those and there is now online-00-21-01,
> DF-00-07-00 and the jdk on all those 121 machines with the patches 01 to 04
> for the online.
> The valid machines are in:
> /afs/cern.ch/user/a/atlonl/public/lst04/gb_nodes.list.220304-1836
> The nodes which failed are:
> tbed0008
> tbed0013
> tbed0041

The file has been moved to:
/afs/cern.ch/user/a/atlonl/public/lst04/gen_info/gb_nodes.list.220304-1836
There is also the full list (including not working) at:
/afs/cern.ch/user/a/atlonl/public/lst04/gen_info/gb_nodes.list.orig