The role of the Scientific Computing team, is to provide technical support and expertise in the areas of compute, storage and networking, in a manner which suits the unique needs and requirements of the various BII scientific divisions. The team's main focus is on providing highly customized IT resources on demand, at short notice, while seeking innovative and elegant architectures and solutions. Our areas of specialization include :
a) High throughput Linux clusters
b) Large scale, general purpose file systems
c) High volume IP networks for large data transfers
d) Data backup, replication and archival
e) Design, implementation and operation of corporate services.
Deployment of Gen4 Infrastructure
The deployment of a new set of Enterprise services as well as end-user networks was performed over the last one year. This new infrastructure was put in place partly as an overall upgrade of Institute infrastructure but more importantly, to provide a network infrastructure upon departure of NCS in Q3 2009. In addition to the new network, a new suite of servers was put in place to replace the aging servers that provide email, DNS, LDAP, and other supporting services to the Institute.
The new network brings Gigabit Ethernet to each scientist, as well as providing a 10Gbit backbone which spans the BII equipment in the Data Centre and the users on Level 7. The network's design is motivated by open standards, as well as emphasis on high volume packet routing and switching. The new network includes a physically independent backup infrastructure, which provides connectivity in event of failure of any single component or link on the primary network.
Gen4's connection to the outside world is throttled such that each service (eg, web, email, ssh, etc) is guaranteed a minimum amount of bandwidth, and at the same time capped at a certain maximum. The total traffic going in and out of the Institute is also throttled to ensure fair usage of the WAN links shared with other institutes in the Biopolis. Packet schedulers handle certain high volume traffic (eg, web) so as to ensure fairness among BII end users.
This new server infrastructure was built up step by step between May till December 2008. During this period, there was a complete migration of all services from old hardware to new servers as well as disk arrays. Central to the new server infrastructure is the Andrew File System (AFS). This global filesystem serves everything from user home directories to providing binaries for corporate services, to supplying shared storage for the various research divisions in BII. AFS itself is backed up by automated nightly procedures which copy data onto a pair of Sun storage servers. The backup system allows us to retrieve any user file, from any day, up to 3 months ago.
All critical network and services are self-monitored on the Gen4 infrastructure. Each service comes with its own automated self-diagnostic procedures, as well as periodic runtime backup and log rotation. Critical faults trigger SMS alerts to systems engineers, while emails are automatically generated upon encountering non-critical faults.
The Institute currently owns and operates 2 clusters : the Christmas Cluster and the Easter cluster. The Easter Cluster consists of 31 compute nodes (248 cores) and 20TB of harddisk space. This is a generic purpose cluster mainly for single-cpu serial jobs. Specific tools like Bioscope, CLCGenomics are running in this cluster also
Christmas cluster comprises of 90 compute nodes, each fitted with 8-cores, 32 or 16 GB memory and a shared storage of 30TB. This cluster is mainly for running molecular dynamics simulation (Amber,namd,GROMACS) over MPI and Infiniband interconnects. There is also a seperate queue with a total 16 of Nvidia GPUs in the cluster.
By end of FY2012, we will increase the number of GPUs (Nvidia Kepler) to 3 times the current count.