Proposal for the RHIC Computing Center in Japan

Draft version August 12, 1998 Takashi Ichihara Index 1. Scope of the PHENIX CC-J 2. Relation with PHENIX and PHENIX CC-J 3. Construction plan 4. CCJ component

1. Scope of the PHENIX CC-J

The PHENIX CC-J is a regional center of the PHENIX computing in Japan. We currently assumes that the size of the PHENIX CC-J is abount 1/4 of the RCF. PHENIX CC-J will serve Data analysis of the PHENIX experiment and simulation. We currently expect that about half of the resouces of the CC-J will be utilized for the PHENIX simulation cordinated by the PHENIX off-line computing group. The another half of the resouces will be used for Data analysys, including data mining and physics analysys, cordinated by the PHENIX-CC-J Program Advisory Committie.

2.

Following fingure shows our view of the PHENIX CC-J in the phenix computing organizaion. +-----------------+ | Phenix Computing+----- RCF Liason (Dave Morison) | (Barbara Jack) +----- CC-J Liasen (Takashi Ichihara) +-------+---------+ | +---------+ +----------+ +--------+ +------+ +---------+ Core Simulation Central Muon Raw Data Software tracking Arm (Dave) (Chrary) (Jeff) (Marinda) (Martin) PHENIX experiment at RHIC will produce Raw Data at 20MB/sec and this amounts 230TB/year. This raw data will be converted to Data Summary Tape (DST) of about 175 TB/year at the RHIC Computing Facility (RCF). The data analysis for the PHENIX experiment start with this DST. In order to promote data analysis for RHIC spin experiment, it is disable to have its own data analysis facility. In addition, since there are many collaborators in Asia, e.g. Japan, China, India, Korean. for PHENIX project, it has been also desired to establish a regional computing center in Japan for PHENIX experiment. RHIC Computing Center in Japan (RHIC-CCJ) is aiming at satisfying these request, promoting data analysis for RHIC experiment and serving a regional computing center for PHENIX experiment for Asian to intensify the research power for RHIC physics.

2. Hardware and Software Requirement

At BNL, RCF will be used as the common computing environment for the RHIC Project: STAR, PHENIX, PHOBOS, BRAHMS etc. The resource of about 1/3 of the RCF is assumed to be used for PHENIX. So at size of the RHIC-CCJ is comparable as the PHENIX portion of the RCF, i.e. 1/3 of the full RCF size. 2.1 Data Storage Requirement 2.1.1 capacity Following table shows the current estimate of the annual data amount for PHENIX experiment at the stationary stage. All PHENIX CCJ Raw Data 300 TB 0 Data Summary Tape(DST) 150 TB 150 TB micro-DST 100 TB 100 TB Simulated Data 150 TB 50 TB Theoretical model 10 TB 10 TB --------------------------------------------------------------- total annal data volume 710 TB 310 TB To handle the data amount of 310 TB/year, Hierarchical Storage System (HSM) with tape robotics and disk system are required. We assume to use High Performance Storage System (HPSS), developed at the Storage laboratory under DOE project. HPSS has been also adapted at the BNL RCF as well as many high-energy accelerator physics laboratories in the world. We assume that 200 TB online storage of Tape robot and about 15 TB disk are required to handle this data. 2.1.2 IO throughput for the storage (a)Tape Drive I/O bandwidth Assuming that it takes less than 30 days to access a whole DST of a year of 175TB, the required bandwidth for total tape drivers is about 70MB/s. If we use the redwood tape driver, more than 6 tape drivers are required to satisfy this. (b) Disk drive I/O bandwidth Assuming that the access to the whole micro DST on disk (10TB) can be done in a day, the disk read bandwidths is more than 115MB/sec. 2.2 CPU requirement Following table shows an estimate of the CPU requirement for the requirement experiment. Since the CCJ does not treat the raw data, the event reconstruction part of the CPU is not necessary. We assume the 1/4 of the required simulation is carried out at CCJ. The simple estimation shows the CPU requirement of the CCJ is about 4880 SPECint95, which corresponds to about 400 CPU of Pentium II processors running 300MHz. All PHENIX CCJ Event Reconstruction 6084 0 Data Analysis of real data 1700 1700 Theoretical Models 800 800 Simulation 7991 2000 Event Reconstruction Simulated Data 1300 330 Data Analysis of Sim data 170 50 ---------------------------------------------------------------------- Total CPU requirement (SPECint95) 18045 4880 2.3 Software requirement Software developed at the BNL RHIC Computing Facility are expected to be used at CCJ. Also the Software developed at the Grand Challenge Project for RHIC experiment will be used. Software specially designing and developed for the PHENIX experiment will be also used. (PISA/STAFF etc.) In order to implement the Hierarchical Storage System, High Performance Storage System (HPSS) will be installed. In order to implement the object oriented data base, Objectivity/DB will be used. In addition, user-written software for the data analysis for spin physics experiment is also used at CCJ. 2.3 Human resource requirement In order to design, re-design, and maintain the HPSS system, a special system engineer who has sophisticated knowledge and experience of HPSS is required at full-time base at CCJ. Also in order to implement and mainten the the objectivity/DB, a special system engineer who has sophisticated knowledge and experience of HPSS required at full-time base time at CCJ. To import the DST of 175TB/year and inject these tapes into tape robotics, a full time operator for the Tape hander is required. For the daily management of the whole CCJ system, a full time system engineer is required at CCJ. items requirement ------------------------------------------------------------ System Manager 1 full time equiv. System Engineer (HPSS) 1 full time eqeiv. System Engineer (Objectivity/DB) 1 full time equiv. Operator (tape hander) 1 full time equiv.

3. Construction plan

The main part of the CCJ will be constructed in three years, starting in JFY 1999 and finished in JFY 2001. In the end of the first fiscal year, about 1/3 capacity of the CCJ should be operational. The tape robot system with 100-200TB capacity will be installed in January 1999 as a part of RIKEN Super computer. We assume that about 3/4 of this tape robot system can be used at the CCJ. The HPSS software and hardware will be also installed in the JFY 1998 as supplementary budget. The detailed budgetly plain is described in another paper. In JFY 1998, R&D for the 1. Prototype of data duplication facility 2. Prototype of the simulation and data analysis hardware are planned. A Brief Description can be found here

4. System component

1.Storage system 1. Tape Robotics Capacity: 100TB -200 TB HPSS supported Tape driver I/O throughput 100 MB/s 2. Disk sub System Capacity: 15 TB I/O throughput: 150 MB/s 3. Data Server SUN Enterprise E4500 (12 CPU) 3 unit 4. HPSS Server IBM SP2 16 node 2. CPU unit PC box (Pentium II 300MHz, 128MB Memory, 2GB disk) box 400 unit 3. Network Switch Switching Speed: 16 Gbis Upstream port: 8 1000Base TX Downstream port: 500 100Base TX 4. Software component Commercial Software to be used at the RCF HPSS Objectivity/DB Fortran C C++ Developed Software at RHIC RCF and PHENIX etc. Grand Challenge Software PISA STAF ROOT