Image
Two researchers reviewing visualization data

Policies and Data Management

Policies and Procedures

The Hewlett Packard Enterprise Data Science Institute (HPE DSI) strives to support the research of the community of the University of Houston (UH). The following policies and procedures aim to help safeguard against any actions/activities that might impact the reliability of our systems and prevent HPE DSI’s resources users from effectively conducting their research. 

Use of HPE DSI’s clusters signifies compliance with and acceptance of all the policies and procedures described below as well as those of UIT (https://uh.edu/infotech/policies/index).

HPC Clusters

The HPE DSI currently hosts three computer clusters: Opuntia, Sabine, and Carya. Opuntia is older and smaller than Sabine, and Sabine is older and smaller than Carya. Details about the clusters can be found on the Research Computing Data Core website.

The allocations, accounts, software, storage and management of each cluster are independent of each other. Popular software/applications have been installed in all the clusters (see Software below), but there are some differences in software versions from cluster to cluster.

Eligibility

All UHS faculty are eligible to be Principal Investigators (PIs). This includes tenured, tenure-track, emeritus, research faculty, or clinical faculty.  

UH faculty need an active cougarnet account to access HPE DSI’s systems. UHS faculty who are employed on the UH Downtown, Clear Lake, or Victoria Campuses may request DSI to sponsor cougarnet credentials directly at datascience@uh.edu. Note that these credentials are renewed annually 

Other UH employees may be eligible for an allocation. These requests will be evaluated on a case-by-case basis. 

Costs 

All HPE DSI allocations are free of charge. 

Single vs. Multiple 

Eligible PIs whose research needs require the utilization of more than one of the HPE DSI’s clusters, may submit an Allocation Request for more than one cluster. HPE DSI does not recommend PIs to have multiple allocations if any of them are not actively used because this generates unnecessary overhead. Allocations for each cluster are evaluated and managed independently.  

Duration/Compute Cycle 

Allocations run on yearly cycles from September to August.  

Renewals 

During July, PIs must submit an Allocation Request with the option ‘Renew Existing Project’ by August 1st. This will allow research groups to continue using the clusters without any interruption. It is important that PIs do an assessment of the necessary compute and storage resources that their group will utilize for 1-year. 

If no such renewal is received by HPE DSI, access to that allocation will be locked by the first week of October of the new compute cycle. HPE DSI will send reminders of these important dates.  

Any accounts which remain locked for 12-months, will be deleted. Including 100% of the data and files in the PI’s project directory, all its subdirectories, as well as all the home directories of the group members with accounts in that specific allocation.  

It is the responsibility of the PIs to inform HPE DSI in writing if they have a written agreement binding them to keep access to their research data in HPE DSI systems for a year or more. HPE DSI reserves the right to delete such data, should PIs fail to communicate this information. 

Request 

Eligible PIs may submit an Allocation Request at any point in time.   

PIs must first decide which cluster they want to utilize: Opuntia, Sabine or Carya. 

Once a specific cluster has been identified, eligible PIs must submit an Allocation Request using the Request Allocation Form.

Note allocation requests from students, postdocs or other collaborators will not be approved; they must be submitted by eligible PIs. PIs must provide the number of hours, number of cores, number of GPUs, amount of memory, amount of storage, and software to utilize, from the date of filing the request to the following August 31st.  

The Allocations Committee (below) will review all submitted requests, and decide on the award within a one-week time period, based on:  

  • the current capacity of the resources

  • the existing allocations commitments  

  • the information provided by the requestor PIs in terms of computational needs 

  • the PIs’ historical utilization in HPE DSI’s resources 

Request types 

New Project. This is for PIs who wish to utilize one of HPE DSI’s clusters for the very first time (independently of any other clusters’ allocations). 

Additional Compute Resources. This is for PIs who already have an active allocation, and their group has utilized most of the awarded compute time and wish to request more compute time. 

Additional Storage Resources. This is for PIs who already have an active allocation, and their group has utilized most of the awarded storage and wish to request more storage.  

Renew Existing Project (yearly, by August 1st). This is for PIs who already have an active allocation and wish to continue using the corresponding cluster during the next compute cycle. Allocations for each cluster are evaluated and managed independently. 

Allocations Committee 

It is formed by the Directors and Managers of HPE DSI and its Research Computing Data Core (RCDC). They evaluate each allocation request on a case-by-case basis, considering the requesting PIs computational needs in terms of number of requested hours, GPUs (if needed), memory, storage, and software to utilize. In addition, the Committee investigates the PIs’ historical utilization in HPE DSI’s resources before deciding on any allocation awards.

Allocations Size/Use Guidelines

   Opuntia [SUs, TB]  Sabine [SUs, TB]  Carya [SUs, TB]
 Small  50k - 150k,  5  50k - 150k,   5  0, 0
 Medium  250k - 999k, 5  250k  - 999k, 10  0, 0
 Large  0, 0  500k – 1.5M, 20  250k - 1M,   20
 Huge  0, 0  1M – 3M, 20  1M – 3M, 20*

*The Allocations Committee may review time-limited proposals for special cases for 'Texas' projects that actually need more than 20Tb.

Eligibility 

Accounts are created for specific active Allocations only.  

All Faculty member of UHS, PIs, who are eligible for an Allocation will receive an account automatically as part of their Allocations 

PIs who have an active Allocation may sponsor other eligible PIs, UHS undergraduate and/or graduate students, UHS post-docs and/or technical staff who have active cougarnet credentials 

Collaborators from other UHS Campuses must first ask the allocation's PI to sponsoring an ePOI for them in order to obtain temporary cougarnet credentials.  

Non-UH collaborators of the PIs may be eligible too, depending on the review of the Allocation's Committee. 

Non-UHS or Foreign collaborators 

HPE DSI does not have the compute capacity to provide for non-UH researchers. However, allocation eligible PIs may sponsor accounts for collaborators from institutions other than UHS. 

Both the PI and any potential non-UH users must submit a clear, detailed project description/proposal to the Allocations Committee, using <datascience@uh.edu>, to evaluate whether UH resources can be allocated to external collaborators sponsored by UHS PIs. 

These non-UHS accounts are to be used: 

  • With active participation of the UHS-based part of the research group 

  • As complement to the external collaborators home institutions resources, but not for the bulk of the external collaborators work.  

It is the responsibility of the PIs to evaluate any of their foreign collaborators  to ensure that no federal regulations, including export control  are violated in the collaboration. The Allocation Committee will ask for this information before deciding on the creation of any account in question.  

Request 

Eligible Allocation Account members must submit an Accounts request using the Request Account Form.

Use the sponsoring PI’s name and the cluster’s name. 

Duration 

Accounts are attached to active Allocations. Active Accounts will be automatically renewed with their corresponding Allocations’ renewals (above), except for Accounts of groups members who are registered in the UH Downtown, Clear Lake, Victoria Campuses, or other non-UH institutions; these require annual sponsorship of cougarnet credentials. 

HPE DSI’s resources are shared. This sets limits on the allocations compute times that may be awarded per year.  

Compute time is measured in ‘service units’, or SUs. One SU is 'burned’ by using one CPU core for one hour. Ten SU are burned using one CPU core for 10 hours or using 10 CPU cores for 1 hour. In the case of GPUs (regardless of their models), 10 SUs are burned using 1 GPU card for 1 hour. 

PIs must provide an amount of SUs representative of their research needs/group while submitting an Allocation Request.  

Should a research group utilize most of their awarded SUs before the end of the current compute cycle, the PI leading the Allocation may submit an allocation request with the option Additional Compute Resources, at any point in time. 

HPE DSI’s resources are shared. This sets limits on the allocationsstorage amounts. 

HPE DSI’s clusters have been architected to support projects with a balanced compute-to-storage ratio, as is the case in most HPC centers. 

HPE DSI does not support archival storage.  

Each active allocation will have the following data directories: 

/home/USERNAME/ 

One for each account. Each home directory has a 10 Gb limit, and it is meant for basic data maintenance, such as copying files, organizing data, compiling code, installing local applications, etc. Home directories are not for large in-out, read/write operations for research. 

/project/ALLOCATION-ID/ 

One for each allocation. The capacity is much larger than the home directories, and it is meant to store large data sets (starting from 250 Gb) and being utilized as the main in-out, read/write drive for research   

It is the responsibility of the leading PI to manage and maintain access and permissions for all her/his allocations’ members.  

Should a research group utilize most of their awarded storage (in the /project/ALLOCATION-ID directory) before the end of the current compute cycle, the leading PI of the Allocation may submit an allocation request with the optionAdditional Storage Resources at any point in time.  

Backups 

HPE DSI does not perform any backups on any of its clusters file systems. If for example, a student of an active allocation deletes her/his research files, then HPE DSI does not have the capacity to restore them.  

It is the responsibility of each Allocation PI to keep backups in servers, workstations, or systems of their own, independently of HPE DSI, of any important data/files which is processed/generated in any of HPE DSI’s clusters. 

HPE DSI’s clusters run on Linux, and do not support any Windows/Mac OS packages.  

HPE DSI’s clusters are traditional HPC systems and are not designed to run any virtual machine environments. 

Popular applications/software/libraries such as C, C++, Fortran, compilers, FFTW, HD5, Python, R studio, ML libraries, and many more, have been installed in all HPE DSI’s clusters. There are some differences on software’s versions from cluster to cluster. 

Allocation PIs are responsible for identifying and purchasing the correct licenses attached to any software that their groups would like to utilize in any HPE DSI’s clusters.  

Allocation users may request the installation of some packages needed for their research via our ticketing system.

Allocation users have permissions to do local installations in their home directories. This is meant for very specific software that is not relevant for other groups in the cluster, or applications that require routine/frequent compilations.

HPE DSI’s RCDC provides all hardware support and maintenance for the clusters. 

HPE DSI and its RCDC provide support to their active Allocation users via our ticketing system 

We provide these types of support:  

  • Allocation/Account related requests: Details can be found above. 
  • General help: This is meant to support Allocations’ active members with issues/clarification related to their Allocations, which may take on a few hours of HPE DSI’s Staff time. This includes software installations.  
  • Consultation: These are approached on a case-by-case basis, and are meant to study, and elaborate a plan to support UH research groups which would like to use HPE DSI’s clusters/Staff for data science, machine learning, deep learning, HPC, or other fields which may require a significant computational effort.  

All queued and running applications in HPE DSI’s clusters are managed by the automated system Slurm. The scheduler has been set-up based on best practices for University-based HPC systems, as well as RCDC’s data on HPE DSI’s clusters utilization/traffic. This ensures the fairest share of resources for Allocations of all sizes.  

It is the responsibility of active Allocations PIs to supervise all their Allocation users activity, and correct utilization. Abuse of any HPE DSI’s resources may result in the termination of problematic queued or running applications, warnings, investigations, and in extreme cases, possible the termination of Accounts or Allocations.  

Incorrect usage examples 

  • Running application on the login node of any cluster
  • Running applications not meant for HPC clusters
  • Running applications that request large parts of the clusters, in terms of SUs, submitted jobs, storage, or write operations
  • Running interactive session that are mostly idle
  • Non-UH or foreign collaborator who use a lot of resources –Staff time included
  • Request Allocations without a genuine research need for them
  • Request Allocations without a genuine compute need for them, but planning to use as archival 

HPE DSI has been working with UH Main Campus Faculty to purchase compute nodes which may be used to expand HPE DSI’s clusters, based on model similar to the ‘Condo model’. Please contact us about this via datascience@uh.edu.

Contact Us

Our office is located in Suite 205 on the second floor of the
Durga D. and Sushila Agrawal Engineering Research Building.

Phone: 713.743.9922
Email: datascience@uh.edu

Physical address:
4718 Calhoun Rd.
Houston, TX 77204

Server Room