Policies and Data Management
Policies and Procedures
The Hewlett Packard Enterprise Data Science Institute (HPE DSI) strives to support the research of the community of the University of Houston (UH). The following policies and procedures aim to help safeguard against any actions/activities that might impact the reliability of our systems and prevent HPE DSI’s resources users from effectively conducting their research.
Use of HPE DSI’s clusters signifies compliance with and acceptance of all the policies and procedures described below as well as those of UIT (https://uh.edu/infotech/policies/index).
Protected and Sensitive Data
All HPE DSI users agree that they will NOT use any of the HPE DSI systems to store data that are protected by the following laws and regulations.
Health Insurance Portability and Accountability Act (HIPAA)
Federal Information Security Management Act (FISMA)
Family Educational Rights and Privacy Act of 1974 (FERPA)
International Traffic in Arms Regulations (ITAR)
Export Administration Regulations (EAR)
Violation of this policy will result in the immediate closure of all associated user accounts and the removal of data that are in violation of this policy. It is the responsibility of the HPE DSI user to ensure compliance with this policy and to report any violation of this policy as dictated by governing data use agreements.
HPC Clusters
The HPE DSI currently hosts three computer clusters: Opuntia, Sabine, and Carya. Opuntia is older and smaller than Sabine, and Sabine is older and smaller than Carya. Details about the clusters can be found on the Research Computing Data Core website.
The allocations, accounts, software, storage, and management of each cluster are independent of each other. Popular software/applications have been installed in all the clusters (see Software below), but there are some differences in software versions from cluster to cluster.
Eligibility
All UHS faculty are eligible to be Principal Investigators (PIs). This includes tenured, tenure-track, emeritus, research faculty, or clinical faculty.
UH faculty need an active cougarnet account to access HPE DSI’s systems. UHS faculty who are employed on the UH Downtown, Clear Lake, or Victoria Campuses may request DSI to sponsor cougarnet credentials directly at datascience@uh.edu. Note that these credentials are renewed annually.
Other UH employees may be eligible for an allocation. These requests will be evaluated on a case-by-case basis.
Costs
All HPE DSI allocations are free of charge.
Single vs. Multiple
Eligible PIs whose research needs require the utilization of more than one of the HPE DSI’s clusters, may submit an Allocation Request for more than one cluster. HPE DSI does not recommend PIs to have multiple allocations if any of them are not actively used because this generates unnecessary overhead. Allocations for each cluster are evaluated and managed independently.
Duration/Compute Cycle
Allocations run on yearly cycles from September to August.
Renewals
During July, PIs must submit an Allocation Request with the option ‘Renew Existing Project’ by August 1st. This will allow research groups to continue using the clusters without any interruption. It is important that PIs do an assessment of the necessary compute and storage resources that their group will utilize for 1-year.
If no such renewal is received by HPE DSI, access to that allocation will be locked by the first week of October of the new compute cycle. HPE DSI will send reminders of these important dates.
Any accounts which remain locked for 12-months, will be deleted. Including 100% of the data and files in the PI’s project directory, all its subdirectories, as well as all the home directories of the group members with accounts in that specific allocation.
It is the responsibility of the PIs to inform HPE DSI in writing if they have a written agreement binding them to keep access to their research data in HPE DSI systems for a year or more. HPE DSI reserves the right to delete such data, should PIs fail to communicate this information.
Request
Eligible PIs may submit an Allocation Request at any point in time.
PIs must first decide which cluster they want to utilize: Opuntia, Sabine or Carya.
Once a specific cluster has been identified, eligible PIs must submit an Allocation Request using the Request Allocation Form.
Note allocation requests from students, postdocs or other collaborators will not be approved; they must be submitted by eligible PIs. PIs must provide the number of hours, number of cores, number of GPUs, amount of memory, amount of storage, and software to utilize, from the date of filing the request to the following August 31st.
The Allocations’ Committee (below) will review all submitted requests, and decide on the award within a one-week time period, based on:
-
the current capacity of the resources
-
the existing allocations’ commitments
-
the information provided by the requestor PIs in terms of computational needs
-
the PIs’ historical utilization in HPE DSI’s resources
Request types
New Project. This is for PIs who wish to utilize one of HPE DSI’s clusters for the very first time (independently of any other clusters’ allocations).
Additional Compute Resources. This is for PIs who already have an active allocation, and their group has utilized most of the awarded compute time and wish to request more compute time.
Additional Storage Resources. This is for PIs who already have an active allocation, and their group has utilized most of the awarded storage and wish to request more storage.
Renew Existing Project (yearly, by August 1st). This is for PIs who already have an active allocation and wish to continue using the corresponding cluster during the next compute cycle. Allocations for each cluster are evaluated and managed independently.
Allocations Committee
It is formed by the Directors and Managers of HPE DSI and its Research Computing Data Core (RCDC). They evaluate each allocation request on a case-by-case basis, considering the requesting PIs’ computational needs in terms of number of requested hours, GPUs (if needed), memory, storage, and software to utilize. In addition, the Committee investigates the PIs’ historical utilization in HPE DSI’s resources before deciding on any allocation awards.
Allocations Size/Use Guidelines
Opuntia [SUs, TB] | Sabine [SUs, TB] | Carya [SUs, TB] | |
---|---|---|---|
Small | 50k - 150k, 5 | 50k - 150k, 5 | 0, 0 |
Medium | 250k - 999k, 5 | 250k - 999k, 10 | 0, 0 |
Large | 0, 0 | 500k – 1.5M, 20 | 250k - 1M, 20 |
Huge | 0, 0 | 1M – 3M, 20 | 1M – 3M, 20* |
*The Allocations Committee may review time-limited proposals for special cases for 'Texas' projects that actually need more than 20Tb.
Eligibility
Accounts are created for specific active Allocations only.
All Faculty member of UHS, PIs, who are eligible for an Allocation will receive an account automatically as part of their Allocations.
PIs who have an active Allocation may sponsor other eligible PIs, UHS undergraduate and/or graduate students, UHS post-docs and/or technical staff who have active cougarnet credentials.
Collaborators from other UHS Campuses must first ask the allocation's PI to sponsoring an ePOI for them in order to obtain temporary cougarnet credentials.
Non-UH collaborators of the PIs may be eligible too, depending on the review of the Allocation's Committee.
Non-UHS or Foreign collaborators
HPE DSI does not have the compute capacity to provide for non-UH researchers. However, allocation eligible PIs may sponsor accounts for collaborators from institutions other than UHS.
Both the PI and any potential non-UH users must submit a clear, detailed project description/proposal to the Allocations Committee, using datascience@uh.edu, to evaluate whether UH resources can be allocated to external collaborators sponsored by UHS PIs.
These non-UHS accounts are to be used:
With active participation of the UHS-based part of the research group
As complement to the external collaborator’s home institutions resources, but not for the bulk of the external collaborators work.
It is the responsibility of the PIs to evaluate any of their foreign collaborators to ensure that no federal regulations, including export control are violated in the collaboration. The Allocation Committee will ask for this information before deciding on the creation of any account in question.
Request
Eligible Allocation Account members must submit an Accounts request using the Request Account Form.
Use the sponsoring PI’s name and the cluster’s name.
Duration
Accounts are attached to active Allocations. Active Accounts will be automatically renewed with their corresponding Allocations’ renewals (above), except for Accounts of groups’ members who are registered in the UH Downtown, Clear Lake, Victoria Campuses, or other non-UH institutions; these require annual sponsorship of cougarnet credentials.
HPE DSI will delete all accounts not associated with an active Cougarnet ID
HPE DSI’s resources are shared. This sets limits on the allocations’ compute times that may be awarded per year.
Compute time is measured in ‘service units’, or SUs. One SU is 'burned’ by using one CPU core for one hour. Ten SU are burned using one CPU core for 10 hours or using 10 CPU cores for 1 hour. In the case of GPUs (regardless of their models), 10 SUs are burned using 1 GPU card for 1 hour.
PIs must provide an amount of SUs representative of their research needs/group while submitting an Allocation Request.
Should a research group utilize most of their awarded SUs before the end of the current compute cycle, the PI leading the Allocation may submit an allocation request with the option “Additional Compute Resources”, at any point in time.
HPE DSI’s resources are shared. This sets limits on the allocations’ storage amounts.
HPE DSI’s clusters have been architected to support projects with a balanced compute-to-storage ratio, as is the case in most HPC centers.
HPE DSI does not support archival storage.
Each active allocation will have the following data directories:
/home/USERNAME/
One for each account. Each home directory has a 10 Gb limit, and it is meant for basic data maintenance, such as copying files, organizing data, compiling code, installing local applications, etc. Home directories are not for large in-out, read/write operations for research.
/project/ALLOCATION-ID/
One for each allocation. The capacity is much larger than the home directories, and it is meant to store large data sets (starting from 250 Gb) and being utilized as the main in-out, read/write drive for research.
It is the responsibility of the leading PI to manage and maintain access and permissions for all her/his allocations’ members.
Should a research group utilize most of their awarded storage (in the /project/ALLOCATION-ID directory) before the end of the current compute cycle, the leading PI of the Allocation may submit an allocation request with the option “Additional Storage Resources” at any point in time.
Backups
HPE DSI does not perform any backups on any of its clusters’ file systems. If for example, a student of an active allocation deletes her/his research files, then HPE DSI does not have the capacity to restore them.
It is the responsibility of each Allocation PI to keep backups in servers, workstations, or systems of their own, independently of HPE DSI, of any important data/files which is processed/generated in any of HPE DSI’s clusters.
HPE DSI will delete all accounts not associated with an active Cougarnet ID.
HPE DSI’s clusters run on Linux, and do not support any Windows/Mac OS packages.
HPE DSI’s clusters are traditional HPC systems and are not designed to run any virtual machine environments.
Popular applications/software/libraries such as C, C++, Fortran, compilers, FFTW, HD5, Python, R studio, ML libraries, and many more, have been installed in all HPE DSI’s clusters. There are some differences on software’s versions from cluster to cluster.
Allocation PIs are responsible for identifying and purchasing the correct licenses attached to any software that their groups would like to utilize in any HPE DSI’s clusters.
Allocation users may request the installation of some packages needed for their research via our ticketing system.
Allocation users have permissions to do local installations in their home directories. This is meant for very specific software that is not relevant for other groups in the cluster, or applications that require routine/frequent compilations.
HPE DSI’s RCDC provides all hardware support and maintenance for the clusters.
HPE DSI and its RCDC provide support to their active Allocation users via our ticketing system.
We provide these types of support:
- Allocation/Account related requests: Details can be found above.
- General help: This is meant to support Allocations’ active members with issues/clarification related to their Allocations, which may take on a few hours of HPE DSI’s Staff time. This includes software installations.
- Consultation: These are approached on a case-by-case basis, and are meant to study, and elaborate a plan to support UH research groups which would like to use HPE DSI’s clusters/Staff for data science, machine learning, deep learning, HPC, or other fields which may require a significant computational effort.
All queued and running applications in HPE DSI’s clusters are managed by the automated system Slurm. The scheduler has been set-up based on best practices for University-based HPC systems, as well as RCDC’s data on HPE DSI’s clusters utilization/traffic. This ensures the fairest share of resources for Allocations of all sizes.
It is the responsibility of active Allocations’ PIs to supervise all their Allocation users activity, and correct utilization. Abuse of any HPE DSI’s resources may result in the termination of problematic queued or running applications, warnings, investigations, and in extreme cases, possible the termination of Accounts or Allocations.
Incorrect usage examples
- Running application on the login node of any cluster
- Running applications not meant for HPC clusters
- Running applications that request large parts of the clusters, in terms of SUs, submitted jobs, storage, or write operations
- Running interactive session that are mostly idle
- Non-UH or foreign collaborator who use a lot of resources –Staff time included
- Request Allocations without a genuine research need for them
- Request Allocations without a genuine compute need for them, but planning to use as archival
HPE DSI has been working with UH Main Campus Faculty to purchase compute nodes which may be used to expand HPE DSI’s clusters, based on model similar to the ‘Condo model’. Please contact us about this via datascience@uh.edu.
The HPE DSI high-performance computing clusters foremost support research of UH faculty, students, and staff. Access to academic courses can be requested by the instructor of record and is subject to the availability of computational resources. The HPE DSI will evaluate requests and determine availability and resources.
Application for support should be sent via the HPE DSI Academic Support Request Form at least two weeks before the beginning of the session. The request must include the following items:
- Syllabus of the course demonstrating the need for high-performance computing resources and a detailed explanation for the total amount of resources requested.
- A listing of requested software.
If the request is approved, a class roster should be submitted, and class accounts will be created for the length of the requested session and deleted one week after the closing date for the session. 1TB of project space will be provided in support of an academic class and students will have 10GB of space in their home directories for storage. Project directory space will be named after the class, i.e., /project/cosc6365.
The instructor of record must provide an updated class roster at the end of the second week of classes.
The instructor of record must ensure that students will only utilize the class account for class work.
Support for class assignments is not provided by the HPE DSI. Students should request support from the faculty member, TA, or other academic support member(s). The HPE DSI will redirect all support requests from class accounts to the faculty member.
An introductory training session on cluster usage is available but must be requested and scheduled with the HPE DSI before the start of the semester.
Contact Us
Our office is located in Suite 205 on the second floor of the
Durga D. and Sushila Agrawal Engineering Research Building.
Phone: 713.743.9922
Email: datascience@uh.edu