Data Classification in Higher Ed

Article Higher Education
Data Classification in Higher Ed


Colleges and universities were traditionally built on the idea of open communication and collaboration of ideas among students and faculty, as well as with people outside of the campus environment. This collaborative environment is often in direct conflict with the need to properly secure data and protect confidential information, so finding the balance between information security and academic openness can be a challenge.

To make things more difficult, educational institutions hold massive amounts of information. Student applications contain social security numbers and personally identifiable information as part of the admissions process. Tuition, housing, and meal costs are paid for using student or parent payment card information. Students visiting the campus health center have personal health information stored within medical records. Research campuses have sensitive research data that may also involve outside partners or government entities. Each type of data comes with its own set of security requirements according to state, local, federal, or contractual requirements (i.e. PCI, FERPA, HIPAA, GDPR, DFARS, state privacy laws, etc.).

Because data is often managed by individuals across multiple systems or applications outside of a standardized records management system, it can be difficult to keep track of the type and amount of sensitive data held by different parties. Unfortunately, we often find sharing of sensitive information like SSNs over Google Docs or social networking sites. In many cases, sensitive data can be lurking throughout campus without your knowledge and without adequate protection.

We continue to see data breaches at colleges and universities in the headlines. In 2012, the University of Rhode Island discovered that unencrypted files containing the personal records (along with names, SSNs, birth dates, and employment records) of 1,000 current and former faculty members and students were stored on a public server meant for faculty to store and share course information. Because the server was designed to distribute information freely, the university had not implemented appropriate security controls.

How can institutions address this growing challenge? As much as possible, segment and partition campus networks so sensitive information can be adequately protected and other networks can remain open to support collaboration on educational or research projects. But knowing what sensitive data exists and where it is located is easier said than done. It can be difficult to know and monitor where all sensitive data is being stored when so many different individuals and departments are managing their own data.

A thorough data classification process (described below) will allow you to separate sensitive data that may present a higher risk for exposure or breach from less valuable, public information. By determining exactly what information needs to be protected, you can then allocate resources to properly secure those information systems from unauthorized access.

Risk Assessment

Conduct a campus-wide risk assessment to determine what types of information exists on campus. Interview key stakeholders so you can gain a thorough understanding of all of the different ways they are using data in their daily roles. The flow of data into and out of the organization should be documented. How does your organization store and share data internally and externally? Do your employees use cloud-based services like Dropbox, Box, OneDrive, etc.? What about mobile devices used by faculty or staff?

You can also deploy software tools to help you automate the discovery process and search for things like payment card information or SSNs. An inventory of all data assets is critical to the due diligence process. Implementing appropriate information governance and security controls for different groups of data is only possible if your institution knows what information needs to be managed in the first place.

Through this process, you should also identify campus systems that present the greatest risk. Typically, these will include enterprise/administrative systems that are hosting large amounts of student and employee information.

Formal data classification policy

Once you know where your data is stored, information should be divided into predefined groups that share a common level of risk. Establish a framework for classifying data based on its level of sensitivity, value, and criticality to the organization. Consider the impact to the institution should that data be disclosed, altered, or destroyed without authorization. You may want to refer to the Federal Information Processing Standards (FIPS) publication 199 published by the National Institute of Standards and Technology to help determine appropriate categorization of information. Using this risk-based approach, you can determine your data classification categories; three or four categories is typical. With the most sensitive data identified, it can then be given the highest levels of protection. Lower-priority information can also be safeguarded appropriately.

Some example data categories might include:

  • Restricted Data:
    Data should be classified as Restricted when the unauthorized disclosure, alteration, or destruction of that data could cause a significant level of financial or legal risk, or harm to individuals and/or the institution. Examples of Restricted Data include data protected by state or federal privacy regulations. For example, information like payment card information, social security numbers, personal health information, non-directory student records, employee banking information, etc. The highest level of security controls should be applied to Restricted Data.
  • Confidential Data:
    Confidential data would include sensitive data that, if compromised, could impair operations or expose the institution to potential criminal or civil liability. This might include information that is tied to contracts with vendors or confidentiality agreements, research information, financial information, employee reviews, physical security plans, etc. Data should be classified as Confidential when the unauthorized disclosure, alteration, or destruction of that data could result in a moderate level of risk to the institution or its affiliates. A moderate level of security controls should be applied to Confidential Data.
  • Data for Internal Use:
    This is information not meant for public disclosure, but that does not present significant risk. This data is designed to only be accessed by eligible employees for the purposes of institutional business, things like organizational charts, faculty tenure recommendations, etc. Reasonable security controls should be applied to Internal Data to ensure that only eligible staff have access to it.
  • Public Data:
    Data should be classified as Public when it can be freely disclosed to the public and the unauthorized disclosure, alteration, or destruction of that data would result in little or no risk to the organization. Examples of Public Data include press releases, marketing materials, course information, contact information, etc. While little or no controls are required to protect the confidentiality of Public Data, some level of control may be warranted to prevent unauthorized modification or destruction of Public Data.

Enable Security Controls

Once each data group has been identified, you will determine the corresponding security controls required to protect each group. The level of protection will be driven by various legal, regulatory, academic, and operational requirements, as well as the risk levels associated with the data.

Establish baseline cybersecurity measures (using the PCI DSS or a cyber security framework like the NIST SP 800-171, CIS Top 20, etc.), and define the required controls for each data group to ensure the appropriate security solutions are in place. High-risk data requires more advanced levels of protection while lower-risk data requires less protection. Because of your previous efforts to identify areas where sensitive information is located and segment those systems from the rest, you (with support from your IT department) should be able to apply appropriate controls where necessary.

Information Security Policy

Your overarching information security policy and related procedures should be well-defined, aligned with your specified data types, and easily interpreted by employees. Clearly outline and define the different data categories and provide real-world examples of what information falls where. Procedures should dictate how each category of data can be used (e.g. Restricted and Confidential Data cannot be shared on consumer cloud platforms). Implement appropriate technical controls and educate users about current threats, best practices, and requirements for keeping data safe.

All sensitive data should have an owner, or data steward, who is responsible for appropriately classifying data and protecting the information. They will oversee the data’s lifecycle, monitoring it while it is in use and, when expired, ensuring proper disposal. There should also be a well-defined protocol for granting and revoking individual access to sensitive data. All users of this information are then responsible for complying with the defined requirements. Your acceptable use policy must be well- communicated and employees should be trained on the policy before they are allowed access to any sensitive information.

Monitor and Update

Monitor and maintain the organization’s data classification policy. This policy should be dynamic so you can make updates as necessary to meet the changing needs of your organization and staff. Establish a process for conducting periodic audits to determine what data exists and why. Update users involved to ensure continued awareness, encourage adoption of best practices for protecting information, and promote a culture of security that prevents the potential compromise of sensitive data.

As we saw last year with the release of the new European General Data Protection Regulation (GDPR) and California’s new data protection requirement, it is becoming more and more critical for organizations to understand what types of information they have and to be able to confirm they are protecting this data. By focusing on your data classification efforts and determining what data is being collected, where sensitive information is being handled, why it is collected, who has access, etc. you will be better equipped to comply with these new regulations, rather than scrambling to implement new processes and procedures each time a new requirement comes out.

Some additional guidance from the CampusGuard Security Advisor team:

[King]: Data classification allows an organization to adopt formal guidelines for proper use of data based on the amount of risk involved and provides a mechanism to promote use of the guidelines and educate data users across the entire organization. Without a formal system, staff must make decisions on proper handling of data based on their own knowledge. Formalizing data classification provides staff critical guidance and training in protecting data appropriately and consistently across the organization.

[Hobby]: All data is not created equal. Put simply, in order to protect your sensitive information, you need to know what data you’re trying to protect and where it is. If you know your data protection needs, you can effectively allocate resources to address those needs. If you don’t know those needs, you can’t appropriately address them; if you don’t understand your data, you might not adequately protect your sensitive data or may inappropriately treat all data as highly confidential. Your data classification program is the tool that helps identify your data protection needs, and positions your organization to focus on protecting higher risk data assets and for easier paths to information security and privacy compliance.


About the Author
Katie Johnson

Katie Johnson


Manager, Operations Support

As the manager of Operations Support, Katie leads the team responsible for supporting and delivering CampusGuard services including online training, vulnerability scanning, and the CampusGuard Central® portal. With over 15 years of experience in information security awareness training, Katie is also the Product Lead for CampusGuard’s online training services. As a Senior Customer Relationship Manager for a limited number of customers, Katie assists organizations with their information security and compliance programs and is responsible for coordinating the various teams involved.