top of page

VulnTool

Discover, Triage, and Remediate CVE Vulnerabilities Across Your Infrastructure

Project Type

End-to-end product design for a Meta internal tool

My Role

Product Designer

Target Users

350 Quarterly Active People

Duration

Ongoing (March 2020 - Present)

Contribution

  • User research

  • Stakeholder interviews

  • Product management

  • Hosted Ideation Sessions

  • High-fidelity prototypes

  • Design

  • Usability Testing

Cover_VT.png

Impact

Remediated 105% more vuln issues

117,905 remediated vuln issues in H2 2020; to 242,476 remediated vuln issues in H1 2021.

Reduced the average remediation time by 57.7%

From 71 days to 30 days.

Medium time to triage vulns decreased by 75%

From 51 days in June 2021, to 13 days in Jan 2021

Overview

Background

Meta’s code isn’t perfect- without proper patches and updates to our systems on a periodic basis they probably have vulnerabilities. For example, an application running on an outdated OS version could be breached if it isn’t upgraded to the latest version that secures the identified vulnerability.

At Meta, any leaked data negatively affects our brand integrity. So it is a very important issue!

The “EE Vuln Org Overview” dashboard highlights the large number of vulnerabilities across pillars, business units, and teams.

What is a Vulnerability?

  • CVE, short for Common Vulnerabilities and Exposures, is a list of publicly disclosed computer security flaws.

  • When someone refers to a CVE they mean a security flaw that's been assigned a CVE ID number.

  • CVE IDs have various CVSS scores to rank how crucial they are to fix quickly

  • Each CVE has a generic remediation solution- usually in the form of a patch upgrade.

  • When a scanned asset gets matched with a CVE ID, that is a Vuln

The Problem with Legacy Vuln Workflow

  • Inefficient Auto Remediation Task (Left)

    • Didn’t match CVE’s to hostnames

    • Have to google CVE info/solution

    • Task Tool statuses are too ambiguous

  • Tracking & Visibility

    • No way to quickly confirm remediation

    • No display of all scanned assets & CVE IDs

    • No lifecycle statuses i.e. “Fix Applied”

    • Lack of useful vuln-health & remediation charts

  • Navigation

    • No sorting & searching capabilities

    • No groupings of vulns that make sense i.e. share the same solution

Auto Remediation Task before VulnTool

Pre-VulnTool Workflow

VulnTool UI When I Started Working at Meta

When I joined the team in March 2020, I was the first product designer in my org Enterprise Engineering. A lead engineer had done all the UX/UI work himself so far, so there was lots of room for improvement as I integrated my UX practice into our team’s product development process.

Early UX Work and Designs

XFN Ideation Session (Quip Board)

Usability Testing Script

Engineer User Journey

VulnTool  - Mid 2020 UI

Mid2020_VulnsTable.png

Vulns Table

1. Status Quick Filters

​Users had trouble finding the vulns they needed. We added quick filters here based on lifecycle states so they could drill down quicker.

2. Entity Groups

Since engineers don’t attack vulns on individual hostnames, we added a column with a selector as header to identify the affected entity groups.

3. Collapsed Navbar

Users complained there wasn’t enough screen space to view the vulns. We collapsed the navbar to increase real estate of vulns.

4. Saved Searches

Because it took a while for users to find what they are looking for, we added saved search capabilities that can be applied throughout the tool.

5. Split Views

To limit the number of search results and make vulns more discoverable, we created separate views for in-progress & remediated .We also Included a notice if vulns are in the other category with a view button.

6. Customized Columns

Since various users requested more metadata in their own order of preference, we added a column

7. Last Scan Date

Users had trouble identifying false positives. We added the last scan date here so users could easily compare it to the last seen at date to make that decision.

Mid2020_Homepage.png

Homepage Dashboard

1. Team Toggle

When entering the tool, users often missed the search bar token for team ownership and were unknowingly looking at wrong vulns. We added this selector to make it more clear.

2. Active CVEs

We discovered the vulns table was difficult to discover all the CVEs and determine the impact one had on our infrastructure. We added this widget to communicate the vuln count and surface all CVEs in order of severity.

3. Recent Remediations

Users found it difficult to track remediation progress via the auto remediation tasks outside of VulnTool. We added this widget to surface that information better.

4. Open Tasks

To limit the number of search results and make vulns more discoverable, we created separate views for in-progress & remediated .We also Included a notice if vulns are in the other category with a view button.

5. Success & Productivity Tracking

We found the original progress charts/graphs were not readable or useful. I redesigned them here on its own page- emphasizing the severity breakdown of the vulnerabilities.

But Users Were Still Unhappy....

Because I entered this project without much technical knowledge of the space, I viewed many assumptions as facts. I later discovered the myriad of pain points our users faced.

I crafted a large UX Research plan to explore and dig deeper. I had to be very clever and organized about my documentation/presentation to convince the engineer project lead major changes were needed.

"How can we disable auto-task creation for Vulns? The auto task creation is simply adding addition noise."

"This is not the first time I've wasted hours of my life because you are not using information that you could easily collect."

"How can I group by CVE's and the host it effects? I would rather fix a CVE that goes to every host (that I care about) than fixing CVE's by host."

Time to Redefine the Problem!

Research Planning

I defined a research plan and sent out on a mission to reveal why the VulnTool was creating more headache than productivity.

Research Goals

  • Dig deeper why users aren’t satisfied with the VulnTool

  • Learn the needs of various user personas, teams, and dev platforms

  • Uncover the ideal remediation workflow to prevent errors and speed up the process

  • Define an optimized information architecture

Who

18 participants, mix of teams and personas

What

1:1 60 min interviews

When

Oct - Dec 2021

How

Remote Sessions through VC

Research Results

Research Data Points Whiteboard

Portion of Single Interview Synthesis

Research Themes

Reliability

Nexpose Vuln Data

​Very generic. Doesn’t consider FB or OS specific steps

“The nexpose solution is pointless at best and probably harmful because the manually installed package would not get any further updates.”

​False & Unknown Vulns

The tool can’t identify backports & has too many false/unknown CVSS vulns.

“95% of frustration dealing with backports & manually marking false positive.”

Task/Vuln Tool Relationship

​It’s difficult to monitor remediation success via tasks due to poor data & complex patch schedules.

“I waste a lot of time when tasks don’t automatically close due to overlaps & gaps in scan cycles of general scanner vs when OS auto-patches.”

Discoverability

Solution Grouping Hides CVE Info

The task tool link view hides vulns & doesn’t communicate the highest priority ones.

“I don’t like the solution view. I want to view the highest CVSS & which hosts the vuln affects. I don’t see the information I need here.”

Active Vulns too Broad

The active vulns view has too many statuses that don’t reflect the true lifecycle of a vuln.

“I want to auto focus on vulns that need attention & in progress, rather than manually filtering out all the false positives.”

Scattered Metadata

Vuln data to make decisions and act on them is indirect & dispersed all over the views/task.

“The task itself doesn’t provide any helpful info. It just says you have this vulnerability.”

Simplicity

Absent Vuln Timeline

There is no aggregated single vuln history that includes vuln published date, status changes, & scanner updates.

“The last scan date is misleading, because it makes it seem like physical host was scanned but it is actually a nexpose run date.”

Labeling of Vulns

Application versus OS issues (and the OS type) aren’t communicated in the tool.

“Engineers attack vulns by OS type & applications. It makes sense to split up by them like that.”

Notifications & Noise

Unimportant vuln tasks and misleading critical CVE references creates distress & waste time.

“I am alarmed by the number of active vulns in home dashboard. In reality only a few are validated & in progress.”

Flexibility

Ownership of Entity Groups

Tasks and CVEs are matched to single hostnames that often are not assigned to the right person.

“Tasks are made on oncall rotations; it doesn’t consider vulns that need to be attacked on upstream dependencies.”

Overrides

There is no way to insert or override the correct versions and workarounds in the tool.

“I want to override solutions in bulk to easily communicate proper remediation steps to my team.”

Custom Columns & Saved Searches

The tool needs to be flexible to accommodate different types of users & edge cases.

“It’s too many clicks & filters to find the info I need to see. I want columns for the metadata in the extra vuln and host info.”

Ideation

After the large UX research study I brought together a group of users, XFN partners, and core team members to ideate on solutions after I presented the research deck- following design thinking techniques.

I also worked closely with the product manager on an “Opportunity -> Solution” tree.

Shift of Focus to Triaging

Triaging = The process of first assessing the validity of a vulnerability, and then assigning it to the right person / team to fix.

Production team's legacy “Triage Decision” table before we integrated them into VulnTool.

We studied their workflow, and created a plan to implement a triage process in our tool that caters to both Production & Corp environments.

Engineers need security analysts to complete a triage process prior to them receiving a remediation task so they can spend more time on implementing the fix quickly rather than investigation- increasing the speed and quantity we remediate vulnerabilities.

Triaging solves...

  1. Make decisions at scale instead of individual vulns

  2. Only Meta verified vuln data in remediation tasks

  3. Correct remediation owners & affected entities

  4. No reliance on faulty auto remediation task logic

  5. Decreases excess notifications and noise for false positives

Triaging Flow Diagram

Triager User Persona

Role

  • Discovers the most critical vulnerabilities

  • Group vulns that can be remediated together

  • Manually creates remediation tasks for engineers

pain points_Blue.png

Frustrations

  • Can’t override vuln data in VulnTool- auto remediation tasks could have the wrong solution

  • Decentralized communication about remediation across different tools

  • Hard to apply vuln statuses in bulk on right groups

  • Hard to track remediation progress and success for a group of vulns

goals_Blue.png

Goals

  • Facilitates efficient and quick remediation of vulns

  • Catch false positives and exceptions as early as possible

  • Provide engineers with clear and descriptive remediation steps

needs_Blue.png

Needs

  • Triage multiple CVEs in bulk

  • Rescore a CVSS score based on environmental vector too

  • Triage only a subset based on entity group type i.e. OS vulns

  • Create remediation tasks for engineers directly from a triage

Triage Designs

Triage Form Annotations

1. Triage Metadata

Manually entered CVE data that will override our providers generic data. Now engineers will have actual and specific steps to remediate in Facebook context.

2. CVE List

Users are more interested in attacking vulns on a CVE level instead of individual hostnames so we listed them here- providing greater visibility of affected vulns.

3. Triage Statuses

Analysts can now enter a CVE status to mark all affected vulns at scale instead of one hostname at a time. False positives are marked sooner so engineers no longer receive unneeded remediation tasks.

4. Triage By Entity Groups

We added the ability to triage by specific affected entity groups for a single CVE since different affected hosts could have different solutions to remediate i.e. windows vs linux.

5. CVEs in Triage

Users can group CVEs that belong in the same triage- thus allowing creation of correct remediation data at scale easily. There is also a “CVE Info” dropdown so users can view CVE data while filling out the form.

CVE View with Triage Cards Annotations

CVE triage cards annotation.png

1. CVE Cards

Since users were most concerned about vulns at the CVE level, we designed this view so the CVE list selection dynamically shows CVE data on the right. Tabs were added to display non triage CVE data, tasks, and hostnames.

If there is a triage, this section is collapsed by default to highlight the verified triage data below.

2. Affected Entity Groups Sidebar

Because vulns live on entity groups and not individual hostnames, we added this button to display affected entity groups for each CVE. We kept a sidebar so users can also look at while filling out triage form.

3. Sub Triage Icon

If a CVE has multiple sub triages (opposed to one for whole triage), we included this split icon for easy recognition. Triage cards that are not a sub triage don’t have this icon.

4. Triage Cards

Each triage receives its own card under the CVE data to distinguish it as verified data. We added triage specific fields such as the CVSS score, triager, and number of entities. Added collapsed states to show the full list of sub triages easily at a glance.

5. Triage Activity

Because often there is a lot of discussion about vulns, we added a section for users to add comments within triage activity where remediation tasks are also tracked.

6. Other Entities Card

If users wish to triage more entities than that were already selected in a sub triage, they can click here.

7. Create a Task Button

We added the ability for triagers to create remediation tasks for engineers directly from the triage card. Now they don’t have to manually create on their own, and all data in the task will be valid.

Triaging Was a Win!

Remediated 105% more vuln issues


117,905 remediated vuln issues in H2 2020; to 242,476 remediated vuln issues in H1 2021.

Reduced the average remediation time by 57.7%


From 71 days to 30 days.

Medium time to triage vulns decreased by 75%

From 51 days in June 2021, to 13 days in Jan 2021

EE Vuln oncall detected 171% more false positives

VulnTool "Group By" Redesign

The Problem

Previously the tool lacked dynamic groupings to make it easier to discover and triage vulns that belong in the same remediation task. The active vulns table only displayed individual scan results, and a triage could include hundreds of them. This resulted in endless scrolling, which was a huge pain point.

 

Without dynamic groupings that reflect true remediation efforts- like grouping vulns by operating systems, OS platforms, and assets types-  it was nearly impossible to track remediation progress.

Active Vulns Table With Too Many Line Items

User Stories, Design Criteria, & Inspiration Audit

Ideation Session

Based on the problem statements me and the PM identified through previous research and the user personas I identified, I hosted an ideation session with the core team.

We generated many ideas that we then categorised based on feasibility into This MVP, next year improvements, and long term vision.

Low-Fi Prototypes & A/B Testing

Option 1

Option 2

Final Designs

Overview
Eary Work
UX Research
Ideation
Triage Designs
More Designs
bottom of page