Creating
content types
A primary driver of the new CDC.gov was to create a streamlined, standardized site. CDC has a wide breadth of content for very different audiences and until the start of this work, programs had complete control, including web presentation. Standardizing content meant exerting more strategic control through content types.
Dividing by audience | Visualizing trends | Defining the types
Dividing by audience
Simply put, CDC.gov content was all over the place. The decentralized nature of content publishing meant even somewhat similar content across topics was presented in entirely different ways. Very little consistency existed across site structures, URLs, page names, etc.
Separating content into major audience based buckets let me focus on the general public subset of content.
General public (GP) - The smallest portion of CDC content but the most widely accessible. This content includes basic education on a topic, how to prevent it, how to care for yourself and others.
Healthcare professionals (HCP) - This content is geared towards doctors, nurses, lab techs. It’s much more technical than general public content but it often mirrors it.
Ex: Clinical Symptoms of COVID-19 (HCP) vs Symptoms of COVID-19 (GP)
Public health professionals (PHP) - This content addresses public health departments, health organizations, media, politicians, data analysts, amongst many others.
GP and HCP symptoms pages for Tuberculosis.
Visualizing trends
My first goal was to quickly identify large trends. Looking at the most visited general public sites, I analyzed page names, URLs, and HTML tagging for the presence of topic words. This Yes/No validation let me quickly group pages which addressed specific topics.
I looked for trends in where content was mentioned. For example, looking at use of ‘symptom’ in H1 page titles revealed a few trends:
Dedicated pages
Symptoms
Signs and symptoms
Multi-topic pages
Symptoms, diagnosis, and treatment
Symptoms, risk, and recovery
The automated analysis was an excellent start but limited due to the varying quality and volume of content. Page titles and headers did not always accurately represent what was on the page. Because of this, I validated the Yes/No insights with manual page reviews.
I collected screenshots and started color blocking topics. This let me see other topics addressed on the page, such as complications. Most importantly though, this gave me insight into how symptoms were being discussed. Most topics had general symptoms sections but many also needed to discuss them within a specific context. For example, Rabies emphasizes the stages and timelines of symptoms whereas Hand, Foot, and Mouth Disease calls out symptoms by body part.
Excerpt from report showing trends for page titles including the word “symptom.”
Defining the types
I turned my visual representations into spreadsheets where I started building content types (CT). Due to business and technical requirements, content types were at a page level. I defined and ordered H2 sections for each content type page. Here is an example of the Symptoms CT. Ideally, I would have created more granular content types which would have addressed symptoms down to the timeline or body part.
Multi-topic pages
My initial research showed instances where topics were combined into a single page, such as Symptoms, Diagnosis, and Treatment. Looking at these pages, the amount of content per topic varied. The page could primarily be diagnosis content with only a few sentences about symptoms diagnosis. I developed two strategies to account for these situations:
Disease Basics content type
This CT served as an intro or a 1-stop shop for information depending on the quantity.
For large sites, this page provided overviews and key information for each topic. Each section would link to a more in depth page.
For small sites, this was potentially the only page a site needed. If one or two topics had enough content to warrant their own page, the Disease Basics section would summarize and link out to those pages.
Crosslink sections
In some cases, a CT needed to include a section that had it’s own CT. The Symptoms CT included a complications section. Minimal complication information could remain on Symptoms. If there’s more information, it would have it’s own page and Symptoms would crosslink.
First draft of a Symptoms content type.
Hopes and dreams
If I was able to do it again, I would use AI to do a more in depth analysis of existing content. I worked with a developer to automate things through R, when possible, but had to do a lot of manual analysis. Instead of analyzing a subset of content, I could have looked more content, identifying some of the smaller, more nuanced trends. I also would have liked to use AI to run a competitive analysis on sites such as WebMD or Cleveland Clinic.