Crafting Analytical Summaries with Chat GPT

Introduction
I would venture a guess that every analyst has at some point in their career received the task of creating a weekly, monthly, or quarterly recurring report that summarizes performance. These reports tend to have very similar topline summaries (e.g. “traffic rose x% month over month, video views fell y% month over month”) which can be tedious to write and can be error-prone given the number of data queries involved. When writing the summary you may get Slacked with an urgent question or request as you’re typing your summary statement. When you come back to finish your report, it is easy to input the wrong figure or percentage change which can be hard to catch before sending the report out. As someone who has fallen victim to these kinds of errors in the past, I set out to avoid such a fate with the help of spreadsheets, ChatGPT, and lots of coffee.
With the hope of automating a somewhat tedious summary-writing process (in this case, one summarizing traffic to HuffPost.com), I dove into the anatomy of the report which consisted of:
- Topline summaries (overall movements and source breakdowns)
- Highlights on the biggest reasons for change (good or bad)
- Deep dives into each data source commenting on what drove changes in traffic using third-party data where available
- A call out if the most recent month was one of the highest over the past 6 months
Scoping the Ask
Initially, I tried to automate just the topline summaries by pulling data from our warehouse in BigQuery, putting the results into a spreadsheet, and using formulas to generate readable, labeled text. For example, for the topline summary, I used the following formula:
="HuffPost had "&TEXT(B6/1000000,"#.#")&"M cross-platform PVS in "&TEXT(B3,"MMMM")&". Total cross-platform PVs across O&O, Apple, Yahoo and MSN were "&IF(H6>0," up "," down ")&TEXT($H$6,"##.#%")&" compared to the previous 3MA."
I used similar formulas to “write out” topline summaries and topline referrer data (e.g., “Social PVs grew to XX.XM (+Y%) driven by Facebook (+Z%), Twitter (+W%) and Reddit (+Q%), coming in at +X% below the 3MA”) where I had each value pre-calculated in a table in the spreadsheet to easily reference.
After this formula build-out was complete, I was satisfied with the results! I no longer needed to do the tedious task of typing out how much traffic went up or down for various cuts. That being said, I still had to dig into why we saw the traffic metrics shift, by identifying traffic correlations from additional datasets. This was a step in the right direction.
ChatGPT to the Rescue
After building out the spreadsheet that performed the topline summaries, I joined forces with colleagues on BuzzFeed’s Analytics squad (shout out Anna McGrady and John Zhou🙂) during a hackathon. We took on the task of building traffic summaries using ChatGPT by feeding it the data table that I sourced my spreadsheet summary logic values from (like the one below) while describing what each of the columns were and asking it to produce a written summary for each row in the prompt.

The summary result was a success; however, the chatbot tended to format the summary in whatever way it saw fit. As such, getting a reproducible and reliable summary given the time constraints of the hackathon wasn’t doable. As a proof of concept, this was absolutely a success.
After the hackathon was over, I was encouraged to carry on the project with the goal of being able to get a usable working traffic summary. I decided that the topline summaries I had put together using the spreadsheet formulas were good to use and that I could use ChatGPT to write the deep dive summaries for each platform (Facebook, Instagram, Google) by giving it key data points and having it do some minor analysis. Ultimately I could use ChatGPT to “weave together” the generated deep-dive summaries and my spreadsheet-generated topline summaries into one cohesive report.
To centralize this project I built a script in Python to pull the structured data and summaries I had in the spreadsheet, make new calls to BiqQuery for deep-dive data, and call ChatGPT’s API to summarize the above, then finally write the output to a CSV file. First, I pulled in all of the summaries from the spreadsheet that were written with formulas. Next, I built SQL queries to pull what I had decided were key data points that could be used to inform the “deep dive” analysis for each platform. Lastly, I crafted a ChatGPT prompt that would produce an analysis of these datasets. For search and social metrics, I pulled impressions, clicks, and traffic data along with the top 20 best articles month-over-month (MoM). For any platform where first-party data was scant, I pulled in average article views MoM and top 20 article data. Additionally, I used a formula in the spreadsheet to rank each platform’s monthly traffic to call out if that month was in the top 3 highest performing for traffic over the past 6 months.
Deep Dive on Prompt Engineering
The initial prompts I made were very broad. I was essentially asking ChatGPT to write the entire summary with a single prompt fed to it without any context, a system role, or an explanation of the dataset I would give it. ChatGPT was then left to decide on its own how to fit the information into a summary like so:
USE THE BELOW FORMAT for each brand EXACTLY TO SUMMARIZE the DATA. Column MoM means month-over-month change and Mo3MA means month over the previous 3-month averages. ONLY PROVIDE ANALYSIS FOR THE MOST RECENT PERIOD OF TIME:
Based on the provided data, here is the monthly website traffic summary for BuzzFeed Inc.:
BuzzFeed:
- In June 2023, BuzzFeed had 500,000,000 pageviews, which is an increase of 6.0% compared to May 2023.
- Apple News brought 300,000,000 pageviews in June 2023, which was up +2.2% compared to May 2023.
- 17% of O&O PVs in June 2023 came from the homepage, which was up +0.7% compared to May 2023.
- O&O UVs and O&O PVs were down 4% and 6% compared to May 2023 and the 3-month average, and PVs per UV was down -1.1% compared to May 2023.
After noticing that the previous output was not continuously producing reliable and repeatable results and that the summarization was too broad, I decided to break down the summarization by traffic source. ChatGPT was then given this system role for each traffic source dataset: “You are an analyst providing a BRIEF summary of [insert traffic source] traffic movement for the most recent month provided.” Next, each relevant dataset was added. A final prompt was constructed and looked like the following example:
“Please provide a summary EXACTLY LIKE the following example only focusing on the most recent month over month trend for the most recent month provided: ‘Instagram traffic in September grew/fell by INSERT ‘Month Over Month # Difference’ visits FROM THE TOPLINE MONTH OVER MONTH SUMMARY (INSERT MoM growth %).[INSERT statement that calls out if this is one of the best months of traffic if not ‘EXCLUDE’]. Potentially driven by the higher/lower total engagement/number of posts with ‘Link in Bio’ in (INSERT METRIC OR METRICS THAT MOST NUMERICALLY FOLLOW THE TREND IN VISITS FROM MOST RECENT MONTH) v (INSERT METRIC OR METRICS THAT MOST NUMERICALLY FOLLOW THE TREND IN VISITS FROM PAST MONTH).”
I found that being very precise about the format and structure of the response yielded the best and most repeatable results. Previously, I had asked ChatGPT to both summarize several data points and explain where to look for the most recent period: this was too much for ChatGPT to consistently analyze. In the new iteration, I was also very careful to only send the data it absolutely needed, since previous attempts showed that giving too many months of data or too much information yielded unpredictable results.
For the third-party platforms with more granular data, the approach to the system role was exactly the same as previously described, but the final prompt for each was tailored to each platform’s respective data points that typically would correlate best to growth/decline. The following is a Facebook prompt example:
Please provide a summary EXACTLY LIKE the following example only focusing on the most recent month over month trend for the most recent month provided: ‘Facebook traffic in September grew/fell by INSERT ‘Month Over Month # Difference’ visits FROM THE TOPLINE MONTH OVER MONTH SUMMARY (INSERT MoM growth %). Potentially driven by the higher/lower total link clicks/impressions/number of posts with links in (INSERT METRIC OR METRICS THAT MOST NUMERICALLY FOLLOW THE TREND IN VISITS FROM MOST RECENT MONTH) v (INSERT METRIC OR METRICS THAT MOST NUMERICALLY FOLLOW THE TREND IN VISITS FROM PAST MONTH).
All Together Now!
Now that we have all of our topline summaries from our spreadsheet formulas and our deep-dive summaries for each of the respective platforms, we are ready to “weave” together our final summary. For this step, I gave chatGPT this system role:
“You are an analyst taking and using EXACTLY the summaries provided and structuring them in a VERY SPECIFIC FORMAT that will be provided.”
Each overall summary statement is provided along with the deep dives stating specifically what they are (e.g., “This is the search topline summary”, “This is the SmartNews (syndicated platform) deep-dive statement”, etc.).
Finally, I fed it a prompt with the final structure I wanted:
Summary
- [INSERT TOPLINE SUMMARY STATEMENT]
- [INSERT THE STATEMENT SUMMARIZING THE TOP TWO SOURCES]
- [INSERT THE FULL DEEP DIVE SUMMARIES FOR THE FIRST SOURCE INCLUDED IN THE TOP TWO SOURCES THAT DROVE THE INCREASE OR DECLINE]
- [INSERT THE FULL DEEP DIVE SUMMARIES FOR THE SECOND SOURCE INCLUDED IN THE TOP TWO SOURCES THAT DROVE THE INCREASE OR DECLINE]
- [INSERT ARTICLE OUTPUT CHANGE SUMMARY]
Referrer Data
- [INSERT APPLE NEWS TOPLINE SUMMARY]
- [INSERT THE FULL APPLE NEWS DEEPDIVE SUMMARY UNLESS IT IS ONE OF THE TOP SOURCES ABOVE THEN EXCLUDE]
- [INSERT SOCIAL TOPLINE SUMMARY]
- [INSERT THE FULL DEEPDIVE SUMMARIES FOR FACEBOOK, INSTAGRAM AND TWITTER UNLESS SOCIAL IS ONE OF THE TOP SOURCES ABOVE THEN EXCLUDE]
- [INSERT SEARCH TOPLINE SUMMARY]
- [INSERT THE FULL GOOGLE SEARCH DEEPDIVE UNLESS SEARCH IS ONE OF THE TOP SOURCES ABOVE THEN EXCLUDE]
- [INSERT SYNDICATED PLATFORM TOPLINE SUMMARY]
- [INSERT THE FULL DEEPDIVES FOR SMARTNEWS AND NEWSBREAK]
- [INSERT HOMEPAGE TOPLINE SUMMARY]
Based on previous experience creating individual website, and third-party platform summaries, the specificity of what I wanted the output to look like provided the acceptable format I wanted for the report. Here’s a sample of the output (without actual figures):
Topline Summary:
- HuffPost had 330M cross-platform PVS in January. Total cross-platform PVs across O&O, Apple, Yahoo, and MSN were up .x% compared to the previous 3MA.
Source Deep-dive Examples
- Twitter traffic in January 2024 grew by x visits from December 2023, a y% month over month growth. This growth could potentially be driven by the higher total impressions and number of posts with links in January (x impressions from y posts) compared to December (z impressions from a posts).
- Google Search traffic in January 2024 grew by x visits from December 2023, a growth of y%. This was the best month for the past 12 months for Google Search. This could be potentially driven by the higher total impressions and number of clicks in ‘web’ search type which showed an increase from y impressions and z clicks in December 2023 to a impressions and b clicks in January 2024.
Conclusions and Learnings
While the above results certainly aren’t perfect, they are a huge step in the right direction in terms of saving time and preventing errors.
A few learnings from this process:
- Automating summarization does not replace an analyst’s work. The output above should be QAed, but now the majority of an analyst’s time can be spent on analysis, rather than a manual laborious task. Our analysts can take the output of this automated script, and spend time discussing reasons why a trend exists with editorial stakeholders or dig deeper into a trend in a way that this automation may not consider.
- Determine the needs of the final consumer to inform how you build your prompt. Before trying to automate any summary, it is important to manually produce the summaries with the stakeholders’ needs in mind. Take the time to meet and chat with the end users about how they use the report and what they want out of it. Their insight will be invaluable to having the best possible outcome. Once inclusions are decided, it’s like putting together a puzzle with the final prompt.
- Specificity helps immensely! To get predictable results, it is important to limit ChatGPT strictly to the information it needs to complete a given task. With too much information it makes mistakes and broad assumptions about what you want
- Break up sections into distinct statements to figure out what inputs are required for each. Instead of having a single long summary, break it down into sections: a topline summary, summaries by source, and deep dives for every traffic source. This increases the chance of a predictable result.
- Giving ChatGPT a system role is essential! ChatGPT System Role prompts (“you are an analyst”) are key to reinforcing and guiding ChatGPT to the result you ultimately want back. They act as guidelines and give ChatGPT a purpose that it acts on when you ultimately give it your final summary prompt.