Analytics Frameworks Every Data Scientist Should Know
Why I believe my experience at McKinsey made me a better data scientist
Unlike a lot of data scientists in tech, my career in data science started in consulting, and I think it’s the best career move I have made. Say what you will about consulting culture and the hours, I learned so much in the two years I was at McKinsey and I still benefit from it every day.
As a manager, part of my job is to coach data scientists on the team when it comes to projects and career growth in general. I realized what junior data scientists struggle with the most is usually not the technical/execution part of the job — that’s the easy to teach/easy to learn part.
It’s usually the more abstract/soft-skill-related part of the job that most people don’t know how to navigate — things like how to break down an abstract business problem into smaller, clearly defined analyses that can eventually lead to concrete business impact.
These are the things I got to practice day in day out as a consultant, and I think the learnings carry over to data science very well.
To help my fellow data scientists, I want to summarize my learnings from my consulting days so you can benefit from them without going through the grind.
In this article, I will:
Convince you why consulting-style training can immensely benefit data scientists of any level
Walk you in detail through the most valuable frameworks I learned at McKinsey that you can apply in your day-to-day work
Why I think consulting experience can benefit every data scientist
Reason 1: It grows your ability to learn about an area and make impact fast.
Because consulting projects are always strapped for time (consultants are billed by the hour), consultants don’t have the luxury to spend months to learn about all the background, context and client-specific subject matter in depth before being pressured to deliver solutions.
So consultants are trained to learn about an area efficiently and make impact along the way. There are a lot of skills involved in this process:
Asking the right questions to the right people; what is the information you really need to know to solve the problem?
Discovering gaps and figuring out short-term solutions to plug the gap
Turning short-term solutions into long-term ones and identifying the right stakeholders to help pushing things forward.
These skills are essential because while “great things take time”, people still expect you to deliver incremental “good things” on the way of delivering great things
Delivering the (albeit not perfect) MVP in a short timeframe that makes a big (enough) impact is the goal. This forces consultants to constantly deploy the 80/20 rule:
With 20% of the effort you can get 80% of the impact.
There are two reasons for this:
Every activity quickly hits diminishing returns. For example, building the first version of a dashboard when the team previously had no way of looking at data has a huge impact. As you keep refining the dashboard, you quickly get to the point where a new filter or toggle only adds a small benefit for individual users
You don’t need perfect accuracy to make most decisions. For example, to make a “Go” or “No-Go” decision, you often just need to know whether a number will land in a certain ballpark range (e.g. $150M — $200M), not what the exact number will be.
This will be completely out of the comfort zone of a lot of engineers and data scientists. But remember that at the end of the day we are trying to run a business, not write a white paper; it’s crucial to build this mental flexibility.
Reason 2: It teaches you to be a full-stack data scientist when needed.
Most consulting teams are not staffed with data engineers + data scientists + ML engineers + PMs + biz ops.
So if you need raw data cleaned and built into pipelines to run an analysis, you don’t have data engineers to count on, you have to pick it up; if you need to estimate the impact of an initiative, put on your biz ops hat and do the back-of-the-envelop calculation yourself.
Most data scientists in consulting are full-stack data scientists; they can wear different hats when needed. For one project, we built an end-to-end AB test analytics solution in a month for an e-commerce company; for another project, we built a prediction model for movies’ box office performance in a few weeks so the team can make fast decisions about which screenplays to fund.
Are these solutions as good as a proper SaaS product from a big tech company? Definitely not.
But are they good enough to solve a business problem timely? Almost certainly.
And most importantly, I learned a ton about how to deliver solutions end-to-end wearing different hats. This makes my collaboration nowadays with partner teams like ML engineers, data engineers and PMs that much smoother because I know a little about their jobs.
Reason 3: It “forces” you to see the big picture and communicate in a succinct but clear way (both written and verbal communication).
While in bigger tech companies you won’t get a lot of visibility in front of executives as a junior IC, consultants can usually bypass the hierarchy and get the opportunity to work directly with C-level executives.
These executives usually have a million things on their plates and they are constantly context switching — so communication with them needs to be high level and effective.
Junior ICs in the analytics realm usually struggle with this type of communication because the analytics work itself requires you to be deep in the weeds and it’s hard to then to zoom out when communicating your findings.
Consultants spends hours honing every client-facing deck and documentation to nail this skill. Every piece of material goes through peer review, manager review and review from partners so people at different levels of familiarity with the project can help judge whether the material is easy to digest.
Compared to junior ICs in the big tech companies who may only get a handful of opportunities per year to practice this type of communication, consultants’ frequent communication with execs forces them to practice and master the ability to not only see but also communicate the big picture.
McKinsey’s “Secret Sauce”: The most useful consulting frameworks and how to apply them to analytics
But not everyone will get a chance or is interested in being in consulting during their career.
So let me introduce you to some of the most important frameworks that can help you get the essence of the consulting way of thinking without going through the grind.
MECE framework
If there’s one thing you take away from this article, it should be this framework.
This is the first framework new consultants are introduced to and it’s a framework that can help you think through just about any problem in life.
MECE stands for mutually exclusive and collectively exhaustive.
It’s a way to break down something big or abstract into smaller buckets to make the problem more approachable while making sure that there’s no duplicated effort (“mutually exclusive”) or missed area (“collectively exhaustive”) in the process.
Let me give an example: Let’s say you want to segment the member base of Instagram by age. It’s not MECE to have groups like “underage (<18)”, “teenagers (13–19)” because there’s duplication.
Similarly, segmenting the data into groups like “underage (<18)” and “age 18–60” is not MECE; while there is no overlap, it doesn’t cover the whole universe of possibilities. What about our friendly senior citizens above 60?
Not all problem can be neatly broken down in the MECE way; but if you apply concerted effort to solve problems with this framework, your approach will be a lot more comprehensive.
MECE is the foundation of most other frameworks; in other words, any good framework that involves any segmentation or categorization should be mutually exclusive and collectively exhaustive.
Issue Trees
An issue tree is the best embodiment of the MECE framework. It is commonly used to decompose a complicated problem and show how different factors contribute to the whole.
For example, the question “How can an e-commerce company increase its profit?” seems like a daunting one without a framework. A lot of people start to brainstorm without any structure:
“Let’s do more marketing on social media”, or
“Let’s move our factories to lower-cost countries”
These might be solid ideas, but where did they come from and how can you make sure you explored all the levers at your disposal? The truth is if you brainstorm like this, you will very likely only explore a couple of options and miss many others.
But if you can break down the problem more systematically, you can make sure you explore all avenues and it can even help you delegate since each sub-tree is a smaller-scale problem that can be solved individually.
Let me show you how:
The beauty of issue tree is it breaks down a huge problem into smaller ones so it’s easier to digest — thinking about ways to “increase product variety” is a lot more concrete and digestible than thinking about ways to “increase profit” directly.
Hypothesis Tree
A typical DS interview question is “XX metric is down, how would you go about investigating what’s causing it?” A lot of candidates again start grabbing hypothesis out of thin air. What interviewers (or your stakeholders and managers when it comes to your day-to-day work) want to see is that you can generate and test hypotheses in a structured way.
Hypothesis trees are a variant of issues trees and another great way of utilizing the MECE framework as data scientists is building hypothesis trees for those types of questions. Let’s say we work at or interview with a food delivery marketplace and want to investigate “Why did the # of deliveries in NY go down by 10% in the last week?”
We can start generating hypotheses in a MECE way like below:
Once you have this framework, you can quickly verify or reject the hypotheses one by one and not have that nagging feeling “have I missed anything”.
Of course there are shortcuts you can take once you have accumulated some domain knowledge on the job — for example, if you know that every year during Chinese New Year delivery numbers go down (maybe because the majority of the customers are Chinese and they prefer to have home-cooked meals during Chinese New Year), then you can quickly check that hypothesis first without having to go through an entire framework.
But if you deal with a new problem, hypothesis trees can save the day.
2x2 Matrix
The two-by-two matrix might be the most well-known consulting framework of all.
It’s sometimes frowned upon because it’s extremely simple; but in my experience, this simplicity is often helpful because it forces you to cut through all the (sometimes unnecessary) complexity and distill the issue down to its core.
A two-by-two matrix helps you categorize something into four categories across two dimensions. Let’s look at a simple example; let’s say you are evaluating analytics projects for the next quarter.
Instead of assessing projects across many different factors, which risks dragging out the planning process, you could just focus on two key factors:
Does the proposed project support any of the key company priorities?
What’s the expected business impact?
Each combination requires a different approach. Laying the problem out in this simple grid can help facilitate discussions and move things forward, which is often more valuable than 100% accuracy.
Minto Pyramid
This one is mentioned in my articles countless times already so you know how important I think it is. It’s THE framework anyone should adopt when it comes communication.
Adopting it takes practice because it requires you to communicate in almost the opposite order compared to how we actually do the work. Most analytics work is open-ended and explorative in nature; so it’s not until the later stages of a project that we have some findings and eventually a conclusion.
But when it comes to communication, it’s crucial that we focus on the conclusion FIRST, then the supporting arguments, and lastly the supporting evidence. This requires you to structure your thoughts in advance rather than just walking through a recap of your analysis.
I have drawn the pyramid itself in previous articles so I will not repeat it here, but it might be helpful to demonstrate the framework using a more concrete example:
Contrary to how a lot of data scientists’ default storytelling mode “we did A, then we discovered B … and here’s the conclusion”, the pyramid framework captures the audience’s attention a lot better; and if they don’t have time to read the whole email, they won’t miss the conclusion.
Conclusion: To have a framework at all
The best framework in my opinion is the one that works for you. Having a framework at all is always better than jumping in with no structure.
The frameworks above are not supposed to be the “end all be all” as there are so many more frameworks out there developed to solve different problems; they are just supposed to get you started in building this muscle.
The most important thing is to develop a way of thinking — the structured way of thinking. Without a framework, you are essentially letting your intuition guide you and hoping it will lead to the right solution by chance. But with a structured framework, on the other hand, problem solving becomes a repeatable and scalable skill.
Super actionable examples. The most dangerous Data Scientists I have worked with either had consulting experience or have at least adopted the toolkit. It's still rare to see this type of well-rounded DS, though, so I hope this post reaches many folks.
Simple but well-illustrated and actionable. Thanks Tessa!