The problem isn't bad data; it's brands not having all their data in 1 place. So, their data is everywhere, making good data hard to come by.
In recent years, modern retail has evolved at a staggering pace. And this has had many 2nd-order effects that go somewhat unnoticed.
There are, of course, the more obvious changes, like the fact that marketing is no longer just Facebook ads and a good email strategy. Similarly, growth has expanded to multiple channels, and even the most ardent DTC brands pursue wholesale sooner.
Perhaps less noticeable, DTC teams have started recruiting someone with "operations," "demand planner," "supply chain," or "logistics" in their title earlier.
And with the rise of sophisticated 3PL providers (ShipBob's last capital raise valued them at $1B+ and Shopify recently acquired Deliverr for $2B+), very few brands handle their fulfillment in-house anymore.
However, none of this should be news because many of these things have become common (or even best) practices in recent years.
What people aren't talking about, however, is how these changes have significantly impacted the broader tech stack that powers most DTC brands. Today, there's seemingly a top-class solution for every part of the business.
Because of this, the average brand uses way more tools than it did 5 years ago. (Companies used an average of 89 different apps in 2021, up from 58 in 2015, per Bloomberg.) But that larger tech stack is where the understated 2nd-order effect becomes a problem.
What do brands consider their source of Truth?
Alongside this DTC evolution, there has been a remarkable reimagining of what brands consider their source of Truth (sometimes called their "system of record"). And what that really means is that most brands simply don't have one.
At Cogsy, this has become a common topic in conversations with top retailers. But it wasn't until Archie Durfee, Associate Director of Supply Chain at Ro, joined us on The Checkout that I had this "AHA!" moment.
When asked about what a DTC source of Truth needs, he said:
"I think a lot of it actually has to do with data and its structure. The data structure is never clean. And to just hand that over to a system… Well, a system is only as good as the data that's being put in."
I spent the next 2 weeks thinking about that answer. My qualm with it, I realized, was that it assumes that no system solves for data integrity. Put bad data in, get bad data out.
But the problem isn't as much about bad data; it's about brands not having all their data in any 1 system. In other words, the data is everywhere, so good data is hard to come by. And that's why so many brands are still stuck in spreadsheets.
When you can't programmatically pull data from a single source you can trust, you're left with data gaps. And with spreadsheets, it's "easier" to plug these gaps with whatever number you can finagle – even if it means a questionable output.
But how did DTC brands get to the point where they have no source of Truth? And more importantly, why isn't this something they're actively trying to solve?
To find these answers, you must first understand how ecommerce and its source of Truth (or lack thereof) evolved. And why settling for spreadsheets is a band-aid solution to the larger problem.
The evolution of DTC’s source of Truth
At a high level, DTC’s source of Truth had a 3-stage evolution. First, the platform was your source of Truth, then it was a connector, and now, it’s… complicated.
Phase 1: The platform is your source of Truth
Let's go back to the early days of ecommerce platforms when Shopify, WooCommerce, and Magento were pretty much the only tools brands adopted. The original versions of these platforms had 2 goals:
- Easily add a shopping cart to a website
- Seamlessly process online payments
At the time, these ecommerce platforms served as the brand's order management system (OMS), warehouse management system (WMS), and inventory management system (IMS).
Admittedly, these first iterations' features were simplistic and supported only those initial 2 goals. Still, they consolidated all your ecommerce data into a single source of Truth.
That is until these platforms opened up, allowing other tools to integrate and build on top of them. When this happened, the data these platforms once consolidated started moving between tools.
Initially, these integrations were predominantly with marketing and growth tools. For instance, whichever ecommerce platform you used could now push conversion data to Google Analytics. This would enrich your GA data and provide insights into how the brand was growing.
Or you could integrate your email service provider (ESP) to help simplify post-purchase marketing. The platform would capture the order. Then, the ESP automatically sends the necessary post-purchase sequence via the API connection.
Most of those initial integrations were 1-way. Meaning, the platform was the source of Truth and merely passed data to the external tool so that brands could take action in that tool.
Going back to the ESP integration example, once this data was transferred, the email service provider might become the source of Truth for your email marketing. But it does not replace, say, WooCommerce as the source of Truth for your broader operations.
But then these tools started gathering their own data, which would need to be re-consolidated back into the original platform. Enter 2-way data syncing.
Inventory management systems were the first external integrations to leverage 2-way data syncing.
Say the brand had specific or more sophisticated tools for its inventory management. Then, they could now have a specialized solution for it.
That specialized IMS solution would replicate some of the data in the underlying platform (like sales orders). Then, add new information (like stock take) and send the updated data back to the original platform. This way, both platforms (theoretically) had the same information, maintaining the brand's single source of Truth.
But in practice, this didn't work out. Instead, the data ecommerce brands relied on became more and more scattered as new tools were introduced.
That's not to say these 2-way integrations were a horrible mistake (they empowered brands to create more specialized solutions with their tech stack). But they also created vulnerabilities that, for years, went unnoticed.
Phase 2: The platform becomes the connector
In the 2nd phase, we saw an explosion of tools built on top of the ecommerce platforms.
Brands got more sophisticated, and the number of tools within their tech stack increased exponentially. This is what spawned the original marketplaces and ecosystems for app developers. Brands had evolving needs if they were to meet consumer demand, requiring that new solutions be built.
But the constant for all of those new solutions is that they heavily relied on the original ecommerce platform to be both the source of Truth and a connector between various tools.
For example, brands could now run an ad campaign on Facebook. The ecommerce platform would capture orders generated by that campaign, providing richer data about where those orders came from.
This meant when the platform passed that information off to the ESP or another reporting tool, the data in those tools were suddenly better too.
But it also meant that the platform no longer held all the data. Because of this, it was only the source of Truth for some information, whereas other tools or systems become a better source of Truth for other information.
This fragmentation scattered brands’ data across several tools, often creating multiple, slightly different versions of the same data sets. It also compromised each tool's integrity (including the original ecommerce platform, which no longer held all the "Truths").
As a result, brands made themselves vulnerable to incomplete and inaccurate data. But lacking a single source of Truth, there was no easy way to know these vulnerabilities existed until they became unignorable.
As awareness of data fragmentation grew, 2 practices became widely accepted to combat it:
- Brands started utilizing reporting tools that connected more of these data sources. At the most rudimentary level, this was (and to some extent, still is) how most brands used Google Analytics. While GA also captures its own data, it offers a good representation of most things happening on your ecommerce website. It doesn't see all the data resources or reports on everything. But it is directionally accurate enough to help brands unlock growth in an austere environment.
- To augment that, brands started relying on siloed and specialist analytics within several reporting tools. If you need to improve your Facebook ad spending, you use the data and reporting available within the Meta platform. Or, if you need to do a deep dive on your email marketing, you do that within your ESP. Then, you need to try and connect the dots between these disparate systems.
This setup has continued to evolve, and as the tools became more sophisticated, the best-in-class solutions now also integrate with each other. But these complex integrations only exacerbate the underlying problem: Ecommerce brands have no single source of Truth.
A customer's loyalty rank (or reward points) doesn't just live within your loyalty solution. It also passes through your UGC or reviews tool because customers get rewarded for leaving a review.
Email subscribers with a certain loyalty rank receive different email offers, so that ranking also needs to be stored in your ESP. And those customers are tagged accordingly in your ecommerce platform for segmentation purposes.
But, like the game of telephone, not all the information makes it from one end to the other intact. The ecommerce platform ends up with a lot of the information you need, but not all of it. For instance, it definitely doesn't have the depth of data that the specialist tools keep in silos.
At the same time, the siloed data in the specialist tools are helpful, but only regarding decisions related to that tool's primary function. However, the connection, context, and relevance to the bigger picture are not always (or even usually) obvious.
Phase 3: The state of Truth today
Today, ecommerce brands accept that the sum of these siloed and specialist tools collectively represents the closest thing to Truth.
Where these tools overlap (meaning, 1 data point in 1 system clearly correlates to another data point in another system), operators can assume that the overlapping data is true. But data points that fail to find overlaps prove problematic.
This is perhaps most obvious when looking at your marketing tech stack. Say you rely on the following tools:
- Elevar as your data layer
- TripleWhale for attribution
- Lifetimely for predictive analytics
- Daasity for reporting
How you integrate these tools (such as if Elevar links directly to Daasity or Elevar's data transfers to Daasity via Lifetimely) can change the delivered outputs. But however these integrations are configured, the advantage is that you can at least attempt to triangulate the closest representation of Truth based on the available overlaps.
Of course, this still doesn't solve the outlying edges (the ones that fail to find overlaps) because none of these systems (not even Daasity) integrates all the marketing tools. Because of this, brands continue to rely on the ecommerce platform as the connector, with no system acting as the single source.
If one tool did integrate all these other tools, it could identify what's True based on what all the data sets agree on; plus, what is presumably true based on what most data sets agree on.
This tool could then act as your marketing source of Truth. But it still wouldn't be your source of ultimate Truth. Not unless you can integrate this tool with your entire tech stack so you can programmatically pull data for any aspect of your operations (not just marketing).
But let's face it: This would require a somewhat standardized tech stack or a source system that is powerful enough to integrate with any tool, no matter how obscure.
And other than a few obvious winners (like Klaviyo, which serves 100k+ ecommerce brands), most brands choose tools that meet their unique needs (and biases). Therefore, they're unlikely to agree on a standardized tech stack to achieve this source of Truth.
Instead, brands set up 1 of 2 operational setups: consolidated or decentralized.
Consolidated is the more traditional setup. With it, an IMS or ERP acts as the middleware, connecting all of the most important data sources. As such, the IMS or ERP becomes the closest thing to a system of record.
However, with a proliferation in sales channels and the evolution of new tools (like Flexport and Anvyl), only brands that invest in developing custom integrations and filling the remaining gaps with spreadsheets to get 100% of their data into their ERP. This is, of course, not a viable pursuit for most brands.
Alternatively, the decentralized operational setup offers nothing close to a single Source of Truth. Some individual tools and data sources are connected, but the data never all flows into a single system.
Still, this decentralized setup is the one most DTC brands rely on. And it's why so many were slow to respond to supply chain issues during the pandemic and now face over- or understocking. Worst of all, even with its obvious shortcomings, this setup fails to remove the need for spreadsheets. In fact, the opposite is true.
A common example is submitting purchase orders to your suppliers as PDFs. The only way to reflect this information in your demand planning and reporting is to manually type it into a spreadsheet, which further decentralizes your tech stack.
Similarly, marketing data remains absent from your inventory plans. Why? Because few operational tools sync with marketing tools, even though increasing demand via a marketing promotion directly affects your inventory levels. And marketing data is seldom reflected in ops-related spreadsheets. This only further exacerbates the effects of other missing data on your stock levels.
What is the solution, then?
This is a question that Feat Clothing COO, Nate Poulin, and I tried to dive into on a recent episode of The Checkout.
We agreed that in principle, brands need a universal and agnostic data layer. Think something similar to what Segment has built for software companies.
Here Segment has no opinion. It is merely a data warehouse that connects any amount of data sources and destinations bi-directionally.
Segment itself could (and already does) solve part of the source of Truth problem for many brands (especially the massive ones) by creating a sort of data warehouse.
But it's not a tool focused exclusively on ecommerce retail. So, it remains a heavy development lift to correlate individual data points to each other in a meaningful fashion before passing it onto a reporting tool (as a destination).
At Cogsy, we're currently building the missing source of Truth for Shopify merchants and Amazon sellers, regardless of whether they rely on a consolidated or decentralized tech stack. Meaning, with the Cogsy tool, you can:
- Integrate the tools you already rely on to centralize the data contained in your tech stack
- Sync product metadata and sales orders from Shopify and Amazon to understand current demand trends
- Connect ShipBob to get real-time inventory levels
- Grab purchase order information from Anvyl to understand incoming and in-production inventory
- Automatically connect your spreadsheets to ensure all your operational data is represented and up-to-date
- Factor marketing events into your operational plans to ensure you have enough supply to meet demand and unlock new revenue
The results? Brands that use Cogsy (like Caraway) generate 40% more revenue and save 20+ hours every week on inventory management.
But we'll admit, Cogsy is still not a perfect solution to ecommerce's missing source of Truth (it’s still a work in progress). So, as we hack away, here are a few recommendations to improve your source of Truth (even if it's not Cogsy):
Be aware of those outlying edges where you lack visibility
The anecdote I often share is how my dad (an accountant) destroyed the concept of materiality to me. He would say that when your Trial Balance is off by $1, you can't write that off as immaterial because you could be off $1,000,000 on the debit side and $999,999 on the credit side.
The same is true for your data. You don't know what you don't know, and some outlying edges might materially impact your business without you realizing it.
Marry your cross-functional data sources
Some of the silos preventing your brand from having a single source of truth will be ones your team created (even if unintentionally). And when left operating in isolation, these silos only hinder growth.
Preventing this pitfall comes down to marketing understanding inventory availability, operations understanding what marketing is planning, and CX understanding returns.
To do this, find ways to at least put these different data sets side-by-side, forcing them into the same conversation. This will naturally ensure more holistic decision-making and prevent these silos from continuing.
Do a health check on your (raw) data
We've previously discussed how bad data leads to inventory planning mistakes (and how you can avoid those common pitfalls). But there seems to be this underlying assumption that you can't fix bad data. This simply isn't true.
You might be unable to fix inadequate historical information, but getting your data sources in a good place is a great start. And the best time to do that is today.
You can fix your data by applying a "trust, but verify" mindset to all your data sources. Meaning, use the information you have but actively work to double-check that it's reliable (such as by cross-checking data points across systems).
Just make sure you know which tools are data conduits and which actually transform data. This will also help you identify the true source for individual data points, so you can confirm which sources are reliable when conflicting data pops up.