Close Menu
  • Breaking News
  • Business
  • Career
  • Sports
  • Climate
  • Science
    • Tech
  • Culture
  • Health
  • Lifestyle
  • Facebook
  • Instagram
  • TikTok
Categories
  • Breaking News (5,046)
  • Business (313)
  • Career (4,281)
  • Climate (213)
  • Culture (4,248)
  • Education (4,464)
  • Finance (203)
  • Health (854)
  • Lifestyle (4,135)
  • Science (4,151)
  • Sports (312)
  • Tech (174)
  • Uncategorized (1)
Hand Picked

What Trump and Xi agreed to in the U.S.-China trade truce

October 30, 2025

Metabolism not the only reason for weight gain in midlife – lifestyle factors also play a part

October 30, 2025

Fredericktown grad chose family over college volleyball career

October 30, 2025

Belarus Free Theatre at Icons of Culture in New York – REFORM.news (formerly REFORM.by)

October 30, 2025
Facebook X (Twitter) Instagram
  • About us
  • Contact us
  • Disclaimer
  • Privacy Policy
  • Terms and services
Facebook X (Twitter) Instagram
onlyfacts24
  • Breaking News

    What Trump and Xi agreed to in the U.S.-China trade truce

    October 30, 2025

    Ed Orgeron says he’s ‘one phone call away’ from returning to LSU

    October 30, 2025

    Can Southeast Asia turn its economic weight into real power? | Business and Economy

    October 30, 2025

    Trump cuts fentanyl tariffs on China to 10%, says U.S. reached rare earths deal with Beijing

    October 30, 2025

    Trey Yesavage strikes out 12 as Blue Jays beat Dodgers in Game 5

    October 30, 2025
  • Business

    Global Topic: Panasonic’s environmental solutions in China—building a sustainable business model | Business Solutions | Products & Solutions | Topics

    October 29, 2025

    Google Business Profile New Report Negative Review Extortion Scams

    October 23, 2025

    Land Topic is Everybody’s Business

    October 20, 2025

    Global Topic: Air India selects Panasonic Avionics’ Astrova for 34 widebody aircraft | Business Solutions | Products & Solutions | Topics

    October 19, 2025

    Business Engagement | IUCN

    October 14, 2025
  • Career

    Fredericktown grad chose family over college volleyball career

    October 30, 2025

    Health and Public Service Career Fair to host variety of employers | News

    October 30, 2025

    Penn State Schuylkill Career Fair connects students with regional employers

    October 30, 2025

    ND stays laser focused on career readiness options | News, Sports, Jobs

    October 30, 2025

    Poole Grad John Stanula Blends Economics, Finance to Chart Career Path

    October 30, 2025
  • Sports

    Raiders DE Maxx Crosby Weighs In on Sports’ Hottest Topic

    October 28, 2025

    Bye Week Off-Topic Thread – Yahoo Sports

    October 25, 2025

    This Thunder Rookie Guard Benefits from the Nikola Topic Injury

    October 23, 2025

    South Bend Topic Sports-betting | WSBT 22: News, Weather and Sports for Michiana

    October 21, 2025

    John Tesh’s iconic ‘Roundball Rock’ theme returns for NBA on NBC

    October 21, 2025
  • Climate

    PA Environment & Energy Articles & NewsClips By Topic

    October 26, 2025

    important environmental topics 2024| Statista

    October 21, 2025

    World BankDevelopment TopicsProvide sustainable food systems, water, and economies for healthy people and a healthy planet. Agriculture · Agribusiness and Value Chains · Climate-Smart….2 days ago

    October 20, 2025

    PA Environment & Energy Articles & NewsClips By Topic

    October 17, 2025

    World Bank Group and the Intergovernmental Negotiating Committee on Plastic Pollution Process

    October 14, 2025
  • Science
    1. Tech
    2. View All

    It is a hot topic as Grok and DeepSeek overwhelmed big tech AI models such as ChatGPT and Gemini in ..

    October 24, 2025

    Countdown to the Tech.eu Summit London 2025: Key Topics, Speakers, and Opportunities

    October 23, 2025

    The High-Tech Agenda of the German government

    October 20, 2025

    Texas Tech Universities Ban Teaching About Transgender and Other Gender Topics

    October 19, 2025

    Earth has hit its first climate tipping point, scientists warn

    October 30, 2025

    Science NewsBlack holes are encircled by thin rings of light. This physicist wants to see oneTheoretical physicist Alex Lupsasca is pushing for a space telescope to glimpse the thin ring of light that is thought to surround every….9 hours ago

    October 30, 2025

    CZI and NVIDIA collaborate to accelerate life science research through virtual cell models

    October 30, 2025

    A major new review on single-cell multi-omics data integration

    October 30, 2025
  • Culture

    Belarus Free Theatre at Icons of Culture in New York – REFORM.news (formerly REFORM.by)

    October 30, 2025

    Wabanaki cultural heritage, history a woven theme in ‘IT: Welcome to Derry’ – UMaine News

    October 30, 2025

    St. George NewsCulture and community: St. George Museum of Art to celebrate Day of the DeadDay of the Dead, or Día de Los Muertos, is a multi-day holiday celebrated by many in South America, is celebrated by family and friends….12 hours ago

    October 30, 2025

    Equifruit pokes fun at wellness culture in consumer campaign | News

    October 30, 2025

    Ballyhoo Festival gearing up for 12th-annual celebration of art and culture

    October 30, 2025
  • Health

    Breast Cancer Awareness Month 2025

    October 26, 2025

    Hampton: Community Encouraged To Attend November Los Alamos County Health Council Meeting

    October 24, 2025

    Health Insurance vs. Nuclear Weapons

    October 23, 2025

    Health Care Coverage For Seniors Topic Of West Hartford Forum

    October 20, 2025

    Mental health & finance topic for women @Bromley conference

    October 17, 2025
  • Lifestyle
Contact
onlyfacts24
Home»Science»The Curse of Conway and the Data Space | by Jack Vanlightly | Oct, 2024
Science

The Curse of Conway and the Data Space | by Jack Vanlightly | Oct, 2024

October 26, 2024No Comments
Facebook Twitter Pinterest LinkedIn Tumblr Email
1illsf1e4d Riq89lo0apxw.jpeg
Share
Facebook Twitter LinkedIn Pinterest Email

How modern trends can be traced back to Conway’s Law

Jack Vanlightly

Towards Data Science

12 min read

·

14 hours ago

Image by the author. (Generated by Midjourney, touched up with Krita)

This article was originally posted on my blog https://jack-vanlightly.com.

The article was triggered by and riffs on the “Beware of silo specialisation” section of Bernd Wessely’s post Data Architecture: Lessons Learned. It brings together a few trends I am seeing plus my own opinions after twenty years experience working on both sides of the software / data team divide.

Conway’s Law:

“Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.” — Melvin Conway

This is playing out worldwide across hundreds of thousands of organizations, and it is no more evident than in the split between software development and data analytics teams. These two groups usually have a different reporting structure, right up to, or immediately below, the executive team.

This is a problem now and is only growing.

Jay Kreps remarked five years ago that organizations are becoming software:

“It isn’t just that businesses use more software, but that, increasingly, a business is defined in software. That is, the core processes a business executes — from how it produces a product, to how it interacts with customers, to how it delivers services — are increasingly specified, monitored, and executed in software.” — Jay Kreps

The effectiveness of this software is directly tied to the organization’s success. If the software is dysfunctional, the organization is dysfunctional. The same can play out in reverse, as organizational structure dysfunction plays out in the software. All this means that a company that wants to win in its category can end up executing poorly compared to its competitors and being too slow to respond to market conditions. This kind of thing has been said umpteen times, but it is a fundamental truth.

When “software engineering” teams and the “data” teams operate in their own bubbles within their own reporting structures, a kind of tragic comedy ensues where the biggest loser is the business as a whole.

Image by the author. (Generated by Midjourney, touched up with Krita)

There are more and more signs that point to a change in attitudes to the current status quo of “us and them”, of software and data teams working at cross purposes or completely oblivious to each other’s needs, incentives, and contributions to the business’s success. There are three key trends that have emerged over the last two years in the data analytics space that have the potential to make real improvements. Each is still quite nascent but gaining momentum:

  • Data engineering is a discipline of software engineering.
  • Data contracts and data products.
  • Shift Left.

After reading this article, I think you’ll agree that all three are tightly interwoven.

Data engineering has evolved as a separate discipline from that of software engineering for numerous reasons:

  • Data analytics / BI, where data engineering is practiced, has historically been a separate business function from software development. This has caused a cultural divergence where the two sides don’t listen to or learn from each other.
  • Data engineering solves a different set of problems from traditional software development and thus has different tools.
  • Data engineering has changed dramatically over the last 25 years. Many new problems arose that required rethinking the technologies from the ground up, which resulted in a long, chaotic period of experimentation and innovation.

The dust has largely settled, though technologies are still evolving. We’ve had time to consolidate and take stock of where we are. The data community is starting to realize that many of the current problems are not actually so different from the problems of the software development side. Data teams are writing software and interacting with software systems just as software engineers do.

The types of software can look different, but many of the practices from software engineering apply to data and analytics engineering as well:

  • Testing.
  • Good stable APIs.
  • Observability/monitoring.
  • Modularity and reuse.
  • Fixing bugs late in the development process is more costly than addressing them early on.

It’s time for data and analytics engineers to identify as software engineers and regularly apply the practices of the wider software engineering discipline to their own sub-discipline.

Data contracts exploded onto the data scene in 2022/2023 as a response to the frustration of the constant break-fix work of broken pipelines and underperforming data teams. It went viral and everyone was talking about data contracts, though the concrete details of how one would implement them were scarce. But the objective was clear: fix the broken pipelines problem.

Broken pipelines for many reasons:

  • Software engineers had no idea what data engineers were building on top of their application databases and therefore provided no guarantees around table schema changes nor even warned of impending changes that would break the pipelines (usually because they had no idea).
  • Data engineers had been largely unable (due to organizational dysfunction or organizational isolation) to develop healthy peer relationships with the software teams they depend on. Or if relationships could be built, there wasn’t buy-in from software team leaders to help data teams get the data they needed beyond giving them database credentials. The result was to just reach in and grab the data at the source, breaking the age-old software engineering practice of encapsulation in the process (and suffering the results).

I recently listened to Super Data Science E825 with Chad Sanderson, a big proponent of data contracts. I loved how he defined the term:

My definition of data quality is a bit different from other people’s. In the software world, people think about quality as, it’s very deterministic. So I am writing a feature, I am building an application, I have a set of requirements for that application and if the software no longer meets those requirements that is known as a bug, it’s a quality issue. But in the data space you might have a producer of data that is emitting data or collecting data in some way, that makes a change which is totally sensible for their use case. As an example, maybe I have a column called timestamp that is being recorded in local time, but I decide to change that to UTC format. Totally fine, makes complete sense, probably exactly what you should do. But if there’s someone downstream of me that’s expecting local time, they’re going to experience a data quality issue. So my perspective is that data quality is actually a result of mismanaged expectations between the data producers and data consumers, and that is the function of the data contract. It’s to help these two sides actually collaborate better with each other. — Chad Sanderson

What constitutes a data contract is still somewhat open to interpretation and implementation regarding actual concrete technology and patterns. Schema management is a central theme, though only one part of the solution. A data contract is not only about specifying the shape of the data (its schema); it’s also about trust and dependability, and we can look to the REST API community to understand this point:

  • REST APIs are regularly documented via OpenAPI, a REST API specification tool. This is essentially the schema of the request and the response, as well as the security schemes.
  • REST APIs are versioned, and great care is taken to version them without making breaking changes. When breaking changes do occur, the API releases a new major version. The topic of API versioning is deep, with a long history of debate about which options are best. But the point is that the software engineering community has thought long and hard about how to evolve APIs.
  • A REST API that is constantly changing and releasing new major versions due to breaking changes is a poor API. Organizations that publish APIs for their customers must ensure that not only do they create a well-modeled and specified API, but a stable one that does not change too frequently.

In software engineering, when Service A needs the data of Service B, what Service A absolutely doesn’t do is just access the private database of Service B. What happens is the following:

  1. The engineering leaders/teams of the two services open a line of communication, likely a physical conversation to begin with.
  2. The team of Service A arranges for a well-designed interface for Service B that doesn’t break the encapsulation of Service A. This may result in a REST API, or perhaps an event stream or queue that Service B can consume.
  3. The team of Service A commits to maintaining this API/stream/queue going forward. This involves the discipline of evolving it over time, providing a stable and predictable interface for Service B to use. Some of this maintenance can fall on a platform team whose responsibility is to provide building block infrastructure for development teams to use.

Why does the team of Service A do this for the team of Service B? Is it out of altruism? No. They collaborate because it is valuable for the business for them to do so. A well-run organization is run with the mantra of #OneTeam, and the organization does what is necessary to operate efficiently and effectively. That means that team Service A sometimes has to do work for the benefit of another team. It happens because of alignment of incentives going up the management chain.

It is also well known in software engineering that fixing bugs late in the development cycle, or worse, in production, is significantly more expensive than addressing them early on. It is disruptive to the software process to go back to previous work from a week or a month before, and bugs in production can lead to all manner of ills. A little upfront work on producing well-modeled, stable APIs makes life easier for everyone. There is a saying for this: an ounce of prevention is worth a pound of cure.

These APIs are contracts. They are established by opening communication between software teams and implemented when it is clear that the ROI makes it worth it. It really comes down to that. It generally works like this inside a software engineering department due to the aligned incentives of software leadership.

Data products

The term API (or Application Programming Interface) doesn’t quite fit “data”. Because the product is the data itself, rather than interface over some business logic, the term “data product” fits better. The word product also implies that there is some kind of quality attached, some level of professionalism and dependability. That is why data contracts are intimately related to data products, with data products being a materialization of the more abstract data contract.

Data products are very similar to the REST APIs on the software side. It comes down to the opening up of communication channels between teams, the rigorous specification of the shape of the data (including the time zone from Chad’s words earlier), careful evolution as inevitable changes occur, and the commitment of the data producers to maintain stable data APIs for the consumers. The difference is that a data product will typically be a table or a stream (the data itself), rather than an HTTP REST API, which typically drives some logic or retrieves a single entity per call.

Another key insight is that just as APIs make services reusable in a predictable way, data products make data processing work more reusable. In the software world, once the Orders API has been released, all downstream services that need to interact with the orders sub-system do so via that API. There aren’t a handful of single-use interfaces set up for each downstream use case. Yet that is exactly what we often see in data engineering, with single-use pipelines and multiple copies of the source data for different use cases.

Simply put, software engineering promotes reusability in software through modularity (be it actual software modules or APIs). Data products do the same for data.

Shift Left came out of the cybersecurity space. Security has also historically been another silo where software and security teams operate under different reporting structures, use different tools, have different incentives, and share little common vocabulary. The result has been a growing security crisis that we’ve become so used to now that the next multi-million record breach barely gets reported. We’re so used to it that we might not even consider it a crisis, but when you look at the trail of destruction left by ransomware gangs, information stealers, and extortionists, it’s hard to argue that this should be business as usual.

The idea of Shift Left is to shift the security focus left to where software is being developed, rather than being applied after the fact, by a separate team with little knowledge of the software being developed, modified, and deployed. Not only is it about integrating security earlier in the development process, it’s also about improving the quality of cyber telemetry. The heterogeneity and general “messiness” of cyber telemetry drive this movement of shifting processing, clean up, and contextualization to the left where the data production is. Reasoning about this data becomes so challenging once provenance is lost. While cyber data is unusually challenging, the lessons learned in this space are generalizable to other domains, such as data analytics.

The similarity of the silos of cybersecurity and data analytics is striking. Silos assume that the silo function can operate as a discrete unit, separated from other business functions. However, both cybersecurity and data analytics are cross-functional and must interact with many different areas of a business. Cross-functional teams can’t operate to the side, behind the scenes, or after the fact. Silos don’t work, and shift-left is about toppling the silos and replacing them with something less centralized and more embedded in the process of software development.

Bernd Wessely wrote a fantastic article on TowardsDataScience about the silo problem. In it he argues that the data analytics silo can be so engrained that the current practices are not questioned. That the silo comprised of an ingest-then-process paradigm is “only a workaround for inappropriate data management. A workaround necessary because of the completely inadequate way of dealing with data in the enterprise today.”

The sad thing is that none of this is new. I’ve been reading articles about breaking silos all my career, and yet here we are in 2024, still talking about the need to break them! But break them we must!

If the data silo is the centralized monolith, separated from the rest of an organization’s software, then shifting left is about integrating the data infrastructure into where the software lives, is developed, and operated.

Service B didn’t just reach into the private internals of Service A; instead, an interface was created that allowed Service A to get data from Service B without violating encapsulation. This interface, an API, queue, or stream, became a stable method of data consumption that didn’t break every time Service A needed to change its hidden internals. The burden of providing that interface was placed on the team of Service A because it was the right solution, but there was also a business case to do so. The same applies with Shift Left; instead of placing the ownership of making data available on the person who wants to use the data, you place that ownership upstream to where the data is produced and maintained.

At the center of this shift to the left is the data product. The data product, be it an event stream or an Iceberg table, is often best managed by the team that owns the underlying data. This way, we avoid the kludges, the rushed, jerry-rigged solutions that bypass good practices.

To make this a reality, we need the following:

  • Communication and alignment between the parties involved. It takes a level of business maturity to get there, but until we do, we’ll be talking about breaking the silos in ten or twenty years’ time or until AI replaces us all.
  • Technological solutions to make it easier to produce, maintain, and support data products.

We see a lot happening in this space, from catalogs, governance tooling, table formats such as Apache Iceberg, and a wealth of event streaming options. There is a lot of open source here but also a large number of vendors. The technologies and practices for building data products are still early in their evolution, but expect this space to develop rapidly.

You’d think that the majority of data platform engineering is solving tech problems at large scale. Unfortunately it’s once again the people problem that’s all-consuming. — Birdy

Organizations are becoming software, and software is organized according to the communication structure of the business; ergo, if we want to fix the software/data/security silo problem, then the solution is in the communication structure.

The most effective way to make data analytics more impactful in the enterprise is to fix the Conway’s Law problem. It has led to both a cultural and technological separation of data teams from the wider software engineering discipline, as well as weak communication structures and a lack of common understanding.

The result has been:

  1. Poor cooperation and coordination between the two sides, leading to:
    – Kludgey integrations between the operational plane (the software services) and the data analytics plane.
    – Constant break-fix work in the analytics plane in response to changes made in the operational plane.
  2. The huge number of great practices that software engineers use to make software development less costly and more reliable is overlooked.

The barriers to achieving the vision of a more integrated software and data analytics world are the continued isolation of data teams and the misalignment of incentives that impede the cooperation between software and data teams. I believe that organizations that embrace #OneTeam, and get these two sides talking, collaborating, and perhaps even merging to some extent will see the greatest ROI. Some organizations may already have done so, but it is by no means widespread.

Things are changing; attitudes are changing. Data engineering is software engineering, data contracts/products, and the emergence of Shift Left are all leading indicators.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

Earth has hit its first climate tipping point, scientists warn

October 30, 2025

Science NewsBlack holes are encircled by thin rings of light. This physicist wants to see oneTheoretical physicist Alex Lupsasca is pushing for a space telescope to glimpse the thin ring of light that is thought to surround every….9 hours ago

October 30, 2025

CZI and NVIDIA collaborate to accelerate life science research through virtual cell models

October 30, 2025

A major new review on single-cell multi-omics data integration

October 30, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest Posts

What Trump and Xi agreed to in the U.S.-China trade truce

October 30, 2025

Metabolism not the only reason for weight gain in midlife – lifestyle factors also play a part

October 30, 2025

Fredericktown grad chose family over college volleyball career

October 30, 2025

Belarus Free Theatre at Icons of Culture in New York – REFORM.news (formerly REFORM.by)

October 30, 2025
News
  • Breaking News (5,046)
  • Business (313)
  • Career (4,281)
  • Climate (213)
  • Culture (4,248)
  • Education (4,464)
  • Finance (203)
  • Health (854)
  • Lifestyle (4,135)
  • Science (4,151)
  • Sports (312)
  • Tech (174)
  • Uncategorized (1)

Subscribe to Updates

Get the latest news from onlyfacts24.

Follow Us
  • Facebook
  • Instagram
  • TikTok

Subscribe to Updates

Get the latest news from ONlyfacts24.

News
  • Breaking News (5,046)
  • Business (313)
  • Career (4,281)
  • Climate (213)
  • Culture (4,248)
  • Education (4,464)
  • Finance (203)
  • Health (854)
  • Lifestyle (4,135)
  • Science (4,151)
  • Sports (312)
  • Tech (174)
  • Uncategorized (1)
Facebook Instagram TikTok
  • About us
  • Contact us
  • Disclaimer
  • Privacy Policy
  • Terms and services
© 2025 Designed by onlyfacts24

Type above and press Enter to search. Press Esc to cancel.