Putting big data to work

Organizational Network Analysis helps reveal insights “hiding in plain sight” in untapped email data


Posted by Greg Szwartz and Nikola Andric on November 13, 2018.

We recently had the opportunity to work with a leading global Life Sciences company to leverage some of its “dark” email-based metadata. This is data that companies generally don’t tap into, let alone turn into valuable information. What we found yielded surprising insights into how the organization and its people work and interact. This knowledge can now be applied to fuel an insights-driven High-Impact HR operating model—with a more systematic and quantified perspective on ways to boost new employee success, reduce turnover, and lift the overall productivity of the entire organization. Here’s how it happened.

Organizational Network Analysis (ONA) looks beyond standard organizational charts to reveal the way organizations really work—how information flows, how teams and individuals interact, who initiates, who responds, how often, and so on. It is useful in understanding organizational dynamics, identifying formal and informal leaders, and revealing breakdowns as well as opportunities to increase collaboration, productivity, and performance. Organizational networks have been shown to be key to organizational success,1 but identifying, activating, and monitoring them has been difficult or impossible without large investments in surveys and systems that can inherently bias the analysis by having people self-report who they are connected to.

A rich untapped data pool—but is it useful?
We had already worked with our Life Sciences client to assist in establishing the company’s Data Center of Excellence (COE) that was tasked with finding ways to generate a greater return on data available across the enterprise. With the COE up and running, several pilot projects were identified to begin to mine enterprise data—including an ONA project to examine a large volume of email and HR data that had been collected into a data lake comprising 30+GB of data in approximately 200 million emails for over 5,000 US-based employees.

The pilot aimed to uncover two unknowns: (1) whether email metadata could be used to accurately define the organizational network (i.e., would a “passive,” non-survey technique give a reasonable representation of what the org network actually looks like?) and, if so, (2) would insights from the email-based organizational network produce actionable insights that the business would commit to using in future decisions (e.g., training, office space, succession planning, etc.).

An analytical approach
The project team applied advanced analytical methods to unlock the value in this data, including large-network predictive modeling of employee, function, and working community performance. To start, email data was combined with the US employees’ HR attributes, including organizational level and unit/department; job category, area, and title; start date and tenure in current role; total length of service; performance ratings; and geographic data. The purpose was to develop organizational network views, understand patterns in collaboration, and test a series of hypotheses around how connectivity, email, and HR factors contribute to (or, conversely, limit) greater productivity and performance within the organization.

For example, some of the hypotheses we tested included:

  • Do different working communities exist, and are they aligned with existing job functions, regions, or other non-network constructs?
  • Are departments that should be collaborating communicating?
  • Are managers at the center of working communities?
  • Did a recent company reorganization impact network structure? (If so, how?)
  • Is length of service driving connectivity and/or centrality?
  • And, maybe most importantly, does connectedness and/or centrality predict performance?

High-impact findings
The results, delivered through business-friendly metrics, dashboards, and interactive network visualizations, gave the organization a new perspective on how people are working together.
Accordingly, the results have also led to new questions—an expected and desirable outcome that all data science projects should anticipate as new insights generate positive business outcomes as well as lead to new hypotheses for deeper insight that compound the potential of the “insights-driven organization.”

For example, the analysis determined that:

  • Within the organization, 10 sub-organizations or working communities exist, both within and across functions. Support functions (like IT and Legal) were well connected to some working communities, but not all of them—creating an immediate action for outreach.
  • There were also new questions about whether 10 working communities was too many or too few, given the type of organization. We could use the data to assess how support functions were covering the communities, but it is unclear in the first analysis if “front-line” functions (like sales and marketing) should be integrated differently.


The org network: In this sample network view, colors represent sub-networks or communities wherein people communicate more often with each other compared to others in the larger organization, independent of their formal role/department alignment.
Source: Deloitte Consulting LLP

  • There were clear and statistically significant insights on how the organizational network is related to performance. High-performers are more closely connected to one another and tend to be at the center of the various organizational networks.
  • High-performers also tend to be “sources” of emails, sending more emails than they receive, rather than “sinks,” who receive many more emails than they send. Email sinks also tended to be low-performers.
  • Isolation was predictive of turnover: People who weren’t active community participants tended to leave.
  • Collaboration between departments was not as strong as it should be in some cases, particularly in departments that should be collaborating.

Putting insights to work
For industries such as Life Sciences that are largely IP-driven and often live or die by the strength and quality of their human capital assets, being able to attract, develop, and retain talent is a business-critical priority. The insights gained from this analysis can now be used to help the organization in a number of ways related to workforce effectiveness, employee experience, diversity & inclusion management, and other talent issues.

Using just the data gathered in the initial project, the company is pursuing follow-on pilots designed to:

  • Improve new-hire onboarding to foster the right connections for new hires.
  • Increase salesforce effectiveness by improving the level, frequency, and quality of interactions between the field team and internal support teams.
  • Optimize space planning and design the work environment to better support people in doing their jobs.

Pending further data gathering and analysis, the company is also considering a half-dozen additional pilots related to succession planning, manager effectiveness optimization, global Organizational Network Analysis, and more.

Future interventions can be monitored “passively” to measure their effectiveness, without influencing the outcome by requiring employees to self-report connections on a regular basis. Instead, connections are revealed or reinforced by recent and plentiful email metadata, and results are easy to refresh.

Interested in conducting a similar project?
If you are interested in tapping the potential of the data in your organization, we have a few lessons learned and leading practices to consider.

  • Mitigating risk and protecting privacy. In the Deloitte 2018 Global Human Capital Trends report, people analytics was the second-highest-rated trend in terms of importance. However, the trend also notes the potential risks and the need to mitigate them to protect the organization and its people.2 In this case, our client proceeded with an abundance of caution to protect privacy and security: The pilot project was limited to metadata regarding US employees and emails, and email subject lines were not included in the analysis. The inclusion of HR data was carefully considered and vetted through the organization’s HR and legal channels. All information was managed on an encrypted server with limited (and auditable) access; we often configure the server with no opportunity for data download.
  • Involving the right functions. It’s important to have a business sponsor for the project in addition to IT, not only to confirm the project serves a specific business purpose but also to bring in the domain expertise of that function. In this project, HR sponsored along with IT, given the focus on ONA, the nature of the data studied, and the potential talent-related findings. Different functions—R&D, marketing, sales—might be appropriate cosponsors depending on the data pool and the project’s purpose.
  • Cultivating a sprint mentality (via “fail fast” or “test and learn” thinking). Although the dataset was massive, the pilot itself took only 10 weeks from data extraction to dashboard prototyping. We developed a long list of potential hypotheses to test, but then focused on a select few to show value quickly—iterating our way to the first deliverable as opposed to working on a larger set of activities in a linear way. We held early and frequent business stakeholder check-ins with draft insights and recommendations that we could craft with the business, as opposed to working in isolation and dropping the results on their desks in the end.
  • Getting started…somewhere! When we started this project, we didn’t know if there were insights hidden in email data that could be useful to the business; it took data science and a willingness on the organization’s part to experiment and test hypotheses to ultimately unlock value. Think about untapped data your organization may have and how you might begin to analyze it. What might a data analytics project like this enable your organization to uncover about its inner workings? How might you use the data to prepare for the future of work? You don’t have to start big, but you should start. BersinTM, Deloitte Consulting LLP research3 reveals that 69 percent of organizations are already building integrated systems to analyze worker-related data.

This organization’s data analysis has given it the foundation to forge a more insight-driven, High-Impact HR operating model and a range of follow-on implementations of insights to consider to further bolster its ability to manage talent effectively. We’re excited to continue to work with our client to explore the insights this project uncovered, and hope to share more results as our work progresses.

Greg SzwartzGreg Szwartz, is a managing director who leads the Life Science and Health Care Data Science practice of Deloitte Consulting LLP.
Nikola AndricNikola Andric, is a manager with Deloitte Consulting LLP in the Strategy & Analytics practice, specializing in the application of data science solutions in the Life Sciences industry.


1 Neha Shah, Daniel Z. Levin, Rob Cross, “Secondhand social capital: Boundary spanning, secondhand closure, and individual performance,” Social Networks 52, May 2017.
2 Dimple Agarwal, Josh Bersin, Gaurav Lahiri, Jeff Schwartz, Erica Volini, “People data: How far is too far?” 2018 Deloitte Global Human Capital Trends: The rise of the social enterprise. https://hctrendsapp.deloitte.com/reports/2018/people-data.html
3 Bersin, Deloitte Consulting LLP, High-impact people analytics research, 2017

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s