When Jonathan Goldman arrived for work in June 2006 at LinkedIn, the business networking site, the place still felt like a startup. The company had just under 8 million accounts, but users weren’t seeking out connections with the people who were already on the site at the rate executives had expected. Something was apparently missing in the social experience.
Goldman was intrigued by the linking he did see and by the richness of the user profiles. It all made for messy data and unwieldy analysis, but as he began exploring people’s connections, he started to see possibilities. He began forming theories, testing hunches and finding patterns that allowed him to predict whose networks a given profile would land in. He could imagine that new features capitalising on the heuristics he was developing might provide value to users.
Goldman started to test what would happen if you presented users with names of people they hadn’t yet connected with but seemed likely to know – for example, people who had shared their tenures at schools and workplaces.
It didn’t take long for LinkedIn’s top managers to recognise a good idea and make it a standard feature. “People You May Know” ads a click-through rate 30% higher than the rate obtained by other prompts to visit more pages on the site. They generated millions of new page views, and LinkedIn’s growth trajectory shifted significantly upward.
Goldman is a good example of a new key player in organisations: the data scientist, a high-ranking professional with the training and curiosity to make discoveries in the world of big data. Their sudden appearance on the business scene reflects the fact that companies are now wrestling with information that comes in varieties and volumes never encountered before. If your organisation stores multiple petabytes of data, if the information most critical to your business resides in forms other than rows and columns of numbers, or if answering your biggest question would involve a “mashup” of several analytical efforts, you’ve got a big data opportunity.
Who are they?
If capitalising on big data depends on hiring scarce data scientists, then the challenge for managers is to learn how to identify that talent, attract it to an enterprise and make it productive. None of those tasks is as straightforward as it is with other, established organisational roles: there are no university programs offering degrees in data science. There is also little consensus on where the role fits in an organisation, how data scientists can add the most value and how their performance should be measured.
The first step in filling the need for data scientists, therefore, is to understand what they do in businesses.
Data scientists make discoveries while swimming in data. At ease in the digital realm, they are able to bring structure to large quantities of formless data and make analysis possible. They identify rich data sources, join them with other, potentially incomplete data sources and clean the resulting set. In a competitive landscape where challenges keep changing and data never stop flowing, data scientists help decision-makers shift from ad hoc analysis to an ongoing conversation with data.
Data scientists realise that they face technical limitations, but they don’t allow that to bog down their search for novel solutions. As they make discoveries, they communicate what they’ve learned and suggest its implications for new business directions. Often they are creative in displaying information visually and making the patterns they find clear and compelling. They advise executives and product managers on the implications of the data for products, processes and decisions.