Why should library professionals learn about data?
Dogfooding and Data Science
“Eating your own dogfood,” or “dogfooding,” is the act of using the product you are designing. When the designer of a product actually uses the product in their everyday life, they hit the pain points that a user would experience, before the product ever goes to market. A good designer will keep improving the product until it meets her or his own standards, and only then is it fit for consumption.
“Data Science” is a phrase you may be more familiar with. Like “Big Data,” data science is a buzzword that’s reached critical mass. Suddenly, everyone is aware of it even if they aren’t exactly sure what “it” is. There are lots of rival definitions, but for the sake of simplicity let’s just say that the main tasks of a data scientist are finding data and using it to learn new things, usually with nifty visualizations that make their findings easier for others to understand.
A huge part of our work as librarians is in finding information, organizing it and making it more accessible to others. Whether we do that through collection development, tech services, reference, scholarly communication or digital scholarship, we are all tied to that core work with data and information.
While some may take this as an argument that librarians are actually a kind of data scientist, I think our job is more important than that. Good data science depends on good data, and the role of libraries is rapidly shifting towards a new goal: producers of the top-quality web data. Think about it: libraries have always been at the epicenter of the art of describing things. We even standardized the way we describe things so other libraries could read our descriptions. The entire history of libraries has been leading up to the semantic web where things are described so consistently that even machines can read and make sense of these descriptions. This is no trivial task. But while the Web 3.0 is still far off in the horizon, libraries are working with the rest of the world to lay the foundation of the semantic web with technology like BIBFRAME and linked data.
What does this have to do with dogfooding and data science? While it’s not crucial for most librarians to learn data science, we do our work in a world of data, information, and metadata. A grounding in the concepts and methods of data science could be incredibly helpful to many librarians engaged in producing quality data for the web. Learning about data science expands the way we think about data and its uses by patrons and consumers, which in turn expands the way we think about our own library data. The librarian who can put on her data scientist hat and actually use the data she is producing is the ultimate dogfooder.
“The librarian who can put on her data scientist hat and actually use the data she is producing is the ultimate dogfooder.”
How Do I Get Involved?
If you would like to learn more about data science, there has never been a better time. Take a look at the data science specialization at Coursera, a free 9-class curriculum aimed at beginners, and which repeats monthly. You can also join the Data Science Study Group on Google Groups. We’re a friendly bunch of librarians discussing data science and how it applies to our work. We’re just starting the course on R now, but you can take any of the data science courses at any time and chime in with your questions and comments.
The possibilities are endless for learning about data science and applying it to your work as a librarian (data literacy! assessment! charts that show you deserve a raise!). As our world becomes more and more data driven, data skills will only increase in value–for us and for everyone we work with.
[Ed. note: for more thoughts on work and useful apps/tools, see Bryan’s recent profile on the ACRL TechConnect blog.]