15- Microsoft Academic Search: An Overview and Future Directions
Lee Dirks1
Microsoft Research Connections
I would like to brief you on what we have been doing lately with the Microsoft Academic Search service. It started as a research project that has been conducted at our Beijing lab for almost eight years now. Over the course of the last eighteen months, our team in Redmond has gotten very involved in providing strategic guidance and input. Currently, we are in the process of transitioning it from a research project into an operational service that Microsoft Research will provide to the community. It will be a free academic search engine for tracking academic papers, citation links, and all the various characteristics that can be extracted from papers.
What we have been doing over the last six to nine months is working directly with open access repositories and publishers around the world to sign content agreements so we can get access to their papers. This is all about facilitating access to the papers. At present, we have 27 million papers across 14 domains, and we have another 100 million papers across more than 20 domains in the queue, pending indexing. We are going to expand our content about every three months, and are already actively evolving the site.
All of the signed content agreements that I was referencing earlier—with the various open access repositories and publishers—are to make sure that content providers are aware that we are making their data available for free. We are very interested in having the community use this service as widely as possible.
I also would like to stress that we are being as transparent as possible in talking about the number of publications and authors that we have. As soon as possible, we are going to post a list of the publishers and all the sources of this material. We are also waiting for ORCID to come online, at which point we intend to leverage their work and use their identifiers to help in the name disambiguation process.
Through the Academic Search service, people will have the ability to look at citations or publications on a cumulative or on an annual basis. The service also has some powerful visualization abilities. For example, we will have the ability to show a single author in connection with all the people that he/she has worked with in the past (e.g., co-authors).
Another thing I would like to highlight is the system’s ability to drill down into fields and subfields. For computer science, for example, you can look at the top authors, top publications, top conferences, journals, organizations, and other characteristics. (Note that this ranking is solely based on citation counts we have calculated.) Also, you can drill down into a sub-domain of computer science and visualize, for example, publication activity using what we call the Domain Trend. We believe that Domain Trend is a very useful tool for helping researchers find coauthors, principal investigators, and even awards and people to invite to conferences. There is also the ability to do ranking across institutions and across countries.
______________________
1 Presentation slides are available at http://sites.nationalacademies.org/PGA/brdi/PGA_064019.
Again, all of that information is free. We have been getting some good coverage lately, especially about some of the new functionalities of the system. Here is a recent quote from Nature2:
“…Meanwhile, Microsoft Academic Search (MAS), which launched in 2009 and has a tool similar to Google Scholar, has over the past few months added a suite of nifty new tools based on its citation metrics (so.nature.com/u1ouut). These include visualizations of citation networks (see ‘Mapping the structure of science’); publication trends; and rankings of the leading researchers in a field.”
I would like to stress the fact that the work that we are doing here is for researchers and by researchers. That is something that we will always keep in mind when we grow and make this a more sustainable service. We are also very interested in changing our interface and not just doing citation analysis of papers, but eventually also of data. We are very interested in conducting research projects with the community. From our perspective, Microsoft Academic Search is an open platform and we are going to be as transparent as we can about our work. We want to make sure that this service will accurately represent how science and academia work. We are going to make our domain coverage more extensive. We are also working on more partnerships. For example, we are an associate member of DataCite and we are a founding sponsor of ORCID. Finally, we are tracking these and other activities to see when and how we can integrate them into our service.
______________________
2 Butler, D. 4 August 2011 Computing giants launch free science metrics. Nature 476, 18 (2011) (doi:10.1038/476018a).