You are in: Submissions > Select unit of assessment > UOA 37 Library and Information Management > University of Wolverhampton > RA5a

UOA 37 - Library and Information Management

University of Wolverhampton

RA5a: Research environment and esteem

Introduction

The Statistical Cybermetrics Research Group (SCRG), founded by Thelwall in December 2000, is a small but growing specialist library and information science (LIS) research group. During 2001-2007, the SCRG aimed to make a range of fundamental contributions to the new research area of webometrics. To this end, we aimed to conduct our research according to the standards required by the best journals in the field (using, for example, the rankings of the U.S. LIS Deans [College & Research Libraries, 66(4), 341-377]). The primary evidence for our success is the large number of publications in high quality international journals 2001-2007, including 19 articles in the Journal of the American Society for Information Science and Technology (JASIST - ranked 1st by the U.S. LIS Deans), 7 articles in Information Processing & Management (ranked 3rd), and 6 in Journal of Documentation (ranked 5th). In addition, we attracted £0.3M in external funding, mainly from Europe.

To create and maintain the capacity to exploit new webometric opportunities, one of the objectives of the SCRG has been to use Thelwall's expertise as the foundation for developing researchers from other cognate areas into information scientists. The core group now consists of four researchers, none of whom was submitted to RAE 2001. The wider group also encompasses one additional faculty member, Musgrove, who participates in PhD supervision and joint papers whilst maintaining his own computer science research and PhD students (not included in our RAE statistics). The full group includes research assistants, postdoctoral researchers working on externally funded projects, full-time and part-time PhD students, and a range of honorary positions. Two of these honorary positions have been awarded to the head and deputy head of the Virtual Knowledge Studio for the Humanities and Social Science, Amsterdam, in recognition of the long-standing collaboration between our two groups.

 

Research Strategy

 2001-2004

The research area of webometrics is concerned with the quantitative analysis of web data. It is closely related to Cybermetrics, which is the quantitative analysis of all internet data, including the web and email. Webometrics was in its infancy when the SCRG started. At that time several important researchers had written about the potential for conducting citation analysis studies using web pages, including Borgman, Cronin, Ingwersen, and van Raan. The first empirical evidence that this was possible was Thelwall’s 'Extracting macroscopic information from web links' (2001, JASIST, with 66 ISI citations). This found a statistically significant correlation between web links to UK university web sites and RAE-based research productivity scores, opening the door to web-based link analysis.

A period of software and method development followed in order to produce improved link-based metrics to measure various aspects of web-based scholarly communication. This led to the development of the Alternative Document Model (ADM) counting mechanisms, based upon the premise that counting individual links gave results that could be influenced by design considerations. In particular, links in web site navigation bars may be replicated to thousands of pages, producing thousands of links. ADM counting, in contrast, can total the sites that link rather than the individual pages, giving more accurate results (and more statistically significant correlations with research productivity, up to r=0.94).

Following this research, the SCRG developed and enhanced methods for calculating and interpreting link-based metrics, applying them in a range of contexts, including different countries, scales of study (national, international) organisational sizes (university, department, research group, individuals) and different types of link (commercial, academic, personal). A consequence of the necessity to explore new potential applications and variations, and the need to research quickly in the fast moving web environment, was a high publication volume.

The culmination of this research was the publication of the first link analysis book, Link Analysis: An Information Science Approach (Academic Press, 2004), and the SocSciBot range of free software for link analysis, in order to engage wider social science researchers in link analysis research. A portion of this research was funded as part of the EU Web Indicators for Science, Technology and Innovation Research (WISER) project in conjunction with SPRU in the UK and two European partners. Subsequent EU funding was obtained for applied joint bibliometrics-webometrics projects, using web links as part of a wider strategy to understand EU researcher international mobility.

In parallel with theory building and software construction, the years 2001-2004 were spent developing researchers who were formerly mathematicians. This included training in the new techniques and the formation of a collegial working ethos, producing a tightly-knit group with complementary skills but a common understanding of key tasks and methods.

Our early theoretical and experimental contributions have helped webometrics to emerge as a recognised important new LIS research area. It has appeared prominently in many published bibliometric characterisations of the field, for example: 'One research area that has grown considerably, to the extent that it has come to dominate LIS research over the last 5 years, is webometrics.' (JASIST 58, 947-957).

 

2004-2007

During 2004, the research group started to develop other internet-based research directions with the analysis of blogs in order to take advantage of the sudden availability of large volumes of public commentary online. This was triggered by engagement in another EU project (CREEN, see “Research Funding” below), investigating the complexity of evolving networks to see if blog analysis could lead to early warning of key science-policy related public debates. This gave an opportunity to adapt our existing techniques and to transfer them to other areas. For instance, we demonstrated that an effective way of assessing the volume of debate around a topic was to count the daily number of bloggers mentioning the topic. This is superior to counting blog postings, for example, because of replication of content within multiple postings on the same day. Our unique contribution was the development of large-scale automated blog analysis for social science research goals.

In parallel with the blog analysis research, we developed other types of internet-based text analysis and a new type of issue analysis. For example, in a Dutch government funded project (see “Research Funding” below), we tracked online evidence for the spread of the United Nations Integrated Water Resource Management initiative through identifying and analysing relevant online documents, showing the importance of commercial considerations for this environmental issue.

In 2007 we again adapted our methods, this time for the hyperlink and text analysis of social network sites like MySpace, in order to gain insights into this important new phenomenon. We also began to engage through conferences, workshops and edited books with the wider e-social science community in the UK and Europe.

The SCRG’s research has been highly focussed since its inception, with the primary research direction being the exploitation of the recent explosion in new online information sources - such as social networks and blogs - for diverse social science research goals. We aim to be part of an emerging e-Social Science research agenda that addresses social issues in new and revolutionary ways. For example we have recently recruited PhD students to the new areas of bibliometrics and internet social marketing; we are researching MySpace for large-scale gender differences in communication and language use; and we are investigating large-scale cultural and gender differences in the online reposting of jokes (with Oxford).

 

Publication strategy

Although the publications listed in RA2 are primarily in high-ranking LIS research journals, our strategy has been to use a broad range of publication types and outlets. For example, a paper in the Communications of the ACM gave a popular introduction to a new research initiative; the link analysis book was designed to convey our methods to a wide audience of social science researchers; and many applied studies have been published in journals such as Scientometrics and Online Information Review to demonstrate the usefulness of webometric techniques. The SCRG has also published book chapters, including general social science books, and encyclopaedia articles. Our free data collection and processing software (LexiURL, SocSciBot, Mozdeh) and databases are also important research outputs.

 

Mechanisms to develop and renew strategy

As a small group, we have a simple collegial research approach, concentrating flexible researchers on the most recent promising research directions, but supporting project-based researchers (PhD students and contract research postdocs) to continue developing other research areas. Our approach to strategy depends on the ability to react quickly to new developments and information sources and is summarised by our theory of “information-centred research”, which we published in an article co-authored with the head of the Amsterdam Virtual Knowledge Studio for the Humanities and Social Sciences.

In brief, the information-centred research idea is that new information sources (e.g., web links) often initially seem promising for a number of different types of research but experience with the data rules out some avenues. Information science, however, can explore the data with the objective of identifying which areas of social science may benefit from the data and then help them to analyse it. This gives information scientists a new role in research as a kind of proactive “data librarian”.

 

Main objectives and planned activities for next five years

We plan to continue our strategy of (a) fast flexible reaction to exploit new internet-based information sources, (b) continuing to enhance previously developed methods, and (c) collaborating with other researchers to help our methods to be used outside an information science context, particularly in wider social sciences. In the nature of these tasks, specific milestones cannot be preset. Nevertheless, we aim to attract £100,000 of external funding and to support one new PhD student per year, based upon our recent track record. Our long-term aim is to become known in Europe as one of the first ports of call for social science internet-based quantitative research methods. Our secondary aim is to develop avenues to commercially exploit our techniques. This might take the form of developing direct marketing of information “products” to consumers but our main goal is to develop partnerships with large organisations with significant online information needs (e.g., the BBC and other media sources) to develop useful methods in cooperation with end users.

 

Research Students and Research Studentships

 The research group has hosted few PhD students because of its youth. In the context of limited research funds in Wolverhampton and in accordance with the University’s Code of Practice for the supervision of PhD students, it needed to demonstrate a good research track record before being allocated funding for PhDs. Its first PhD was awarded in 2006 (taking just over 4 years). Two further students are set to complete in late 2007 (both in under 4 years) and there are three further PhDs at earlier stages but on target to complete on time. We use income from externally-funded projects to finance new PhD student bursaries, although we are now also beginning to attract self-financing PhDs. 

As the group's founder, Thelwall has taken a hands-on approach to PhD supervision and is the lead supervisor in all PhD projects to date. However, in keeping with the strong developmental ethos of the group, members have been encouraged to develop their supervisory skills through active involvement in PhD projects, progressing to formal co-supervision (e.g., Wilkinson), and ultimately, to lead supervisor. New supervisors also receive support from the University's Research Supervisors Development Programme. Members have also supported researchers outside of Wolverhampton. This has included acting as second supervisors for overseas PhD students in Denmark, Iran and Finland; giving courses to PhD students (Aberystwyth, Talinn, Oslo, plus lectures to the Oxford Internet Institute PhD programme); and providing ongoing informal distance support to PhD students in Germany and India.

SCRG PhD students are co-located and participate in the university postgraduate training as well as a monthly Cybermetrics forum for PhD students. In addition we mix with the computational linguistics PhD students and researchers – a group with a similar type of skill set – in order to overcome small group isolation. The students are also encouraged and financed to travel to conferences and meetings where they can meet PhD students or other researchers in the field.

Formal student management is conducted by a committee run in conjunction with the School of Computing and the Computational Linguistics Research Group, ensuring wide-ranging expertise.

 

Research Income

The group has participated in the projects listed below. Completed projects have all been judged successful and have delivered interesting new research. The research funds have primarily financed postdoctoral researchers.

Funding has been difficult to obtain, given the generally low level of research funding in the area in which we operate. However, because of our internet data collection and analysis expertise in addition to general social science and information science problem-solving capabilities, we have been invited to join several high profile European research teams (reported in RA4):

  • 2002-2005 EU funded project WISER (led by the Royal Netherlands Academy of Arts and Sciences): Web indicators for science, technology and innovation research.
  • 2005-2008 EU funded project CREEN (led by the Institute of Physics, Warsaw University of Technology): Critical events in evolving networks.
  • 2005-2006 EU research contract NetReAct (led by empirica GmbH): The role of networking in EU life sciences research.
  • 2006-2007 EU research contract RESCAR (led by empirica GmbH): The role of networking in EU engineering and social sciences research.

In addition, the following projects are not reported in RA4, being either HEFCE-funded or finance spent through partners.

  • 2005-2006 JISC funded project: “User Needs and Potential Users of Public Repositories: An Integrated Analysis”. Wolverhampton’s share £43,000, as the lead institution. With Loughborough.
  • 2003-2006 Canadian Social Sciences and Humanities Research Council project: “Mining the Web for Business Intelligence through Link Structure Analysis”, led by Professor Liwen Vaughan of the University of Western Ontario. Award: $165,479 Canadian dollars (co-applicant).
  • 2005-2007 Dutch government project: “In the web and on the ground: global circulation and local achievements of a prospective shift in governance". Led by Katie Vann of Amsterdam. Total award: £100,000.
  • 2005 Italian government project led by the Universita Cattolica del Sacro Cuore, Milan. (£30,000).

 

Research Structure and Environment

The SCRG is part of the Information and Language Processing Research Institute (ILP) in association with the Computational Linguistics Research Group. This initiative exploits an increasing overlap in interests, for example in the processing of web and blog data to extract trends and for opinion mining, and has led to three joint research bids.  This collaborative venture gives wider support to the postgraduates and postdoctoral researchers in both teams. 

A speaker programme is maintained to stimulate the research culture, drawing on external speakers mostly from overseas, including Canada (2), Denmark (2), and Australia (2).

The group is located within a new School of Computing building, co-located with computing researchers and computer scientists working on external funding projects. Its location ensures that it has ample support for its extensive computing needs in terms of computer power, technicians, and bandwidth.

 

Staffing policy

For the first few years, the main focus of the group was training its members in information science research. This initially took the form of apprenticeship through collaborative research projects led by Thelwall, with tasks allocated to other members, and all contributing to the final write-up. Researchers completing one or more projects then had confidence to develop their own projects and specialisms. The large number of projects attracted by the group has supported a rapid learning curve, so that all core members of the team have not only developed as independent researchers but have also developed their own areas of expertise: Wilkinson specialises in the classification of hyperlinks and their creation motivations; Price focuses on quantitative analyses of the text in web sites; and Fairclough specialises in web-based issue analysis.  

The research group has capitalised on the following strengths in order to develop its researchers effectively:

  • Aptitude of staff, all of whom are mathematically trained.
  • Time being made available, albeit accidentally, by falling mathematics student numbers.
  • Development of unique and powerful software suites for internet data gathering and analysis.

Workload allocations are adjusted to ensure that research-active staff continue to have time to pursue their own research, to supervise PhD students, and to develop opportunities for collaborative research. 

Esteem Indicators

As a small, closely-knit group, a number of esteem indicators, both quantitative and qualitative, reflect the achievements of the group as a whole:

 Quantitative Indicators

The key quantitative indicator is the number of articles published in high ranked journals. Although publishing many articles in the period (103), quality has not been sacrificed. In 2001-2007 for the 4 submitted researchers this included the following, with journal rankings by U.S. Library and Information Science Deans included as evidence of prestige. In addition to journals with widely accepted quality, such as the top 5 in the table below, we have published in field-specific journals such as Scientometrics, Information Research, and Online Information Review, as well as some journals from other fields, such as the Journal of Computer Mediated Communication, New Media & Society, Communications of the ACM, and Social Science Computer Review; we have also published in UK LIS journals including ASLIB Proceedings, Journal of Information Science (all ISI-indexed).

 

Journal Number of articles 2001-7 U.S. LIS Deans’ Ranking*
Journal of the American Society for Information Science and Technology 17 (+3 in press, +2 by non-submitted researchers) 1
Library Quarterly 0 (+1 in press) 2
Information Processing & Management 7 3
Library and Information Science Research 2 (+2 in press) 4
Journal of Documentation 6 (+2 in press) 5
ARIST 1 6
Scientometrics 14 (+2 in press) 7
Information Research 2 (+1 in press) 14
Journal of Information Science 10 (+1 in press) 27
Communications of the ACM 1 NA

* “Journal prestige in terms of value for tenure and promotion” [Table 2 in: College & Research Libraries, 66(4), 341-377]

The most highly cited paper ('Extracting macroscopic…') received 66 ISI Web of Science citations and 102 Google Scholar citations (as of September, 2007) - probably one of the highest cited UK information science papers of the current RAE period [N.B. This was superseded by another paper and is not in RA2]. We have also received five Institute for Scientific Information (U.S.) social science research impact awards, including “3rd hottest social science paper” and “new hot paper” (see cybermetrics.wlv.ac.uk).

 

Qualitative Indicators

The group’s international reputation has led to a high demand for its members to join international collaborations. In addition to the international collaborative externally funded projects mentioned above, 19 overseas academics from 17 universities (including SUNY Albany, Penn State, University of Wellington, Royal School of LIS in Copenhagen, University of Western Ontario, Canada) in 15 countries have produced at least one co-authored refereed journal article with one or more cybermetrics group members in 2001-2007. Within the UK we have coauthored papers with Oppenheim (Loughborough) and Fry (Oxford).

 

As indicated in 'Staffing Policy', the group is also keen to develop the individuals within it by encouraging them to take up opportunities that reflect their growing knowledge and expertise. Using H-indexes (the largest number h such that at least h publications have at least h citations), esteem indicators are reported below for each researcher.

Professor Mike Thelwall (1 book and 96 refereed journal articles in 2001-2007, h-index 15 for 2001-2007 publications according to the Web of Science; h-index 24 in Google Scholar). Thelwall was credited as the third most successful UK information scientist of all time, using the h-index measure, even though Thelwall’s score was based almost exclusively on publications in the current RAE period (Oppenheim, JASIST 2007, 58, 297-301). Honorary Research Fellow of the Virtual Knowledge Studio for the Humanities and Social Sciences, Amsterdam (similar to the UK’s National Centre for e-Social Science). Docent at Abo Akademi University, Finland, Research Associate of the Oxford Internet Institute, Oxford University. Editorial board member for 7 journals including: Journal of the American Society for Information Science & Technology; Journal of Information Science; Library and Information Science Research; Scientometrics, Online Information Review. Member of the editorial boards of the book series: International Yearbook of Library and Information Management (Facet); Information Science and Knowledge Management (Kluwer). Guest editor for the Journal of the American Society for Information Science and Technology webometrics special issue 55(14), 2004. Programme committee member for four conferences: 5th International Conference on Conceptions of Library and Information Science; ISSI 2005 and ISSI 2007 Conferences on Scientometrics and Informetrics, and the Third International Workshop on Adversarial Information Retrieval on the Web 2007. Reviews completed for 25 journals and 13 conferences. Proposals reviewed for: AHRC and ESRC (UK); Social Sciences and Humanities Research Council (Canada).

Plenary speaker at COLLNET 2006; Keynote speaker at 11th Nordic Workshop on Bibliometrics, Informetrics and Research Policy 2006; many guest talks, including Oxford University, Abo Akademi University, Finland and CINDOC, Spain.

Thelwall is in several published international maps of library and information science or information scientists, including Persson’s maps of influential information science authors, and Åström’s and Small’s maps of hot areas in information science.

He has also received the following recognition:

  • ‘Highly Commended’ for papers published in ASLIB/Emerald publications in 2002 for: Evidence for the existence of geographic trends in university web site interlinking”.
  • Online Information Review ‘Outstanding Paper’ award for, ‘Can the Web give useful information about commercial uses of scientific research?’.

Dr David Wilkinson (12 journal articles in 2001-2007, h-index 5 for 2001-2007 publications according to the Web of Science) Editorial board member for the journal Cybermetrics. Referee for three journals, including the Journal of the American Society for Information Science & Technology, and one conference. Member of the Heads of Departments of Mathematical Sciences.

Liz Price (8 journal articles in 2001-2007, h-index 5 for 2001-2007 publications according to the Web of Science) Referee for one journal and one conference. Chair of maths education conference and committee member of UK maths education organisation.

Ruth Fairclough (6 journal articles in 2003-2007, h-index 2 for 2003-2007 publications according to the Web of Science).