Chapter 10: Citation, bibliometrics and quality: assessing impact and usage – Academic and Professional Publishing

10

Citation, bibliometrics and quality: assessing impact and usage

Adam Finch

Abstract:

This chapter details the various methods of evaluating the impact of published research, with a particular focus on citations. The chapter gives an overview of the difficulties of measuring research impact and the solutions and controversies of citation analysis, then goes on to look at the indices that record citations between articles and the metrics that these data feed into. Also discussed are publishers’ approaches to improving journal impact, various recent developments in citation analysis, such as the influences of early view and open access, and author metrics.

Key words

Research performance

Web of Science

Scopus

Google Scholar

Impact Factor

EigenFactor

SJR

SNIP

Open Access

early view

h-index

strategic journal development

Introduction

One of the challenges facing organisations manufacturing or using a product is measuring its quality and comparing it with that of competitors in a standardised way; although it is different from normal commercial products in many respects, the same is true of research. In most cases, the primary initial output of research takes the form of journal articles, conference proceedings and books; these publications then become the focus of evaluation. Authors may need to prove the importance of their work when applying for a new post; institutions might wish to demonstrate and publicise their strengths; funding agencies seek to support the strongest research and evaluate the impact of the funding; and Editors can build reputations by improving their journal. Indeed, for research, the importance of indicators of quality is twofold; not only are authors producing research, they are incorporating the research of others into their work and so need a way to identify crucial islands of work in the sea of information. As it is a priority for their key customers and clients, publishers too are increasingly focused on the best way to demonstrate the usefulness, and therefore the value, of the research they publish.

Fortunately, there are measurable elements associated with publications that have been used to analyse their quality. The best and longest established example is counting citations (Gross and Gross, 1927). When one publication includes another in its reference list, this is a citation. As research tends to build incrementally, each new conclusion based on and extending the established body of knowledge, these citations are taken as recognition of value in the cited work. The more a work is cited by other publications, the more valuable it is.

With the online revolution, usage too became a viable option. When the online version of a work is accessed, this can be counted like the hits measured for any page on the web. Less significant research may attract chance usage based on the relevance of the title or an author involved, but more significant work is likely to be more popular and accessed a great many times. It has been argued that usage data are in some ways more useful than citation data as they provide a more current picture of research importance (Bollen et al., 2009).

Analysis of the content of a book or journal can also be a useful third approach. If articles or chapters dealing with a specific topic or from a certain author or institution are becoming more common, that may indicate a rise in the importance or acceptance of an idea or researcher. The study of the measurable elements of publications – citations, usage and content – is referred to as bibliometrics, although the term is most commonly used in publishing to refer to citation analysis.

There have been moves to use other data to evaluate research but these are not yet as widely accepted. Counting patents, for example, has been proposed, but as different countries and regions have different patent laws and databases, establishing counts and coverage is difficult. Features tracking references to research in social media and blogs have been introduced by some publishers (PLoS One, 2009) but this approach is too recent to have become mainstream and remains more open to manipulation than other methods. This said, citations and usage are not without their problems either.

Quality, impact and popularity

It is tempting to use terms such as quality, impact and popularity more or less interchangeably; however, there are subtle but important differences. Citations, for example, can only demonstrate the impact of a work on a subject area, not the rather more subjective property of quality; a citation can be made to a work that is poorly written or presented, or recognising value in only part of the research presented. Moreover, citations may be made in contention with the cited work. Usually, it could be argued, this ultimately helps to establish consensus on a certain topic and so is still a valuable contribution to knowledge. In all cases, however, impact rather than quality is being measured.

Likewise, usage shows popularity rather than quality. A certain proportion of all usage will be based on the title, author name or the source in which the work appears, and just because a reader looked at an article does not mean they found it useful or important. Thus one might take impact and popularity as measuring elements of quality but not the whole.

Both citations and usage also have limitations and drawbacks as the foundations of metrics. Citations are inherently retrospective; it may take an author several years to read a work, incorporate it into their research, write an article addressing the topic and have this article published. Furthermore, the databases that index these citations tend to cover more journals in the US and Europe than elsewhere (Harzing and van der Wal, 2008), and coverage of the literature across subject areas varies wildly. The average number of citations in each subject area also differs, being more numerous in medicine, for example, than in mathematics. Self-citation, where authors or journals preferentially cite their own work to inflate their performance, may also be an issue (Van Raan, 2008). Many citations contain bibliographic errors and some are made without the cited work being read (Simkin and Roychowdhury, 2003). Some authors even question the underlying assumption that citations can be used to fairly measure impact (Todd and Ladle, 2008).

Usage data are more problematic still. Unlike citations, no central or neutral organisation counts usage for articles that are published. Although the COUNTER project has provided an immensely useful standard for what is counted when a user accesses an article (Shepherd, 2002), different publishers vary in their treatment of usage aggregated through third-party hosts such as Ovid. Usage counts can also be affected by changes in the indexing policies of Google and other search engines, making fair comparison over time impossible. Usage looks only at the electronic version of an article, ignoring print usage. It is also easier to manipulate than citations because all that is required is a click of a button rather than the publication of a citing article. Last but not least, publishers are also reluctant to yield their usage data, as they could provide business intelligence to competitors. Although work on a Usage Factor is underway (Usage Factors, 2010), no journal usage metrics have yet been produced for a significant proportion of titles across publishers.

In any case, neither usage nor citations provide us with a complete picture of research impact. The former is an indicator of where useful research was sought, while the latter only shows the subset of instances where a researcher’s use of the research was finally published. The situation was perhaps best elucidated by Carl Bergstrom: ‘Usage data tell us where the net was cast; citation data tell us where the fish were caught’ (Butler, 2009). Both approaches are utilised far more frequently with journal articles than with books. At the time of writing, only selected book series and collections of conference proceedings are included in either of the main citation indices and usage counts are most commonly provided by publishers for their journal products.

It is quite possible to undertake useful citation and usage analysis of book titles where the data exists, but for the aforementioned reasons, the commonly used metrics in publishing apply to journals and count citations rather than usage. It is these metrics on which this chapter will focus; but before metrics can be calculated, publications and citations must be recorded in a citation index.

Citation indices

There are currently three major, international citation indices available, each with different coverage of published research.

Web of Science

The longest-running citation index is the Web of Science (WoS), now owned by Thomson Reuters. WoS, which has existed since 1960 and indexes around 12 000 journals (Thomson Reuters, 2011c), is a part of the larger database, Web of Knowledge, which indexes some 23 000 titles with backfiles going back more than a century (Thomson Reuters, 2011b). However, full bibliographic information and a breakdown of the citations received each year since publication are available for approximately 8300 Science titles, 2900 Social Science titles and 1600 Arts and Humanities titles (with some overlap) along with a selection of conference proceedings in the Sciences and Social Sciences. Journals apply for coverage and are admitted to the index if they can demonstrate that their articles are scholarly, peer reviewed, attract a basic level of citation activity and are international in authorship or editorial coverage. Foreign language titles can be indexed as long as the abstract is available in English.

The data available include author names and addresses, keywords, volume, issue and page numbers, date of publication and article DOI (Digital Object Identifier – a unique code for each article, book chapter or book). Standard web access to WoS limits the downloading of records to batches of 500, which can make data acquisition for large analyses time consuming. The main bibliographic information for each article and the breakdown of citations per year are acquired on two different screens with different outputs, which do not share a common identifier; this means that some initial work is often required pairing the metadata for an article with its citations.

It is possible to search the index by author, article or journal title, address, country, year and document type, as well as less frequently used fields such as grant number and funding agency; however, it is not possible to search by subject area or publisher, so publishers must take alternative routes to gathering these useful data.

In the latest version of WoS, version 5, institutional addresses are aggregated from the form in which they appeared on the original work. ‘Lemmatisation’ has also been introduced, meaning that, for example, US and UK variant spellings of the same word need no longer be searched separately (Thomson Reuters, 2011a). It has historically been tricky searching for authors, many of whom often share the same surname and initials (Deis and Goodman, 2007), but WoS has been enhanced to construct unique author sets to help disambiguate researchers and to include a separate author search page. It is also now possible to search by Researcher ID with links for each ID back to the full author profile on the Researcher ID site (Thomson Reuters, 2011a). Additionally, Thomson Reuters plans to deliver a solution to this as part of their Research in View product, by incorporating the ORCID (Open Researcher & Contributor ID) author unique identifier system (Haak, 2011).

A significant development in the WoS product is the Book Citation Index (BCI), scheduled for release in late 2011. Planned to initially contain approximately 25 000 volumes, the BCI will include much of the data currently available for journals and will be accessible through the same web interface. This is crucial for social sciences, arts and humanities titles, where a far higher proportion of the key literature appears in book form. It may also inaugurate a new era of citation analysis for books.

Scopus

Scopus is Elsevier’s competitor to WoS, launched in 2004. Covering nearly 18 000 peer-reviewed journal titles (Elsevier, 2011) as well as book series and conference proceedings, it covers more material than WoS (Falagas et al., 2008), although there is some evidence that the additional material is less well cited (Vieira and Gomes, 2009). The fields of data available via the Scopus web interface are very similar to that in WoS; the bibliographic data and year-on-year citation data are again stored on different screens and retrieved in different downloads. Records can be downloaded in batches of 2000, but consecutive batches cannot currently be easily selected, making the acquisition of larger data sets reliant on complicated filtering of results and thus more difficult and time consuming than on WoS.

Like WoS, Scopus attempts to resolve the issue of author ambiguity through the construction of unique author sets, although again these are sometimes incomplete (Deis and Goodman, 2007). Scopus also provides an institution search, aggregating the articles published by schools, departments and subsidiaries of an institution into one entry. A series of breakdowns by subject area and document type are also available for each institution. Searching is possible by publisher, although the entries sometimes give publisher imprints or out-of-date information, meaning these results are not entirely reliable. Searching by journal title may be less reliable than in WoS, however, as there are some variations in coverage across years, with some journal issues absent from the index (Deis and Goodman, 2007).

Google Scholar

Publishers and institutions must pay for access to WoS and Scopus, but a free alternative exists: Google Scholar (GS) eschews article data provided by publishers in favour of that garnered by automatic crawlers and parsers. The result is a service that theoretically trawls the entire world of online scholarly materials, providing records for journal articles not covered by the other citation indices and creating citation counts for all published researchers.

However, there is a strong case that you get what you pay for with this free service, as problems with data quality have been identified (Falagas et al., 2008). If it lacks a master record for a certain publication, an entry is created from citations to that publication; if these citations appear in different formats or with inconsistent details, multiple entries can be created for the same target article (Jascó, 2009b). This can vastly inflate citation counts for authors and journals. The parser can erroneously create author names from text on a web page (such as Please Login or Payment Options) or can ignore actual author names. Other investigations have shown that pseudo-article generators designed to test journal peer review can fool Google Scholar’s crawlers, introducing a new array of inflated article and citation counts (Labbe, 2010).

These problems may affect a relatively small proportion of the GS records, but without any idea of the number of bad records or the total number of GS records available, it is impossible to have confidence in this index. A set of free, independent analysis tools for GS citations called Publish or Perish is available, although these are for personal use only and not commercial use by publishers.

Other indices

In addition to the three main indices, there are a number of other services, including the Chinese Social Science Index, the Asian Science Citation Index and the Indian Citation Index. The creators of these indices often cite low levels of coverage in WoS and Scopus as the reason for their existence. Certainly North American and European articles tend to receive more citations in many subject areas, but whether this is because of a bias against research from the rest of the world or because such research is genuinely and objectively superior has not yet been proven. In any case, the journal metrics currently established or gaining popularity rely on WoS or Scopus.

Journal impact metrics

With the increasing importance of citations in evaluating the impact of research, it was natural that metrics would be devised to measure the citation performance of different journal titles. A great number of metrics of varying degrees of sophistication have now been proposed, with their merits and demerits debated extensively; however, only relatively few have been calculated for a significant proportion of journals and made available over a number of consecutive years. These are the most useful from a publishing perspective as they can be used both for comparative analysis and publicity.

Impact Factor

The first metric to gain currency was the Impact Factor, created by Eugene Garfield (Garfield, 1955), founder of the Institute for Scientific Information (ISI) that created the WoS. Unsurprisingly, this metric is based on WoS data, meaning that only citations from indexed journals are counted. Impact Factors are generated for all of the titles in the Thomson Reuters Science and Social Science indices but not the Arts & Humanities titles. Impact Factors for a given year are released in June of the following year as part of a subscription to the Journal Citation Report (JCR).

The Impact Factor for a given year is calculated as follows:

This number, which is given to three decimal places, effectively measures average citations per article for recent publications. There is a disparity between what is counted on the two halves of the ratio. When WoS indexes a journal, the articles in that journal are reclassified according to an internal document type system (which may be entirely different from that used by the journal) as either a citable item, such as an Article, Review or Proceedings Paper, or one of a range of non-citable items, such as Letter, Book Review, Meeting Abstract or Editorial. Citations from any article type to any article type are counted on the numerator, but only ‘citable items’ are counted on the denominator (McVeigh and Mann, 2009).

A variation on the Impact Factor was recently incorporated into the JCR; the Five Year Impact Factor uses the same calculation, but instead of looking at citations in a given year to articles published in the previous two years, it studies those published in the previous five, with a denominator looking at the same period. This was introduced partly to serve the needs of some subject categories, particularly in the Social Sciences, where a two-year citation window was insufficient to include the core period of citation in the calculation. The JCR gives additional metrics for each title, such as the Immediacy Index (which measures how rapidly a title is cited) and the Cited Half Life (which measures how long after publication issues of a title are being cited).

The Impact Factor is the most widely recognised of the journal metrics (Hoeffel, 1998). It is the metric given as standard on publishers’ journal home pages and even low scores are often publicised. As a metric, it has several advantages. Measuring average citations-per-article is easy to understand as a concept and relatively easy to explain as a calculation; furthermore, it can be duplicated, predicted and simulated, rather than being an opaque and inscrutable ‘black box’.

However, there are some issues with the metric. Articles only contribute to Impact Factors in the two years following the year of publication; this is too short a window for many subject areas, which take longer to start accruing significant numbers of citations (Jacsó, 2009a). The fact that different subject areas are cited with widely varying frequency and speed also means that the Impact Factors of two similar journals in subtly different fields cannot be fairly compared (Taylor et al., 2008). Veterinary Science provides an excellent example of this; journals within the same subject category, but dealing with different species, have quite different average citation rates.

Furthermore, the metric suffers from ‘inflation’; with more journals being published each year, more articles per issue and more references per article, the pool of citations from which the Impact Factors are calculated is continually growing (Althouse et al., 2009). This means there has been inflation in the metric and that an Impact Factor of 5 now is ‘worth’ less than it was a decade ago.

Some Editors might be tempted to pressure authors to insert citations to their journal into submitted articles before they would be accepted for publication, or publishing editorials citing all of the previous year’s articles, in an attempt to artificially inflate citation counts (Brumback, 2009). ISI does exclude journals where the proportion of such self-citations contributing to an Impact Factor is beyond a certain threshold, but it is unclear what this threshold is.

Scimago Journal Rank

The Scimago Journal Rank (SJR) is one alternative to the Impact Factor. SJRs are released biannually, based on half- and then full-year figures for the previous year, by the Scimago Journal Group (Elsevier, 2010). This group consists of academics based in Spain who were closely involved with the creation of Scopus but are not formally affiliated with Elsevier.

A detailed version of the equation is available (Scimago Research Group, 2007), but calculation outlined on the Scimago Journal Rank homepage for the full-year SJR of a given year is:

There are a few key differences between this metric and the Impact Factor. Citable items are still defined as articles, reviews or proceedings papers, although the classification system employed by Scopus is not the same as that used by ISI. Any journal self-citation above 33 per cent of the citations received during the census period is excluded and the calculation window is extended from looking at the previous two years to the previous three (Anegon, no date).

More significant still is the introduction of weighted citations, where a citation from a high-impact title is worth more than a citation from a low-impact title. The performance of a journal is modified by the performance of those titles that cite it in an iterative process, with the high-impact journals accumulating more of the total prestige with each iteration, until the variation between iterations falls below a certain threshold. The weighting approach is very similar to the PageRank algorithm used by Google to rank pages in search results, although it is still under debate whether this increases the usefulness of the metric.

The metric has strengths and weaknesses. On the one hand, the self-citation limit reduces the chance that the rare unscrupulous journal Editor would seek to manipulate their result. The SJR is entirely free to access and is based on more journals than the Impact Factor, theoretically representing a larger proportion of the scientific community. It also looks at a three-year rather than two-year window, covering more of the core period of article citation. SJRs are also recalculated annually, meaning that they are not affected by citation inflation.

However, some of the innovative solutions applied in the SJR bring new problems. Because the impact of a journal is based on an iterative weighting process, it is impossible to calculate a journal’s SJR without knowing the impact of all the other journals in the set. This weighting data for the 18 000 titles in Scopus is not available, meaning the SJR cannot be checked, predicted or replicated for non-indexed titles. This is not ideal from the point of view of publishers wishing to confirm or improve their titles’ performance; nor is the metric’s annual recalculation, which means that a journal’s historical performance may change over time. Finally, like the Impact Factor, the SJR is very much dependent on subject area, meaning the values of two titles in different fields cannot be meaningfully compared.

EigenFactor

Proposed by Carl Bergstrom (Bergstrom, 2007), the EigenFactor (EF) first entered currency in 2007 when it and Article Influence, a related measure, were incorporated into the JCR and released annually along with the Impact Factor. Following the same notation as the two previous examples, it would be represented as:

The most notable difference between this metric and the previous two is that it has no denominator; it does not represent an average number of citations but rather the proportion of all available citations that a given journal attracts. So if two journals were to have the same Impact Factor, the larger of them would have a higher EigenFactor because it would attract a higher proportion of all cites. The EigenFactor is also based on a five-year target period rather than two for the Impact Factor or three for the SJR. It shares some similar properties with the SJR in that it ignores self-citations and weights citations received according to the impact of the citing journal, using a similar methodology.

The EigenFactor is complemented by the Article Influence measure. This is more similar to the Impact Factor, in that it takes account of the size of a journal, but it is still not an average cites-per-article. To calculate the Article Influence, the EigenFactor is normalised by the proportion of all articles in the JCR that appear in the journal under study. So if two journals had the same EigenFactor, the smaller of them would have the higher Article Influence.

The EigenFactor has advantages. Looking at a five-year window, it certainly includes the core citation period for most subject areas in the Science and Social Science citation indices, and again eliminates any possible manipulation through self-citation. It is also fixed, like the Impact Factor. Moreover, because they represent a proportion of citations rather than an average citation count, EigenFactors are not subject to citation inflation, making them comparable across years.

Unlike the SJR, it is theoretically possible to replicate the EigenFactor calculations (West and Bergstrom, 2008), although this requires the CD-ROM version of the JCR and would be computationally intensive. An EigenFactor can be simulated for a non-indexed title too, although clearly citations from such a title could not be included in the citation matrix used to determine the EigenFactors of indexed titles, resulting in an inaccurate simulation. The EigenFactor is therefore far more of a closed box than the Impact Factor. Like the IF and SJR, the EigenFactor and Article Influence are strongly influenced by the subject area of a journal. Additionally, it has been suggested that the weighting of citations does not provide a substantially different result from citation counts without weighting (Davis, 2008a).

Source Normalised Impact per Paper

The Source Normalised Impact per Paper (SNIP) was created by Henk Moed (Moed, 2010), is based on data from Scopus and is released along with the SJR on JournalMetrics.com. It is the first metric to be calculated for the whole journal list that seeks to take account of the varying frequency and speed of citation between different subject areas.

Like the SJR and EigenFactor, it is not easily reduced to a simple equation, but can be expressed as follows:

It is therefore a two-stage process. First, an average citations-per-paper is calculated for a journal, looking only at citations to and from citable items (articles, reviews and proceedings papers) appearing in journals only, from a three-year target window. This is then normalised by the relative database citation potential, which measures how likely it is that the journal should be cited, given how many citations are made by articles in the journals that cite it. Effectively, every journal has its own subject area, made up only of the journals from which it receives citations. Its raw average citations-per-article is adjusted for the average citations it would be expected to receive.

The major advantage of the SNIP is that it appears to eliminate citation differences between subject areas. Metrics have been theorised that normalise citation performance by subject area, but these are often based on grouping journals by field – this causes difficulties, as noted, for clinical or applied journals, or subject areas with internal variation, such as Veterinary Sciences. By defining subject areas uniquely for each journal, the SNIP avoids this. It also has the strength of ignoring citations to and from non-citable items, making manipulation through game-playing with document type classification far less likely.

It does suffer from drawbacks, however. Clearly, it is complicated to calculate even with all the required data. Getting the required data is no easy task either, because one must know how many times every citing article in the dataset has cited each journal in the dataset. Although it is possible to calculate the Database Citation Potential (DCP) for a single title, one would also need to know the DCP for all titles in a dataset. At present, the SNIP is therefore as much of a ‘black box’ as the SJR or EigenFactor and cannot be checked, predicted or simulated for non-indexed titles. It is still a very new metric and further debate will probably establish the degree of its usefulness, although it has been suggested that the SNIP methodology does not account for differences in citation between fields (Leydesdorff and Opthof, 2010).

Backlash against citation metrics

As the application of journal citation metrics has grown, so too has opposition to the practice. It is held by some that, far from providing a useful guide, the chasing of high impact has begun to damage research itself.

In some cases, objections are made to specific attributes of the metrics, which make them unsuitable for measuring science. For example, journals publishing a large number of non-citable items, or a small number of very long citable items, may have an unrepresentatively high numerator and a low denominator in metrics that look at average citations (Campbell, 2008). Journals in fields with lower average numbers of citations, such as mathematics or social sciences, can be unfairly discriminated against if funding decisions are based on one of the majority of metrics that do not account for variations in subject area.

Some criticism has been levelled at editors and publishers; for example, it has been suggested that focusing on big-name authors to attract citation creates a barrier to up-and-coming researchers (Tsikliras, 2008). Another criticism argues that clinical articles and case studies, which tend to receive fewer citations (Oh and Lim, 2009), might be avoided despite their importance to a journal’s community.

Other issues have been raised with the application of the metrics. Chinese science authors have been instructed to publish articles only in journals appearing in the Science Citation Index, and which therefore receive an Impact Factor (Al-Awqati, 2007). Authors from Imperial College London have had the Impact Factor of the journals in which they publish incorporated into a Publication Score (Colquhoun, 2007). In Brazil, Impact Factors are used to evaluate graduate programmes (de Albuquerque, 2010). There are similar reports of the Impact Factor being used in Italy, Japan and Spain in a similar fashion (Cameron, 2005).

This is particularly problematic; the impact of a journal cannot be used to generalise about the impact of the authors publishing in it because the distribution of citation strength within a journal is highly skewed. In some samples, the most cited 15 per cent of the articles account for 50 per cent of the citations, and the most cited 50 per cent of the articles account for 90 per cent of the citations (Seglen, 1997); in some cases, the skew is even greater (Campbell, 2008).

These concerns have been part of the driving force behind innovations in article-level metrics. Some publishers have begun giving the usage and citation counts for individual articles, allowing authors to report these rather than focusing on journals. Although usage data are incomparable between publishers, owing to variations in what is counted and how, the citation data supplied by CrossRef allow meaningful comparison of citation impact at an article level.

Strategic journal development

From a publisher’s perspective, it is crucial to maximise a journal’s impact, whichever metric is used to measure it. Authors will often consider Impact Factors when deciding where to publish their work and citation performance may well be one of the elements in a librarian’s decision over whether to subscribe to a publication. When working with society-owned titles, retention of contracts may depend heavily on maintaining a strong citation performance. Publishers are therefore compelled to maximise the impact of their publications as much as authors and institutions.

Journal Editors may, of course, have other priorities for a title than maximising its Impact Factor, such as service to the subject community. Moreover, as already noted, some of the tactics used to maximise impact can end up unduly influencing the format or focus of research. Publishers should be sensitive to these considerations and seek to increase impact through citation analysis only when their efforts do not debase the knowledge they seek to make available.

In the future, the metrics that weight citations by the impact of the citing journal may become more popular, in which case it will be important both to attract a high volume of citations and that these citations come from high-impact journals. Currently, however, the Impact Factor is the most widely recognised and prioritised measure, so the key aim of publisher citation analysis is to attract more citations per article for the journal.

Before this is possible, there are some technical constraints to overcome. As previously noted, WoS and Scopus both provide two sets of data, one with bibliographic information and the other with citation counts by year. It may therefore be necessary to join the two sets of data using a database or a program like Microsoft Excel. Address data are also usually combined into one field, so it may be worthwhile separating out the different lines to enable study of institutions, countries and world regions. Some publishers have created tools to automate these processes, which saves a great deal of manual work.

When studying the performance of a journal and working to improve it, it is often best to download several years of articles and their associated citation data, so that trends over time can be established. It is also advisable to acquire records for similar or competitor titles, so that strengths, weaknesses, opportunities and threats can be identified and results can be viewed in proper context.

There are numerous ways to break down and study article data. While a top article list may allow someone familiar with the subject area to spot patterns in what is best cited, more transparent approaches are often effective. Aggregating the articles by author can allow identification of the top names in a field. Articles or reviews could then be commissioned from these individuals or they could be recruited to a journal Editorial Board, to attract papers from their networks. The same can be done for author affiliation, identifying institutions with strong publishing records in certain areas. It should be noted, however, that there will usually be only a few articles for each author or institution, and care should be taken to avoid reaching strong conclusions based on a small sample size.

If country and region data can be extracted from the bibliographic records, these can often provide useful guidance. While all research from a certain country in a field will not be of a similar level of citation strength, analysis by country can allow the identification of areas where publication frequency is growing or where citation strength is trending upwards. Extending a journal’s scope or Editorial Board to draw in more papers from other countries can increase the chances that a high-impact article will be submitted to the journal.

More simply, journal strengths and opportunities can be identified through a breakdown of the document types published. It may become apparent that a journal’s articles are well cited but its proceedings papers are not; or that a great many citations are attracted by editorials and relatively few by reviews, in comparison with competitor titles. The decision could then be taken whether to decrease the numbers of weak document type articles published, or to try to improve them by looking at what makes a certain document type successful for other journals. This approach can also work when comparing special issues with normal issues; because each citable item contributes to the denominator of an Impact Factor, it is important to ensure that special issues, if published, are attracting as many citations per article as normal issues.

One of the more useful but commensurately difficult analyses often requested is a study of which topics within a field are more or less cited. Although some subject areas, such as chemistry, have an existing taxonomy of subjects with the tagging of indexed articles to allow easier analysis, this is rare. It is tempting simply to look up the citations for a single keyword, but the citation performance of a single keyword is unlikely to be representative of a topic; it is more reliable to aggregate keywords describing topics into clusters and then study the average citation performance of these clusters. Without degree-level experience of the journal topic, however, the assistance of a journal’s Editorial Board may be required in establishing topic keyword lists.

There are some simpler tricks to improving Impact Factor performance that do not involve debasing the research presented. Citations count towards metrics like the Impact Factor on a calendar year basis, so that a 2012 article would contribute to the 2013 and 2014 Impact Factors regardless of publication month. However, publishing an issue in January 2011 rather than December 2012 will mean its articles have 36 rather than 24 months before the end of this ‘citation window’. These extra 12 months can increase average citation levels significantly. For this reason, publishers often load the early issues of a year with more papers, or fill them with articles written by well-respected authors who are more likely to be cited. However, the most honest, the most effective and the least damaging way to increase an Impact Factor is to identify and publish better research with a higher potential for citation.

WoS additionally allows users to search for and download the citations themselves, along with the data of the articles that made them. This can allow analysis of which authors, institutions and countries cite a particular journal most frequently, which can feed into marketing and sales efforts. More usefully still, it is possible to use these data to check, predict and simulate Impact Factors ahead of their formal release. In the rare case that citations have been missed from the JCR calculations, it is possible to alert Thomson Reuters to the issue and request correction in the September re-release of the JCR. Impact Factor simulation for non-indexed titles also provides a useful guide when applying for coverage in WoS, as a basic level of citation activity is required for admission.

The ‘early view’ effect

Most major publishers now make copies of their articles available online before they are incorporated into an issue and given page numbers. Certain subjects also have preprint online repositories where the text of an article accepted for publication (but prior to formatting by a publisher) can be hosted; one example of this is the arXiv.org repository for Physics, Mathematics and Computing. Making a version of an article available online ahead of print publication in this way may increase the citations received by those papers before a certain deadline; they are online earlier, more likely to be read and therefore more likely to be cited. The citations are, in essence, brought forward and made sooner (Moed, 2007); this is the early view effect.

Citations from these papers are only counted by WoS when the final paginated version is indexed, but citations to these papers made before that date can theoretically be recorded and added to the article record when it appears in an index. For such a citation to be counted on WoS, it must contain the DOI for the cited work.

However, even if the CrossRef DOI is not given and the citation is consequently not attached to an article’s entry on WoS, the citation may still be counted as part of an Impact Factor calculation. This is because the only information sought when identifying citations for the Impact Factor are the cited journal title and the year of the cited article. No attempt is made to verify that the target article has been published in the given year. This means that the effect of making an article available online early is, ideally, to extend the window of citation for that article by the duration of the lag between the early view version becoming available and the paginated version being published.

For example, citations to an article published in December 2011 will be counted towards the 2012 and 2013 Impact Factors. That means a citing article must be written and published within 24 months of the target article becoming available in December or the citation will not contribute to an Impact Factor. In many subject areas, this is too short a window and citations are ‘lost’. However, if the target article was made available for early view in December 2010, the ‘window’ during which citations will count is extended to 36 months, theoretically increasing the likelihood that the target article will be cited.

There are arguments against this theory. It relies on taking an article in isolation; if the target article was available 12 months earlier, there may have been other articles available then that citing authors would have chosen in preference to it. Citations exist as part of a network of knowledge and altering one element may change the whole. It is also possible that citations to the article will be given as ‘In Press’ or that authors will continue to cite the version under the early view date (2010). Moreover, it is very difficult to test the early view effect, because it is impossible to know how many citations an article would have received if it had not been available for early view. Work on arXiv has found both an early view effect and a strong quality bias in what was submitted to the repository (Moed, 2007). Only through a controlled trial, with articles randomly made available online early or not, would it be possible to test scientifically whether early view increases citation counts; we can, however, be reasonably confident that it does not lower them.

Open Access (OA) and citations

One hotly debated issue in citation analysis is whether making articles freely available increases the numbers of citations made to them. The theory goes that if an article is available to any reader online, the pool of potential citing authors will be larger and that a proportion of these additional citations will be indexed, benefiting authors and the journals in which they publish. Indeed, early studies (Antelman, 2004; Hajjem et al., 2005) showed a correlation between articles being freely available and higher levels of citation; the authors of some of these studies held that there was a causal link between an article being free and receiving more citations (Harnad and Brody, 2004).

Subsequent examinations of the subject found methodological problems with the early studies, such as imbalances in the size of OA and non-OA samples and the use of inconsistent windows for citation counting (Craig et al., 2007). Moreover, higher levels of citation could have been caused by other factors, such as the aforementioned early view effect or selection bias – that authors would only make their best work freely available online or that better authors tended to self-archive (Moed, 2007). Another study, by Davis (2008b), sought to eliminate selection bias by randomly selecting which articles were made available via OA and then checking citation counts over time to see if there was a difference between the two groups; this study found that there was no OA advantage after one year.

Further research by proponents of OA citation advantage compared articles that were made available online by choice of the author with those that were mandatorily made free, for example as part of requirements by a funding agency, but did not randomly select which articles were made OA (Gargouri et al., 2010). Criticism had also been levelled at the Davis study that one year was too short a time for OA advantage to become apparent (Harnad, 2008).

Davis has recently updated his results to show that there is still no OA advantage three years after publication (Davis, 2010), but this has not been accepted by the proponents of OA, whose results suggest there is an OA advantage for both mandated and self-selected OA articles (Harnad, 2010). The two conclusions seem contradictory, although there is a difference between institution- or funding body-mandated OA and random selection; it could be argued still that articles published as a result of funding body grants are more likely to contain significant and citable conclusions. While neutral parties seem to be satisfied with Davis’ approach, it has not been suggested that his results can be generalized to the entire journal population. It would take a wide range of large samples for one to be confident that any result could be generalised to all journals. However, the approach of randomly assigning OA status gives a view of the data that is the most free from confounding variables so far. While OA citation advantage cannot be considered to have been comprehensively disproved, it has now been called seriously into doubt.

Author metrics

Although publishers are primarily concerned with metrics relating to journals, many key stakeholders are also authors. As journal metrics have grown in importance over recent years, so too have author metrics. The first metric to gain significant currency was the H-Index, proposed by Jorge Hirsch (Hirsch, 2007). An author has an H-Index of 3 if, from all the articles they have published, there are 3 that have at least 3 citations each; they have an H-Index of 5 if there are 5 articles with at least 5 citations each; 10 if there are 10 articles with at least 10 citations each, and so on (Figure 10.1). The H-Index is therefore a measurement of both quantity of publications and their quality.

Figure 10.1 Illustration of an author’s H-Index calculation

H-indices have not been calculated and released in the same way that Impact Factors are because they may change every time a citation is indexed. This is, however, something of a weakness, as there is no definitive version of an author’s H-Index. The value will vary depending on whether it is calculated from WoS, Scopus or Google Scholar data. Even the automatic calculations on WoS and Scopus may give inaccurate results if based on incomplete unique author article sets. But there are problems with the metric beyond calculation (Bornmann and Daniel, 2007). It is, like the Impact Factor, strongly affected by differences in citation frequency between different subject areas. It is also insensitive to time, in that it can never fall; long-dead researchers may well have higher H-Indices than their living counterparts. Researchers who co-author in large groups tend to receive more citations and so have higher H-Indices, a pattern leading to increasing levels of co-authorship (Ioannidis, 2008). The metric can also be manipulated by publishing a large number of weaker articles that cite each other and would fail to properly rank those authors who publish a very few but critically important articles.

A number of variations on the basic H-Index have been proposed (Bornmann et al., 2008). Alternatives include the AR-Index (Jin et al., 2007), which takes into account the age of the citations, or the Individual H-Index (Batista et al., 2006), which aims to eliminate the effect of co-authorship. None of these deals satisfactorily with gulfs in citation levels between different subject areas that skew as a result of which index is used.

The future of research performance metrics

The next several years will probably see the emergence of some new metrics and the consolidation of some existing ones. At a journal level, the Impact Factor, misused as it sometimes is, will most likely remain the most accepted metric. It took decades for the Impact Factor to reach this position, so we should not expect any of the competitor metrics to replace it imminently, particularly given the complexity of their calculation. It is more likely that we will see the emergence of new approaches to measuring research performance at a journal level. As previously mentioned, the Usage Factor is still in development, and although it may be as prone to manipulation as the Impact Factor, if not more so, it will add a new dimension to the tools we have available. Time will tell whether it is accepted.

Work continues on developing new approaches to research performance evaluation. Grant information could theoretically be used to evaluate research, but the funding a project receives and its eventual usefulness are disconnected. The RAND corporation recently reported on novel efforts to evaluate research using systematically collected ‘atoms’ of data (Wooding, 2011) but these would seem to be too subject-specific to apply to the entire gamut of journal fields. Given the difficulties encountered in fairly comparing journals from different subject areas, it is even conceivable that field-specific metrics will evolve, but there has been little movement in this direction so far.

The continued development of metrics at other levels is a certainty. The major publishers are investing heavily in usage, citation and social media metrics at an article level, and the sophistication of these is only likely to increase. With the increased adoption of unique researcher identification, the use of author-level metrics should become easier and more reliable; all that remains is to find a strong metric to which author data can be applied (Finch, 2010).

Governments are also making the measurement of research performance a priority at an institutional and subject level. In the US, the National Institutes of Health, National Science Foundation and White House Office of Science and Technology policy has launched STAR METRICS, the first phase of which has looked at job creation resultant from research funding. The second phase plans to look at publications and citations but also social and environmental outcomes, patent grants and company start-ups based on the application of research innovations. In Australia, the ERA (Excellence in Research for Australia) initiative continues to develop its bibliometric indicators as one element of research performance evaluation.

The path and destination of the road ahead are quite uncertain; if only one thing is for sure, it is that publishers’ bibliometricians will need to know about far more than just the Impact Factor.

References

Al-Awqati, Q. Impact Factors and prestige. Kidney International. 2007; 71:183–185.

de Albuquerque, U. P. The tyranny of the impact factor: why do we still want to be subjugated? Rodriguésia. 2010; 61(3):353–358.

Althouse, B. M., West, J. D., Bergstrom, T., Bergstrom, C. T. Differences in Impact Factor across fields and over time. Journal of the American Society for Information Science and Technology. 2009; 60:27–34.

Anegon, F. d. (no date). Auto-citacao - distribuicao global. Retrieved 29 June 2011, from Portal Eventos BVS: http://www. slideshare. net/fkersten/scopus-journal-metrics-snip-sjr.

Antelman, K. Do Open Access articles have a greater citation impact? College & Research Libraries. 2004; 65:372–382.

Batista, P. D., Campiteli, M. G., Kinouchi, O., Martinez, A. S. Is it possible to compare researchers with different scientific interests? Scientometrics. 2006; 68(1):179–189.

Bergstrom, C. T. Eigenfactor: measuring the value and prestige of scholarly journals. College & Research Libraries News. 68(5), 2007.

Bollen, J., Van de Sompel, H., Hagberg, A., et al. Clickstream data yields high-resolution maps of science. PLoS One. 2009; 4(3):e4803.

Bornmann, L., Daniel, H. D. What do we know about the H-Index? Journal of the American Society for Information Science and Technology. 2007; 58:1381–1385.

Bornmann, L., Mutz, R., Daniel, H. -D. Are there better indices for evaluation purposes than the h index? A comparison of nine different variants of the h index using data from biomedicine. Journal of the American Society for Information Science and Technology. 2008; 59(5):830–837.

Brumback, R. A. Impact Factor Wars: Episode V - The Empire Strikes Back. Journal of Child Neurology. 2009; 24(3):260–262.

Butler, D. Web usage data outline map of knowledge. Retrieved 21 June 2011, from Nature News: http://www. nature. com/news/2009/090309/full/458135a. html, 2009.

Cameron, B. D. Trends in the usage of ISI bibliometric data: uses, abuses, and implications, 2005. [Librarian and Staff, Paper 3].

Campbell, P. Escape from the Impact Factor. Ethics in Science and Environmental Politics. 2008; 8:5–7.

Colquhoun, D. How to get good science. Retrieved 20 June 2011, from DCScience. net: http://www. dcscience. net/goodscience. pdf, 2007.

Craig, I. D., Plume, A. M., McVeigh, M. E., Pringle, J., Amin, M. Do open access articles have greater citation impact? A critical review of the literature. Journal of Informetrics. 2007; 1(3):239–248.

Davis, P. M. Eigenfactor: does the principle of repeated improvement result in better journal impact estimates than raw citation counts? Journal of the American Society for Information Science and Technology. 2008; 59(13):2186–2188.

Davis, P. M. Open access publishing, article downloads, and citations: randomised controlled trial. British Medical Journal. 2008.

Davis, P. M. Does Open Access lead to increased readership and citations? A randomised controlled trial of articles published in the APS journals. The Physiologist. 2010; 53:197–201.

Deis, L. F., Goodman, D. Update on Scopus and Web of Science. The Charleston Advisor. 8(3), 2007. [15-15(1)].

Elsevier. Journal Ranking Metrics - SNIP & SJR: A New Perspective in Journal Performance Management. Retrieved 29 June 2011, from SlideShare. Net: http://www. slideshare. net/fkersten/scopus-journal-metrics-snip-sjr, 2010.

Elsevier. About Scopus. Retrieved 29 June 2011, from SciVerse: http://www. info. sciverse. com/scopus/about, 2011.

Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., Pappas, G. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses. The FASEB Journal. 2008; 22(2):338–342.

Finch, A. T. Can we do better than existing author citation metrics? BioEssays. 2010; 32(9):744–747.

Gargouri, Y., Hajjem, C., Lariviere, V., et al. Self-selected or mandated, open access increases citation impact for higher quality research. PLoS One. 2010; 10(5):e13636.

Garfield, E. Citation indexes to science: a new dimension in documentation through association of ideas. Science. 1955; 122:108–111.

Gross, P. L., Gross, E. M. College libraries and chemical education. Science. 1927; 66:385–389.

Hajjem, C., Harnad, S., Gingras, Y. Ten-year cross-disciplinary comparison of the growth of open access and how it increases research citation impact. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering. 2005; 28:39–47.

Haak, D. L. Perspectives on Metrics-Based Research Evaluation. Retrieved 27 June 2011, from University of Queensland: http://www. library. uq. edu. au/metrics2011/presentations/Haakpercent2017percent20Maypercent20am. pdf, 2011.

Harnad, S. Davis et al’s 1-year Study of Self-Selection Bias: No Self-Archiving Control, No OA Effect, No Conclusion. Retrieved 21 June 2011, from Open Access Archivangelism: http://openaccess. eprints. org/index. php?/archives/441-guid. html, 2008.

Harnad, S. Correlation, Causation, and the Weight of Evidence. Retrieved 21 June 2011, from Open Access Archivangelism: http://openaccess. eprints. org/index. php?/archives/772-Correlation,-Causation,-and-the-Weight-of-Evidence. html, 2010.

Harnad, S., Brody, T. Comparing the impact of Open Access (OA) vs. non-OA articles in the same journals. D-Lib Magazine. 10, 2004.

Harzing, A., van der Wal, R. Google Scholar as a new source for citation analysis. Ethics in Science and Environmental Politics. 2008; 2008:61–73.

Hirsch, J. E. An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Science. 2007; 102:16569–16572.

Hoeffel, C. Journal impact factors. Allergy. 1998; 53:1225.

Ioannidis, J. P. Measuring co-authorship and networking-adjusted scientific impact. PLoS One. 2008; 3(7):e2778.

Jacsó, P. Five-year impact factor data in the Journal Citation Reports. Online Information Review. 2009; 33(3):603–614.

Jascó, P., Newswire Analysis: Google Scholar’s Ghost Authors, Lost Authors, and Other Problems Retrieved 21 June 2011, from LibraryJournal. com:, http://www. libraryjournal. com/article/CA6698580. html?nid=2673&rid=528369845&source=title, 2009

Jin, B., Liang, L. M., Rousseau, R., Egghe, L. The R- and AR- Indices: complementing the H Index. Chinese Science Bulletin. 2007; 52:855–863.

Labbe, C. Ike Antkare, one of the great stars in the scientific firmament. ISSI Newsletter. 2010; 6(2):48–52.

Leydesdorff, L., Opthof, T. Scopus’s Source Normalized Impact per Paper (SNIP) versus a journal impact factor based on fractional counting of citations. Journal of the American Society for Information Science and Technology. 2010; 61(11):2365–2369.

McVeigh, M. E., Mann, S. J. The journal impact factor denominator: defining citable (counted) items. Journal of the American Medical Association. 2009; 302(10):1107–1109.

Moed, H. The effect of ‘open access’ on citation impact: an analysis of ArXiv’s condensed matter section. Journal of the American Society for Information Science and Technology. 2007; 58(13):2047–2054.

Moed, H. F. Measuring contextual citation impact of scientific journals. Journal of Informetrics. 2010; 3(2):265–277.

Oh, H. C., Lim, J. F. Is the journal impact factor a valid indicator of scientific value? Singapore Medical Journal. 2009; 50:749–751.

PLoS One. Article-Level Metrics Information. Retrieved 18 July 2011, from PLoS One: http://www. plosone. org/static/almInfo. action, 2009.

Scimago Research Group. Description of SCImago Journal Rank Indicator. Retrieved 21 June 2011, from SJR - SCImago Journal & Country Rank: http://www. scimagojr. com/SCImagoJournalRank. pdf, 2007.

Seglen, P. O. Why the impact factor of journals should not be used for evaluating research. British Medical Journal. 1997; 314(7079):498–502.

Shepherd, P. T. COUNTER Code of Practice. Retrieved 21 June 2010, from COUNTER (Counting Online Usage of NeTworked Electronic Resources): http://www. projectcounter. org/code_practice_r1. html, 2002.

Simkin, M. V., Roychowdhury, V. P. Read before you cite!. Complex Systems. 2003; 14:269–274.

Taylor, M., Perakakis, P., Trachana, V. The siege of science. Ethics in Science and Environmental Politics. 2008; 8:17–40.

Reuters, Thomson. Discovery Starts Here. Retrieved 29 June 2011, from Web of Knowledge: http://isiwebofknowledge. com/about/newwok/, 2011.

Reuters, Thomson. Quick Facts. Retrieved 29 June 2011, from Web of Knowledge: http://wokinfo. com/about/facts/, 2011.

Reuters, Thomson. Quick Reference Guide. Retrieved 29 June 2011, from Web of Science: http://thomsonreuters. com/content/science/pdf/ssr/training/wok5_wos_qrc_en. pdf, 2011.

Todd, P. A., Ladle, R. J. Hidden dangers of a ‘citation culture’. Ethics in Science and Environmental Politics. 2008; 8:13–16.

Tsikliras, A. C. Chasing after the high impact. Ethics in Science and Environmental Politics. 2008; 8:45–47.

Usage Factors. Retrieved 21 June 2011, from UKSG Website: http://www. uksg. org/usagefactors, 2010.

Van Raan, A. F. Self-citation as an impact-reinforcing mechanism in the science system. Journal of the American Society for Information Science. 2008; 59(10):1631–1643.

Vieira, E. S., Gomes, J. A. A comparison of Scopus and Web of Science for a typical university. Scientometrics. 2009; 81(2):587–600.

West, J., Bergstrom, C. T. Pseudocode for calculating Eigenfactor score and Article influence score using data from Thomson-Reuters Journal Citation Reports. Retrieved 21 June 2011, from Eigenfactor. org: http://www. eigenfactor. org/EF_pseudocode. pdf, 2008.

Wooding, S. Surveying the scene – the RAISS tool for mapping the impact of research portfolios; Perspectives on Metrics-Based Research Evaluation, 2011. [Brisbane, 16 May 2011].