Baselines

Baseline

A baseline is the average performance of a global set of publications with the same subject area, document type and year.

For example, a global set might consist of all articles in the field of chemistry published in 2006. Baselines and subject schemes create useful reference points for comparison, and they are the basis of normalization to overcome subject bias.

Baselines are calculated using a whole counting method, this means that all papers in a subject area are counted towards the baseline calculation regardless of whether those papers are also in other subject areas or not.

Baseline Calculation Example

ArticleID

Times Cited

Subject Areas

Document Type

Year

A

0

Chemistry, Organic

Article

2010

B

12

Chemistry, Organic & Chemistry Physical

Article

2010

C

5

Chemistry, Physical

Article

2010

D

8

Chemistry, Organic

Review

2010

This table shows some sample publications A-D that are in different subjects and have different document types. For simplicity of the demonstration of the calculation all papers are in the same year, but in reality, baselines are also calculated for each year. The citation impact (average citations per paper) baseline for each variant of subject, year and document type will be calculated as the mean average:

Where: e = the expected citation rate or baseline, c = Times Cited, p = the number of papers f = the field or subject area, t = year and d = document type. For Articles in the filed Chemistry, Organic published in 2010 (A&B) it would be:

For articles in Chemistry, Physical published in 2010 (B&C) it would be:

For reviews in Chemistry, Organic published in 2010 (D) it would be:

Note: The citation distribution for any set of publications is typically skewed towards a small number of highly cited papers and a large number of papers with relatively few citations. Because baselines are based on the mean of a set of papers and the mean is influenced by the presence of highly cited papers, the mean average will be considerably higher than the median. Therefore more than half the publications are below the mean average.

The following chart shows the differences between the Citation Impact of various subject categories. Mathematics has a lower Citation Impact than biochemisty & molecular biology. Recent publications exhibit lower citation impact due to the fact that older papers have had more time to accrue citations, and therefore exhibit a higher average citation count. Citation Impact can vary significantly across different disciplines and time periods so it cannot be used effectively to compare entities that are in different subjects or years. In these cases it is preferable to use some form of normalization to allow for the differences in fields and time (see Normalized Citation Impact, % Documents in Top 1% and % Documents in Top 10%, Average Percentile).

Five-Year Baseline

Five-year baselines are used in the five-year trend graph.

Each document will be assigned five new baselines, one for each five-year time period it appears in.  Each five-year time period acts as a single year baseline as described above, normalizing for document type, category/journal, and using an average of baselines for documents appearing in multiple categories.