A research front is a cluster of highly cited papers over a five-year period --referred to as "core papers"-- in a specialized topic defined by a cluster analysis. Research fronts offer an alternative classification scheme for highly cited papers since the assignment of papers to a research front is not based on the research fields used in Essential Science Indicators. Identifying research fronts involves manipulating the co-cited papers in order to group together those that are strongly related. Before embarking on this process, a threshold is set on the integer co-citation frequencies to eliminate very low values, and the remaining frequencies are converted to a normalized form using the following formula:
Normalized co-citation = Integer co-citation frequency of A and B/(citation frequency A*citation frequency B)^.5
In other words, we divide the co-citation frequency by the square root of the product of the citation frequencies of the two papers. A second threshold is set on these normalized values. In the most recent data run for Essential Science Indicators, the integer threshold was set to accept co-citation frequencies of 2 or greater, and the normalized threshold was set at 0.1.
Starting with a co-cited pair that meets the thresholds, this grouping procedure then finds other pairs that share common papers. The gathering process continues until no other pairs of papers can be added to the set. This process is commonly known as single-link clustering. The resulting clusters vary in size from a minimum of two papers to some maximum size.
The numeric attributes of fronts can help determine the significance of the areas and their stage of development. The number of core papers in the front and the total citations received give indications of the size of the area. The numbers of citations per core paper give an indication of the focus or concentration of effort. The average publication year and distribution of core papers by year give an indication of currency or "hotness"—that is, how quickly research is changing and whether there are new developments. An analysis of frequently occurring keywords or phrases in the titles of the paper, as given by the front name, can give an indication of the subject content and thematic focus of the area.
Research front analysis will not identify all research areas or all the papers in an area. However, it can assist in identifying areas where important work is being done and where the scientific community is focusing its attention.
A measure of association between highly cited papers is used to form the clusters. That measure is the number of times pairs of papers have been co-cited, that is, the number of later papers that have cited both of them. Clusters are formed by selecting all papers that can be linked together by a specified co-citation threshold.
The clusters are named using a semi-automatic process based on frequently occurring title words and phrases. Statistical characteristics of each cluster are also determined, including the number of highly cited papers, the sum of their citation frequencies, the citations per paper, and the mean year of papers in the front. The number of highly cited papers gives an indication of the size of the foundation literature; the sum of citation frequencies reflects the size of the research front; the citations per paper the degree of concentration, and the mean year of papers the currency, or "hotness," of the cluster.
Research fronts are assigned to the 22 broad fields based on the field of the most frequently occurring journal in the front.
Only those fronts meeting a minimum size threshold and high average currency are included in Essential Science Indicators. Currency is determined by calculating the mean of the years of publications of the highly cited papers.