Posts H-Index
Post
Cancel

H-Index

PROBLEM DESCRIPTION

Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper, return the researcher’s h-index.

According to the definition of h-index on Wikipedia: The h-index is defined as the maximum value of h such that the given researcher has published at least h papers that have each been cited at least h times.

leetcode

SOLUTION

Since we are looking for the maximum citation, it would make sense to first sort the list. Also, the h-index should be equal to the number of papers which has been cited, so we would keep to keep a count of how many papers we have checked. Even if the number of citations for current paper is more that it, it does not make a difference because it should match with the number of papers that has been cited.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Solution {

    public int hIndex(int[] citations) {
        
        int n = citations.length;

        int maxH = 0;

        //number of papers
        int currentH = 0;

        Arrays.sort(citations); 

        //loop from right so that we start from the largest number        
        for(int i=n-1; i>=0; i--){
            
            //increase the number of papers
            currentH++;

            //if the current number of citations is more, we can be sure that anything ahead of it was also more and this can be the largest value
            if(citations[i] >= currentH){
                maxH = currentH;
            }

        }

        return maxH;

    }

}

OPTIMIZATION (TIME-COMPLEXITY)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
class Solution {
    
    public int hIndex(int[] citations) {

        //example: [1 3 2 3 100]

        int n = citations.length;

        //max possible H-index is n -> max possible answer
        //if any paper has more than n citation, that cannot be the answer (n is upper limit)
        //so we can replace them with n
        for(int i=0; i<n; i++)
            if(citations[i] > n) citations[i] = n;

        //updated: [1 3 2 3 5]

        //the possible number of citations for any papers are now in the range [0,n]
        //let us count the number of papers which has citations 0,1,2,3,4...n
        //the range is [0,n], so need an array of size n+1
        int[] countOfPapersWithCitationX = new int[n+1];

        //go to each paper
        //get its citation, say X
        //increase the count at index X in countOfPapersWithCitationX
        //so now we know how many papers are there with citations ranging from [0,n]
        for(int i=0; i<n; i++){
            int citation = citations[i];
            countOfPapersWithCitationX[citation] = countOfPapersWithCitationX[citation] + 1;
        }

        //papers: [1 3 2 3 5]
        //count of papers with 0 citations: 0
        //count of papers with 1 citations: 1
        //count of papers with 2 citations: 1
        //count of papers with 3 citations: 2
        //count of papers with 4 citations: 0
        //count of papers with 5 citations: 1

        //countOfPapersWithCitationX: [0 1 1 2 0 1] -> n+1 elements

        //[0 1 1 2 0 1] 
        //now the important part (go from back, in cumulative way)
        //count of papers with 5 or more citations: 1
        //count of papers with 4 or more citations: 1
        //count of papers with 3 or more citations: 3
        //count of papers with 2 or more citations: 4
        //count of papers with 1 or more citations: 5
        //count of papers with 0 or more citations: 5
        
        int[] countOfPapersWithCitationXOrMore = new int[n+1];
        countOfPapersWithCitationXOrMore[n] = countOfPapersWithCitationX[n];
        for(int i=n-1; i>=0; i--){
            countOfPapersWithCitationXOrMore[i] = countOfPapersWithCitationX[i] + countOfPapersWithCitationXOrMore[i+1];
        }
        //countOfPapersWithCitationXOrMore: 
        //[0 1 2 3 4 5] -> index
        //[5 5 4 3 1 1] -> count of papers with citatations "i" or more, where i is the index
        
        //the above table is useful because we can derive the following info:
        //there is 1 paper with 5 or more citations
        //there is 1 paper with 4 or more citations
        //there are 3 papers with 3 or more citations
        //there are 4 papers with 2 or more citations
        //there are 5 papers with 1 or more citations
        //there are 5 papers with 0 or more citations
        
        //H-index:
        //maximum value of h such that the given researcher has published at least h papers that have each been cited at least h times.
        //loop from highest possible answer which is citation = n 
        for(int i=n; i>=0; i--){

            //number of papers with at least i citations -> this must be the H-index
            if(countOfPapersWithCitationXOrMore[i] >= i){
                return i;
            }

        }

        return 0;

    }

}
This post is licensed under CC BY 4.0 by the author.