1

I have a dataframe with authors, papers they've published and citation counts for each paper (as well as 71 other columns). I want to find the most cited authors. The problem is that some papers have multiple authors so each author is a substring. I can separate out the authors easily enough but I can't figure out how to aggregate each of their citations. Can anyone help?

Here's the dataframe

    year   citation  author              paper_title
    2018       33    author1; author2    paper1
    2018       89    author2; author3    paper2
    2017       10    author4             paper3 
    2013       10    author2             paper4
    2014        9    author3             paper5
    2011        1    author5             paper7
2
  • 3
    Not clear about the expected outputlibrary(tidyverse); df1 %>% separate_rows(author) %>% group_by(author) %>% summarise(citation = sum(citation)) Commented Sep 10, 2019 at 17:56
  • Thanks for this. Still not getting the result I want but I think this general approach might be the way to go. The expected output is number of citations for each author. Commented Sep 10, 2019 at 18:23

1 Answer 1

2
df <- data.frame(year = c(2018, 2017),
citation = c(33,89),
author = c('author1; author2', 'author2; author3'),
paper_title = c('paper1', 'paper2'), stringsAsFactors = F)

df <- df %>% mutate(author=strsplit(author, "; ")) %>% 
  unnest(author) %>% group_by(author) %>% summarise(n_cit = sum(citation))
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you! This gets me exactly what i want!
@NewtoRcode happy to help and would be even happier to see an upvote or a check mark!
I upvoted but it said it wouldn't register because I have less than 15 points or something

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.