How to parse link header from github API

Question

the github API sends the pagination data for the json results in the http link header:

Link: <https://api.github.com/repos?page=3&per_page=100>; rel="next",
<https://api.github.com/repos?page=50&per_page=100>; rel="last"

since the github API is not the only API using this method (i think) i wanted to ask if someone has a useful little snippet to parse the link header (and convert it to an array for example) so that i can use it for my js app.

i googled around but found nothing useful regarding how to parse pagination from json APIs

Charles Duffy · Accepted Answer · 2015-04-30 21:12:50Z

28

The parse-link-header NPM module exists for this purpose; its source can be found on github under a MIT license (free for commercial use).

Installation is as simple as:

npm install parse-link-header

Usage looks like the following:

var parse = require('parse-link-header');
var parsed = parse('<https://api.github.com/repos?page=3&per_page=100>; rel="next", <https://api.github.com/repos?page=50&per_page=100>; rel="last"')

...after which one has parsed.next, parsed.last, etc:

{ next:
   { page: '3',
     per_page: '100',
     rel: 'next',
     url: 'https://api.github.com/repos?page=3&per_page=100' },
  last:
   { page: '50',
     per_page: '100',
     rel: 'last',
     url: ' https://api.github.com/repos?page=50&per_page=100' } }

edited Apr 30, 2015 at 21:12

Charles Duffy

299k43 gold badges441 silver badges497 bronze badges

answered Apr 30, 2015 at 17:36

Cosmin

2,0803 gold badges20 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Cosmin Over a year ago

this is a npm module guys. does it make sense to pick it apart ?

Cosmin Over a year ago

yea sounds like rules need to be revised. for the time being you can let the community know that my incentive for contributing has been diminished after this. Thanks for your edit. However, if the links get deleted ( the node module gets deleted ) the info you have edited into my message would be just as useless. I am finding a hard time to understand the common sense of making rules to downvote/delete the posts of someone who was trying to help, and in fact did nothing wrong ( it's a node module, maybe some people don't understand what that is ... )

Cosmin Over a year ago

Also keep in mind i'm not payed by stack overflow. I'm payed by my employer, for which I got back to working after I spent the minimum time possible, to help others, after the stack overflow question on it proved to provide improper info. Feel free to delete my post next time, if you consider that accurate information I provided is not worth while. Also feel free to make me an offer, if you consider that next time I should spend more time on my posts, and I'll budget time and make a professional post after hours. Stackoverflow is community driven but it IS a business.

Cosmin Over a year ago

meanwhile the accepted answer, provides only links and nobody said anything about it for 3 years. this is simply ridiculous ...

MartinJH Over a year ago

This is the best answer for javascript.

|

Kevin Sawicki · Accepted Answer · 2012-01-07 04:12:44Z

18

There is a PageLinks class in the GitHub Java API that shows how to parse the Link header.

answered Jan 7, 2012 at 4:12

Kevin Sawicki

3,0281 gold badge22 silver badges18 bronze badges

2 Comments

Aseem Kishore Over a year ago

It's worth noting that while this does the trick for GitHub's usage, this isn't a fully robust parsing of any Link header. String splits aren't enough; e.g. ;= is allowed within URLs, and even , is allowed within values if the values are quoted. Horribly complex. Spec: rfc-editor.org/rfc/rfc5988.txt

Jing Ma Over a year ago

Those links are dead by now.

danriti · Accepted Answer · 2013-08-23 13:30:41Z

9

I found this Gist that:

Parse Github Links header in JavaScript

Tested it out on the Github API and it returns an object like:

var results = {
    last: "https://api.github.com/repositories/123456/issues?access_token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&state=open&since=2013-07-24T02%3A12%3A30.309Z&direction=asc&page=4"
    next: "https://api.github.com/repositories/123456/issues?access_token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&state=open&since=2013-07-24T02%3A12%3A30.309Z&direction=asc&page=2"
};

answered Aug 23, 2013 at 13:30

danriti

9158 silver badges9 bronze badges

Comments

Atul Varma · Accepted Answer · 2013-05-26 14:05:51Z

7

I found wombleton/link-headers on github. It appears to be made for the browser, as opposed to being an npm module, but it seems like it wouldn't be hard to modify it to work in a server-side environment. It uses pegjs to generate a real RFC 5988 parser rather than string splits, so it should work well for any link header, rather than just Github's.

answered May 26, 2013 at 14:05

Atul Varma

3113 silver badges7 bronze badges

1 Comment

Lubo Over a year ago

Also spring hateoas does have such a parser. See unit test described here: github.com/spring-projects/spring-hateoas/issues/710 Documentation does describe only creating links: docs.spring.io/spring-hateoas/docs/current/reference/html/… To parse, just try to use Link link = Link.valueOf(linkHeaderValue);.

pim · Accepted Answer · 2017-09-21 20:22:26Z

I completely understand this is "technically" a JavaScript thread. But, if you're like me and arrived here by Google'ing "how to parse Link header" I thought I'd share my solution for my envinronment (C#).

public class LinkHeader
{
    public string FirstLink { get; set; }
    public string PrevLink { get; set; }
    public string NextLink { get; set; }
    public string LastLink { get; set;}

    public static LinkHeader FromHeader(string linkHeader)
    {
        LinkHeader linkHeader = null;

        if (!string.IsNullOrWhiteSpace(linkHeader))
        {
            string[] linkStrings = linkHeader.Split("\",");

            if (linkStrings != null && linkStrings.Any())
            {
                linkHeader = new LinkHeader();

                foreach (string linkString in linkStrings)
                {
                    var relMatch = Regex.Match(linkString, "(?<=rel=\").+?(?=\")", RegexOptions.IgnoreCase);
                    var linkMatch = Regex.Match(linkString, "(?<=<).+?(?=>)", RegexOptions.IgnoreCase);

                    if (relMatch.Success && linkMatch.Success)
                    {
                        string rel = relMatch.Value.ToUpper();
                        string link = linkMatch.Value;

                        switch (rel)
                        {
                            case "FIRST":
                                linkHeader.FirstLink = link;
                                break;
                            case "PREV":
                                linkHeader.PrevLink = link;
                                break;
                            case "NEXT":
                                linkHeader.NextLink = link;
                                break;
                            case "LAST":
                                linkHeader.LastLink = link;
                                break;
                        }
                    }
                }
            }
        }

        return linkHeader;
    }
}

Testing in a console app, using GitHub's example Link header:

void Main()
{
    string link = "<https://api.github.com/user/repos?page=3&per_page=100>; rel=\"next\",< https://api.github.com/user/repos?page=50&per_page=100>; rel=\"last\"";
    LinkHeader linkHeader = LinkHeader.FromHeader(link);
}

Sahil Chhabra · Accepted Answer · 2020-10-19 07:31:18Z

6

For someone who ended up here searching for Link Header Parser in Java, you can use javax.ws.rs.core.Link. Refer below for example:

import javax.ws.rs.core.Link

String linkHeaderValue = "<https://api.github.com/repos?page=3&per_page=100>; rel='next'";
Link link = Link.valueOf(linkHeaderValue);

edited Oct 19, 2020 at 7:31

answered Oct 14, 2020 at 12:29

Sahil Chhabra

11.9k5 gold badges69 silver badges64 bronze badges

2 Comments

Yudhistira Arya Over a year ago

I can confirm that this work for those looking for solution for Java without wanting to use any 3rd party library.

jspetrak Over a year ago

Unfortunately, this utility class is not available separately. If you don't use Jakarta WS otherwise, you may also need jersey-common to make the code work. It also does not support multiple links in the value.

Austin Cory Bart · Accepted Answer · 2019-04-20 18:14:18Z

5

Here is a simple javascript function that extracts the useful info from the link in a nice object notation.

var linkParser = (linkHeader) => {
  let re = /<([^\?]+\?[a-z]+=([\d]+))>;[\s]*rel="([a-z]+)"/g;
  let arrRes = [];
  let obj = {};
  while ((arrRes = re.exec(linkHeader)) !== null) {
    obj[arrRes[3]] = {
      url: arrRes[1],
      page: arrRes[2]
    };
  }
  return obj;
}

It will output the result like this ==>

{
  "next": {
    "url": "https://api.github.com/user/9919/repos?page=2",
    "page": "2"
  },
  "last": {
    "url": "https://api.github.com/user/9919/repos?page=10",
    "page": "10"
  }
}

edited Apr 20, 2019 at 18:14

Austin Cory Bart

2,5902 gold badges23 silver badges33 bronze badges

answered Feb 1, 2019 at 6:41

Harman

1651 gold badge2 silver badges10 bronze badges

Comments

Community · Accepted Answer · 2021-10-07 06:06:50Z

3

If you can use Python and don't want to implement full specification, but need to have something what work for Github API, then here we go:

import re
header_link = '<https://api.github.com/repos?page=3&per_page=100>; rel="next", <https://api.github.com/repos?page=50&per_page=100>; rel="last"'
if re.search(r'; rel="next"', header_link):
    print re.sub(r'.*<(.*)>; rel="next".*', r'\1', header_link)

edited Oct 7, 2021 at 6:06

CommunityBot

11 silver badge

answered Sep 30, 2015 at 7:23

Anton Babenko

6,6442 gold badges38 silver badges44 bronze badges

2 Comments

Kane Blueriver Over a year ago

You can get it by rsp.links if you are using requests.

Simon Charette Over a year ago

Thanks for the links tip @kxxoling, I didn't know about that!

noelbk · Accepted Answer · 2019-01-08 10:47:07Z

3

Here's a simple bash script with curl and sed to get all pages from a long query

url="https://api.github.com/repos/$GIT_USER/$GIT_REPO/issues"
while [ "$url" ]; do
      echo "$url" >&2
      curl -Ss -n "$url"
      url="$(curl -Ss -I -n "$url" | sed -n -E 's/Link:.*<(.*?)>; rel="next".*/\1/p')"
done > issues.json

answered Jan 8, 2019 at 10:47

noelbk

1,4731 gold badge12 silver badges8 bronze badges

Comments

Ray Jasson · Accepted Answer · 2022-06-03 13:11:18Z

Instead of using the original parse-link-header package, another option would be @web3-storage/parse-link-header. It's the forked version of the original NPM package. The API is the same but it comes with advantages like:

TypeScript support
Zero dependencies
No Node.js globals and ESM

Installation:

npm install @web3-storage/parse-link-header

Usage:

import { parseLinkHeader } from '@web3-storage/parse-link-header'

const linkHeader =
  '<https://api.github.com/user/9287/repos?page=3&per_page=100>; rel="next", ' +
  '<https://api.github.com/user/9287/repos?page=1&per_page=100>; rel="prev"; pet="cat", ' +
  '<https://api.github.com/user/9287/repos?page=5&per_page=100>; rel="last"'

const parsed = parseLinkHeader(linkHeader)
console.log(parsed)

Output:

{
   "next":{
      "page":"3",
      "per_page":"100",
      "rel":"next",
      "url":"https://api.github.com/user/9287/repos?page=3&per_page=100"
   },
   "prev":{
      "page":"1",
      "per_page":"100",
      "rel":"prev",
      "pet":"cat",
      "url":"https://api.github.com/user/9287/repos?page=1&per_page=100"
   },
   "last":{
      "page":"5",
      "per_page":"100",
      "rel":"last",
      "url":"https://api.github.com/user/9287/repos?page=5&per_page=100"
   }
}

Anvesh Reddy · Accepted Answer · 2020-08-10 21:13:04Z

1

Here is a simple code to parse link header from GitHub in Java Script

var parse = require('parse-link-header');
    var parsed = parse(res.headers.link);
    no_of_pages = parsed.last.page;

answered Aug 10, 2020 at 21:13

Anvesh Reddy

211 bronze badge

1 Comment

Lubo Over a year ago

Mostly same answer as one bellow stackoverflow.com/a/29974304/11152683 and without link to given library.

Akshay Som · Accepted Answer · 2021-02-09 04:30:13Z

This is a Java function which will serve the purpose, you can find a link for the provided parameter key and parameter value. Please Note: This is something that I made for personal purpose, it might not be fool proof for your scenario, so review it and make changes accordingly

https://github.com/akshaysom/LinkExtract/blob/main/LinkExtract.java

  public static String getLinkFromLinkHeaderByParamAndValue(String header, String param, String value) {
            if (header != null && param != null && value != null && !"".equals(header.trim()) && !"".equals(param.trim())
                    && !"".equals(value)) {
    
                String[] links = header.split(",");
    
                LINKS_LOOP: for (String link : links) {
    
                    String[] segments = link.split(";");
    
                    if (segments != null) {
    
                        String segmentLink = "";
    
                        SEGMENT_LOOP: for (String segment : segments) {
                            segment = segment.trim();
                            if (segment.startsWith("<") && segment.endsWith(">")) {
    
                                segmentLink = segment.substring(1, segment.length() - 1);
                                continue SEGMENT_LOOP;
    
                            } else {
                                if (segment.split("=").length > 1) {
    
                                    String currentSegmentParam = segment.split("=")[0].trim();
                                    String currentSegmentValue = segment.split("=")[1].trim();
    
                                    if (param.equals(currentSegmentParam) && value.equals(currentSegmentValue)) {
                                        return segmentLink;
                                    }
                                }
                            }
                        }
                    }
                }
            }
            return null;
        }

Arunprasad Rajkumar · Accepted Answer · 2020-07-30 13:12:39Z

0

Here is a Python solution to get contributors count for any github repo.

import requests
from urllib.parse import parse_qs

rsp = requests.head('https://api.github.com/repos/fabric8-analytics/fabric8-analytics-server/contributors?per_page=1')
contributors_count = parse_qs(rsp.links['last']['url'])['page'][0]

answered Jul 30, 2020 at 13:12

Arunprasad Rajkumar

1,4741 gold badge17 silver badges31 bronze badges

Collectives™ on Stack Overflow

How to parse link header from github API

13 Answers 13

9 Comments

2 Comments

Comments

1 Comment

Comments

2 Comments

Comments

2 Comments

Comments

Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

13 Answers 13

9 Comments

2 Comments

Comments

1 Comment

Comments

2 Comments

Comments

2 Comments

Comments

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related