46

the github API sends the pagination data for the json results in the http link header:

Link: <https://api.github.com/repos?page=3&per_page=100>; rel="next",
<https://api.github.com/repos?page=50&per_page=100>; rel="last"

since the github API is not the only API using this method (i think) i wanted to ask if someone has a useful little snippet to parse the link header (and convert it to an array for example) so that i can use it for my js app.

i googled around but found nothing useful regarding how to parse pagination from json APIs

13 Answers 13

28

The parse-link-header NPM module exists for this purpose; its source can be found on github under a MIT license (free for commercial use).

Installation is as simple as:

npm install parse-link-header

Usage looks like the following:

var parse = require('parse-link-header');
var parsed = parse('<https://api.github.com/repos?page=3&per_page=100>; rel="next", <https://api.github.com/repos?page=50&per_page=100>; rel="last"')

...after which one has parsed.next, parsed.last, etc:

{ next:
   { page: '3',
     per_page: '100',
     rel: 'next',
     url: 'https://api.github.com/repos?page=3&per_page=100' },
  last:
   { page: '50',
     per_page: '100',
     rel: 'last',
     url: ' https://api.github.com/repos?page=50&per_page=100' } }
Sign up to request clarification or add additional context in comments.

9 Comments

this is a npm module guys. does it make sense to pick it apart ?
yea sounds like rules need to be revised. for the time being you can let the community know that my incentive for contributing has been diminished after this. Thanks for your edit. However, if the links get deleted ( the node module gets deleted ) the info you have edited into my message would be just as useless. I am finding a hard time to understand the common sense of making rules to downvote/delete the posts of someone who was trying to help, and in fact did nothing wrong ( it's a node module, maybe some people don't understand what that is ... )
Also keep in mind i'm not payed by stack overflow. I'm payed by my employer, for which I got back to working after I spent the minimum time possible, to help others, after the stack overflow question on it proved to provide improper info. Feel free to delete my post next time, if you consider that accurate information I provided is not worth while. Also feel free to make me an offer, if you consider that next time I should spend more time on my posts, and I'll budget time and make a professional post after hours. Stackoverflow is community driven but it IS a business.
meanwhile the accepted answer, provides only links and nobody said anything about it for 3 years. this is simply ridiculous ...
This is the best answer for javascript.
|
18

There is a PageLinks class in the GitHub Java API that shows how to parse the Link header.

2 Comments

It's worth noting that while this does the trick for GitHub's usage, this isn't a fully robust parsing of any Link header. String splits aren't enough; e.g. ;= is allowed within URLs, and even , is allowed within values if the values are quoted. Horribly complex. Spec: rfc-editor.org/rfc/rfc5988.txt
Those links are dead by now.
9

I found this Gist that:

Parse Github Links header in JavaScript

Tested it out on the Github API and it returns an object like:

var results = {
    last: "https://api.github.com/repositories/123456/issues?access_token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&state=open&since=2013-07-24T02%3A12%3A30.309Z&direction=asc&page=4"
    next: "https://api.github.com/repositories/123456/issues?access_token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&state=open&since=2013-07-24T02%3A12%3A30.309Z&direction=asc&page=2"
};

Comments

7

I found wombleton/link-headers on github. It appears to be made for the browser, as opposed to being an npm module, but it seems like it wouldn't be hard to modify it to work in a server-side environment. It uses pegjs to generate a real RFC 5988 parser rather than string splits, so it should work well for any link header, rather than just Github's.

1 Comment

Also spring hateoas does have such a parser. See unit test described here: github.com/spring-projects/spring-hateoas/issues/710 Documentation does describe only creating links: docs.spring.io/spring-hateoas/docs/current/reference/html/… To parse, just try to use Link link = Link.valueOf(linkHeaderValue);.
6

I completely understand this is "technically" a JavaScript thread. But, if you're like me and arrived here by Google'ing "how to parse Link header" I thought I'd share my solution for my envinronment (C#).

public class LinkHeader
{
    public string FirstLink { get; set; }
    public string PrevLink { get; set; }
    public string NextLink { get; set; }
    public string LastLink { get; set;}

    public static LinkHeader FromHeader(string linkHeader)
    {
        LinkHeader linkHeader = null;

        if (!string.IsNullOrWhiteSpace(linkHeader))
        {
            string[] linkStrings = linkHeader.Split("\",");

            if (linkStrings != null && linkStrings.Any())
            {
                linkHeader = new LinkHeader();

                foreach (string linkString in linkStrings)
                {
                    var relMatch = Regex.Match(linkString, "(?<=rel=\").+?(?=\")", RegexOptions.IgnoreCase);
                    var linkMatch = Regex.Match(linkString, "(?<=<).+?(?=>)", RegexOptions.IgnoreCase);

                    if (relMatch.Success && linkMatch.Success)
                    {
                        string rel = relMatch.Value.ToUpper();
                        string link = linkMatch.Value;

                        switch (rel)
                        {
                            case "FIRST":
                                linkHeader.FirstLink = link;
                                break;
                            case "PREV":
                                linkHeader.PrevLink = link;
                                break;
                            case "NEXT":
                                linkHeader.NextLink = link;
                                break;
                            case "LAST":
                                linkHeader.LastLink = link;
                                break;
                        }
                    }
                }
            }
        }

        return linkHeader;
    }
}

Testing in a console app, using GitHub's example Link header:

void Main()
{
    string link = "<https://api.github.com/user/repos?page=3&per_page=100>; rel=\"next\",< https://api.github.com/user/repos?page=50&per_page=100>; rel=\"last\"";
    LinkHeader linkHeader = LinkHeader.FromHeader(link);
}

Comments

6

For someone who ended up here searching for Link Header Parser in Java, you can use javax.ws.rs.core.Link. Refer below for example:

import javax.ws.rs.core.Link

String linkHeaderValue = "<https://api.github.com/repos?page=3&per_page=100>; rel='next'";
Link link = Link.valueOf(linkHeaderValue);

2 Comments

I can confirm that this work for those looking for solution for Java without wanting to use any 3rd party library.
Unfortunately, this utility class is not available separately. If you don't use Jakarta WS otherwise, you may also need jersey-common to make the code work. It also does not support multiple links in the value.
5

Here is a simple javascript function that extracts the useful info from the link in a nice object notation.

var linkParser = (linkHeader) => {
  let re = /<([^\?]+\?[a-z]+=([\d]+))>;[\s]*rel="([a-z]+)"/g;
  let arrRes = [];
  let obj = {};
  while ((arrRes = re.exec(linkHeader)) !== null) {
    obj[arrRes[3]] = {
      url: arrRes[1],
      page: arrRes[2]
    };
  }
  return obj;
}

It will output the result like this ==>

{
  "next": {
    "url": "https://api.github.com/user/9919/repos?page=2",
    "page": "2"
  },
  "last": {
    "url": "https://api.github.com/user/9919/repos?page=10",
    "page": "10"
  }
}

Comments

3

If you can use Python and don't want to implement full specification, but need to have something what work for Github API, then here we go:

import re
header_link = '<https://api.github.com/repos?page=3&per_page=100>; rel="next", <https://api.github.com/repos?page=50&per_page=100>; rel="last"'
if re.search(r'; rel="next"', header_link):
    print re.sub(r'.*<(.*)>; rel="next".*', r'\1', header_link)

2 Comments

You can get it by rsp.links if you are using requests.
Thanks for the links tip @kxxoling, I didn't know about that!
3

Here's a simple bash script with curl and sed to get all pages from a long query

url="https://api.github.com/repos/$GIT_USER/$GIT_REPO/issues"
while [ "$url" ]; do
      echo "$url" >&2
      curl -Ss -n "$url"
      url="$(curl -Ss -I -n "$url" | sed -n -E 's/Link:.*<(.*?)>; rel="next".*/\1/p')"
done > issues.json

Comments

3

Instead of using the original parse-link-header package, another option would be @web3-storage/parse-link-header. It's the forked version of the original NPM package. The API is the same but it comes with advantages like:

  • TypeScript support
  • Zero dependencies
  • No Node.js globals and ESM

Installation:

npm install @web3-storage/parse-link-header

Usage:

import { parseLinkHeader } from '@web3-storage/parse-link-header'

const linkHeader =
  '<https://api.github.com/user/9287/repos?page=3&per_page=100>; rel="next", ' +
  '<https://api.github.com/user/9287/repos?page=1&per_page=100>; rel="prev"; pet="cat", ' +
  '<https://api.github.com/user/9287/repos?page=5&per_page=100>; rel="last"'

const parsed = parseLinkHeader(linkHeader)
console.log(parsed)

Output:

{
   "next":{
      "page":"3",
      "per_page":"100",
      "rel":"next",
      "url":"https://api.github.com/user/9287/repos?page=3&per_page=100"
   },
   "prev":{
      "page":"1",
      "per_page":"100",
      "rel":"prev",
      "pet":"cat",
      "url":"https://api.github.com/user/9287/repos?page=1&per_page=100"
   },
   "last":{
      "page":"5",
      "per_page":"100",
      "rel":"last",
      "url":"https://api.github.com/user/9287/repos?page=5&per_page=100"
   }
}

Comments

1

Here is a simple code to parse link header from GitHub in Java Script

var parse = require('parse-link-header');
    var parsed = parse(res.headers.link);
    no_of_pages = parsed.last.page;

1 Comment

Mostly same answer as one bellow stackoverflow.com/a/29974304/11152683 and without link to given library.
1

This is a Java function which will serve the purpose, you can find a link for the provided parameter key and parameter value. Please Note: This is something that I made for personal purpose, it might not be fool proof for your scenario, so review it and make changes accordingly

https://github.com/akshaysom/LinkExtract/blob/main/LinkExtract.java

  public static String getLinkFromLinkHeaderByParamAndValue(String header, String param, String value) {
            if (header != null && param != null && value != null && !"".equals(header.trim()) && !"".equals(param.trim())
                    && !"".equals(value)) {
    
                String[] links = header.split(",");
    
                LINKS_LOOP: for (String link : links) {
    
                    String[] segments = link.split(";");
    
                    if (segments != null) {
    
                        String segmentLink = "";
    
                        SEGMENT_LOOP: for (String segment : segments) {
                            segment = segment.trim();
                            if (segment.startsWith("<") && segment.endsWith(">")) {
    
                                segmentLink = segment.substring(1, segment.length() - 1);
                                continue SEGMENT_LOOP;
    
                            } else {
                                if (segment.split("=").length > 1) {
    
                                    String currentSegmentParam = segment.split("=")[0].trim();
                                    String currentSegmentValue = segment.split("=")[1].trim();
    
                                    if (param.equals(currentSegmentParam) && value.equals(currentSegmentValue)) {
                                        return segmentLink;
                                    }
                                }
                            }
                        }
                    }
                }
            }
            return null;
        }

Comments

0

Here is a Python solution to get contributors count for any github repo.

import requests
from urllib.parse import parse_qs

rsp = requests.head('https://api.github.com/repos/fabric8-analytics/fabric8-analytics-server/contributors?per_page=1')
contributors_count = parse_qs(rsp.links['last']['url'])['page'][0]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.