6

In the code sample below, I use regex to extract the subdomain name from a given URL. This sample works, but I don't think I've done it correctly at the point where I compile the regex, mainly where I insert the 'virtualHost' variable. Any suggestions?

package main

import (
    "fmt"
    "regexp"
)

var (
    virtualHost string
    domainRegex *regexp.Regexp
)

func extractSubdomain(host string) string {
    matches := domainRegex.FindStringSubmatch(host)
    if matches != nil && len(matches) > 1 {
        return matches[1]
    }
    return ""
}

func init() {
    // virtualHost = os.GetEnv("VIRTUAL_HOST")
    virtualHost = "login.localhost:3000"

    domainRegex = regexp.MustCompile(`^(?:https?://)?([-a-z0-9]+)(?:\.` + virtualHost + `)*$`)
}

func main() {
    // host := req.host
    host := "http://acme.login.localhost:3000"

    if result := extractSubdomain(host); result != "" {
        fmt.Printf("Subdomain detected: %s\n", result)
        return
    }

    fmt.Println("No subdomain detected")
}
3
  • 5
    Don't use a regex. Use the url package to parse the URL. Then just split on period. Commented Mar 13, 2018 at 13:17
  • 2
    Just split on period.. ah yes.. please read the RFC before you suggest ..... Commented Aug 25, 2020 at 16:01
  • An alternative option without using regex is go-tld but this does not answer the question. Commented Oct 11, 2021 at 2:32

2 Answers 2

7

The url package has a function parse that allows you to parse an URL. The parsed URL instance has a method Hostname which will return you the hostname.

package main

import (
    "fmt"
    "log"
    "net/url"
)

func main() {
    u, err := url.Parse("http://login.localhost:3000")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(u.Hostname())
}

Output:

login.localhost

See https://play.golang.com/p/3R1TPyk8qck

Update:

My previous answer only dealt with parsing the host name. Since then I have been using the following library to parse the domain suffix from the host name. Once you have that, it is simple to strip the domain and leave only the subdomain prefix.

https://pkg.go.dev/golang.org/x/net/publicsuffix

I have found that it can be a bit tricky to exactly identify the difference between subdomain and host, without a little help first from this package that can identify common suffixes. For instance, internally we may have a domain coming from a kubernetes ingress:

foo.bar.host.kube.domain.com.au

The host is "host" and the subdomain is "foo.bar". Even with the help of the publicsuffix library it won't know that "kube" is part of the internal domain components. So you have to add some more of your own hinting to match.

Sign up to request clarification or add additional context in comments.

1 Comment

I dont know why this is marked as the answer. It gives you the full hostname not the subdomain as questioned. Its not the answer.
2

This is what I've used


func getSubdomain(r *http.Request) string {
    //The Host that the user queried.
    host := r.URL.Host
    host = strings.TrimSpace(host)
    //Figure out if a subdomain exists in the host given.
    hostParts := strings.Split(host, ".")

    fmt.Println("host parts",hostParts)

    lengthOfHostParts := len(hostParts)

    // scenarios
    // A. site.com  -> length : 2
    // B. www.site.com -> length : 3
    // C. www.hello.site.com -> length : 4

    if lengthOfHostParts == 4 {
        return strings.Join([]string{hostParts[1]},"") // scenario C
    }
    
    if lengthOfHostParts == 3 { // scenario B with a check
        subdomain := strings.Join([]string{hostParts[0]},"")
        
        if subdomain == "www" {
            return ""
        } else {
            return subdomain
        }
    }

    return "" // scenario A
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.