Extract part of string in Golang?

Question

I'm learning Golang so I can rewrite some of my shell scripts.

I have URL's that look like this:

https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value

I want to extract the following part:

https://example-1.example.com/a/c482dfad3573acff324c/list.txt

In a shell script I would do something like this:

echo "$myString" | grep -o 'http://.*.txt'

What is the best way to do the same thing in Golang, only by using the standard library?

score 14 · Accepted Answer · 2025-03-28 17:46:55Z

14

There are a few options:

// match regexp as in question
pat := regexp.MustCompile(`https?://.*\.txt`)
s := pat.FindString(myString)

// everything before the ?
s, _, _ := strings.Cut(myString, "?")

// parse and clear query string
u, err := url.Parse(myString)
u.RawQuery = ""
s := u.String()

The last option is the best because it will handle all possible corner cases.

Try it on the playground

edited Mar 28 at 17:46

answered Jul 27, 2016 at 23:19

user5728991

Sign up to request clarification or add additional context in comments.

1 Comment

WW. Over a year ago

I'd recommend using the url.Parse since that should handle any weird edge cases which might be missed by a regex or a split. For example, URLs without a ?

score 4 · Accepted Answer · 2016-07-28 05:03:02Z

you may use strings.IndexRune, strings.IndexByte, strings.Split, strings.SplitAfter, strings.FieldsFunc, url.Parse, regexp or your function.

first most simple way:
you may use i := strings.IndexRune(s, '?') or i := strings.IndexByte(s, '?') then s[:i] like this (with commented output):

package main

import "fmt"
import "strings"

func main() {
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value`
    i := strings.IndexByte(s, '?')
    if i != -1 {
        fmt.Println(s[:i]) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt
    }
}

or you may use url.Parse(s) (I'd use this):

package main

import "fmt"
import "net/url"

func main() {
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value`
    url, err := url.Parse(s)
    if err == nil {
        url.RawQuery = ""
        fmt.Println(url.String()) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt
    }
}

or you may use regexp.MustCompile(".*\\.txt"):

package main

import "fmt"
import "regexp"

var rgx = regexp.MustCompile(`.*\.txt`)

func main() {
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value`

    fmt.Println(rgx.FindString(s)) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt
}

or you may use splits := strings.FieldsFunc(s, func(r rune) bool { return r == '?' }) then splits[0]:

package main

import "fmt"
import "strings"

func main() {
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value`
    splits := strings.FieldsFunc(s, func(r rune) bool { return r == '?' })
    fmt.Println(splits[0]) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt
}

you may use splits := strings.Split(s, "?") then splits[0]:

package main

import "fmt"
import "strings"

func main() {
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value`
    splits := strings.Split(s, "?")
    fmt.Println(splits[0]) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt
}

you may use splits := strings.SplitAfter(s, ".txt") then splits[0]:

package main

import "fmt"
import "strings"

func main() {
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value`
    splits := strings.SplitAfter(s, ".txt")
    fmt.Println(splits[0]) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt
}

or you may use your function (most independent way):

package main

import "fmt"

func left(s string) string {
    for i, r := range s {
        if r == '?' {
            return s[:i]
        }
    }
    return ""
}

func main() {
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value`
    fmt.Println(left(s)) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt
}

dmitris · Accepted Answer · 2016-07-28 11:48:18Z

2

If you are prosessing only URLs, you can use Go's net/url library https://golang.org/pkg/net/url/ to parse the URL, truncate the Query and Fragment parts (Query would be parm1=value,parm2=value etc.), and extract the remaining portion scheme://host/path, as in the following example (https://play.golang.org/p/Ao0jU22NyA):

package main

import (
    "fmt"
    "net/url"
)

func main() {
    u, _ := url.Parse("https://example-1.example.com/a/b/c/list.txt?parm1=value,parm2=https%3A%2F%2Fexample.com%2Fa%3Fparm1%3Dvalue%2Cparm2%3Dvalue#somefragment")
    u.RawQuery, u.Fragment = "", ""
    fmt.Printf("%s\n", u)
}

Output:

https://example-1.example.com/a/b/c/list.txt

edited Jul 28, 2016 at 11:48

answered Jul 28, 2016 at 10:15

dmitris

1,4911 gold badge12 silver badges13 bronze badges

Comments

Ganga Ram Daukiya · Accepted Answer · 2019-11-30 08:58:42Z

0

I used regexp package extract string from string .

In this example I wanted to extract between and <\PERSON> , did this by re expression and and replaced and <\PERSON> by re1 expression.

for loop used for if there there are multiple match and re1 format used for replace.

package main

import (
    "fmt"
    "regexp"
)

func main() {
    re := regexp.MustCompile(`<PERSON>(.*?)</PERSON>`)

    string_l := "java -mx500m -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier classifiers/english.all.3class.distsim.crf.ser.gz -textFile PatrickYe.txt -outputFormat inlineXML 2> /dev/null I complained to <ORGANIZATION>Microsoft</ORGANIZATION> about <PERSON>Bill Gates</PERSON>.They     told me to see the mayor of <PERSON>New York</PERSON>.,"
    x := re.FindAllString(string_l, -1)        
    fmt.Println(x)
    for v,st:= range x{
            re1 := regexp.MustCompile(`<(.?)PERSON>`)
            y1 := re1.ReplaceAllLiteralString(st,"")
            fmt.Println(v,st," : sdf : ",y1) 

    }    
}

Play with Go

edited Nov 30, 2019 at 8:58

answered Nov 28, 2019 at 13:27

Ganga Ram Daukiya

11 bronze badge

1 Comment

MortenSickel Over a year ago

Hi and welcome! I am sorry but it is a bit difficult to understand your question. Could you rephrase it a bit so it is easier to understand what you want to achieve?

Collectives™ on Stack Overflow

Extract part of string in Golang?

4 Answers 4

1 Comment

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related