0

I'm trying to get a set of strings from a paragraph that match the format of 4chan's quotes: >>1111111 where it starts with >> followed by 7 digits.

>>1111000
>>1111001
Yes, I agree with those sentiments. 

Both >>1111000 and >>1111001 would be extracted from the text above which I would then split into the digits after.

4
  • 1
    Do you already have any written code? And how will you treat the coincidences at concurrences? Probably you will have to use an indexed collection Commented Nov 23, 2020 at 9:37
  • 1
    What have you tried so far to solve this on your own? Commented Nov 23, 2020 at 9:44
  • Repetitions of quotes will be checked against before storing in the database as one quotation is enough from a post. Commented Nov 23, 2020 at 9:52
  • I am aware of the matches() function in JS, but have no knowledge on the most appropriate regex for the pattern. Commented Nov 23, 2020 at 9:53

4 Answers 4

1

You can use the following which will match lines starting with 2 > characters followed by 7 digits:

const regex =/^[>]{2}[\d]{7}$/gm;

const text = `>>1234567
>>6548789
foo barr`;

const matches = text.match(regex);

console.log(matches);

Sign up to request clarification or add additional context in comments.

2 Comments

Only seems to work on strings in a specific format. When text is refactored as '>>1234567 >>6548789 foo barr' (same line, separated by whitespace), matches return null.
If you change the regex to /[>]{2}[\d]{7}/gm it will not explicitly look for lines starting with >> and ending with 7 digits, so it should work if it's all on one line.
1

You can use this regex

/[>]{2}[0-1]{7}/

1 Comment

How do I make it so that the regex accepts those patterns when in the same line as words but separated by whitespaces?
0

There appears to be some answers, but since it's a topic I would like to understand better here are my two cents. In the past this answer has helped me a lot and online regex sites are also great, such as this one

 

      <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <meta http-equiv="X-UA-Compatible" content="ie=edge">
        <title>Parse Test</title>
    
    </head>
    <body>
        <div >
            <text id="ToParse">
    
                >>1111000 <br>
                >>1111001 <br>
                Yes, I agree with those sentiments.
    
            </text>
        </div>
    
    <script>
    
     try {
        var body = document.getElementById('ToParse').innerHTML;
        console.log(body);
    
    
    } catch (err) {
        console.log('empty let body,' + " " + err);
    }
    
    
    function parseBody () {
    
        // from HTML
    // function parseBody (body) {
        // const regex = /(&gt;&gt;)([0-9]*)\w+/gm;
    
        // from JS
        const regex = /(>>)([0-9]*)\w+/gm;
        const body = `            >>1111000 <br>
                    >>1111001 <br>
                    Yes, I agree with those sentiments.`;
    
        let m;
    
        while ((m = regex.exec(body)) !== null) {
            // This is necessary to avoid infinite loops with zero-width matches
            if (m.index === regex.lastIndex) {
                regex.lastIndex++;
            }
    
            // The result can be accessed through the `m`-variable.
            m.forEach((match, groupIndex) => {
                console.log(`Found match, group ${groupIndex}: ${match}`);
            });
        }
    
    };
    
    parseBody(body);
    
    
    
    
    // </script>
    
    
    
    
    </body>
    </html>

Comments

0

Like @spyshiv said, you can match the string like so:

var string = '>>1111000';
var matches = string.match(/[>]{2}[0-1]{7}/);
console.log(matches);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.