0

I have this array of array and I want to loop through it and give me every word, but stripped from "@", punctuation and hashtags. However my regular expression is removing some of the words completely from the array and I am not sure why.

  [ [ '@AIMEEE94x',
      '@Arsenal_Geek',
      '@Charlottegshore',
      'shut',
      'it',
      'ha' ],
     [ '"You',
       'learn',
       'so',
       'much',
       '@MesutOzil1088',
       'and',
       '@Alexis_Sanchez"',
       '-',
       '@alexiwobi:' ] ]


     var regex = /\w+/g;
     var listsb = [];
     for ( i = 0 ; i < tweetsA.length; i++) {
         for(j = 0; j < tweetsA[i].length; j++){

             if (regex.test(tweetsA[i][j])== true){
                 listsb = listsb.concat(tweetsA[i][j])
             }                                                                                                 

         }
     }  
  console.log(listsb);

3 Answers 3

1

If you want to strip out all the other characters, then just a check against the regex doesn't suffice. You would need to find the exact pattern that matches in the word. This is done using the match function of String in javascript

var str = "@Alexis_Sanchez";
var regex = /\w+/g;
var match = str.match(regex); //match = ['Alexis_Sanchez']
var str2 = "@alexwobi:";
var match2 = str2.match(regex); //match2 = ['alexwobi']

This value of match (if match exists) should be pushed inside the list array.

The \w meta character is equivalent to [A-Za-z0-9_]. So it won't strip underscores for you. Also if you have a non \w character in the middle of the word, the you would get two elements in the match array. Both of them would need to be appended and then pushed in your list.

Sign up to request clarification or add additional context in comments.

Comments

0

Wouldn't it be easier to just use String.match() for this? Like this:

var regex = /\w+/g;
var listsb = [];
for ( i = 0 ; i < tweetsA.length; i++) {
  for(j = 0; j < tweetsA[i].length; j++){
    listb.push(tweetsA[i][j].match(regex)); //Will give you string stripped with regex characters.                                                                                           
  }
}  

Comments

0

New answer based on the update in your comment. This version loops through all the matches found and adds them to your list.

 var regex = /\w+/g;
 var listsb = [];
 for ( i = 0 ; i < tweetsA.length; i++) {
     for(j = 0; j < tweetsA[i].length; j++) {
         while((m = regex.exec(tweetsA[i][j])) != null) {
            listsb = listsb.concat(m[0]);
         }
     }
 }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.