1

I have a text file in which contains many emails, at the beginning of each email is 3 lines of header information, these include From:, Subject:, Date:. I know that after every ctrl-L character are the header lines, hence the c==12 line.

Currently my from array gets 1 line of text either something like:

From: Rollen Awen <[email protected]>

or

From: [email protected]

So right now I am trying to use delimiters to only keep the email address, but im not sure how to go about it. I have to be able to handle any type of situation, whether its enclosed within < > or if its enclosed between 2 white spaces.

For example, I want to change the

From: Rollen Awen <[email protected]> string into only [email protected]

Or changing

From: [email protected] into [email protected]

...
FILE *emaildata = fopen (argv[1], "r");

    while((c=fgetc(emaildata))!=EOF){
            if(c==12){
                numberemails++;
                fgets(nothing, sizeof(nothing), emaildata);
                fgets(from, sizeof(from), emaildata);
                fgets(subject, sizeof(subject), emaildata);
                fgets(date, sizeof(date), emaildata);
                //printf("%s", from);
            }
    ...
4
  • One solution can be: search the line for '@' symbol. once you find it come back till you get 'valid email address chars' and then again start from there till you are getting 'valid email address chars'! But I doubt if this is an efficient solution! Commented Oct 16, 2014 at 2:46
  • What do you mean by "enclosed between 2 white spaces"? Commented Oct 16, 2014 at 2:54
  • I honestly think it's a good one. I am not sure how complicated a proper email address can be, but if it's limited only to some character sets it is easy and efficient. I would also consider a regexp library if more logic is needed. Commented Oct 16, 2014 at 2:55
  • It might also be important if the input is always valid. E.g. There is at most one email address per line and it's a valid email address. Commented Oct 16, 2014 at 3:10

1 Answer 1

3

This requires memrchr() which glibc gives you if you #define _GNU_SOURCE. If you don't have that function I'm sure you can find a similar one or write it yourself.

// input is either like "John Smith <[email protected]>" or "[email protected]"
// leading and trailing whitespace is skipped
// email is an out-param, must be an array at least as long as input
void parse_email_address(const char* input, char* email)
{
  // skip leading whitespace
  while (isspace(*input)) {
    ++input;
  }

  size_t len = strlen(input);

  // ignore trailing whitespace
  while (len > 0 && isspace(input[len - 1])) {
    --len;
  }

  // parse friendly addresses like "John Smith <[email protected]>"
  // '>' must come last, and '<' must come before it
  if (len > 0 && input[len - 1] == '>') {
    const char* left = memrchr(input, '<', len);
    if (left) {
      len -= left - input + 2; // 2 for '<' and '>'
      input = left + 1;
    }
  }

  memcpy(email, input, len);
  email[len] = '\0';
}
Sign up to request clarification or add additional context in comments.

5 Comments

Ummm, but it doesn't handle address-only lines like From: [email protected], does it? I might have missed something. If I'm right, I think "find @ and span from there" is more robust.
@luk32: it assumes you've chopped off From: already. Of course that part is easy, but I should have been clearer about it. I'll add a comment.
Oh, I thought input is the line. But won't it trip, when there is no < and >? EDIT: OK, I think I see it. Sorry for getting confused. It's a nice counter part to the "find @ and span from it" idea which is also nice IMO.
@luk32: My concern about the @ finding method was, what if someone used Bobby T@bles <[email protected]>? I think at a minimum you'd need to find the @ from the rear, and even then it seems a bit fraught.
Ha, I only exploited email addresses, well thought sir. +1.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.