3

I'm currently need to know how to extract a link from html mail in Gmail account.

I able to connect using the python library for gmail provided by google but once i use the fonction describe as example:

def GetMimeMessage(service, user_id, msg_id):
  """Get a Message and use it to create a MIME Message.

  Args:
    service: Authorized Gmail API service instance.
    user_id: User's email address. The special value "me"
    can be used to indicate the authenticated user.
    msg_id: The ID of the Message required.

  Returns:
    A MIME Message, consisting of data from Message.
  """
  try:
    message = service.users().messages().get(userId=user_id, id=msg_id,
                                             format='raw').execute()

    print ('Message snippet: %s' % message['snippet'])

    msg_str = base64.urlsafe_b64decode(message['raw'].encode('ASCII'))

    mime_msg = email.message_from_string(msg_str)

    return mime_msg
  except errors.HttpError, error:
    print ('An error occurred: %s') % error

message = GetMimeMessage(service, 'me', '15876d11f0719f43')
#message = GetMessage(service, 'me', '15876d11f0719f43')
#message = str(message)
print (message)

The message looks like this:

    style=3D"font-family: Verdana, Geneva, sans-serif; font-siz=
e: 14px; line-height: 20px;"><a href=3D"https://e.vidaxl.com/1/4/1505/1/ZJb=
bJSDLWNxmfpHYIRQzlMiIupCb0wiKMrAxIrfXlymZQ_TK5GUcGAT6rIBJD9nfIFJ5XWG6HnYei-=
G1aQqlfnBxKnJ3yujKlOpRY2UxqroSHS51ofyXzr3kFa7OTyJH5zKbxESXzbTlcQOYxRuEnBcKF=
saVBGQXyJomUGLL6RY" target=3D"_blank" style=3D"color:blue;">SUIVEZ VOTRE CO=
MMANDE</a></td>=0A

As you see at each end it add a "=" sign . I don't know why, is it because i print it ?

My main question is how to extract a specific link from a MIMEmail. I tried lxml withoug success. first because i have those "=" added i think or because it's not a valid html or xml.

Thanks for your help

4
  • the example at th Google API page, developers.google.com/gmail/api/v1/reference/users/messages/get, doesn't use the "raw" attribute, have you tried the default ("full") Commented Nov 22, 2016 at 15:31
  • Hi, they are using raw no ? I tried also with full but i got an error saying there are no key. Commented Nov 22, 2016 at 15:39
  • 1
    Try using httplib — HTTP protocol client library from Python which is meant for parsing HTTP response which is demoed in this SO thread. Try using msg.get_payload() which returns the current payload. Demo code is here. Commented Nov 23, 2016 at 12:02
  • I used you response to find a solution which can solve my issue. thanks again. I will edit my question to post update code Commented Nov 25, 2016 at 16:15

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.