3

I am working on a Python tool to migrate emails into Gmail while preserving the original Date header. My goal is simply to build a cli tool that allows to copy email from gmail account a to gmail account b, preserving all data and metadata (including date and time).

I am using the Gmail API's users.messages.insert method, as suggested in the Google support documentation. The support states that using internalDateSource: 'dateHeader' and deleted: true should enforce the Date header from the email: https://support.google.com/vault/answer/9987957?hl=en

Here is a minimal code example:

from googleapiclient.discovery import build
import base64

# Initialize Gmail API client
service = build('gmail', 'v1', credentials=your_credentials)

# Raw email with a custom Date header
raw_email = """\
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
to: [email protected]
from: [email protected]
subject: Test Email
Date: Tue, 13 Aug 2024 14:00:00 +0000

This is a test email.
"""

# Encode the email
raw_email_b64 = base64.urlsafe_b64encode(raw_email.encode('utf-8')).decode('utf-8')

# Insert the email using the Gmail API
body = {
    'raw': raw_email_b64,
    'internalDateSource': 'dateHeader',
    'deleted': True
}
response = service.users().messages().insert(userId='me', body=body).execute()

# Log the response
print(response)

Problem: Despite setting internalDateSource: 'dateHeader' and deleted: true, the Date header in the inserted email is overridden by the current timestamp. The original Date header is not preserved and the datetime of insert is used instead.

Question: Is this behavior expected, or am I missing something in the implementation? Are there additional steps required to enforce the Date header during email insertion? Any insights or workarounds would be greatly appreciated!

Verified that the Date header is correctly set in the raw email before insertion. Used the internalDateSource: 'dateHeader' parameter as per the Google support suggestion. Added the deleted: true parameter to the users.messages.insert method. Observations: The Gmail API still overrides the Date header with the current timestamp. The X-Original-Date header workaround works, but I would prefer to rely on the Date header directly.

1
  • 1
    documentation shows dateHeader as Enum value - so maybe normally it has assigned integer value and you would have to use this integer value instead of string "dateHeader" Commented Aug 13 at 22:48

1 Answer 1

3

It has to be outside body (and it works even without deleted)

body = {
    'raw': raw_email_b64,
    #'labelIds': ['INBOX'],  # Optional: put in INBOX
}

.insert(
  userId='me', 
  body=body,
  internalDateSource='dateHeader',
  #deleted=True
)

I realized it after checking help/docstring

help( service.users().messages().insert )

I even found this text on internet: insert parameters and in first line it shows

insert(userId, body=None, deleted=None, internalDateSource=None, media_body=None, media_mime_type=None, x__xgafv=None)

I run code with header Date with Sun, 17 May 2020 15:30:00 +0000 and Gmail uses this value to sort this message (difference can make my local timezone +0200)

enter image description here

In Show Source it also shows it as Created at and as header date in raw data.

enter image description here


Full working code used for tests:

import base64
import datetime
import os
import pickle
import sys
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.discovery import build
from email.mime.text import MIMEText

SCOPES = ['https://www.googleapis.com/auth/gmail.insert']

def get_credentials():
    creds = None

    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)

    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            #print('refresh')
            creds.refresh(Request())  # refresh silently
        else:
            flow = InstalledAppFlow.from_client_secrets_file('credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)

        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    return creds

def create_message_1():
    raw_email = """\
    Content-Type: text/plain; charset="us-ascii"
    MIME-Version: 1.0
    Content-Transfer-Encoding: 7bit
    to: [email protected]
    from: [email protected]
    subject: Test Email
    Date: Tue, 13 Aug 2024 14:00:00 +0000

    This is a test email.
    """

    print(raw_email)

    raw_message = base64.urlsafe_b64encode(raw_email.encode('utf-8')).decode('utf-8')

    return raw_message

def create_message_2():
    now = datetime.datetime.now()
    custom_date = datetime.datetime(2020, 5, 17, 15, 30, tzinfo=datetime.UTC)

    message = MIMEText(f"This email has a custom date {custom_date:%a, %d %b %Y %H:%M:%S %z} (Send: {now:%a, %d %b %Y %H:%M:%S %z})")
    message['to'] = "[email protected]"
    message['from'] = "[email protected]"
    message['subject'] = "Email with custom date"
    message['date'] = custom_date.strftime('%a, %d %b %Y %H:%M:%S %z')  # With timezone

    print(message.as_bytes().decode())

    raw_message = base64.urlsafe_b64encode(message.as_bytes('utf-8')).decode('utf-8')

    return raw_message

# ------

creds = get_credentials()

service = build('gmail', 'v1', credentials=creds)

#help(service.users().messages().insert)  # to see parameters
# https://googleapis.github.io/google-api-python-client/docs/dyn/gmail_v1.users.messages.html#insert

if len(sys.argv) == 1:
    raw_message = create_message_1()
else:
    raw_message = create_message_2()

inserted = service.users().messages().insert(
    userId="me",
    body={
        'raw': raw_message,
        'labelIds': ['INBOX'],  # Optional: put in INBOX
        #'internalDate': int(custom_date.timestamp()),
    },
    internalDateSource='dateHeader',
    #internalDateSource='receivedTime',
    #deleted=True,
).execute()

print(f"Inserted message ID: {inserted['id']}")

print(f"{inserted = }")

Sometimes inserted has only id but sometimes it has also labelIds and threadId.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.