I wrote an application where I fetch a message and check it's content:
public void getInhoud(Message msg) throws IOException, Exception {
Object contt = msg.getContent();
...
if (contt instanceof String) {
handlePart((Part) msg);
}
...
}
public void handlePart(Part part)
throws MessagingException, IOException, Exception {
ByteArrayOutputStream out = new ByteArrayOutputStream();
String contentType = part.getContentType();
...
if ((contentType.length() >= 9)
&& (contentType.toLowerCase().substring(
0, 9).equals("text/html"))) {
part.writeTo(out);
String stringS = out.toString();
}
...
}
I removed the unnecessary code. This methods works for e-mail which was sent from Gmail, Hotmail and the Outlook desktop client, but somehow fails to work with e-mails which were sent from the Office 365 web client. For every other client the content type will be 'plain/text' but only for Office 365 mail it will be text/html. It is writing the data of the Part to an ByteArrayOutputStream which then will be converted to a String. This works, well atleast the String will contain the content of the part. But the HTML it contains is somewhat faulty.
Here is an example: http://pastebin.com/5mEYCHxD (posted to Pastebin, it is pretty big).
Notice the = symbols which are printed at the end of almost every line. Is this something I can fix within in the code or should it be somewhere in the mailclient?
I thought about looping trough every line of HTML and removing the = after having checked it is not a part an HTML tag.
Any help is very much appreciated, this has been bothering me for a few weeks now.
Thanks!