6

I have the following problem:

I want to convert a tab delimited text file to a csv file. The text file is the SentiWS dictionary which I want to use for a sentiment analysis ( https://github.com/MechLabEngineering/Tatort-Analyzer-ME/tree/master/SentiWS_v1.8c ).

The code I used to do this is the following:

txt_file = r"SentiWS_v1.8c_Positive.txt"
csv_file = r"NewProcessedDoc.csv"

in_txt = csv.reader(open(txt_file, "r"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'w'))

out_csv.writerows(in_txt)

This code writes everything in one row but I need the data to be in three rows as normally intended from the file itself. There is also a blank line under each data and I don´t know why.

I want the data to be in this form:

Row1 Row2 Row3

Word Data Words

Word Data Words

instead of

Row1

Word,Data,Words

Word,Data,Words

Can anyone help me?

1
  • what is the problem? your script seems to work fine for me. can you include a few lines of the actual output of your script (not just "row1 row2 row3") and then the same few lines in your desired format? Commented Mar 14, 2017 at 2:37

2 Answers 2

9
import pandas

It will convert tab delimiter text file into dataframe

dataframe = pandas.read_csv("SentiWS_v1.8c_Positive.txt",delimiter="\t")

Write dataframe into CSV

dataframe.to_csv("NewProcessedDoc.csv", encoding='utf-8', index=False)
Sign up to request clarification or add additional context in comments.

Comments

4

Try this:

import csv

txt_file = r"SentiWS_v1.8c_Positive.txt"
csv_file = r"NewProcessedDoc.csv"

with open(txt_file, "r") as in_text:
    in_reader = csv.reader(in_text, delimiter = '\t')
    with open(csv_file, "w") as out_csv:
        out_writer = csv.writer(out_csv, newline='')
        for row in in_reader:
            out_writer.writerow(row)

There is also a blank line under each data and I don´t know why.

You're probably using a file created or edited in a Windows-based text editor. According to the Python 3 csv module docs:

If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

1 Comment

You're welcome, @gHOsTaManTe - please upvote and mark as the accepted answer if this resolves your issue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.