2

I'm building a web app using Django. I uploaded a text file using

csv_file = request.FILES['file'].

I can't read the csv into pandas. The file that i'm trying to import has text and data, but I only want the data.

I've tried the following

  1. df = pd.read_csv(csv_file, sep=" ", header=None, names=["col1","col2","col3"], skiprows = 2) to try to remove the comments and just read the numbers

Error: pandas will not read all 3 columns. It only reads 1 column

  1. I tried df = pd.read_csv(csv_file, sep="\s{2}", sep=" ", header=None, names=["col1","col2","col3"], skiprows = 2) to try to remove the comments and just read the numbers

Error: cannot use a string pattern on a bytes-like object

  1. I tried df = pd.read_csv(csv_file.read(), sep=" ", header=None, names=["col1","col2","col3"], skiprows = 2) to try to remove the comments and just read the numbers

File I uploaded

% filename
% username
2.0000  117.441  -0.430
2.0100  117.499  -0.337
2.0200  117.557  -0.246
2.0300  117.615  -0.157
2.0400  117.672  -0.069

views.py

def new_measurement(request, pk):
    material = Material.objects.get(pk=pk)
    if request.method == 'POST':
        form = NewTopicForm(request.POST)
        if form.is_valid():
            topic = form.save(commit=False)
            topic.material = material
            topic.message=form.cleaned_data.get('message')
            csv_file = request.FILES['file']
            df = genDataFrame(csv_file)
            topic.data = df
            topic.created_by = request.user
            topic.save()
            return redirect('topic_detail', pk =  material.pk)
    else:
        form = NewTopicForm()
    return render(request, 'new_topic.html', {'material': material, 'form': form})
def genDataFrame(csv_file):
    df = pd.read_csv(csv_file, sep=" ", header=None, names=["col1","col2","col3"])
    df = df.convert_objects(convert_numeric=True)
    df = df.dropna()
    df = df.reset_index(drop = True)
    return df_list

I want to get a dataframe like

col1   col2     col3
2.0000  117.441  -0.430
2.0100  117.499  -0.337
2.0200  117.557  -0.246
2.0300  117.615  -0.157
2.0400  117.672  -0.069

2 Answers 2

0

This works on the data you provided and gives you the dataframe you expect:

df = pd.read_csv(csv_filepath, sep='  ', header=None, 
                 names=['col1', 'col2', 'col3'], skiprows=2, engine='python')

Because sep is more than one character, you need to use the python engine instead of the C engine. The python engine sometimes has trouble with quotes, but you don't have any, so that's fine. You actually don't even need to specify the python engine, it will be selected automatically for you, but you'll get a warning to stderr; specifying the engine suppresses that.

Sign up to request clarification or add additional context in comments.

4 Comments

When I have sep = ' ', I get an error "cannot use a string pattern on a bytes-like object". When I have sep = '\s+', the warning is "Anomalous backslash in string: '\s'. String constant might be missing an r prefix" and it gives me an empty DataFrame
When you use a backslash in a regex, you need to use raw text, i.e. sep=r'\s+'
As for your bytest-like object, that sounds like you haven't decoded your file into text format.
Can you tell me how to decode my file into text format? When I type csv_filepath.read(), the class type is 'bytes.'
0

You had almost the right approach in your description point #2. Also, my answer just adds regex as separator to @prooffreader's answer as it will make the statement less error prone.

 df = pd.read_csv('file_path', sep="\s+",header=None, 
                    names=['col1', 'col2','col3'], skiprows=2)

3 Comments

When I have sep = ' ', I get an error "cannot use a string pattern on a bytes-like object". When I have sep = '\s+', the warning is "Anomalous backslash in string: '\s'. String constant might be missing an r prefix" and it gives me an empty DataFrame
Try using separator = r'\s+'. If that doesn't work, can you please post the complete code that you're using to read the file and the file as well? Ideally, you should not get such an error.
thanks! it worked. I had to import re. That was the mistake

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.