74

I'd like to read numbers from a file into a two-dimensional array.

File contents:

  • line containing w, h
  • h lines containing w integers separated with space

For example:

4 3
1 2 3 4
2 3 4 5
6 7 8 9
0

7 Answers 7

108

Assuming you don't have extraneous whitespace:

with open('file') as f:
    w, h = [int(x) for x in next(f).split()] # read first line
    array = []
    for line in f: # read rest of lines
        array.append([int(x) for x in line.split()])

You could condense the last for loop into a nested list comprehension:

with open('file') as f:
    w, h = [int(x) for x in next(f).split()]
    array = [[int(x) for x in line.split()] for line in f]
Sign up to request clarification or add additional context in comments.

11 Comments

aren't values stored as strings then ?
I think this is an okay solution, but I'm always hesitant to iterate and append... IMO it's usually more succinct and easier to read to work with list generators, where you can do both in a single operation and without a flow-control structure like a for loop.
I prefer list comprehensions as well, up until they become huge nested monstrosities.
+1 : This is an excellent answer and your skillful use of both readline and line in f earns my highest regards. Cheers.
@Jean-ClaudeArbaut No, you are right. In Python 3 you are allowed to freely mix next(f) and f.readline(), since next() is actually implemented using readline(), and buffering is moved to a separate class used by all mechanisms of reading from a file. Thanks for pointing this out. I now remember reading about it years ago, but it had slipped my mind when I wrote the previous comment.
|
19

To me this kind of seemingly simple problem is what Python is all about. Especially if you're coming from a language like C++, where simple text parsing can be a pain in the butt, you'll really appreciate the functionally unit-wise solution that python can give you. I'd keep it really simple with a couple of built-in functions and some generator expressions.

You'll need open(name, mode), myfile.readlines(), mystring.split(), int(myval), and then you'll probably want to use a couple of generators to put them all together in a pythonic way.

# This opens a handle to your file, in 'r' read mode
file_handle = open('mynumbers.txt', 'r')
# Read in all the lines of your file into a list of lines
lines_list = file_handle.readlines()
# Extract dimensions from first line. Cast values to integers from strings.
cols, rows = (int(val) for val in lines_list[0].split())
# Do a double-nested list comprehension to get the rest of the data into your matrix
my_data = [[int(val) for val in line.split()] for line in lines_list[1:]]

Look up generator expressions here. They can really simplify your code into discrete functional units! Imagine doing the same thing in 4 lines in C++... It would be a monster. Especially the list generators, when I was I C++ guy I always wished I had something like that, and I'd often end up building custom functions to construct each kind of array I wanted.

4 Comments

I don't think this works. cols, rows = (int(val) for val in '4 3\n') doesn't do what you want. Same for [int(val) for val in line] because line will be something like '1 2 3 4\n'
@Jason: Yeah sorry there were a couple of errors in my original code, but the gist was right. Corrected above. I guess that's what iterative development is for! :)
In the trivial case OP mentions, the C++ version, while slightly longer, would not be "a monster" as you say. You would use fscanf() or streams and vector<vector<int>> (or even int[][]). And C++ would provide much more control over memory management while reading and parsing the file.
Actually, ifstreams are simpler to handle than fscanf, which is a C function, not a C++ one. If you're simply parsing text in C++ and have anything more complex than the suggested python solution you're clearly doing something wrong.
6

Not sure why do you need w,h. If these values are actually required and mean that only specified number of rows and cols should be read than you can try the following:

output = []
with open(r'c:\file.txt', 'r') as f:
    w, h  = map(int, f.readline().split())
    tmp = []
    for i, line in enumerate(f):
        if i == h:
            break
        tmp.append(map(int, line.split()[:w]))
    output.append(tmp)

2 Comments

Interesting approach to include the header data as well, I didn't even think of that. +1 for completeness... but it's a bit lengthy / hard to read :)
Thanx) I have created extended solution that iterates line by line and creates list of list for all occurrences of w,h. However the best answer is already selected)))
2

is working with both python2(e.g. Python 2.7.10) and python3(e.g. Python 3.6.4)

with open('in.txt') as f:
  rows,cols=np.fromfile(f, dtype=int, count=2, sep=" ")
  data = np.fromfile(f, dtype=int, count=cols*rows, sep=" ").reshape((rows,cols))

another way: is working with both python2(e.g. Python 2.7.10) and python3(e.g. Python 3.6.4), as well for complex matrices see the example below (only change int to complex)

with open('in.txt') as f:
   data = []
   cols,rows=list(map(int, f.readline().split()))
   for i in range(0, rows):
      data.append(list(map(int, f.readline().split()[:cols])))
print (data)

I updated the code, this method is working for any number of matrices and any kind of matrices(int,complex,float) in the initial in.txt file.

This program yields matrix multiplication as an application. Is working with python2, in order to work with python3 make the following changes

print to print()

and

print "%7g" %a[i,j],    to     print ("%7g" %a[i,j],end="")

the script:

import numpy as np

def printMatrix(a):
   print ("Matrix["+("%d" %a.shape[0])+"]["+("%d" %a.shape[1])+"]")
   rows = a.shape[0]
   cols = a.shape[1]
   for i in range(0,rows):
      for j in range(0,cols):
         print "%7g" %a[i,j],
      print
   print      

def readMatrixFile(FileName):
   rows,cols=np.fromfile(FileName, dtype=int, count=2, sep=" ")
   a = np.fromfile(FileName, dtype=float, count=rows*cols, sep=" ").reshape((rows,cols))
   return a

def readMatrixFileComplex(FileName):
   data = []
   rows,cols=list(map(int, FileName.readline().split()))
   for i in range(0, rows):
      data.append(list(map(complex, FileName.readline().split()[:cols])))
   a = np.array(data)
   return a

f = open('in.txt')
a=readMatrixFile(f)
printMatrix(a)
b=readMatrixFile(f)
printMatrix(b)
a1=readMatrixFile(f)
printMatrix(a1)
b1=readMatrixFile(f)
printMatrix(b1)
f.close()

print ("matrix multiplication")
c = np.dot(a,b)
printMatrix(c)
c1 = np.dot(a1,b1)
printMatrix(c1)

with open('complex_in.txt') as fid:
  a2=readMatrixFileComplex(fid)
  print(a2)
  b2=readMatrixFileComplex(fid)
  print(b2)

print ("complex matrix multiplication")
c2 = np.dot(a2,b2)
print(c2)
print ("real part of complex matrix")
printMatrix(c2.real)
print ("imaginary part of complex matrix")
printMatrix(c2.imag)

as input file I take in.txt:

4 4
1 1 1 1
2 4 8 16
3 9 27 81
4 16 64 256
4 3
4.02 -3.0 4.0
-13.0 19.0 -7.0
3.0 -2.0 7.0
-1.0 1.0 -1.0
3 4
1 2 -2 0
-3 4 7 2
6 0 3 1
4 2
-1 3
0 9
1 -11
4 -5

and complex_in.txt

3 4
1+1j 2+2j -2-2j 0+0j
-3-3j 4+4j 7+7j 2+2j
6+6j 0+0j 3+3j 1+1j
4 2
-1-1j 3+3j
0+0j 9+9j
1+1j -11-11j
4+4j -5-5j

and the output look like:

Matrix[4][4]
     1      1      1      1
     2      4      8     16
     3      9     27     81
     4     16     64    256

Matrix[4][3]
  4.02     -3      4
   -13     19     -7
     3     -2      7
    -1      1     -1

Matrix[3][4]
     1      2     -2      0
    -3      4      7      2
     6      0      3      1

Matrix[4][2]
    -1      3
     0      9
     1    -11
     4     -5

matrix multiplication
Matrix[4][3]
  -6.98      15       3
 -35.96      70      20
-104.94     189      57
-255.92     420      96

Matrix[3][2]
    -3     43
    18    -60
     1    -20

[[ 1.+1.j  2.+2.j -2.-2.j  0.+0.j]
 [-3.-3.j  4.+4.j  7.+7.j  2.+2.j]
 [ 6.+6.j  0.+0.j  3.+3.j  1.+1.j]]
[[ -1. -1.j   3. +3.j]
 [  0. +0.j   9. +9.j]
 [  1. +1.j -11.-11.j]
 [  4. +4.j  -5. -5.j]]
complex matrix multiplication
[[ 0.  -6.j  0. +86.j]
 [ 0. +36.j  0.-120.j]
 [ 0.  +2.j  0. -40.j]]
real part of complex matrix
Matrix[3][2]
      0       0
      0       0
      0       0

imaginary part of complex matrix
Matrix[3][2]
     -6      86
     36    -120
      2     -40

Comments

2

To make the answer simple here is a program that reads integers from the file and sorting them

f = open("input.txt", 'r')

nums = f.readlines()
nums = [int(i) for i in nums]

After reading each line of the file converting each string to a digit

nums.sort()

Sorting the numbers

f.close()

f = open("input.txt", 'w')
for num in nums:
    f.write("%d\n" %num)

f.close()

Writing them back As easy as that, Hope this helps

Comments

1

The shortest I can think of is:

with open("file") as f:
    (w, h), data = [int(x) for x in f.readline().split()], [int(x) for x in f.read().split()]

You can seperate (w, h) and data if it looks neater.

Comments

0

I found this question searching for a Pandas solution. Here are the two methods to read this file into a dataframe. Both methods are using pandas.read_csv:

import pandas as pd

## Method 1:

# Ignore the header, read the rest of the file:
df = pd.read_csv('infile', header=None, sep=r'\s+', skiprows=1)

## Method 2:

# Read the header:
with open('infile') as infile:
    ncols, nrows = [int(x) for x in infile.readline().split()]

# Read the rest of the file, using the info from the header:
df = pd.read_csv('infile', header=None, sep=r'\s+', skiprows=1, nrows=nrows, usecols=range(ncols))

With either of the two methods, you get this dataframe:

print(df)
   0  1  2  3
0  1  2  3  4
1  2  3  4  5
2  6  7  8  9

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.