extracting data from columns in a text file using python

Question

I am new to python file data processing. I have the following text file having the report of a new college campus. I want to extract the data from the column "colleges" and for "book_IDs_1" for block_ABC_top which is 23. I also want to know if there is any more occurrence of block_ABC_top in the colleges column and find the value for the book IDs_1 column. Is it possible in a text file? or il have to change it to csv? How do i write a code for this data processing? Kindly help me!!

Copyright 1986-2019, Inc. All Rights Reserved.

Design Information
-----------------------------------------------------------------------------------------------------------------
| Version : (lin64) Build 2729669 Thu Dec  5 04:48:12 MST 2019
| Date         : Wed Aug 26 00:46:08 2020
| Host         : running 64-bit Red Hat Enterprise Linux Server release 7.8 
| Command      : college report
| Design       : college
| Device       : laptop
| Design State : in construction
-----------------------------------------------------------------------------------------------------------------

Table of Contents
-----------------
1. Information by Hierarchy

1. Information by Hierarchy
---------------------------
+----------------------------------------------+--------------------------------------------+------------+------------+---------+------+-----+
|                   colleges                   |                   Module                   | Total mems | book IDs_1 | canteen | BUS  | UPS | 
+----------------------------------------------+--------------------------------------------+------------+------------+---------+------+-----+
| block_ABC_top                                |                                      (top) |         44 |         23 |       8 |    8 |   8 |   
|    (block_ABC_top_0)                         |                            block_ABC_top_0 |          5 |          5 |       5 |    2 |   9 |       
+----------------------------------------------+--------------------------------------------+------------+------------+---------+------+-----+

I have a data List which has data of the colleges such as block_ABC_top, block_ABC_top_1,block_ABC_top, block_ABC_top_1...Here is my code below The problem i face is..it only takes the data for data[0]..but i have data[0] and data[2] having the same college and i expect the check to happen twice.

with open ("utility.txt", 'r') as f1:
            
            for line in f1:
                if data[x] in line:
                    line_values = line.split('|') 

                    if (int(line_values[4]) == 23 or int(line_values[7]) == 8):
                        filecheck = fullpath + "/" + filenames[x]
                        print filecheck

                        #print "check file "+ filenames[x]
                    x = x + 1

            f1.close()

Is there a specific issue? Have you tried anything, done any research? Please see How to Ask, help center. — AMC
– AMC, Commented Aug 28, 2020 at 10:38
@AMC...yeah..i tried searching the data block_ABC_top in the file and extracting that line...but how do i specifically extract the data from Block id column from this line? — k11
– k11, Commented Aug 31, 2020 at 5:04
Please share the code you have so far, then, as well as the specific problem you encountered. — AMC
– AMC, Commented Sep 1, 2020 at 1:07

Yansh · Accepted Answer · 2020-08-28 10:35:45Z

1

print [x.split(' ')[0] for x in open(file).readlines()]  #colleges column
print [x.split(' ')[3] for x in open(file).readlines()]  #book_IDs_1 column

Try running these.

answered Aug 28, 2020 at 10:35

Yansh

1101 gold badge2 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

k11 Over a year ago

@yansh...i ran this...but i specifically want data for block_ABC_top from the book IDs_1 column...not seperately extract the data..also the input file has some text details on top of the table...:(

Murilo Schünke · Accepted Answer · 2020-08-28 10:36:38Z

1

Instead of going with the exact position of reach field, a better way would be to use the split() function, since you have your fields separated by a | symbol. You can loop thru the lines of the file and handle them accordingly.

for loop...:
    line_values = line.split("|")

print(line_values[0]) # block_ABC_top

answered Aug 28, 2020 at 10:36

Murilo Schünke

1162 silver badges9 bronze badges

Comments

Jay Bharadia · Accepted Answer · 2020-09-02 18:28:59Z

0

To extract Book id column data, use code below

with open('report.txt') as f:
  for line in f:
    if 'block_ABC_top' in line:
      line_values = line.split('|')
      print(line_values[4]) # PRINTS 23 AND 5

answered Sep 2, 2020 at 18:28

Jay Bharadia

162 bronze badges

Collectives™ on Stack Overflow

extracting data from columns in a text file using python

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related