Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Extracting data based on specific patterns in a text file using python

Status
Not open for further replies.

dipk11

Junior Member level 2
Joined
Jul 27, 2017
Messages
23
Helped
0
Reputation
0
Reaction score
0
Trophy points
1
Activity points
245
Please help me!!
I have a huge report file with some data where i have to do some data processing on lines starting with the code "MLT-TRR" For now i have extracted all the lines in my script that start with that code and placed them in a separate file. The new file looks like this- Rules.txt.

Code:
MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\top.png  63   10   Port is not registered [Folder: 'Picture']

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\tree.png 315  10   Port is not registered [Folder: 'Picture.first_inst']

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\top.png  315  10   Port is not registered [Folder: 'Picture.second_inst']

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\tree.png 317  10   Port is not registered [Folder: 'Picture.third_inst']

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\top.png  317  10   Port is not registered [Folder: 'Picture.fourth_inst']


For each of these lines i have to extract the data that lies after "[Folder: 'Picture" If there is no data after "[Folder: 'Picture" as in the case of my first line, then skip that line and move on to the next line. I also want to extract the file names for each of those lines- top.txt, tree.txt

I couldnt think of a simpler method to do this as this involves a loop and gets messier. Is there any way out i can do this? extracting just the file paths and the ending data of each line.

Code:
import os
import sys
from os import path
import numpy as np


folder_path = os.path.dirname(os.path.abspath(__file__))
inFile1 = 'Rules.txt'
inFile2 = 'TopRules.txt'

def open_file(filename):
    try:
        with open(filename,'r') as f:
            targets = [line for line in f if "MLT-TRR" in line]
            print targets
        f.close()
        with open(inFile1, "w") as f2:
            for line in targets:
                f2.write(line + "\n")
        f2.close()
     
    except Exception,e:
        print str(e)
    exit(1)


if __name__ == '__main__':
    name = sys.argv[1]
    filename = sys.argv[1]
    open_file(filename)
 
Last edited by a moderator:

Please help me!!
I have a huge report file with some data where i have to do some data processing on lines starting with the code "MLT-TRR" For now i have extracted all the lines in my script that start with that code and placed them in a separate file. ...

I couldnt think of a simpler method to do this as this involves a loop and gets messier. Is there any way out i can do this? extracting just the file paths and the ending data of each line.

perhaps you should read the original file
when you find the "MLT-TRR" code, process that line for file paths and ending data and save that result
then read in the next line
it will save writing and re-reading

i probably have a bad attitude about these things
sometimes there is nothing for it than to slog through
elegance is not important
loops are not the enemy

the computer is doing the work
documenting the code is the most important thing.
getting it right is the most important thing
a straight-forward user interface is next most important
 

Status
Not open for further replies.

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top