Extracting data based on specific patterns in a text file using python

dipk11 · Aug 28, 2020

Please help me!!
I have a huge report file with some data where i have to do some data processing on lines starting with the code "MLT-TRR" For now i have extracted all the lines in my script that start with that code and placed them in a separate file. The new file looks like this- Rules.txt.

Code:

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\top.png  63   10   Port is not registered [Folder: 'Picture']

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\tree.png 315  10   Port is not registered [Folder: 'Picture.first_inst']

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\top.png  315  10   Port is not registered [Folder: 'Picture.second_inst']

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\tree.png 317  10   Port is not registered [Folder: 'Picture.third_inst']

MLT-TRR                         Warning     C:\Users\Di\Pictures\SavedPictures\top.png  317  10   Port is not registered [Folder: 'Picture.fourth_inst']

For each of these lines i have to extract the data that lies after "[Folder: 'Picture" If there is no data after "[Folder: 'Picture" as in the case of my first line, then skip that line and move on to the next line. I also want to extract the file names for each of those lines- top.txt, tree.txt

I couldnt think of a simpler method to do this as this involves a loop and gets messier. Is there any way out i can do this? extracting just the file paths and the ending data of each line.

Code:

import os
import sys
from os import path
import numpy as np


folder_path = os.path.dirname(os.path.abspath(__file__))
inFile1 = 'Rules.txt'
inFile2 = 'TopRules.txt'

def open_file(filename):
    try:
        with open(filename,'r') as f:
            targets = [line for line in f if "MLT-TRR" in line]
            print targets
        f.close()
        with open(inFile1, "w") as f2:
            for line in targets:
                f2.write(line + "\n")
        f2.close()
     
    except Exception,e:
        print str(e)
    exit(1)


if __name__ == '__main__':
    name = sys.argv[1]
    filename = sys.argv[1]
    open_file(filename)

wwfeldman · Aug 30, 2020

dipk11 said:
Please help me!!
I have a huge report file with some data where i have to do some data processing on lines starting with the code "MLT-TRR" For now i have extracted all the lines in my script that start with that code and placed them in a separate file. ...

I couldnt think of a simpler method to do this as this involves a loop and gets messier. Is there any way out i can do this? extracting just the file paths and the ending data of each line.

perhaps you should read the original file
when you find the "MLT-TRR" code, process that line for file paths and ending data and save that result
then read in the next line
it will save writing and re-reading

i probably have a bad attitude about these things
sometimes there is nothing for it than to slog through
elegance is not important
loops are not the enemy

the computer is doing the work
documenting the code is the most important thing.
getting it right is the most important thing
a straight-forward user interface is next most important

Welcome to EDAboard.com

Extracting data based on specific patterns in a text file using python

dipk11

Junior Member level 2

wwfeldman

Advanced Member level 4

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics