Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Machine learning algorithms for classification and identification of file formats

akerkarprashant

Junior Member level 3
Joined
Nov 24, 2020
Messages
31
Helped
0
Reputation
0
Reaction score
0
Trophy points
6
Activity points
201


Can machine learning algorithms, AI technologies classify and identify file formats?

Input dataset of all file formats viz csv,png,gif,jpg,c,cpp,py,html,txt,java,dll,exe etc are fed for the machine learning program.

Can the Machine learning classify and identify the file format with below examples?

txt - Ascii text file

c - C program source code file

html - Hypertext markup language

gif - Graphics interchange format.

exe - executable binary file


Thanks & Regards,

Prashant S Akerkar
 

BradtheRad

Super Moderator
Staff member
Joined
Apr 1, 2011
Messages
14,059
Helped
2,796
Reputation
5,592
Reaction score
2,705
Trophy points
1,393
Location
Minneapolis, Minnesota, USA
Activity points
104,875
An AI algorithm shall learn similarities and differences between the various types of files.

Of course a human must program the algorithm to compare and observe (however that's done). A human must provide a database containing examples of file types. And a human must verify the correctness of the algorithm's decisions, at least in early development stages.
 

Prototyp_V1.0

Advanced Member level 2
Joined
Apr 3, 2007
Messages
667
Helped
119
Reputation
238
Reaction score
82
Trophy points
1,308
Location
Norway
Activity points
4,953
Isn't this already taken care of by MIME types and content sniffing ?


I know that at least Linux desktop systems and their file managers do content sniffing as it doesn't care about file extension as in Windows.
 

betwixt

Super Moderator
Staff member
Joined
Jul 4, 2009
Messages
14,921
Helped
4,874
Reputation
9,766
Reaction score
4,661
Trophy points
1,393
Location
Aberdyfi, West Wales, UK
Activity points
126,954
Linux has a command called "file" which you follow with the path to the file you want to investigate. It returns an analysis of the file type.
As it is open source, looking at the way "file" works should give you some ideas.

Brian.
 

LaTeX Commands Quick-Menu:

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Top