textvalid reads a formatted ASCII text file of data, identifies the type of content (data) and validates the file format. An informative message about the format and validity is written to screen. A report file is written on any errors or issues that were detected with the formatting.
|
The first line has the general format: "FileName:FileType:FileFormat:(in)valid" Where FileName is the name of the file, FileType is the type of content ('sequence record', 'Sequence alignment' etc.) and FileFormat is the format ('fasta-like sequence format', 'clustal alignment format' etc.). Unknown might be given instead of FileType or FileFormat if this cannot be ascertained. invalid or valid is given depending on whether formattting errors were detected.
Subsequent lines contain informative message about formatting errors detected during parsing.
Optionally, the type of content (data) of the input file may be specified. This will help textvalid to identify and validate the file format. If the content type is known it should always be specified.
The identification and validation of text file formats is a difficult problem owing to the diversity of available formats, poorly defined formats, lack of adherance to and evolution of the formats. It is therefore not always possible to identify, for a given file, the format and errors in formatting.
This application was modified by