No description, website, or topics provided.
Switch branches/tags
Nothing to show
Latest commit 0800523 Nov 10, 2017
Failed to load latest commit information.
csv Adding to the README Nov 7, 2017
data wrapup commit Nov 6, 2017 Adding to the README Nov 10, 2017 wrapup commit Nov 6, 2017 Adding to the README Nov 7, 2017 wrapup commit Nov 6, 2017


The Fresco files contained here are part of a project at Purdue University conducted by Saurabh Bagchi and Carol Song, et al, to collect and analyze failure data on supercomputer clusters.

The portion of files here are targeted at breaking Torque logs down into tables to be imported into ELK where they will be joined with TACC Stats.

Certain assumptions are made about the environment. The parent directory will be referred to here as "Fresco" and the presence of the child directories "anon", "csv", and "log is assumed. Also, the Torque statistics portion consists of two seperate Python files, and which are located in the Fresco directory.

The script reads the Torque logs and anonymizes them by removing any data which could be used to identify a user or Purdue University. The anonymized files are written to the fresco/anon directory with the same name as the data file and the ".anon" extension added to it. gzipped files will be 'unzipped' in this process.

The output files from will, in a separate process, be used as input by will analyze the anonymized Torque logs and convert those to a CSV file format which while have a top row of column heading followed by lines of data values. Those .csv files will be placed in the csv directory, alocg with other CSV files used to hold data from a number of Python dictionaries.

It is assumed, though it has not be tested, that those CSV files are ready to be imported into ELK.

The Python scripts were developed in Python 2.66 using some backports from newer versions of Python. Version 2.66 was used as the development version as that is available on all clusters.

Usage: ./

The parameter for is the path containing the Torque logs to be anonymized. All files in that path that have no extension will be treated as a log to process. Any compressed files in the .gz format will be uncomressed and then processed. The anonymized output files will be sent to the directory anon/ where they will have the same filename with the extenson .anon.

*** Note: The Torque logs should be found in /depot/sbagchi/data.

*** Afterthought: It would probably be a good idea to change so that it accepts the output directory as the second parameter since the amount of input from a depot directory could fill a normal directory.

Usage: ./

The parameter for is the directory where the files that have been previously anonymized by reside. That directory by default is anon. converts the anonymized files to CSV files formatted as a table complete with the first row of headings. The CSV files should be ready for importation into ELK.

*** Note: The clumsy name of was adopted after discovering that the name caused problems for which imports csv.

*** Obvious Afterthought: The same modifications for apply to