Using Pilot to analyze existing data¶
Existing experiment data can be easily analyzed using the
analyze command of Pilot.
You can just pipe your data into Pilot. Say you have existing test data in file unit_test_analyze_input.csv (sample in the cli/test directory of the Pilot source code) with one sample per line, you can do
pilot analyze unit_test_analyze_input.csv
or pipe the data in through a pipe
cat | pilot analyze -
- option tells Pilot to read data from stdin)
The output would be like:
1:[2016-08-16 10:34:45] <info> Preset mode activated: quick 2:[2016-08-16 10:34:45] <info> Setting the limit of autocorrelation coefficient to 0.8 sample_size 48 mean 1.756458 optimal_subsession_size 1 CI 0.157416 variance 0.073474 subsession_autocorrelation_coefficient 0.636556
To get help on using the
analyze command, run:
pilot analyze --help
Handling Comma Separated Value (CSV) File¶
Pilot can read data from a Comma Separated Value (CSV) file. CSV file holds data in plaintext format and can be generated by many programs, like LibreOffice and Excel. The following options can be passed to Pilot to help parsing a CSV file:
-f nto set the field (or column) to extract data from. Note that n starts with 0, so data from the (n+1)th field will be analyzed. Currently only one field can be analyzed. If you need to analyze data in multiple fields please run Pilot multiple times with different
-i nto ignore the first n lines of the input file. This can be useful for ignoring the CSV file header.
Pilot performs autocorrelation analysis to check whether the input data is i.i.d. and uses subsession analysis to mitigate the high autocorrelation (see https://docs.ascar.io/features/autocorrelation-detection-and-mitigation.html for details). The limit for the autocorrelation coefficient (AC) can be set by using
--ac n or using a preset. The following ACs are used for three presets:
Quick mode (default): AC limit to 0.8
Normal mode: AC limit to 0.2
Strict mode: AC limit to 0.1
If getting precise result is critical, you should use a smaller AC limit, like 0.1. The consequence is that the confidence interval would become wider with the same amount of input samples.