- description: sddscorrelate computes correlation coefficients and correlation significance
between column data. The correlation coefficient between columns i and j is defined as
If C

_{ij}= 1, then the variables are perfectly correlated, whereas if C_{ij}= -1, they are perfectly anticorrelated. The correlation significance is the probability that the observed correlation coefficient could happen by chance if the variables were in fact uncorrelated. Hence, a very small correlation significance means that the variables are probably correlated. - examples: Find the correlations among beam-position-monitor x values in par.bpm:
sddscorrelate par.bpm par.cor -column=’*x’

Find the correlations of these readouts with one specific readout only:

sddscorrelate par.bpm par.cor -column=’*x’ -withOnly=P1P1x

- synopsis:
sddscorrelate [-pipe=[input][,output]] [inputFile] [outputFile] [-columns=columnNames] [-excludeColumns=columnNames] [-withOnly=columnName] [-rankOrder] [-stDevOutlier[=limit=factor][,passes=integer]]

- files: inputFile is an SDDS file containing two or more columns of data. For each page of the file, outputFile contains the correlation coefficients and significance for every possible pairing of variables requested. outputFile also contains three string columns: Correlate1Name, Correlate2Name, and CorrelatePair. These are respectively the name first column in the analysis, the name of the second column in the analysis, and a string of the form Name1.Name2.
- switches:
- -pipe=[input][,output] — The standard SDDS Toolkit pipe option.
- -columns=columnNames — Specifies the names of columns to be included in the analysis. A comma-separated list of optionally wildcard-containing names may be given.
- -excludeColumns=columnNames — Specifies the names of columns to be excluded from the analysis. A comma-separated list of optionally wildcard-containing names may be given.
- -withOnly=columnName — Specifies that one of the variables for each correlation will be the named column.
- -rankOrder — Specifies computing rank-order correlations rather than standard correlations. This is considered more robust that standard correlations.
- -stDevOutlier[=limit=factor][,passes=integer] — Specifies standard-deviation-based outlier elimination on each pair of columns prior to computation of the correlation coefficient. Any pair of values is ignored if one or both values are outliers relative to the column from which they come. The limit qualifier specifies the allowed deviation from the mean in standard deviations; the default is 1. The passes qualifier specifies how many times the outlier elimination (including recomputation of the mean and standard deviation) is performed; the default is 1.

- author: M. Borland, ANL/APS.