|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectrecognizer.PersonalityRecognizer
public class PersonalityRecognizer
The program computes features described in (Mairesse et al., 2007) given a text, and it runs Weka models on the features to produce personality scores for all Big Five dimensions. The MRC Psycholinguistic database and the LIWC tool need to be installed, and the file PersonalityRecognizer.conf in the main directory needs to be modified accordingly. The PersonalityRecognizer script should be used for launching the program. Usage: PersonalityRecognizer [-d] [-m model_number] [-o] [-c] [-t model_type] [-a arff_output_file] -i file|directory -c,--counts Also outputs feature counts, -d must be disabled -d,--directory Corpus analysis mode. Input must be a directory with multiple text files, features are standardized over the corpus and the recognizer outputs a personality estimate for each text file. -i,--input Input file or directory (required) -m,--model Model to use for computing scores (default 4). Options: 1 = Linear Regression 2 = M5' Model Tree 3 = M5' Regression Tree 4 = Support Vector Machine with Linear Kernel (SMOreg) -o,--outputmod Also outputs models -t,--type Selects the type of model to use (default 1). The appropriate model depends on the language sample (written or spoken), and whether observed personality (as perceived by external judges) or self-assessed personality (the writer/speaker's perception) needs to be estimated from the text. Options: 1 = Observed personality from spoken language 2 = Self-assessed personality from written language -a,--arff In corpus analysis mode, outputs the features of each text into a Weka.arff
dataset file, together with the predicted scores. New models can be trained by adding features and replacing the scores with human estimates. Each line corresponds to a text in the corpus indicated by thefilename
feature. See the included readme file and the website http://www.mairesse.co.uk/personality/recognizer.html for more information. Questions can be emailed to the author (webpage: http://www.mairesse.co.uk). Reference paper: Francois Mairesse, Marilyn Walker, Matthias Mehl and Roger Moore. Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text. Journal of Artificial Intelligence Research (JAIR), 30, pages 457-500, 2007. Available on the web in PDF format at http://www.mairesse.co.uk/papers/personality-jair07.pdf
Field Summary | |
---|---|
static java.io.File |
DEFAULT_CONFIG_FILE
Configuration file (default is PersonalityRecognizer.conf in root application directory). |
static java.lang.String[] |
DIMENSIONS
Personality dimensions names. |
static java.lang.String |
FS
File separator. |
static java.lang.String |
LS
Line separator. |
Constructor Summary | |
---|---|
PersonalityRecognizer()
Initializes parameters based on the default configuration file (PersonalityRecognizer.properties). |
|
PersonalityRecognizer(java.io.File propFile)
Initializes parameters based on configuration file, and loads the LIWC dictionary and the MRC database in memory. |
Method Summary | |
---|---|
java.util.Map<java.io.File,java.lang.Double[]> |
computeScoresOverCorpus(java.io.File dir,
weka.classifiers.Classifier[] models,
java.io.File outputArffFile)
Runs the models of each personality trait for each file in the directory. |
java.util.Map<java.lang.String,java.lang.Double> |
getFeatureCounts(java.lang.String text,
boolean relativeOnly)
Computes the features from the input text (70 LIWC features and 14 from the MRC database). |
int |
getModelIndex()
Gets the current default model index. |
int |
getModelIndex(java.lang.String modelDir)
Gets the model index in the MODEL_NAMES array from a string representation. |
weka.classifiers.Classifier[] |
loadWekaModels(boolean selfModel,
boolean stdModels)
Loads saved Weka models in memory for all personality dimensions, using the default model type. |
weka.classifiers.Classifier[] |
loadWekaModels(int modelIndex,
boolean selfModel,
boolean stdModels)
Loads saved Weka models in memory for all personality dimensions. |
static void |
main(java.lang.String[] args)
Main method that initializes the parameters from the configuration file, counts the features from the input text(s), run the specified Weka models for this feature set for each Big Five personality traits, and returns the personality score estimates to the standard output. |
void |
printOutput(weka.classifiers.Classifier[] models,
double[] scores,
int modelIndex,
boolean printModels,
boolean self,
java.io.PrintStream out)
Prints personality scores to standard output, and model details if required. |
void |
printOutput(weka.classifiers.Classifier[] models,
java.util.Map<java.io.File,java.lang.Double[]> scores,
int modelIndex,
boolean printModels,
boolean self,
java.io.PrintStream out)
Prints personality scores of multiple files to standard output, and model details if required. |
double[] |
runWekaModels(weka.classifiers.Classifier[] models,
java.util.Map<java.lang.String,java.lang.Double> counts)
Runs each Weka model on a new instance created from the input feature counts, and outputs the resulting personality score. |
void |
setModel(int modelIndex)
Sets the default Weka model to load when calling loadWekaModels(). |
void |
setModel(java.lang.String modelDir)
Sets the default Weka model to load when calling loadWekaModels(). |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final java.lang.String[] DIMENSIONS
public static final java.lang.String LS
public static final java.lang.String FS
public static final java.io.File DEFAULT_CONFIG_FILE
Constructor Detail |
---|
public PersonalityRecognizer(java.io.File propFile)
propFile
- configuration file in ASCII format ( VARIABLE = "VALUE"
on each line).public PersonalityRecognizer()
Method Detail |
---|
public static void main(java.lang.String[] args)
args
- set of options and input file(s).public void setModel(int modelIndex)
modelIndex
- the index of the element in the MODEL_DIRS array corresponding
to the directory of the model to load.public void setModel(java.lang.String modelDir)
modelDir
- the model subdirectory in the MODEL_DIRS array corresponding
to the model to load.public int getModelIndex()
public int getModelIndex(java.lang.String modelDir)
modelDir
- the model subdirectory in the MODEL_DIRS array corresponding
to the model to load.
public weka.classifiers.Classifier[] loadWekaModels(boolean selfModel, boolean stdModels)
selfModel
- if set to true, loads the self-report models.stdModels
- if set to true, loads the standardized models.
public weka.classifiers.Classifier[] loadWekaModels(int modelIndex, boolean selfModel, boolean stdModels)
modelIndex
- the index of the element in the MODEL_DIRS array corresponding
to the directory of the model to load.selfModel
- if set to true, loads the self-report models.stdModels
- if set to true, loads the standardized models.
public double[] runWekaModels(weka.classifiers.Classifier[] models, java.util.Map<java.lang.String,java.lang.Double> counts)
models
- array of Weka models (Classifier objects).counts
- mapping of feature counts (Double objects), it must probide
a value for all attribute strings of the input models.
public java.util.Map<java.lang.String,java.lang.Double> getFeatureCounts(java.lang.String text, boolean relativeOnly) throws java.lang.Exception
text
- input text.relativeOnly
- do not return absolute count features (WC), must be set to false if
standardized features are used (corpus analysis mode).
java.lang.Exception
public void printOutput(weka.classifiers.Classifier[] models, double[] scores, int modelIndex, boolean printModels, boolean self, java.io.PrintStream out)
models
- array of Weka models.scores
- array of personality scores to print.modelIndex
- index of the model used in the MODEL_NAMES array.printModels
- if true, prints out a textual representation of the models.out
- output stream.public void printOutput(weka.classifiers.Classifier[] models, java.util.Map<java.io.File,java.lang.Double[]> scores, int modelIndex, boolean printModels, boolean self, java.io.PrintStream out)
models
- array of Weka models.scores
- map associating each file to an array of personality scores to print.modelIndex
- index of the model used in the MODEL_NAMES array.printModels
- if true, prints out a textual representation of the models.out
- output stream.public java.util.Map<java.io.File,java.lang.Double[]> computeScoresOverCorpus(java.io.File dir, weka.classifiers.Classifier[] models, java.io.File outputArffFile)
dir
- input directory containing multiple text files.models
- models of each Big Five personality trait.outputArffFile
- Weka arff
file to print the feature values and scores to (null=none).
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |