------------------------------------------------------------
Random Forest Model Objects for Pulmonary Toxicity Risk Ass-
essment
Jeremy M. Gernand
15 April 2013
------------------------------------------------------------
http://nanohub.org/resources/17539
------------------------------------------------------------
This download contains MATLAB treebagger objects, random
forest models, based on a meta-analysis of published
pulmonary nanoparticle toxicity experiments.
There are 5 individual model objects contained in a single
MATLAB .mat file called "NanoToxRandomForestModels.mat"
MATLAB 2010a is the version utilized to create these models.
They have also been tested with MATLAB version 2012a.
The designations, input parameters, range of inputs, and
outputs are described below:
All model outputs describe the predicted measurement taken
in bronchoalveolar lavage (BAL) fluid from a rodent (rat or
mouse) exposed to the nanoparticles by inhalation,
intrtracheal instillation, or aspiration. The units of the
model outputs are all in "fold of control" or the multiple
of the measured response over that of the control group.
All RF models are only valid within the ranges of specified
input parameters. These models do not extrapolate. They will
produce a prediction outside of their defined ranges, but
the predicted value will be identical to that of the
nearest boundary.
------------------------------------------------------------
Input descriptions follow this format:
Input Description (units) [Minimum - Maximum]
RF_CNT_PMN
INPUTS AND VALID RANGES:
Total Dose, mass (ug/kg) [0 - 6,291]
Post Exposure, Recovery (days) [1 - 90]
Median Diameter (nm) [1 - 49]
Median Length (nm) [320 - 5,900]
Dose Cobalt (ug/kg) [0 - 3,335]
Aggregation, MMAD* (nm) [1,670 - 4,200]
OUTPUTS: PMN (fold of control) -- the multiple change in
PMN counts from the control group to the exposed group
PMN is polymorphonuclear neutrophils. For carbon nano-
tubes.
RF_CNT_LDH
INPUTS AND VALID RANGES:
Total Dose, mass (ug/kg) [0 - 8,889]
Post Exposure, Recovery (days) [1 - 90]
Median Diameter (nm) [1 - 49]
Median Length (nm) [320 - 5,900]
Dose Cobalt (ug/kg) [0 - 25,000]
Aggregation, MMAD* (nm) [1,670 - 4,200]
OUTPUTS: LDH (fold of control) -- the multiple change in
LDH concentration from the control group to the exp-
osed group. PMN is lactate dehydrogenase. For carbon
nanotubes.
RF_TiO2_LDH
INPUTS AND VALID RANGES:
Total Dose, mass (ug/kg) [0 - 3.87E6]
Post Exposure, Recovery (days) [0 - 2]
Avg. Primary Particle Size (nm) [3.5 - 1,000]
Aggregation, MMAD (nm) [18 - 1,400]
Purity (%) [88 - 100]
OUTPUTS: LDH (fold of control) -- the multiple change in
LDH concentration from the control group to the exp-
osed group. PMN is lactate dehydrogenase. For titanium
dioxide nanoparticles.
RF_TiO2_TP
INPUTS AND VALID RANGES:
Total Dose, mass (ug/kg) [0 - 3.87E6]
Post Exposure, Recovery (days) [0 - 2]
Avg. Primary Particle Size (nm) [3.5 - 1,000]
Aggregation, MMAD (nm) [18 - 1,400]
Purity (%) [88 - 100]
OUTPUTS: Total Protein (fold of control) -- the multiple
change in total protein concentration from the control
group to the exposure group. For titanium dioxide
nanoparticles.
RF_MetOx_LDH
INPUTS AND VALID RANGES:
Total Dose, mass (ug/kg) [0 - 16,543]
Post Exposure, Recovery (days) [1 - 90]
Aggregation, MMAD (nm) [2,800 - 3,300]
Purity (%) [90 - 100]
Gibbs Free Energy (kJ/mol) [-856 - -321]
Avg. Primary Particle Size (nm) [90 - 452]
OUTPUTS: LDH (fold of control) -- the multiple change in
LDH concentration from the control group to the exp-
osed group. PMN is lactate dehydrogenase. For metal
oxide nanoparticles including titanium dioxide, mag-
nesium oxide, zinc oxide, and silicon dioxide.
*MMAD is Mass Mode Aerodynamic Diameter
------------------------------------------------------------
To utilize these models, the MATLAB function "predict" comb-
ined with a matrix of the input parameters should be imple-
mented as follows (the order of x1, x2, ... inputs must
exactly follow the number and order outlined above):
>> y = predict(RF_CNT_PMN,[x1 x2 x3 x4 x5 x6]);
To generate a prediction when any of the input variables is
missing, use the "NaN" (MATLAB designation for 'Not a Num-
ber') in place of the particular value. If NaN is used for
all input parameters, the RF model object will return the
overall mean response for all exposed groups.
------------------------------------------------------------
Further information on these models as well as results can
be found at:
Gernand J. and Casman E. "Selecting Nanoparticle Properties
to Mitigate Risks to Workers and the Public – A Machine
Learning Modeling Framework to Compare Pulmonary Toxicity
Risks of Nanomaterials." Proc. of IMECE2013. No. 62687.
------------------------------------------------------------