Applicability Evaluation of Heat-Not-Burn Tobacco Raw Materials Based on Thermal Pyrolysis-Gas Chromatography-Mass Spectrometry and Random Forests
摘 要
基于热裂解-气相色谱-质谱法(Py-GC-MS)和随机森林(RF),从化学成分角度分析加热卷烟烟叶原料适用性。称取过筛后的样品粉末0.90mg于样品杯中,采用Py-GC-MS对28种不同类型的加热卷烟进行检测,用MZmine软件对Py-GC-MS数据进行处理,获得含有峰强度信息的特征峰表。分别以样品的特征峰表和感官评价得分作为自变量和因变量,采用RF回归算法建立加热卷烟烟叶原料适用性模型。结果显示:RF模型在训练集上的决定系数为0.93,均方根误差为0.85,在测试集上的决定系数为0.92,均方根误差为0.96;根据NIST 2017库的定性结果,共筛选出20个特征重要性评分较高的化学成分。
Abstract
Based on thermal pyrolysis-gas chromatography-mass spectrometry (Py-GC-MS) and random forests (RF), the applicability of heat-not-burn tobacco raw materials was analyzed from the perspective of chemical composition. The sieved sample powder (0.90 mg) was placed into the sample cup, Py-GC-MS was used to detect 28 different types of heat-not-burn tobaccos, and MZmine software was used to process the Py-GC-MS data to obtain the characteristic peak table containing intensity information. The characteristic peak table and the sensory evaluation scores of the samples were used as independent variables and dependent variables, respectively, and RF regression algorithm was used to establish the applicability model of heat-not-burn tobacco raw materials. It was shown that the coefficient of determination of RF model on the training set was 0.93, and the root mean square error was 0.85. The coefficient of determination on the testing set was 0.92, and the root mean square error was 0.96. According to the qualitative results of NIST 2017 database, 20 chemical compositions with higher feature importance scores were screened out.
中图分类号 O657.63 DOI 10.11973/lhjy-hx202301003
所属栏目 试验与研究
基金项目 中国烟草总公司云南省公司新型烟草制品烟叶原料保障重大专项“加热卷烟烟叶原料调制技术及质量评价研究与应用”(20205300002410040)
收稿日期 2022/5/9
修改稿日期
网络出版日期
作者单位点击查看
备注陈颐,博士,主要从事烟草调制技术研究
引用该论文: CHEN Yi,FAN Yingjie,WANG Xu,YANG Jing,ZHAO Wentao,ZHANG Zhimin. Applicability Evaluation of Heat-Not-Burn Tobacco Raw Materials Based on Thermal Pyrolysis-Gas Chromatography-Mass Spectrometry and Random Forests[J]. Physical Testing and Chemical Analysis part B:Chemical Analysis, 2023, 59(1): 21~28
陈颐,范迎杰,汪旭,杨菁,赵文涛,张志敏. 基于热裂解-气相色谱-质谱法和随机森林的加热卷烟烟叶原料适用性评估[J]. 理化检验-化学分册, 2023, 59(1): 21~28
共有人对该论文发表了看法,其中:
人认为该论文很差
人认为该论文较差
人认为该论文一般
人认为该论文较好
人认为该论文很好
参考文献
【1】朱浩,席辉,柴国璧,等.温度对加热非燃烧卷烟烟熏香成分释放的影响[J].烟草科技, 2017,50(11):33-38.
【2】BITZER Z T, GOEL R, TRUSHIN N, et al. Free radical production and characterization of heat-not-burn cigarettes in comparison to conventional and electronic cigarettes[J]. Chemical Research in Toxicology, 2020,33(7):1882-1887.
【3】窦玉青,沈轶,杨举田,等.新型烟草制品发展现状及展望[J].中国烟草科学, 2016,37(5):92-97.
【4】高茜,向能军,许永,等.Py-GC/MS分析技术与其在烟用中草药添加剂中的应用[J].光谱实验室, 2008,25(5):1015-1019.
【5】张丽娜,李萍萍,胡安福,等.气相色谱-质谱法分析卷烟烟丝和"九曲红梅"红茶样品在加热不燃烧条件下释放出的挥发物及用掺入红茶的烟丝制卷烟的初步探讨[J].理化检验-化学分册, 2020,56(1):39-45.
【6】BREIMAN L. Random forests[J]. Machine Learning, 2001,45:5-32.
【7】刘继辉,许磊,马晓龙,等.基于随机森林回归的制丝过程参数影响权重分析[J].烟草科技, 2017,50(2):63-71.
【8】张莉,纪铭阳,胡宗玉,等.基于随机森林和逻辑回归分类模型的烟叶精选品控指标筛选[J].江苏农业科学, 2020,48(3):214-217.
【9】BARAN R, KOCHI H, SAITO N, et al. MathDAMP:A package for differential analysis of metabolite profiles[J]. BMC Bioinformatics, 2006,7:530.
【10】WU L J, LIU W, CAO J L, et al. Analysis of the aroma components in tobacco using combined GC-MS and AMDIS[J]. Analytical Methods, 2013,5(5):1259-1263.
【11】LOMMEN A. Data (pre-) processing of nominal and accurate mass LC-MS or GC-MS data using MetAlign[M]//WALKER J M. Methods in molecular biology. Towowa:Human Press, 2012,860:229-253.
【12】LEI Z T, LI H Q, CHANG J, et al. MET-IDEA version 2.06; improved efficiency and additional functions for mass spectrometry-based metabolomics data processing[J]. Metabolomics, 2012,8:105-110.
【13】LUEDEMANN A, VON MALOTKY L, ERBAN A, et al. TagFinder:Preprocessing software for the fingerprinting and the profiling of gas chromatography-mass spectrometry based metabolome analyses[M]//WALKER J M. Methods in molecular biology. Towowa:Human Press, 2012,860:255-286.
【14】DURAN A L, YANG J, WANG L J, et al. Metabolomics spectral formatting, alignment and conversion tools (MSFACTs)[J]. Bioinformatics, 2003,19(17):2283-2293.
【15】MYERS O D, SUMNER S J, LI S Z, et al. One step forward for reducing false positive and false negative compound identifications from mass spectrometry metabolomics data:New algorithms for constructing extracted ion chromatograms and detecting chromatographic peaks[J]. Analytical Chemistry, 2017,89(17):8696-8703.
【16】NEMBRINI S, KÖNIG I R, WRIGHT M N. The revival of the gini importance?[J]. Bioinformatics, 2018,34(21):3711-3718.
【17】AI F F, BIN J, ZHANG Z M, et al. Application of random forests to select premium quality vegetable oils by their fatty acid composition[J]. Food Chemistry, 2014,143:472-478.
【18】LIAO J G, MCGEE D. Adjusted coefficients of determination for logistic regression[J]. The American Statistician, 2003,57(3):161-165.
【19】ZHANG Z M, LIANG Y Z, LU H M, et al. Multiscale peak alignment for chromatographic datasets[J]. Journal of Chromatography A, 2012,1223:93-106.
【20】MA L, FAN S H. CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests[J]. BMC Bioinformatics, 2017,18(1):169.
【21】DUROUX R, SCORNET E. Impact of subsampling and tree depth on random forests[J]. ESAIM:Probability and Statistics, 2018,22:96-128.
【22】SCHWANZ T G, BOKOWSKI L V V, MARCELO M C A, et al. Analysis of chemosensory markers in cigarette smoke from different tobacco varieties by GC×GC-TOFMS and chemometrics[J]. Talanta, 2019,202:74-89.
【2】BITZER Z T, GOEL R, TRUSHIN N, et al. Free radical production and characterization of heat-not-burn cigarettes in comparison to conventional and electronic cigarettes[J]. Chemical Research in Toxicology, 2020,33(7):1882-1887.
【3】窦玉青,沈轶,杨举田,等.新型烟草制品发展现状及展望[J].中国烟草科学, 2016,37(5):92-97.
【4】高茜,向能军,许永,等.Py-GC/MS分析技术与其在烟用中草药添加剂中的应用[J].光谱实验室, 2008,25(5):1015-1019.
【5】张丽娜,李萍萍,胡安福,等.气相色谱-质谱法分析卷烟烟丝和"九曲红梅"红茶样品在加热不燃烧条件下释放出的挥发物及用掺入红茶的烟丝制卷烟的初步探讨[J].理化检验-化学分册, 2020,56(1):39-45.
【6】BREIMAN L. Random forests[J]. Machine Learning, 2001,45:5-32.
【7】刘继辉,许磊,马晓龙,等.基于随机森林回归的制丝过程参数影响权重分析[J].烟草科技, 2017,50(2):63-71.
【8】张莉,纪铭阳,胡宗玉,等.基于随机森林和逻辑回归分类模型的烟叶精选品控指标筛选[J].江苏农业科学, 2020,48(3):214-217.
【9】BARAN R, KOCHI H, SAITO N, et al. MathDAMP:A package for differential analysis of metabolite profiles[J]. BMC Bioinformatics, 2006,7:530.
【10】WU L J, LIU W, CAO J L, et al. Analysis of the aroma components in tobacco using combined GC-MS and AMDIS[J]. Analytical Methods, 2013,5(5):1259-1263.
【11】LOMMEN A. Data (pre-) processing of nominal and accurate mass LC-MS or GC-MS data using MetAlign[M]//WALKER J M. Methods in molecular biology. Towowa:Human Press, 2012,860:229-253.
【12】LEI Z T, LI H Q, CHANG J, et al. MET-IDEA version 2.06; improved efficiency and additional functions for mass spectrometry-based metabolomics data processing[J]. Metabolomics, 2012,8:105-110.
【13】LUEDEMANN A, VON MALOTKY L, ERBAN A, et al. TagFinder:Preprocessing software for the fingerprinting and the profiling of gas chromatography-mass spectrometry based metabolome analyses[M]//WALKER J M. Methods in molecular biology. Towowa:Human Press, 2012,860:255-286.
【14】DURAN A L, YANG J, WANG L J, et al. Metabolomics spectral formatting, alignment and conversion tools (MSFACTs)[J]. Bioinformatics, 2003,19(17):2283-2293.
【15】MYERS O D, SUMNER S J, LI S Z, et al. One step forward for reducing false positive and false negative compound identifications from mass spectrometry metabolomics data:New algorithms for constructing extracted ion chromatograms and detecting chromatographic peaks[J]. Analytical Chemistry, 2017,89(17):8696-8703.
【16】NEMBRINI S, KÖNIG I R, WRIGHT M N. The revival of the gini importance?[J]. Bioinformatics, 2018,34(21):3711-3718.
【17】AI F F, BIN J, ZHANG Z M, et al. Application of random forests to select premium quality vegetable oils by their fatty acid composition[J]. Food Chemistry, 2014,143:472-478.
【18】LIAO J G, MCGEE D. Adjusted coefficients of determination for logistic regression[J]. The American Statistician, 2003,57(3):161-165.
【19】ZHANG Z M, LIANG Y Z, LU H M, et al. Multiscale peak alignment for chromatographic datasets[J]. Journal of Chromatography A, 2012,1223:93-106.
【20】MA L, FAN S H. CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests[J]. BMC Bioinformatics, 2017,18(1):169.
【21】DUROUX R, SCORNET E. Impact of subsampling and tree depth on random forests[J]. ESAIM:Probability and Statistics, 2018,22:96-128.
【22】SCHWANZ T G, BOKOWSKI L V V, MARCELO M C A, et al. Analysis of chemosensory markers in cigarette smoke from different tobacco varieties by GC×GC-TOFMS and chemometrics[J]. Talanta, 2019,202:74-89.
相关信息