人类基因组数据质量评估研究

严维军; 赵正宜; 熊行创

doi:10.12338/j.issn.2096-9015.2023.0089

人类基因组数据质量评估研究

A Study on Quality Assessment of Human Genome Data

摘要

摘要: 随着高通量测序技术的发展，研究人员现已具备对人类基因组测序数据进行深度分析和处理的能力，数据质量无疑成为影响数据分析结果可信度的决定性因素。因此，精确的数据质量评估成为至关重要的环节，其目的在于避免不必要的损失并确保结果的准确性。学术界和产业界都高度重视数据质量的评估，提出了大量的质量评估方法并开发了大量的工具，例如FastQC、Qualimap等软件工具，以及各类标准物质和标准参考数据，为数据质量评估提供了有力支持。然而，系统的研究各个质量评估环节的工具集以及对各类工具集的特点汇总相对较少，数据的质量评估的过程仍存在诸多问题和挑战。为评估人类基因组数据工作提供帮助，深入分析了上述问题的解决策略，并提供了一些具有实践意义的建议，以期提供参考。

Abstract: In the wake of the advancements in high-throughput sequencing technology, researchers are now equipped with the capacity to conduct in-depth analyses and processing of human genome sequencing data. The quality of these data inevitably serves as a pivotal factor impacting the credibility of analysis results. As such, precise quality assessment becomes a paramount process to circumvent needless loss and to ascertain the accuracy of outcomes. Both the academic and industrial communities place significant emphasis on data quality assessment, having introduced numerous methods for such assessment and developed a multitude of tools like FastQC and Qualimap software, along with various standard materials and standard reference data, which collectively underpin data quality assessment. However, there are scant systematic investigations of toolsets employed in each assessment stage and summarizations of toolset characteristics. Furthermore, the process of data quality assessment is laden with numerous issues and challenges. To aid human genome data assessment endeavors, this paper delves into potential solutions for these problems and puts forth several practically significant suggestions for reference.

HTML全文

参考文献(64)

施引文献

资源附件(0)