五、获取Document对象

本文最后更新于:2022年6月11日 下午

获取Document对象

XmlBeanDefinitionReader#doLoadDocument(InputSource inputSource, Resource resource) 方法,中做了两件事情:

  • 调用 #getValidationModeForResource(Resource resource) 方法,获取指定资源(xml)的验证模式
  • 调用 DocumentLoader#loadDocument(InputSource inputSource, EntityResolver entityResolver, ErrorHandler errorHandler, int validationMode, boolean namespaceAware) 方法,获取 XML Document 实例。

DocumentLoader

1
2
3
4
5
6
7
8
9
10
11
12
13
public interface DocumentLoader {
/**
* @param inputSource 要加载的文档来源
* @param entityResolver 解析文件的解析器
* @param errorHandler 处理加载Document对象过程中发生的错误
* @param validationMode 验证模式
* @param namespaceAware 如果要提供对 XML 名称空间的支持,则需要值为 true 。
*/
Document loadDocument(
InputSource inputSource, EntityResolver entityResolver,
ErrorHandler errorHandler, int validationMode, boolean namespaceAware)
throws Exception;
}
DefaultDocumentLoader

该类是DocumentLoader的默认实现类。

1
2
3
4
5
6
7
8
9
10
11
12
13
@Override
public Document loadDocument(InputSource inputSource, EntityResolver entityResolver,
ErrorHandler errorHandler, int validationMode, boolean namespaceAware) throws Exception {
// <1> 创建 DocumentBuilderFactory
DocumentBuilderFactory factory = createDocumentBuilderFactory(validationMode, namespaceAware);
if (logger.isTraceEnabled()) {
logger.trace("Using JAXP provider [" + factory.getClass().getName() + "]");
}
// <2> 创建 DocumentBuilder
DocumentBuilder builder = createDocumentBuilder(factory, entityResolver, errorHandler);
// <3> 解析 XML InputSource 返回 Document 对象
return builder.parse(inputSource);
}

第一步:调用createDocumentBuilderFactory()创建DocumentBuilderFactory对象。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
private static final String SCHEMA_LANGUAGE_ATTRIBUTE = "http://java.sun.com/xml/jaxp/properties/schemaLanguage";
private static final String XSD_SCHEMA_LANGUAGE = "http://www.w3.org/2001/XMLSchema";

protected DocumentBuilderFactory createDocumentBuilderFactory(int validationMode, boolean namespaceAware) throws ParserConfigurationException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// 设置命名空间支持
factory.setNamespaceAware(namespaceAware);
if (validationMode != XmlValidationModeDetector.VALIDATION_NONE) {
// 开启校验
factory.setValidating(true);
// XSD 模式下,设置 factory 的属性
if (validationMode == XmlValidationModeDetector.VALIDATION_XSD) {
// XSD 模式下,强制设置命名空间支持
factory.setNamespaceAware(true);
try {
// 设置 SCHEMA_LANGUAGE_ATTRIBUTE
factory.setAttribute(SCHEMA_LANGUAGE_ATTRIBUTE, XSD_SCHEMA_LANGUAGE);
}
catch (IllegalArgumentException ex) {
ParserConfigurationException pcex = new ParserConfigurationException(
"Unable to validate using XSD: Your JAXP provider [" + factory +
"] does not support XML Schema. Are you running on Java 1.4 with Apache Crimson? " +
"Upgrade to Apache Xerces (or Java 1.5) for full XSD support.");
pcex.initCause(ex);
throw pcex;
}
}
}
return factory;
}

第二步:调用createDocumentBuilder()创建DocumentBuilder对象。

1
2
3
4
5
6
7
8
9
10
11
12
13
protected DocumentBuilder createDocumentBuilder(DocumentBuilderFactory factory,@Nullable EntityResolver entityResolver, @Nullable ErrorHandler errorHandler)throws ParserConfigurationException {
// 创建 DocumentBuilder 对象
DocumentBuilder docBuilder = factory.newDocumentBuilder();
if (entityResolver != null) {
// <x> 设置 EntityResolver 属性
docBuilder.setEntityResolver(entityResolver);
}
if (errorHandler != null) {
// 设置 ErrorHandler 属性
docBuilder.setErrorHandler(errorHandler);
}
return docBuilder;
}

设置 DocumentBuilder 的 EntityResolver 属性。详解在EntityResolver段。

第三步:调用DocumentBuilder#parse(InputSource) 方法,解析 InputSource ,返回 Document 对象。

EntityResolver

XmlBeanDefinitionReader.doLoadDocument()中通过 DocumentLoader.loadDocument() 方法来获取 Document 对象时,有一个方法参数 entityResolver 。该参数是通过 XmlBeanDefinitionReader.getEntityResolver() 方法来获取的。该类在于如何获取验证文件,从而验证用户写的 XML 是否通过验证。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
protected EntityResolver getEntityResolver() {
if (this.entityResolver == null) {
// Determine default EntityResolver to use.
ResourceLoader resourceLoader = getResourceLoader();
if (resourceLoader != null) {
//如果 ResourceLoader不为null,则根据指定的ResourceLoader创建一个ResourceEntityResolver对象。
this.entityResolver = new ResourceEntityResolver(resourceLoader);
}
else {
//如果 ResourceLoader为null,则创建一个DelegatingEntityResolver对象。该Resolver委托给默认的BeansDtdResolver和PluggableSchemaResolver。
this.entityResolver = new DelegatingEntityResolver(getBeanClassLoader());
}
}
return this.entityResolver;
}
EntityResoler子类

上边获取EntityResolver的方法共涉及到4个EntityResolver的子类。

ResourceEntityResolver

EntityResolverorg.springframework.beans.factory.xml.ResourceEntityResolver继承自DelegatingEntityResolver,通过 ResourceLoader 来解析实体的引用。

1
2
3
4
5
6
private final ResourceLoader resourceLoader;

public ResourceEntityResolver(ResourceLoader resourceLoader) {
super(resourceLoader.getClassLoader());//下边紧接就是父类的构造器
this.resourceLoader = resourceLoader;
}
DelegatingEntityResolver

org.springframework.beans.factory.xml.DelegatingEntityResolver实现了EntityResolver接口,分别代理了BeansDtdResolverPluggableSchemaResolver

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public static final String DTD_SUFFIX = ".dtd";
public static final String XSD_SUFFIX = ".xsd";
private final EntityResolver dtdResolver;
private final EntityResolver schemaResolver;

// 默认
public DelegatingEntityResolver(@Nullable ClassLoader classLoader) {
this.dtdResolver = new BeansDtdResolver();
this.schemaResolver = new PluggableSchemaResolver(classLoader);
}

// 自定义
public DelegatingEntityResolver(EntityResolver dtdResolver, EntityResolver schemaResolver) {
Assert.notNull(dtdResolver, "'dtdResolver' is required");
Assert.notNull(schemaResolver, "'schemaResolver' is required");
this.dtdResolver = dtdResolver;
this.schemaResolver = schemaResolver;
}
BeansDtdResolver

org.springframework.beans.factory.xml.BeansDtdResolver实现了EntityResolver接口,Spring-Bean.dtd解码器,用来从classpath或者jar文件中加载 dtd ,源码注释有。

1
2
3
4
//"/org/springframework/beans/factory/xml/spring-beans.dtd"
//"https://www.springframework.org/dtd/spring-beans-2.0.dtd"
private static final String DTD_EXTENSION = ".dtd";
private static final String DTD_NAME = "spring-beans";
PluggableSchemaResolver

org.springframework.beans.factory.xml.PluggableSchemaResolver实现了EntityResolver接口,读取 classpath 下的所有 META-INF/spring.schemas 成一个 namespaceURI 与 Schema 文件地址的 map 。

作用

EntityResolver 的作用是项目本身就可以提供一个如何寻找 DTD 声明的方法,即由程序来实现寻找 DTD 声明的过程。比如我们将 DTD 文件放到项目中某处,在实现时直接将此文档读取并返回给 SAX 即可。这样就避免了通过网络来寻找相应的声明。--《Spring 源码深度解析》

org.xml.sax.EntityResolver 接口:

1
2
3
4
5
/* 
*@param 被引用的外部实体的公共标识符,如果没有提供,则返回 null 。.
*@param 被引用的外部实体的系统标识符。
*/
public abstract InputSource resolveEntity (String publicId,String systemId)throws SAXException, IOException;

这两个参数的实际内容和具体的验证模式的如下:

ResourceEntityResolver

ResourceEntityResolver的解析过程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
@Override
@Nullable
public InputSource resolveEntity(@Nullable String publicId, @Nullable String systemId)
throws SAXException, IOException {
// 调用父类的方法,进行解析
InputSource source = super.resolveEntity(publicId, systemId);
// 解析失败,resourceLoader 进行解析
if (source == null && systemId != null) {
// 获得 resourcePath ,即 Resource 资源地址
String resourcePath = null;
try {
String decodedSystemId = URLDecoder.decode(systemId, "UTF-8");// 使用 UTF-8 ,解码 systemId
String givenUrl = new URL(decodedSystemId).toString();// 转换成 URL 字符串
// 解析文件资源的相对路径(相对于系统根路径)
String systemRootUrl = new File("").toURI().toURL().toString();
// Try relative to resource base if currently in system root.
if (givenUrl.startsWith(systemRootUrl)) {
resourcePath = givenUrl.substring(systemRootUrl.length());
}
}
catch (Exception ex) {
// Typically a MalformedURLException or AccessControlException.
if (logger.isDebugEnabled()) {
logger.debug("Could not resolve XML entity [" + systemId + "] against system root URL", ex);
}
// No URL (or no resolvable URL) -> try relative to resource base.
resourcePath = systemId;
}
if (resourcePath != null) {
if (logger.isTraceEnabled()) {
logger.trace("Trying to locate XML entity [" + systemId + "] as resource [" + resourcePath + "]");
}
// 获得 Resource 资源
Resource resource = this.resourceLoader.getResource(resourcePath);
// 创建 InputSource 对象
source = new InputSource(resource.getInputStream());
// 设置 publicId 和 systemId 属性
source.setPublicId(publicId);
source.setSystemId(systemId);
if (logger.isDebugEnabled()) {
logger.debug("Found XML entity [" + systemId + "]: " + resource);
}
}
else if (systemId.endsWith(DTD_SUFFIX) || systemId.endsWith(XSD_SUFFIX)) {
//通过https进行外部dtdxsd查找
String url = systemId;
if (url.startsWith("http:")) {
url = "https:" + url.substring(5);
}
try {
source = new InputSource(new URL(url).openStream());
source.setPublicId(publicId);
source.setSystemId(systemId);
}
catch (IOException ex) {
if (logger.isDebugEnabled()) {
logger.debug("Could not resolve XML entity [" + systemId + "] through URL [" + url + "]", ex);
}
//退回到解析器的默认行为
source = null;
}
}
}
return source;
}
  • 首先,调用父类的方法,进行解析。
  • 如果失败,使用 resourceLoader ,尝试读取 systemId 对应的 Resource 资源。
DelegatingEntityResolver

DelegatingEntityResolver的解析过程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
@Override
@Nullable
public InputSource resolveEntity(@Nullable String publicId, @Nullable String systemId)
throws SAXException, IOException {
if (systemId != null) {
// DTD 模式,使用BeansDtdResolver进行解析
if (systemId.endsWith(DTD_SUFFIX)) {
return this.dtdResolver.resolveEntity(publicId, systemId);
}
// XSD 模式,使用PluggableSchemaResolver进行解析
else if (systemId.endsWith(XSD_SUFFIX)) {
return this.schemaResolver.resolveEntity(publicId, systemId);
}
}
return null;
}
BeansDtdResolver

BeansDtdResolver的解析过程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
@Override
@Nullable
public InputSource resolveEntity(@Nullable String publicId, @Nullable String systemId) throws IOException {
if (logger.isTraceEnabled()) {
logger.trace("Trying to resolve XML entity with public ID [" + publicId +
"] and system ID [" + systemId + "]");
}
// 以 .dtd 结尾
if (systemId != null && systemId.endsWith(DTD_EXTENSION)) {
// 获取最后一个 / 的位置
int lastPathSeparator = systemId.lastIndexOf('/');
// 获取 spring-beans 的位置
int dtdNameStart = systemId.indexOf(DTD_NAME, lastPathSeparator);
if (dtdNameStart != -1) {
String dtdFile = DTD_NAME + DTD_EXTENSION;
if (logger.isTraceEnabled()) {
logger.trace("Trying to locate [" + dtdFile + "] in Spring jar on classpath");
}
try {
// 创建 ClassPathResource 对象
Resource resource = new ClassPathResource(dtdFile, getClass());
// 创建 InputSource 对象,并设置 publicId、systemId 属性
InputSource source = new InputSource(resource.getInputStream());
source.setPublicId(publicId);
source.setSystemId(systemId);
if (logger.isTraceEnabled()) {
logger.trace("Found beans DTD [" + systemId + "] in classpath: " + dtdFile);
}
return source;
}
catch (FileNotFoundException ex) {
if (logger.isDebugEnabled()) {
logger.debug("Could not resolve beans DTD [" + systemId + "]: not found in classpath", ex);
}
}
}
}
// 使用默认行为,从网络上下载
return null;
}
PluggableSchemaResolver

PluggableSchemaResolver解析过程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
@Override
@Nullable
public InputSource resolveEntity(@Nullable String publicId, @Nullable String systemId) throws IOException {
if (logger.isTraceEnabled()) {
logger.trace("Trying to resolve XML entity with public id [" + publicId +
"] and system id [" + systemId + "]");
}
if (systemId != null) {
// 获得 Resource 所在位置
String resourceLocation = getSchemaMappings().get(systemId);
if (resourceLocation == null && systemId.startsWith("https:")) {
// Retrieve canonical http schema mapping even for https declaration
resourceLocation = getSchemaMappings().get("http:" + systemId.substring(6));
}
if (resourceLocation != null) {
Resource resource = new ClassPathResource(resourceLocation, this.classLoader);
try {
// 创建 InputSource 对象,并设置 publicId、systemId 属性
InputSource source = new InputSource(resource.getInputStream());
source.setPublicId(publicId);
source.setSystemId(systemId);
if (logger.isTraceEnabled()) {
logger.trace("Found XML schema [" + systemId + "] in classpath: " + resourceLocation);
}
return source;
}
catch (FileNotFoundException ex) {
if (logger.isDebugEnabled()) {
logger.debug("Could not find XML schema [" + systemId + "]: " + resource, ex);
}
}
}
}
return null;
}

getSchemaMappings()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
private Map<String, String> getSchemaMappings() {
Map<String, String> schemaMappings = this.schemaMappings;
// 双重检查锁,实现 schemaMappings 单例
if (schemaMappings == null) {
synchronized (this) {
schemaMappings = this.schemaMappings;
if (schemaMappings == null) {
if (logger.isTraceEnabled()) {
logger.trace("Loading schema mappings from [" + this.schemaMappingsLocation + "]");
}
try {
// 以 Properties 的方式,读取 schemaMappingsLocation
Properties mappings =
PropertiesLoaderUtils.loadAllProperties(this.schemaMappingsLocation, this.classLoader);
if (logger.isTraceEnabled()) {
logger.trace("Loaded schema mappings: " + mappings);
}
// 将 mappings 初始化到 schemaMappings 中
schemaMappings = new ConcurrentHashMap<>(mappings.size());
CollectionUtils.mergePropertiesIntoMap(mappings, schemaMappings);
this.schemaMappings = schemaMappings;
}
catch (IOException ex) {
throw new IllegalStateException(
"Unable to load schema mappings from location [" + this.schemaMappingsLocation + "]", ex);
}
}
}
}
return schemaMappings;
}

例:部分映射

1
2
"http://www.springframework.org/schema/context/spring-context-3.2.xsd"->"org/springframework/context/config/spring-context.xsd"
"http://www.springframework.org/schema/cache/spring-cache-4.3.xsd"->"org/springframework/cache/config/spring-cache.xsd"
自定义EntityResolver

如果 SAX 应用程序需要实现自定义处理外部实体,则必须实现EntityResolver接口,并使用 #setEntityResolver(EntityResolver entityResolver) 方法,向 SAX 驱动器注册一个 EntityResolver 实例。


五、获取Document对象
http://www.muzili.ren/2022/06/11/获取Document对象/
作者
jievhaha
发布于
2022年6月11日
许可协议