Skip to content

Analysis

HTML analysis modules for readability, forms, tables, and metadata extraction.

6 modules

ModuleDescription
Legibilidad HTMLAnalizar legibilidad del contenido
Extraer formulariosExtraer datos de formularios de HTML
Extraer metadatosExtraer metadatos de HTML
Extraer tablasExtraer datos de tablas de HTML
Encontrar patronesEncontrar patrones de datos repetitivos en HTML
Estructura HTMLAnalizar estructura DOM de HTML

Modules

Legibilidad HTML

analysis.html.analyze_readability

Analizar legibilidad del contenido

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Extraer formularios

analysis.html.extract_forms

Extraer datos de formularios de HTML

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Extraer metadatos

analysis.html.extract_metadata

Extraer metadatos de HTML

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Extraer tablas

analysis.html.extract_tables

Extraer datos de tablas de HTML

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Encontrar patrones

analysis.html.find_patterns

Encontrar patrones de datos repetitivos en HTML

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Estructura HTML

analysis.html.structure

Analizar estructura DOM de HTML

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Released under the Apache 2.0 License.