Skip to content

Analysis

HTML analysis modules for readability, forms, tables, and metadata extraction.

6 modules

ModuleDescription
HTML-LesbarkeitLesbarkeit von Inhalten analysieren
Formulare extrahierenFormulardaten aus HTML extrahieren
Metadaten extrahierenMetadaten aus HTML extrahieren
Tabellen extrahierenTabellendaten aus HTML extrahieren
Muster findenWiederholende Datenmuster in HTML finden
HTML-StrukturHTML-DOM-Struktur analysieren

Modules

HTML-Lesbarkeit

analysis.html.analyze_readability

Lesbarkeit von Inhalten analysieren

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Formulare extrahieren

analysis.html.extract_forms

Formulardaten aus HTML extrahieren

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Metadaten extrahieren

analysis.html.extract_metadata

Metadaten aus HTML extrahieren

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Tabellen extrahieren

analysis.html.extract_tables

Tabellendaten aus HTML extrahieren

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Muster finden

analysis.html.find_patterns

Wiederholende Datenmuster in HTML finden

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

HTML-Struktur

analysis.html.structure

HTML-DOM-Struktur analysieren

Parameters:

NameTypeRequiredDefaultDescription
htmlstringYes-HTML content to analyze

Output:

FieldTypeDescription
typeanyobject
propertiesany

Released under the Apache 2.0 License.