Extract data from online news & articles. Get full metadata with content, images, authors, summary, category, keywords, topics, and more.
Automatic data extraction from articles, products, discussions, and more. This API uses advanced AI technology to retrieve clean, structured data without the need for manual rules or site-specific training.
The most advanced article extraction API with AI/ML summary, category prediction, all images, blog logo, authors, keywords, tags and more.
Features:
- Extracts full HTML/Text
- Using A.I. we extract full HTML even from javascript heavy websites.
Consistent Categories
Get auto predicted categories to better organize your extracted content
Metadata
Get full metadata of the article including images, keywords, tags, and more.
Extracted Fields:
- Date publishedAt;
- String title;
- List authors;
- String description;
- String language;
- String url;
- String mainImage;
- String html;
- String text;
- String category;
- /*** Top 3 Predicted Categoris using an A.I. model */
- List predictedCategories;
- List tags
- /** Most important keywords in the article */
- List keywords;
- /** A.I. generated summary of the article */
- String summary;
- /** All images in the article */
- List images;
- String blogName;
- String blogLogoUrl;
Want any more fields to be added? Send us an email at [email protected]