تفاصيل العمل

Create a web scraping tool for collecting the articles from different news websites then store them in the database in a structured way to retrieve and control them through APIs.

1. Create Login and Registration APIs

2. Create a website scraping Module just as APIs according to the following functions

a. Get API to list all stored websites that you scraped with your system ( Website name - Website link - Created at - Last scraped at - Last scraped by )

b. Post API to re-scrape a selected website(by passing it ID ) to get the new articles.

c. Get API to get all the scraped articles by the website ID (Article title- Article description - Article

DOM - Published at - Article link- Website name), ordered desc by “Published at”

d. Log each scraping request with whoever sent it and on which website.

e. Get API to list the website scraping history (User name - Scraping date)

3. Host your results on the AWS cloud.(Optional)

a. Use the free tier of Amazon AWS to host your result (IMPORTANT NOTE: The AWS free tier takes

up to 48 hours to activate, so register first and then develop your code while waiting for the

activation)

4. Websites scrapped

https://www.mklat.com/cat... https://www.arabmediasoci...

ملفات مرفقة

بطاقة العمل

اسم المستقل عمر ح.
عدد الإعجابات 0
عدد المشاهدات 6
تاريخ الإضافة
تاريخ الإنجاز