This Python project demonstrates web scraping, data extraction, and basic data analysis skills. The scraper collects the top 20 posts from Hacker News, including their titles, links, scores, number of comments, and posting time. The data is then stored in a structured CSV file for further analysis.
Key features of this project include:
Extracting post data from a live website using BeautifulSoup and Requests
Handling missing data and cleaning text fields
Saving the collected data in a CSV file for easy use
Clean and organized Python code for maintainability
This project is ideal for showcasing Python scripting, web scraping, and data handling skills, making it a great addition to any data engineering or software development portfolio.