Skip to main content

Posts

Showing posts from January, 2019

Web Scraping using Python Scrapy framework with Example

In today's world, internet has an overwhelming amount of data, this same data is used by different stakeholders for different services. Saving this data set manually would be an enormous amount of work. Web Scraping is a process of extracting data from websites automatically. Scrapy is a Python framework designed for large scale web scraping. Scrapy's architecture is build around "Spiders", which are self-contained crawlers. Spiders are Python classes which are used by the framework to extract from the website(s). As an example, in this post we will scrape the popular Canadian website Redflagdeals - Hot Deals section to extract the information like Deal Title, Vote Count and the Link to the deal. This information would be dumped in a .csv file which can be used to sort the deals based on the vote count. To start with, we will install the Scrapy and create a new project pip install scrapy scrapy startproject rfd Our new project will have the f

Serverless functions with Python & AWS Lamda

Serverless Computing is a cloud computing execution model in which the cloud provider acts as the server, dynamically managing the allocation of machine resources. Prices is based on the actual amount of resources consumed by an application. Serverless applications are event-driven cloud based systems where application development relies solely on a combination of third-party services, client-side logic and cloud-hosted remote procedure callys (FaaS - Functions as a Service). In this post, we will be setting-up and creating a basic Serverless Function using Serverless Framework & AWS Lamda. Serverless Setup :- To start with, we need to install Serverless Framework . Now, we can write Serverless functions directly in AWS Lamda but Serverless Framework makes things more easy for setup especially when you have to work with other services from AWS like API Gateway. There are other frameworks available in market for deploying Python serverless functions like Zappa .