Posts

Showing posts from April, 2022

ML/AI Image Classifier for Skin Cancer Detection

Image
  Skin cancer is one of the most active types of cancer in the present decade.  As the skin is the body’s largest organ, the point of considering skin cancer as the most common type of cancer among humans is understandable . It is generally classified into two major categories: nonmelanoma (benign) and melanoma (malignant) skin cancer Melanoma type of cancers can only be cured if diagnosed early; otherwise, they spread to other body parts and lead to the victim’s painful death.  Therefore, the critical factor in skin cancer treatment is early diagnosis.  Yet, diagnoses is still a visual process , which relies on the long-winded procedure of clinical screenings, followed by dermoscopic analysis, and then a biopsy and finally a histopathological examination. This process  easily  takes months and the need for many medical professionals  and still  is only ~77% accurate. Current methods using AI and Deep Learning to diagnose lesions show  poten...

Webscraping in R - IMDb ETL Showcase

Image
  Web scraping in R is an ETL pipeline that perform web data mining by reading HTML tags and converting them  to the structured format which can easily be visualized using tidyverse . Let's  scrape movies from IMDb into a data frame in R by invoking the rvest library and then visualize the data frame using ggplot2 and qplot functions: Importing the key R libraries library(rvest) #scraping library(dplyr) #piping library('ggplot2') #plotting Specifying the URL for desired website to be scraped url <- 'http://www.imdb.com/search/title?count=100&release_date=2016,2016&title_type=feature' Reading the HTML code from the website webpage <- read_html(url) Using CSS selectors to scrape the rankings section rank_data_html <- html_nodes(webpage,'.text-primary') Converting the ranking data to text rank_data <- html_text(rank_data_html) Let's have a look at the rankings head(rank_data) [1] "1." "2." "3." "4....