Introduction to Python and Big Data Analysis

CDIP offers the certificate course on Introduction to Python and Big Data Analysis. Nowadays, Data Science is the most demanding profession in the software industry. To be a Data Scientist one should have a vast knowledge of Big Data and Machine Learning. For both cases, python provides best packages and libraries. In this course, a student will not only learn the basic of python but also the data analysis on Big Data.

Batch 6 class has already started 0n 3rd January 2020.

Name: Sifatur Rahim

Designation: Lead BigData DevOps Engineer

Company: Telenor Health

Experience: 12 Years

Linkedin: https://bd.linkedin.com/in/sifatur-rahim-59930a33

Name: Rakib Hasan

Designation: Sr. Software Engineer

Company: Telenor Health

Experience: 7 Years

Linkedin: https://bd.linkedin.com/in/rakib-hasan-amiya-a7700a66

Basic python Part 1

  • Overview – of Linux (Ubuntu), Linux filesystem
  • Ubuntu command line (terminal) tricks 
  • Environment Setup, Python intro, package manager (pip)
  • Basic Syntax
  • Variable Types

Basic python Part 2

  • Basic Operators
  • Decision Making
  • Loops
  • Numbers
  • Strings

Basic python Part 3

  • Arrays
  • Matrix
  • Lists
  • Tuples
  • Dictionary
  • Images
  •  Tables
  •  Forms

Basic Python Part-4

  • Date & Time
  • Functions
  • Modules
  • Files I/O
  • Exceptions

Advanced python Part 1

  • Classes/Objects
  • Reg Expressions
  • Database Access

Advanced python Part 2

  • Sending Email
  • JSON Processing
  • Logging

Database

  • Why database
  • Postgres DB
  • Understanding Database design of standard project
  • Standard DB operations
  • SQL Tricks for better data operations 

Pandas

Pandas (intro)

  • Understanding padas dataframe
  • Load dataframe from csv/excel
  • Database connection and execute query
  • Dataframe filtering and storing result to CSV, Excel
  • Kaggle dataset analysis

Big data

 Big Data Concept

  •    Structure concept (Hadoop, Spark)
  •    When to choose what.  Data collection (Open dataSets)

Big Data Part 2 (PySpark)

Implementation

  •      AWS EMR (and related basics of AWS like ec2, S3)
  •      Run example with Hive, Hue
  •      Example with PySpark (from Kaggle dataset)

 

  • Students can gain proper idea about Script writing using Python.
  • Student can Identify Proper objective types and can build Python Modules for reusability.
  • Exception & Error handeling in Python.
  • Data Analysis using Python scripts.
  • Students will learn different types of open-source relational database management system.
  • Big Data Analysis Using Pyspark, Numpy & Pandas.
  • Student will expert to find the result using Data analysis.

Blogs

November 2019

why-learn-python-2019-2020

Why Learn Python- Top Reasons 2019

Python is everywhere. If you haven’t been living under a rock for the past 5-7 years you must have heard of python in one way or another. It is the largest growing high-level and interpreted programming language to-date. Learning Python makes you eligible for even more jobs in the market compared to C++ or Java. The average Python developer in the US (2019) earns an average yearly salary of slightly more than $120k. In addition to the lucrative job [...]