The course teaches students comprehensive and specialised subjects in computer science; it teaches students cutting-edge engineering skills to solve real-world problems using computational thinking and tools. Most of this program is the case (or) project-based where students learn by solving real-world problems end to end. This program has core courses that focus on computational thinking and problem solving from first principles. The core courses are followed by specialization courses that teach various aspects of building real-world systems. This is followed by more advanced courses that focus on research-level topics, which cover state-of-the-art methods. The program also has a capstone project at the end, wherein students can either work on building end-to-end solutions to real-world problems (or) work on a research topic. The program also focuses on teaching the students the “ability to learn” so that they can be lifelong learners constantly upgrading their skills. Students can choose from a spectrum of courses to specialize in a specific sub-area of Computer Science like Artificial Intelligence and Machine Learning, Cloud and Full Stack Development, etc.
Target Audience
- Ages 19-30, 31-65, 65+
Target Group
This course is designed for individuals who wish to enhance their knowledge of computer science and its various applications used in different fields of employment. It is designed for those that will have responsibility for planning, organizing, and directing technological operations. In all cases, the target group should be prepared to pursue substantial academic studies. Students must qualify for the course of study by entrance application. A prior computer science degree is not required; however the course does assume technical aptitude; and it targets students with finance, engineering, or STEM training or professional experience.
Mode of attendance
Online/Blended Learning
Structure of the programme - Please note that this structure may be subject to change based on faculty expertise and evolving academic best practices. This flexibility ensures we can provide the most up-to-date and effective learning experience for our students.The Master of Science in Computer Science combines asynchronous components (lecture videos, readings, and assignments) and synchronous meetings attended by students and a teacher during a video call. Asynchronous components support the schedule of students from diverse work-life situations, and synchronous meetings provide accountability and motivation for students. Students have direct access to their teacher and their peers at all times through the use of direct message and group chat; teachers are also able to initiate voice and video calls with students outside the regularly scheduled synchronous sessions. Modules are offered continuously on a publicly advertised schedule consisting of cohort sequences designed to accommodate adult students at different paces. Although there are few formal prerequisites identified throughout the programme, enrollment in courses depends on advisement from Woolf faculty and staff.The degree has 3 tiers: The first tier is required for all students, who must take 15 ECTS. In the second tier, students must select 45 ECTS from elective tiers. Under the guidance of the Academic Staff at Woolf, students may either select exclusively from one specialization track (in which case they will earn that specialization), or they may mix tracks (in which case they will finish without a specialization). Tier Three may be completed in two different ways: a) by completing a 30ECTS Advanced Applied Computer Science capstone project, or b) by completing a 10 ECTS Applied Computer Science project and 20 ECTS of electives from the program.
Grading System
Scale: 0-100 points
Components: 60% of the mark derives from the average of the assignments, and 40% of the mark derives from the cumulative examination
Passing requirement: minimum of 60% overall
Dates of Next Intake
Rolling admission
Pass rates
2023 pass rates will be publicised in the next cycle, contingent upon ensuring sufficient student data for anonymization.
Identity Malta’s VISA requirement for third country nationals: https://www.identitymalta.com/unit/central-visa-unit/
Passing requirement: minimum of 60% overall
Dates of Next Intake
Rolling admission
Pass rates
2023 pass rates will be publicised in the next cycle, contingent upon ensuring sufficient student data for anonymization. Identity Malta’s VISA requirement for third country nationals: https://www.identitymalta.com/unit/central-visa-unit/
This is a foundational and mandatory course which aims to build student's ability to apply various algorithmic design methods to provide an optimal solution to computational problems. This course starts with time and space complexity analysis of divide and conquer algorithms using recursion-tree based methods and Master’s theorem. Students would also learn about amortized time and space complexity analysis for randomized/probabilistic algorithms. Various algorithmic design strategies would be introduced via real world examples and problems. Students would learn when, where and how to optimally use Divide and Conquer, Dynamic programming (top-down and button-up), Greedy, Backtracking and Randomization strategies with examples. The module uses various practical examples from Array manipulations, Sorting, Searching, String manipulations, Tree & Graphs traversals, Graph path-finding, Spanning Trees etc., to introduce the above algorithmic strategies in action. Students would implement many of the above algorithmic design methods from scratch as part of the assignments. The module also introduces how some of these popular algorithms are readily available via popular libraries in various programming languages.
This course is aimed to build a strong foundational knowledge of data structures (DS) used extensively in computing. The module starts with introducing time and space complexity notations and estimation for code snippets. This helps students be able to make trade-offs between various Data Structures while solving real world computational problems. The module introduces most widely used basic data structures like Dynamic arrays, multi-dimensional arrays, Lists, Strings, Hash Tables, Binary Trees, Balanced Binary Trees, Priority Queues and Graphs. The module discusses multiple implementation variations for each of the above data-structures along with trade-offs in space and time for each implementation. In this course, students implement these data-structures from scratch to gain a solid understanding of their inner workings. Students are also introduced to how to use the built-in data-structures available in various programming languages/libraries like Python/NumPy/C++ STL/Java/JavaScript. Students solve real-world problems where they must use an optimal DS to solve a computational problem at hand.
Mathematics and computer science are closely related fields. Problems in computer science are often formalized and solved with mathematical methods. It is likely that many important problems currently facing computer scientists will be solved by researchers skilled in algebra, analysis, combinatorics, logic and/or probability theory, as well as computer science.
This course covers elementary discrete mathematics for computer science and engineering. Topics may include asymptotic notation and growth of functions; permutations and combinations; counting principles; discrete probability. Further selected topics may also be covered, such as recursive definition and structural induction; state machines and invariants; recurrences; generating functions.
Students will be able to explain and apply the basic methods of discrete (noncontinuous) mathematics in computer science. They will be able to use these methods in subsequent courses in the design and analysis of algorithms, computability theory, software engineering, and computer systems.
This course helps students translate advanced mathematical/statistical/scientific concepts into code. This is a module for writing code to solve real-world problems. It introduces programming concepts (such as control structures, recursion, classes and objects) assuming no prior programming knowledge, to make this course accessible to advanced professionals from scientific fields like Biology, Physics, Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation for converting scientific knowledge into programming concepts, the course advances to dive deeply into Object-Oriented Programming and its methodologies. We also learn when and how to use inbuilt-data structures like 1-Dimensional and 2-Dimensional Arrays. We introduce the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods.
The module can be taught to allow students to learn these concepts using a modern programming language such as Java or Python.
The course prepares students to handle advanced data structures and algorithm design methods in the separate module, ‘Data Structures’.
This is a core and foundational course which aims to equip the student with the ability to model, design, implement and query relational database systems for real-world data storage & processing needs. Students would start with diagrammatic tools (ER-diagram) to map a real world data storage problem into entities, relationships and keys. Then, they learn to translate the ER-diagram into a relational model with tables. SQL is then introduced as a de facto tool to create, modify, append, delete, query and manipulate data in a relational database. Due to SQL’s popularity, the course spends considerable time building the ability to write optimized and complex queries for various data manipulation tasks. The module exposes students to various real world SQL examples to build solid practical knowledge. Students then move on to understanding various trade-offs in modern relational databases like the ones between storage space and latency. Designing a database would need a solid understanding of normal forms to minimize data duplication, indexing for speedup and flattening tables to avoid complex joins in low-latency environments. These real-world database design strategies are discussed with practical examples from various domains. Most of this course uses the opensource MySQL database and cloud-hosted relational databases (like Amazon RDS) to help students apply the concepts learned on real databases via assignments.
The ability to solve problems is a skill, and just like any other skill, the more one practices, the better one gets. So how exactly does one practice problem solving? Learning about different problem-solving strategies and when to use them will give a good start. Problem solving is a process. Most strategies provide steps that help you identify the problem and choose the best solution.
Building a toolbox of problem-solving strategies will improve problem solving skills. With practice, students will be able to recognize and choose among multiple strategies to find the most appropriate one to solve complex problems.
In this course we will introduce arrays and some of their real-world applications, such as prefix sum, carry forward, subarrays, and 2-dimensional matrices. We will also include industry relevant problems and dive deeply into building their solutions with various approaches, recognizing each’s limitations (i.e when to use a data structure and when not to use a data structure).
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
This course aims to build the core competency of building real world end-to-end ML systems and deploy them into production for a variety of problems and scenarios. Students would learn a variety of ML systems ranging from high throughput and low latency internet scale systems to low compute power and energy constrained IoT devices like smart watches. Students will study the ML lifecycle and various components in detail. We also use real world ML platforms like Google’s KubeFlow, TensorFlow Lite, and Amazon’s SageMaker to implement real world systems and understand the engineering trade-offs and challenges. Students also learn relevant technologies and tools like Containerization (Docker) and Container Orchestration (Kubernetes) and Git which are often used extensively in real world scalable ML systems. This course is a hands-on course where we solve multiple real world cases and discuss solutions built by various companies and organizations to provide the students a comprehensive understanding of varied systems and design choices.
This is a course that focuses both on architectural design and practical hands-on learning of the most used cloud services. The module extensively uses Amazon Web services (AWS) to show real world code examples of various cloud services. It also covers the core concepts and architectures in a platform agnostic manner so that students can easily translate these learnings to other cloud platforms (like Azure, GCP etc.). The module starts with virtualization and how virtualized compute instances are created and configured. Students also learn how to auto-scale applications using load balancers and build fault tolerant applications across a geographically distributed cloud. As relational databases are widely used in most enterprises, students learn how to migrate and scale (both vertically and horizontally) these databases on the cloud while ensuring enterprise grade security. Virtual private clouds enable us to create a logically isolated virtual network of compute resources. Students learn to set up a VPC using virtualized-compute-servers on AWS. The course also covers the basics of networking while setting up a VPC. Students learn of the architecture and practical aspects of distributed object storage and how it enables low latency and high availability data storage on the cloud.
This course builds upon the introductory JavaScript course to acquaint students of popular and modern frameworks to build the front end. We focus on three very popular frameworks/libraries in use: React.js, jQuery and AngularJS. We start with React.js, one of the most popular and advanced ones amongst the three. students learn various components and data flow to learn to architect real world front end using React.js. This would be achieved via multiple code examples and code-walkthroughs from scratch. We would also dive into React Native which is a cross platform Framework to build native mobile and smart-TV apps using JavaScript. This helps students to build applications for various platforms using only JavaScript. jQuery is one of the oldest and most widely used JavaScript libraries, which students cover in detail. Students specifically focus on how jQuery can simplify event handling, AJAX, HTML DOM tree manipulation and create CSS animations. We also provide a hands-on introduction to AngularJS to architect model-view-controller (MVC) based dynamic web pages.
This is a foundational course on building server-side (or backend) applications using popular JavaScript runtime environments like Node.js. Students will learn event driven programming for building scalable backend for web applications. The module teaches various aspects of Node.js like setup, package manager, client-server programming and connecting to various databases and REST APIs. Most of these concepts would be covered in a hands-on manner with real world examples and applications built from scratch using Node.js on Linux servers. This course also provides an introduction to Linux server administration and scripting with special focus on web-development and networking. Students learn to use Linux monitoring tools (like Monit) to track the health of the servers. The module also provides an introduction to Express.js which is a popular light-weight framework for Node.js applications. Given the practical nature of this course, this would involve building actual website backends via assignments/projects for ecommerce, online learning and/or photo-sharing.
This is a course that focuses both on architectural design and practical hands-on learning of the most used cloud services. The module extensively uses Amazon Web services (AWS) to show real world code examples of various cloud services. It also covers the core concepts and architectures in a platform agnostic manner so that students can easily translate these learnings to other cloud platforms (like Azure, GCP etc.). The module starts with virtualization and how virtualized compute instances are created and configured. Students also learn how to auto-scale applications using load balancers and build fault tolerant applications across a geographically distributed cloud. As relational databases are widely used in most enterprises, students learn how to migrate and scale (both vertically and horizontally) these databases on the cloud while ensuring enterprise grade security. Virtual private clouds enable us to create a logically isolated virtual network of compute resources. Students learn to set up a VPC using virtualized-compute-servers on AWS. The course also covers the basics of networking while setting up a VPC. Students learn of the architecture and practical aspects of distributed object storage and how it enables low latency and high availability data storage on the cloud.
This course is a follow-up to Introduction to Problem-Solving Techniques: Part 1, and as part of their academic planning process with Woolf staff, students will ordinarily take that course first.
Part 2 deepens the approach to data structures by including such topics as stacks, queues, linked lists, and trees, and we will discuss in detail real world applications of each approach and their comparative strengths and limitations (i.e when to use a data structure and when not to use a data structure). This course will also include hashing techniques along with recursion and subset problems. This course will have rigorous homework and assignments as we introduce more than 4 data structures.
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
In this module we will discuss general approaches to the construction of efficient solutions to problems.
Such methods are of interest because:
They provide templates suited to solving a broad range of diverse problems.
They can be translated into common control and data structures provided by most high-level languages.
The temporal and spatial requirements of the algorithms which result can be precisely analyzed.
This course will provide a solid foundation and background to design and analysis of algorithms. In particular, upon successful completion of this course, students will be able to understand, explain and apply key algorithmic concepts and principles, which might include:
Greedy algorithms (Activity Selection, 0-1 Knapsack Problem, Fractional Knapsack Problem)
Dynamic programming (Longest Common Subsequence, 0-1 Knapsack Problem)
Minimum Spanning Trees (Prim’s Algorithm, Kruskal’s Algorithm)
Graph Algorithms (Dijkstra’s Shortest Path Algorithm, Bipartite Graphs, Minimum Vertex Cover)
Although more than one technique may be applicable to a specific problem, it is often the case that an algorithm constructed by one approach is clearly superior to equivalent solutions built using alternative techniques. This module will help students assess these choices.
This course gives the detailed overview on how to approach Low Level Design problems with real-world case studies discussed such as Designing a Pen (Mac/Windows), TicTacToe, BookMyShow (most used event booking app, manages millions of users), Email campaign Management System and detailed design of Splitwise.
This course provides a practical understanding of popular object-oriented design patterns so that students can reuse design strategies developed for commonly occurring problems in software development. We begin the course with a revision of object-oriented programming and an overview of UML (unified modelling language) diagrams to represent software design diagrammatically. We then dive into 10-12 most popular design patterns motivating each of them from real world scenarios. We would also showcase multiple opensource code bases which use the specific design pattern to solve a real-world design problem. This would help students gain an appreciation of how each of the theoretical patterns they learn actually translate to code. We also take up real world cases and dive into various design patterns that can be used to solve the problem. Sometimes, there could be multiple valid designs. We would five into the pros and cons of each design decision and trade-offs involved. Our objective is to build the problem-solving ability amongst students to recognize the appropriate design pattern to tackle a real-world problem. The module briefly discusses domain specific design patterns in their respective contexts.
Data is the fuel driving all major organisations. In this course, we help you understand how to process data at scale.
From understanding the fundamentals of distributed processing to designing data warehousing and writing ETL (Extract Transform Load) pipelines to process batch and streaming data.
We will give you a comprehensive view of the complete Data Engineering lifecycle.
Every organisation is building products to solve the pain points of its customers. Product managers are a critical part of an organisation, who make sure that evolving customer needs, and market trends are observed and converted into delightful solutions which help businesses get its outcomes.
In this course, students will get a fundamental understanding of product management practices.
This will give them a comprehensive view of the complete product management life cycle.
This course helps students translate mathematical/statistical/scientific concepts into code. This is a foundational course for writing code to solve Data Science ML & AI problems. It introduces basic programming concepts (like control structures, recursion, classes and objects) from scratch, assuming no prerequisites, to make this course accessible to students from non-computational scientific fields like Biology, Physics, Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation, the course advances to dive deep into core Mathematical libraries like NumPy, Scipy and Pandas. Students also learn when and how to use inbuilt-data structures like Lists, Dicts, Sets and Tuples. The module introduces the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods. The module does not dive deep into the data structures and algorithm design methods in this course - that is available in the ‘Data Structures and Algorithms’ module. This course is valuabe for all students specializing in mathematical sub-areas of CS like ML, Data Science, Scientific Computing etc.
This course introduces basic probability theory , statistical methods and computational algorithms to perform mathematically rigorous data analysis. The course starts with basic foundational concepts of random variables, histograms, and various plots (PMF, PDF and CDF). Students learn various popular discrete and continuous distributions like Bernoulli, Binomial, Poisson, Gaussian, Exponential, Pareto, log-normal etc., both mathematically and from an applicative perspective. Students learn various measures like mean, median, percentiles, quantiles, variance and interquartile-range. Students learn the pros and cons of each metric and understand when and how to use them in practice. Studnets will learn conditional probability and Bayes theorem in the applied context of real-world problems in medicine and healthcare. The module teaches the foundations of non-parametric statistics and applies them to solve problems using computational tools. Students learn various methods to determine correlations rigorously in data. This is followed by applied and mathematical understanding of the statistics underlying control-treatment (A/B) experiments and hypothesis testing. The module engages computation tools in modern statics like Bootstrapping, Monte-Carlo methods, RANSAC etc.
This module focuses on representing statistical techniques in code, and may be conducted in Python, R, or another relevant language. Such languages provide libraries that can handle a wide variety of statistical techniques like linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering and graphical techniques, and is highly extensible.
Learning to work in statistically-oriented programming language environments can equip you with the following skills among many others:
An effective way of data handling (using arrays for example) and storing data in a structured manner.
Expertise in diverse tools and libraries for Data Analysis
Ability to present complex data in a graphical and visual format for easy understanding of the data and further solutions.
This course teaches students how to analyse the ways users engage with a service. This method, called product analytics, helps businesses track and analyse user data. Students will learn more deeply what is required to move a product from idea to implementation, through to launch, and then on to iterative improvements. The course teaches how to measure progress, validate or update product hypotheses, and present product learnings.
Also, students will gain experience in making informed decisions, as well as how to present findings and make an analytics-informed business case to win support for a product.
This is a hands-on course on designing responsive, modern and light-weight UI for web, mobile and desktop applications using HTML5, CSS and Frameworks like Bootstrap 4. This course starts with an introduction on how web browsers, mobile apps and web servers work. We then dive into each of the nitty gritty details of HTML5 to build webpages. We would start with simple web pages and then graduate to more complex layouts and features in HTML like forms, iFrames, multimedia-playback and using web-APIs. We then go on to learn stylesheets based on CSS 4 and how browsers interpret CSS files to render web pages. Once again, we use multiple real world example web pages to learn the internals of CSS4. We learn popular good practices on writing responsive HTML and CSS code which is also interoperable on mobile browsers, apps and desktop apps. We would introduce students to building desktop apps using HTML and CSS using toolkits like Electron. We would also study popular frameworks for front end development like Bootstrap 4 which can speed up UI development significantly.
This course focuses on building basic classification and regression models and understanding these models rigorously both with a mathematical and an applicative focus. The module starts with a basic introduction to high dimensional geometry of points, distance-metrics, hyperplanes and hyperspheres. We build on top this to introduce the mathematical formulation of logistic regression to find a separating hyperplane. Students learn to solve the optimization problem using vector calculus and gradient descent (GD) based algorithms. The module introduces computational variations of GD like mini-batch and stochastic gradient descent. Students also learn other popular classification and regression methods like k-Nearest Neighbours, Naive Bayes, Decision Trees, Linear Regression etc. Students also learn how each of these techniques under various real world situations like the presence of outliers, imbalanced data, multi class classification etc. Students learn bias and variance trade-off and various techniques to avoid overfitting and underfitting. Students also study these algorithms from a Bayesian viewpoint along with geometric intuition. This module is hands-on and students apply all these classical techniques to real world problems.
This course is aimed to help learners understand various techniques and algorithms to visualize, analyse and understand high dimensional data which is very common in Data Science and ML. The module starts with linear algebraic methods like Principal Component Analysis (PCA) and SVD (Singular Value Decomposition) for obtaining linear projection of high dimensional data. This is followed by more advanced nonlinear and state of the art techniques like t-SNE and UMAP for visualizing high dimensional data. Each of these techniques would be covered in full mathematical detail from first principles along with applying them to real world datasets in NLP, Genomics and internet-datasets. Students will also study how PCA and SVD are related to general Matrix Factorization techniques. To analyse and understand high dimensional un-labelled data, students learn clustering techniques like K-Means, Gaussian Mixture models, Hierarchical Clustering and DBSCAN. The modules shows how some of the techniques are mathematically related to Matrix Factorization. Students study various outlier detection techniques based on density, proximity, factorization and cluster analysis.
This course introduces more advanced ML techniques like ensembles: bagging, boosting, cascading and stacking classifiers and regressors. It covers both the theoretical foundations and applicative details of these techniques along with popular implementations of boosting like LightGBM, CatBoost and XGBoost. Students also delve into kernel methods with specific focus on SVMs for classification and regression. Students will study state of the art model agnostic feature importance and model-interpretability techniques like LIME and SHAP. Students also study classical NLP based text encoding methods like Bag-of-words, TF-IDF etc. The module teaches various classical methods in time series analysis and forecasting like ARMA, ARIMA etc. Students also learn how to pose time series forecasting problems as regression and classification problems to leverage well studied ML techniques. This is followed by various domain and problem specific Feature engineering techniques that are often helpful in real world problem solving. Students will study methods like error analysis, ablative analysis etc., to debug and understand why and where a model is performing well and where it is not performing well. This will further help us in designing appropriate features. Students study model calibration techniques like Platt Scaling, Isotonic Regression etc. Later in this course, we cover how to build recommender systems using content-based and collaborative filtering methods. The module also teaches the detailed solution of the Netflix prize (2009) and various recent advances in RecSys.
This course provides an in-depth understanding of distributed systems for ML and Deep Learning using CPU,GPU and TPU clusters. It starts with foundations of Map-reduce framework and in-memory distributed and resilient data structures that form the backbone of Spark. Students will learn the architectural details of these distributed system platforms and how they can be leveraged to perform data analysis and model training on petabyte scale datasets. We cover how distributed training is achieved for popular ML algorithms on Spark by understanding the internal working of SparkMLLib. The module then focuses on understanding distributed graph processing using GraphX. Students move on to Deep-Learning algorithms and how distributed algorithms can be designed for them when we have GPU or TPU clusters at our disposal. We also dive deep into how TensorFlow archives distributed computing for popular Deep Learning algorithms. Students will study distributed data stores and how they can be used for ML using popular datastore systems like Hive and SparkSQL. The module concludes by discussing state of the art distributed, low-latency approximate nearest neighbour algorithms along with their implementations in ElasticSearch.
This course provides a strong mathematical and applicative introduction to Deep Learning. The module starts with the perceptron model as an over simplified approximation to a biological neuron. We motivate the need for a network of neurons and how they can be connected to form a Multi Layered Perceptron (MLPs). This is followed by a rigorous understanding of back-propagation algorithms and its limitations from the 1980s. Students study how modern deep learning took off with improved computational tools and data sets. We teach more modern activation units (like ReLU and SeLU) and how they overcome problems with the more classical Sigmoid and Tanh units. Students learn weight initialization methods, regularization by dropouts, batch normalization etc., to ensure that deep MLPs can be successfully trained. The module teaches variants of Gradient Descent that have been specifically designed to work well for deep learning systems like ADAM, AdaGrad, RMSProp etc. Students also learn AutoEncoders, VAEs and Word2Vec as unsupervised, encoding deep-learning architectures. We apply all of the foundational theory learned to various real world problems using TensorFlow 2 and Keras. Students also understand how TensorFlow 2 works internally with specific focus on computational graph processing.
This course provides a comprehensive overview of Computer vision problems and how they can be tackled using various Convolutional Neural networks (CNNs). Students start with classical image processing operations like edge detection, convolution, shape detectors and colour space conversions. This is followed by a foundational understanding of Deep-Convolutional Neural networks and how their training and evaluation works. We introduce various CNN specific layers like pooling-layers and upsampling layers. We also introduce various Data Augmentation techniques that are very helpful for image-related problems. This is followed by a dive deep into the internals of popular CNN architectures like: AlexNet, VGGNet, ResNet etc. Students also learn how to use these methods practically for transfer learning. Students will study how various computer-vision related tasks like image segmentation, image-generation, object detection and localization, contrastive learning etc., can be performed using state of the art algorithms for each of these tasks. Most of these techniques would be studied directly from the original research papers and open-source code provided by the authors. Students would also implement some of these algorithms from scratch in this course.
This course focuses on modelling sequences (text, music, time-series, genes) using deep-learning models. We start with a simple Recurrent Neural Network and its limitations with long-sequences. Students learn LSTMs and GRUs which can handle significantly longer sequences to model sequence data like text, music, gene-sequences and time-series data. We study variations of LSTM like bi-directional LSTMs and encoder-decoder architectures. This is followed by a detailed study of attention mechanism and Transformer based models which are currently the state-of-the-art for NLP and sequence modelling. The module teaches encoder-decoder Transformers, BERT, BERT-variations, GPT-1,2 &3 models from both the architectural and mathematical viewpoints and also a practical viewpoint. Studnets learn to implement many of these complex models from scratch (using TensorFlow 2 and Keras) to gain a deeper understanding of how they work internally. Students will study popular applications of deep-learning in NLP like parts-of-speech tagging, question-answering systems, conversational engines (chatbots), Semantic search with low-latency etc. For each of these problems, Students will study cutting edge deep-learning models along with code implementations.
This course provides students with hands-on experience on deploying high velocity applications and services reliably on complex and distributed infrastructure. DevOps as a philosophy is a key driver of the modern software life cycle which prefers rapid and reliable delivery of functionality and features via code. We start with a solid introduction to Linux scripting and networking. Then, we learn popular methodologies to deploy complex and distributed software like microservices, containerization (Docker) and orchestration (Kubernetes). All of this would be introduced with real world examples from the industry. We also focus on Continuous Integration and Continuous Delivery (CI/CD) methodology and how it can be achieved using popular toolchains like Jenkins. We dive into how automated testing of software can be achieved using libraries like Selenium. This shall be followed by more advanced techniques like serverless-compute, Platform as a service model and Cloud-DevOps. Students would learn to monitor and log key data points to ensure they maintain a healthy system and adapt it as needed. Infrastructure-as-code is a key component of modern DevOps especially on cloud and containerized applications which would also be covered with real-world examples.
This course is a hands-on course covering JavaScript from basics to advanced concepts in detail using multiple examples. We start with basic programming concepts like variables, control statements, loops, classes and objects. Students also learn basic data-structures like Strings, Arrays and dates. Students also learn to debug our code and handle errors gracefully in code. We learn popular style guides and good coding practices to build readable and reusable code which is also highly performant. We then learn how web browsers execute JavaScript code using V8 engine as an example. We also cover concepts like JIT-compiling which helps JS code to run faster. This is followed by slightly advanced concepts like DOM, Async-functions, Web APIs and AJAX which are very popularly used in modern front end development. We learn how to optimize JavaScript code to run on both mobile apps and mobile browsers along with Desktop browsers and as desktop apps via ElectronJS. Most of this course would be covered via real world examples and by learning from JS code of popular open-source websites and libraries.
This course provides a dive deep into more advanced concepts in server-side programming using Node.js to enable initiative, real-time and scalable web applications. We dive into threading and thread pools in Node.js and how they can be leveraged to build more responsive web apps. We learn socket programming using socket.io and Node.js for instant messaging, document collaboration, real time analytics and streaming applications. Students also learn to use Caching using distributed in-memory key-value stores (like Redis) to rescue latency while serving web-apps. Students also learn how to use Node.js with popular NoSQL data stores like MongoDB for storing unstructured data. We also cover GraphQL which is an open source data query and manipulation language for APIs, which is gaining popularity more recently. We learn popular protocols like OAuth to enable cross platform logins. Students also learn the architecture and practical aspects of Web-RTC to enable multimedia applications like video-chat, live-streaming, music-streaming etc.
This course provides an in-depth architectural overview and hands- on experience with building scalable data processing and distributed computing via various cloud systems. We focus a lot on Spark which is one of the most popular and powerful distributed systems to perform petabyte scale data processing. We learn various components of Spark like HDFS, Resilient Distributed Datasets (RDDs), Programming models like Map-reduce. Students also learn SparkSQL and Hive and how they can be used for querying large datastores. We focus on how various services in a cloud (like AWS) can be used together to build scalable data-pipelines for both batch and near real-time processing. We show various examples of real world systems and their architectures from various companies and organizations. We learn how graphX can be used to process large graphs using Spark. Students use AWS Elastic Map Reduce (EMR) for cloud based Spark clusters. We learn the design and architecture of distributed inverted indices and how they can be used for implementing search scalably. Students learn to use ElasticSearch, a very popular distributed inverted index for implementing search functionality on websites and on unstructured data.
This core course equips the student with knowledge of database management systems, operating systems and computer networks. At the end of the course, students will have a critical understanding of the architecture of computers and networks, as well has how programs interact with these. Students begin with mapping data storage problems (as they had done in Relational Databases) to understand how data is stored in a distributed network, and related issues such as concurrency. Subsequently, students cover operating systems with an overview of process scheduling, process synchronisation and memory management techniques with disk scheduling. The module concludes with computer networks, where we will be discussing all of the computer network layers and their protocols in detail.
Low-Level Design & Design Patterns focuses on modularity and reusability in software design, common design vocabularies, refactoring and how to reduce it, and how to incorporate design patterns into iterative development processes. The course pays significant attention to the interaction between system architecture and components, including data organisation.
The course begins with Object-Oriented Analysis (OOA), which is a problems-solving technique that includes: modelling an information design; representing behaviour; describing functions; dividing data, functional, and behavioural models to uncover detail; moving from abstraction to implementation details. The course then turns to Object-Oriented Design (OOD), which reduces the analysis model into a modular design for software creation, with subsystems, components, and objects.
The iteration of analysis and implementation will be covered in detail with real-world industry examples.
This course is aimed to build a strong foundational knowledge of data structures (DS) used extensively in computing. The module starts with introducing time and space complexity notations and estimation for code snippets. This helps students be able to make trade-offs between various Data Structures while solving real world computational problems. The module introduces most widely used basic data structures like Dynamic arrays, multi-dimensional arrays, Lists, Strings, Hash Tables, Binary Trees, Balanced Binary Trees, Priority Queues and Graphs. The module discusses multiple implementation variations for each of the above data-structures along with trade-offs in space and time for each implementation. In this course, students implement these data-structures from scratch to gain a solid understanding of their inner workings. Students are also introduced to how to use the built-in data-structures available in various programming languages/libraries like Python/NumPy/C++ STL/Java/JavaScript. Students solve real-world problems where they must use an optimal DS to solve a computational problem at hand.
This is a foundational and mandatory course which aims to build student's ability to apply various algorithmic design methods to provide an optimal solution to computational problems. This course starts with time and space complexity analysis of divide and conquer algorithms using recursion-tree based methods and Master’s theorem. Students would also learn about amortized time and space complexity analysis for randomized/probabilistic algorithms. Various algorithmic design strategies would be introduced via real world examples and problems. Students would learn when, where and how to optimally use Divide and Conquer, Dynamic programming (top-down and button-up), Greedy, Backtracking and Randomization strategies with examples. The module uses various practical examples from Array manipulations, Sorting, Searching, String manipulations, Tree & Graphs traversals, Graph path-finding, Spanning Trees etc., to introduce the above algorithmic strategies in action. Students would implement many of the above algorithmic design methods from scratch as part of the assignments. The module also introduces how some of these popular algorithms are readily available via popular libraries in various programming languages.
This course builds on foundational AWS knowledge, diving deeper into the platform's sophisticated features and services. Students will explore advanced networking configurations, security and compliance measures, and serverless architectures. Emphasis will be placed on practical applications, allowing students to design and implement complex AWS architectures, automate infrastructure management with Infrastructure as Code (IaC) tools, and optimize costs for scalable solutions.In addition to technical skills, the course covers advanced topics in containerization, orchestration with AWS services, and the development of continuous integration/continuous deployment (CI/CD) pipelines. Students will gain hands-on experience through labs and projects that simulate real-world scenarios, ensuring they can effectively deploy, manage, and scale applications on AWS. By the end of the course, students will be proficient in leveraging AWS's full potential to meet specific business requirements, ensuring security, compliance, and cost-efficiency in cloud environments.
This course provides a comprehensive overview of Amazon Web Services (AWS), focusing on core services and best practices for building and managing cloud-based infrastructure. Students will learn the fundamentals of cloud computing, explore key AWS services such as EC2, S3, RDS, and VPC, and gain hands-on experience in deploying and managing applications on the AWS platform. The course emphasizes practical skills, enabling students to design scalable, secure, and cost-effective cloud solutions.In addition to foundational AWS services, the course covers essential topics such as identity and access management (IAM), networking and security configurations, and monitoring and logging with CloudWatch. Students will also be introduced to Infrastructure as Code (IaC) using AWS CloudFormation and gain insights into setting up continuous integration/continuous deployment (CI/CD) pipelines with AWS tools. By the end of the course, students will have a solid understanding of AWS basics, equipping them with the knowledge and skills to effectively utilize AWS services in their DevOps practices and prepare for more advanced AWS coursework.
DevOps Tools Part 1 is a comprehensive course designed for students pursuing a Master of Science in Computer Science with a specialization in DevOps. This course introduces the essential tools and methodologies that form the backbone of modern DevOps practices. Students will gain a solid foundation in version control with Git, continuous integration/continuous deployment (CI/CD) pipelines using Jenkins, and configuration management with Ansible. The course emphasizes hands-on learning, enabling students to set up, configure, and utilize these tools in real-world scenarios, ensuring they can effectively collaborate, automate workflows, and streamline the development process.In addition to core tools, the course covers containerization with Docker and orchestration with Kubernetes, providing students with the skills to deploy and manage applications in a microservices architecture. Students will also explore monitoring and logging solutions such as Prometheus and ELK Stack to maintain system reliability and performance. By the end of the course, students will be proficient in employing a wide range of DevOps tools, laying a strong foundation for advanced DevOps practices and tools covered in subsequent courses.
This core foundational course equips students with knowledge of Database Management Systems (DBMS) and Computer Networks.The course starts with Entity-Relationship (ER) diagrams, a visual tool for mapping real-world data storage problems.Students learn to translate ER diagrams into a relational model with tables. SQL, the standard language for relational databases, is then introduced. Students will spend significant time building proficiency in writing optimized and complex SQL queries for various data manipulation tasks. Real-world examples will be used to solidify practical knowledge.Next, the course explores trade-offs in modern relational databases, such as storage space versus latency. Designing efficient databases requires understanding normal forms to minimize data duplication, indexing for speed improvements,and flattening tables to avoid complex joins in low-latency environments. These real-world database design strategies are discussed with practical examples.The course utilizes open-source MySQL databases and cloud-hosted relational databases (like Amazon RDS) for assignments, allowing students to apply learned concepts on real databases.Following the DBMS section, the course transitions to Computer Networks. Here, students will delve into foundational concepts like the OSI model, TCP/IP model, TCP/UDP protocols, subnetting, DNS (Domain Name System), Network Address Translation (NAT), private networks, Secure Sockets Layer (SSL), and network security principles.
This module is designed to deepen students' understanding of advanced algorithmic techniques and problem-solving strategies. Building on their existing knowledge of dynamic programming (DP) and graph algorithms, students will explore more sophisticated concepts and applications.The module begins with a revision of recursion techniques, enabling students to solve complex recursive problems more efficiently. They will refine their DP skills by implementing more advanced bottom-up and top-down approaches, crucial for optimizing real-world solutions.Students will gain expertise in mathematical algorithms, equipping them to handle intricate problems involving factorials, modular arithmetic, and large power computations. They will develop strategies to solve complex puzzles and optimization problems through backtracking.The course will introduce the Trie data structure, empowering students to efficiently manage and manipulate large sets of data. They will enhance their ability to handle text processing tasks by mastering advanced string pattern matching algorithms.Additionally, students will advance their skills in graph algorithms, exploring concepts such as Disjoint Set Union (DSU), graph coloring, and shortest path algorithms like Bellman-Ford and Floyd-Warshall. These advanced topics will prepare them for tackling complex algorithmic challenges often encountered in technical interviews.Throughout the course, students will engage in practical assignments and real-world examples to solidify their understanding. By the end of the module, they will be well-equipped with the knowledge and skills to confidently approach a wide range of advanced problems, ensuring their readiness for technical interviews and advanced problem-solving scenarios.
DevOps Tools Part 2 is an advanced course designed for students building on the foundational knowledge from DevOps Tools Part 1. his course delves deeper into advanced DevOps tools and techniques that drive modern software development and operations. Students will explore sophisticated CI/CD pipelines with tools like GitLab CI/CD and Azure DevOps, and master infrastructure as code (IaC) with Terraform and AWS CloudFormation. The course emphasizes practical, hands-on experience, enabling students to automate and manage complex cloud environments effectively.In addition to advanced CI/CD and IaC, students will learn about service mesh architectures with Istio, advanced container orchestration with Kubernetes, and continuous monitoring and observability with tools such as Grafana and Jaeger. The course also covers security practices in DevOps, including integrating security tools into the CI/CD pipeline and managing secrets with HashiCorp Vault. By the end of the course, students will have the expertise to design, implement, and manage scalable, secure, and efficient DevOps workflows, preparing them for leadership roles in the field.
This fundamental course aims to equip students with knowledge of core Linux operating system concepts. The module will cover essential operating system functionalities, including process management, process synchronization (concurrency), memory management techniques, and disk scheduling.In process management, students will explore how Linux manages processes throughout their lifecycle, including starting, pausing, resuming, and allocating resources. The concept of concurrency will be introduced, covering multithreading and multiprocessing. Students will also learn how asynchronous processing facilitates concurrent execution.Memory management is another key topic. The course will delve into how the operating system manages memory on both RAM and hard disk, along with concepts like virtual memory, memory pages, and page caching.
This course equips learners with the essential skills to navigate and manage Linux servers effectively. The course begins by demystifying the Linux command line interface, learning the fundamentals students need to interact with the system. As learners progress, they will explore powerful command-line utilities like grep, sed, awk, and more, allowing them to manipulate and analyze data efficiently.Next, they will delve into the Linux file system structure, gaining a solid understanding of user permissions and access control. This knowledge is crucial for ensuring secure and organized server environments.Finally, the course empowers students to write their own shell scripts. They will learn how to craft conditional statements (if/else) and loops (for loops) to automate repetitive tasks and streamline the workflow. By the end, they will be able to write scripts to solve real-world problems and significantly boost their Linux server administration skills.
Structured Query Language (SQL) is key to working with data in relational databases, a task at the core of data science and analytics. In this course, students will learn all the major keywords and clauses used to extract data, best practices for formatting SQL queries, and how to generate meaningful insights from the results.
The focus is at all times on real-world uses of SQL queries, syntax, and expression, to allow students to begin professional-level work as quickly as possible.
This course focuses on building basic classification and regression models and understanding these models rigorously both with a mathematical and an applicative focus. It opens with a basic introduction to high dimensional geometry of points, distance-metrics, hyperplanes and hyperspheres. Then, it introduces the mathematical formulation of logistic regression to find a separating hyperplane. Vector calculus and gradient descent (GD)-based algorithms are explored to learn to solve the optimization problem, including computational variations of GD like mini-batch and stochastic gradient descent. The course also covers other popular classification and regression methods like k-Nearest Neighbours, Naive Bayes, Decision Trees, Linear Regression etc, to show how each of these techniques performs under various real-world situations like the presence of outliers, imbalanced data, multi class classification etc. Lectures on bias and variance tradeoff and various techniques to avoid overfitting and underfitting are incorporated. Algorithms are taught from a Bayesian viewpoint along with geometric intuition. This course would be heavily hands-on where students apply all these classical techniques to real world problems.
This course is aimed to build a strong foundational knowledge of Data Analytics used extensively in the Data Science field. Tableau is a powerful data visualisation tool used in the business analytics industry to process and visualise raw business data in a very presentable and understandable format. Tableau is used by all data analytics departments of companies and in data analytics companies in various fields for its ease of use and efficiency. Tableau uses relational databases, Online Analytical Processing Cubes, Spreadsheets, cloud databases to generate graphical type visualisations. Course starts with visualisations and moves to an in-depth look at the different chart and graph functions, calculations, mapping and other functionality. Students will be taught quick table calculations, reference lines, different types of visualisations, bands and distributions, parameters, motion chart, trends and forecasting, formatting, stories, performance recording and advanced mapping.
At the end of this course, students will be prepared, if they desire, to earn industry desktop certifications as a Tableau Desktop Specialist, a Tableau Certified Associate, or a Tableau Certified Professional.
This is a project-based course, with the aim of building the required skills for creating web-based software systems. The course covers the entire lifecycle of building software projects, from requirement gathering and scope definition from a product document, to designing the architecture of the system, and all the way to delivery and maintenance of the software system.
The course covers both frontend, which is, building browser-based interfaces for users, using frontend web frameworks, and also building the backend, which is the server running an API to serve the information to the frontend, and running on an SQL or similar database management system for storage.
All aspects of delivering a software project, including security, user authentication and authorisation, monitoring and analytics, and maintaining the project are covered. The course also covers the aspects of project maintenance, like using a version control system, setting up continuous integration and deployment pipelines and bug trackers.
This is a foundational and mandatory course which aims to build student's ability to apply various algorithmic design methods to provide an optimal solution to computational problems. This course starts with time and space complexity analysis of divide and conquer algorithms using recursion-tree based methods and Master’s theorem. Students would also learn about amortized time and space complexity analysis for randomized/probabilistic algorithms. Various algorithmic design strategies would be introduced via real world examples and problems. Students would learn when, where and how to optimally use Divide and Conquer, Dynamic programming (top-down and button-up), Greedy, Backtracking and Randomization strategies with examples. The module uses various practical examples from Array manipulations, Sorting, Searching, String manipulations, Tree & Graphs traversals, Graph path-finding, Spanning Trees etc., to introduce the above algorithmic strategies in action. Students would implement many of the above algorithmic design methods from scratch as part of the assignments. The module also introduces how some of these popular algorithms are readily available via popular libraries in various programming languages.
This course is aimed to build a strong foundational knowledge of data structures (DS) used extensively in computing. The module starts with introducing time and space complexity notations and estimation for code snippets. This helps students be able to make trade-offs between various Data Structures while solving real world computational problems. The module introduces most widely used basic data structures like Dynamic arrays, multi-dimensional arrays, Lists, Strings, Hash Tables, Binary Trees, Balanced Binary Trees, Priority Queues and Graphs. The module discusses multiple implementation variations for each of the above data-structures along with trade-offs in space and time for each implementation. In this course, students implement these data-structures from scratch to gain a solid understanding of their inner workings. Students are also introduced to how to use the built-in data-structures available in various programming languages/libraries like Python/NumPy/C++ STL/Java/JavaScript. Students solve real-world problems where they must use an optimal DS to solve a computational problem at hand.
Mathematics and computer science are closely related fields. Problems in computer science are often formalized and solved with mathematical methods. It is likely that many important problems currently facing computer scientists will be solved by researchers skilled in algebra, analysis, combinatorics, logic and/or probability theory, as well as computer science.
This course covers elementary discrete mathematics for computer science and engineering. Topics may include asymptotic notation and growth of functions; permutations and combinations; counting principles; discrete probability. Further selected topics may also be covered, such as recursive definition and structural induction; state machines and invariants; recurrences; generating functions.
Students will be able to explain and apply the basic methods of discrete (noncontinuous) mathematics in computer science. They will be able to use these methods in subsequent courses in the design and analysis of algorithms, computability theory, software engineering, and computer systems.
This course helps students translate advanced mathematical/statistical/scientific concepts into code. This is a module for writing code to solve real-world problems. It introduces programming concepts (such as control structures, recursion, classes and objects) assuming no prior programming knowledge, to make this course accessible to advanced professionals from scientific fields like Biology, Physics, Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation for converting scientific knowledge into programming concepts, the course advances to dive deeply into Object-Oriented Programming and its methodologies. We also learn when and how to use inbuilt-data structures like 1-Dimensional and 2-Dimensional Arrays. We introduce the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods.
The module can be taught to allow students to learn these concepts using a modern programming language such as Java or Python.
The course prepares students to handle advanced data structures and algorithm design methods in the separate module, ‘Data Structures’.
This is a core and foundational course which aims to equip the student with the ability to model, design, implement and query relational database systems for real-world data storage & processing needs. Students would start with diagrammatic tools (ER-diagram) to map a real world data storage problem into entities, relationships and keys. Then, they learn to translate the ER-diagram into a relational model with tables. SQL is then introduced as a de facto tool to create, modify, append, delete, query and manipulate data in a relational database. Due to SQL’s popularity, the course spends considerable time building the ability to write optimized and complex queries for various data manipulation tasks. The module exposes students to various real world SQL examples to build solid practical knowledge. Students then move on to understanding various trade-offs in modern relational databases like the ones between storage space and latency. Designing a database would need a solid understanding of normal forms to minimize data duplication, indexing for speedup and flattening tables to avoid complex joins in low-latency environments. These real-world database design strategies are discussed with practical examples from various domains. Most of this course uses the opensource MySQL database and cloud-hosted relational databases (like Amazon RDS) to help students apply the concepts learned on real databases via assignments.
The ability to solve problems is a skill, and just like any other skill, the more one practices, the better one gets. So how exactly does one practice problem solving? Learning about different problem-solving strategies and when to use them will give a good start. Problem solving is a process. Most strategies provide steps that help you identify the problem and choose the best solution.
Building a toolbox of problem-solving strategies will improve problem solving skills. With practice, students will be able to recognize and choose among multiple strategies to find the most appropriate one to solve complex problems.
In this course we will introduce arrays and some of their real-world applications, such as prefix sum, carry forward, subarrays, and 2-dimensional matrices. We will also include industry relevant problems and dive deeply into building their solutions with various approaches, recognizing each’s limitations (i.e when to use a data structure and when not to use a data structure).
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
This course aims to build the core competency of building real world end-to-end ML systems and deploy them into production for a variety of problems and scenarios. Students would learn a variety of ML systems ranging from high throughput and low latency internet scale systems to low compute power and energy constrained IoT devices like smart watches. Students will study the ML lifecycle and various components in detail. We also use real world ML platforms like Google’s KubeFlow, TensorFlow Lite, and Amazon’s SageMaker to implement real world systems and understand the engineering trade-offs and challenges. Students also learn relevant technologies and tools like Containerization (Docker) and Container Orchestration (Kubernetes) and Git which are often used extensively in real world scalable ML systems. This course is a hands-on course where we solve multiple real world cases and discuss solutions built by various companies and organizations to provide the students a comprehensive understanding of varied systems and design choices.
This is a course that focuses both on architectural design and practical hands-on learning of the most used cloud services. The module extensively uses Amazon Web services (AWS) to show real world code examples of various cloud services. It also covers the core concepts and architectures in a platform agnostic manner so that students can easily translate these learnings to other cloud platforms (like Azure, GCP etc.). The module starts with virtualization and how virtualized compute instances are created and configured. Students also learn how to auto-scale applications using load balancers and build fault tolerant applications across a geographically distributed cloud. As relational databases are widely used in most enterprises, students learn how to migrate and scale (both vertically and horizontally) these databases on the cloud while ensuring enterprise grade security. Virtual private clouds enable us to create a logically isolated virtual network of compute resources. Students learn to set up a VPC using virtualized-compute-servers on AWS. The course also covers the basics of networking while setting up a VPC. Students learn of the architecture and practical aspects of distributed object storage and how it enables low latency and high availability data storage on the cloud.
This course builds upon the introductory JavaScript course to acquaint students of popular and modern frameworks to build the front end. We focus on three very popular frameworks/libraries in use: React.js, jQuery and AngularJS. We start with React.js, one of the most popular and advanced ones amongst the three. students learn various components and data flow to learn to architect real world front end using React.js. This would be achieved via multiple code examples and code-walkthroughs from scratch. We would also dive into React Native which is a cross platform Framework to build native mobile and smart-TV apps using JavaScript. This helps students to build applications for various platforms using only JavaScript. jQuery is one of the oldest and most widely used JavaScript libraries, which students cover in detail. Students specifically focus on how jQuery can simplify event handling, AJAX, HTML DOM tree manipulation and create CSS animations. We also provide a hands-on introduction to AngularJS to architect model-view-controller (MVC) based dynamic web pages.
This is a foundational course on building server-side (or backend) applications using popular JavaScript runtime environments like Node.js. Students will learn event driven programming for building scalable backend for web applications. The module teaches various aspects of Node.js like setup, package manager, client-server programming and connecting to various databases and REST APIs. Most of these concepts would be covered in a hands-on manner with real world examples and applications built from scratch using Node.js on Linux servers. This course also provides an introduction to Linux server administration and scripting with special focus on web-development and networking. Students learn to use Linux monitoring tools (like Monit) to track the health of the servers. The module also provides an introduction to Express.js which is a popular light-weight framework for Node.js applications. Given the practical nature of this course, this would involve building actual website backends via assignments/projects for ecommerce, online learning and/or photo-sharing.
This is a course that focuses both on architectural design and practical hands-on learning of the most used cloud services. The module extensively uses Amazon Web services (AWS) to show real world code examples of various cloud services. It also covers the core concepts and architectures in a platform agnostic manner so that students can easily translate these learnings to other cloud platforms (like Azure, GCP etc.). The module starts with virtualization and how virtualized compute instances are created and configured. Students also learn how to auto-scale applications using load balancers and build fault tolerant applications across a geographically distributed cloud. As relational databases are widely used in most enterprises, students learn how to migrate and scale (both vertically and horizontally) these databases on the cloud while ensuring enterprise grade security. Virtual private clouds enable us to create a logically isolated virtual network of compute resources. Students learn to set up a VPC using virtualized-compute-servers on AWS. The course also covers the basics of networking while setting up a VPC. Students learn of the architecture and practical aspects of distributed object storage and how it enables low latency and high availability data storage on the cloud.
This course is a follow-up to Introduction to Problem-Solving Techniques: Part 1, and as part of their academic planning process with Woolf staff, students will ordinarily take that course first.
Part 2 deepens the approach to data structures by including such topics as stacks, queues, linked lists, and trees, and we will discuss in detail real world applications of each approach and their comparative strengths and limitations (i.e when to use a data structure and when not to use a data structure). This course will also include hashing techniques along with recursion and subset problems. This course will have rigorous homework and assignments as we introduce more than 4 data structures.
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
In this module we will discuss general approaches to the construction of efficient solutions to problems.
Such methods are of interest because:
They provide templates suited to solving a broad range of diverse problems.
They can be translated into common control and data structures provided by most high-level languages.
The temporal and spatial requirements of the algorithms which result can be precisely analyzed.
This course will provide a solid foundation and background to design and analysis of algorithms. In particular, upon successful completion of this course, students will be able to understand, explain and apply key algorithmic concepts and principles, which might include:
Greedy algorithms (Activity Selection, 0-1 Knapsack Problem, Fractional Knapsack Problem)
Dynamic programming (Longest Common Subsequence, 0-1 Knapsack Problem)
Minimum Spanning Trees (Prim’s Algorithm, Kruskal’s Algorithm)
Graph Algorithms (Dijkstra’s Shortest Path Algorithm, Bipartite Graphs, Minimum Vertex Cover)
Although more than one technique may be applicable to a specific problem, it is often the case that an algorithm constructed by one approach is clearly superior to equivalent solutions built using alternative techniques. This module will help students assess these choices.
This course gives the detailed overview on how to approach Low Level Design problems with real-world case studies discussed such as Designing a Pen (Mac/Windows), TicTacToe, BookMyShow (most used event booking app, manages millions of users), Email campaign Management System and detailed design of Splitwise.
This course provides a practical understanding of popular object-oriented design patterns so that students can reuse design strategies developed for commonly occurring problems in software development. We begin the course with a revision of object-oriented programming and an overview of UML (unified modelling language) diagrams to represent software design diagrammatically. We then dive into 10-12 most popular design patterns motivating each of them from real world scenarios. We would also showcase multiple opensource code bases which use the specific design pattern to solve a real-world design problem. This would help students gain an appreciation of how each of the theoretical patterns they learn actually translate to code. We also take up real world cases and dive into various design patterns that can be used to solve the problem. Sometimes, there could be multiple valid designs. We would five into the pros and cons of each design decision and trade-offs involved. Our objective is to build the problem-solving ability amongst students to recognize the appropriate design pattern to tackle a real-world problem. The module briefly discusses domain specific design patterns in their respective contexts.
Data is the fuel driving all major organisations. In this course, we help you understand how to process data at scale.
From understanding the fundamentals of distributed processing to designing data warehousing and writing ETL (Extract Transform Load) pipelines to process batch and streaming data.
We will give you a comprehensive view of the complete Data Engineering lifecycle.
Every organisation is building products to solve the pain points of its customers. Product managers are a critical part of an organisation, who make sure that evolving customer needs, and market trends are observed and converted into delightful solutions which help businesses get its outcomes.
In this course, students will get a fundamental understanding of product management practices.
This will give them a comprehensive view of the complete product management life cycle.
This course helps students translate mathematical/statistical/scientific concepts into code. This is a foundational course for writing code to solve Data Science ML & AI problems. It introduces basic programming concepts (like control structures, recursion, classes and objects) from scratch, assuming no prerequisites, to make this course accessible to students from non-computational scientific fields like Biology, Physics, Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation, the course advances to dive deep into core Mathematical libraries like NumPy, Scipy and Pandas. Students also learn when and how to use inbuilt-data structures like Lists, Dicts, Sets and Tuples. The module introduces the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods. The module does not dive deep into the data structures and algorithm design methods in this course - that is available in the ‘Data Structures and Algorithms’ module. This course is valuabe for all students specializing in mathematical sub-areas of CS like ML, Data Science, Scientific Computing etc.
This course introduces basic probability theory , statistical methods and computational algorithms to perform mathematically rigorous data analysis. The course starts with basic foundational concepts of random variables, histograms, and various plots (PMF, PDF and CDF). Students learn various popular discrete and continuous distributions like Bernoulli, Binomial, Poisson, Gaussian, Exponential, Pareto, log-normal etc., both mathematically and from an applicative perspective. Students learn various measures like mean, median, percentiles, quantiles, variance and interquartile-range. Students learn the pros and cons of each metric and understand when and how to use them in practice. Studnets will learn conditional probability and Bayes theorem in the applied context of real-world problems in medicine and healthcare. The module teaches the foundations of non-parametric statistics and applies them to solve problems using computational tools. Students learn various methods to determine correlations rigorously in data. This is followed by applied and mathematical understanding of the statistics underlying control-treatment (A/B) experiments and hypothesis testing. The module engages computation tools in modern statics like Bootstrapping, Monte-Carlo methods, RANSAC etc.
This module focuses on representing statistical techniques in code, and may be conducted in Python, R, or another relevant language. Such languages provide libraries that can handle a wide variety of statistical techniques like linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering and graphical techniques, and is highly extensible.
Learning to work in statistically-oriented programming language environments can equip you with the following skills among many others:
An effective way of data handling (using arrays for example) and storing data in a structured manner.
Expertise in diverse tools and libraries for Data Analysis
Ability to present complex data in a graphical and visual format for easy understanding of the data and further solutions.
This course teaches students how to analyse the ways users engage with a service. This method, called product analytics, helps businesses track and analyse user data. Students will learn more deeply what is required to move a product from idea to implementation, through to launch, and then on to iterative improvements. The course teaches how to measure progress, validate or update product hypotheses, and present product learnings.
Also, students will gain experience in making informed decisions, as well as how to present findings and make an analytics-informed business case to win support for a product.
This is a hands-on course on designing responsive, modern and light-weight UI for web, mobile and desktop applications using HTML5, CSS and Frameworks like Bootstrap 4. This course starts with an introduction on how web browsers, mobile apps and web servers work. We then dive into each of the nitty gritty details of HTML5 to build webpages. We would start with simple web pages and then graduate to more complex layouts and features in HTML like forms, iFrames, multimedia-playback and using web-APIs. We then go on to learn stylesheets based on CSS 4 and how browsers interpret CSS files to render web pages. Once again, we use multiple real world example web pages to learn the internals of CSS4. We learn popular good practices on writing responsive HTML and CSS code which is also interoperable on mobile browsers, apps and desktop apps. We would introduce students to building desktop apps using HTML and CSS using toolkits like Electron. We would also study popular frameworks for front end development like Bootstrap 4 which can speed up UI development significantly.
This course focuses on building basic classification and regression models and understanding these models rigorously both with a mathematical and an applicative focus. The module starts with a basic introduction to high dimensional geometry of points, distance-metrics, hyperplanes and hyperspheres. We build on top this to introduce the mathematical formulation of logistic regression to find a separating hyperplane. Students learn to solve the optimization problem using vector calculus and gradient descent (GD) based algorithms. The module introduces computational variations of GD like mini-batch and stochastic gradient descent. Students also learn other popular classification and regression methods like k-Nearest Neighbours, Naive Bayes, Decision Trees, Linear Regression etc. Students also learn how each of these techniques under various real world situations like the presence of outliers, imbalanced data, multi class classification etc. Students learn bias and variance trade-off and various techniques to avoid overfitting and underfitting. Students also study these algorithms from a Bayesian viewpoint along with geometric intuition. This module is hands-on and students apply all these classical techniques to real world problems.
This course is aimed to help learners understand various techniques and algorithms to visualize, analyse and understand high dimensional data which is very common in Data Science and ML. The module starts with linear algebraic methods like Principal Component Analysis (PCA) and SVD (Singular Value Decomposition) for obtaining linear projection of high dimensional data. This is followed by more advanced nonlinear and state of the art techniques like t-SNE and UMAP for visualizing high dimensional data. Each of these techniques would be covered in full mathematical detail from first principles along with applying them to real world datasets in NLP, Genomics and internet-datasets. Students will also study how PCA and SVD are related to general Matrix Factorization techniques. To analyse and understand high dimensional un-labelled data, students learn clustering techniques like K-Means, Gaussian Mixture models, Hierarchical Clustering and DBSCAN. The modules shows how some of the techniques are mathematically related to Matrix Factorization. Students study various outlier detection techniques based on density, proximity, factorization and cluster analysis.
This course introduces more advanced ML techniques like ensembles: bagging, boosting, cascading and stacking classifiers and regressors. It covers both the theoretical foundations and applicative details of these techniques along with popular implementations of boosting like LightGBM, CatBoost and XGBoost. Students also delve into kernel methods with specific focus on SVMs for classification and regression. Students will study state of the art model agnostic feature importance and model-interpretability techniques like LIME and SHAP. Students also study classical NLP based text encoding methods like Bag-of-words, TF-IDF etc. The module teaches various classical methods in time series analysis and forecasting like ARMA, ARIMA etc. Students also learn how to pose time series forecasting problems as regression and classification problems to leverage well studied ML techniques. This is followed by various domain and problem specific Feature engineering techniques that are often helpful in real world problem solving. Students will study methods like error analysis, ablative analysis etc., to debug and understand why and where a model is performing well and where it is not performing well. This will further help us in designing appropriate features. Students study model calibration techniques like Platt Scaling, Isotonic Regression etc. Later in this course, we cover how to build recommender systems using content-based and collaborative filtering methods. The module also teaches the detailed solution of the Netflix prize (2009) and various recent advances in RecSys.
This course provides an in-depth understanding of distributed systems for ML and Deep Learning using CPU,GPU and TPU clusters. It starts with foundations of Map-reduce framework and in-memory distributed and resilient data structures that form the backbone of Spark. Students will learn the architectural details of these distributed system platforms and how they can be leveraged to perform data analysis and model training on petabyte scale datasets. We cover how distributed training is achieved for popular ML algorithms on Spark by understanding the internal working of SparkMLLib. The module then focuses on understanding distributed graph processing using GraphX. Students move on to Deep-Learning algorithms and how distributed algorithms can be designed for them when we have GPU or TPU clusters at our disposal. We also dive deep into how TensorFlow archives distributed computing for popular Deep Learning algorithms. Students will study distributed data stores and how they can be used for ML using popular datastore systems like Hive and SparkSQL. The module concludes by discussing state of the art distributed, low-latency approximate nearest neighbour algorithms along with their implementations in ElasticSearch.
This course provides a strong mathematical and applicative introduction to Deep Learning. The module starts with the perceptron model as an over simplified approximation to a biological neuron. We motivate the need for a network of neurons and how they can be connected to form a Multi Layered Perceptron (MLPs). This is followed by a rigorous understanding of back-propagation algorithms and its limitations from the 1980s. Students study how modern deep learning took off with improved computational tools and data sets. We teach more modern activation units (like ReLU and SeLU) and how they overcome problems with the more classical Sigmoid and Tanh units. Students learn weight initialization methods, regularization by dropouts, batch normalization etc., to ensure that deep MLPs can be successfully trained. The module teaches variants of Gradient Descent that have been specifically designed to work well for deep learning systems like ADAM, AdaGrad, RMSProp etc. Students also learn AutoEncoders, VAEs and Word2Vec as unsupervised, encoding deep-learning architectures. We apply all of the foundational theory learned to various real world problems using TensorFlow 2 and Keras. Students also understand how TensorFlow 2 works internally with specific focus on computational graph processing.
This course provides a comprehensive overview of Computer vision problems and how they can be tackled using various Convolutional Neural networks (CNNs). Students start with classical image processing operations like edge detection, convolution, shape detectors and colour space conversions. This is followed by a foundational understanding of Deep-Convolutional Neural networks and how their training and evaluation works. We introduce various CNN specific layers like pooling-layers and upsampling layers. We also introduce various Data Augmentation techniques that are very helpful for image-related problems. This is followed by a dive deep into the internals of popular CNN architectures like: AlexNet, VGGNet, ResNet etc. Students also learn how to use these methods practically for transfer learning. Students will study how various computer-vision related tasks like image segmentation, image-generation, object detection and localization, contrastive learning etc., can be performed using state of the art algorithms for each of these tasks. Most of these techniques would be studied directly from the original research papers and open-source code provided by the authors. Students would also implement some of these algorithms from scratch in this course.
This course focuses on modelling sequences (text, music, time-series, genes) using deep-learning models. We start with a simple Recurrent Neural Network and its limitations with long-sequences. Students learn LSTMs and GRUs which can handle significantly longer sequences to model sequence data like text, music, gene-sequences and time-series data. We study variations of LSTM like bi-directional LSTMs and encoder-decoder architectures. This is followed by a detailed study of attention mechanism and Transformer based models which are currently the state-of-the-art for NLP and sequence modelling. The module teaches encoder-decoder Transformers, BERT, BERT-variations, GPT-1,2 &3 models from both the architectural and mathematical viewpoints and also a practical viewpoint. Studnets learn to implement many of these complex models from scratch (using TensorFlow 2 and Keras) to gain a deeper understanding of how they work internally. Students will study popular applications of deep-learning in NLP like parts-of-speech tagging, question-answering systems, conversational engines (chatbots), Semantic search with low-latency etc. For each of these problems, Students will study cutting edge deep-learning models along with code implementations.
This course provides students with hands-on experience on deploying high velocity applications and services reliably on complex and distributed infrastructure. DevOps as a philosophy is a key driver of the modern software life cycle which prefers rapid and reliable delivery of functionality and features via code. We start with a solid introduction to Linux scripting and networking. Then, we learn popular methodologies to deploy complex and distributed software like microservices, containerization (Docker) and orchestration (Kubernetes). All of this would be introduced with real world examples from the industry. We also focus on Continuous Integration and Continuous Delivery (CI/CD) methodology and how it can be achieved using popular toolchains like Jenkins. We dive into how automated testing of software can be achieved using libraries like Selenium. This shall be followed by more advanced techniques like serverless-compute, Platform as a service model and Cloud-DevOps. Students would learn to monitor and log key data points to ensure they maintain a healthy system and adapt it as needed. Infrastructure-as-code is a key component of modern DevOps especially on cloud and containerized applications which would also be covered with real-world examples.
This course is a hands-on course covering JavaScript from basics to advanced concepts in detail using multiple examples. We start with basic programming concepts like variables, control statements, loops, classes and objects. Students also learn basic data-structures like Strings, Arrays and dates. Students also learn to debug our code and handle errors gracefully in code. We learn popular style guides and good coding practices to build readable and reusable code which is also highly performant. We then learn how web browsers execute JavaScript code using V8 engine as an example. We also cover concepts like JIT-compiling which helps JS code to run faster. This is followed by slightly advanced concepts like DOM, Async-functions, Web APIs and AJAX which are very popularly used in modern front end development. We learn how to optimize JavaScript code to run on both mobile apps and mobile browsers along with Desktop browsers and as desktop apps via ElectronJS. Most of this course would be covered via real world examples and by learning from JS code of popular open-source websites and libraries.
This course provides a dive deep into more advanced concepts in server-side programming using Node.js to enable initiative, real-time and scalable web applications. We dive into threading and thread pools in Node.js and how they can be leveraged to build more responsive web apps. We learn socket programming using socket.io and Node.js for instant messaging, document collaboration, real time analytics and streaming applications. Students also learn to use Caching using distributed in-memory key-value stores (like Redis) to rescue latency while serving web-apps. Students also learn how to use Node.js with popular NoSQL data stores like MongoDB for storing unstructured data. We also cover GraphQL which is an open source data query and manipulation language for APIs, which is gaining popularity more recently. We learn popular protocols like OAuth to enable cross platform logins. Students also learn the architecture and practical aspects of Web-RTC to enable multimedia applications like video-chat, live-streaming, music-streaming etc.
This course provides an in-depth architectural overview and hands- on experience with building scalable data processing and distributed computing via various cloud systems. We focus a lot on Spark which is one of the most popular and powerful distributed systems to perform petabyte scale data processing. We learn various components of Spark like HDFS, Resilient Distributed Datasets (RDDs), Programming models like Map-reduce. Students also learn SparkSQL and Hive and how they can be used for querying large datastores. We focus on how various services in a cloud (like AWS) can be used together to build scalable data-pipelines for both batch and near real-time processing. We show various examples of real world systems and their architectures from various companies and organizations. We learn how graphX can be used to process large graphs using Spark. Students use AWS Elastic Map Reduce (EMR) for cloud based Spark clusters. We learn the design and architecture of distributed inverted indices and how they can be used for implementing search scalably. Students learn to use ElasticSearch, a very popular distributed inverted index for implementing search functionality on websites and on unstructured data.
This core course equips the student with knowledge of database management systems, operating systems and computer networks. At the end of the course, students will have a critical understanding of the architecture of computers and networks, as well has how programs interact with these. Students begin with mapping data storage problems (as they had done in Relational Databases) to understand how data is stored in a distributed network, and related issues such as concurrency. Subsequently, students cover operating systems with an overview of process scheduling, process synchronisation and memory management techniques with disk scheduling. The module concludes with computer networks, where we will be discussing all of the computer network layers and their protocols in detail.
Low-Level Design & Design Patterns focuses on modularity and reusability in software design, common design vocabularies, refactoring and how to reduce it, and how to incorporate design patterns into iterative development processes. The course pays significant attention to the interaction between system architecture and components, including data organisation.
The course begins with Object-Oriented Analysis (OOA), which is a problems-solving technique that includes: modelling an information design; representing behaviour; describing functions; dividing data, functional, and behavioural models to uncover detail; moving from abstraction to implementation details. The course then turns to Object-Oriented Design (OOD), which reduces the analysis model into a modular design for software creation, with subsystems, components, and objects.
The iteration of analysis and implementation will be covered in detail with real-world industry examples.
This module is designed to deepen students' understanding of advanced algorithmic techniques and problem-solving strategies. Building on their existing knowledge of dynamic programming (DP) and graph algorithms, students will explore more sophisticated concepts and applications.The module begins with a revision of recursion techniques, enabling students to solve complex recursive problems more efficiently. They will refine their DP skills by implementing more advanced bottom-up and top-down approaches, crucial for optimizing real-world solutions.Students will gain expertise in mathematical algorithms, equipping them to handle intricate problems involving factorials, modular arithmetic, and large power computations. They will develop strategies to solve complex puzzles and optimization problems through backtracking.The course will introduce the Trie data structure, empowering students to efficiently manage and manipulate large sets of data. They will enhance their ability to handle text processing tasks by mastering advanced string pattern matching algorithms.Additionally, students will advance their skills in graph algorithms, exploring concepts such as Disjoint Set Union (DSU), graph coloring, and shortest path algorithms like Bellman-Ford and Floyd-Warshall. These advanced topics will prepare them for tackling complex algorithmic challenges often encountered in technical interviews.Throughout the course, students will engage in practical assignments and real-world examples to solidify their understanding. By the end of the module, they will be well-equipped with the knowledge and skills to confidently approach a wide range of advanced problems, ensuring their readiness for technical interviews and advanced problem-solving scenarios.
Structured Query Language (SQL) is key to working with data in relational databases, a task at the core of data science and analytics. In this course, students will learn all the major keywords and clauses used to extract data, best practices for formatting SQL queries, and how to generate meaningful insights from the results.
The focus is at all times on real-world uses of SQL queries, syntax, and expression, to allow students to begin professional-level work as quickly as possible.
This course focuses on building basic classification and regression models and understanding these models rigorously both with a mathematical and an applicative focus. It opens with a basic introduction to high dimensional geometry of points, distance-metrics, hyperplanes and hyperspheres. Then, it introduces the mathematical formulation of logistic regression to find a separating hyperplane. Vector calculus and gradient descent (GD)-based algorithms are explored to learn to solve the optimization problem, including computational variations of GD like mini-batch and stochastic gradient descent. The course also covers other popular classification and regression methods like k-Nearest Neighbours, Naive Bayes, Decision Trees, Linear Regression etc, to show how each of these techniques performs under various real-world situations like the presence of outliers, imbalanced data, multi class classification etc. Lectures on bias and variance tradeoff and various techniques to avoid overfitting and underfitting are incorporated. Algorithms are taught from a Bayesian viewpoint along with geometric intuition. This course would be heavily hands-on where students apply all these classical techniques to real world problems.
This course is aimed to build a strong foundational knowledge of Data Analytics used extensively in the Data Science field. Tableau is a powerful data visualisation tool used in the business analytics industry to process and visualise raw business data in a very presentable and understandable format. Tableau is used by all data analytics departments of companies and in data analytics companies in various fields for its ease of use and efficiency. Tableau uses relational databases, Online Analytical Processing Cubes, Spreadsheets, cloud databases to generate graphical type visualisations. Course starts with visualisations and moves to an in-depth look at the different chart and graph functions, calculations, mapping and other functionality. Students will be taught quick table calculations, reference lines, different types of visualisations, bands and distributions, parameters, motion chart, trends and forecasting, formatting, stories, performance recording and advanced mapping.
At the end of this course, students will be prepared, if they desire, to earn industry desktop certifications as a Tableau Desktop Specialist, a Tableau Certified Associate, or a Tableau Certified Professional.
This core course equips the student with knowledge of database management systems, operating systems and computer networks. At the end of the course, students will have a critical understanding of the architecture of computers and networks, as well has how programs interact with these. Students begin with mapping data storage problems (as they had done in Relational Databases) to understand how data is stored in a distributed network, and related issues such as concurrency. Subsequently, students cover operating systems with an overview of process scheduling, process synchronisation and memory management techniques with disk scheduling. The module concludes with computer networks, where we will be discussing all of the computer network layers and their protocols in detail.
Low-Level Design & Design Patterns focuses on modularity and reusability in software design, common design vocabularies, refactoring and how to reduce it, and how to incorporate design patterns into iterative development processes. The course pays significant attention to the interaction between system architecture and components, including data organisation.
The course begins with Object-Oriented Analysis (OOA), which is a problems-solving technique that includes: modelling an information design; representing behaviour; describing functions; dividing data, functional, and behavioural models to uncover detail; moving from abstraction to implementation details. The course then turns to Object-Oriented Design (OOD), which reduces the analysis model into a modular design for software creation, with subsystems, components, and objects.
The iteration of analysis and implementation will be covered in detail with real-world industry examples.
This course gives the detailed overview on how to approach Low Level Design problems with real-world case studies discussed such as Designing a Pen (Mac/Windows), TicTacToe, BookMyShow (most used event booking app, manages millions of users), Email campaign Management System and detailed design of Splitwise.
This is a project-based course, with the aim of building the required skills for creating web-based software systems. The course covers the entire lifecycle of building software projects, from requirement gathering and scope definition from a product document, to designing the architecture of the system, and all the way to delivery and maintenance of the software system.
The course covers both frontend, which is, building browser-based interfaces for users, using frontend web frameworks, and also building the backend, which is the server running an API to serve the information to the frontend, and running on an SQL or similar database management system for storage.
All aspects of delivering a software project, including security, user authentication and authorisation, monitoring and analytics, and maintaining the project are covered. The course also covers the aspects of project maintenance, like using a version control system, setting up continuous integration and deployment pipelines and bug trackers.
This course is a hands-on course covering JavaScript from basics to advanced concepts in detail using multiple examples. We start with basic programming concepts like variables, control statements, loops, classes and objects. Students also learn basic data-structures like Strings, Arrays and dates. Students also learn to debug our code and handle errors gracefully in code. We learn popular style guides and good coding practices to build readable and reusable code which is also highly performant. We then learn how web browsers execute JavaScript code using V8 engine as an example. We also cover concepts like JIT-compiling which helps JS code to run faster. This is followed by slightly advanced concepts like DOM, Async-functions, Web APIs and AJAX which are very popularly used in modern front end development. We learn how to optimize JavaScript code to run on both mobile apps and mobile browsers along with Desktop browsers and as desktop apps via ElectronJS. Most of this course would be covered via real world examples and by learning from JS code of popular open-source websites and libraries.
This is a hands-on course on designing responsive, modern and light-weight UI for web, mobile and desktop applications using HTML5, CSS and Frameworks like Bootstrap 4. This course starts with an introduction on how web browsers, mobile apps and web servers work. We then dive into each of the nitty gritty details of HTML5 to build webpages. We would start with simple web pages and then graduate to more complex layouts and features in HTML like forms, iFrames, multimedia-playback and using web-APIs. We then go on to learn stylesheets based on CSS 4 and how browsers interpret CSS files to render web pages. Once again, we use multiple real world example web pages to learn the internals of CSS4. We learn popular good practices on writing responsive HTML and CSS code which is also interoperable on mobile browsers, apps and desktop apps. We would introduce students to building desktop apps using HTML and CSS using toolkits like Electron. We would also study popular frameworks for front end development like Bootstrap 4 which can speed up UI development significantly.
This course builds upon the introductory JavaScript course to acquaint students of popular and modern frameworks to build the front end. We focus on three very popular frameworks/libraries in use: React.js, jQuery and AngularJS. We start with React.js, one of the most popular and advanced ones amongst the three. students learn various components and data flow to learn to architect real world front end using React.js. This would be achieved via multiple code examples and code-walkthroughs from scratch. We would also dive into React Native which is a cross platform Framework to build native mobile and smart-TV apps using JavaScript. This helps students to build applications for various platforms using only JavaScript. jQuery is one of the oldest and most widely used JavaScript libraries, which students cover in detail. Students specifically focus on how jQuery can simplify event handling, AJAX, HTML DOM tree manipulation and create CSS animations. We also provide a hands-on introduction to AngularJS to architect model-view-controller (MVC) based dynamic web pages.
This is a project-based course, with the aim of building the required skills for creating web-based software systems. The course covers the entire lifecycle of building software projects, from requirement gathering and scope definition from a product document, to designing the architecture of the system, and all the way to delivery and maintenance of the software system.
The course covers both frontend, which is, building browser-based interfaces for users, using frontend web frameworks, and also building the backend, which is the server running an API to serve the information to the frontend, and running on an SQL or similar database management system for storage.
All aspects of delivering a software project, including security, user authentication and authorisation, monitoring and analytics, and maintaining the project are covered. The course also covers the aspects of project maintenance, like using a version control system, setting up continuous integration and deployment pipelines and bug trackers.
This course focuses on building basic classification and regression models and understanding these models rigorously both with a mathematical and an applicative focus. The module starts with a basic introduction to high dimensional geometry of points, distance-metrics, hyperplanes and hyperspheres. We build on top this to introduce the mathematical formulation of logistic regression to find a separating hyperplane. Students learn to solve the optimization problem using vector calculus and gradient descent (GD) based algorithms. The module introduces computational variations of GD like mini-batch and stochastic gradient descent. Students also learn other popular classification and regression methods like k-Nearest Neighbours, Naive Bayes, Decision Trees, Linear Regression etc. Students also learn how each of these techniques under various real world situations like the presence of outliers, imbalanced data, multi class classification etc. Students learn bias and variance trade-off and various techniques to avoid overfitting and underfitting. Students also study these algorithms from a Bayesian viewpoint along with geometric intuition. This module is hands-on and students apply all these classical techniques to real world problems.
This course provides a strong mathematical and applicative introduction to Deep Learning. The module starts with the perceptron model as an over simplified approximation to a biological neuron. We motivate the need for a network of neurons and how they can be connected to form a Multi Layered Perceptron (MLPs). This is followed by a rigorous understanding of back-propagation algorithms and its limitations from the 1980s. Students study how modern deep learning took off with improved computational tools and data sets. We teach more modern activation units (like ReLU and SeLU) and how they overcome problems with the more classical Sigmoid and Tanh units. Students learn weight initialization methods, regularization by dropouts, batch normalization etc., to ensure that deep MLPs can be successfully trained. The module teaches variants of Gradient Descent that have been specifically designed to work well for deep learning systems like ADAM, AdaGrad, RMSProp etc. Students also learn AutoEncoders, VAEs and Word2Vec as unsupervised, encoding deep-learning architectures. We apply all of the foundational theory learned to various real world problems using TensorFlow 2 and Keras. Students also understand how TensorFlow 2 works internally with specific focus on computational graph processing.
This course provides an in-depth understanding of distributed systems for ML and Deep Learning using CPU,GPU and TPU clusters. It starts with foundations of Map-reduce framework and in-memory distributed and resilient data structures that form the backbone of Spark. Students will learn the architectural details of these distributed system platforms and how they can be leveraged to perform data analysis and model training on petabyte scale datasets. We cover how distributed training is achieved for popular ML algorithms on Spark by understanding the internal working of SparkMLLib. The module then focuses on understanding distributed graph processing using GraphX. Students move on to Deep-Learning algorithms and how distributed algorithms can be designed for them when we have GPU or TPU clusters at our disposal. We also dive deep into how TensorFlow archives distributed computing for popular Deep Learning algorithms. Students will study distributed data stores and how they can be used for ML using popular datastore systems like Hive and SparkSQL. The module concludes by discussing state of the art distributed, low-latency approximate nearest neighbour algorithms along with their implementations in ElasticSearch.
This course aims to build the core competency of building real world end-to-end ML systems and deploy them into production for a variety of problems and scenarios. Students would learn a variety of ML systems ranging from high throughput and low latency internet scale systems to low compute power and energy constrained IoT devices like smart watches. Students will study the ML lifecycle and various components in detail. We also use real world ML platforms like Google’s KubeFlow, TensorFlow Lite, and Amazon’s SageMaker to implement real world systems and understand the engineering trade-offs and challenges. Students also learn relevant technologies and tools like Containerization (Docker) and Container Orchestration (Kubernetes) and Git which are often used extensively in real world scalable ML systems. This course is a hands-on course where we solve multiple real world cases and discuss solutions built by various companies and organizations to provide the students a comprehensive understanding of varied systems and design choices.
This course focuses on modelling sequences (text, music, time-series, genes) using deep-learning models. We start with a simple Recurrent Neural Network and its limitations with long-sequences. Students learn LSTMs and GRUs which can handle significantly longer sequences to model sequence data like text, music, gene-sequences and time-series data. We study variations of LSTM like bi-directional LSTMs and encoder-decoder architectures. This is followed by a detailed study of attention mechanism and Transformer based models which are currently the state-of-the-art for NLP and sequence modelling. The module teaches encoder-decoder Transformers, BERT, BERT-variations, GPT-1,2 &3 models from both the architectural and mathematical viewpoints and also a practical viewpoint. Studnets learn to implement many of these complex models from scratch (using TensorFlow 2 and Keras) to gain a deeper understanding of how they work internally. Students will study popular applications of deep-learning in NLP like parts-of-speech tagging, question-answering systems, conversational engines (chatbots), Semantic search with low-latency etc. For each of these problems, Students will study cutting edge deep-learning models along with code implementations.
This course provides a comprehensive overview of Computer vision problems and how they can be tackled using various Convolutional Neural networks (CNNs). Students start with classical image processing operations like edge detection, convolution, shape detectors and colour space conversions. This is followed by a foundational understanding of Deep-Convolutional Neural networks and how their training and evaluation works. We introduce various CNN specific layers like pooling-layers and upsampling layers. We also introduce various Data Augmentation techniques that are very helpful for image-related problems. This is followed by a dive deep into the internals of popular CNN architectures like: AlexNet, VGGNet, ResNet etc. Students also learn how to use these methods practically for transfer learning. Students will study how various computer-vision related tasks like image segmentation, image-generation, object detection and localization, contrastive learning etc., can be performed using state of the art algorithms for each of these tasks. Most of these techniques would be studied directly from the original research papers and open-source code provided by the authors. Students would also implement some of these algorithms from scratch in this course.
This course introduces more advanced ML techniques like ensembles: bagging, boosting, cascading and stacking classifiers and regressors. It covers both the theoretical foundations and applicative details of these techniques along with popular implementations of boosting like LightGBM, CatBoost and XGBoost. Students also delve into kernel methods with specific focus on SVMs for classification and regression. Students will study state of the art model agnostic feature importance and model-interpretability techniques like LIME and SHAP. Students also study classical NLP based text encoding methods like Bag-of-words, TF-IDF etc. The module teaches various classical methods in time series analysis and forecasting like ARMA, ARIMA etc. Students also learn how to pose time series forecasting problems as regression and classification problems to leverage well studied ML techniques. This is followed by various domain and problem specific Feature engineering techniques that are often helpful in real world problem solving. Students will study methods like error analysis, ablative analysis etc., to debug and understand why and where a model is performing well and where it is not performing well. This will further help us in designing appropriate features. Students study model calibration techniques like Platt Scaling, Isotonic Regression etc. Later in this course, we cover how to build recommender systems using content-based and collaborative filtering methods. The module also teaches the detailed solution of the Netflix prize (2009) and various recent advances in RecSys.
Every organisation is building products to solve the pain points of its customers. Product managers are a critical part of an organisation, who make sure that evolving customer needs, and market trends are observed and converted into delightful solutions which help businesses get its outcomes.
In this course, students will get a fundamental understanding of product management practices.
This will give them a comprehensive view of the complete product management life cycle.
Data is the fuel driving all major organisations. In this course, we help you understand how to process data at scale.
From understanding the fundamentals of distributed processing to designing data warehousing and writing ETL (Extract Transform Load) pipelines to process batch and streaming data.
We will give you a comprehensive view of the complete Data Engineering lifecycle.
This is a course that focuses both on architectural design and practical hands-on learning of the most used cloud services. The module extensively uses Amazon Web services (AWS) to show real world code examples of various cloud services. It also covers the core concepts and architectures in a platform agnostic manner so that students can easily translate these learnings to other cloud platforms (like Azure, GCP etc.). The module starts with virtualization and how virtualized compute instances are created and configured. Students also learn how to auto-scale applications using load balancers and build fault tolerant applications across a geographically distributed cloud. As relational databases are widely used in most enterprises, students learn how to migrate and scale (both vertically and horizontally) these databases on the cloud while ensuring enterprise grade security. Virtual private clouds enable us to create a logically isolated virtual network of compute resources. Students learn to set up a VPC using virtualized-compute-servers on AWS. The course also covers the basics of networking while setting up a VPC. Students learn of the architecture and practical aspects of distributed object storage and how it enables low latency and high availability data storage on the cloud.
This course teaches students how to analyse the ways users engage with a service. This method, called product analytics, helps businesses track and analyse user data. Students will learn more deeply what is required to move a product from idea to implementation, through to launch, and then on to iterative improvements. The course teaches how to measure progress, validate or update product hypotheses, and present product learnings.
Also, students will gain experience in making informed decisions, as well as how to present findings and make an analytics-informed business case to win support for a product.
This module focuses on representing statistical techniques in code, and may be conducted in Python, R, or another relevant language. Such languages provide libraries that can handle a wide variety of statistical techniques like linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering and graphical techniques, and is highly extensible.
Learning to work in statistically-oriented programming language environments can equip you with the following skills among many others:
An effective way of data handling (using arrays for example) and storing data in a structured manner.
Expertise in diverse tools and libraries for Data Analysis
Ability to present complex data in a graphical and visual format for easy understanding of the data and further solutions.
This course helps students translate mathematical/statistical/scientific concepts into code. This is a foundational course for writing code to solve Data Science ML & AI problems. It introduces basic programming concepts (like control structures, recursion, classes and objects) from scratch, assuming no prerequisites, to make this course accessible to students from non-computational scientific fields like Biology, Physics, Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation, the course advances to dive deep into core Mathematical libraries like NumPy, Scipy and Pandas. Students also learn when and how to use inbuilt-data structures like Lists, Dicts, Sets and Tuples. The module introduces the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods. The module does not dive deep into the data structures and algorithm design methods in this course - that is available in the ‘Data Structures and Algorithms’ module. This course is valuabe for all students specializing in mathematical sub-areas of CS like ML, Data Science, Scientific Computing etc.
This course introduces basic probability theory , statistical methods and computational algorithms to perform mathematically rigorous data analysis. The course starts with basic foundational concepts of random variables, histograms, and various plots (PMF, PDF and CDF). Students learn various popular discrete and continuous distributions like Bernoulli, Binomial, Poisson, Gaussian, Exponential, Pareto, log-normal etc., both mathematically and from an applicative perspective. Students learn various measures like mean, median, percentiles, quantiles, variance and interquartile-range. Students learn the pros and cons of each metric and understand when and how to use them in practice. Studnets will learn conditional probability and Bayes theorem in the applied context of real-world problems in medicine and healthcare. The module teaches the foundations of non-parametric statistics and applies them to solve problems using computational tools. Students learn various methods to determine correlations rigorously in data. This is followed by applied and mathematical understanding of the statistics underlying control-treatment (A/B) experiments and hypothesis testing. The module engages computation tools in modern statics like Bootstrapping, Monte-Carlo methods, RANSAC etc.
This course builds on foundational AWS knowledge, diving deeper into the platform's sophisticated features and services. Students will explore advanced networking configurations, security and compliance measures, and serverless architectures. Emphasis will be placed on practical applications, allowing students to design and implement complex AWS architectures, automate infrastructure management with Infrastructure as Code (IaC) tools, and optimize costs for scalable solutions.In addition to technical skills, the course covers advanced topics in containerization, orchestration with AWS services, and the development of continuous integration/continuous deployment (CI/CD) pipelines. Students will gain hands-on experience through labs and projects that simulate real-world scenarios, ensuring they can effectively deploy, manage, and scale applications on AWS. By the end of the course, students will be proficient in leveraging AWS's full potential to meet specific business requirements, ensuring security, compliance, and cost-efficiency in cloud environments.
This course provides a comprehensive overview of Amazon Web Services (AWS), focusing on core services and best practices for building and managing cloud-based infrastructure. Students will learn the fundamentals of cloud computing, explore key AWS services such as EC2, S3, RDS, and VPC, and gain hands-on experience in deploying and managing applications on the AWS platform. The course emphasizes practical skills, enabling students to design scalable, secure, and cost-effective cloud solutions.In addition to foundational AWS services, the course covers essential topics such as identity and access management (IAM), networking and security configurations, and monitoring and logging with CloudWatch. Students will also be introduced to Infrastructure as Code (IaC) using AWS CloudFormation and gain insights into setting up continuous integration/continuous deployment (CI/CD) pipelines with AWS tools. By the end of the course, students will have a solid understanding of AWS basics, equipping them with the knowledge and skills to effectively utilize AWS services in their DevOps practices and prepare for more advanced AWS coursework.
DevOps Tools Part 2 is an advanced course designed for students building on the foundational knowledge from DevOps Tools Part 1. his course delves deeper into advanced DevOps tools and techniques that drive modern software development and operations. Students will explore sophisticated CI/CD pipelines with tools like GitLab CI/CD and Azure DevOps, and master infrastructure as code (IaC) with Terraform and AWS CloudFormation. The course emphasizes practical, hands-on experience, enabling students to automate and manage complex cloud environments effectively.In addition to advanced CI/CD and IaC, students will learn about service mesh architectures with Istio, advanced container orchestration with Kubernetes, and continuous monitoring and observability with tools such as Grafana and Jaeger. The course also covers security practices in DevOps, including integrating security tools into the CI/CD pipeline and managing secrets with HashiCorp Vault. By the end of the course, students will have the expertise to design, implement, and manage scalable, secure, and efficient DevOps workflows, preparing them for leadership roles in the field.
DevOps Tools Part 1 is a comprehensive course designed for students pursuing a Master of Science in Computer Science with a specialization in DevOps. This course introduces the essential tools and methodologies that form the backbone of modern DevOps practices. Students will gain a solid foundation in version control with Git, continuous integration/continuous deployment (CI/CD) pipelines using Jenkins, and configuration management with Ansible. The course emphasizes hands-on learning, enabling students to set up, configure, and utilize these tools in real-world scenarios, ensuring they can effectively collaborate, automate workflows, and streamline the development process.In addition to core tools, the course covers containerization with Docker and orchestration with Kubernetes, providing students with the skills to deploy and manage applications in a microservices architecture. Students will also explore monitoring and logging solutions such as Prometheus and ELK Stack to maintain system reliability and performance. By the end of the course, students will be proficient in employing a wide range of DevOps tools, laying a strong foundation for advanced DevOps practices and tools covered in subsequent courses.
This core foundational course equips students with knowledge of Database Management Systems (DBMS) and Computer Networks.The course starts with Entity-Relationship (ER) diagrams, a visual tool for mapping real-world data storage problems.Students learn to translate ER diagrams into a relational model with tables. SQL, the standard language for relational databases, is then introduced. Students will spend significant time building proficiency in writing optimized and complex SQL queries for various data manipulation tasks. Real-world examples will be used to solidify practical knowledge.Next, the course explores trade-offs in modern relational databases, such as storage space versus latency. Designing efficient databases requires understanding normal forms to minimize data duplication, indexing for speed improvements,and flattening tables to avoid complex joins in low-latency environments. These real-world database design strategies are discussed with practical examples.The course utilizes open-source MySQL databases and cloud-hosted relational databases (like Amazon RDS) for assignments, allowing students to apply learned concepts on real databases.Following the DBMS section, the course transitions to Computer Networks. Here, students will delve into foundational concepts like the OSI model, TCP/IP model, TCP/UDP protocols, subnetting, DNS (Domain Name System), Network Address Translation (NAT), private networks, Secure Sockets Layer (SSL), and network security principles.
This fundamental course aims to equip students with knowledge of core Linux operating system concepts. The module will cover essential operating system functionalities, including process management, process synchronization (concurrency), memory management techniques, and disk scheduling.In process management, students will explore how Linux manages processes throughout their lifecycle, including starting, pausing, resuming, and allocating resources. The concept of concurrency will be introduced, covering multithreading and multiprocessing. Students will also learn how asynchronous processing facilitates concurrent execution.Memory management is another key topic. The course will delve into how the operating system manages memory on both RAM and hard disk, along with concepts like virtual memory, memory pages, and page caching.
This course equips learners with the essential skills to navigate and manage Linux servers effectively. The course begins by demystifying the Linux command line interface, learning the fundamentals students need to interact with the system. As learners progress, they will explore powerful command-line utilities like grep, sed, awk, and more, allowing them to manipulate and analyze data efficiently.Next, they will delve into the Linux file system structure, gaining a solid understanding of user permissions and access control. This knowledge is crucial for ensuring secure and organized server environments.Finally, the course empowers students to write their own shell scripts. They will learn how to craft conditional statements (if/else) and loops (for loops) to automate repetitive tasks and streamline the workflow. By the end, they will be able to write scripts to solve real-world problems and significantly boost their Linux server administration skills.