Grid, Cluster and Cloud Computing
The purpose of this lecture is to
provide an introduction into the theoretical aspects and basic applications of
grid, cluster and cloud computing technologies, with a focus on cloud
computing. We will study the fundamental theoretical basis aspects as well as
new and emergent topics related to grid operating systems, cloud
based systems, cloud systems architectures and services.
Team
ID =>join the class with the following code:
0cv5g2j
Labs
and other communication will happen on Teams as well - in the same class.
12/03/2025 14:00-18:00 (C335)- IaaS,
Private Cloud and Virtualization - Invited Speaker
Tudor Damian (Microsoft MVP and ethical hacker) - presence
compulsory |
19/03/2025 - 14:00-18:00 (C335)- Azure Architecture and Automation Seminar - Invited Speaker Florin Loghiade
Cloud Solutions Architect -presence is compulsory |
1. Introduction to cluster
computing definitions, roles, Taxonomies
2. Distributed processing
3. Hardware, Architectures,
Cluster Technologies
4. Distributed File Systems
5. Virtualization
technologies
6. Grid and Cloud Processing
7. Grid and Cloud System
Architectures
8. Implementation methods for
Application partitioning and planning
9. Functional and parallel
programming
10. Map reduce Paradigm
11. Web Services and Computing
Services
12. Microsoft Azure/Amazon AWS
13. Cloud based Data
management systems
14. Final Overview
1.
Foster, Ian; Carl Kesselman (1999). The Grid: Blueprint for a New
Computing Infrastructure. Morgan Kaufmann Publishers. ISBN 1-55860-475-8
2.
Li, Maozhen; Mark A. Baker (2005). The Grid:
Core Technologies. Wiley. ISBN 0-470-09417-6
3.
G. Reese, Cloud Application Architectures: Building Applications and
Infrastructure in the Cloud, O'Reilly, 2009, ISBN:978-0-596-15636-7
4.
Tanenbaum A.S. Operating Systems Design and Implementation (Third
Edition). Prentice 2006.
5.
J.F Kurose, K. Ross, Computer Networking - A top down
approach, Addisson Wesley, 2007 (4th ed)
6.
Anil Desay, The Definitive Guide to Virtual
Platform Management, 2010, Ca technologies, download
http://nexus.realtimepublishers.com/dgvpm.php
7.
R. Jennings, Cloud Computing with the Windows Azure Platform (Wrox Programmer to Programmer), Wrox, 2009,
ISBN: 978-0470506387
8.
D. Sanderson, Programming Google App Engine Build and Run Scalable Web
Apps on Google's Infrastructure, O'Reilly, 2009., ISBN:978-0-596-52272-8
9.
Andy Oram (ed), Peer-to-peer Harnessing the power of disruptive
technologies, O'Reilly, 2001, ISBN: 978-0596001100
10.
* * *, http://code.google.com/intl/ro-RO/appengine/docs/
1. Sem 1, 2,3 - The Hadoop Infrastructure-installation,configuration,
application running
·
Virtual Machine installation and Configuration
·
Hadoop Installation ->VM
·
Configure master nodes and compute nodes /DFS
·
Replication configuration and cluster management
·
Docker deployment
An example of a Virtual Machine can be downloaded from here in the Teams Files folder of the class. An online copy (for remote download) of the same
virtual machine (user root, pass: test) is
also found here.
The docker
variant (that should be developed further by students) is available here: https://github.com/asergiu/hadoop-docker
Once this lab is
finished you may have eclipse (optionally) configured with the Hadoop plugin
together with the virtual machine and be able to connect from eclipse to the VM
in order to deploy jobs or just to build your apps and
run the on the cluster. You should also be able to
copy files to and from the HDFS filesystem
2.
Sem 4 - Simple Hadoop
application implementation (map reduce hello World = word count
3.
Sem 5,6 - implementing an Azure distributed app using at least 3
services (for ex: Compute/Storage/Web Roles and Worker Roles or something
else). There is an in-class example
4.
Sem 7 Semester
Project presentation - Hadoop - inversed index.
Using the already configured Hadoop VM (either VM or docker
version) from above, together with the development environment write an
inversed index app for a set of books as data. You can download text books from project gutenberg – using text format.
In order to build the inversed index you need to
account for a stop word list (words that will not be indexed by the application
as not useful or relevant. Ex: and, or,
how, so, etc). These stop words will be read by
the application from a text file (stopwords.txt)
An inversed index contains for each distinct unique word, a list of files
containing the given word with its location within the file (line number
in our case). When running the application you should have a small cluster/cloud
of at least two nodes build from VMs/docker – eventually a larger cluster build
from all your individual VMs.
Ex: word: (file#1, line#1, line#2, …) (file#4, line#1, line#2,…) …)
Example: Wordcount java
Azure Project – Any project using at least 3 services from
the Azure platform – you can come up with a problem yourself.
Example of
Requirements for Project (you do not need to implement this one): Implement a cloud
web application that accepts guest reviews (text) with images posted. As users
post reviews with images on the webpage – they are displayed in a chronological
reversed order (with the more recent at the top of the page). The app should
keep all posts and have them displayed at all times –
while allowing guests to publish new posts. As guests could upload images of
large different sizes – the guestbook app should handle image resizing
(automatically) so that the image from each post is automatically scaled down
to a standard small definition (around 128 pixels) as an icon. The original
image is replaced in each post by the small resized
icon of it. The original image is reachable by clicking the small icon in every
post. This ensures that the guestbook page would not take too long to download
and display in user browsers due to the large images from posts. You probably
need to use: a table store for storing messages and link,
a blob storage for storing images and thumbnails( icons),
a web role for implementing the web page and a worker role that reads a queue
storage having as entries large images to be scaled down. As the worker roles
progresses trough posts – their images are transformed to thumbnails and the
web page updated to reflect that.
Written
Exam (or Presentation) 50% + Hadoop Project 25% + Azure Project 25%
Students
need :
·
Min 5 on the Presentation topic/Written
Exam (you choose one of the two)
·
Min 5 for the projects