Course Content‎ > ‎

Section 13: Real-World Problems

Real-world Problems
Many of the participants of this course will have built a number of software systems during their studies at University.  While these systems may have been impressive feats of programming excellence, they were most likely single person projects.  Single person projects have a wide range of advantages over collaborative projects:
  1. The sole developer can update files and never have to worry about overwriting code from other developers
  2. The sole developer never has to worry about their code changes being overwritten
  3. The sole developer has a full, comprehensive understanding of all of the code in the project
  4. The sole developer can change parts of the code and already knows how this will impact other components they have built
  5. The sole developer never really has to worry about writing documentation (although they probably should!)
  6. The sole developer never really has to worry about integration of written components
As can be seen, there are a bunch of advantages in working alone.  Unfortunately, as soon as students leave University, they will be thrust into environments where they need to work with teams of developers and project teams.  Suddenly, all of the above advantages become problems that need to be dealt with.
  1. Unless due care is taken, developers can overwrite code written by other developers
  2. Unless due care is taken, developers may have their code overwritten by other developers
  3. Documentation is important now as information has to be shared between teams
  4. No single developer has a full, comprehensive understanding of all code in the company system
  5. The developer may not have a full understanding of how their code will impact components written by other individuals
  6. Integration is now an important issue as there are likely to be a range of separately built components by different teams
I have highlighted the first three of these issues as we are going to deal with them in this section.  Other issues, such as integration and working on components will be dealt with later in this section.

Source Version Control

Most course participants have probably already done some versioning control of their own in some context.  For example, on your hard-drive you might have some documents, such as:
    - John_Smith_CV_Sep19.docx, John_Smith_CV_Dec19.docx, John_Smith_CV_Jan20.docx

Here we are seeing a self-imposed versioning system (by using 'Save As') where we have a "repository" of documents, each of which have been versioned.  The author can access old files, perform updates (saving with a new name), delete versions etc.  

In a multi-user software project, this approach just is not feasible (does anyone really want to see: index-version11-david.html, index-version12-john.html etc.).  It would create major problems such as broken references, duplication of effort and really bloated and untidy software projects.

Version Control Systems (VCS) or Source Code Management (SCM) tools are software utilities used by programmers to manage their source code in a multi-user environment.  They are responsible for keeping track of multiple versions of files and updates to those files.  Every version of file is provided with a timestamp and indicates the person responsible for the change.


Modern Day Source Code Management

By using SCM, users can look forward to a number of benefits:
  • Files and directory structures are consistent and don't contain versioning in names or directory structures
  • Full audit record of every change made, by whom and when
  • Developers can make changes with confidence and if needed, the changes can be "reverted" (rolled back)
  • The SCM provides the communication mechanism for the team  (e.g. no emails with 'I have made those changes Barry')
  • SCM allows deployment of any version of code to test or production servers
  • Developer code is backed up after committing to a VCS system
So what does modern-day source control look like?

Distributed version control diagram

This is what is known as 'Distributed Version Control'.  What this means is that developers don't just "check out" (e.g. download locally) the latest versions of files.  Instead, they mirror the entire repository, containing all files, all revisions, all history etc.   One of the obvious advantages of this is that if the primary version database goes offline, individual developers can continue developing as they have a complete copy of the entire repository.  If the server were to subsequently come back online, then the local repository changes can be copied back up to the server to restore it.  

What is Git?

Git is the most widely used version control system currently used today.  It is an actively maintained open-source project originally developed in 2005 by Linus Torvalds, the creator of the Linux OS.  As described in the previous section, Git is an example of a DVCS (Distributed Version Control System).  Git is recognised as having:
  • Performance: Updating changes and comparing past versions etc. are all optimised for performance.  
  • Security: Git ensures the integrity of managed source code.  It does this by securing all objects in a repository using a cryptographical secure hashing algorithm called SHA1.  This protects code against accidental and malicious change.  This is highly critical for software systems which are often co-reliant on other projects which could be compromised.
  • Flexibility: Git has been designed to support a wide range of development workflows and is efficient for both small and larger projects.  It supports "branching", "merging", "reverting" and a number of other operations that are extremely powerful.
  • Broad Adoption: Git is the most widely used version control system and has broad industrial adoption.  Numerous tools and third-party integrations for Git (e.g. Eclipse, IntelliJ, Slack etc.) are available.
  • Open Source: Git is a well supported open source project, with strong community support and a vast user base.  High quality documentation and tutorials are available. 

Installing Git 

Rather than recreate the steps here, a good guide to installing Git on Linux, Mac or Windows can be found at: https://www.linode.com/docs/development/version-control/how-to-install-git-on-linux-mac-and-windows/
It is a good idea at this point to install Git and ensure that you can type 'git' successfully at the command line in your system.
E.g. after a successful install on Windows you should be able to get:


What is GitHub?

GitHub is a Git repository hosting service which provides a web-based graphical interface. It also provides access control and a number of additional tools (such as project Wikis, issue tracking, project forking etc.).  For the purposes of this chapter, we are going to use GitHub as our central repository server.
Note: You can set up your own account on GitHub and create your own repositories with full read/write access.  You will typically only have read access on public repositories. 

After creating an account on GitHub, our first task is to create a "repository":
Git - Creating a repostitory
Note: With a free account, you can create any number of "public" repositories.  These can be seen by anyone with the URL (which is fine for our purposes!), but would not be suitable for a private company repository!  Private repositories typically involve a subscription charge, but there are some deals available for student accounts.

After creating our repository, we can see that it contains:
  • 2 files: the license and the readme (shown below)
  • 1 branch: this is the default branch, known as 'master'
  • 1 contributor: others can be invited as appropriate
Git - Sample almost blank repository

Git/GitHub Demonstration

So at this point, we have a code repository but one that contains no code.  Let's start using some code!  In Eclipse, create a new 'Dynamic Web Project' that we will also call 'ee417-service'.  The name does not necessarily have to match the GitHub naming, but I typically prefer to keep them matching.

Eclipse - Create Dynamic Web Project
This will create our standard web project, complete with src, build and WebContent directories.  


Working on our local repository

From the command line (and having previously installed Git), we can now initialise a local Git repository using:
git init
This creates a new hidden folder '.git' in the root directory of our software project.  By default in Windows, the command 'dir' will not show this hidden directory.  To view the directory, you can type:
dir /A

Initialising local repository

At this point, we have a local Java project together with a local Git repository.  We also have a remote GitHub server repository.  However, these are not linked in any way and the local repository currently knows nothing about the remote repository.  
So let's join them together by telling the local repository about the server repository:
git remote add origin https://github.com/davidmolloydcu/ee417-service.git

So when we now type:
git status
We can see any "untracked files".  Untracked files are any files in our working directory (C:\Java\Workspace\ee417-service) that are not currently stored in the local (or remote) repository and have not been "staged" for commit to the repository.

We firstly add any changes in the working directory to "staging" to indicate that they are ready to be committed to the repository. 
git add .
This command adds all changes in files to the staging area.  It tells Git that we want to include updates to these particular files in the next commit we make.  This doesn't affect the repository in any significant way as changes are not actually recorded until you run 'git commit'
git commit


Note: The repository referred to in the diagram above is the local git directory.  This is stored in your working directory in a .git folder.  Have a look at it - but don't touch any of the files!

Adding and committing files to local repository

This commit statement is synonymous with the idea of 'Saving' our files - in this case to our local repository.  Nothing has happened to our server repository at this point in time!

Pushing our Changes to the Server Repository

We are now in a position where we want to share our project with other developers on the team (assuming they were set up as collaborators in the GitHub interface).  There is a command 'git push origin master' which would try to push our current master branch on top of the server branch.  If you try to do this, then it will fail with a rather complex message.
This is because the server repository has already got some initial files that we don't have in our local repository/file system  (remember the LICENSE and readme.md files that were created in the GitHub server repository!!)

So what we need to do is to ensure that the local repository is firstly in sync with the server repository, before we will be allowed to push changes to the server repository.  We can do this by:

git pull --allow-unrelated-histories origin master

As can be seen in the screenshot, this copies down the two files already existing on the server repository.  After this command, you should see the two files included locally in the Eclipse project (refresh if needed).  
So now that our local repository is in sync with the server repository, let us finally push up our changes:

git push origin master


Now, if we look at the server repository in GitHub, we can see that our files and directories have been copied successfully up to the central server repository and that both our local repository and server repository are completely in sync.

GitHub showing server repository complete with new files

So now, we have successfully installed Git, created a Java web project, created our first local repository, created a GitHub server repository and managed to get everything in sychronisation!  Throughout this process, we have worked exclusively on one branch called 'master', which is the core master copy of code for our project.  As we were simply initialising our repositories and have only been one developer, we have been working so far with this master branch.  In future, we are going to avoid working directly with 'master' and will typically develop using branches, which we introduce next!


Note re --allow-unrelated-histories:
The approach we took was a slightly unorthodox approach to creating and linking our repositories. We created two separate repositories and then we set the server repository for our local repository.  Git does not particularly like this and without the --allow-unrelated-histories it will have a synchronisation problem between the two repositories.

Normally, the standard practice is to create a repository and then to simply clone that repository (which we will show later in this chapter) and then to start adding files etc. However, I wanted to demonstrate creating a Web Project in the usual way and linking it to a server repository.  


Note: re Authenticating to GitHub
The above examples do not show the process of logging in to GitHub.  For Windows users the login details get stored in the Windows Credentials manager.  A configuration file (typically: C:\Users\<WinUsername>\.gitconfig will typically make reference to this:

[user]
email = <github_registered_email>
name = <github_username>
[credential]
helper = wincred

If you are trying this yourself, you will likely be prompted initially for your own username/password details.  Once saved in the credentials manager, you will not be asked again (similar to the examples above).

Git Branching

There are a number of approaches that can be taken when committing code changes.  For the purposes of this course, we are going to avoid committing code directly to the master branch.  Our master branch should only contain thoroughly tested and deploy-ready versions of code and should not be used for day-to-day updates.  This is particularly important when you start working on numbered releases, features requiring multiple commits and when you start working with other developers.

This is where 'branches' come in.  A branch represents an independent line of code development.  Typically it starts with a copy of the 'master' branch (so that everything is in sync).  The developer then works entirely in this new branch, which has it's own facilities for staging, committing etc.  

Let us demonstrate this with a continuation of the example above.  We are going to add a new set of functionality to our project - we will add an index.html page and a HelloWorld servlet. However, initially we are going to do this in a separate branch.

From the command line in the root of the project, we will perform two commands:
git pull origin master
git checkout -b "IndexAndHelloWorld"

Checking out a new branch

The first command retrieves the latest version of the code from the 'master' branch.  This indicates that our local repository is 'Already up to date'.  We would expect this since we are the only developer working on this project to date.  However, we should always do this in a multi-developer environment to ensure that our new branch is starting with the absolute latest version of all of the files.  If we were to omit this step, this can cause merging conflicts at a later stage.

The second command creates a new branch called 'IndexAndHelloWorld' and switches us to this branch.  We are now working on a separate branch of development.

Coding as Normal

We won't go into detail here regarding the coding.  We will simply make an index page that links to a servlet and a servlet that says 'Hello World'.  We simply work in Eclipse in the same way we did in the earlier chapters, creating index.html and HelloWorld.java.
Output from index page and servlet

Committing our changes to the 'IndexAndHelloWorld' branch

Similar to before, we can type:
git status
to see our untracked changes not yet added to staging or committed on this branch.  
Git Status on our new branch
As expected, we can see the new index.html file and the src/ folder (which contains our package directories and our HelloWorld.java file).  However, we have an additional 'build' folder that we were not expecting. This folder contains the compiled HelloWorld.class folder  (in build\classes\com\ee417\HelloWorld.class).  
We don't want to add compiled build files to our source code repositories, so we need to tell Git to ignore these files.

To tell Git to ignore certain files, we need to create a file .gitignore in the root of our project.  The contents of this file can be as simple as adding a wildcard list of any files/directories we want to ignore.
build/*

So now, let's progress and do the following:
  1. Add our files to staging    (git add .)
  2. Commit our changes to the 'IndexAndHelloWorld' branch (local repository)      (git commit -m "Description of our commit")
  3. Push our changes to the 'IndexAndHelloWorld' branch (GitHub server repository)       (git push origin IndexAndHelloWorld)
Git - committing our changes to our new branch

Merging our Branch into the Master Branch

At this point our repository has two branches:
1) Master which currently looks like this:
Git - our current master branch

2) IndexAndHelloWorld which has a number of new files (the ones we just added):
Git - our IndexAndHelloWorld branch
Navigating through the directories we can find the 'index.html' file and the 'HelloWorld.java' file *only* in this latter branch.  At this point in time, 'Master' is sitting at the state before we started developing  (assuming nobody else made any changes).  'IndexAndHelloWorld' contains the contents of 'Master' plus the changes we have made.

At this point in the development process, the company is likely to have some procedures that are applied before anything is "merged" into 'master'.  We won't go into this detail here, but rather are going to progress towards merging the branches.

To do this, we create a 'Pull request' in GitHubPull requests tell other team members about the changes you've pushed to a branch and typically means you are looking to merge code into master.  This might involve the performing of testing, code reviews and discussions with other team members and potentially other iterations of commits if problems are identified.  

Git - Making a pull request
The pull request will give a title, a description, the option to request code review by nominated individuals and will also show all of the changes that were made to files in the project.  Green highlighted code is newly added code, whereas if code had been removed it is typically highlighted in red.
Let's go ahead and simply create the pull request.

Git - Merging the pull request

When we try to do this, GitHub will check for "conflicts".  If it finds no conflicts, it will allow us to merge the 'IndexAndHelloWorld' branch into 'Master', applying all of the changes to Master.
We will talk about conflicts in the next section.

Git - Confirmed Merge into Master

Once this is done - our changes have been merged into Master and we can optionally delete the branch 'IndexAndHelloWorld' now that we have completed that development work.
Our code changes have been applied and will be included in the next release of our web application!  Well done!

Updating Local Branches

So let us consider something now.  On our local repository we currently have two branches:
 - IndexAndHelloWorld: which contains all files.
 - Master: which does not contain the newer files!!!

Why not!!?  If you think about the sequence of events, we did the following:
  • Wrote files to the local 'IndexAndHelloWorld' branch        (local)
  • Committed these to the remote 'IndexAndHelloWorld' branch   (remote)
  • Merged these changes into the remote 'Master' branch        (remote) 
However, our local 'Master' branch was not changed.

To demonstrate this, we can switch to this local 'Master' branch:
git checkout master

If you look closely here, you will notice that our new files have disappeared!  This is because the 'Master' branch was last copied locally before we wrote these files.
If we were to: git checkout IndexAndHelloWorld, the files would appear again.  You will see these files appear and disappear in Eclipse (you might need to refresh).
So after checking out master, what we want to do is to resynchronise it with the server 'Master' branch.

git pull origin master

Git - Updating Local Master

As can be seen, this brings 'Master' up to date including all of our changes (and any changes committed and merged by other developers).  Before creating another branch and making further changes, it is best to synchronise your local 'master' repository with the central 'master' repository.

Conflicts

What do we mean by conflicts?  Let's take a sequential example where two developers are working on a project.  T represents a nominal time-frame.

      T1 T2 T3 T4 T5 T6 T7
 Dev A Updates local master branch Creates branch 'Dev A Branch' Makes changes to 'index.html', HelloWorld.java Adds, commits and pushes to 'Dev A Branch'                  
 Pull request to merge into Master Merge fails due to conflict!
 Dev B Updates local master branch Creates branch 
'Dev B Branch'
 Makes changes to 'index.html' and creates new 'Home.java' Adds, commits and pushes to 'Dev B Branch' Pull request to merge into Master Pull request merged without conflict. 

A conflict arises because both developers simultaneously worked on the same file (index.html) and are now both trying to apply their changes.  If GitHub were to automatically allow 'Dev A's changes to be merged (which took place after the merge by Dev B), then this would overwrite the work performed by 'Dev B'.
In this scenario, there needs to be some human input.  Git provides merging tools which will highlight and help to resolve such conflicts.  

Image result for github merge tool
Sample Git Conflict Tool showing conflicts needing resolution

Resolving conflicts tends to be relatively straightforward but may involve some interaction with the developers.  
Once the conflict is resolved regarding 'index.html', the change to 'HelloWorld.java' would also be applied as no conflict would have arisen in this situation.

We won't go into conflicts in more detail here.  As a single developer you are unlikely to encounter them, but they do become a reality in a real-world company environment.  


Git Log

Typing:
git log

will show a record of all commits. For each commit, it shows the SHA1 hash code, the author, the date and the commit message which explains what happened in the commit.  This provides us with complete oversight on all commit activities.

Git Log Output


Git Clone

Git clone is used to download an entire Git project.  For example, if you wanted to create an entire copy of the repository to a code directory on your local machine you can use:
mkdir code
cd code
git clone https://github.com/davidmolloydcu/ee417-service.git

This can then be opened in Eclipse or your preferred IDE and you can start building your own branches.  If you have permission, you can later make pull requests to make changes on the master repository.


Real-World Problems Revisited

Earlier we presented a number of "problems" with working in multi-user environments in the real-world.  We listed these as:
  1. Unless due care is taken, developers can overwrite code written by other developers
  2. Unless due care is taken, developers may have their code overwritten by other developers
  3. Documentation is important now as information has to be shared between teams
  4. No single developer has a full, comprehensive understanding of all code in the company system
  5. The developer may not have a full understanding of how their code will impact components written by other individuals
  6. Integration is now an important issue as there are likely to be a range of separately built components by different teams
In the earlier section, we saw how Source Control Management can help us with the first three of these listed issues.  
The last three issues we will address in the remainder of this section.  But first, let us talk briefly about monolithic web application architecture.


Monolithic Architecture

Traditional software development processes usually result in teams working on a single 'monolithic' web application.  By 'monolithic' we mean that the application is built as a "single unit".  
In fact, this is what we have done to date in this module and what you should have built in your own web application assignments.  The application consists of three main parts:

- A database or data-store of some sort
- A server side application handling logic, retrieval and update of data
- A client-side front end which serves out HTML content for users

It is a single application, running either on a single server or load balanced array.  When we want to make code changes to the application, the code is updated (hopefully using SCM above!) and an entirely new version of the application is deployed as a single entity.  

So let us consider the remaining problems above:
  • No single developer has a full, comprehensive understanding of all code in the company system:  as expected, as the application scales in complexity, it would be no longer possible for any individual to understand all code (there could be millions of lines!)
  • The developer may not have a full understanding of how their code will impact components written by other individuals: if a developer does not have a comprehensive understanding of all code, then they can't be sure that their new code will break some existing code.  Test-driven development can go a long way to helping here.  However, as projects scale, so indeed does the likelihood of negative impacts. 
  • Integration is now an important issue as there are likely to be a range of separately built components by different teams: There are a number of challenges related with integrating internal components.  There are additionally a number of challenges with providing interfaces for third parties to utilise functionality or share data within your monolithic application.


Microservices

Microservices solve a number of challenges of monolithic systems by being as modular as possible.  Instead of building one application, we instead build a suite of small services, each of which runs in their own process and are deployed independently.  Each of these services typically encapsulate some core business capability and are decoupled from other parts of the application.


For example, in the diagram above, if we were building a web application for an e-Commerce system, we might have modules:
  • Green Microservice: Customer management, login, profile editing etc.  Linked database containing user information.
  • Cyan Microservice: Product searching, product information etc.  Linked database containing product information
  • Orange Microservice: Placement, modification, cancellation and tracking of orders.  Linked database containing order information.
  • Red Microservice: User interface for customer management (built in HTML/Javascript, jQuery, Angular, React etc.)
  • Light Blue Microservice: Automated ordering service looking at product information and orders to determine automatic ordering from third-party whole-sellers.
You likely have a number of questions with regards to how these individual modules communicate and share data, given that they might be deployed on different hardware, have different databases and no longer have the inability to use traditional session tracking etc.  
We will get into some of these.  However, first let us consider our three remaining problems again and how Microservices, as a concept, can help solve these problems.

  • No single developer has a full, comprehensive understanding of all code in the company system:  This is no longer an issue as you can have single developers who now have responsibility only for certain microservices.  They only need to have a full understanding of all of the code in their own microservice.
  • The developer may not have a full understanding of how their code will impact components written by other individuals: Micro-services are generally expressed using 'business-oriented APIs' (we will talk more about these later).   The developer simply needs to provide a clean and consistent interface to their microservice that other microservices can use.  
  • Integration is now an important issue as there are likely to be a range of separately built components by different teams: Internal integration and external integration are essentially managed in the same way now.  Because we already need to build APIs for internal application communication, it is simply a small task to externalise these interfaces for third-party services to use!
As we can see, there are a number of advantages associated with micro-services, particularly for large web applications with a wide range of functionality.  
In fact, here are a couple of more to whet your appetite:
  • Development Stack: because all of our components are built to be entirely independent and only communicating through an API, they can be built in different languages, running on different infrastructure types.  Build your green microservice in .NET, build your orange one in Spring Boot, your cyan one in Python - it doesn't matter.  You can use the best tool for the job.
  • Code Deployment: We want to apply a new feature under the orange microservice.  That code can be updated and a new version deployed with zero downtime or effect on any of the other microservices.  Of course, there may be integration issues experienced on other microservices if they rely on this orange microservice for functionality (e.g. the API would cease to function).  However, we no longer have to bring down the entire web application to deploy a new change to a single function.
  • Improved Developer Productivity: As developers can focus on individual components with clearly defined boundaries, this typically enhances productivity in large projects. Jeff Bezos (Amazon CEO) in the early days of Amazon instituted a rule "every internal team should be small enough that it can be fed with two pizzas".  
  • Scalability:  Microservices allow us to scale individual components and provide the resources that are needed.  If our green microservice sees a big uptake in usage and needs more CPU/memory we can scale it easily without unnecessarily scaling other components which may operate at lower loads.  
We have referred to "business-oriented APIs' and 'API-driven design' above.  To better understand these, let us first talk a little about JSON and RESTful web services.


RESTful Web Services

REST is an acryonym for REpresentational State Transfer and is an architectural style for developing web services.  REST is popular as it's a pretty simple way of representing data to be transferred between different service endpoints.  Data is transferred over standard HTTP/HTTPs protocols using standard HTTP methods (GET, POST,PUT, DELETE etc.).  By using these protocols, we typically avoid issues around firewalls or having to closely link service endpoints.

In earlier years and in traditional web services, SOAP (Simple Object Access Protocol) was the most common approach that was used.  This approach used XML data.  This was described in detail in 'Web Services (RETIRED)' but this content has now been retired.  This means that this content will not be asked in an examination for this module.  It has been linked to for reference purposes only.  


JSON

The most common data format for RESTful transfer of data today is JSON (JavaScript Object Notation).  JSON is a lightweight format for storing and transporting object-oriented data.  Here is some sample JSON data:

[
  {
    "id": 1,
    "name": "Leanne Graham",
    "username": "Bret",
    "email": "Sincere@april.biz",
    "address": {
      "street": "Kulas Light",
      "suite": "Apt. 556",
      "city": "Gwenborough",
      "zipcode": "92998-3874",
      "geo": {
        "lat": "-37.3159",
        "lng": "81.1496"
      }
    },
    "phone": "1-770-736-8031 x56442",
    "website": "hildegard.org",
    "company": {
      "name": "Romaguera-Crona",
      "catchPhrase": "Multi-layered client-server neural-net",
      "bs": "harness real-time e-markets"
    }
  },
  {
    "id": 2,
    "name": "Ervin Howell",
    "username": "Antonette",
    "email": "Shanna@melissa.tv",
    "address": {
      "street": "Victor Plains",
      "suite": "Suite 879",
      "city": "Wisokyburgh",
      "zipcode": "90566-7771",
      "geo": {
        "lat": "-43.9509",
        "lng": "-34.4618"
      }
    },
    "phone": "010-692-6593 x09125",
    "website": "anastasia.net",
    "company": {
      "name": "Deckow-Crist",
      "catchPhrase": "Proactive didactic contingency",
      "bs": "synergize scalable supply-chains"
    }
  }
]

We can see here how the data represents two users, complete with a child address object and company object (and their associated properties).  It is clean, lightweight and contains both the structure of the data and the data itself. 

Here is another example that I like:
[{"id":56,"type":"programming","setup":"How do you check if a webpage is HTML5?","punchline":"Try it out on Internet Explorer"}]

While the later example is less visually formatted, the JSON code is equally valid.  There are JSON validators available on the web if you ever want to test valid JSON data.  We won't go into the smaller details around JSON - as a general idea it should be pretty understandable.

RESTful Methods

Here is a table from a recently developed API:
 URL       Method Description  Sample Response Codes
 /usersGET Get all users 200 OK, 500 Error, 403 Forbidden
 /users   POSTCreate a user  200 OK, 500 Error, 403 Forbidden, 402 Validation
 /users/{id}GET Get a single user matched by the id 200 OK, 500 Error, 403 Forbidden
 /users/{id}PUTUpdate a user based on provided details 200 OK, 500 Error, 403 Forbidden, 402 Validation/Unable
 /users/{id}DELETE Delete a user based on provided details  200 OK, 500 Error, 403 Forbidden, 402 Validation/Unable
 /rolesGET   Get all available roles 200 OK, 500 Error, 403 Forbidden
 ... etc......          .....         ..... 
 /users/{id}/roles  GET Get all roles assigned to the user 200 OK, 500 Error, 403 Forbidden 
 /users/{id}/roles POST Assign a role to the specified user 200 OK, 500 Error, 403 Forbidden, 402 Validation
 ... etc...     .... ..... .....
    

While the API documentation provided should ideally be more detailed than this, this example shows the range of functionality available by the service, which URL and HTTP method should be used and what the likely responses might be.

It provides a clear API interface for other services to use this service. This may be an external service (either public or requiring authentication) or indeed may be an internal microservice.    

Here are some APIs that you might be interested in browsing to get a better handle on things:



API-Driven Design

Let's build a really quick web application that is going to consume some data over an API.  
To do this we will use a very cool site: https://jsonplaceholder.typicode.com/
This is a free online REST API that developers can use whenever they want to use a bit of fake data for testing.  It's great for testing new libraries, sharing some code and providing an example to a bunch of students in an EE417 module!

<html>
<head>
<title>My Page Consuming an API</title>
<script type="text/javascript">
function updatePageWithServerData(json) {
   console.log(json);
   document.getElementById("nameSpan").innerHTML = json.name;
   document.getElementById("websiteSpan").innerHTML = json.website;
}
</script>
</head>

<body>
<h1>API Example</h1>
<p><b>Name: </b><span id="nameSpan">...</span></p>
<p><b>Website: </b><span id="websiteSpan">...</span></p>

<script type="text/javascript">
console.log("Fetching...");
fetch('https://jsonplaceholder.typicode.com/users/1')
  .then(response => response.json())
  .then(json => updatePageWithServerData(json))
  .catch(error => console.log(error))
</script>
</body>
</html>
Sample Code of Simple User Interface to Retrieve JSON Server Information

Loading JSON data into our simple UI
Sample Output and Console View 

So here we can see how we can write applications that talk to third party services using JSON as the common data transfer format.  As long as the JSON format remains consistent from the providing service, we can write our own consuming services.  A few additional points here:

1) Language Independence: What programming language is the jsonplaceholder application written in? Java/PHP/Python?  We don't know and we don't really care!  By using decoupled services through JSON the programming languages used for each service is not important to us.  All that we care about is the provision of service and the format of the JSON integration interface.

2) Network Access: Typically this is not a problem, since we are only using HTTP/HTTPS which are standard ports (80/443) that are typically enabled on most networks.  As a result, there are typically no extra firewall/routing configurations required.

3) Reliability: Naturally, if we build applications that are reliant on multiple third party services then we need to have a strong guarantee of reliability of those third party services.  If those services were to be offline, then this could in turn reduce functionality of our own web applications.  Most third party services come with a reliability guarantee in the form of a service level agreement (SLA).  For example, Google Maps provide a 99.9% SLA  (365 days x 24 hours =   8,760 hours in a year = Downtime of no more than 8.7 hours per year).  Of course, legal resource might be limited unless you have a specific company agreement - most SLA violations simply result in proportional service credits being allocated.

4) Security: While there are a range of free API services on the web, the majority of them have some authorisation process wrapping them.  This is to prevent abuse of web services by limiting the number of requests and is also aimed at generating revenue for the company providing the services.  For example, if you want to use the Google Maps API then you need to have a customer API key (https://developers.google.com/maps/documentation/javascript/usage-and-billing).  Security is typically achieved through authorisation properties being passed over the RESTful requests, either contained in the query string (e.g. https://domain/pathToService?key=A393B300F194CE3GAO3AD) or passed in the HTTP headers using in the request.  For example, a common header used is 'Authorization'.

'authorization: Basic abNlcm5hbWU6ABIzNDU2Nzg5MDEyMzQ1Njc4OTAxMjM0NTY3ODkwMTIzNDU2Nzg5XC==' 


Microservices: Building our own Services

To date, we have built monolithic applications.  All aspects, including the presentation, application and data interfacing have all been bundled inside the same set of application code.  Our entire application is bundled inside a WAR file and deployed on a traditional server.  We have not yet considered exposing our functionality with APIs, making our services available to third party providers to consume.  

Microservices Architecture
Microservices Architecture: (Reference HashNode)

Microservice architecture focuses on designing your application as single function components (microservices) with well-defined interfaces and operations that communicate via APIs.  The granularity of these services depend on business needs.  For example, you might decide to build a microservice dealing with user management, including roles, departments etc.  When other microservices need to get access to user information, they do not have access to load this information from the database (it is most likely not co-located in the same database).  Instead they will use API requests (GET for retrieval) in order to access the user information - this information is transferred over HTTP in JSON format.  

When it comes to the client-side user interface, in monolithic applications we typically pushed data in a synchronous manner to the client in the form of HTML with data embedded.  We may have additionally performed some asynchronous Ajax updating.
With microservices, we instead build the user interface as a completely decoupled standalone service (or commonly a group of modular services).  All data requests are performed asynchronously using scripting-initiated API service calls to our other microservices.   

All of our services can be deployed on individual, scalable architecture running completely independent from one another (except through common API interfaces).  They represent an effective modular approach for architectural design.  

Microservices also offer advantages around integration.  For example, if we are already using JSON RESTful interfaces for our intra-application communication then we can easily use these interfaces for interfacing with third-party clients who wish to integrate with our software services.  For example, the third-party service could (with permission) perform API calls to load user and role information from your user management microservice.  

While we won't have time in this course to go into microservices in more detail, it is worth mentioning that microservices does come with some new challenges.

Microservices: Challenges

Authentication / Authorisation: In the early part of our course, we talked about session tracking as an effective method for maintaining "state" while users used our web applications.  We stored login information in our session to determine whether a user had previously authenticated successfully.  The session represented some information stored on the server for each user.  Microservices on the other hand are completely stateless!  While this has significant advantages around scaling, it means that we can no longer store information in a shared session in a JVM.  This presents some additional challenges and is typically overcome by ensuring that a secure authentication token is sent with every request.  One such approach is JSON Web Tokens (see: https://jwt.io/introduction/ ) which are encrypted tokens for securely transmitting information between parties as a JSON object.

Logging / Debugging: With monolithic applications the logs are typically stored in one location (e.g. something like <TOMCAT_HOME>/logs/localhost.log).  All log entries from the entire application are stored in the one file, making it easy to track and debug issues.  With microservices, each of our servers can be running on different servers, written in different languages etc.  We no longer have a simple approach of browsing all logs at once, as they could be stored in log files across multiple servers. This makes bug tracking much more difficult.  
There are a number of solutions for this often using some sort of centralised logging stack (e.g. something like https://www.elastic.co/what-is/elk-stack )

Hosting / Server Management: With monolithic applications, our entire application was running on one server or on multiple load balanced replica servers.  With a microservices approach we could potentially have dozens or hundreds of separate services all running in a standalone capacity.  Assuming we used one scalable server per microservice, this adds a significant server management overhead onto our development team.  This is typically managed using a container based approach such as by using Docker or Kubernetes.  Containers allow a developer to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package. These can be deployed in clusters onto servers, greatly simplifying the deployment process.  For example, a container might have the user management service running on https://mydomain.com:8000, the order management service running on https://mydomain.com:8500, the login service running on https://mydomain.com:9000 etc.  There are a number of Docker management services that allow the management of these containers, their memory, CPU, monitoring and auto-scaling.  

Development: When building monolithic applications, the developer has the entire application running locally while in development mode.  With microservices this is more complex as they might need to start up dozens of microservices to deploy the application even during development.  Container services, such as Docker provide the solution to this challenge.  Developers can download, run and develop within a managed container.

API Versioning: Throughout this course, we have said a number of times that building web applications is an iterative process with new functionality added all of the time.  We wish to constantly make changes and there are times where we want to change the API interfaces (perhaps to pass more data, remove some properties, add new API functions etc.).  However, if we change the API we now risk breaking the interface with some other components.  This is typically handled using API versions.  Rather than change the API, we simply release a later version of the API and provide backwards compatibility with the old version.  For example, this could be handled through URL-based versioning:
Existing services could use:  
GET /v1/order/1 
Whereas, we can now launch version 2 and new services could use:
GET /v2/order/1
These APIs can each be shared with third party companies also, who can choose to use their preferred version.

Final Notes on Microservices

Microservices are fairly complex and we won't go into further detail in this course.  However, they are growing significantly in popularity and over time we will see a large drop-off in the usage of monolithic applications.  For anyone working in web development into the future, it would be useful to learn more about microservices.  
 

Comments