+1-617-874-1011 (US)
+61-7-5641-0117 (AU)
+44-117-230-1145 (UK)
Live Chat

7022DATSCI Big Data Analysis

Master of Sensors Data and Management
Big Data Analysis


  • You should work on the mini-projects in groups of up to 5 students.
  • Use electronic communication for organising your group work. Support wil be provided online via
  • Together with your group, prepare a Powerpoint presentation of your project with 10 minutes recorded audio. All group members will receive the same mark for the presentation.
  • In addition, everyone should hand in a one-page summary of the project.
  • In the week after the deadline (week commencing Monday, 27th April 2020) each group should meet with me via videoconference and explain the code that they have produced for the project (“code walkthrough”). Each student will receive an individual mark for the code demonstration.

Project: Big Data Analysis

The aim of the Big Data Analysis project is to apply a machine learning method in a practical setting. In each of the following projects you are asked to...

  1. Work on a practical machine learning project.
  2. Present your work in a presentation.

You will work on your projects in groups of 3-5 students. The following list contains suggestions for project topics. Additional topics might become available and you can also suggest alternative topics:

  • “3, 6, 8, 9?”—recognising hand-written digits with principal component analysis

Apply principal component analysis for recognising handwritten digits as explained in (Lu, 2017) (but without the pre-processing using Histograms of Oriented Gradients (HOG)) to the MNIST data set. http://yann.lecun.com/exdb/mnist/

  • Googling food webs—the PageRank of extinction

Implement the variant of the PageRank algorithm described in (Allesina and Pascual, 2009) and reproduce the study for some of the food webs from this article. Note that some of the food webs are available in R by installing the cheddar library.

  • MCMC for code cracking

A highly original application of Markov chain Monte Carlo (MCMC) was presented by (Diaconis, 2009) and extended by (Chen and Rosenthal, 2012). Implement and test the approach by reproducing the example described in (Diaconis, 2009).


Allesina, S., Pascual, M., 09 2009. Googling food webs: Can an eigenvector measure species’ importance for coextinctions? PLOS Computational Biology 5 (9), 1–6.

URL https://doi.org/10.1371/journal.pcbi.1000494

Chen, J., Rosenthal, J., 2012. Decrypting classical cipher text using Markov chain Monte Carlo. Statistics and Computing 22, 397–413.

URL https://doi.org/10.1007/s11222-011-9232-5

Diaconis, P., 2009. The Markov Chain Monte Carlo Revolution. Bulletin of the American Mathematical Society 46 (2), 179–205.

Lu, W., 2017. Handwritten digits recognition using PCA of histogram of oriented gradient. In: 2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM). pp. 1–5.

What you should hand in

  1. Each group: A Powerpoint presentation with 10 minutes recorded audio (25%).
  2. Every student: A one-page summary of your mini-project (25%)
  3. Every student: A text file containing your commented source code (50%).

Important! All group members will receive the same mark for the Powerpoint presentation, one-page summary and code demonstration will be marked individually.

Presentation/One-page summary

Partial mark


Brief description of your application

Motivation: Which challenge are you going to address?



What are the challenges of implementing the algorithm?

Explain how you implemented the method.



What have you found out about your data set?

Show how your machine learning method addresses the challenge described in the Introduction.



Brief summary of the analysis of the data

Critically reflect how well the challenge described in the Introduction was solved by your machine learning approach.


Formal marks

Visual presentation

Delivery of the talk

Time keeping




Source code (submitted to Canvas and demonstration)

Partial mark

Completeness of the implementation




Clarity of the code


Quality of Comments




Improve Your Grades with Custom Writing Help
Homework Help
Writing Help
Editing Services
Plagiarism check
Proofreading services
Research Project help
Custom writing services
E learning blogs

Disclaimer : The study tools and academic assistance/guidance through online tutoring sessions provided by AssignmentHelp.Net is to help and enable students to compete academically. The website does not provide ghostwriting services and has ZERO TOLERANCE towards misuse of the services. In case any user is found misusing our services, the user's account will be immediately terminated.