ENN543 Data Analytics and Optimisation Semester 2, 2019

Assignment

ENN543, Data Analytics and Optimisation, Semester 2, 2019

This document sets out the three (3) questions you are to complete for Assessment. The assignment is worth 30% of the overall subject grade. Weights for individual questions are indicated throughout the document. Students are to work in small groups (2-3) and submit their answers in a separate, single document (either a PDF or word document), and upload this to TurnItIn.

The submitted answers should be written in the sytle of a research report, and outline the methods investigated (and the rationale behind selecting them), how the evaluation was structured (i.e. training and testing splits, etc), the results obtained and the conclusions.

One submission should be made per group. Further Instructions:

  1. Data required for this assessment is available on blackboard in ENN543 Assessment 2 Data.zip, alongside this document.
  2. As part of the submission, students should provide a brief table of contributions to outline the contribution of each group member. This table should be signed by all group members to signal agreement. Note that all group members will be assigned the same mark unless one or more members of the group explicitly request that marks be moderated based on contributions.
  3. Matlab code or scripts (or equivalent materials for other languages) may be submitted as supplementary material or appendices, however note that this will not be directly marked, and will only be used if there are ambiguities.
  4. Figures and outputs/results that are critical to the answer should be included in the main response.
  5. Students who require an extension should lodge their extension application with HiQ

(see http://external-apps.qut.edu.au/studentservices/concession/). Please note that teaching staff (including the unit coordinator) cannot grant extensions.

Problem 1. Subject Invariant Activity Recognition (30%). A common problem when performing classification and/or recognition on human behaviours, for example speech or activity recognition, is that different people will perform the same behaviour in different ways. In the case of speech this may manifest as different accents, while in the case of activity recognition differences between subjects may be driven by differences in how they move their body to perform the target action. To cope with such variations, models are typically trained on many subjects in the hope that they can capture the different possible manners in which the target behaviours are performed, such that the model can generalise to unseen subjects.

The Task

You are to investigate how feasible it is to use a single chest mounted accelerometer to recognise the type of activity being performed by a person. You have been provided with data for 15 subjects (see 1.csv, 2.csv, 3.csv, ... 15.csv in the Q1 directory of the assignment data archive). Each subject performs seven activities:

  • Working at Computer
  • Standing Up, Walking and Going up/down stairs
  • Standing
  • Walking
  • Going Up/Down Stairs
  • Walking and Talking with Someone
  • Talking while Standing

The data consists of five columns which are (in order):

  • A sequential index, which should be ignored
  • X acceleration
  • Y acceleration
  • Z acceleration
  • Activity type, which is your model target

Using this data, you are to develop a model to recognise the activities of an unseen subject, i.e. having trained the model on a selection of subjects, consider how well the model recognsises the activities of a subject who as been held out of the training set.

In addressing this problem you are free to select your own approach (from those covered in lectures and tutorials) to recognise the actions. Your answer should explain the method you have chosen to use, and provide justification your choice. You may optionally wish to include small scale experiments to support your decision. The training and testing splits used should be documented, and you should ensure that you evalaute your proposed approach on multiple subjects. Your answer should provide an analysis and discussion of your models performance, should identify situations where the model fails (and if possible reasons for this failure), and should determine if performance is consistent across all subjects.

Problem 2. Recognising a Person’s Age from Their Face (30%). Age estimation is a widely studied task relating to facial recognition, with applications in domains such as biometrics, and human computer interaction. Estimating age from facial images suffers from many of same challenges as face recognition, such as variations in appearance caused by pose, lighting, and facial accessories (i.e. glasses) or facial hair. Much like facial recognition, a critical pre-processing step for age estimation is to localise and align the face, such that all examples are as consistent as possible in terms of the location of major landmarks such as the eyes and nose.

The Task

You are to develop a method to estimate the age of a subject from a facial image. The file UTKFace.zip in the Q2 directory within the data archive contains the aligned and cropped face images from the UTKFace dataset. This archive contains 20,000+ colour face images, all of which have been cropped and aligned in preparation for further processing. A selection of example raw images (i.e. uncropped images) are shown in Figure 1.

ENN543 Data Analytics and Optimisation Semester 2, 2019 Image 1

Figure 1: Example raw images from UTKFace. Note that the supplied cropped and aligned images contain only the face regions.

Faces in the archive are named as follows: [age] [gender] [race] [datestamp].jpg, where:

  • [age]: an integer from 0 to 116, indicating the age;
  • [gender]: either 0 (male) or 1 (female);
  • [race]: an integer from 0 to 4, denoting White, Black, Asian, Indian, and Others (like Hispanic, Latino, Middle Eastern);
  • [datestamp]: date and time stamp, in the format of yyyymmddHHMMSSFFF, corresponding to the date and time an image was collected to UTKFace.

The [age] value is to be the primary response of your model. You may use or ignore the other variables as you choose.

In addressing this problem you are free to select your own approach (from those covered in lectures and tutorials) to determine age. Given the large size of the database, you are welcome to down-sample the images to a lower resolution, though be aware that if you are too aggressive in your down-sampling you may lose the ability to estimate age.

Your answer should explain the method you have chosen to use, and provide justification your choice. You may optionally wish to include small scale experiments to support your decision. Any approaches to use to modify the data should be documented, as should the training and testing splits. Your answer should also provide an analysis and discussion of your models performance, including identifying situations where the model fails and possible reasons for the failures.

Problem 3. Classifying Digits (40%). The MNIST dataset is a widely established benchmark dataset in computer vision, and recent machine learning methods can achieve almost perfect performance on the dataset. Despite this, digit, and more broadly character recognition still poses a challenge as many datasets have far greater variability than is observed in MNIST. One of the main challenges for methods stems from the within class variability that occurs due to changing conditions. If we consider recognising numbers or characters in natural scenes, we have changes in camera pose, lighting (which impacts brightness, contrast, and the presence of shadows), camera white balance, resolution and changes in the style of font used to render the character. The wide variety of conditions makes recognition challenging, and as such it is desirable to have methods that can generalise to unseen conditions.

Such generalisation can be achieved in a number of ways, with more complex classifiers (such as deep convolutional neural networks) and feature extraction techniques that seek to extract salient features that are common across different domain variations being popular approaches.

The Task

You are to investigate how well classifiers trained using different methods generalise across databases. In this question you will train on the MNIST database, and test on both MNIST and the Street View House Numbers (SVHN) database. Examples from the two databases are shown in Figure 2.

ENN543 Data Analytics and Optimisation Semester 2, 2019 Image 2

(a) MNIST

ENN543 Data Analytics and Optimisation Semester 2, 2019 Image 3

(b) SVHN

Figure 2: Examples from the MNIST (left) and SVHN (right) datasets.

To complete these question you have been supplied with the following data files (all in the Q3 directory of the data archive):

  • mnist train.mat: The MNIST training set. Data within this file should be used to train all models.
  • mnist test.mat: The MNIST testing set. This file should be used to evaluate the developed models on ‘in domain’ data. Data in this file should not be used for model training.
  • svhn test.mat: The SVHN testing set. This file should be used to evaluate the developed models on ‘out of domain’ data. Data in this file should not be used for model training. Note that for consistency with the MNIST data, the images in svhn test.mat have already been converted to greyscale.

Each of these data files contains two variables:

  • imgs: A 32 × 32 × N matrix, where N is the number of samples. Each 32x32 matrix is a single instance of an image.
  • labels:A 1 × N vector, where N is the total number of samples. Each entry is the label for the corresponding image in imgs.

Using this data you should explore how various classifiers and dimension reduction techniques can impact performance, and the ability of a model to generalise. In particular you should investigate:

  • SVMs and CKNN classifiers using both raw data (i.e. pixel-values) and data that has had dimensionality reduction applied;
  • Deep Convolutional Neural Network classifiers;
  • Principal Component Analysis, Linear Discriminant Analysis and Deep Auto-Encoders as dimensionality reduction methods.

Performance on both the ‘in domain’ MNIST testing data and the ‘out of domain’ SVHN testing data should be considered in your evaluation and analysis.

Note that for some methods it may not be computationally feasible to train on all data. In this case, you should decimate the data to obtain a representative sample that allows you to train and evaluate the model. In such cases you should document how the dataset was reduced.

Diploma Universities Assignments

Laureate International Universities Assignment

Holmes Institute Assignment

Tafe NSW

Yes College Australia

ACC508 Informatics and Financial Applications Task 2 T2, 2019

ACC512 Accounting

ACC520 Legal Regulation of Business Structures Semester 2, 2019

ACCT20074 Contemporary Accounting Theory Term 2 Assessment 3

AERO2463 Computational Engineering Analysis : Assignment 4

B01DBFN212 Database Fundamentals Assessment 1

BE01106 - Business Statistics Assignment

BFA301 Advanced Financial Accounting

BFA504 Accounting Systems Assessment 3

BSB61015 Advanced Diploma of Leadership and Management

BSBADV602 Develop an Advertising Campaign

BSBCOM603 Plan and establish compliance management systems case study

BSBCOM603 Plan and establish compliance management systems Assessment Task 1

BSBCOM603 Plan and establish compliance management systems Assessment Task 2

BSBCOM603 Plan and establish compliance management systems Assessment Task 3

BSBFIM501 Manage Budgets And Financial Plans Assessment Task 1

BSBHRM602 Manage Human Resources Strategic Planning

BSBINM601 Manage Knowledge and Information

BSBWOR501 Assessment Task 3 Plan Personal Development Plan Project

BSBMGT517 Manage Operational Plan

BSBWHS521 Ensure a Safe Workplace For a Work Area

BSBWRK510 Manage employee relations

BUSS1030 Accounting, Business and Society

CAB202 Microprocessors and Digital Systems Assignment Help

CHC40213 Certificate IV in Education Support

CHCAGE001 Facilitate the empowerment of older people

CHCAGE005 Provide support to people living with dementia

CHCCCS023 Support independence and wellbeing

CHCCCS025 Support relationships with carers and families

CHCCOM005 Communicate and CHCLEG001 Work Legally Ethically

CHCDIS002 Follow established person-centred behaviour supports

CHCECE019 Early Childhood Education and Care

CHCHCS001 Provide home and community support services

COMP10002 Foundations of Algorithms

COMP90038 Algorithms and Complexity

COSC2633/2637 Big Data Processing

COSC473 Introduction to Computer Systems

CPCCBC5011A Manage Environmental Management Practices And Processes In Building And Construction

CPCCBC5018A Apply structural Principles Medium rise Construction

CSE3OSA Assignment 2019

ELEC242 2019 Session 2

ENN543 Data Analytics and Optimisation

ENN543 Data Analytics and Optimisation Semester 2, 2019

FINM202 Financial Management Assessment 3 Group Report

Forensic Investigation Case Assignment ECU University

HA2042 Accounting Information Systems T2 2019

HC1010 Holmes Institute Accounting For Business

HC2112 Service Marketing and Relationship Marketing Individual Assignment T2 2019

HC2121 Comparative Business Ethics & Social Responsibility T2 2019

HI5002 Holmes Institute Finance for Business

HI5003 Economics for Business Trimester 2 2019

HI5004 Marketing Management T1 2020 Individual Report

HI5004 Marketing Management T1 2020 Group Report

HI5004 Holmes Institute Marketing Management

HI5014 International Business across Borders Assignment 1

HI5014 International Business across Borders

HI5017 Managerial Accounting T2 2019

HI5017 Managerial Accounting T1 2019

HI5019 Tutorial Questions 1

HI5019 Strategic Information Systems for Business and Enterprise T1 2020

HI5019 Holmes Institute Strategic Information Systems T2

HI5019 T2 2019

HI5019 T1 2019

HI5020 Corporate Accounting T3 2019

HI5020 Corporate Accounting T2 2019

HI6005: Management and Organisations in a Global Environment

HI6006 Tutorial questions

HI6006 Competitive Strategy Individual T1 2020

HI6006 Holmes Institute Competitive Strategy

HI6006 Competitive Strategy T3 2019

HI6007 Statistics for business decisions

HI6007 Assessment 2 T1 2020

HI6007 T1 2019

HI6008 T2 2019

HI6008 Holmes Institute Research Project

HI6025 Accounting Theory and Current Issues

HI6026 Audit, Assurance and Compliance Assignment Help

HI6026 Audit, Assurance and Compliance

HI6027 business and corporate law tutorial Assignment T1 2021

HI6027 Business and Corporate Law T3 2019

HI6027 Business and Corporate Law T2 2019

HI6028 Taxation Theory, Practice and Law T2 2021

Hi6028 taxation theory, practice and law Final Assessment t1 2021

HI6028 Taxation Theory, Practice and Law T2 2019

HI6028 Taxation Theory T1 2019

HI6028 Taxation Law Holmes

HLTAAP001 Recognise healthy body systems

HLTWHS002 Follow safe practices for direct client care

HOTL5003 Hotel Property and Operations

HPS771 - Research Methods in Psychology A

HS2021 Database Design

ICTICT307 Customise packaged software applications for clients

IFN619 Data Analytics for Strategic Decision Makers

INF80028 Business Process Management Swinburne University

ISY2005 Case Assignment Assessment 2

ISYS326: Information Systems Security Assignment 2, Semester 2, 2019

ITAP3010 Developing Data Access Solutions Project

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items

ITECH1103- Big Data and Analytics Assignment Semester 1, 2020

ITECH 5500 Professional Research and Communication

Kent Institute Australia Assignment

MA5830 Data Visualisation Assignment 2

MGMT7020 Project Management Plan

Mgt 301 Assessment 3

MGT215 Project Management Individual Assignment

MIS102 Data and Networking Assignment Help

MITS4002 Object Oriented Software Development

MITS5002 Software Engineering Methodology

MKT01760 Tourism Planning Environments Assessment 4

MKT01760 Tourism Planning Environments

MKT01906 International Tourism Systems

MKT5000 Marketing Management S2 2019

MNG03236 Report Writing SCU

MRE5003 Industrial Techniques In Maintenance Management Assignment 4

MRE5003 Industrial Techniques In Maintenance Management Assignment 3

MRE5003 Industrial Techniques In Maintenance Management

Network Security and Mitigation Strategies Answers

NIT2213 Software Engineering Assignment

NSB231 Integrated Nursing Practice Assessment Task 1

Science Literacy Assessment 4

SIT323 Practical Software Development T 2, 2019

SIT718 Using aggregation functions for data analysis

SITXCOM002 Show Social and Cultural Sensitivity

TLIL5055 Manage a supply chain

TLIR5014 Manage Suppliers

USQ ACC5502 Accounting and Financial Management

UTS: 48370 Road and Transport Engineering Assessment 2

CHCAGE001 Facilitate the empowerment of older people

CHCAGE005 Provide support to people living with dementia

CHCCCS011 Meet personal support needs

CHCCCS015 Provide Individualised Support

CHCCCS023 Support independence and wellbeing

CHCCCS025 Support relationships with carers and families

CHCCOM005 Communicate and work in health or community services

CHCDIS001 Contribute to ongoing skills development

CHCDIS002 Follow established person-centred behaviour supports

CHCDIS003 Support community participation and social inclusion

CHCDIS005 Develop and provide person-centred service responses

CHCDIS007 Facilitate the empowerment of people with disability

CHCDIS008 Facilitate community participation and social inclusion

CHCDIS009 Facilitate ongoing skills development

CHCDIS010 Provide person-centred services

CHCDIV001 Work with diverse people

CHCHCS001 Provide home and community support services

CHCLEG001 Work legally and ethically

CHCLEG003 Manage legal and ethical compliance

HLTAAP001 Recognise healthy body systems

HLTAID003 Provide First Aid

HLTHPS007 Administer and monitor medications

HLTWHS002 Follow safe work practices for direct client care

Assignment 2 Introduction to Digital Forensics

MGT603 Systems Thinking Assessment 1

MGT603 Systems Thinking Assessment 2

Hi5017 Managerial Accounting T1 2021

HI6028 Taxation Theory, Practice and Law T1 2021

OODP101 Assessment Task 3 T1 2021

ITNE2003R Network Configuration and Management Project

Australia Universities

ACT

Australian Catholic University

Australian National University

Bond University

Central Queensland University

Charles Darwin University

Charles Sturt University

Curtin University of Technology

Deakin University

Edith Cowan University

Flinders University

Griffith University

Holmes Institute

James Cook University

La Trobe University

Macquarie University

Monash University

Murdoch University

Queensland University of Technology

RMIT University

Southern Cross University

Swinburne University of Technology

University of Adelaide

University of Ballarat

University of Canberra

University of Melbourne

University of Newcastle

University of New England

University of New South Wales

University of Notre Dame Australia

University of Queensland

University of South Australia

University of Southern Queensland

University of Sydney

University of Tasmania

University of Technology Sydney

University of the Sunshine Coast

University of Western Australia

University of Wollongong

Victoria University

Western Sydney University

Year 11 - 12 Certification Assignment

Australian Capital Territory Year 12 Certificate

HSC - Higher School Certificate

NTCE - Northern Territory Certificate of Education

QCE - Queensland Certificate of Education

SACE - South Australian Certificate of Education

TCE - Tasmanian Certificate of Education

VCE - Victorian Certificate of Education

WACE - Western Australia Certificate of Education