Computer Science

Applications for 2023-2024 are now closed.

Background investigation on the use of generative AI for cognitive assistance in Alzheimer's

Supervisors

Michael Witbrock

Vithya Yogarajan

Lynette Tippett

Discipline

School of Computer Science

Project code: SCI065

Project

Alzheimer's patients progressively lose memory functions that can cause behavioural changes that are difficult for them and for those they interact with. In this work, the student will investigate the nature of those losses, and explore what sort of information might be supplied, by a cognitive prosthesis based on an adaptive large language model, to the patient to mitigate the effects of the memory loss. The student will compile a report on the existing relevant literature and any available datasets.

Requirements

Familiarity with coding in Python, the basics of machine learning and AI.

Some experience with Pytorch would be ideal but not necessary.

Can large language models generate Computer-Aided Design code for 3D models?

Supervisors

Michael Witbrock

Trung Nguyen

Discipline

School of Computer Science

Project code: SCI066

Project

Large language models show astonishing capacity in writing various programming languages. Nevertheless, there is still limited use of language models to generate 3D models in the form of CAD commands or other programming languages supported by AutoCAD, although there are some existing CAD datasets. This project aims to use such datasets and investigate how well large language models can generate such code for 3D models given conversational prompts.

Requirements

Familiarity with coding in Python, the basics of machine learning and AI. 

Some experience with Pytorch would be ideal but not necessary.

Prototype of LLM based memory prosthesis

Supervisors

Michael Witbrock

Vithya Yogarajan

Lynette Tippett

Discipline

School of Computer Science

Project code: SCI067

Project

In this work, the student will use LLM APIs to prototype a memory prosthesis, which notices when short to medium term memory is necessary to a conversational task and provides a prompt for the use of that information. The task should be designed to resemble one with which people with progressive memory loss might have difficulty but does not need to be specific to this population.

Requirements

Familiarity with coding in Python, the basics of machine learning and AI.

Some experience with Pytorch would be ideal but not necessary.

Study of Adaptive Power-of-Two Quantization for Fixed-point representation in Neural Network Model

Supervisor

Bruce Sham

Discipline

School of Computer Science

Project code: SCI068

Project

In the project, the student is going to study adaptive Power-of-Two Quantization for fixed-point representation in Neural Network Models such as MobileNet, ResNet, etc. The objective is find out the best trade-off between computation complexity and accuracy.

Requirement

Students who are interested in this project should be familiar with C or Python programming.

Integrated AI Training and Inference for the Edge using U250

Supervisor

Bruce Sham

Discipline

School of Computer Science

Project code: SCI070

Project

Training and inference demand massive computer resources that utilize expensive and power-hungry GPUs. In this project, the student is asked to develop a unique, integrated, and efficient training and inference deep learning solution for the edge using the Xilinx U250. The objective of this project is to deploy an integrated training-inference solution with real-time retraining of their model, in parallel to online inference on the same device.

Requirements

Students who are interested in this project should be familiar with C/C++ language and interested in learning the new FPGA tools and related developing platform.

Development of Design Automation Software for the Design of Integrated Circuits

Supervisor

Bruce Sham

Discipline

School of Computer Science

Project code: SCI071

Project

In the project, the student is going to develop the electronic design automation (EDA) software for the design of integrated circuits. Before EDA, integrated circuits were designed by hand, and manually laid out. Recently, most of those design processes are done automatically by the EDA software.

Requirements

Students who are interested in this project should be familiar with C/C++ language and have strong background knowledge in algorithms and data structures (heaps, double linked-list, quick sort, tree and graphs) but NO circuit knowledge is required.

An Ethical Computing Toolkit for New Zealand

Supervisor

Gillian Dobbie

Vithya Yogarajan

Discipline

School of Computer Science

Project code: SCI072

Project

Overall aim
The overall aim of this project is to compare and evaluate different ethical computing toolkits based on their suitability for addressing the ethical challenges in machine learning in the context of New Zealand.

Project description
This project will involve a comprehensive analysis and evaluation of ethical computing toolkits to determine their suitability for New Zealand. The student will conduct a thorough literature review to identify relevant frameworks and methodologies used in the field of ethical computing. Based on the literature review, a set of promising toolkits will be selected for further evaluation.

The student will develop evaluation criteria specifically tailored to New Zealand's needs, considering data sovereignty, data gathering, model transparency, explainability, fairness, accountability, and robustness, as well as potential future risks.

To assess the real-world effectiveness of the selected toolkits, a case study will be conducted using publicly available New Zealand data. The case study will involve applying the toolkits to the dataset, evaluating their performance in addressing ethical concerns, and analyzing their limitations and strengths.

Based on the findings, the student will provide recommendations and guidelines for selecting and implementing ethical computing toolkits in New Zealand, taking into account the unique requirements and challenges of the country.

Data
The student will utilise publicly available New Zealand data for the case study. The specific datasets will be determined during the project, considering their relevance to ethical computing and machine learning.

Desired output
The desired output of this project includes:

  • Literature review summarising the current state of ethical computing toolkits and their applicability to New Zealand
  • Comparative analysis report evaluating the selected toolkits based on their compatibility with New Zealand's requirements, focusing on data sovereignty, data gathering, and potential future risks
  • Case study report showcasing the application of the selected toolkits using publicly available New Zealand data, including an analysis of their performance and limitations
  • Recommendations and guidelines for selecting and implementing ethical computing toolkits in New Zealand, considering the unique context of the region

Preferred skills

We are looking for a highly motivated student with the following preferred skills:

  • Familiarity with ethical considerations in computing and machine learning
  • Understanding of machine learning approaches
  • Data analysis and Python programming skills

Predicting influenza disease burden in NZ

Supervisor

Gillian Dobbie

Steffen Albrecht

Discipline

School of Computer Science

Project code: SCI073

Figure. Influenza virus isolates reported in New Zealand all ages, week 1 to week 23 2022

Project

Background

With the opening of international boarders, influenza virus circulation has recurred in New Zealand (see Figure). Anecdotal evidence from practicing clinicians in several hospitals in New Zealand is that influenza and influenza-related hospitalisations have been more complicated in 2022, with significant proportions of children presenting with complicated lower respiratory infections, and with non-respiratory presentations and complications.

    Figure. Influenza virus isolates reported in New Zealand all ages, week 1 to week 23 2022

    Research question

    Using machine learning, we are wanting to answer the question, “What is the epidemiology and clinical spectrum of influenza hospitalisations among children in New Zealand in 2022?”

      Data availability

      We have a number of medical data sources available to us that will help us address the question.

      Desired outputs

      Machine learning models that determine the factors that influence influenza hospitalisations amongst children in NZ.

        Preferred skills

        We are looking for a highly motivated student with the following skills:

        • Understanding of machine learning approaches
        • Data analysis and Python programming skills

        VR.net: Curating a large-scale real-world dataset for VR motion sickness research

        Supervisor

        Elliott Wen

        Discipline

        School of Computer Science

        Project code: SCI074

        Project

        VR gaming has gained widespread popularity in recent years, with the annual market revenue projected to reach 87$ billion by 2023. However, up to 40% of users suffer from VR motion sickness with symptoms like fatigue, disorientation, and nausea.

        Recently, researchers have proposed using Machine Learning (ML) approaches to identify motion sickness risk factors in VR content. However, many of these studies report the need for more training datasets. These researchers demand a large-scale dataset containing many hours of VR gameplay clips and the corresponding risk factor labels. The video clips should also come from diverse real-world game genres to ensure generalisation. Building such a dataset is challenging since manual labeling would require an infeasible amount of time.

        In this project, you are tasked to build an automatic data collection tool to extract labeled data from real-world VR games. The data may include gameplay video, 3D object/camera movement, and VR headset/joystick movement. You will use a software engineering technique named code instrumentation, where a piece of custom code is dynamically injected into low-level system graphics stacks to intercept any valuable data.

        You will test the data collection tool by playing various real-world VR games (e.g., Beat Saber or Epic Roller Coaster). We will provide two sets of VR goggles and game copies. You can enjoy the games as long as you like. After the gameplay, you are also encouraged to show the utility of the collected dataset by building a simple machine-learning model. For instance, given a one-second gameplay video, predict whether the camera is doing multi-axis rotation.

          Requirements

          1) Experience with Python and C++ Programming

          2) Solid knowledge in operating systems

          3) Comfort with evaluating VR programs

          Solving Scalability challenges for a Green Computing Hub

          Supervisor

          Gerald Weber

          Discipline

          School of Computer Science

          Project code: SCI075

          Project

          This project is about exploring how scalability challenges can be solved in a way that fits to the mission of a green computing hub.

          Requirements

          Skills include database and web application experience and the openness to explore and evaluate new cloud computing solutions.

          AWS deployment of a mobile App for diabetes type 2 patients

          Supervisor

          Jing Sun

          Discipline

          School of Computer Science

          Project code: SCI076

          Project

          This project focuses on the deployment of an existing software implementation onto AWS for the purpose of showcase demonstration. The current mobile and server-side applications were developed by two groups of students in the past. During the deployment process, extensions on the current software (code) would be required. The project also requires an evaluation of the outcome.

          Requirements

          The required skill set includes mobile application development and AWS deployment experience, etc. The duration of the project would be 10 weeks.

          Solving Scalability challenges for a Green Computing Hub

          Supervisor

          Thomas Lacombe

          Discipline

          School of Computer Science

          Project code: SCI077

          Project

          In this project, you will develop comprehensive teaching materials to educate students on ethical considerations when developing and using AI tools (e.g., ChatGPT) and models.

          Your task is to create a curriculum that introduces students to the potential misuse of AI tools and models for malevolent and/or unethical purposes. The teaching materials should cover key ethical principles, explore real-world case studies, and provide practical guidelines and strategies to prevent and mitigate misuse. Deliverables may include slides which could be used in lectures, as well as demonstrations in the form of a website, Jupyter Notebooks, or any other innovative medium you think would be relevant to the task.

          You will work in collaboration with members of the Ethical Computing project within the School of Computer Science.

          You work will empower future software developers to proactively consider the ethical implications of their work and contribute to the responsible and accountable development of AI technologies.

          It will also empower AI users with a better understanding of the ethical implications of using AI technologies.

          Theoretical foundations of machine learning

          Supervisors

          Jesse Goodman (Statistics)
          Pedram Hekmati (Mathematics)
          Simone Linz (Computer Science)

          Discipline

          School of Computer Science

          Mathematics

          Statistics

          Project code: SCI078

          Project

          Machine learning and, more broadly, artificial intelligence will continue to change everyone’s life profoundly. Mathematics, statistics, and computer science play an important role in advancing machine learning algorithms (e.g., to make algorithms more reliable) and theoretical research into machine learning can advance our understanding into why certain methods are successful or not.

          In this project, you will study some theoretical foundations of machine learning algorithms and how techniques from probability theory, geometry, and graph theory can be leveraged to aid the design of machine learning algorithms.

          The exact direction of the project will be decided at its start and depend on the interests and experience of the summer student.

          Improve functionality and performance of assignment automarker used in algorithms classes

          Supervisor

          Michael J. Dinneen

          Discipline

          School of Computer Science

          Project code: SCI079

          Project

          We need toi add some new features to one of the automated marking platforms that we use in computer science algorithms courses.

          We want to fix a few operational issues and adapt with improved performance (sequential to parallel) for test/exam environments.

          Requirements

          You will need to be fluent (or learn quickly) in linux, docker, java, php and possibly javascript.

          Test/exam generator for real data/computer scientists

          Supervisor

          Michael J. Dinneen

          Discipline

          School of Computer Science

          Project code: SCI080

          Project

          Using an expandable database of algorithm questions, we want to utilise the generation of either a fixed exam or set of individual exams. Statistics from marking scripts to support ranking of good/bad questions.

          Requirements

          Knowledge of XML/SQL and LaTeX. Python (or possibly Java) as development language. Initially command-line tools are desired but could have GUI interface, if time permits.

          Pose Detection using Machine Learning

          Supervisor

          Sathiamoorthy Manoharan

          Discipline

          School of Computer Science

          Project code: SCI081

          Project

          Many people perform physical exercises, such as Yoga, at home. It is crucial to execute these exercises correctly. Incorrect execution can render the exercises ineffective and may even result in bodily harm. Recognizing that not everyone has a personal trainer, we aim in this project to develop a machine learning model that can automatically determine whether people are performing exercises correctly by analysing their poses during the exercise. We will use Yoga as an example exercise in this project.

          Requirements and skills gained

          Proficiency in Python programming is essential for the project. While working on this project, the student is expected to acquire knowledge in building neural networks and using existing deep learning networks, such as vision/video transformers and others.

          Scanning Teleform Sheets

          Supervisors

          Sathiamoorthy Manoharan

          Patrice Delmas

          Discipline

          School of Computer Science

          Project code: SCI082

          Project

          Our university uses Scantron Teleform sheets for examinations and tests where students answer multiple choice questions by shading a bubble in their Teleform sheet. The sheets are subsequently scanned by a machine and the answer choices are processed for grading.

          Unfortunately, the scanning centre becomes a hotspot during examinations and this becomes a bottleneck in processing and releasing grades.
          In this project, you would work on recognizing the answer choices in scanned PDF Teleform sheets (of the format that UoA uses) and provide a text file that encodes the answers in each PDF Teleform. A command-line application (CLI) is all that is required.

          A similar project catering for a Teleform sheet of a different type exists on GitHub: https://github.com/floft/freetron#readme You may wish to use this as the starting point.

          A C# implementation will be required.

          Scale up Sparse Neuron Network with Randomised Hashing

          Supervisors

          Ninh Pham

          Discipline

          School of Computer Science

          Project code: SCI083

          Project

          Sparse neural networks have gained considerable attention due to their potential to reduce computational complexity and energy consumption in machine learning tasks. However, achieving scalability while maintaining high performance remains a challenge. This research proposal aims to investigate the integration of randomised hashing techniques into sparse neural networks to enable effective scaling, improved performance, and increased efficiency.

          Objectives

          1. Develop a novel framework for scaling up sparse neural networks using randomized hashing techniques.
          2. Investigate the impact of randomized hashing on the performance of sparse networks, focusing on accuracy, training convergence, and computational efficiency.
          3. Optimize the training process of scaled-up sparse networks with randomized hashing to achieve competitive performance with dense networks.
          4. Evaluate the trade-offs between performance, sparsity, and computational costs in scaled-up sparse networks with GPUs.

          Skills

          TensorFlow, Pytorch, CUDA-GPU, locality-sensitive hashing, random projection

          Reference: Paper with Code
          1. https://github.com/zahraatashgahi/CTRE
          2. https://github.com/rdspring1/LSH_DeepLearning

          Characterisation and Analysis of Reddit Financial Communities

          Supervisor

          Aniket Mahanti

          Discipline

          School of Computer Science

          Project code: SCI084

          Project

          The surge in prices of meme stocks such as Gamestop and AMC initiated by regular investor members of the Wallstreetbets Reddit community has highlighted the power of social media. This project will characterize the usage of the financial subreddit by using web scraping techniques. Basic characteristics such as a number of posts, comments, and users will be measured.

          We will also study a variety of features such as the density of author posts and timeline of posts and comments over the period of existence of the subreddit. We will also analyze the stocks that have been discussed by the community and its impact on the stock price. Analysis of semantics, NLP, and sentiment analysis may also be required. Data needs to be scraped from Reddit.

          Requirements

          Abilities in Python or similar programming, basic statistical analysis, web scraping, natural language processing

          Characterising NFT Marketplaces

          Supervisor

          Aniket Mahanti

          Discipline

          School of Computer Science

          Project code: SCI085

          Project

          The aim of this measurement and characterization study is to understand the market and user dynamics in the NFT marketplace. The data will be collected from one of the larger marketplaces such as Opensea or niche marketplaces focusing on art or a specific collectible. The objective is to perform a longitudinal study with the data spanning the past 4 years since NFT have gained traction in the mainstream.

          You will perform a comprehensive characterization study. You will study various aspects of the marketplace such as user behaviour, popularity, market dynamics, economic factors, network science, among other things in the marketplace. Not much academic research has been conducted on this topic, hence there is lot of scope to do interesting analysis.

          Requirements

          Abilities in Python or similar programming, basic statistical analysis, web scraping, visualization, network science, ML.

          Characterizing Decentralized Social Blockchain (DeSo) and Decentralized Finance (DeFi) sites

          Supervisor

          Aniket Mahanti

          Discipline

          School of Computer Science

          Project code: SCI086

          Project

          The aim of this measurement and characterization study is to understand the emerging DeSo applications. One such application is BitClout, which is a Crypto Social network similar to Twitter. The data will be collected from the DeSo platforms to better understand the design, implementation, usage, and dynamics of the platforms. Similar analysis can then be applied to DeFi systems to better understand their salient features.

          Requirements

          Strong programming skills, a good background in statistics, background in Blockchain. Network Science (Social Networks/Graph Structures) is a plus. Excellent writing skills.

          Cybersecurity using machine learning in IoT and digital twins

          Supervisor

          Aniket Mahanti

          Discipline

          School of Computer Science

          Project code: SCI087

          Project

          We will study the use of Machine Learning algorithms such as SVM, Decision Tree, and Random Forest in detecting denial of service attacks in IoT networks. This project will focus on the software-defined network paradigm with a central viewpoint. The analysis can be performed via simulation such as Mininet or on existing freely available datasets. The objective is to identify features that can classify the attacks and also compare the performance of various machine learning techniques. Deep learning can be used for detecting attacks and their performance can be analyzed.

          Requirements

          Good programming skills, TCP/IP, Mininet or similar simulator, machine learning

          Machine learning for social good in environment: Developing an advanced warning system for predicting extreme weather events

          Supervisors

          Yun Sing Koh

          Gillian Dobbie

          Daniel Wilson

          Centre of Machine Learning for Social Good

          https://ml4sg.auckland.ac.nz/

          Discipline

          School of Computer Science

          Project code: SCI088

          Project

          Floods, the most prevalent of natural disasters, impact over 250 million people annually, leading to economic damages of approximately $10 billion. Our project aims to address this pressing issue by developing an advanced warning system that leverages multi-modality generative machine learning.

          By investigating the potential of extreme weather events, particularly flooding, and integrating valuable data on green, grey, and blue infrastructure, we strive to equip policymakers with the necessary tools to make informed decisions. Our ultimate goal is to empower organizations and individuals, enabling them to take proactive measures to mitigate damage and save lives.

          Be a part of the Centre of Machine Learning for Social Good: Join our dynamic team. We are committed to advancing fundamental knowledge in machine learning and data analytics while tackling the most challenging health, environmental, and societal problems of our time.

          As the first center in Aotearoa dedicated to utilizing machine learning for social good, we collaborate closely with domain experts, leveraging their expertise as a catalyst to address high-impact societal issues.

          By participating in this project, you will contribute to the development of a prototype for an open-sourced early event warning system. This system will revolutionize the way extreme weather events, such as floods, are predicted and managed. Your role will involve engaging in meaningful discussions with our collaborators to refine the system's design and functionality, ensuring its effectiveness and usability.

          Project Output

          Prototype of an Open-Sourced Early Event Warning System

          Requirements

          To excel in this project, proficiency in Python programming, including Keras or Pytorch, and a strong understanding of machine learning fundamentals are essential. Familiarity with large-language models is considered advantageous. We are seeking individuals who possess a deep passion for creating impactful outcomes that positively influence society.

          Machine Learning for Social Good in Health: Automated Machine Learning for early prediction of acute pancreatitis severity

          Supervisors

          Gillian Dobbie

          Yun Sing Koh

          Daniel Wilson

          Centre of Machine Learning for Social Good

          https://ml4sg.auckland.ac.nz/

          Discipline

          School of Computer Science

          Project code: SCI089

          Project

          Acute pancreatitis is a complex condition with varying degrees of severity, and accurate prediction plays a crucial role in guiding timely interventions and improving patient outcomes. 

          Our project focuses on developing an automated machine learning system capable of predicting the severity of acute pancreatitis at an early stage. Early identification of severe cases can lead to timely interventions and improved patient management. Previous approaches use traditional neural networks, going beyond, we will investigate current state-of-the-art machine learning approaches.

          Be a part of the Centre of Machine Learning for Social Good: Join our dynamic team. We are committed to advancing fundamental knowledge in machine learning and data analytics while tackling the most challenging health, environmental, and societal problems of our time. As the first center in Aotearoa dedicated to utilizing machine learning for social good, we collaborate closely with domain experts, leveraging their expertise as a catalyst to address high-impact societal issues.

          Output

          The automated prediction system has the potential to enable healthcare professionals to intervene early, allocate resources effectively, and provide personalized treatment plans to acute pancreatitis patients.

          Requirements

          • Strong programming skills, particularly in Python, and familiarity with machine learning libraries (e.g., scikit-learn, TensorFlow)
          • Basic understanding of machine learning concepts and algorithms
          • Ability to work independently and collaboratively in a research team
          • Attention to detail and analytical mindset
          • Passion for improving healthcare outcomes through technology

          Skills developed

          By participating in this project, you will have the opportunity to work alongside experts in the field, gain valuable research experience, and contribute to advancing the field of healthcare analytics. This project combines the power of machine learning algorithms, medical data, and clinical expertise to create a robust and efficient prediction model.

          Machine Learning for social good in a New Zealand: Current and future

          Supervisors

          Yun Sing Koh

          Gillian Dobbie

          Daniel Wilson

          Centre of Machine Learning for Social Good

          https://ml4sg.auckland.ac.nz/

           

          Discipline

          School of Computer Science

          Project code: SCI090

          Project

          This project aims to explore, compare, contrast and design engagement approaches for transdisciplinary social good projects specifically tailored to the unique context of New Zealand. This project offers a valuable opportunity to delve into the field of ML for Social Good, culminating in a comprehensive study on current and future approaches for utilizing machine learning to address societal challenges.

          Our project focuses on examining how machine learning can be effectively designed and applied to address pressing social issues in New Zealand. We will explore existing Machine Learning for social good approaches and identify areas where tailored solutions can have the most impact, and how the tools build can be sustainable beyond the life of the projects. The culmination of this project will be a comprehensive study that outlines current approaches, evaluates their efficacy, and provides insights into future directions.

          New Zealand Context

          By working on ML for Social Good within the unique context of New Zealand, you will have the opportunity to understand the specific challenges and needs of local communities. Through collaborations with experts and stakeholders, you will contribute to designing machine learning solutions that are relevant, culturally sensitive, and effective in addressing social issues specific to New Zealand.

          Output

          Study on ML for Social Good Approaches: The ultimate outcome of this project will be a comprehensive study that investigates and documents the current landscape of ML for Social Good approaches, with a specific focus on New Zealand. The study will delve into the effectiveness, challenges, and future prospects of these processes and approaches, providing valuable insights for researchers, policymakers, and practitioners.

          Be a part of the Centre of Machine Learning for Social Good. We are committed to advancing fundamental knowledge in machine learning and data analytics while tackling the most challenging health, environmental, and societal problems of our time. As the first center in Aotearoa dedicated to utilizing machine learning for social good, we collaborate closely with domain experts, leveraging their expertise as a catalyst to address high-impact societal issues.

          Requirements

          • Strong research and analytical skills
          • Familiarity with machine learning concepts and algorithms
          • Proficiency in data analysis and report writing
          • Ability to work independently and collaborate in a research team
          • Passion for utilizing technology for social good.

          Using AI to predict behaviour in team sports

          Supervisor

          Patrice Delmas

          Discipline

          School of Computer Science

          Project code: SCI091

          Project

          The use of AI for team player behaviour analysis during games using video recording and data analytics (such as body sensors) from sports channels has the potential to improve players' and teams performance, along with other applications for the viewers.

            The student will first provide a brief overview of the state-of-the-art both for behavioral analysis in team sports and on existing annotated datasets.

              Leveraging the above and using our professional sports and broadcasting partners' expertise and datasets, the student will trial best existing machine learning techniques for individual behaviour tracking and, depending on progress, will attempt to link this behaviour to recorded game events (as provided by our sports team partner) such as fouls, scoring, injury and so on. The sports studied will be one or more of the following based on available datasets and task complexity: basketball, netball, rugby league, soccer, rugby union.

                Requirements

                Strong motivation and a willingness to learn. Some Python programming capabilities and the ability to pick-up existing tools and knowledge. Computer vision knowledge is not essential although the candidate will need to pick-up relevant skills in computer vision and machine learning along the way.

                Full-body interaction with AI in a Virtual Reality installation

                Supervisors

                A/P Danielle Lottridge

                Dr Becca Weber

                Discipline

                School of Computer Science

                Dance Studies

                Project code: SCI092

                Project

                This project explores the future of interacting with AI – full-body interaction with AI agents within an immersive environment where users experience responsive audio-visual feedback based on real-time body tracking. The student will integrate AI agents into a Unity code base for a participative installation. The research goal is to understand how the AI agents and effects impact sensory perception, embodiment, and subjective experiences.

                Over the summer, we will work with dance experts to iteratively develop the AI agents and interaction. This summer research project will contribute to an installation that will be made public. It is related to a larger project that is likely to lead to topics of masters and PhD studies and to collaborations with other researchers in universities abroad.

                Augmented reality stroke rehabilitation game from te whare tapa whā

                Supervisor

                A/P Danielle Lottridge

                Discipline

                School of Computer Science

                Project code: SCI093

                Project

                We reconceptualise healthtech for elders from the Māori model 'te whare tapa whā’ (the four cornerstones of health; Durie, 1994): it builds on iwi involvement and concurrently supports physical, mental, whānau (family) and spiritual health with interactive activities and experiences. Our team includes a partnership with a Māori augmented reality development company ARA where we iteratively codesigned the first version of the software with Māori communities. Together, we will interact directly with kaumatua (elders) to better understand their experiences and motivation to engage in their rehabilitation. We will collaborate with researchers at Auckland Hospital to prepare a feasibility study of the prototype.

                Automatic assessment of accessibility, visual design, and interactivity of websites

                Supervisor

                A/P Danielle Lottridge

                Discipline

                School of Computer Science

                Project code: SCI094

                Project

                Web technologies are foundational and continue to be widespread, with front-end development skills in high demand. This project pursues the automatic assessment of web aspects dynamically, by executing them as would occur within a typical browser using the Selenium WebDriver framework. In this project you will write custom code to assess and interact with web components through browser-specific drivers, expanding functionality to assess visual Gestalt Principles and interactivity in programmatic fashion. Selenium enables the remote control of a browser and mimics user actions on the browser including button click, drag, and drop selection, checkboxes, key presses, taps, and scrolling. The use of this tool is educational to support increased understanding of accessibility guidelines and visual design skills.

                AI for Climate Change: Automated detection of urchin barren from underwater imagery

                Supervisor

                Katerina Taskova

                Co-supervisors

                Patrice Delams, Arie Spyksma (Institute of Marine Science)

                Discipline

                School of Computer Science

                Project code: SCI095

                Project

                Kelp forests are among the most productive ecosystems on Earth, but climate-driven impacts are causing wide-spread kelp habitat loss. For example, the climate-driven proliferation of the longspined sea urchin is one of the most urgent threats to kelp forests in south-eastern Australia and north-eastern New Zealand.

                Assessing this threat requires collection and analysis (typically manually) of underwater imagery spanning tens to hundreds of kilometres of reef. The high contrast of sea urchins on barren reef makes this an ideal candidate for modern computer vision solutions based on machine learning (ML) algorithms to dramatically improve annotation and analysis.

                Using existing image-based monitoring data you will develop and test ML algorithms to detect the presence and the extent of urchin barren expansion in Australia/New Zealand.

                Recommended skills

                This project is suitable for students with basic skills in maths, statistics, machine learning and image analysis; intermediate programming skills in Python; familiarity with convolutional neural networks and programming experience in Pytorch will be beneficial (but it is not necessary and can be learned while working on the project).

                Machine learning for mass spectrometry data analysis

                Supervisor

                Katerina Taskova

                Co-supervisors

                Patrice Delams, Arie Spyksma (Institute of Marine Science)

                Discipline

                School of Computer Science

                Project code: SCI096

                Project

                As new mass spectrometry (MS) technologies are rapidly developed to cope with the complexity of biological samples emerging in environmental and biomedical sciences, standard tools for MS analysis fail to exploit the full data potential offered by the recent technologies. For example, top-down tandem MS has been extremely useful in studying metal-protein interactions and relevant for development of anti-cancer metal-based drugs. Manual identification of binding sites of metal-based drugs is extremely difficult, prone to error, and often only the most intense peaks get assigned. Nevertheless, it is the common approach in absence of effective automated methods for the given problem.

                New computational methods are therefore needed, and machine learning ML algorithms would be particularly valuable to cope with the complexity, noise, and volume of MS data. More specifically, the overall problem resembles challenges in ML for time series analysis. You will investigate the use of time series data analysis techniques to match and identify specific peak patterns in MS data.

                Recommended skills

                Basic knowledge of machine learning, good knowledge of Python, openness to learn about mass spectrometry data and to collaborate with chemists.

                Reliable machine learning for predator identification

                Supervisor

                Katerina Taskova

                Co-supervisors

                Patrice Delams, Arie Spyksma (Institute of Marine Science)

                Discipline

                School of Computer Science

                Project code: SCI097

                Project

                Large datasets are now routinely collected from digital cameras and other sensing technologies that need to be integrated and analyzed in an efficient and intelligent way in order to address biosecurity problems (such as predator identification) with success at operational scale. Recent advances in low-cost sensing technology, computer vision and deep learning methodology have enabled new opportunities for developing zero tolerance technology for predator monitoring and trapping in large forested and complex environments.

                While deep learning models have seen enormous success in computer vision due to their high expressiveness compared to traditional shallow models, they don’t have well-motivated methods for accurately estimating their confidence in a prediction. They can be “overconfident” for images that humans clearly will rule out as not relevant for the prediction task or not even including the object of interest.

                This project will investigate methods for quantifying uncertainty in deep learning model predictions. The goal is to develop actionable deep machine learning models, safe to deploy in the real-world applications that need reliable detection of predators in sensing (image-based) data.

                Recommended skills

                Basic knowledge of machine learning, good knowledge of Python, essential understanding of deep learning networks beneficial, and programming experience with Pytorch (TensorFlow or Keras).

                Auditing Artificial Intelligence with Adversarial Learning

                Supervisor

                Joerg Wicker

                Discipline

                School of Computer Science

                Project code: SCI098

                Project

                We aim to design and develop new methods to attack machine learning models and use the adversarial attacks to define a measure of reliability. Weak performances of models where data sets are not representative or flaws in training process are a common issue in Machine Learning. This leads to misclassification and unfairness of the model. We will develop a framework that identifies adversarial regions in the data space that are prone to make models fail. The framework will not only identify these regions and data, but also produce tools to improve it, and return a score that reflects the reliability of the model. This score can be used to certify models without having access to the training process and estimate the applicability of models to specific use cases.

                Recommended skills

                Basic knowledge of machine learning and python.

                Predicting Persistence of Environmental Pollutants

                Supervisor

                Joerg Wicker

                Discipline

                School of Computer Science

                Project code: SCI099

                Project

                Most chemicals that are currently produced sooner or later end up in the environment, many of them in rivers and other waters. It is essential to know their fate in terms of transformations and persistence. Harmful chemicals that degrade quickly might pose no big thread to the environment, however persistent toxic compounds can have lasting negative impact. We will go beyond the prediction of specific biodegradation products as done in state-of-the-art metabolic prediction systems (such as enviPath) and aim to predict reaction rates, that is how long pollutants and their metabolites persist in the environment. We will develop and train machine learning models that use data on metabolic reactions under certain environmental conditions and aim to predict reaction rates and the half-life of compounds.

                Recommended skills

                Basic knowledge of chemistry, machine learning, and python.

                Design for Degradability – In-Silico Development of Sustainable Chemicals

                Supervisor

                Joerg Wicker

                Discipline

                School of Computer Science

                Project code: SCI100

                Project

                An important aspect in the development of novel chemicals is their environmental fate, that is their ability to degrade when released in the environment. To achieve this, the goal is to design compounds that fulfill a certain function - for example medication or pesticides, and at the same time allow for quick degradation into harmless metabolites. We will develop new algorithms that achieve this, evaluating on large databases of existing compounds. We will use standard machine learning models for predicting degradation products and pathways (see enviPath ). Our approach will be to start with existing compounds, and transform them using adversarial methods and generative models (GANs) such that their degradability increases while at the same time keeping their original function.

                Recommended skills

                Basic knowledge of chemistry, machine learning, and python.

                Adversarial Time Series

                Supervisor

                Joerg Wicker

                Discipline

                School of Computer Science

                Project code: SCI101

                Project

                Adversarial Machine Learning is a field of Machine Learning that focuses on exploiting model vulnerabilities by making use of obtainable information from the model. Studying a model’s weaknesses to adversarial attacks not only helps the researcher understand more about the model itself, but also allows them to defend against malicious attacks and prevent potentially fatal consequences after deployment. Adversarial Machine Learning was firstly proposed in the image classification domain, where an attack fools a model to misclassify an image by adding carefully crafted noise that is hardly detectable by a human. Recently, adversarial methods have been introduced that target time series challenges. We will develop and evaluate new adversarial attacks on time series, targeting specific time series challenges beyond forecasting.

                Recommended skills

                Basic knowledge of chemistry, machine learning, and python.

                Do Neural Networks Pay Off?

                Supervisor

                Joerg Wicker

                Discipline

                School of Computer Science

                Project code: SCI102

                Project

                For a while now, we have seen the trend that neural networks are vastly popular, and a large portion of the machine learning research is dedicated to achieving minor gains in accuracy at huge power costs. We hypothesize that, given the same love and care (in terms of nifty pre-processing strategies etc.), traditional machine learning methods have the potential to achieve a similar accuracy while consuming less power. A few questions we are interested in are the following:
                - Using the same pre-processing techniques, can traditional techniques achieve a similar performance? For which type of dataset does it work?
                - There is a trade-off between the accuracy and the number of parameters or layers (as a proxy for the power consumption), and we can expect the last bit of accuracy to be the costliest. Can we find a more sustainable way to stop at a point where we sacrifice a little accuracy to save power?
                - If we compare traditional ML methods to NNs while allowing the same number of parameters, what do we observe?
                - There is a myth that only NNs can perform well on certain types of data (such as images). Can we transfer the special tricks NNs use on this data type to traditional ML methods?

                Recommended skills

                Basic understanding of machine learning and Python.

                Image compression to support image processing

                Supervisor

                Joerg Wicker

                Discipline

                School of Computer Science

                Project code: SCI103

                Project

                This project aims to investigate the potential benefits of using our newly developed image compression technique, based on multivariate trees, to enhance image processing machine learning models. The objective is to explore whether employing this technique can lead to faster and more efficient training of these models, requiring fewer iterations, layers, and parameters. While previous research has shown improvements using Superpixels, our approach offers a substantially more lightweight and simplistic solution, reducing storage requirements while maintaining performance. Through this project, the student will conduct empirical evaluations, comparing the performance of models trained on compressed images versus uncompressed ones, and analyze the impact on training time, convergence rate, and model accuracy.

                Recommended skills

                Basic understanding of machine learning and Python.

                A Machine Learning-based news recommender

                Supervisor

                Xinfeng Ye

                Discipline

                School of Computer Science

                Project code: SCI109

                Project

                Service providers aim to offer an excellent experience to their customers by prefetching data likely to be accessed by them and storing it at locations close to the users. In this project, we aim to investigate the use of deep reinforcement learning in predicting customer behaviour and aiding service providers in identifying the data relevant for prefetching.

                Requirements

                Proficiency in Python programming is essential for the project. While working on this project, the student is expected to acquire knowledge in building neural networks, understanding reinforcement learning, and using existing deep learning networks, such as transformers.

                Computing basin boundaries in Julia

                Supervisors

                Claire Postlethwaite
                Matthew Egbert

                Discipline

                School of Computer Science

                Mathematics

                Project code: SCI131

                Project

                This project will involve using the scientific computing language Julia and implementing (previously developed) software to compute basins of attractions of various attractors such as equilibria and periodic orbits in systems which exhibit bi-stability. You will further examine how the boundaries of the basin change as parameters are varied.

                Prerequisites: MATHS 260, and at least some programming experience.