Models for Quasi-Objective Rating of Digital Samples

Download nøgletal CSV Se projektside Vis printervenlig udgave

Projekttitel	Models for Quasi-Objective Rating of Digital Samples
Projekttype	Anvendt forskning og udvikling
Frascati	Ja
Tema	IT \| Teknik
Teaser	How can we build machine learning models that can tell good handwriting from bad handwriting?
Status	Igangværende
Ejer
- Akademi	Københavns Erhvervsakademi (KEA)
- Kontaktperson	Henrik Strøm Adjunkt hstr@kea.dk
Nat./Int.	Internationalt
Projektperiode	01. juni 2019 - 31. maj 2023
Projektbeskrivelse
- Projektresumé	A project for researching and developing quasi-objective models to perform a human-like rating of digital samples, more specifically, handwriting samples. The project covers data collection via observational studies, theory-building of machine learning models, system development of prototypes, and finally, a real-world application experiment. The project is currently in an initial phase of data collection.
- Baggrund og formål	Simultaneously doing research in machine learning and trying to learn Chinese, I came up with the idea of making an app that would help me training writing Chinese characters based on machine learning. It turned out to be far from trivial. The aesthetics of handwriting is an intrinsically subjective matter. However, most people would still agree about whether a particular handwriting sample was “good” or “bad,” which makes handwriting a good case study. Machine learning models to rate handwriting will at best be quasi-objective, meaning they have high external validity in that they align with the consensus of a large number of real humans. This Ph.D. project is aimed at establishing data sets and researching quasi-objective machine learning models based on these data sets, for rating handwriting in real-time, to provide a user with immediate feedback on his handwriting.
- Aktiviteter og handling	During this project, two data sets will is collected and analyzed: (1) an augmentation to the MNIST data set with real human ratings to establish a ground truth to train, evaluate, and test models (2) a data set of the handwriting of Chinese characters with a temporal factor, that is, time-series of the strokes. For each of these data sets, the following activities will take place: (1) Analysis of how data sets should be collected and analyzed. (2) Collection of data sets. (3) Theory building of machine learning models (4) Development of prototype models
- Projektets Metode	The two major activities in this project are generating data sets to establish a ground truth and the research and development of models around these data sets. A multi-methodological approach, as described by Nunamaker et al., is therefore taken. Observation, and more specifically, survey studies, are used to generate data sets. These data sets support theory building in form of machine learning models, and systems development of prototypes (see figure). Finally, a field experiment* will implement the selected best model into an app, that can be used by real users. App development is planned to happen in collaboration with a third party. * Jay F Nunamaker Jr, Minder Chen, and Titus DM Purdin. 1990. Systems development in information systems research. 7, 3 (1990), 89–106.
- Projektets Forventede Resultater	Quasi-objective models for the rating of digital samples with high external validity, two data sets, 4+ articles published.
- Projektets Forventede Effekt	Models can be applied in digital learning scenarios, e.g. students can practice handwriting on a tablet device with a pen, and get immediate feedback while freeing the teacher to do other tasks.
Tags	datascience \| machinelearning
Deltagere
- Studerende
- Medarbejdere	Københavns Erhvervsakademi (KEA) Henrik Strøm
- Virksomhedsrepræsentanter
- Andre
Partnere	Aalborg Universitet
Finansiering
- Intern	100%
- Ekstern
Resultat
Evaluering
Formidlingsform
- Resultatets formidling
- Resultaternes værdi
- Målgruppen
- Publikationer