Topic 3 Introduction to analysis of data generated from learners activities , either digital or non-digital

Data mining research

Educational data mining researchers (e.g., Baker 2011; Baker and Yacef 2009) view the following as the goals for their research

1.Predicting students’ future learning behavior by creating student models that incorporate such detailed information as students’ knowledge, motivation, metacognition, and attitudes;

  1. Discovering or improving domain models that characterize the content to be learned and optimal instructional sequences;
  2. Studying the effects of different kinds of pedagogical support that can be provided by learning software; and
  3. Advancing scientific knowledge about learning and learners through building computational models that incorporate models of the student, the domain, and the software’s pedagogy.

To accomplish these four goals, educational data mining research uses the five categories of technical methods (Baker 2011) described below.

  1. Prediction entails developing a model that can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables). Examples of using prediction include detecting such student behaviors as when they are gaming the system, engaging in off-task behavior, or failing to answer a question correctly despite having a skill
  2. Clustering refers to finding data points that naturally group together and can be used to split a full dataset into categories. Examples of clustering applications are grouping students based on their learning difficulties and interaction patterns, such as how and how much they use tools in a learning management system (Amershi and Conati 2009), and grouping users for purposes of recommending actions and resources to similar users
  3. Relationship mining involves discovering relationships between variables in a dataset and encoding them as rules for later use. For example, relationship mining can identify the relationships among products purchased in online shopping (Romero and Ventura 2010).
  4. Distillation for human judgment is a technique that involves depicting data in a way that enables a human to quickly identify or classify features of the data. This area of educational data mining improves machine-learning models because humans can identify patterns in, or features of, student learning actions, student behaviors, or data involving collaboration among students. This approach overlaps with visual data analytics (described in the third part of this section).

Learning analytics

Technical methods used in learning analytics are varied and draw from those used in educational data mining. Additionally, learning analytics may employ:

  • Social network analysis (e.g., analysis of student-to-student and student-to-teacher relationships and interactions to identify disconnected students, influencers, etc.) and
  • Social or “attention” metadata to determine what a user is engaged with.

In summary, learning analytics systems apply models to answer such questions as:


  • When are students ready to move on to the next topic?
  • When are students falling behind in a course?
  • When is a student at risk for not completing a course?
  • What grade is a student likely to get without intervention?
  • What is the best next course for a given student?
  • Should a student be referred to a counselor for help?

Visual Data Analytics

Visual interactive principal components analysis (finding the components of a dataset that reduce many variables into few) is a technique once available only to statisticians that is now commonly used to detect trends and data correlations in multidimensional data sets.

Gapminder ( ), for example, uses this approach in its analysis of multivariate datasets over time.

Websites, such as Many Eyes ( ), offer tools for any user to create visualizations (map-based, text-based clouds and networks, and charts and graphs) of personal datasets.

Early in its release, the creators of Many Eyes discovered that it was being used for visual analytics, to check for data quality, to characterize social trends, and to reveal personal and collective sentiments or advocate for a position (Viégas et al. 2008). Like Many Eyes, other online services, such as Wordle and FlowingData, accept uploaded data and allow the user to configure the output to varying degrees. To facilitate the development of this field, the National Visualization and Analytics Center was established by the U.S. Department of Homeland Security to provide strategic leadership and coordination for visual analytics technology and tools nationwide, and this has broadened into a visual analytics community (

An Analysis of Data Activities and Instructional Supports in Middle School Science Textbooks

Coding categories,

Operational definitions and results by textbook type

Technology used for gathering and analysing data  –  Web publishing tools


An ePortfolio is a learner-created collection of digital items, ideas, evidence, reflections, feedback which presents a selected audience with evidence of a person’s learning and/or ability. Portfolios can allow a student to demonstrate development over a period of time. Support real-world tasks. Can be time-consuming to mark. Clear rubrics need to be provided.


Diagrams can be used to capture processes, concepts or creative solutions to problems.


Mindmap:, popplet, mindomo, coggle, mindmiester, mindomo

Flowchart: Lucidchart Infographic: piktochart

General: Google draw

Quizzes and Surveys

Automarking ability lends itself well to large enrolments. The multiple-choice style questions are suited to assessing basic recall/knowledge. Well structured questions take time to write and quizzes take time to set up. Time is saved through automarking. They can be used to give question-specific feedback, but this is time consuming to create. To minimise potential for plagiarism, strategies such as randomisation, complete-in-one-sitting, question banks and multiple test versions can be used. Can be used to collect authentic data for students to analyse, or for students to create their own questions.Can be used to support self-reflection.


LEO quiz tool and Qualtrix

Under this topic we reviewed technology used for gathering and analyzing data and analyzing category,definition, concept measurement,coding criteria and other types of analysis of data activities and instructional supports.