mirror of
https://github.com/jackyzha0/quartz.git
synced 2025-12-24 13:24:05 -06:00
188 lines
6.0 KiB
Markdown
188 lines
6.0 KiB
Markdown
---
|
||
title: Evaluating designs
|
||
sr-due: 2022-04-07
|
||
sr-interval: 10
|
||
sr-ease: 210
|
||
---
|
||
|
||
tags: #review
|
||
|
||
---
|
||
#### Review questions
|
||
1. what are two of the five (PGRCW) isses to consider when evaluating designs
|
||
- precision and reliability
|
||
- are your evaulationg repeatable
|
||
- are they accurate
|
||
- generalizability
|
||
- do your findings apply to the real world
|
||
- realism
|
||
- do you results apply tot he real world
|
||
- comparison
|
||
- better than just "do you like it study"
|
||
|
||
3. what are the two styles of evaluation. How do they differ
|
||
- ~~qualitative and quantitative~~
|
||
- field and lab studies
|
||
|
||
4. when would you use a qualitative methods and when would you use a quantitative method
|
||
- qualitative when comparing designs
|
||
- quantitative when evaluating a single design
|
||
|
||
5. what are two stages of design and in each cycle which type of method should you use
|
||
- design stage qualitative
|
||
- implentation quantitative
|
||
|
||
6. give a brief description of one method of evaluation.
|
||
- feedback from experts
|
||
- usbility studies
|
||
- observation
|
||
- simulation/maths
|
||
- surveys/focus groups
|
||
- comparitive experiments
|
||
|
||
---
|
||
|
||
# Evaluating-designs
|
||
Why to evaluate using 'outside' people:
|
||
- how do we know if a [prototype](content/notes/prototyping.md) is good
|
||
- designer/developers are not 'fresh' -> they already have experience with the product
|
||
- designer/developers don't know what real users will do
|
||
|
||
## Issues to consider
|
||
- Reliability/precision
|
||
- how accurate is your study?
|
||
- Is is reproducible -> if it was repeated, would you get the same result
|
||
- Generalizability
|
||
- Is your sample representative
|
||
- Realism
|
||
- Would observed behaviour also occur in the wild
|
||
- Comparison
|
||
- Shows how different options were recieved
|
||
- rather than a "people liked it" study
|
||
- work involved/efficiency
|
||
- How cost efficient are your methods
|
||
|
||
## Factors to consider when choosing an evaluation method
|
||
- Stage in the cycle at which the evaluation is carried out -> (design / implementation)
|
||
- Style of evaluation -> (lab / field)
|
||
- Level of subjectivity or objectivity
|
||
- Type of measurement -> (qualitative / quantitative)
|
||
- Information provided -> (high-level / low-level)
|
||
- Immediacy of response -> (real-time / recollection of events)
|
||
- Level of interference implied -> (intrusiveness)
|
||
- Resources required -> (equipment, time, money, subjects, expertise, context)
|
||
|
||
## Styles of evaluation
|
||
##### Laboratory Studies
|
||
- 1st step: Designer evaluates his/her UI
|
||
- Specialised equipment for testing available
|
||
- Undisturbed (can be a good or bad thing)
|
||
- Allows for well controlled experiments
|
||
- Substitute for dangerous or remote real-world locations
|
||
- Variations in manipulations possible / alternatives
|
||
|
||
##### Field Studies
|
||
- Within the actual user’s working environment
|
||
- Observe the system in action
|
||
- Disturbance / interruptions (+/-)
|
||
- Long-term studies possible
|
||
- Bias: presence of observer and equipment
|
||
- Needs support / disturbs real workflow
|
||
|
||
## Quantitative vs Qualitative methods
|
||
##### Quantitative Measures
|
||
- Usually numeric
|
||
- E.g. # of errors, time to complete a certain task, questionnaire with scales
|
||
- Can be (easily) analysed using statistical techniques
|
||
- Rather objective
|
||
- Most useful in comparing alternative designs
|
||
- Test hypotheses
|
||
- Confirm designs
|
||
|
||
##### Qualitative Measures
|
||
- Non-numeric
|
||
- E.g. survey, interview, informal observation, heuristic evaluation
|
||
- Difficult to analyse, demands interpretation
|
||
- Rather subjective
|
||
- User’s overall reaction and understanding of design
|
||
- Generate hypotheses
|
||
- Find flaws
|
||
|
||
## Stage in cycle
|
||
##### Design Stage
|
||
- Only concept (even if very detailed) exists
|
||
- More experts, less users involved
|
||
- Greatest pay-off: early error detection saves a lot of development money
|
||
- Rather qualitative measures (exceptions: detail alternatives; fundamental questions, ...)
|
||
|
||
##### Implementation
|
||
- Artefact exists, sth. concrete to be tested
|
||
- More users, less experts involved
|
||
- Assures quality of product before or after deployment; bug detection
|
||
- Rather quantitative measures (exceptions: overall satisfaction, appeal, ...)
|
||
|
||
## Methods
|
||
### Usability studies
|
||
- Bringing people in to test Product
|
||
- Usage setting is not ecologically valid - usage in real world can be different
|
||
- can have tester bias - testers are not the same as real users
|
||
- cant compare interfaces
|
||
- requires physical contact
|
||
### Surveys and focus groups
|
||
+ quicly get feedback from large number of responses
|
||
+ auto tally ressults
|
||
+ easy to compare different products
|
||
- responder bias
|
||
- Not accurate representation of real product
|
||
* e.g., 
|
||
* Focus groups
|
||
* gathering groups of people to discuss an interface
|
||
* group setting can help or hinder
|
||
|
||
### Feedback from experts
|
||
- [Peer critique](None)
|
||
- [Dogfooding](None)
|
||
- Using tools yourself
|
||
- [Heuristic Evaluation](content/notes/heuristic-evaluation.md)
|
||
- structured feedback
|
||
|
||
### Comparative experiments
|
||
- in lab, field, online
|
||
- short or long duration
|
||
- which option is better?
|
||
- what matters most?
|
||
- can see real usage
|
||
- more actionable
|
||
|
||
### Participant observation
|
||
- observe what people do in the actual evironment
|
||
- usually more long term
|
||
- find things not present in short term studies
|
||
- [Observation](content/notes/observation.md)
|
||
|
||
### Simulation and formal models
|
||
- more mathmatical quantitative
|
||
- useful if you have a theory to test
|
||
- often used for input techniques
|
||
- can test multiple alternatives quickly
|
||
- typically simulation is used in conjugtion with [monte carlo optimisation](None)
|
||
|
||
## Query techniques
|
||
- [Interviews](content/notes/interviews.md)
|
||
- questionnaires
|
||
- less flexible
|
||
- larger samples possible
|
||
- design of questionnaire is for expert only
|
||
- use of standard (proven) questionnaires recommended
|
||
- types of questions:
|
||
- general (age, gender)
|
||
- open ended
|
||
- scalar (e.g., likert-like scales)
|
||
- multiple choice
|
||
- ranking
|
||
|
||
## Users
|
||
- users can come up with great ideas
|
||
- lead user -> need specific soluton that does not exist -> often make up their own solution
|
||
- extreme user -> use existing solution for it's intended purpose to an extreme degree
|
||
- typical user -> |