quartz/content/notes/evaluating-designs.md

---
title: Evaluating designs
sr-due: 2022-04-07
sr-interval: 10
sr-ease: 210
---

tags: #review

---
#### Review questions
1. what are two of the five (PGRCW) isses to consider when evaluating designs
- precision and reliability
	- are your evaulationg repeatable
	- are they accurate
- generalizability
	- do your findings apply to the real world
- realism
	- do you results apply tot he real world
- comparison
	- better than just "do you like it study"

3.  what are the two styles of evaluation. How do they differ
- ~~qualitative and quantitative~~
- field and lab studies

4. when would you use a qualitative methods and when would you use a quantitative method
- qualitative when comparing designs
- quantitative when evaluating a single design

5. what are two stages of design and in each cycle which type of method should you use
- design stage qualitative
- implentation quantitative

6. give a brief description of one method of evaluation.
- feedback from experts
- usbility studies
- observation
- simulation/maths
- surveys/focus groups
- comparitive experiments

---

# Evaluating-designs
Why to evaluate using 'outside' people:
- how do we know if a [prototype](content/notes/prototyping.md) is good
- designer/developers are not 'fresh' -> they already have experience with the product
- designer/developers don't know what real users will do

## Issues to consider
- Reliability/precision
	- how accurate is your study?
	- Is is reproducible -> if it was repeated, would you get the same result
- Generalizability
	- Is your sample representative
- Realism
	- Would observed behaviour also occur in the wild
- Comparison
	- Shows how different options were recieved
	- rather than a "people liked it" study
- work involved/efficiency
	- How cost efficient are your methods

## Factors to consider when choosing an evaluation method
- Stage in the cycle at which the evaluation is carried out -> (design / implementation)
- Style of evaluation -> (lab / field)
- Level of subjectivity or objectivity
- Type of measurement -> (qualitative / quantitative)
- Information provided -> (high-level / low-level)
- Immediacy of response -> (real-time / recollection of events)
- Level of interference implied -> (intrusiveness)
- Resources required -> (equipment, time, money, subjects, expertise, context)

## Styles of evaluation
##### Laboratory Studies
- 1st step: Designer evaluates his/her UI
- Specialised equipment for testing available
- Undisturbed (can be a good or bad thing)
- Allows for well controlled experiments
- Substitute for dangerous or remote real-world locations
- Variations in manipulations possible / alternatives

##### Field Studies
- Within the actual user’s working environment
- Observe the system in action
- Disturbance / interruptions (+/-)
- Long-term studies possible
- Bias: presence of observer and equipment
- Needs support / disturbs real workflow

## Quantitative vs Qualitative methods
##### Quantitative Measures
- Usually numeric
- E.g. # of errors, time to complete a certain task, questionnaire with scales
- Can be (easily) analysed using statistical techniques
- Rather objective
- Most useful in comparing alternative designs
- Test hypotheses
- Confirm designs

##### Qualitative Measures
- Non-numeric
- E.g. survey, interview, informal observation, heuristic evaluation
- Difficult to analyse, demands interpretation
- Rather subjective
- User’s overall reaction and understanding of design
- Generate hypotheses
- Find flaws

## Stage in cycle
##### Design Stage
- Only concept (even if very detailed) exists
- More experts, less users involved
- Greatest pay-off: early error detection saves a lot of development money
- Rather qualitative measures (exceptions: detail alternatives; fundamental questions, ...)

##### Implementation
- Artefact exists, sth. concrete to be tested
- More users, less experts involved
- Assures quality of product before or after deployment; bug detection
- Rather quantitative measures (exceptions: overall satisfaction, appeal, ...)

## Methods
### Usability studies
- Bringing people in to test Product
	- Usage setting is not ecologically valid - usage in real world can be different
	- can have tester bias - testers are not the same as real users
	- cant compare interfaces
	- requires physical contact
### Surveys and focus groups
+ quicly get feedback from large number of responses
+ auto tally ressults
+ easy to compare different products
- responder bias
- Not accurate representation of real product
* e.g., ![Pasted image 20220316130318.png](None)
* Focus groups
	* gathering groups of people to discuss an interface
	* group setting can help or hinder

### Feedback from experts
- [Peer critique](None)
- [Dogfooding](None)
	- Using tools yourself
- [Heuristic Evaluation](content/notes/heuristic-evaluation.md)
	- structured feedback

### Comparative experiments
- in lab, field, online
- short or long duration
- which option is better?
- what matters most?
- can see real usage
- more actionable

### Participant observation
- observe what people do in the actual evironment
- usually more long term
	- find things not present in short term studies
- [Observation](content/notes/observation.md)

### Simulation and formal models
- more mathmatical quantitative
- useful if you have a theory to test
- often used for input techniques
- can test multiple alternatives quickly
- typically simulation is used in conjugtion with [monte carlo optimisation](None)

## Query techniques
- [Interviews](content/notes/interviews.md)
- questionnaires
	- less flexible
	- larger samples possible
	- design of questionnaire is for expert only
	- use of standard (proven) questionnaires recommended
	- types of questions:
		- general (age, gender)
		- open ended
		- scalar (e.g., likert-like scales)
		- multiple choice
		- ranking

## Users
- users can come up with great ideas
	- lead user -> need specific soluton that does not exist -> often make up their own solution
	- extreme user -> use existing solution for it's intended purpose to an extreme degree
	- typical user ->