quartz/content/notes/set.md
Jet Hughes eec8badee0 update
2022-04-28 11:51:17 +12:00

61 lines
2.2 KiB
Markdown

---
number headings: auto, first-level 1, max 6, 1.1
title: "set"
aliases: sets, Set, Sets
tags:
- cosc201
- datastructure
---
links: [java docs](https://docs.oracle.com/javase/7/docs/api/java/util/Set.html) for set interface
> A collection of items with no repetition allowed
How do we want to be able to use them?
- We want to be able to add to them
- And Remove from them
- And check if it contains something
This gives us the methods:
- `Add(x)`
- `Remove(x)`
- `Contains(x)`
Binary search trees are data types that supprts this data type when there is an ordering on the underlying elements
A [binary search tree](notes/binary-search-tree.md) can be used to implement a set when there is no order. [hash sets and hash maps](notes/hash-map.md) can be used when there is order.
# 1 Implementations
## 1.1 Basic set
Simplest way to implement a set is using an array.
[Code for basic set](https://blackboard.otago.ac.nz/bbcswebdav/pid-2890167-dt-content-rid-18354837_1/courses/COSC201_S1DNIE_2022/BasicSet.java)
- Contains: Simple linear search
- Add: Check if it is present, if not add it to the end
- Remove: Delete the element and replace it with the last element
- This leaves empty elements at the end.
- So we keep track of the size
All three operations are $O(n)$ as they must all iterate over the entire set
## 1.2 Ordered set
A set with some underlying "natural" order. E.g., dictionary order for `string` objects.
We would also like to be able to do an in order traversal of an ordered set
If the set is static
- store using sorted array
- use binary search to find elements --> $O(lg\ n)$
- traverse by incementing a counter --> $O(lg\ n)$ to init then $O(1)$
But then:
- `Add(x)` Insert `x` if its not already present, so we start a search which is fine, but then insertion is $O(n)$
- `Remove(x)` find `x` if present, then move eyerthing beyond it back over the top --> $O(n)$
This is fine if we dont expect to use the dynamic operations a lot.
Another approach might be to maintain a `main` array and a subsidary `add` and `remove` arrays and only periodically do the updates to the main array. but this gets complicated very quickly
A better way of doing this is using a [binary search tree](notes/binary-search-tree.md)