vault backup: 2025-03-25 00:31:54
All checks were successful
Update pages on webserver / Update (push) Successful in 6s
All checks were successful
Update pages on webserver / Update (push) Successful in 6s
This commit is contained in:
parent
7d6c3784a3
commit
3168fa8b92
@ -2,11 +2,11 @@
|
|||||||
title: when two macros are faster than one
|
title: when two macros are faster than one
|
||||||
draft: "false"
|
draft: "false"
|
||||||
---
|
---
|
||||||
While working on my Database datapack (still WIP), I knew I'd want to find
|
While working on my Database datapack (still WIP), I knew I'd want to find the most efficient way to dynamically access dynamically populated arrays. I had some ideas and decided to benchmark them using [Kragast's Benchmark Datapack](https://www.planetminecraft.com/data-pack/benchmark-6443027/). This process was really illuminating to me, and I hope it will be for you as well. Thanks for all the help from **PukiSilver**, **amandin**, and **Nicoder**.
|
||||||
|
|
||||||
# scenario
|
# scenario
|
||||||
## dataset
|
## dataset
|
||||||
The data is stored in a storage `#_macro.array`. Array is populated with a total of 500 entries, each having `id` and `string` fields.
|
The testing data is stored in the storage `#_macro.array`. The array is populated with a total of 500 entries, each having `id` and `string` fields.
|
||||||
```json
|
```json
|
||||||
[
|
[
|
||||||
{
|
{
|
||||||
@ -114,4 +114,24 @@ data remove storage test_namespace:test_namespace temp.index
|
|||||||
'# call the function that consumes 'temp.result', then remove it
|
'# call the function that consumes 'temp.result', then remove it
|
||||||
data remove storage test_namespace:test_namespace temp.result
|
data remove storage test_namespace:test_namespace temp.result
|
||||||
```
|
```
|
||||||
|
# two is faster than one??
|
||||||
|
I ran benchmarks on a simple iteration-based function and the single-macro function suggested by **PukiSilver** and **amandin**. I also threw in the two-macro indexing function since I had already coded it. I assumed using one macro would be faster than two, but I was curious exactly *how* much faster it would be.
|
||||||
|
|
||||||
|
As expected, the iteration-based function was sloooooow. Both macro functions blew it out of the water. Unexpectedly, however, the `two_macro` function doubled the performance of the `one_macro` function. Here are the results (bigger is better):
|
||||||
|
|
||||||
|
| **function** | **benchmark** |
|
||||||
|
| ------------ | ------------- |
|
||||||
|
| iteration | 416 |
|
||||||
|
| one_macro | 30342 |
|
||||||
|
| two_macro | 72450 |
|
||||||
|
The `two_macro` function is *2.4x* faster than the `one_macro` function.
|
||||||
|
|
||||||
|
What the heck is going on? How does *adding* an entire second macro function *improve* performance??
|
||||||
|
|
||||||
|
It turns out that the clever and convenient `one_macro.array[string:$(keyword)]` triggers iteration to filter the array. Since the iteration is triggered by a macro, it directly runs Java code. It's still much faster than iterating in mcfunction, but the performance hit is O(n). In contrast, the `two_macro` approach directly accesses values by `key` and `index`. These operations have a performance hit of O(1). While I haven't tested it, this means that, when run on a larger dataset, the gap between `two_macro` and `one_macro` should continue to widen.
|
||||||
|
# takeaways
|
||||||
|
Indexing is cool. If you find yourself in a situation where you're working with moderate-to-large arrays and are able to index in advance of querying data, it's absolutely worth it from a query performance standpoint.
|
||||||
|
|
||||||
|
However, *indexing* is pretty expensive, and also requires active preplanning when writing a datapack. When items are added, updated, or deleted, the index will also need to be updated. A scheduled task should probably be run every so often to audit indexes and identify potential errors. Indexing existing fields that do not already have an index could be annoying.
|
||||||
|
|
||||||
|
Point being, if it's worth it, *it's worth it*; if it's not, the `one_macro` one liner is simpler and fast enough for most applications.
|
||||||
Loading…
Reference in New Issue
Block a user