From 3168fa8b9270d0db9b3e4179065e8026fdca279a Mon Sep 17 00:00:00 2001
From: themodernhakr <ehrumsey@gmail.com>
Date: Tue, 25 Mar 2025 00:31:54 -0500
Subject: [PATCH] vault backup: 2025-03-25 00:31:54

---
 .../When Two Macros are Faster than One.md    | 24 +++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/Minecraft Datapacking/When Two Macros are Faster than One.md b/Minecraft Datapacking/When Two Macros are Faster than One.md
index e862f73..53852cb 100644
--- a/Minecraft Datapacking/When Two Macros are Faster than One.md	
+++ b/Minecraft Datapacking/When Two Macros are Faster than One.md	
@@ -2,11 +2,11 @@
 title: when two macros are faster than one
 draft: "false"
 ---
-While working on my Database datapack (still WIP), I knew I'd want to find 
+While working on my Database datapack (still WIP), I knew I'd want to find the most efficient way to dynamically access dynamically populated arrays. I had some ideas and decided to benchmark them using [Kragast's Benchmark Datapack](https://www.planetminecraft.com/data-pack/benchmark-6443027/). This process was really illuminating to me, and I hope it will be for you as well. Thanks for all the help from **PukiSilver**, **amandin**, and **Nicoder**.
 
 # scenario
 ## dataset
-The data is stored in a storage `#_macro.array`. Array is populated with a total of 500 entries, each having `id` and `string` fields.
+The testing data is stored in the storage `#_macro.array`. The array is populated with a total of 500 entries, each having `id` and `string` fields.
 ```json
 [
 	{
@@ -114,4 +114,24 @@ data remove storage test_namespace:test_namespace temp.index
 '# call the function that consumes 'temp.result', then remove it
 data remove storage test_namespace:test_namespace temp.result
 ```
+# two is faster than one??
+I ran benchmarks on a simple iteration-based function and the single-macro function suggested by **PukiSilver** and **amandin**. I also threw in the two-macro indexing function since I had already coded it. I assumed using one macro would be faster than two, but I was curious exactly *how* much faster it would be.
 
+As expected, the iteration-based function was sloooooow. Both macro functions blew it out of the water. Unexpectedly, however, the `two_macro` function doubled the performance of the `one_macro` function. Here are the results (bigger is better):
+
+| **function** | **benchmark** |
+| ------------ | ------------- |
+| iteration    | 416           |
+| one_macro    | 30342         |
+| two_macro    | 72450         |
+The `two_macro` function is *2.4x* faster than the `one_macro` function.
+
+What the heck is going on? How does *adding* an entire second macro function *improve* performance??
+
+It turns out that the clever and convenient `one_macro.array[string:$(keyword)]` triggers iteration to filter the array. Since the iteration is triggered by a macro, it directly runs Java code. It's still much faster than iterating in mcfunction, but the performance hit is O(n). In contrast, the `two_macro` approach directly accesses values by `key` and `index`. These operations have a performance hit of O(1). While I haven't tested it, this means that, when run on a larger dataset, the gap between `two_macro` and `one_macro` should continue to widen.
+# takeaways
+Indexing is cool. If you find yourself in a situation where you're working with moderate-to-large  arrays and are able to index in advance of querying data, it's absolutely worth it from a query performance standpoint.
+
+However, *indexing* is pretty expensive, and also requires active preplanning when writing a datapack. When items are added, updated, or deleted, the index will also need to be updated. A scheduled task should probably be run every so often to audit indexes and identify potential errors. Indexing existing fields that do not already have an index could be annoying.
+
+Point being, if it's worth it, *it's worth it*; if it's not, the `one_macro` one liner is simpler and fast enough for most applications.
\ No newline at end of file