justin․searls․co

What one must pass to includes() to include Active Storage attachments

If you're using Active Storage, eager-loading nested associations that contain attachments in order to avoid the "N + 1" query problem can quickly reach the point of absurdity.

Working on the app for Becky's strength-training business, I got curious about how large the array of hashes being sent to the call to includes() is whenever the overall strength-training program is loaded by the server. (This only happens on a few pages, like the program overview page, which genuinely does contain a boatload of information and images).

Each symbol below refers to a reference from one table to another. Every one that descends from :file_attachment is a reference to one of the tables managed by Active Storage for keeping track of cloud-hosted images and videos. Those hashes were extracted from the with_all_variant_records scope that Rails provides.

I mean, look at this:

[{:overview_video=>
   {:file_attachment=>
     {:blob=>
       {:variant_records=>{:image_attachment=>:blob}, :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}},
 {:overview_thumbnail=>
   {:file_attachment=>
     {:blob=>
       {:variant_records=>{:image_attachment=>:blob}, :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}},
 {:warmup_movement=>
   {:movement_video=>
     {:file_attachment=>
       {:blob=>
         {:variant_records=>{:image_attachment=>:blob}, :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}},
    :movement_preview=>
     {:file_attachment=>
       {:blob=>
         {:variant_records=>{:image_attachment=>:blob}, :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}}},
 {:workouts=>
   {:blocks=>
     {:mobility_movement=>
       [{:primary_equipment=>
          {:equipment_image=>
            {:file_attachment=>
              {:blob=>
                {:variant_records=>{:image_attachment=>:blob},
                 :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}},
         :secondary_equipment=>
          {:equipment_image=>
            {:file_attachment=>
              {:blob=>
                {:variant_records=>{:image_attachment=>:blob},
                 :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}},
         :tertiary_equipment=>
          {:equipment_image=>
            {:file_attachment=>
              {:blob=>
                {:variant_records=>{:image_attachment=>:blob},
                 :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}},
         :movement_video=>
          {:file_attachment=>
            {:blob=>
              {:variant_records=>{:image_attachment=>:blob},
               :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}},
         :movement_preview=>
          {:file_attachment=>
            {:blob=>
              {:variant_records=>{:image_attachment=>:blob},
               :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}}],
      :exercises=>
       {:exercise_options=>
         {:movement=>
           [{:primary_equipment=>
              {:equipment_image=>
                {:file_attachment=>
                  {:blob=>
                    {:variant_records=>{:image_attachment=>:blob},
                     :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}},
             :secondary_equipment=>
              {:equipment_image=>
                {:file_attachment=>
                  {:blob=>
                    {:variant_records=>{:image_attachment=>:blob},
                     :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}},
             :tertiary_equipment=>
              {:equipment_image=>
                {:file_attachment=>
                  {:blob=>
                    {:variant_records=>{:image_attachment=>:blob},
                     :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}},
             :movement_video=>
              {:file_attachment=>
                {:blob=>
                  {:variant_records=>{:image_attachment=>:blob},
                   :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}},
             :movement_preview=>
              {:file_attachment=>
                {:blob=>
                  {:variant_records=>{:image_attachment=>:blob},
                   :preview_image_attachment=>{:blob=>{:variant_records=>{:image_attachment=>:blob}}}}}}}]}}}}}]

By my count, that's 167 relationships! Of course, in practice it's not quite this bad since the vast majority are repeated, and as a result this winds up executing "only" 50 queries or so. But that's… a lot!

I've run into a lot of papercuts with Active Storage since starting to work with it in January of this year. I still believe it's the best tool for the job, but qualitatively it feels like it really would benefit from some simplification and refactoring, even if that would require some breaking changes to its (mostly undocumented, thankfully) rough edges.

An example frustration: 14 of these includes hashes are to preview_image_attachment and each of those include four more associations for a total of 70 out of 167 relationships. But preview_image_attachment is actually a specially-named variant record that only exists separate and apart from variant_records because of a quirk in how non-image videos and PDFs are processed. Videos are analyzed the first frame of content and PDFs for their first page, which is saved as an image attachment-of-the-attachment, and it's from that second-order attachment that all other image variants (thumbnails, etc.) are derived. However (and I could be wrong about this, because my own efforts to unwind Active Storage's code have been unsuccessful), that preview image could have just been stored as a normal variant record itself (perhaps referenced as the root variant via an referential column on active_storage_attachments) rather than factored as a full-blown attachment that requires 5 additional eager-load declarations for each attachment in a tree of models.

Does the number of symbols in the above really matter from a performance perspective? I don't know! But if we could cut the size of the mess I just pasted above by nearly half, that would certainly feel nice.


Got a taste for hot, fresh takes?

Then you're in luck, because you can subscribe to this site via RSS or Mastodon! And if that ain't enough, then sign up for my newsletter and I'll send you a usually-pretty-good essay once a month. I also have a solo podcast, because of course I do.