{"id":3696,"date":"2025-03-07T22:16:13","date_gmt":"2025-03-07T13:16:13","guid":{"rendered":"https:\/\/aireviewirush.com\/?p=3696"},"modified":"2025-03-07T22:16:13","modified_gmt":"2025-03-07T13:16:13","slug":"introducing-distill-cli-an-environment-friendly-rust-powered-instrument-for-media-summarization","status":"publish","type":"post","link":"https:\/\/aireviewirush.com\/?p=3696","title":{"rendered":"Introducing Distill CLI: An environment friendly, Rust-powered instrument for media summarization"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div id=\"\">\n<p><img decoding=\"async\" src=\"\/images\/distill-cli-header.png\" alt=\"Distill CLI summarizing The Frugal Architect\" loading=\"lazy\"\/><\/p>\n<p>A number of weeks in the past, I wrote a couple of undertaking our workforce has been engaged on known as <a href=\"http:\/\/www.allthingsdistributed.com\/2024\/05\/hacking-our-way-to-better-team-meetings.html\" target=\"_blank\" rel=\"noopener\">Distill<\/a>. A easy software that summarizes and extracts vital particulars from our each day conferences. On the finish of that submit, I promised you a CLI model written in Rust. After just a few code evaluations from Rustaceans at Amazon and a little bit of polish, at this time, I\u2019m able to share the <a href=\"https:\/\/github.com\/awslabs\/distill-cli\" onclick=\"fathom.trackEvent(&quot;Distill CLI - Source&quot;)\" target=\"_blank\" rel=\"noopener\">Distill CLI<\/a>.<\/p>\n<p>After you construct from supply, merely cross Distill CLI a media file and choose the S3 bucket the place you\u2019d wish to retailer the file. Right now, Distill helps outputting summaries as Phrase paperwork, textual content information, and printing on to terminal (the default). You\u2019ll discover that it\u2019s simply extensible \u2013 my workforce (OCTO) is already utilizing it to export summaries of our workforce conferences on to Slack (and dealing on help for Markdown).<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_53 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title \" >Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\" role=\"button\"><label for=\"item-69e678611d68b\" ><span class=\"\"><span style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input aria-label=\"Toggle\" aria-label=\"item-69e678611d68b\"  type=\"checkbox\" id=\"item-69e678611d68b\"><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/aireviewirush.com\/?p=3696\/#Tinkering_is_an_efficient_strategy_to_study_and_be_curious\" title=\"Tinkering is an efficient strategy to study and be curious \">Tinkering is an efficient strategy to study and be curious <\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/aireviewirush.com\/?p=3696\/#Builders_are_selecting_Rust\" title=\"Builders are selecting Rust \">Builders are selecting Rust <\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/aireviewirush.com\/?p=3696\/#Rust_is_toughmldr\" title=\"Rust is tough&amp;mldr; \">Rust is tough&amp;mldr; <\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/aireviewirush.com\/?p=3696\/#Closing_ideas\" title=\"Closing ideas \">Closing ideas <\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/aireviewirush.com\/?p=3696\/#Beneficial_posts\" title=\"Beneficial posts \">Beneficial posts <\/a><\/li><\/ul><\/nav><\/div>\n<h2 id=\"tinkering-is-a-good-way-to-learn-and-be-curious\"><span class=\"ez-toc-section\" id=\"Tinkering_is_an_efficient_strategy_to_study_and_be_curious\"><\/span>Tinkering is an efficient strategy to study and be curious <a href=\"#tinkering-is-a-good-way-to-learn-and-be-curious\"\/><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The best way we construct has modified fairly a bit since I began working with distributed methods. Right now, in order for you it, compute, storage, databases, networking can be found on demand. As builders, our focus has shifted to sooner and sooner innovation, and alongside the best way tinkering on the system degree has turn into a little bit of a misplaced artwork. However tinkering is as vital now because it has ever been. I vividly keep in mind the hours spent twiddling with BSD 2.8 to make it work on PDP-11s, and it cemented my unending love for OS software program. Tinkering gives us with a chance to essentially get to know our methods. To experiment with new languages, frameworks, and instruments. To search for efficiencies massive and small. To seek out inspiration. And that is precisely what occurred with Distill.<\/p>\n<p>We rewrote considered one of our Lambda capabilities in Rust, and noticed that chilly begins have been 12x sooner and the reminiscence footprint decreased by 73%. Earlier than I knew it, I started to consider different methods I may make all the course of extra environment friendly for my use case.<\/p>\n<p>The unique proof of idea saved media information, transcripts, and summaries in S3, however since I\u2019m operating the CLI domestically, I spotted I may retailer the transcripts and summaries in reminiscence and save myself just a few writes to S3. I additionally needed a straightforward strategy to add media and monitor the summarization course of with out leaving the command line, so I cobbled collectively a easy UI that gives standing updates and lets me know when something fails. The unique confirmed what was potential, it left room for tinkering, and it was the blueprint that I used to write down the Distill CLI in Rust.<\/p>\n<p>I encourage you to <a href=\"https:\/\/github.com\/awslabs\/distill-cli\" onclick=\"fathom.trackEvent(&quot;Distill CLI - Source&quot;)\" target=\"_blank\" rel=\"noopener\">give it a attempt<\/a>, and let me know while you discover any bugs, edge instances or have concepts to enhance on it.<\/p>\n<h2 id=\"builders-are-choosing-rust\"><span class=\"ez-toc-section\" id=\"Builders_are_selecting_Rust\"><\/span>Builders are selecting Rust <a href=\"#builders-are-choosing-rust\"\/><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>As technologists, we have now a accountability to construct sustainably. And that is the place I actually see Rust\u2019s potential. With its emphasis on efficiency, reminiscence security and concurrency there&#8217;s a actual alternative to lower computational and upkeep prices. Its reminiscence security ensures get rid of obscure bugs that plague C and C++ tasks, lowering crashes with out compromising efficiency. Its concurrency mannequin enforces strict compile-time checks, stopping information races and maximizing multi-core processors. And whereas compilation errors could be bloody aggravating within the second, fewer builders chasing bugs, and extra time centered on innovation are all the time good issues. That\u2019s why it\u2019s turn into a go-to for builders who thrive on fixing issues at unprecedented scale.<\/p>\n<p>Since 2018, we have now more and more leveraged Rust for important workloads throughout varied providers like S3, EC2, DynamoDB, Lambda, Fargate, and Nitro, particularly in situations the place {hardware} prices are anticipated to dominate over time. In his visitor submit final yr, Andy Warfield wrote a bit about ShardStore, the bottom-most layer of S3\u2019s storage stack that manages information on every particular person disk. Rust was chosen to get kind security and structured language help to assist establish bugs sooner, and the way they wrote libraries to increase that kind security to functions to on-disk constructions. Should you haven\u2019t already, I like to recommend that you simply <a href=\"http:\/\/www.allthingsdistributed.com\/2023\/07\/building-and-operating-a-pretty-big-storage-system.html\" target=\"_blank\" rel=\"noopener\">learn the submit<\/a>, and the <a href=\"https:\/\/assets.amazon.science\/77\/5e\/4a7c238f4ce890efdc325df83263\/using-lightweight-formal-methods-to-validate-a-key-value-storage-node-in-amazon-s3-2.pdf\" target=\"_blank\" rel=\"noopener\">SOSP paper<\/a>.<\/p>\n<p>This pattern is mirrored throughout the trade. <a href=\"https:\/\/discord.com\/blog\/why-discord-is-switching-from-go-to-rust\" onclick=\"fathom.trackEvent(&quot;Distill CLI - Discord&quot;)\" target=\"_blank\" rel=\"noopener\">Discord<\/a> moved their Learn States service from Go to Rust to deal with giant latency spikes brought on by rubbish assortment. It&#8217;s 10x sooner with their worst tail latencies decreased nearly <a href=\"https:\/\/aws.amazon.com\/blogs\/opensource\/sustainability-with-rust\/\" onclick=\"fathom.trackEvent(&quot;Distill CLI - Sustainable Rust Stats&quot;)\" target=\"_blank\" rel=\"noopener\">100x<\/a>. Equally, <a href=\"https:\/\/www.figma.com\/blog\/rust-in-production-at-figma\/\" onclick=\"fathom.trackEvent(&quot;Distill CLI - Figma&quot;)\" target=\"_blank\" rel=\"noopener\">Figma<\/a> rewrote performance-sensitive elements of their multiplayer service in Rust, they usually\u2019ve seen important server-side efficiency enhancements, resembling lowering peak common CPU utilization per machine by 6x.<\/p>\n<p>The purpose is that in case you are severe about price and sustainability, there is no such thing as a purpose to not think about Rust.<\/p>\n<h2 id=\"rust-is-hard\"><span class=\"ez-toc-section\" id=\"Rust_is_toughmldr\"><\/span>Rust is tough&amp;mldr; <a href=\"#rust-is-hard\"\/><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Rust has a repute for being a tough language to study and I received\u2019t dispute that there&#8217;s a studying curve. It&#8217;s going to take time to get conversant in the borrow checker, and you&#8217;ll battle with the compiler. It\u2019s loads like writing a PRFAQ for a brand new concept at Amazon. There&#8217;s plenty of friction up entrance, which is typically arduous when all you actually wish to do is bounce into the IDE and begin constructing. However when you\u2019re on the opposite aspect, there may be super potential to choose up velocity. Bear in mind, the price to construct a system, service, or software is nothing in comparison with the price of working it, so the best way you construct must be frequently below scrutiny.<\/p>\n<p>However you don\u2019t should take my phrase for it. Earlier this yr, <a href=\"https:\/\/www.theregister.com\/2024\/03\/31\/rust_google_c\/\" onclick=\"fathom.trackEvent(&quot;Distill CLI - The Register&quot;)\" target=\"_blank\" rel=\"noopener\">The Register<\/a> printed findings from Google that confirmed their Rust groups have been twice as productive as workforce\u2019s utilizing C++, and that the identical dimension workforce utilizing Rust as an alternative of Go was as productive with extra correctness of their code. There aren&#8217;t any bonus factors for rising headcount to sort out avoidable issues.<\/p>\n<h2 id=\"closing-thoughts\"><span class=\"ez-toc-section\" id=\"Closing_ideas\"><\/span>Closing ideas <a href=\"#closing-thoughts\"\/><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>I wish to be crystal clear: this isn&#8217;t a name to rewrite all the things in Rust. Simply as <a href=\"http:\/\/www.allthingsdistributed.com\/2023\/05\/monoliths-are-not-dinosaurs.html\" target=\"_blank\" rel=\"noopener\">monoliths will not be dinosaurs<\/a>, there is no such thing as a single programming language to rule all of them and never each software can have the identical enterprise or technical necessities. It\u2019s about utilizing the proper instrument for the proper job. This implies questioning the established order, and repeatedly on the lookout for methods to incrementally optimize your methods \u2013 to tinker with issues and measure what occurs. One thing so simple as switching the library you employ to serialize and deserialize <code>json<\/code> from Python\u2019s commonplace library to <code>orjson<\/code> is likely to be all it&#8217;s worthwhile to pace up your app, scale back your reminiscence footprint, and decrease prices within the course of.<\/p>\n<p>Should you take nothing else away from this submit, I encourage you to actively search for efficiencies in all points of your work. Tinker. Measure. As a result of all the things has a price, and price is a reasonably good proxy for a sustainable system.<\/p>\n<p>Now, go construct!<\/p>\n<p><em>A particular thanks to AWS Rustaceans <a href=\"https:\/\/www.linkedin.com\/in\/nicholas-matsakis-615614\/\" target=\"_blank\" rel=\"noopener\">Niko Matsakis<\/a> and <a href=\"https:\/\/www.linkedin.com\/in\/grant-gurvis\/\" target=\"_blank\" rel=\"noopener\">Grant Gurvis<\/a> for his or her code evaluations and suggestions whereas creating the Distill CLI.<\/em><\/p>\n<h2 id=\"recommended-posts\"><span class=\"ez-toc-section\" id=\"Beneficial_posts\"><\/span>Beneficial posts <a href=\"#recommended-posts\"\/><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<\/div>\n\n","protected":false},"excerpt":{"rendered":"<p>A number of weeks in the past, I wrote a couple of undertaking our workforce has been engaged on known as Distill. A easy software that summarizes and extracts vital particulars from our each day conferences. On the finish of that submit, I promised you a CLI model written in Rust. After just a few [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3698,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":{"0":"post-3696","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-cloud-computing"},"_links":{"self":[{"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/posts\/3696","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3696"}],"version-history":[{"count":1,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/posts\/3696\/revisions"}],"predecessor-version":[{"id":3697,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/posts\/3696\/revisions\/3697"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/media\/3698"}],"wp:attachment":[{"href":"https:\/\/aireviewirush.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3696"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3696"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3696"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}