{"id":23819,"date":"2026-03-15T18:16:26","date_gmt":"2026-03-15T09:16:26","guid":{"rendered":"https:\/\/aireviewirush.com\/?p=23819"},"modified":"2026-03-15T18:16:27","modified_gmt":"2026-03-15T09:16:27","slug":"introducing-fireworks-ai-on-microsoft-foundry-bringing-excessive-efficiency-low-latency-open-mannequin-inference-to-azure","status":"publish","type":"post","link":"https:\/\/aireviewirush.com\/?p=23819","title":{"rendered":"Introducing Fireworks AI on Microsoft Foundry: Bringing excessive efficiency, low latency open mannequin inference to Azure"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div id=\"post-49830\">\n<p>\n\t\tWe\u2019re saying the general public preview of Fireworks AI on Microsoft Foundry, bringing excessive\u2011efficiency open mannequin inference into Azure. This integration displays Microsoft Foundry\u2019s broader path: offering a single place the place builders can&#8217;t solely run open fashions effectively but additionally customise and operationalize them as a part of an entire enterprise\u2011prepared AI lifecycle.\t<\/p>\n<p class=\"wp-block-paragraph\">Throughout industries, organizations are more and more standardizing on open fashions to achieve larger management over efficiency, price, customization, and the safety and compliance required for enterprise deployment. Open fashions give groups the pliability to decide on the correct structure for every workload and keep away from lock\u2011in to a single mannequin supplier as their wants evolve.<\/p>\n<p class=\"wp-block-paragraph\">As adoption grows, nonetheless, efficiency alone is now not sufficient. Groups want a constant technique to consider fashions shortly, function them safely in manufacturing, and enhance them over time with out rebuilding infrastructure or fragmenting their tooling. Too typically, organizations are pressured to assemble bespoke serving stacks, slowing innovation and making it more durable to scale and compound progress.<\/p>\n<p class=\"wp-block-paragraph\">Microsoft Foundry is designed to deal with this problem. It serves as a unified system of report and enterprise management airplane for AI, bringing collectively fashions, brokers, analysis, deployment, and governance right into a single expertise. With Microsoft Foundry, groups can transfer from experimentation to manufacturing with confidence, utilizing the fashions and frameworks that finest match their necessities, whereas counting on a constant operational basis.<\/p>\n<p class=\"wp-block-paragraph\"><strong>In the present day, we\u2019re saying the general public preview of Fireworks AI on <\/strong><a href=\"https:\/\/ai.azure.com\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Microsoft Foundry<\/strong><\/a><strong>, bringing excessive\u2011efficiency open mannequin inference into Azure.<\/strong> This integration displays Microsoft Foundry\u2019s broader path: offering a single place the place builders can&#8217;t solely run open fashions effectively but additionally customise and operationalize them as a part of an entire enterprise\u2011prepared AI lifecycle.<\/p>\n<figure class=\"wp-block-msx-ump-embed wp-block-msx-ump-embed\">\n<\/figure>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_53 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title \" >Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\" role=\"button\"><label for=\"item-6a2bc53747e25\" ><span class=\"\"><span style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input aria-label=\"Toggle\" aria-label=\"item-6a2bc53747e25\"  type=\"checkbox\" id=\"item-6a2bc53747e25\"><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/aireviewirush.com\/?p=23819\/#Fireworks_AI_fashions_on_Microsoft_Foundry_A_single_place_for_open_fashions\" title=\"Fireworks AI fashions on Microsoft Foundry: A single place for open fashions\">Fireworks AI fashions on Microsoft Foundry: A single place for open fashions<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/aireviewirush.com\/?p=23819\/#The_way_forward_for_Fireworks_and_AI_use_instances\" title=\"The way forward for Fireworks and AI use instances\">The way forward for Fireworks and AI use instances<\/a><ul class='ez-toc-list-level-3'><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/aireviewirush.com\/?p=23819\/#To_get_began\" title=\"To get began:\">To get began:<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/aireviewirush.com\/?p=23819\/#Study_extra_about_Fireworks_on_Microsoft_Foundry\" title=\"Study extra about Fireworks on Microsoft Foundry\">Study extra about Fireworks on Microsoft Foundry<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\" id=\"fireworks-ai-models-on-microsoft-foundry-a-single-place-for-open-models\"><span class=\"ez-toc-section\" id=\"Fireworks_AI_fashions_on_Microsoft_Foundry_A_single_place_for_open_fashions\"><\/span>Fireworks AI fashions on Microsoft Foundry: A single place for open fashions<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p class=\"wp-block-paragraph\">Fireworks AI delivers industry-leading inference for open fashions, and Microsoft Foundry is what makes that efficiency usable at enterprise scale. Accessing Fireworks AI by means of Microsoft Foundry offers groups a single, trusted management airplane to judge, deploy, customise, and function open fashions alongside the remainder of their AI stack.<\/p>\n<p class=\"wp-block-paragraph\">As open fashions mature, customization more and more extends past coaching. Groups want constant methods to configure, deploy, optimize, govern, and iterate on fashions in manufacturing with out fragmenting instruments or infrastructure. Microsoft Foundry gives the atmosphere the place these customization and operational workflows are standardized, whereas Fireworks AI provides the efficiency and effectivity wanted to run open fashions at scale. This implies groups can transfer from experimentation to manufacturing utilizing open fashions with out stitching collectively separate instruments, contracts, and deployment paths.<\/p>\n<p class=\"wp-block-paragraph\">Collectively, Fireworks AI and Microsoft Foundry allow a extra full and sustainable strategy to working with open fashions combining quick, environment friendly inference with a platform designed to help enterprise open mannequin operations over time. <\/p>\n<p class=\"wp-block-paragraph\">With Fireworks AI on Foundry, builders can <strong>get entry to best-in-class inferencing for open fashions<\/strong>, together with optimized deployments for customized weight fashions. Fireworks AI is a market chief for top efficiency inference for open fashions. Its engine already runs at web scale processing over 13T tokens day by day, sustaining about 180 thousand requests per second, and producing over 1,000 tokens per second on massive fashions, substantiated by main benchmark efficiency on<em> <\/em><a href=\"https:\/\/artificialanalysis.ai\/providers\/fireworks\" target=\"_blank\" rel=\"noopener\"><em>Synthetic Evaluation.<\/em><\/a><em> <\/em>This efficiency is now accessible on Foundry.<\/p>\n<p class=\"wp-block-paragraph\">Builders can log into Foundry and entry these open fashions with Fireworks AI as we speak:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">DeepSeek V3.2<\/li>\n<li class=\"wp-block-list-item\">OpenAI gpt-oss-120b<\/li>\n<li class=\"wp-block-list-item\">Kimi K2.5<\/li>\n<li class=\"wp-block-list-item\">MiniMax M2.5 (<em>new<\/em>)<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">This brings a brand new open mannequin (MiniMax M2.5) to Foundry with serverless help and presents optimized inference for already in style open fashions. <\/p>\n<figure data-wp-context=\"{&quot;imageId&quot;:&quot;69b678ea90a38&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img loading=\"lazy\" decoding=\"async\" width=\"960\" height=\"540\" data-wp-class--hide=\"state.isContentHidden\" data-wp-class--show=\"state.isContentVisible\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Fireworks-Catalog-4_models_Final.gif\" alt=\"A demo of model discovery.\" class=\"wp-image-49918\"\/><button class=\"lightbox-trigger\" type=\"button\" aria-haspopup=\"dialog\" aria-label=\"Enlarge\" data-wp-init=\"callbacks.initTriggerButton\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-style--right=\"state.imageButtonRight\" data-wp-style--top=\"state.imageButtonTop\"><br \/>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewbox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\"\/>\n\t\t\t<\/svg><br \/>\n\t\t<\/button><\/figure>\n<p class=\"wp-block-paragraph\">With Fireworks AI in Microsoft Foundry, builders can:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><strong>Consider fashions quicker with day\u2011zero entry and help:<\/strong> Begin constructing instantly with entry to state-of-the-art open fashions from Fireworks AI by means of a single Azure endpoint through Foundry.<\/li>\n<li class=\"wp-block-list-item\"><strong>Optimize inference: <\/strong>Requests to open fashions are served by Fireworks\u2019 excessive\u2011throughput inference stack for quick efficiency with Azure\u2011grade governance.<\/li>\n<li class=\"wp-block-list-item\"><strong>Run the fashions you already belief:<\/strong> With bring-your-own-weights (BYOW), you possibly can add and register quantized or advantageous\u2011tuned weights educated elsewhere with out altering the serving stack.<\/li>\n<\/ul>\n<figure data-wp-context=\"{&quot;imageId&quot;:&quot;69b678ea918ca&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img loading=\"lazy\" decoding=\"async\" width=\"960\" height=\"540\" data-wp-class--hide=\"state.isContentHidden\" data-wp-class--show=\"state.isContentVisible\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Fireworks-Custom_Model-2_Final.gif\" alt=\"A demo of custom model creation.\" class=\"wp-image-49920\"\/><button class=\"lightbox-trigger\" type=\"button\" aria-haspopup=\"dialog\" aria-label=\"Enlarge\" data-wp-init=\"callbacks.initTriggerButton\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-style--right=\"state.imageButtonRight\" data-wp-style--top=\"state.imageButtonTop\"><br \/>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewbox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\"\/>\n\t\t\t<\/svg><br \/>\n\t\t<\/button><\/figure>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><strong>Select the correct pricing mannequin on your workload<\/strong>: Use serverless, pay-per\u2011token inference to experiment securely and shortly with <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/foundry\/foundry-models\/concepts\/deployment-types#data-zone-standard\" target=\"_blank\" rel=\"noopener\">Knowledge Zone Commonplace<\/a> or select provisioned throughput items (PTUs) for predictable, steady-state efficiency with base or customized fashions. Whether or not you\u2019re optimizing for agility or effectivity, you get flexibility with out managing infrastructure.<\/li>\n<li class=\"wp-block-list-item\"><strong>Function with enterprise belief and scale<\/strong>: We&#8217;re dedicated to enabling clients to construct production-ready AI purposes shortly, whereas sustaining the very best ranges of security and safety. Foundry gives an end-to-end workspace for agent improvement, analysis, and deployment, together with unified governance, observability, and agent-ready tooling.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\" id=\"the-future-of-fireworks-and-ai-use-cases\"><span class=\"ez-toc-section\" id=\"The_way_forward_for_Fireworks_and_AI_use_instances\"><\/span>The way forward for Fireworks and AI use instances<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p class=\"wp-block-paragraph\">Microsoft Foundry is evolving to help the complete lifecycle of open fashions\u2014from early analysis by means of manufacturing operation and ongoing optimization. As groups scale their use of open fashions, having a constant, enterprise\u2011prepared basis turns into more and more necessary.<\/p>\n<p class=\"wp-block-paragraph\">By integrating Fireworks AI into Microsoft Foundry, builders achieve entry to excessive\u2011efficiency inference as we speak whereas constructing on a platform designed to help deeper customization and enterprise operations over time. This strategy offers groups the boldness to undertake open fashions not only for what they&#8217;ll do now, however for a way they&#8217;ll develop, adapt, and function reliably as their AI ambitions broaden. We\u2019re trying ahead to seeing how builders and enterprises use Fireworks AI on Microsoft Foundry to energy the following technology of clever purposes.<\/p>\n<h3 class=\"wp-block-heading\" id=\"to-get-started\"><span class=\"ez-toc-section\" id=\"To_get_began\"><\/span>To get began:<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ol class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Go to <a href=\"https:\/\/ai.azure.com\" target=\"_blank\" rel=\"noreferrer noopener\">Microsoft Foundry<\/a> fashions and choose Fireworks AI open fashions within the mannequin catalog assortment.<\/li>\n<li class=\"wp-block-list-item\">Choose the open mannequin hosted by Fireworks.<\/li>\n<li class=\"wp-block-list-item\">View the mannequin card.<\/li>\n<li class=\"wp-block-list-item\">Choose your deployment choice\u2014serverless or PTU\u2014and deploy.<\/li>\n<\/ol>\n<h2 class=\"wp-block-heading\" id=\"learn-more-about-fireworks-on-microsoft-foundry\"><span class=\"ez-toc-section\" id=\"Study_extra_about_Fireworks_on_Microsoft_Foundry\"><\/span>Study extra about Fireworks on Microsoft Foundry<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<\/div>\n<p><script>\n\t\tfunction facebookTracking() {\n\t\t\t\/\/ If GPC or AMC Signal is enabled, do not fire Facebook Pixel\n\t\t\tif ( navigator.globalPrivacyControl || document.cookie.includes('3PAdsOptOut=1') ) {\n\t\t\t\treturn false;\n\t\t\t}\n\t\t\t!function(f,b,e,v,n,t,s){if(f.fbq)return;n=f.fbq=function(){n.callMethod?\n\t\t\t\tn.callMethod.apply(n,arguments):n.queue.push(arguments)};if(!f._fbq)f._fbq=n;\n\t\t\t\tn.push=n;n.loaded=!0;n.version='2.0';n.queue=[];t=b.createElement(e);t.async=!0;\n\t\t\t\tt.src=v;t.type=\"ms-delay-type\";t.setAttribute('data-ms-type','text\/javascript');\n\t\t\t\ts=b.getElementsByTagName(e)[0];s.parentNode.insertBefore(t,s)}(window,\n\t\t\t\tdocument,'script','https:\/\/connect.facebook.net\/en_US\/fbevents.js');\n\t\t\tfbq('init', '1770559986549030');\n\t\t\t\t\t\tfbq('track', 'PageView');\n\t\t\t\t\t}\n\t<\/script><br \/>\n<br \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We\u2019re saying the general public preview of Fireworks AI on Microsoft Foundry, bringing excessive\u2011efficiency open mannequin inference into Azure. This integration displays Microsoft Foundry\u2019s broader path: offering a single place the place builders can&#8217;t solely run open fashions effectively but additionally customise and operationalize them as a part of an entire enterprise\u2011prepared AI lifecycle. Throughout [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":23821,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":["post-23819","post","type-post","status-publish","format-standard","has-post-thumbnail","category-cloud-computing"],"_links":{"self":[{"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/posts\/23819","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=23819"}],"version-history":[{"count":1,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/posts\/23819\/revisions"}],"predecessor-version":[{"id":23820,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/posts\/23819\/revisions\/23820"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=\/wp\/v2\/media\/23821"}],"wp:attachment":[{"href":"https:\/\/aireviewirush.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=23819"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=23819"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aireviewirush.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=23819"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}