<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>Node.js &#8211; Wade Tregaskis</title>
	<atom:link href="https://wadetregaskis.com/tags/node-js/feed/" rel="self" type="application/rss+xml" />
	<link>https://wadetregaskis.com</link>
	<description></description>
	<lastBuildDate>Sat, 25 May 2024 15:20:27 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://wadetregaskis.com/wp-content/uploads/2016/03/Stitch-512x512-1-256x256.png</url>
	<title>Node.js &#8211; Wade Tregaskis</title>
	<link>https://wadetregaskis.com</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">226351702</site>	<item>
		<title>Swift sucks at web serving… or does it?</title>
		<link>https://wadetregaskis.com/swift-sucks-at-web-serving-or-does-it/</link>
					<comments>https://wadetregaskis.com/swift-sucks-at-web-serving-or-does-it/#comments</comments>
		
		<dc:creator><![CDATA[]]></dc:creator>
		<pubDate>Thu, 16 May 2024 01:32:06 +0000</pubDate>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[BigInt]]></category>
		<category><![CDATA[FPM]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[kevents]]></category>
		<category><![CDATA[Kotlin]]></category>
		<category><![CDATA[kqueue]]></category>
		<category><![CDATA[macOS kernel]]></category>
		<category><![CDATA[Node.js]]></category>
		<category><![CDATA[Numberick]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Swift]]></category>
		<category><![CDATA[Swift Forums]]></category>
		<category><![CDATA[SwiftNIO]]></category>
		<category><![CDATA[Vapor]]></category>
		<category><![CDATA[web server]]></category>
		<category><![CDATA[wrk]]></category>
		<guid isPermaLink="false">https://wadetregaskis.com/?p=8061</guid>

					<description><![CDATA[A few weeks ago, Axel Roest published a simple web server comparison, that turned out to not be doing what it was thought to be doing. Figuring that out was a very interesting discussion that warrants a retrospective, to look at which parts were particularly helpful and which not so much. Tangentially, I want to&#8230; <a class="read-more-link" href="https://wadetregaskis.com/swift-sucks-at-web-serving-or-does-it/" data-wpel-link="internal">Read more</a>]]></description>
										<content:encoded><![CDATA[
<p>A few weeks ago, Axel Roest published <a href="https://tech.phlux.us/Juice-Sucking-Servers/" data-wpel-link="external" target="_blank" rel="external noopener">a simple web server comparison</a>, that turned out to not be doing what it was thought to be doing. Figuring that out was a very interesting discussion that warrants a retrospective, to look at which parts were particularly helpful and which not so much.</p>



<p>Tangentially, I want to highlight that Axel&#8217;s comparison is notable because he is interested in <em>efficiency</em>, not mere brute performance. The two are usually correlated but not always the same. He correctly noted that electricity is a major <em>and increasingly large</em> part of server costs (see <a href="https://wadetregaskis.com/the-cost-of-electrical-power-in-servers/" data-wpel-link="internal">my prior post</a> for why it&#8217;s even worse than you likely realise). That said, while he did take RAM and power measurements, his benchmark and analysis didn&#8217;t go into detail about energy efficiency.</p>



<h1 class="wp-block-heading">Benchmark method &amp; apparatus</h1>



<p>Axel wanted to see how a very simple web server performed in:</p>



<ul class="wp-block-list">
<li><a href="https://www.php.net/manual/en/install.fpm.php" data-wpel-link="external" target="_blank" rel="external noopener">FPM</a> w/ <a href="https://www.nginx.com" data-wpel-link="external" target="_blank" rel="external noopener">NGINX</a> (PHP).</li>



<li><a href="https://helidon.io" data-wpel-link="external" target="_blank" rel="external noopener">Helidon</a> (Kotlin / Java<sup data-fn="e7728698-06db-400a-a6b9-01a2ce4f3e5b" class="fn"><a href="#e7728698-06db-400a-a6b9-01a2ce4f3e5b" id="e7728698-06db-400a-a6b9-01a2ce4f3e5b-link">1</a></sup>).</li>



<li><a href="https://nodejs.org/en" data-wpel-link="external" target="_blank" rel="external noopener">Node.js</a> (JavaScript).</li>



<li><a href="https://vapor.codes" data-wpel-link="external" target="_blank" rel="external noopener">Vapor</a> (Swift).</li>
</ul>



<p>He was particularly interested in throughput &amp; latency vs RAM &amp; power usage. All are important metrics in their own right, but are most useful in light of each other.</p>



<p>He chose to use <a href="https://en.wikipedia.org/wiki/Fibonacci_sequence" data-wpel-link="external" target="_blank" rel="external noopener">Fibonacci sequence</a> calculation as the load. Choosing a load for any web server benchmark is always highly contentious, and not the focus of this post. Whether you think Fibonacci&#8217;s a good choice or not, read on to see why really it didn&#8217;t matter.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>☝️ People get hung up on how well benchmarks represent the so-called real world, but I think that&#8217;s often fruitless to argue about and also beside the point. What matters is whether the benchmark is <em>useful</em>. e.g. does it <em>inform</em> and <em>elucidate</em>?</p>
</div></div>



<p>He did use <em>very</em> old hardware, though &#8211; an Intel Core i3-550 from over a decade ago. Fortunately it didn&#8217;t turn out to materially impact the relative results nor behaviours of the benchmark, but it&#8217;s usually unwise to add unnecessary [potential] variables to your setup, like unusual hardware.</p>



<p>In my own debugging and profiling, I used my also very old 10-core iMac Pro. It&#8217;s at least a Xeon? 😅</p>



<h1 class="wp-block-heading">Benchmark results</h1>



<div class="wp-block-group is-content-justification-center is-layout-flex wp-container-core-group-is-layout-64b26803 wp-block-group-is-layout-flex">
<figure class="wp-block-image size-full is-resized"><img fetchpriority="high" decoding="async" width="800" height="586" src="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput.webp" alt="" class="wp-image-8063" style="object-fit:cover;width:400px;height:293px" srcset="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput.webp 800w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput-256x188.webp 256w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput-768x563.webp 768w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput-256x188@2x.webp 512w" sizes="(max-width: 800px) 100vw, 800px" /><figcaption class="wp-element-caption">Requests per second (Y) over concurrent requests (X).<br>From <a href="https://tech.phlux.us/Juice-Sucking-Servers/" data-wpel-link="external" target="_blank" rel="external noopener">Axel&#8217;s first post</a>.</figcaption></figure>



<figure class="wp-block-image size-full is-resized"><img decoding="async" width="800" height="617" src="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-failure-rate.webp" alt="" class="wp-image-8064" style="object-fit:cover;width:400px;height:293px" srcset="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-failure-rate.webp 800w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-failure-rate-256x197.webp 256w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-failure-rate-768x592.webp 768w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-failure-rate-256x197@2x.webp 512w" sizes="(max-width: 800px) 100vw, 800px" /><figcaption class="wp-element-caption">Success rate (Y) over concurrent requests (X).<br>From <a href="https://tech.phlux.us/Juice-Sucking-Servers/" data-wpel-link="external" target="_blank" rel="external noopener">Axel&#8217;s first post</a>.</figcaption></figure>



<figure class="wp-block-image size-large is-resized"><img decoding="async" width="800" height="577" src="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-memory-usage.webp" alt="" class="wp-image-8076" style="object-fit:cover;width:400px;height:300px" srcset="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-memory-usage.webp 800w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-memory-usage-256x185.webp 256w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-memory-usage-768x554.webp 768w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-memory-usage-256x185@2x.webp 512w" sizes="(max-width: 800px) 100vw, 800px" /><figcaption class="wp-element-caption">RAM usage (Y) over concurrent requests (X).<br>From <a href="https://tech.phlux.us/Juice-Sucking-Servers/" data-wpel-link="external" target="_blank" rel="external noopener">Axel&#8217;s first post</a>.</figcaption></figure>



<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" width="800" height="575" src="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-power-usage.webp" alt="" class="wp-image-8077" style="object-fit:cover;width:400px;height:300px" srcset="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-power-usage.webp 800w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-power-usage-256x184.webp 256w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-power-usage-768x552.webp 768w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-power-usage-256x184@2x.webp 512w" sizes="auto, (max-width: 800px) 100vw, 800px" /><figcaption class="wp-element-caption">Power usage (Y) over concurrent requests (X).<br>From <a href="https://tech.phlux.us/Juice-Sucking-Servers/" data-wpel-link="external" target="_blank" rel="external noopener">Axel&#8217;s first post</a>.</figcaption></figure>
</div>



<p>In words:</p>



<ul class="wp-block-list">
<li>Helidon (Kotlin / Java) had the highest throughput and lowest latency at low (and arguably more reasonable) loads, but used by far the most RAM, and the most power. Consequently it handled the most load before requests started failing (timing out).</li>



<li>Node.js (JavaScript) was qualitatively very similar to Helidon (Kotlin / Java) but less in all metrics &#8211; less throughput, less peak load capacity, but also less RAM and very slightly less power used.</li>



<li>FPM + NGINX (PHP) followed the pattern.</li>



<li>Vapor (Swift) did not &#8211; it had higher throughput than PHP yet requests started failing much sooner as load increased. It used the least RAM and least power, though, and kept on trucking irrespective of the load.</li>
</ul>



<p>Many people would have left it at that &#8211; obviously the results make sense for the first three (&#8220;everyone knows&#8221; that Kotlin / Java&#8217;s faster than JavaScript that&#8217;s faster than PHP) and Vapor / Swift apparently just isn&#8217;t fast and has weird reliability behaviours. QED, right?</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>⚠️ Going in with a specific hypothesis can be helpful, but hypotheses can also end up being just biases. Be careful not to blindly accept apparent confirmation of the hypothesis. Similarly, beware subconscious hypotheses like &#8220;Kotlin / Java is faster than JavaScript&#8221;.</p>
</div></div>



<p>To his credit, Axel wasn&#8217;t so sure &#8211; he felt that the results he was seeing were suspicious, and <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583" data-wpel-link="external" target="_blank" rel="external noopener">he sought help from the Swift Forums</a> in explaining or correcting them.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>✅ Question your results. <em>Understand</em> them. It improves the quality, correctness, and usefulness of your work. <em>Why</em> something behaves the way it does is often more interesting and important than merely how it behaves.</p>



<p>On most platforms it&#8217;s pretty easy to at least do a time profile, and most often that&#8217;s all you need to understand what&#8217;s going on. On Apple platforms you can use <a href="https://www.avanderlee.com/debugging/xcode-instruments-time-profiler/" data-wpel-link="external" target="_blank" rel="external noopener">Instruments</a>, on Windows &amp; Linux tools like <a href="https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2024-1/basic-hotspots-analysis.html" data-wpel-link="external" target="_blank" rel="external noopener">VTune</a>, among <a href="https://en.wikipedia.org/wiki/List_of_performance_analysis_tools" data-wpel-link="external" target="_blank" rel="external noopener">many other options</a>.</p>



<p>If need be, ask others for help, like Axel did.</p>
</div></div>



<p>While Axel did <em>suspect</em> something was wrong &#8211; noting the oddly small but persistent failure rate &#8211; he missed the most obvious <em>proof</em> of wrongness &#8211; logically impossible results. Doing 80,000 continuous concurrent streams of requests with ~98% of those requests completing within the two second time limit means the server must have a throughput of at least 39,000 requests per second. Yet the benchmark tool reported a mere ~8,000 requests per second.</p>



<p>Sadly, though <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/2" data-wpel-link="external" target="_blank" rel="external noopener">I pointed this out as the very first response to the thread</a>, it seemed to be overlooked by everyone (even myself!), even though it clearly fingered the benchmark tool itself as the problem (which is only partially correct, as we&#8217;ll see later, but in any case was the exact right place to start looking).</p>



<h1 class="wp-block-heading">Debugging the benchmark</h1>



<h2 class="wp-block-heading">Domain experts weigh in</h2>



<p>The Swift Forum post immediately attracted relevant people: folks that work on Vapor and NIO, and folks that have experience using them. However, ironically this didn&#8217;t initially help &#8211; they tended to assume the problem was in Vapor (or its networking library, <a href="https://github.com/apple/swift-nio" data-wpel-link="external" target="_blank" rel="external noopener">SwiftNIO</a>) or how Vapor was being configured. It turned out none of this was really true &#8211; there <em>was</em> <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/49" data-wpel-link="external" target="_blank" rel="external noopener">a small optimisation made to Vapor</a> as a result of all this, which did marginally improve performance (in specific circumstances), but ultimately Vapor &amp; NIO were not the problem, nor was the benchmark&#8217;s configuration and use of them.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>⚠️ It can be all too easy to assume elaborate reasons when you know a lot about something. Don&#8217;t jump to conclusions. Check the most basic and foundational things <em>first</em>.</p>



<p>I say this with humility and I guess technically hypocrisy, because even as professional performance engineer (in the past) I&#8217;ve repeatedly made this mistake myself. We&#8217;re all particularly susceptible to this mistake.</p>
</div></div>



<p>There were some assertions that the results <em>were</em> plausible and just how Vapor performs, and that the &#8220;problem&#8221; was the choice of Vapor rather than some other web server framework (e.g. <a href="https://github.com/hummingbird-project/hummingbird" data-wpel-link="external" target="_blank" rel="external noopener">Hummingbird</a>).</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>⚠️ It&#8217;s not <em>wrong</em> to be interested in additional data, but be careful not to get distracted. Using Vapor was not in any way wrong or unhelpful &#8211; it is the most well-known and probably well-used web server framework in Swift. It might well be that other frameworks are better in some respects, but that&#8217;s a <em>different</em> comparison than what Axel performed.</p>
</div></div>



<p>Others similarly asserted that the results were plausible because Swift uses <a href="https://docs.swift.org/swift-book/documentation/the-swift-programming-language/automaticreferencecounting/" data-wpel-link="external" target="_blank" rel="external noopener">reference-counting</a> for memory management whereas PHP, JavaScript, and Kotlin / Java use garbage collection. It was presented as &#8220;common knowledge&#8221; that garbage collection has inherent benefits for some programs, like web servers, because it makes memory allocation super cheap.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>⚠️ While it can be useful to speculate a little, in a brainstorming sense, don&#8217;t presume. A <em>lot</em> of mistakes have been made over the years because of this, like that &#8220;linked lists are faster than arrays&#8221; or &#8220;binary search is faster than linear search&#8221;, etc.</p>



<p>Remember that intuition is in large part presumptions and generalisations. That doesn&#8217;t make intuition useless, but always remember that it&#8217;s far from foolproof. Use it to generate hypotheses, not conclusions.</p>
</div></div>



<h2 class="wp-block-heading">Examining the load</h2>



<p>Even though it was clear that something was wrong with the actual measurements, a lot of the early discussion revolved around the load used (Fibonacci sequence calculation), particularly regarding whether it was:</p>



<h3 class="wp-block-heading">The &#8220;right&#8221; load</h3>



<p>A few folks asserted that the CPU-heavy nature of calculating Fibonacci numbers isn&#8217;t representative of web servers generally. Multiple people noted that &#8211; in the Swift implementation, at least &#8211; the majority of the CPU time was spent doing the Fibonacci calculation. Some felt this was therefore not a useful benchmark of Vapor itself.</p>



<p>A lot of this boiled down to <a href="https://en.wikipedia.org/wiki/No_true_Scotsman" data-wpel-link="external" target="_blank" rel="external noopener">the &#8220;no true Scotsman&#8221; problem</a>, which is very common in benchmarking, with a bit of <a href="https://en.wikipedia.org/wiki/Nirvana_fallacy#Perfect_solution_fallacy" data-wpel-link="external" target="_blank" rel="external noopener">perfect world logical fallacy</a> peppered in, trying to identify the One True Representative Benchmark. See the earlier point about fixating on such matters rather than whether the benchmark is <em>useful</em>.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>⚠️ While it&#8217;s not necessarily wrong or unwise to evaluate how well a benchmark represents real world usage (whether generally or against specific cases), it&#8217;s an exercise that suffers from diminishing returns pretty quickly. It&#8217;s usually best to not quibble too much or too long, as long as the benchmark is in the ballpark.</p>



<p>You can always develop &amp; present your own benchmark(s), if you feel there are better or additional ways to go about it. Best of all, the existence of <em>both</em> the original benchmark and your benchmark(s) will be more useful than either alone, since you can compare and contrast them.</p>
</div></div>



<h3 class="wp-block-heading">A &#8220;fair&#8221; load</h3>



<p>Accusations were made pretty quickly that the benchmark is &#8220;unfair&#8221; to Swift because Swift doesn&#8217;t &#8211; it was asserted &#8211; have a properly-optimised &#8220;<a href="https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic" data-wpel-link="external" target="_blank" rel="external noopener">BigInt</a>&#8221; implementation, unlike all the other languages tested.</p>



<p>No real evidence was given for this. Even if it were true, it doesn&#8217;t invalidate the benchmark &#8211; in fact, it just makes the benchmark <em>more</em> successful because it&#8217;s then highlighted an area where Swift is lacking.</p>



<p>The BigInt library that Axel used, <a href="https://github.com/attaswift/BigInt" data-wpel-link="external" target="_blank" rel="external noopener">attaswift/BigInt</a>, is by far the most popular available for Swift, as judged by things like GitHub stars, forks, &amp; contributor counts, ranking in web &amp; GitHub searches, etc. There are <a href="https://swiftpackageindex.com/search?query=bigint" data-wpel-link="external" target="_blank" rel="external noopener">quite a few others</a>, though.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>☝️ There are multiple ways to approach a benchmark, all equally valid because they&#8217;re all useful. Axel chose to use popular packages, in <em>all</em> the languages he tested. That&#8217;s definitely fair. It&#8217;s also useful because it represents what the typical developer will do when building real web servers.</p>



<p>It&#8217;s often also interesting and useful to search out the <em>best</em> packages (whatever that may mean in context, such as fastest). That could represent what a more heavily optimised implementation might do. It <em>might</em> also better represent what is theoretical possible (<em>if</em> optimal packages exist already). Those are interesting things to explore too, just not what Axel happened to be doing.</p>



<p>You can see also more of my thoughts on Axel&#8217;s choice here, <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/39" data-wpel-link="external" target="_blank" rel="external noopener">in the Swift Forums thread</a>.</p>
</div></div>



<p>It wasn&#8217;t until <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/46" data-wpel-link="external" target="_blank" rel="external noopener">actual evidence was presented</a>, that the discussion made progress.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>⚠️ While it&#8217;s true that without the initial blind assertions actual data might never have been gathered, it would have been more effective and efficient to have just gathered the data at the start.</p>



<p>Data is better than supposition.</p>
</div></div>



<p>It was shown that in fact the BigInt implementation in question <em>was</em> significantly slower than it could be, because JavaScript&#8217;s implementation of addition was much faster. <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/62" data-wpel-link="external" target="_blank" rel="external noopener">Some additional simple tests</a> showed even wider performance gaps regarding the other key operation: rendering to strings. It was <em>that</em> data that turned out to be critical &#8211; <a href="https://github.com/apple/swift-foundation/pull/262" data-wpel-link="external" target="_blank" rel="external noopener">I myself happened to have implemented BigInt string rendering for Apple&#8217;s new Foundation</a>, <em>and</em> then <a href="https://github.com/apple/swift-foundation/pull/306" data-wpel-link="external" target="_blank" rel="external noopener">saw it dramatically optimised by</a> <a href="https://github.com/oscbyspro" data-wpel-link="external" target="_blank" rel="external noopener">Oscar Byström Ericsson</a>, whom has his own BigInt package for Swift, <a href="https://github.com/oscbyspro/Numberick" data-wpel-link="external" target="_blank" rel="external noopener">Numberick</a>. So I had a pretty darn good idea of where I might find a faster package… 😆</p>



<p>You can read more about that specific bit of serendipity <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/64" data-wpel-link="external" target="_blank" rel="external noopener">in the Swift Forums thread</a>.</p>



<p>It was trivial to do the package switch, and it quickly improved Vapor/Swift&#8217;s showing in the benchmark manyfold &#8211; in combination with some other simple and reasonable tweaks, it was <em>five times faster</em>!</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>✅ Axel&#8217;s benchmark taught a lot of people that <a href="https://github.com/oscbyspro/Numberick" data-wpel-link="external" target="_blank" rel="external noopener">Numberick</a> is much more performant than <a href="https://github.com/attaswift/BigInt" data-wpel-link="external" target="_blank" rel="external noopener">BigInt</a>, at least in some important operations (addition and string rendering). Granted that knowledge is a little bit niche in its utility, but it&#8217;s still a good outcome.</p>



<p>It also demonstrated that modifying in place can be faster than creating a copy, <em>even if</em> it means having to do a swap. i.e.:</p>



<div class="wp-block-kevinbatdorf-code-block-pro padding-disabled" data-code-block-pro-font-family="" style="font-size:.875rem;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><pre class="shiki light-plus" style="background-color: #FFFFFF" tabindex="0"><code><span class="line"><span style="color: #000000">a += b</span></span>
<span class="line"><span style="color: #795E26">swap</span><span style="color: #000000">(&amp;a, &amp;b)</span></span></code></pre></div>



<p>…instead of:</p>



<div class="wp-block-kevinbatdorf-code-block-pro padding-disabled" data-code-block-pro-font-family="" style="font-size:.875rem;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><pre class="shiki light-plus" style="background-color: #FFFFFF" tabindex="0"><code><span class="line"><span style="color: #0000FF">let</span><span style="color: #000000"> c = a + b</span></span>
<span class="line"><span style="color: #000000">a = b</span></span>
<span class="line"><span style="color: #000000">b = c</span></span></code></pre></div>



<p>That&#8217;s a tidbit I had picked up through varied experiences, and <a href="https://wadetregaskis.com/swift-tip-the-swap-function/" data-wpel-link="internal">wrote about previously</a>. The <code><a href="https://developer.apple.com/documentation/swift/swap(_:_:)" data-wpel-link="external" target="_blank" rel="external noopener">swap</a></code> function in Swift is under-appreciated and under-utilised. This knowledge may seem esoteric but you&#8217;d be amazed how often it applies (a <em>lot</em> of programming is about combining data, after all).</p>
</div></div>



<p>Axel posted <a href="https://tech.phlux.us/Juice-Sucking-Servers-Part-Deux/" data-wpel-link="external" target="_blank" rel="external noopener">a follow-up with additional data</a> (with the aforementioned changes and optimisations). That showed Swift now beating out the other three frameworks / languages, with the highest throughput and lowest latency (and still the lowest RAM and power usage).</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="800" height="513" src="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput-after-fixes.webp" alt="" class="wp-image-8071" srcset="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput-after-fixes.webp 800w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput-after-fixes-256x164.webp 256w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput-after-fixes-768x492.webp 768w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput-after-fixes-256x164@2x.webp 512w" sizes="auto, (max-width: 800px) 100vw, 800px" /><figcaption class="wp-element-caption">From Axel&#8217;s follow-up post. X axis is the number of concurrent requests.</figcaption></figure>
</div>


<p>So, all done, right? Turns out, Vapor/Swift wins, yeah?</p>



<p>Well, maybe.</p>



<h3 class="wp-block-heading" id="do-these-improvements-apply-to-the-other-cases-too">Do these improvements apply to the other cases too?</h3>



<p>That is yet to be examined. Because only Swift seemed to be producing odd results, Axel only put the benchmark to the Swift community for deeper analysis. It&#8217;s quite possible that doing the same with the other web frameworks &amp; languages would similarly reveal potential improvements.</p>



<p>Still, the results are useful as they stand. Some simple and very plausible &#8211; even for a Swift beginner &#8211; optimisations made a big difference, though of course the biggest difference was simply using a different 3rd party package. There are a lot of useful lessons in that, both in the specifics as already covered and as general best practices.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>☝️ Benchmarks are rarely &#8220;done&#8221;, their results rarely &#8220;final&#8221;. At least if you permit optimisations or other changes. How do you <em>know</em> there&#8217;s not something still &#8220;unfair&#8221; about one of the cases?</p>



<p>Again, this speaks to the potential futility of trying to make &#8220;fair&#8221; benchmarks, and reiterates the practical benefit of simply trying to learn instead.</p>
</div></div>



<h2 class="wp-block-heading">…but… why is the success rate still weird?</h2>



<p>Despite the improved performance, a fundamental problem remained: <em>the numbers still didn&#8217;t make sense</em>.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="800" height="512" src="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-success-rate-after-fixes.png" alt="" class="wp-image-8073" srcset="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-success-rate-after-fixes.png 800w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-success-rate-after-fixes-256x164.png 256w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-success-rate-after-fixes-768x492.png 768w, https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-success-rate-after-fixes-256x164@2x.png 512w" sizes="auto, (max-width: 800px) 100vw, 800px" /><figcaption class="wp-element-caption">From Axel&#8217;s follow-up post. X axis is the number of concurrent requests.</figcaption></figure>
</div>


<p>The success rates are <em>slightly</em> different but not materially &#8211; as concurrent requests go up, the throughput plateaus very quickly, yet success rate remains about the same. It&#8217;s exactly the same problem as at the outset &#8211; these results cannot possibly be correct.</p>



<p>Despite all the community&#8217;s efforts, we hadn&#8217;t actually figured out the real problem. We&#8217;d merely made Swift <em>look</em> better, without actually providing confidence in the accuracy of the results.</p>



<p>In fairness to myself, I was well aware that we weren&#8217;t done, I was just struggling to understand what was really going on, <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/74" data-wpel-link="external" target="_blank" rel="external noopener">as I noted here</a>.</p>



<h2 class="wp-block-heading">Examining the benchmark tool</h2>



<p>While there&#8217;d been some tangential questions about <code><a href="https://github.com/wg/wrk" data-wpel-link="external" target="_blank" rel="external noopener">wrk</a></code>, the benchmarking tool Axel used, it had largely been ignored thus far.</p>



<p>Ironically (as you&#8217;ll soon see) Axel chose <code>wrk</code> specifically because <a href="https://tech.phlux.us/Juice-Sucking-Servers/#benchmarking-software" data-wpel-link="external" target="_blank" rel="external noopener">he didn&#8217;t like the behaviour he saw</a> with <a href="https://httpd.apache.org/docs/2.4/programs/ab.html" data-wpel-link="external" target="_blank" rel="external noopener">ApacheBench</a>. Mostly its lack of HTTP/1.1 connection reuse (a subjective but valid methodology choice on Axel&#8217;s part) but also because it sounds like he saw some inexplicable results from it too. In hindsight, that might have been a clue that something more pervasive was wrong.</p>



<p>In retrospect there were a few tangential comments in the Swift Forums thread that were on the right track, e.g.:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>…when a new connection comes in, the server needs to make a decision: It can</p>



<ul class="wp-block-list">
<li>Either accept the new connection immediately, slowing the existing connections down a little (because now there are more connections to service with the same resources as before)</li>



<li>Or it can prioritise the existing connections and slow the connection acceptance (increasing the latency of the first request in the new connection which now has to wait).</li>
</ul>
<cite><a href="https://forums.swift.org/u/johannesweiss/summary" data-wpel-link="external" target="_blank" rel="external noopener">Johannes Weiss</a>, <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/77" data-wpel-link="external" target="_blank" rel="external noopener">Swift Forums post</a></cite></blockquote>



<p>As a little spoiler, it seems apparent that the other three web frameworks all accept incoming connections virtually immediately with priority over any existing connections &amp; request handling (even though they don&#8217;t necessarily attempt to <em>serve</em> all those connections&#8217; requests simultaneously). Vapor does not.</p>



<p>Suspicions did [correctly] develop around the opening of the connections themselves, which triggered testing with longer timeouts in a somewhat blind attempt to cover-up the &#8220;spurious&#8221; first moments of the test.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>❌ Trying to essentially just hide inconvenient results is unlikely to help. It may even be successful, which is the worst possible outcome because it&#8217;s basically just burying a time-bomb into the benchmark, <em>and</em> forgoing any real understanding &amp; potential knowledge to be gained from properly investigating the problem.</p>
</div></div>



<h3 class="wp-block-heading">Characterising the failure mode(s)</h3>



<p>Though admittedly I wasn&#8217;t <em>fully</em> conscious of what I was doing at the time, the next breakthrough came from simply <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/85" data-wpel-link="external" target="_blank" rel="external noopener">gathering more data and analysing it <em>qualitatively</em></a>. This helped in two key ways:</p>



<ul class="wp-block-list">
<li>It better defined and pinned down the circumstances in which things appear to go wrong with the benchmark itself.<br><br>It separated out a whole bunch of test configurations that seemingly weren&#8217;t interesting (as they behaved in line with intuition / expectations, and similarly across all four web servers).</li>
</ul>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>✅ When it doubt, try to better define the problem. Eliminate variables. Refine quantitative estimates. Make your life easier by eliminating things that don&#8217;t matter.</p>
</div></div>



<ul class="wp-block-list">
<li>It provided hints and potential insight into the nature of the problem.<br><br>It showed that there was some kind of variability (in time) in the benchmark&#8217;s behaviour, with three very distinct modes (including one which was basically the benchmark actually working as expected, the existence of which had been unknown until that point!).</li>
</ul>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>✅ There are <em>many</em> ways to approach a data set, in terms of analysis methods. It&#8217;s a good idea to always keep that in mind, and to try different analysis mindsets whenever you seem stuck (and also to further validate conclusions).</p>
</div></div>



<p>Interestingly although ultimately only tangentially, this modality finding prompted quite a few &#8220;me too!&#8221; responses from other folks, about a variety of use-cases involving Vapor <em>or</em> NIO. I took that as affirmation that I was onto something real, but in retrospect that should have been an even better clue: the fact that some people had seen this issue <em>without</em> Vapor involved &#8211; the only common denominator was NIO. Even though it turns out NIO itself wasn&#8217;t doing any wrong, it was on the right path to answers. <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/87" data-wpel-link="external" target="_blank" rel="external noopener">This was <em>specifically</em> pointed out to everyone</a>, even.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>☝️ Sometimes, it just comes down to needing to listen better.</p>
</div></div>



<h3 class="wp-block-heading">Overlooked clues</h3>



<p>At this point there were a bunch of discussions about <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/89" data-wpel-link="external" target="_blank" rel="external noopener">benchmark tool configuration</a>, <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/91" data-wpel-link="external" target="_blank" rel="external noopener">hardware arrangement</a>, <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/102" data-wpel-link="external" target="_blank" rel="external noopener">whether TLS should be used</a>, etc. I&#8217;m going to skim over it, because there&#8217;s not much to ultimately say about it &#8211; it turned out to not be on the right track in this case, or purely tangential, but it was entirely reasonable to investigate &amp; discuss those aspects. Such is debug life.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="600" height="600" src="https://wadetregaskis.com/wp-content/uploads/2024/05/Debug-Life-t-shirt.avif" alt="" class="wp-image-8109" style="object-fit:cover" srcset="https://wadetregaskis.com/wp-content/uploads/2024/05/Debug-Life-t-shirt.avif 600w, https://wadetregaskis.com/wp-content/uploads/2024/05/Debug-Life-t-shirt-256x256.avif 256w, https://wadetregaskis.com/wp-content/uploads/2024/05/Debug-Life-t-shirt-256x256@2x.avif 512w" sizes="auto, (max-width: 600px) 100vw, 600px" /><figcaption class="wp-element-caption">I couldn&#8217;t find evidence that anyone&#8217;s gotten the tattoo yet, but you can at least <a href="https://www.redbubble.com/i/t-shirt/Debug-Life-White-Typographic-Design-for-Thug-Programmers-by-ramiro/17317023.FB110" data-wpel-link="external" target="_blank" rel="external noopener">get the wardrobe</a>.</figcaption></figure>
</div>


<p>What&#8217;s interesting is that yet another key clue was mentioned in the Swift Forums thread, yet was overlooked because it was attributed incorrectly and the mechanics miscategorised:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Don&#8217;t test with more than 128 connections. You will get read errors. This is due to the file descriptor limit applied to each process on macOS. As&nbsp;<a href="https://forums.swift.org/u/johannesweiss" data-wpel-link="external" target="_blank" rel="external noopener">@johannesweiss</a>&nbsp;mentioned earlier the default for this is 256. You can change this but it involves disabling the System Integrity Protection.</p>
<cite><a href="https://forums.swift.org/u/adam-fowler/summary" data-wpel-link="external" target="_blank" rel="external noopener">Adam Fowler</a>, <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/99" data-wpel-link="external" target="_blank" rel="external noopener">Swift Forums thread</a></cite></blockquote>



<p>The 128 connections &amp; read errors parts were spot on, in hindsight. But the rest was incorrect (it&#8217;s not about the file descriptor ulimit) and in particular the incorrect statement about having to disable SIP perhaps further distracted readers (corrections were posted in reply, which perhaps steered the thread away from what actually mattered).</p>



<p>I&#8217;m not sure what precisely the lesson is here… if Adam had better understood the behaviour he&#8217;d seen previously (re. 128 connections being the apparent limit) he might have been able to immediately point out one of the key problems. But who can say why he didn&#8217;t quite understand that limit correctly, or whether he should have. This sort of thing happens, and <em>maybe</em> it suggests a failure to properly diagnose problems previously, but mostly I&#8217;d just point out that the discrepancy here &#8211; between 128 and 256 &#8211; <em>should</em> have been noticed, and had it been questioned it would have accelerated progress towards the root cause.</p>



<p>Speaking just for myself, I think I (erroneously) dismissed Adam&#8217;s comment because I already knew that the default file descriptor limit is <em>not</em> actually 256 (it&#8217;s 2,560 on macOS, mostly) and so I assumed the <em>whole</em> comment was wrong and irrelevant.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>⚠️ Partly wrong is not the same as completely wrong (let-alone useless).</p>
</div></div>



<p>Another clue was put forth, yet again essentially by accident (without understanding its significance, at the time):</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Yes, the reason I used&nbsp;<code>wrk</code>, is that it uses pipelining. That&#8217;s why ab (apachebench) had such terrible performance: it opened a new socket for each request. And then it overloaded the system by throwing</p>



<ul class="wp-block-list">
<li><code>socket: Too many open files</code></li>



<li><code>apr_socket_recv: Connection reset by peer&nbsp;</code></li>
</ul>



<p>errors.</p>



<p>I raised the&nbsp;<code>ulimit -n</code>&nbsp;to 10240, but still&nbsp;<code>apr_socket_recv: Connection reset by peer (104)</code>&nbsp;occurred occasionally.</p>
<cite><a href="https://forums.swift.org/u/axello/summary" data-wpel-link="external" target="_blank" rel="external noopener">Axel Roest</a>, <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/122" data-wpel-link="external" target="_blank" rel="external noopener">Swift Forums thread</a></cite></blockquote>



<p>This hinted very directly at the second major problem, but it seems nobody in the forum thread realised it. I think there was still a pre-occupation with the file descriptor ulimit.</p>



<p>A little logic applied at the time of Axel&#8217;s comment <em>should</em> have revealed its mistaken presumption: that opening new TCP connections for each HTTP request will inevitably cause connection failures. Sure, it will if you give it enough concurrent connection attempts, but real-world web servers operate at <em>huge</em> loads that are basically one HTTP request per connection, without any significant reliability problems. In hindsight, it&#8217;s clear that Axel&#8217;s dismissal of this behaviour as in any way normal was a mistake &#8211; as was everyone else in the thread going along with that dismissal.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>⚠️ If a tool isn&#8217;t working the way you expect, maybe that&#8217;s telling you something important. Just switching tools until you find one which doesn&#8217;t exhibit the problem doesn&#8217;t necessarily mean it&#8217;s not still a problem.</p>
</div></div>



<h3 class="wp-block-heading">A misunderstood workaround</h3>



<p>In parallel to all of the above discussion in the Swift Forums thread, I&#8217;d been diving into <code>wrk</code> to see what it was really doing. I discovered <em>a way</em> to eliminate the errors: by opening all the TCP connections in advance in a way that <em>happened</em> to limit how many were attempted concurrently by <code>wrk</code> thread count which <em>happened</em> to be low enough in my use of <code>wrk</code> to not hit the magic 128 limit (more on that later). As you can see in <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/123" data-wpel-link="external" target="_blank" rel="external noopener">my forum post on this</a>, I initially misunderstood how <code>wrk</code> functioned and misattributed the root cause as bugs / bad design in <code>wrk</code>.</p>



<p>In my defence, <code>wrk</code> isn&#8217;t written very well, eschewing such outrageous and bourgeois software engineering practices as, you know, actually checking for errors. So it wasn&#8217;t unreasonable to believe it was ultimately just broken, given plenty of evidence that it was at least partly broken (which it was &amp; is), but it was ultimately a mistake to let that cloud my judgement of each individual behaviour.</p>



<p>Then again, if I hadn&#8217;t been so appalled by the bad code in <code>wrk</code>, and taken it upon myself to rewrite key parts of it, I might not have stumbled onto the above &#8220;fix&#8221; and therefore also not found the true cause, later.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>✅ Improving error handling &amp; reporting is practically always a good idea. And when debugging a problem it can be helpful even if it doesn&#8217;t feel guided &#8211; the whole point of absent or incorrect error reporting is that you don&#8217;t know what you&#8217;re missing, so you may well reveal an important clue &#8220;by accident&#8221;.</p>
</div></div>



<h3 class="wp-block-heading">It&#8217;s never the compiler or the kernel… except when it is</h3>



<p>At the time I did think I&#8217;d actually <em>fixed</em> <code>wrk</code>; I didn&#8217;t realise I&#8217;d merely found an imperfect workaround. I&#8217;d solved the connection errors (not really)! But, I was still curious about one thing &#8211; something pretty much everyone had kinda ignored this whole time:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Though those lingering few read/write errors still bother me. I might look into them later.</p>
<cite>Me, <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/123" data-wpel-link="external" target="_blank" rel="external noopener">Swift Forums thread</a></cite></blockquote>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>✅ Curiosity is <em>powerful</em>. Why did my chocolate bar melt in my pocket when I walked through the lab, <a href="https://www.technologyreview.com/1999/01/01/236818/melted-chocolate-to-microwave/" data-wpel-link="external" target="_blank" rel="external noopener">maybe that&#8217;s interesting</a>? Why did this contaminated Petri dish end up full of fungus instead of bacteria, <a href="https://www.healio.com/news/endocrinology/20120325/penicillin-an-accidental-discovery-changed-the-course-of-medicine" data-wpel-link="external" target="_blank" rel="external noopener">maybe that&#8217;s interesting</a>? Why&#8217;s this unused screen glowing, <a href="https://www.aps.org/publications/apsnews/200111/history.cfm" data-wpel-link="external" target="_blank" rel="external noopener">maybe that&#8217;s interesting</a>? Ow, why did this apple fall on my head… but, <a href="https://education.nationalgeographic.org/resource/isaac-newton-who-he-was-why-apples-are-falling/" data-wpel-link="external" target="_blank" rel="external noopener">maybe that&#8217;s interesting</a>? (<a href="https://www.newscientist.com/article/2170052-newtons-apple-the-real-story/" data-wpel-link="external" target="_blank" rel="external noopener">apocryphal</a>, but close enough)</p>
</div></div>



<p>Tracing those reported errors to their cause was quite a challenge. The only known way to reproduce the errors was to use a very high number of concurrent TCP connections (several thousand), which made it hard to follow any <em>single</em> connection through its lifecycle using any low-brow methods (printf debugging etc). <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/129" data-wpel-link="external" target="_blank" rel="external noopener">I eventually managed</a> using System Trace<sup data-fn="65590619-a219-4fb0-87ea-fd2c98990365" class="fn"><a href="#65590619-a219-4fb0-87ea-fd2c98990365" id="65590619-a219-4fb0-87ea-fd2c98990365-link">2</a></sup> (lamenting, the entire time I used Instruments, that it would have been <a href="https://leopard-adc.pepas.com/documentation/DeveloperTools/Conceptual/SharkUserGuide/SystemTracing/SystemTracing.html" data-wpel-link="external" target="_blank" rel="external noopener">so much easier in Shark</a>).</p>



<p>Unfortunately, what I was seeing &#8211; while in fact correct &#8211; did not make sense to me, so I was hesitant to take it on face value.</p>



<p>The lack of any error reporting on the server side, because Vapor lacks it completely, was also both a known problem at the time and also a problem in hindsight. Had Vapor/NIO actually reported the errors they were encountering, it would have partially validated what I was seeing in the system traces &#8211; in fact, it would probably have saved me from having to capture &amp; analyse system traces.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>❌ Ignoring errors is always a bad idea. I mean, duh, right? But apparently it has to be reiterated.</p>
</div></div>



<p>Alas I don&#8217;t actually remember now precisely what led me to the final answers and root causes. I know it involved many hours of experimenting, exploring hypotheses, and in generally fiddling with everything I could think of.</p>



<p>Somehow or other, I did finally cotton on to a key configuration parameter: <code>kern.ipc.somaxconn</code>.</p>



<p>That controls how many connection requests can be pending (not formally accepted by the server) at one time. It defaults to 128 on macOS. Remember that number, 128?</p>



<p>Once I had figured out that <code>kern.ipc.somaxconn</code> directly controlled the problematic behaviour, the rest followed pretty naturally and quickly &#8211; I realised that what I saw in the system traces was in fact accurate, and that in turn revealed that the macOS kernel contains multiple surprisingly blatant and serious bugs (or at the very least dubious design choices, and lying documentation) regarding TCP sockets in non-blocking mode. I wrote that up in some detail in the second half of <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583/132" data-wpel-link="external" target="_blank" rel="external noopener">this Swift Forums post</a>.</p>



<p>As a sidenote, that darn magic number that everyone kept ignoring &#8211; 128 &#8211; cropped up yet again, in <a href="https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/listen.2.html" data-wpel-link="external" target="_blank" rel="external noopener">the <code>listen</code> man page</a>, though by the time I saw it there it was merely a confirmation of what I&#8217;d already discovered, than a helpful clue. Still, perhaps there&#8217;s a lesson there: read the man page. 😆</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>❌ When documenting known bugs and limitations, explain them fully. Don&#8217;t just say e.g. &#8220;more than 128 doesn&#8217;t work&#8221;, say <em>why</em>.</p>
</div></div>



<h2 class="wp-block-heading">Conclusion</h2>



<p>All told, the major problems identified by the benchmark were (and not all of these were mentioned above, but you can find all the details in <a href="https://forums.swift.org/t/standard-vapor-website-drops-1-5-of-requests-even-at-concurrency-of-100/71583" data-wpel-link="external" target="_blank" rel="external noopener">the Swift Forums thread</a>):</p>



<ul class="wp-block-list">
<li>The particular 3rd party library used for BigInt support in Swift, <a href="https://github.com/attaswift/BigInt" data-wpel-link="external" target="_blank" rel="external noopener">attaswift/BigInt</a>, performs quite poorly.</li>



<li>Vapor would accept too few connections per cycle of its event loop (promptly fixed, in <a href="https://github.com/vapor/vapor/releases/tag/4.96.0" data-wpel-link="external" target="_blank" rel="external noopener">4.96.0</a>).</li>



<li>The benchmark tool used, <code><a href="https://github.com/wg/wrk" data-wpel-link="external" target="_blank" rel="external noopener">wrk</a></code>, has numerous bugs:
<ul class="wp-block-list">
<li>It doesn&#8217;t always use the configured number of concurrent connections.</li>



<li>It doesn&#8217;t measure latency correctly.</li>



<li>It doesn&#8217;t report errors correctly (in the sense both that it miscategorises them, e.g. connect vs read/write, and that it doesn&#8217;t provide enough detail to understand what they are, such as by including the errno).</li>
</ul>
</li>



<li>The macOS kernel (and seemingly Linux kernel likewise) has multiple bugs:
<ul class="wp-block-list">
<li>Connection errors are reported incorrectly (as <code>ECONNRESET</code> or <code>EBADF</code>, instead of <code>ECONNREFUSED</code>).</li>



<li>kqueue (kevents) behaves as if all connections are always accepted, even when they are not. Put another way, you cannot actually tell if a connection was successful when using non-blocking sockets on macOS.</li>
</ul>
</li>



<li>Key network configuration on macOS &amp; Linux is way too restrictive:
<ul class="wp-block-list">
<li>Maximum file descriptors per process is only 2,560 generally on macOS, and even less (256) in GUI apps. It may vary on Linux, but on Axel&#8217;s particular server it was 1,024.</li>



<li>Maximum number of unaccepted connection requests (the <code>kern.ipc.somaxconn</code> sysctl on macOS, <code>/proc/sys/net/core/somaxconn</code> on Linux) is only 128 on macOS. It may vary on Linux.</li>
</ul>
</li>
</ul>



<p>It <em>appears</em> that the kernel bugs apply to Linux as well (although it&#8217;s not known if kqueue was in use there, as <code>wrk</code> also supports <code>epoll</code> and <code>select</code>), as the behaviour seems to be the same between macOS and Linux.</p>



<p>With the above issues fixed or worked around, <a href="https://tech.phlux.us/Juice-Sucking-Servers-Part-Trois/" data-wpel-link="external" target="_blank" rel="external noopener">his benchmark produces more explicable results</a> (but keep in mind that the difference in connection acceptance behaviour <em>is real</em> and reflects a different design trade-off in Vapor, which <em>may</em> be a problem for real-world use if you don&#8217;t raise <code>somaxconn</code> and the listen backlog limit enough).</p>



<p>And that&#8217;s all just the <em>problems</em> Axel&#8217;s benchmark surfaced &#8211; there was a whole host of other interesting lessons taken away from all this (only a fraction of which were highlighted in this post &#8211; many more can be found in Axel&#8217;s posts and the Swift Forums thread).</p>



<p>Nominally the end result is also a benchmark that shows Vapor (Swift) out-performing other popular web frameworks in other languages. <em>Hugely</em> out-performing them, if you factor in not just throughput &amp; latency but RAM &amp; power usage. But, to reiterate <a href="#do-these-improvements-apply-to-the-other-cases-too">what I pointed out earlier</a>, take that with a grain of salt.</p>



<p>So, for a benchmark that many initially decried as unrealistic or plain poorly conceived, it turned out to be pretty darn useful, I think. And if that doesn&#8217;t make it a successful benchmark, I don&#8217;t know what does.</p>



<hr class="wp-block-separator has-alpha-channel-opacity is-style-dots"/>



<h1 class="wp-block-heading">Addendum: post title</h1>



<p>Looking at the <a href="https://news.ycombinator.com/item?id=40374946" data-wpel-link="external" target="_blank" rel="external noopener">comments about this post on HackerNews</a> etc, I feel like I have to explain the title a little. I was quite pleased with myself when I came up with it (admittedly by accident), because it&#8217;s subtle and I think kinda clever, but perhaps too subtle.</p>



<p>&#8220;Swift sucks at web serving… or does it?&#8221; is a [platonic] double entendre.</p>



<p>On face value it&#8217;s alluding to the more typical type of post that is both (a) click-baity and (b) a standard &#8220;turns out&#8221; story where actually Swift is <em>awesome</em> at web server and haha to all those who doubted it. (where one can replace the word &#8220;Swift&#8221; with basically any programming technology, because benchmarking brings out some ugly competitiveness from the community)</p>



<p>But <em>really</em> what it means here, if you read the whole post, is that actually <em>we still don&#8217;t know</em>. It&#8217;s alluding to the oft-overlooked fact that benchmarks are rarely as conclusive as they&#8217;re presented. Which I thought was quite clever because it reiterates, at a meta level, my whole point about learning being more important than competing.</p>



<p>At least, that was the idea. 😆</p>


<ol class="wp-block-footnotes"><li id="e7728698-06db-400a-a6b9-01a2ce4f3e5b"><a href="https://github.com/helidon-io/helidon" data-wpel-link="external" target="_blank" rel="external noopener">Helidon itself is written in Java</a>, but <a href="https://gitlab.com/axello/serverbench/-/tree/main/java/src/main/kotlin?ref_type=heads" data-wpel-link="external" target="_blank" rel="external noopener">Axel used Kotlin for his little web server implementation</a> &#8211; including most crucially the Fibonacci calculations.  Both interoperate atop the <a href="https://en.wikipedia.org/wiki/Java_virtual_machine" data-wpel-link="external" target="_blank" rel="external noopener">JVM</a> and plenty of &#8220;Java&#8221; libraries are partly written in Kotlin, or have dependencies written in Kotlin &#8211; and vice versa.  A little like Objective-C and Swift interoperate such that many Mac / iDevice apps use a rich mix of both and you don&#8217;t typically need to care which language is used for any particular piece. <a href="#e7728698-06db-400a-a6b9-01a2ce4f3e5b-link" aria-label="Jump to footnote reference 1">↩︎</a></li><li id="65590619-a219-4fb0-87ea-fd2c98990365">I always endeavour to link to the things I mention, but in this case there&#8217;s nothing to link to &#8211; Apple don&#8217;t provide any actual documentation of the System Trace tool in Instruments, and there&#8217;s not even any usable 3rd party guide to it, that I can find.  It&#8217;s a sad demonstration of Apple&#8217;s general indifference to performance tools. 😔<br><br>Apple don&#8217;t even have a proper product page for Instruments itself &#8211; the closest you can find is merely <a href="https://help.apple.com/instruments/mac/10.0/#/" data-wpel-link="external" target="_blank" rel="external noopener">its Help</a>. <a href="#65590619-a219-4fb0-87ea-fd2c98990365-link" aria-label="Jump to footnote reference 2">↩︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://wadetregaskis.com/swift-sucks-at-web-serving-or-does-it/feed/</wfw:commentRss>
			<slash:comments>13</slash:comments>
		
		
			<media:content url="https://wadetregaskis.com/wp-content/uploads/2024/05/Web-server-comparison-throughput.webp" medium="image" />
<post-id xmlns="com-wordpress:feed-additions:1">8061</post-id>	</item>
	</channel>
</rss>
