<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>gettimeofday &#8211; Wade Tregaskis</title>
	<atom:link href="https://wadetregaskis.com/tags/gettimeofday/feed/" rel="self" type="application/rss+xml" />
	<link>https://wadetregaskis.com</link>
	<description></description>
	<lastBuildDate>Sun, 19 May 2024 15:24:24 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://wadetregaskis.com/wp-content/uploads/2016/03/Stitch-512x512-1-256x256.png</url>
	<title>gettimeofday &#8211; Wade Tregaskis</title>
	<link>https://wadetregaskis.com</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">226351702</site>	<item>
		<title>Swift&#8217;s native Clocks are very inefficient</title>
		<link>https://wadetregaskis.com/swifts-native-clocks-are-very-inefficient/</link>
					<comments>https://wadetregaskis.com/swifts-native-clocks-are-very-inefficient/#comments</comments>
		
		<dc:creator><![CDATA[]]></dc:creator>
		<pubDate>Fri, 03 May 2024 02:10:07 +0000</pubDate>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Benchmarked]]></category>
		<category><![CDATA[clock_gettime_nsec_np]]></category>
		<category><![CDATA[ContinuousClock]]></category>
		<category><![CDATA[gettimeofday]]></category>
		<category><![CDATA[Inefficient by design]]></category>
		<category><![CDATA[mach_absolute_time]]></category>
		<category><![CDATA[Sad]]></category>
		<category><![CDATA[SuspendingClock]]></category>
		<category><![CDATA[Swift]]></category>
		<guid isPermaLink="false">https://wadetregaskis.com/?p=7990</guid>

					<description><![CDATA[By which I mean, things like ContinuousClock and SuspendingClock. In absolute terms they don&#8217;t have much overhead &#8211; think sub-microsecond for most uses. Which makes them perfectly acceptable when they&#8217;re used sporadically (e.g. only a few times per second). However, if you need to deal with time and timing more frequently, their inefficiency can become&#8230; <a class="read-more-link" href="https://wadetregaskis.com/swifts-native-clocks-are-very-inefficient/" data-wpel-link="internal">Read more</a>]]></description>
										<content:encoded><![CDATA[
<p>By which I mean, things like <code><a href="https://developer.apple.com/documentation/swift/continuousclock" data-wpel-link="external" target="_blank" rel="external noopener">ContinuousClock</a></code> and <code><a href="https://developer.apple.com/documentation/swift/suspendingclock" data-wpel-link="external" target="_blank" rel="external noopener">SuspendingClock</a></code>.</p>



<p>In absolute terms they don&#8217;t have much overhead &#8211; think sub-microsecond for most uses. Which makes them perfectly acceptable when they&#8217;re used sporadically (e.g. only a few times per second).</p>



<p>However, if you need to deal with time and timing more frequently, their inefficiency can become a serious bottleneck.</p>



<p>I stumbled into this because of a fairly common and otherwise uninteresting pattern &#8211; throttling UI updates on an I/O operation&#8217;s progress. This might look something like:</p>



<div class="wp-block-kevinbatdorf-code-block-pro padding-disabled" data-code-block-pro-font-family="" style="font-size:.875rem;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><pre class="shiki light-plus" style="background-color: #FFFFFF" tabindex="0"><code><span class="line"><span style="color: #0000FF">struct</span><span style="color: #000000"> </span><span style="color: #267F99">Example</span><span style="color: #000000">: View {</span></span>
<span class="line"><span style="color: #000000">    </span><span style="color: #0000FF">let</span><span style="color: #000000"> bytes: AsyncSequence&lt;</span><span style="color: #267F99">UInt8</span><span style="color: #000000">&gt;</span></span>
<span class="line"></span>
<span class="line"><span style="color: #000000">    </span><span style="color: #0000FF">@State</span><span style="color: #000000"> </span><span style="color: #0000FF">var</span><span style="color: #000000"> byteCount = </span><span style="color: #098658">0</span></span>
<span class="line"></span>
<span class="line"><span style="color: #000000">    </span><span style="color: #0000FF">var</span><span style="color: #000000"> body: some View {</span></span>
<span class="line"><span style="color: #000000">        </span><span style="color: #795E26">Text</span><span style="color: #000000">(</span><span style="color: #A31515">&quot;Bytes so far: </span><span style="color: #0000FF">\(</span><span style="color: #000000FF">byteCount.</span><span style="color: #795E26">formatted</span><span style="color: #000000FF">(.</span><span style="color: #795E26">byteCount</span><span style="color: #000000FF">(</span><span style="color: #795E26">style</span><span style="color: #000000FF">: .</span><span style="color: #001080">binary</span><span style="color: #000000FF">))</span><span style="color: #0000FF">)</span><span style="color: #A31515">&quot;</span><span style="color: #000000">)</span></span>
<span class="line"><span style="color: #000000">            .</span><span style="color: #001080">task</span><span style="color: #000000"> {</span></span>
<span class="line"><span style="color: #000000">                </span><span style="color: #0000FF">var</span><span style="color: #000000"> unpostedByteCount = </span><span style="color: #098658">0</span></span>
<span class="line"><span style="color: #000000">                </span><span style="color: #0000FF">let</span><span style="color: #000000"> clock = </span><span style="color: #795E26">ContinuousClock</span><span style="color: #000000">()</span></span>
<span class="line"><span style="color: #000000">                </span><span style="color: #0000FF">var</span><span style="color: #000000"> lastUpdate = clock.</span><span style="color: #001080">now</span></span>
<span class="line"></span>
<span class="line"><span style="color: #000000">                </span><span style="color: #AF00DB">for</span><span style="color: #000000"> </span><span style="color: #AF00DB">try</span><span style="color: #000000"> </span><span style="color: #AF00DB">await</span><span style="color: #000000"> byte </span><span style="color: #AF00DB">in</span><span style="color: #000000"> bytes {</span></span>
<span class="line"><span style="color: #000000">                    … </span><span style="color: #008000">// Do something with the byte.</span></span>
<span class="line"></span>
<span class="line"><span style="color: #000000">                    unpostedByteCount += </span><span style="color: #098658">1</span></span>
<span class="line"></span>
<span class="line"><span style="color: #000000">                    </span><span style="color: #0000FF">let</span><span style="color: #000000"> now = clock.</span><span style="color: #001080">now</span></span>
<span class="line"><span style="color: #000000">                    </span><span style="color: #0000FF">let</span><span style="color: #000000"> delta = now - lastUpdate</span></span>
<span class="line"></span>
<span class="line"><span style="color: #000000">                    </span><span style="color: #AF00DB">if</span><span style="color: #000000"> (    delta &gt; .</span><span style="color: #795E26">seconds</span><span style="color: #000000">(</span><span style="color: #098658">1</span><span style="color: #000000">)</span></span>
<span class="line"><span style="color: #000000">                         || (    (delta &gt; .</span><span style="color: #795E26">milliseconds</span><span style="color: #000000">(</span><span style="color: #098658">100</span><span style="color: #000000">)</span></span>
<span class="line"><span style="color: #000000">                              &amp;&amp; </span><span style="color: #098658">1_000_000</span><span style="color: #000000"> &lt;= unpostedByteCount))) {</span></span>
<span class="line"><span style="color: #000000">                        byteCount += unpostedByteCount</span></span>
<span class="line"><span style="color: #000000">                        unpostedByteCount = </span><span style="color: #098658">0</span></span>
<span class="line"><span style="color: #000000">                        lastUpdate = now</span></span>
<span class="line"><span style="color: #000000">                    }</span></span>
<span class="line"><span style="color: #000000">                }</span></span>
<span class="line"><span style="color: #000000">            }</span></span>
<span class="line"><span style="color: #000000">    }</span></span>
<span class="line"><span style="color: #000000">}</span></span></code></pre></div>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>☝️ This isn&#8217;t a complete implementation, as it won&#8217;t update the byte count if the download stalls (since the lack of incoming bytes will mean no iteration on the loop, and therefore no updates even if a full second passes). But it&#8217;s sufficient for demonstration purposes here.</p>



<p>🖐️ Why didn&#8217;t I just use <code><a href="https://github.com/apple/swift-async-algorithms/blob/main/Sources/AsyncAlgorithms/AsyncAlgorithms.docc/Guides/Throttle.md" data-wpel-link="external" target="_blank" rel="external noopener">throttle</a></code> from <a href="https://github.com/apple/swift-async-algorithms" data-wpel-link="external" target="_blank" rel="external noopener">swift-async-algorithms</a>? I did, at first, and quickly discovered that its performance is <em>horrible</em>. While I do suspect I can &#8216;optimise&#8217; it to not be atrocious, I haven&#8217;t pursued that as it was easier to just write my own throttling system.</p>
</div></div>



<p>The above seems fairly straightforward, but if you run it and have any non-trivial I/O rate &#8211; even just a few hundred kilobytes per second &#8211; you&#8217;ll find that it saturates an entire CPU core, not just wasting CPU time but limiting the I/O rate severely.</p>



<p>Using a <code>SuspendingClock</code> makes no difference.</p>



<p>In a nutshell, the problem is that Swift&#8217;s <code><a href="https://developer.apple.com/documentation/swift/clock" data-wpel-link="external" target="_blank" rel="external noopener">Clock</a></code> protocol has significant overheads by design<sup data-fn="2f4a7c64-e213-44df-a3da-0e5020545aad" class="fn"><a href="#2f4a7c64-e213-44df-a3da-0e5020545aad" id="2f4a7c64-e213-44df-a3da-0e5020545aad-link">1</a></sup>. If you look at a time profile of code like this, you&#8217;ll see things like:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img fetchpriority="high" decoding="async" width="900" height="716" src="https://wadetregaskis.com/wp-content/uploads/2024/05/ContinuousClock-overhead.webp" alt="Screenshot of Instruments showing the outline view for a Time Profile, expanded to show dozens of spurious, overhead functions taking up the vast majority of the runtime." class="wp-image-7991" srcset="https://wadetregaskis.com/wp-content/uploads/2024/05/ContinuousClock-overhead.webp 900w, https://wadetregaskis.com/wp-content/uploads/2024/05/ContinuousClock-overhead-256x204.webp 256w, https://wadetregaskis.com/wp-content/uploads/2024/05/ContinuousClock-overhead-768x611.webp 768w, https://wadetregaskis.com/wp-content/uploads/2024/05/ContinuousClock-overhead@2x.webp 1800w, https://wadetregaskis.com/wp-content/uploads/2024/05/ContinuousClock-overhead-256x204@2x.webp 512w" sizes="(max-width: 900px) 100vw, 900px" /></figure>
</div>


<p>That&#8217;s a lot of time wasted in function calls and struct initialisation and type conversion and protocol witnesses and all that guff. The only part that&#8217;s <em>actually</em> retrieving the time is the <code><a href="https://github.com/apple/swift/blob/625436af05b1cf8f1904096530235489daec9dac/stdlib/public/Concurrency/Clock.cpp#L30" data-wpel-link="external" target="_blank" rel="external noopener">swift_get_time</a></code> call (which is just a wrapper over <code><a href="https://www.manpagez.com/man/3/clock_gettime/" data-wpel-link="external" target="_blank" rel="external noopener">clock_gettime</a></code>, which is just a wrapper over <code><a href="https://www.manpagez.com/man/3/clock_gettime_nsec_np/" data-wpel-link="external" target="_blank" rel="external noopener">clock_gettime_nsec_np</a>(CLOCK_UPTIME_RAW)</code>, which is just a wrapper over <code><a href="https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time" data-wpel-link="external" target="_blank" rel="external noopener">mach_absolute_time</a></code>).</p>



<p>I wrote <a href="https://github.com/wadetregaskis/Swift-Benchmarks/blob/main/Benchmarks/Clocks/Clocks.swift" data-wpel-link="external" target="_blank" rel="external noopener">some simple benchmarks of various alternative time-tracking methods</a>, with these results with Swift 5.10 (showing the median runtime of the benchmark, which is a million iterations of checking the time):</p>



<figure class="wp-block-table aligncenter"><table><thead><tr><th class="has-text-align-right" data-align="right">Method</th><th class="has-text-align-center" data-align="center">10-core iMac Pro</th><th class="has-text-align-center" data-align="center">M2 MacBook Air</th></tr></thead><tbody><tr><td class="has-text-align-right" data-align="right"><code><a href="https://developer.apple.com/documentation/swift/continuousclock" data-wpel-link="external" target="_blank" rel="external noopener">ContinuousClock</a></code></td><td class="has-text-align-center" data-align="center">429 ms</td><td class="has-text-align-center" data-align="center">258 ms</td></tr><tr><td class="has-text-align-right" data-align="right"><code><a href="https://developer.apple.com/documentation/swift/suspendingclock" data-wpel-link="external" target="_blank" rel="external noopener">SuspendingClock</a></code></td><td class="has-text-align-center" data-align="center">430 ms</td><td class="has-text-align-center" data-align="center">247 ms</td></tr><tr><td class="has-text-align-right" data-align="right"><code><a href="https://developer.apple.com/documentation/foundation/date" data-wpel-link="external" target="_blank" rel="external noopener">Date</a></code></td><td class="has-text-align-center" data-align="center">30 ms</td><td class="has-text-align-center" data-align="center">19 ms</td></tr><tr><td class="has-text-align-right" data-align="right"><code><a href="https://www.manpagez.com/man/3/clock_gettime_nsec_np/" data-wpel-link="external" target="_blank" rel="external noopener">clock_gettime_nsec_np(CLOCK_MONOTONIC_RAW)</a></code></td><td class="has-text-align-center" data-align="center">32 ms</td><td class="has-text-align-center" data-align="center">10 ms</td></tr><tr><td class="has-text-align-right" data-align="right"><code><a href="https://www.manpagez.com/man/3/clock_gettime_nsec_np/" data-wpel-link="external" target="_blank" rel="external noopener">clock_gettime_nsec_np(CLOCK_UPTIME_RAW)</a></code></td><td class="has-text-align-center" data-align="center">27 ms</td><td class="has-text-align-center" data-align="center">10 ms</td></tr><tr><td class="has-text-align-right" data-align="right"><code><a href="https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/gettimeofday.2.html" data-wpel-link="external" target="_blank" rel="external noopener">gettimeofday</a></code></td><td class="has-text-align-center" data-align="center">24 ms</td><td class="has-text-align-center" data-align="center">12 ms</td></tr><tr><td class="has-text-align-right" data-align="right"><code><a href="https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time" data-wpel-link="external" target="_blank" rel="external noopener">mach_absolute_time</a></code></td><td class="has-text-align-center" data-align="center">15 ms</td><td class="has-text-align-center" data-align="center">6 ms</td></tr></tbody></table></figure>



<p>All these alternative methods are <em>well</em> over an order of magnitude faster than Swift&#8217;s native clock APIs, showing just how dreadfully inefficient the Swift <code>Clock</code> API is.</p>



<h3 class="wp-block-heading">mach_absolute_time for the win</h3>



<p>Unsurprisingly, <code>mach_absolute_time</code> is the fastest. It is what all these other APIs are actually based on; it is the lowest level of the time stack.</p>



<p>The downside to calling <code>mach_absolute_time</code> <em>directly</em>, though, is that <a href="https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time#discussion" data-wpel-link="external" target="_blank" rel="external noopener">it&#8217;s on Apple&#8217;s &#8220;naughty&#8221; list</a> &#8211; apparently it&#8217;s been abused for device fingerprinting, so Apple require you to beg for special permission if you want to use it (even though it&#8217;s used by all these other APIs anyway, as the basis for their implementations, and there&#8217;s nothing you can get from <code>mach_absolute_time</code> that you can&#8217;t get from them too 🤨).</p>



<h3 class="wp-block-heading"><code>Date</code> surprisingly not bad</h3>



<p>I was quite surprised to see good ol&#8217; <code><a href="https://developer.apple.com/documentation/foundation/date" data-wpel-link="external" target="_blank" rel="external noopener">Date</a></code> performing competitively with the traditional C-level APIs, at least on x86-64. Even on arm64 it&#8217;s not bad, at still a third to half the speed of the C APIs. This surprised me because <s>it has the overhead of at least one Objective-C message send (for <code><a href="https://developer.apple.com/documentation/foundation/date/1780473-timeintervalsincenow" data-wpel-link="external" target="_blank" rel="external noopener">timeIntervalSinceNow</a></code>), unless somehow the Swift compiler is optimising that into a static function call, or inlining it entirely…?</s></p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p><strong>Update</strong>: I later looked at the disassembly, and found no message sends, only a plain function call to <code>Foundation.Date.timeIntervalSinceNow.getter</code> (which is only 40 instructions, on arm64, over <code>clock_gettime</code> and <code>__stack_chk_fail</code> &#8211; and the former is hundreds of instructions, so it&#8217;s adding relatively little overhead to the C API).</p>



<p>This isn&#8217;t being done by the compiler, it&#8217;s because <a href="https://github.com/apple/swift-foundation/blob/main/Sources/FoundationEssentials/Date.swift" data-wpel-link="external" target="_blank" rel="external noopener">that&#8217;s <em>actually</em> how it&#8217;s implemented in Foundation</a>. I keep forgetting that Foundation from Swift is no longer just the old Objective-C Foundation, but rather mostly the <em>new</em> Foundation that&#8217;s written in native Swift. So these performance results likely don&#8217;t apply once you go back far enough in Apple OS releases (to when Swift really was calling into the Objective-C code for <code>NSDate</code>) &#8211; but it&#8217;s safe to rely on good <code>Date</code> performance now and in future.</p>
</div></div>



<p>I certainly wouldn&#8217;t be afraid to use <code>Date</code> broadly, going down to lower APIs only when truly necessary &#8211; which is pretty rarely, I&#8217;d wager; we&#8217;re talking a mere 19 to 30 <em>nanoseconds</em> to get the time elapsed since a reference date <em>and</em> compare it to a threshold. If that&#8217;s too slow, it might be an indication that there&#8217;s a bigger problem (like transferring data a single byte at a time, as in the example that started this post &#8211; but more on that in <a href="https://wadetregaskis.com/urlsession-performance-for-reading-a-byte-stream/" data-wpel-link="internal">the next post</a>).</p>



<hr class="wp-block-separator has-alpha-channel-opacity is-style-dots"/>



<h3 class="wp-block-heading">Follow-up</h3>



<p>This post <a href="https://news.ycombinator.com/item?id=40262897" data-wpel-link="external" target="_blank" rel="external noopener">got some attention on HackerNews</a>. Pleasingly, the comments there were almost all well-intentioned and interesting. It&#8217;s a bit beyond me to try to address all of them, but a few in particular raised good points that I would like to answer / clarify:</p>



<ul class="wp-block-list">
<li>A lot of folks were curious about <code>mach_absolute_time</code> being on Apple&#8217;s naughty list. I don&#8217;t know for sure why it is either, but I think it&#8217;s very likely that it&#8217;s <em>primarily</em> because it essentially provides a reference time point, that&#8217;s very precise and pretty unique between computers. It&#8217;s not the boot time necessarily &#8211; because the timer pauses whenever the system is put to sleep &#8211; but even so it provides a simple way to nearly if not exactly identify an individual machine session (between boots &amp; sleeps). It probably wouldn&#8217;t take many other fingerprinting data points to reliably pin-point a specific machine.<br><br>Secondarily, because it provides very precise timing capabilities (e.g. nanosecond-resolution on x86), it could possibly be a key component of <a href="https://en.wikipedia.org/wiki/Timing_attack" data-wpel-link="external" target="_blank" rel="external noopener">timing attacks</a> and broader device fingerprinting based on timing information (e.g. measuring how long it takes to perform an otherwise innocuous operation).<br><br>That all said, the only difference between it and some of the higher-level APIs wrapping it is their overhead. And it&#8217;s not apparent to me that merely making the &#8220;get-time&#8221; functionality 2x slower is going to magically mitigate all the above concerns, especially when we&#8217;re still talking just a few nanoseconds.</li>



<li>Admittedly my phrasing regarding Apple&#8217;s policies on <code>mach_absolute_time</code> &#8211; &#8220;beg for permission to use it&#8221; &#8211; is a little melodramatic. It&#8217;s revealing something of my personal opinions on certain Apple &#8220;security&#8221; practices. I love that Apple genuinely care about protecting everyone&#8217;s privacy, but sometimes I chaff at what feels like capricious or impractical specific policies.<br><br>In this particular case, it&#8217;s not apparent to me why this sort of protection is needed for <em>native</em> apps. In a web browser, sure, you&#8217;re running untrustworthy, essentially arbitrary code from all over the place, a <em>lot</em> of which is openly malicious (thanks, Google &amp; Facebook, for your pervasive trackers &#8211; fuck you too). But a native app &#8211; or heck, even a dodgy non-native one like an Electron app &#8211; must be explicitly installed by the end user, among other barriers like code signing.</li>



<li>A few folks looked at the example case, of iterating a single byte at a time, and were suspicious of how performant that could possibly be anyway. This is a very fair reaction &#8211; it&#8217;s my ingrained instinct as well, from years of C/C++/Objective-C &#8211; <em>but</em> it&#8217;s relying on a few outdated assumptions. <a href="https://wadetregaskis.com/urlsession-performance-for-reading-a-byte-stream/" data-wpel-link="internal">My next post</a> already covered this for the most part, but in short here:<br><br>Through inlining, that code basically optimises down to an outer loop that fetches a new <em>chunk</em> of data (a pointer &amp; length) plus an inner loop to iterate over that as direct memory access. The chunks are typically tens of kilobytes to megabytes, in my experience (depending on the source, e.g. network vs local storage, and the buffer sizes chosen by Apple&#8217;s framework code). So it actually is quite performant and essentially what you&#8217;d conventionally write in a file descriptor read loop. <em>If and when</em> it happens to optimise correctly. That&#8217;s the major caveat &#8211; sometimes the Swift compiler fails to properly optimise code like this, and then indeed the performance can really suck. But for simple cases like in this post&#8217;s example code, the optimiser has no trouble with it.</li>



<li>Similarly, a few folks questioned the need to check the clock on <em>every</em> byte, as in the example. That&#8217;s a valid critique of this sort of code in many contexts, and I concur that where possible one <em>should</em> try to be smarter about such things &#8211; i.e. use sequences of bunches of bytes, not sequences of individual bytes.  <a href="https://wadetregaskis.com/urlsession-performance-for-reading-a-byte-stream/" data-wpel-link="internal">e.g. with <code>URLSession</code> you can</a>, and indeed it is faster to do it smarter like that.  But, you <em>can</em> get acceptable real-world performance with this code, even in high-throughput cases, and it&#8217;s relatively simple and intuitive to write, so it&#8217;s not uncommon or necessarily unreasonable.<br><br>In addition, sometimes you&#8217;re at the mercy of the APIs available &#8211; e.g. sometimes you can <em>only</em> get an <code>AsyncSequence&lt;UInt8&gt;</code>. If you don&#8217;t care about complete accuracy, you can do things like only considering UI updates every N bytes. You&#8217;ll save CPU time and nobody will notice the difference for small enough N on a fast enough iteration, but if those prerequisites aren&#8217;t met you might read e.g. N-1 bytes and then hit a long pause, during which time you <em>have</em> the extra N-1 bytes in hand but you&#8217;re not showing as such in your UI.</li>



<li>Some folks noted that are a <em>lot</em> of other clock APIs from Apple&#8217;s frameworks, like <code><a href="https://developer.apple.com/documentation/dispatch/dispatchtime" data-wpel-link="external" target="_blank" rel="external noopener">DispatchTime</a></code> and <code><a href="https://developer.apple.com/documentation/quartzcore/1395996-cacurrentmediatime" data-wpel-link="external" target="_blank" rel="external noopener">CACurrentMediaTime</a></code>. I didn&#8217;t include those in the benchmark because I just didn&#8217;t think of them at the time. If anyone wants to send me a pull request adding them to <a href="https://github.com/wadetregaskis/Swift-Benchmarks/blob/main/Benchmarks/Clocks/Clocks.swift" data-wpel-link="external" target="_blank" rel="external noopener">the code</a>, I&#8217;d be very happy to accept it.<br><br>I haven&#8217;t checked all those other APIs specifically, but I can pretty much guarantee they&#8217;re all built on <code>mach_absolute_time</code> too (possibly via one or more of the other C APIs already covered in this post). In fact those two examples just mentioned are explicitly documented as using <code>mach_absolute_time</code>.</li>



<li><a href="https://news.ycombinator.com/user?id=Kallikrates" data-wpel-link="external" target="_blank" rel="external noopener">Kallikrates</a> quietly pointed to a very interesting recent change in Apple&#8217;s Swift standard library code, <a href="https://github.com/apple/swift/pull/73429" data-wpel-link="external" target="_blank" rel="external noopener">Make static [milli/micro/nano]seconds members on Duration inlinable</a>. It&#8217;s paired with <a href="https://github.com/apple/swift/pull/73419" data-wpel-link="external" target="_blank" rel="external noopener">another patch</a> that together seem very specifically aimed at eliminating some of the absurd overhead in Swift&#8217;s <code>ContinuousClock</code> &amp; <code>SuspendingClock</code> implementations. The timing is a bit interesting &#8211; I don&#8217;t know if they were prompted by this post, but it&#8217;d be an unlikely coincidence otherwise.<br><br>In any case, I suspect it is possible to eliminate the overheads &#8211; there&#8217;s no apparent reason why they can&#8217;t be at least as efficient as <code>Date</code> already is &#8211; and so I hope that is what&#8217;s happening. Hopefully I&#8217;ll be able to re-run these benchmarks in a few months, with Swift 6, and see the performance gap eliminated. 🤞</li>
</ul>


<ol class="wp-block-footnotes"><li id="2f4a7c64-e213-44df-a3da-0e5020545aad">One might quibble with the &#8220;by design&#8221; assertion.  What I mean is that because it uses a protocol it&#8217;s susceptible to significant overheads &#8211; as is seen in these benchmarks &#8211; and because its internal implementation (a private <code>_Int128</code> type, inside the standard library) is kept hidden, it limits the compiler&#8217;s ability to inline, which is in turn critical to eliminating what&#8217;s technically a lot of boilerplate.  In contrast, if it were simply a struct using only public types internally, it would have avoided most of these overheads and been more amenable to inlining.<br><br>It&#8217;s not an irredeemable design (I think) &#8211; and that&#8217;s what the <a href="https://github.com/apple/swift/pull/73429" data-wpel-link="external" target="_blank" rel="external noopener">recent</a> <a href="https://github.com/apple/swift/pull/73419" data-wpel-link="external" target="_blank" rel="external noopener">patches</a> seem to be banking on, by tweaking the design in order to allow inlining and thus hopefully eliminate almost all the overhead. <a href="#2f4a7c64-e213-44df-a3da-0e5020545aad-link" aria-label="Jump to footnote reference 1">↩︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://wadetregaskis.com/swifts-native-clocks-are-very-inefficient/feed/</wfw:commentRss>
			<slash:comments>13</slash:comments>
		
		
			<media:content url="https://wadetregaskis.com/wp-content/uploads/2024/05/ContinuousClock-overhead.webp" medium="image" />
<post-id xmlns="com-wordpress:feed-additions:1">7990</post-id>	</item>
	</channel>
</rss>
