<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>Intel Core Duo &#8211; Wade Tregaskis</title>
	<atom:link href="https://wadetregaskis.com/tags/intel-core-duo/feed/" rel="self" type="application/rss+xml" />
	<link>https://wadetregaskis.com</link>
	<description></description>
	<lastBuildDate>Thu, 25 Jan 2024 18:25:43 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://wadetregaskis.com/wp-content/uploads/2016/03/Stitch-512x512-1-256x256.png</url>
	<title>Intel Core Duo &#8211; Wade Tregaskis</title>
	<link>https://wadetregaskis.com</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">226351702</site>	<item>
		<title>-fomit-frame-pointer</title>
		<link>https://wadetregaskis.com/fomit-frame-pointer/</link>
					<comments>https://wadetregaskis.com/fomit-frame-pointer/#respond</comments>
		
		<dc:creator><![CDATA[]]></dc:creator>
		<pubDate>Wed, 24 Jan 2024 22:30:30 +0000</pubDate>
				<category><![CDATA[Ancient History]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Ramblings]]></category>
		<category><![CDATA[-fomit-frame-pointer]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[backtracing]]></category>
		<category><![CDATA[frame pointers]]></category>
		<category><![CDATA[i386]]></category>
		<category><![CDATA[Instruments]]></category>
		<category><![CDATA[Intel Core Duo]]></category>
		<category><![CDATA[Merom]]></category>
		<category><![CDATA[Shark]]></category>
		<category><![CDATA[Swift Forums]]></category>
		<category><![CDATA[x86-64]]></category>
		<category><![CDATA[Yonah]]></category>
		<guid isPermaLink="false">https://wadetregaskis.com/?p=7536</guid>

					<description><![CDATA[This is an elaboration of a post I made in a Swift Forums thread, SE-0419: Swift Backtracing API. The question was raised whether an official Swift backtracer should try to support code that doesn&#8217;t use frame pointers. Which immediately raised the question &#8211; in my mind &#8211; of if anyone is still using the &#8220;optimisation&#8221;&#8230; <a class="read-more-link" href="https://wadetregaskis.com/fomit-frame-pointer/" data-wpel-link="internal">Read more</a>]]></description>
										<content:encoded><![CDATA[
<p>This is an elaboration of <a href="https://forums.swift.org/t/se-0419-swift-backtracing-api/69595/13" data-wpel-link="external" target="_blank" rel="external noopener">a post I made</a> in a Swift Forums thread, <a href="https://forums.swift.org/t/se-0419-swift-backtracing-api/69595" data-wpel-link="external" target="_blank" rel="external noopener">SE-0419: Swift Backtracing API</a>.</p>



<p>The question was raised whether an official Swift backtracer should try to support code that doesn&#8217;t use frame pointers.  Which immediately raised the question &#8211; in my mind &#8211; of if anyone is still using the &#8220;optimisation&#8221; of omitting frame pointers, anyway.  And perhaps more importantly, whether they <em>should</em> still be omitting frame pointers.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<h4 class="wp-block-heading">What is a frame pointer?</h4>



<p>A pointer to a stack frame, <em>held in a well-known location</em>.  That location can be in the stack itself (forming a linked-list of the stack frames) or in registers (e.g. the x29 register on AArch64, or RBP register on x86-64).</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img fetchpriority="high" decoding="async" width="806" height="541" src="https://wadetregaskis.com/wp-content/uploads/2024/01/Frame-pointers-explanatory-diagram.webp" alt="Explanatory diagram of frame pointers, showing a link from the x86-64 register %rbp to the start of the current frame, which holds the prior value of %rbp that points to the top of the previous frame, and so on." class="wp-image-7549" srcset="https://wadetregaskis.com/wp-content/uploads/2024/01/Frame-pointers-explanatory-diagram.webp 806w, https://wadetregaskis.com/wp-content/uploads/2024/01/Frame-pointers-explanatory-diagram-256x172.webp 256w, https://wadetregaskis.com/wp-content/uploads/2024/01/Frame-pointers-explanatory-diagram-512x344.webp 512w, https://wadetregaskis.com/wp-content/uploads/2024/01/Frame-pointers-explanatory-diagram@2x.webp 1612w" sizes="(max-width: 806px) 100vw, 806px" /><figcaption class="wp-element-caption">Diagram <a href="https://fedoraproject.org/wiki/Changes/fno-omit-frame-pointer" data-wpel-link="external" target="_blank" rel="external noopener">courtesy of the Fedora Project</a> (specific author unknown).</figcaption></figure>
</div>


<p>The controversial part &#8211; insofar as there is any controversy &#8211; is in dedicating a CPU register to hold a frame pointer (to point to the start of the current stack frame).  It&#8217;s super convenient for a lot of things, but particularly for debuggers and profilers as it gives them a reliable and very fast way to find the top of the current callstack.  But it&#8217;s not <em>technically</em> required for the program to function.</p>



<p>No live CPU architectures, that I&#8217;m aware of, have a dedicated hardware register for frame pointers.  So you nominally have to &#8220;give up&#8221; a GPR (general-purpose register) in order to have a frame pointer.</p>
</div></div>



<p><a href="https://github.com/FranzBusch" data-wpel-link="external" target="_blank" rel="external noopener">Franz Busch</a> <a href="https://forums.swift.org/t/se-0419-swift-backtracing-api/69595/12" data-wpel-link="external" target="_blank" rel="external noopener">pointed out</a> that some notable software <em>still</em> ships with frame pointers omitted, e.g. apparently some major Linux distros.  I suspect it&#8217;s merely some inertia (or simply oversight) that&#8217;s delaying getting people off of that old crutch.  I&#8217;m not remotely surprised that some big Linux distros are in this bucket &#8211; they tend to be absurdly conservative and slow to change<sup data-fn="6e48e51a-6c80-46a9-b2d0-4729eb123f42" class="fn"><a href="#6e48e51a-6c80-46a9-b2d0-4729eb123f42" id="6e48e51a-6c80-46a9-b2d0-4729eb123f42-link">1</a></sup>.  And it&#8217;s mind-boggling how much vitriol restoring frame pointers generates from <a href="https://news.ycombinator.com/item?id=34632677" data-wpel-link="external" target="_blank" rel="external noopener">the peanut gallery</a>.</p>



<p>From watches to servers these days &#8211; and frankly most of the embedded space, since it&#8217;s mostly <a href="https://en.wikipedia.org/wiki/ARM_architecture_family#32-bit_architecture" data-wpel-link="external" target="_blank" rel="external noopener">ARM</a> &#8211; everything generally has an ISA with sufficiently many GPRs to negate any big benefit from omitting frame pointers.  Giving up one of 31 GPRs (for e.g. <a href="https://en.wikipedia.org/wiki/AArch64" data-wpel-link="external" target="_blank" rel="external noopener">AArch64</a>, the dominant CPU architecture family today) is pretty insignificant for the vast majority of code, because almost nothing actually uses all 31 GPRs anyway.  It only makes a significant difference<sup data-fn="188c96e5-8b0d-4e30-84f9-989822dfd065" class="fn"><a href="#188c96e5-8b0d-4e30-84f9-989822dfd065" id="188c96e5-8b0d-4e30-84f9-989822dfd065-link">2</a></sup> when the CPU design is register-starved to begin with, like <a href="https://en.wikipedia.org/wiki/IA-32" data-wpel-link="external" target="_blank" rel="external noopener">i386</a>.  And those architectures are largely dead, in museums, or restricted to <em>very</em> tiny CPUs as used in some microcontrollers (&#8220;embedded&#8221; systems).</p>



<p>Even back when i386 et al were still a concern, the proponents of <code>-fomit-frame-pointer</code> often argued not on the potential merits of the trade-off, but rather that it was a &#8220;free&#8221; performance boost, so even if it was only by a percentage point or two, why not?  They of course were either naively or deliberately overlooking the detrimental effects.</p>



<p>There may still be software for which omitting frame pointers is the right trade-off, even on modern CPUs.  But I find it hard to believe there&#8217;s <em>enough</em> cases like that to warrant accomodation in standard tools.</p>



<h3 class="wp-block-heading">A brief trip back to Apple circa 2007</h3>



<p>Back in the brief window of time when i386 was a thing for the Mac (32-bit Intel, e.g. <a href="https://en.wikipedia.org/wiki/Intel_Core#Core" data-wpel-link="external" target="_blank" rel="external noopener">Core Duos</a><sup data-fn="b67d5fbb-7aa1-4c29-bc93-1341ac28771a" class="fn"><a href="#b67d5fbb-7aa1-4c29-bc93-1341ac28771a" id="b67d5fbb-7aa1-4c29-bc93-1341ac28771a-link">3</a></sup> as used in <a href="https://everymac.com/systems/apple/macbook/specs/macbook_1.83.html" data-wpel-link="external" target="_blank" rel="external noopener">the first MacBooks</a>), I was at Apple in the Performance Tools teams (<a href="https://web.archive.org/web/20100124025810/https://developer.apple.com/tools/sharkoptimize.html" data-wpel-link="external" target="_blank" rel="external noopener">Shark</a> &amp; <a href="https://help.apple.com/instruments/mac/current/#/dev7b09c84f5" data-wpel-link="external" target="_blank" rel="external noopener">Instruments</a>), and it was a frustration of ours that&nbsp;<code>-fomit-frame-pointer</code>&nbsp;<em>was</em>&nbsp;a noticeable performance-booster on the register-starved i386<sup data-fn="29afcd2a-0449-44c0-a274-0b06c9ddce8a" class="fn"><a href="#29afcd2a-0449-44c0-a274-0b06c9ddce8a" id="29afcd2a-0449-44c0-a274-0b06c9ddce8a-link">4</a></sup> architecture<sup data-fn="e0287bfa-c0ab-44b3-8fef-812983216ca6" class="fn"><a href="#e0287bfa-c0ab-44b3-8fef-812983216ca6" id="e0287bfa-c0ab-44b3-8fef-812983216ca6-link">5</a></sup>, so it was hard to just bluntly tell people not to use it… yet, by breaking the ability to profile their code, people who used it often left even&nbsp;<em>bigger</em>&nbsp;performance gains on the table (or otherwise had to invest much more labour into identifying &amp; resolving performance problems).</p>



<p>At one point there was even an Apple-internal debate about whether to abandon kernel-based profiling in favour of user-space profiling<sup data-fn="d563aa47-98d0-4761-9c0f-63194d9f7d20" class="fn"><a href="#d563aa47-98d0-4761-9c0f-63194d9f7d20" id="d563aa47-98d0-4761-9c0f-63194d9f7d20-link">6</a></sup> because <a href="https://developers.redhat.com/articles/2023/07/31/frame-pointers-untangling-unwinding#where_do_frame_pointers_fit_into_this_" data-wpel-link="external" target="_blank" rel="external noopener">implementing backtracing without frame pointers is&nbsp;<em>possible</em></a>&nbsp;but <a href="https://rwmj.wordpress.com/2023/02/14/frame-pointers-vs-dwarf-my-verdict/" data-wpel-link="external" target="_blank" rel="external noopener">very expensive</a> and requires masses of debug metadata (e.g. <a href="https://en.wikipedia.org/wiki/DWARF" data-wpel-link="external" target="_blank" rel="external noopener">DWARF</a>), making it highly unpalatable to put in the kernel. Thankfully there were too many obvious problems with user-space profiling, so that notion never really got its legs, and then x86-64 finally arrived<sup data-fn="3ad966e2-a3a1-41ee-a13a-84e58f5e8981" class="fn"><a href="#3ad966e2-a3a1-41ee-a13a-84e58f5e8981" id="3ad966e2-a3a1-41ee-a13a-84e58f5e8981-link">7</a></sup> and it was mooted.</p>


<ol class="wp-block-footnotes"><li id="6e48e51a-6c80-46a9-b2d0-4729eb123f42">e.g. <a href="https://wadetregaskis.com/how-to-install-imagemagick-7-for-wordpress-under-plesk-obsidian-on-ubuntu-22-04/" data-wpel-link="internal">Ubuntu <em>still</em> not officially supporting ImageMagick 7</a> even though it&#8217;s been out for nearly a decade. <a href="#6e48e51a-6c80-46a9-b2d0-4729eb123f42-link" aria-label="Jump to footnote reference 1">↩︎</a></li><li id="188c96e5-8b0d-4e30-84f9-989822dfd065">Aside from the question of register space, there <em>is</em> additional cost to implementing frame pointers, as additional instructions are required around function entry &amp; exit in order to maintain the frame pointers &#8211; to push &amp; pop them off the stack, etc.  The cost of those is usually insignificant &#8211; especially in <a href="https://en.wikipedia.org/wiki/Superscalar_processor" data-wpel-link="external" target="_blank" rel="external noopener">superscalar</a> microarchitectures, as is the norm &#8211; so that aspect is not typically the focus of the controversy. <a href="#188c96e5-8b0d-4e30-84f9-989822dfd065-link" aria-label="Jump to footnote reference 2">↩︎</a></li><li id="b67d5fbb-7aa1-4c29-bc93-1341ac28771a">Tangentially, I vaguely recall us Apple engineers kinda hating the Core Duo (Yonah), or more specifically Apple&#8217;s choice to use it.  Apple used them only for a tiny window of time, from May 2006 to about November 2006 when the Core 2 Duo (Merom) finally replaced them across the line.  I don&#8217;t recall <em>all</em> the reasons that the Core 2 Duo was superior, but they included that Core 2 Duo corrected the 32-bit regression (for Macs) and performed <em>much</em> better.  Anytime Apple releases a Mac with a dud processor in it, like those Core Duos, a lot of Apple engineers die a little inside because they know they&#8217;re going to be stuck supporting the damn things for many years even after the last cursed one rolls off the assembly line.<br><br>It&#8217;s still a mystery to me why Apple rushed the Intel transition in this regard.  They only had to wait six more months and they could have had a clean start on Intel, with no 32-bit to burden on them for the next seven years. <a href="#b67d5fbb-7aa1-4c29-bc93-1341ac28771a-link" aria-label="Jump to footnote reference 3">↩︎</a></li><li id="29afcd2a-0449-44c0-a274-0b06c9ddce8a">Why do I keep calling it &#8220;i386&#8221;?  Isn&#8217;t it officially &#8220;IA-32&#8221;?  Well, yes, but that&#8217;s (a) only retroactively and (b) only ever used by Intel.  Though I guess &#8220;x86&#8221; is probably the more common name?  Yet &#8220;i386&#8221; is in my mental muscle memory.  Maybe that&#8217;s just how we used to refer to it, at Apple?  Maybe just because that&#8217;s the name used in gcc / clang arch &amp; target flags?<br><br>Incidentally, <code>clang -arch i386 -print-supported-cpus</code> on my M2 MacBook Air still lists Yonah (those damn Core Duos) as supported.  Gah!  They won&#8217;t die! 😆 <a href="#29afcd2a-0449-44c0-a274-0b06c9ddce8a-link" aria-label="Jump to footnote reference 4">↩︎</a></li><li id="e0287bfa-c0ab-44b3-8fef-812983216ca6">It&#8217;s funny how the Intel transition is now heralded as being amazing and how much better Intel Macs were than PPC Macs, but for a while there we lost a <em>lot</em> of things, like a 64-bit architecture, an excellent SIMD implementation, and the notion of more than [effectively] six GPRs. <img decoding="async" width="20" height="20" src="https://emoji.discourse-cdn.com/apple/stuck_out_tongue_closed_eyes.png?v=12" alt=":stuck_out_tongue_closed_eyes:"> <a href="#e0287bfa-c0ab-44b3-8fef-812983216ca6-link" aria-label="Jump to footnote reference 5">↩︎</a></li><li id="d563aa47-98d0-4761-9c0f-63194d9f7d20">There were at the time already some Apple developer tools that did user-space profiling, most notably Sampler (now a niche feature in Activity Monitor) and early versions of Instruments (in fact Instruments <em>still</em> has the Sampler plug-in which does this, although I can&#8217;t really fathom why anyone would ever intentionally use it over the Time Profiler plug-in). <a href="#d563aa47-98d0-4761-9c0f-63194d9f7d20-link" aria-label="Jump to footnote reference 6">↩︎</a></li><li id="3ad966e2-a3a1-41ee-a13a-84e58f5e8981">In the sense of <em>all</em> Macs adopting it, not just the Mac Pro.  It was easy to ignore i386 at that point because it was then all but officially a dead architecture as far as Apple were concerned. <a href="#3ad966e2-a3a1-41ee-a13a-84e58f5e8981-link" aria-label="Jump to footnote reference 7">↩︎</a></li></ol>]]></content:encoded>
					
					<wfw:commentRss>https://wadetregaskis.com/fomit-frame-pointer/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			<media:content url="https://wadetregaskis.com/wp-content/uploads/2024/01/Frame-pointers-explanatory-diagram.webp" medium="image" />
<post-id xmlns="com-wordpress:feed-additions:1">7536</post-id>	</item>
	</channel>
</rss>
