btown

Born on February 25, 2012•19200 Karma

13 hours ago

•on: Incident with Pull Requests, Issues, Git Operation...

Is the “streak” days of continuous uptime, or of days with at least one downtime incident? I think it’s the latter :]

13 hours ago

It looks like it is the number of consecutive days with no incident. If you look at 31 Dec 2025, that corresponds to an 8-day period with no incidents.

isityettime•

13 hours ago

I guess that also means this year GitHub has not yet made it a single week without an outage of some kind.

gen220•

12 hours ago

It's a streak for continuous uptime, and yeah it is fairly depressing to imagine overseeing that :/

btown•

1 day ago

•on: Cloudflare Flagship

Never underestimate the power of a zero-network-hop abstraction over f(feature_name, context).

And context can be extremely tailored to your niche: specific inventory, from a specific supplier, for a specific user of a specific B2B client of a specific business model subtype, who should or shouldn’t see certain features on that specific inventory at certain times.

When you can write your own logic, and just run this in a tight loop as easily and performantly as you can use a constant, it makes your business incredibly agile. Think some text might change for some customers? Just write the code to make it configurable, and you get tests and flags for free.

Sadly, that zero-hop setup requires a sophisticated client execution engine, which it doesn’t appear Cloudflare has implemented here. Makes sense for their memory constrained workers, less sense for traditional infrastructure.

Statsig has an approach here that I quite like:

> To be able to do this, Server SDKs hold the entire ruleset of your project in memory - a representation of each gate or experiment in JSON. On client SDKs, we evaluate all of the gates/experiments when you call initialize - on our servers.

https://docs.statsig.com/sdks/how-evaluation-works

You can also roll your own - just sync your rulesets to a few data structures every few seconds in a background thread and atomically swap the reference to them. Then you just need a CRUD interface over the applicability ruleset dimensions.

Just be careful to have governance on who can play with which would-be constants. Great power and great responsibility and all that!

la_fayette•

18 hours ago

When reading your comment, it just reminds me on how feature flags can be misused as application configuration/customization. An antipattern i could observe at various organzations already.

For me feature flags go along with trunk based development to enable features in QA settings, but not on PROD yet, for PO/PM testing. Trunk based development allows for fast/easy devops, without complicated branching strategies.

Application configuration is, for me, part of the application and has the business context for customizing the application accordingly. Not sure if there are specific frameworks/tools out there. But one should clearly distinguish these two.

baq•

17 hours ago

> it just reminds me on how feature flags can be misused as application configuration/customization. An antipattern i could observe at various organzations already.

feature flags are perfect for configuration and customization, why using them for this purpose is 'misuse' is beyond me and I've heard this claim from multiple people. they're literally configuration. feature with a flag to turn it on, off or give the flag a value. where's the misuse? is it a problem I'm not running experiments when switching over redis to valkey or whatever?

ZephyrBlu•

16 hours ago

Feature flags need to be treated as short-lived and experimental otherwise they end up getting abused for everything and make it very difficult to reason about your application.

If it's config/customization, it should be in code. If it's experimental it can be a flag until it solidifies, and then it needs to get moved to code.

When I was at Shopify a couple of years ago they mandated that feature flags had to be short-lived (Like 2-4w lifetime tops, some had exceptions) because they would end up getting left in code and never cleaned up, or for extended periods of time like months. Hard to tell if it's genuinely a "feature flag" or actually just a normal part of the system at that point.

Feature flags being flipped in prod was also a major source of incidents, in part because people didn't treat them as experimental and with the associated risk profile of something experimental.

The only exception where having long-lived flags was useful and required was for operational killswitches (E.g. disable Apple Pay because it's having issues), but that is explicitly not application config.

WhitneyLand•

14 hours ago

Agreed.

This is the kind of design wisdom that’s both true and difficult to win an argument over.

It reminds me of arguments related to over-engineering and complexity. The principles are super important to having a codebase that scales and continues to be efficient to work in as the team grows, but they are hard to objectively measure.

Locally or in isolation something may sound like a great idea. Being able to step back and see the greater ripple effects require some experience and intuition that can’t always be used to convince people otherwise.

baq•

16 hours ago

I disagree with just about everything you said being a problem except the process of cleaning up is absolutely required.

Notably feature flags triggering incidents is expected and desired vs the alternative of shipping the code and having to roll a release back because there is no other way to remove the feature from prod.

ZephyrBlu•

16 hours ago

In a company the size of Shopify people flipping their feature flags would very often impact *other teams*, and like I said feature flags got abused with even seemingly innocuous changes being put behind them or being left long periods of time before being fully used.

When someone else flips a flag that impacts your team and they have no idea they even caused a problem, it becomes very difficult to resolve the issue. Usually you can check for recent deploys, instead you have to go and guess at which feature flag which was recently flipped could possibly be affecting your code. I experienced this several times.

Also, it was actually more desirable for most of these things to go straight to production. Test it properly before shipping, then when you ship it soaks on a 5% traffic canary at which point you can monitor and cancel the deploy if you see errors. That is generally safer than a feature flag rollout unless you are doing something very high impact/risk, in large part because it gives any other team affected by your rollout the ability to respond and be able to easily find the source of errors.

In my org it was a fairly common failure mode to ship something and accidentally cause an issue for another team. Usually it was other teams/orgs shipping things that impacted us.

echelon•

16 hours ago

Runtime evaluated feature flags can always be used for control plane levers and emergency handbrakes.

You just have to label them as such and prevent other teams from fiddling with them.

This is not an antipattern, it's just semantic hand-wringing.

My team managed critical systems in the online flow of billions of dollars of daily payment volume. We also wrote the feature flag system that the rest of the company used. Not only were we completely fine with feature flags as long-lived control plane levers, we heavily used the system that way ourselves.

You just have to clearly distinguish between ephemeral rollout flags (and clean them up or expire them) and the permanent control plane levers.

It's the exact same functionality for both sets of tools. Just different practices around the two usages.

ZephyrBlu•

15 hours ago

I completely agree with your distinction and that is exactly what they mandated :)

I don't think that is what most people colloquially mean by "feature flags" though. Even most teams in Shopify abused "ephemeral" flags for long periods of time.

When they rolled out the mandate it was very annoying for my team because we had a lot of operational flags like you're describing that we needed to get exemptions for.

jeremyjh•

16 hours ago

One well known issue is that when you have a lot of separate feature flags that can interact, you explode the number of test cases you have to cover. For example if you have three feature flags that can interact in a module that has 100 test cases, you actually have 900 test cases if you are going to test with each possible combination of flags. Many teams don't test them all because they "already know" that doesn't apply here, and find out in production which combination of feature flags is unworkable.

akoboldfrying•

5 hours ago

But you have this same issue if you store those 3 bits of state as "app config" instead of as "feature flags".

I think it's useful to distinguish between kinds of feature flags -- "traditional" feature flags for safe rollout of new features (these should be removed on a strict timeline, to preserve maintainability) vs. "config"-type flags that are designed to remain indefinitely and optionally be configurable by certain end users. But I don't yet see a reason why the actual mechanism for these two things (namely, a function with a descriptive name that can be called to quickly return a small datum that is periodically refreshed from somewhere in the background) can't be the same.

jeremyjh•

3 hours ago

Yes people are speaking past each other to some extent here; but if we're going to talk about "Feature Flags" we aren't talking about using a flag service for "normal" configuration - whatever that means. But what, exactly - does that mean? I normally consider things like the following configuration:

API Host Names, client Ids, secrets, Database connection strings Other runtime settings like pool sizes, replica counts etc

Most of those things are secrets, or they are specific scalar values closely tied to the application runtime, that often need to be known at startup. Are people putting those in a "feature flag" service? If not, what is a good example?

My comment only applies to "configuration" that alters behavior, usually in binary off/on manner. Which is what I'd call a "Feature Flag".

chambers•

11 hours ago

Yes, feature flags are conflated with remote configs (or its more useful variety: "dynamic configs"). The difference is subtle, hence why people are talking past each other.

Feature flags are gates for whether a piece of code runs; basically, an if-condition. Remote configs are a mechanism for changing runtime values without redeploying[1].

For example:

  # Feature flag — variant gate for rollout
  flag = sdk.check_gate(user, "checkout_flow")
  if flag == 'open':
      render_new_checkout()
  elif flag == 'warning':
      render_warning_checkout()
  else:
      render_old_checkout()

  # Raw remote config pulled — structured values for tuning behavior
  config = sdk.get_config(user, "checkout_settings") # if the config changes based on user or context, this "remote" config is considered "dynamic"
  timeout_ms   = config.get("timeout_ms", 5000)
  max_items    = config.get("max_items", 50)
  allowed_tlds = config.get("allowed_tlds", [".com", ".org"])

In practice, feature flags are implemented on top of dynamic configs[2] to manage the temporary lifecycle of a feature — aka, ship a new block of code, ramp its execution up to 100%, then delete the flag. Whereas dynamic configs are a deeper primitive meant for semi-permanent/safer operations like tuning rate limits or changing text copy on a marketing website.

As I've seen it: the forcing function that separates the concepts are experimentation platforms: when human-control of feature flags is shared (via dynamic configs) with automated & randomized assignments. That's how Statsig built their system and, in part, why they could sell for a billion. Whereas companies that ignored the difference, like LaunchDarkly, struggled outside of feature flags.

[1] https://engineering.atspotify.com/2020/10/spotifys-new-exper...

[2] https://docs.statsig.com/dynamic-config/overview https://blog.x.com/engineering/en_us/topics/infrastructure/2...

tailscaler2026•

9 hours ago

I think feature flags, remote configs, and experiments are all the same thing. Semantically they differ in how you're applying the config and interpreting the outcomes.

epolanski•

17 hours ago

> it just reminds me on how feature flags can be misused as application configuration/customization

They literally are configuration.

tedk-42•

16 hours ago

Oh yeah lets make a web request per service invocation to figure out what to serve for the invocation!

Guys this is exactly the kind of banal crap that makes a simple app into a monsterous beast that won't work unless it's connected to the internet.

epolanski•

14 hours ago

There's no web request per service invocation.

Feature flags are set once at startup (or specific events like hard refresh, or new login) and then simply included in the request headers.

It's not rocket science, but I'm sure people are free to overcomplicate it.

tedk-42•

3 hours ago

That's not a feature flagging service then (config as a service! not a thing really...)

I've done both client and server side implementations of the launch darkly sdk and that's how it's done to know client context.

If you're initialising the entire SDK only to load 1 set of configuration items, I'd argue you can host the config as a json file on a CDN and be done with it - feature flagging is overkill.

julik•

12 hours ago

Which is not hard to do (it is a modulo over a mersenne twister or something similar), but in my recent gigs just Flipper with optional "state of the flags table as of now" endpoint was more than enough. That modulo+random combo required tools like LaunchDarkly to ship SDKs in several languages, and the ones I had to work with were just plain horrible fit for their language of choice. But because the evaluation was relegated to the edge, the whole system got way more complex than desirable. In actuality, I think a refetch of the current flags table "for this customer" every so often is just fine, and way less of a nuisance.

So glad Flipper exists and I don't have to deal with this stuff anymore.

bobthepanda•

12 hours ago

which Flipper is this?

hobofan•

20 hours ago

> Sadly, that zero-hop setup requires a sophisticated client execution engine, which it doesn’t appear Cloudflare has implemented here.

It doesn't have to be sophisticated and they don't need to implement it themselves. They piggy-back on OpenFeature where the client libraries have a simple targeting rule evaluation engine integrated.

chrisweekly•

23 hours ago

Good advice. I'll add a protip / reminder that feature flags, AB tests, and entitlements are three distinct concepts. This blog post (no affiliation) has framing I found helpful:

https://www.stigg.io/blog-posts/entitlements-untangled-the-m...

BatteryMountain•

20 hours ago

Amazing resource, thanks!

ZeWaka•

20 hours ago

Statsig has worked great at my work, really polished and rich feature set. Their tooling to identify unused flags as candidates for removal is neat.

The per-seat billing we have in our agreement is a bit rough but it's workable.

pil0u•

17 hours ago

Statsig is a half-baked product bought out by OpenAI for data harvesting. We already reported 2 documentation issues and 1 critical technical issue, and we're barely using it.

iancarroll•

11 hours ago

Well, OpenAI already sold it (but kept the team), so it’s in someone else’s hands now.

swyx•

22 hours ago

> Sadly, that zero-hop setup requires a sophisticated client execution engine, which it doesn’t appear Cloudflare has implemented here. Makes sense for their memory constrained workers, less sense for traditional infrastructure.

wait what? what kind of logic do you need to do that CF Workers can't do?

rustystump•

21 hours ago

Could you be more specific?

btown•

1 day ago

•on: Stack Overflow’s forum is dead but the company’s s...

In theory reasoning tokens should do the equivalent of this - explicitly create options outside of the quick-response probability space, so those can guide future generation.

In practice, models that do this won't be prioritized as much, because the economics of thinking tokens that stop by default at, say, one option plus a bit more planning (short of full alternatives) would be superior as long as billing is per-user instead of per-token. So we'll still need to play games with prompting!

tliltocatl•

1 day ago

Without continuous feedback from real world, lower-probability token (and soon high-probability ones as well) will be complete garbage.

btown•

2 days ago

•on: 2026 HIPAA Security Rule Update

It's worth noting that cybersecurity requirements can be a mechanism of control.

As a government regime, do you want to build an effective surveillance system where health data on large numbers of suspects can be pulled into a data fusion system at the push of a button, once a judicial framework for rubber-stamping is in place? And do you want to be able to pressure vendors into not supporting certain types of research/analysis and even direct patient care that could be construed/presented as counter to the regime's goals?

Both of these are easier when smaller vendors are forced out and larger vendors are the only ones left standing. As such, regulatory capture becomes a mutually beneficial tool to dominant vendors and regulators alike.

There are few coincidences when lobbying is involved. Which is not to say that cybersecurity improvements aren't a good thing! But speed and mechanisms of required rollout need to be balanced. And with the numerous signatories of [0] opposing the rule and describing "unreasonable implementation timelines," it's hard to say that this is entirely done in the interest of patients.

[0] https://assets.ctfassets.net/opszt4tga0mx/4QrJlGP2EkCiZjgvGx... (2025)

deathanatos•

1 day ago

Your comment is essentially borderline conspiracy theory that HIPAA is somehow setting up a surveillance state.

> As a government regime, do you want to build an effective surveillance system where health data on large numbers of suspects can be pulled into a data fusion system at the push of a button, once a judicial framework for rubber-stamping is in place?

Sure, and I'm right there with you that people should protest frameworks for judicial rubber-stamping. But HIPAA is like the only privacy law in America, basically, and having it mandate that medical data is encrypted can be good on its own.

While there are standardized formats for medical data, many are so ill-adopted that building some sort of surveillance system would be a monumental task; the bulk of data I've worked with has been in poorly documented, non-standard formats.

> Both of these are easier when smaller vendors are forced out and larger vendors are the only ones left standing

Clearer regulations and standardized, interoperable data formats benefit smaller players.

btown•

2 days ago

•on: Search engines alternatives now that Google isn't ...

Surprised not to see a mention of https://tenbluelinks.org/ here.

Google still maintains a web search mode that's free of AI overviews/chat exhortations (as well as ads, if you use an ad blocker). https://www.google.com/search?q=foo&udm=14 is the format of the search URL, and tenbluelinks has instructions on how to use it as your default engine on various platforms.

That said, I've stopped using this as a founder. While I personally like the web search results more (if I wanted synthesis of results, I'd use dedicated agentic-loop-capable tools that are a hotkey away), it's far more important to understand (and empathize with) our users' experiences, good and bad, when they use Google in its full AI extravagance in practice.

btown•

6 days ago

•on: Show HN: Rmux – A programmable terminal multiplexe...

Very cool! I think the hype around “agents are so good that you never actually need to see the underlying commands they are running or interact with the terminal session that they’re running” misses out on a lot of very important use cases, particularly around long running processes that may be shared across multiple agents. This will be very cool to see how best practices evolve!

shideneyu•

6 days ago

Same ! TUI are becoming more and more mainstream so there is a need for automated multiplexing , not used by humans but by programs (and agents)

btown•

7 days ago

•on: An OpenAI model has disproved a central conjecture...

The parent poster isn't saying "advancement of knowledge" is some kind of universal goal for humanity at the cost of all else - and I would agree that it shouldn't be. They're suggesting that as an individual studying pure mathematics, the discovery of new truth is a self-consistent good.

Even taking a purely Kantian interpretation that would scale this beyond mathematicians - and that itself is a logical leap! - making a universal law out of "a discovery can be beautiful regardless of whether created by humans or AI" is is much more specific than the straw extrapolation you've created.

btown•

7 days ago

•on: Goodbye Visa and Mastercard: 130M Europeans switch...

For the uninitiated: https://knowyourmeme.com/memes/poob-has-it-for-you

iamtheworstdev•

7 days ago

also relevant is the end of this SNL sketch from this last weekend - https://www.youtube.com/watch?v=nS97AzfKp3U

spking•

7 days ago

And this SNL sketch from a few decades ago:

https://youtu.be/NWIlScfHwOU?si=64xMCQf8MHtho44H

tantalor•

7 days ago

This one still makes me laugh!

shabgzer•

7 days ago

How mind-numbingly lame. Just another dime a dozen meme.

btown•

8 days ago

•on: Mini Shai-Hulud Strikes Again: 314 npm Packages Co...

At a certain point, is it better to just turn off Dependabot and freeze all NPM packages (minor/patch version and all), rather than continuously update? Particularly for frontend packages, meaningful security fixes seem less likely than supply chain attacks these days.

It's a sad state of affairs, for sure - but is there a reason we can't just switch our frontends to static BOMs, and trust that NPM at least gets their "you can't republish to an old version" bare-minimum constraint right?

Sohcahtoa82•

8 days ago

> At a certain point, is it better to just turn off Dependabot and freeze all NPM packages (minor/patch version and all), rather than continuously update?

But then the compliance team gets annoyed because some CVE with a CVSS score of 3.1 that has a patch available sits unfixed.

btown•

8 days ago

I wonder if the only thing that will solve this is an insurer or regulator saying that: "A system that automatically pulls updates for dependencies without human review, where said updates are not protected by multi-factor authentication by their respective maintainers, shall not be considered secure."

That would wake NPM up at least to the notion that it's absolutely reasonable to require OSS maintainers to press a button on their phones when releases go out, and that's a good thing not a bad thing.

DeliciousSeaCow•

8 days ago

Pretty much what the EU Cyber Resiliency Act says re: OSS due diligence, actually.

lovich•

8 days ago

The core problem is companies skimping on maintenance.

They don’t want to pay engineers to do the analysis manually, and they don’t want to pay for someone to figure out a better automated system.

Would anyone be surprised at their car having problems if they cheaped out on oil changes?

tedd4u•

8 days ago

Enforce a “seasoning” period, for example don’t let any pull pull versions newer that 30 days. Perhaps with an exception for versions that address known CVEs.

peterldowns•

8 days ago

Yes. This is partially why other ecosystems don’t see as many supply chain attacks.

zahlman•

8 days ago

> and trust that NPM at least gets their "you can't republish to an old version" bare-minimum constraint right?

... Does NPM not create full lockfiles, with hashes and pinned transitive dependencies and everything?

btown•

8 days ago

Yes, and the problem here is that most projects have automated systems that automatically update those lockfiles on every upstream release of a library, under an assumption that minor releases are either security patches or bugfixes that would immediately be useful to the consuming project.

IMO this is built on a pre-ShaiHulud, pre-AI set of assumptions, and should be evaluated from first principles with today's security situation.

zahlman•

7 days ago

My point was the "with hashes" part. You aren't in fact "trusting" NPM to ensure that old versions aren't replaced if the package installer is verifying the hash.

esafak•

8 days ago

https://docs.npmjs.com/cli/v11/configuring-npm/package-lock-...

btown•

9 days ago

•on: Stratum: System-Hardware Co-Design with 3D-Stackab...

For those confused by the headline, HN changed it from MoE to Moe.

Though, thinking about "efficient moe" I'm fondly reminded of projects like https://make.girls.moe/ (2017) - "Towards the Automatic Anime Characters Creation with Generative Adversarial Networks" [sic] https://arxiv.org/abs/1708.05509

A simpler time, for sure.

rbanffy•

9 days ago

Now that you mentioned it, I’m thinking about the Three Stooges

Or Billy Idol: “In the midnight hour, she cried Moe, Moe, Moe”