purplesyringa

Subscribe to RSS

"AI discourse" is a joke

July 14, 2025 Hacker News

In contemporary “AI” discourse, people often make a point that LLM output cannot be trusted, since it contains hallucinations, often doesn’t handle edge cases properly, causes vulnerabilities, and so on. This is seen as an argument to never use LLM-generated code in production. Others argue that the benefits AI grants them are worth the risk.

These groups are talking past each other. The problem was never about AI, it’s only the catalyst. To discuss what problems AI causes in software development is to completely miss the point, since those same arguments have been milked to death even before LLMs were a thing.

Hidden complexity in software development

July 2, 2025 Reddit

This is a tech phenomenon that I keep getting blindsided by no matter how much I try to anticipate it.

Physical work feels difficult. You can look at someone and realize you don’t have nearly as much stamina, and even if you did, it still feels demanding.

Research feels difficult. You’re tasked with thinking about something no one else has considered yet. That rarely happens even outside of science – try to tell a unique joke.

But non-algorithmic programming? You’re telling a machine that precisely follows instructions what you want it to do. At best, you’re a technical translator. You’re not working towards a PhD degree. You’re just wiring things together without creating anything intrinsically new. It looks simple, and so it feels easy.

lol. lmao, even.

Experience shows that it’s anything but easy, but it’s always been hard for me to pinpoint exactly why that is the case. And I think I’ve finally found a good answer.

Splitting independent variables without SSA

June 15, 2025

I’m making progress on the Java decompiler I’ve mentioned in a previous post, and I want to share the next couple of tricks I’m using to speed it up.

Java bytecode is a stack-based language, and so data flow is a bit cursed, especially when the control flow is complicated. I need to analyze data flow globally for expression inlining and some other stuff. Single-static assignment produces basically everything I need as a byproduct… but it’s not very fast.

For one thing, it typically mutates the IR instead of returning data separately, and the resulting IR has imperative code mixed with functional code, which is a little unpleasant to work with. SSA has multiple implementations with very different performance characteristics and conditions, and each of them forces me to make a tradeoff I’m not positive about.

Fast limited-range conversion between ints and floats

June 7, 2025

This post is about a popular but niche technique I can never find a succinct reference for. I didn’t invent it, I just need a page I can link when giving optimization advice.

Integer $\leftrightarrow$ float casts that utilize specialized processor instructions, i.e. those that compilers use by default, typically have worse throughput and higher latency than alternatives based on applying bit tricks to the IEEE-754 format. (Please benchmark them anyway, I’ve seen them decrease performance occasionally.) Unfortunately, these bit tricks only work over a reduced range, e.g. numbers up to $2^{23}$ or $2^{52}$ as opposed to the full $2^{32}$ or $2^{64}$ range. Still, they can be very useful in specialized tasks.

Recovering control flow structures without CFGs

June 6, 2025 Hacker News Reddit Lobsters

I’m working on a Java decompiler because I’m not satisfied with the performance of other solutions. I’ve always heard that decompiling JVM bytecode is a solved problem, but I’ve concluded that the decompilation methods used by CFR and Vineflower are hacky, inefficient, and sometimes don’t even work. The existing solutions are haphazard and inadequate compared to alternative approaches.

Specifically, I have beef with the control flow extraction strategies employed by most decompilers. I haven’t tackled decompilation as a whole yet, but I’ve found an approach to control flow recovery that works in isolation, is quite modular, and addresses common frustrations. I don’t claim to be the first person to think of this method, but I haven’t seen it mentioned anywhere, so this post describes it hoping that it’s useful to someone else.

Why performance optimization is hard work

April 29, 2025 Hacker News Reddit

I’m not talking about skill, knowledge, or convincing a world focused on radical acceleration that optimization is necessary. Performance optimization is hard because it’s fundamentally a brute-force task, and there’s nothing you can do about it.

This post is a bit of a rant on my frustrations with code optimization. I’ll also try to give actionable advice, which I hope enchants your experience.

Falsehoods programmers believe about null pointers

January 30, 2025 Hacker News Reddit

Null pointers look simple on the surface, and that’s why they’re so dangerous. As compiler optimizations, intuitive but incorrect simplifications, and platform-specific quirks have piled on, the odds of making a wrong assumption have increased, leading to the proliferation of bugs and vulnerabilities.

This article explores common misconceptions about null pointers held by many programmers, starting with simple fallacies and working our way up to the weirdest cases. Some of them will be news only to beginners, while others may lead experts down the path of meticulous fact-checking. Without further ado, let’s dive in.

The RAM myth

December 19, 2024 Reddit Hacker News

The RAM myth is a belief that modern computer memory resembles perfect random-access memory. Cache is seen as an optimization for small data: if it fits in L2, it’s going to be processed faster; if it doesn’t, there’s nothing we can do.

Most likely, you believe that code like this is the fastest way to shard data (I’m using Python as pseudocode; pretend I used your favorite low-level language):

groups = [[] for _ in range(n_groups)]
for element in elements:
    groups[element.group].append(element)

Indeed, it’s linear (i.e. asymptotically optimal), and we have to access random indices anyway, so cache isn’t going to help us in any case.

In reality, when the number of groups is high, this is leaving a lot of performance on the table, and certain asymptotically slower algorithms can perform sharding significantly faster. They are mostly used by on-disk databases, but, surprisingly, they are useful even for in-RAM data.

Thoughts on Rust hashing

December 12, 2024 Reddit IRLO

In languages like Python, Java, or C++, values are hashed by calling a “hash me” method on them, implemented by the type author. This fixed-hash size is then immediately used by the hash table or what have you. This design suffers from some obvious problems, like:

How do you hash an integer? If you use a no-op hasher (booo), DoS attacks on hash tables are inevitable. If you hash it thoroughly, consumers that only cache hashes to optimize equality checks lose out of performance.

Any Python program fits in 24 characters*

November 17, 2024

* If you don’t take whitespace into account.

My friend challenged me to find the shortest solution to a certain Leetcode-style problem in Python. They were generous enough to let me use whitespace for free, so that the code stays readable. So that’s exactly what we’ll abuse to encode any Python program in $24$ bytes, ignoring whitespace.

The Rust Trademark Policy is still harmful

November 10, 2024 Reddit

Four days ago, the Rust Foundation released a new draft of the Rust Language Trademark Policy. The previous draft caused division within the community several years ago, prompting its retraction with the aim of creating a new, milder version.

Well, that failed. While certain issues were addressed (thank you, we appreciate it!), the new version remains excessively restrictive and, in my opinion, will harm both the Rust community as a whole and compiler and crate developers. While I expect the stricter rules to not be enforced in practice, I don’t want to constantly feel like I’m under threat while contributing to the Rust ecosystem, and this is exactly what it would feel like if this draft is finalized.

Below are some of my core objections to the draft.

Bringing faster exceptions to Rust

November 6, 2024 Reddit Hacker News

Three months ago, I wrote about why you might want to use panics for error handling. Even though it’s a catchy title, panics are hardly suited for this goal, even if you try to hack around with macros and libraries. The real star is the unwinding mechanism, which powers panics. This post is the first in a series exploring what unwinding is, how to speed it up, and how it can benefit Rust and C++ programmers.

We built the best "Bad Apple!!" in Minecraft

October 10, 2024 Hacker News

Demoscene is the art of pushing computers to perform tasks they weren’t designed to handle. One recurring theme in demoscene is the shadow-art animation “Bad Apple!!”. We’ve played it on the Commodore 64, Vectrex (a unique game console utilizing only vector graphics), Impulse Tracker, and even exploited Super Mario Bros. to play it.

But how about Bad Apple!!.. in Minecraft?

Minecraft сравнивает массивы за куб

September 14, 2024 Telegram

Коллизии в играх обнаруживаются тяжелыми алгоритмами. Для примера попробуйте представить себе, насколько сложно это для просто двух произвольно повернутых кубов в пространстве. Они могут контактировать двумя ребрами, вершиной и гранью или еще как-то более сложно.

В майнкрафте вся геометрия хитбоксов параллельна осям координат, т.е. наклона не бывает. Это сильно упрощает поиск коллизий.

Я бы такое писала просто. Раз хитбокс блока — это объединение нескольких параллелепипедов, то можно его так и хранить: как список 6-элементных тьюплов. В подавляющем большинстве случаев этот список будет очень коротким. Для обычных кубов его длина — 1, для стеклопаналей может достигать 2, наковальня, о боги, состоит из 3 элементов, а стены могут иметь их аж целых 4. Для проверки хитбоксов на пересечение достаточно перебрать пары параллелепипедов двух хитбоксов (кажется, их может быть максимум 16). Для параллелепипедов с параллельными осями задача решается тривиально.

Но Minecraft JE писала не я, поэтому там реализация иная.

WebP: The WebPage compression format

September 7, 2024 Hacker News Reddit Lobsters Russian

I want to provide a smooth experience to my site visitors, so I work on accessibility and ensure it works without JavaScript enabled. I care about page load time because some pages contain large illustrations, so I minify my HTML.

But one thing makes turning my blog light as a feather a pain in the ass.

Division is hard, but it doesn't have to be

August 24, 2024 Reddit

Developers don’t usually divide numbers all the time, but hashmaps often need to compute remainders modulo a prime. Hashmaps are really common, so fast division is useful.

For instance, rolling hashes might compute u128 % u64 with a fixed divisor. Compilers just drop the ball here:

fn modulo(n: u128) -> u64 {
    (n % 0xffffffffffffffc5) as u64
}

modulo:
    push    rax
    mov     rdx, -59
    xor     ecx, ecx
    call    qword ptr [rip + __umodti3@GOTPCREL]
    pop     rcx
    ret

__umodti3 is a generic long division implementation, and it’s slow and ugly.

I prefer my code the opposite of slow and ugly.

I sped up serde_json strings by 20%

August 20, 2024 Reddit Hacker News Lobsters Russian

I have recently done some performance work and realized that reading about my experience could be entertaining. Teaching to think is just as important as teaching to code, but this is seldom done; I think something I’ve done last month is a great opportunity to draw the curtain a bit.

serde is the Rust framework for serialization and deserialization. Everyone uses it, and it’s the default among the ecosystem. serde_json is the official serde “mixin” for JSON, so when people need to parse stuff, that’s what they use instinctively. There are other libraries for JSON parsing, like simd-json, but serde_json is overwhelmingly used: it has 26916 dependents at the time of this post, compared to only 66 for simd-json.

This makes serde_json a good target ~~(not in a Jia Tan way)~~ for optimization. Chances are, many of those 26916 users would profit from switching to simd-json, but as long as they aren’t doing that, smaller optimizations are better than nothing, and such improvements are reapt across the ecosystem.

The sentinel trick

August 13, 2024 Telegram

The sentinel trick underlies a data structure with the following requirements:

Read element by index in $O (1)$ ,
Write element by index in $O (1)$ ,
Replace all elements with a given value in $O (1)$ .

It is not a novel technique by any means, but it doesn’t seem on everyone’s lips, so some of you might find it interesting.

You might want to use panics for error handling

August 13, 2024

Rust’s approach to error handling comes at a cost. The Result type often doesn’t fit in CPU registers, and callers of fallible functions have to check whether the returned value is Ok or Err. That’s a stack spill, a comparison, a branch, and a lot of error handling code intertwined with the hot path that just shouldn’t be here, which inhibits inlining, the most important optimization of all.

Exceptions and panics make it easy to forget about the occasional error, but they don’t suffer from inefficiency. Throwing an exception unwinds the stack automatically, without any cooperation from the functions except the one that throws the exception and the one that catches it. Wouldn’t it be neat if a mechanism with the performance of panic! and the ergonomics of Result existed?

У base64 есть неподвижная точка

August 3, 2024 Telegram

$ </dev/urandom base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 \
     | base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 \
     | base64 | head -1
Vm0wd2QyUXlVWGxWV0d4V1YwZDRWMVl3WkRSV01WbDNXa1JTVjAxV2JETlhhMUpUVmpBeFYySkVU

$ </dev/urandom base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 \
     | base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 | base64 \
     | base64 | head -1
Vm0wd2QyUXlVWGxWV0d4V1YwZDRWMVl3WkRSV01WbDNXa1JTVjAxV2JETlhhMUpUVmpBeFYySkVU

I thought I was smart enough to play with fire

June 20, 2024 Codeforces

blazingio cuts corners by design. It keeps the constant factor small and uses long forgotten algorithms people used before processors supported SIMD and integer division. But another limitation made this task much harder.

Size.

Professional libraries start exceeding the Codeforces limit of 64 KiB really fast. Code minification barely helps, and neither does resorting to ugly code. So I cut a corner I don’t typically cut.

Undefined Behavior.

These two words make a seasoned programmer shudder. But sidestepping UB increases code size so much the library can hardly be used on CF. So I took a gamble. I meticulously scanned every instance of UB I used intentionally and made sure the compiler had absolutely no reason to miscompile it. I wrote excessive tests and run them on CI on all architecture and OS combinations I could think of. I released the library without so much as a flaw. It worked like clockwork.

And then, 3 months later, I updated README, and all hell broke loose.

Recovering garbled Bitcoin addresses

April 23, 2024 Telegram

ZeroNet is a decentralized network that enables dynamic sites, such as blogs and forums, unlike popular content-addressed storage networks that came later. Sites aren’t addressed by immutable hashes; instead, site updates are signed by Bitcoin addresses.

A moot point is that Bitcoin addresses are case-sensitive, and people are used to addresses being case-insensitive. Mistakes happen, and sometimes the only trail you have is a lower-cased address, like 1lbcfr7sahtd9cgdqo3htmtkv8lk4znx71.

Losing valuable information is a bad thing when you’re an archivist. Have we really lost access to the site if we only know the lower-cased address? Can we recover the original address somehow?