# https://iio.ie recent posts backup
older entries at @/archive.html.2024 entries:
2025 entries:
# uses: a list of tools i use
i like the idea of https://uses.tech, might as well do such a page for myself. so this is a list of random stuff i use or recommend. i don't list everything, only stuff i deem worth mentioning. the items are in alphabetical order.
software:
hardware:
services:
notes for my future reference:
published on 2024-01-03, last modified on 2024-12-13
comment #uses.1 on 2024-01-04
check out https://usesthis.com/
comment #uses.1 response from iio.ie
interesting. i now also found https://uses.tech/. i've renamed the post to /uses to conform, thanks for bringing this to my attention.
# wingaming: windows can be used without license for some tasks just fine
a while ago i bought borderlands 3 + dlcs very cheaply in a steam sale. it's a nice game. this winter i had some free time to play it. but in order to get the max framerate out of it, i decided i'll use windows rather than running it in linux with steam's proton.
so i downloaded the official windows installer and installed windows 10 on my 10 year old gaming machine. it was hibernating in the basement for the past 5 or so years. i tried windows 11 too but that didn't work because my machine didn't have tpm chip for secureboot or something. i think. the installer just said "your machine does not meet system requirements". (i don't get why developers cannot make the error messages more useful. include the requirement not met, sigh.)
anyway, windows 10 worked fine but until you activate it, you see a "activate windows" watermark in the bottom right corner. and you cannot customize the desktop such as changing the wallpaper either. but otherwise it's a completely functional windows that you can use freely!
i had a valid windows license at some point but i lost the details and i'm pretty sure i wouldn't be able to use on newer windows anyway. anyway, it makes no sense for me to pay the full license fee just to use the system for a week and then never again. i wouldn't want to use windows as day-to-day operating system anyway. windows is goddamn slow compared to an equivalent linux. i had to wait seconds for the menu to appear when i right click on the desktop. (i installed it on a spinning disk, not ssd but that shouldn't make such a simple operation this slow.)
but anyway, the "activate windows" watermark is annoying because it appears even when you run games full screen. if you can live with it then that's it, game away. but for my own future reference let me document how to get rid of it:
that's it. i've looked other ways to get rid of the watermark such as regedit hacks but they didn't work. then my vacation time ran out, the machine went back to the basement. it would have made no sense to buy a license just for a few days. and if i would have needed one then i would have just accepted the small framerate loss and played it in linux. so this wasn't a lost sale anyway.
(also what's up with operating systems blasting ads into your face right after installing it? i mean they appear in the start menu, in the default browser's starting page, the default search engine's page, ads ads ads everywhere. and people pay for this? i'm very sad about the state of affairs of computers.)
published on 2024-01-20
# titles: omit periods and uppercase in title-like contexts
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/titles.html.
consider these contexts:
most of these (but not all) should need only one sentence. do you put a period after them? how do you decide? here are my rules:
and furthermore: if you don't need a period, you shouldn't need uppercase either! so a book title or the first line of a git commit should not start with uppercase! go error strings are like this. no periods, no capitals: https://google.github.io/styleguide/go/decisions#error-strings.
consider these things like sentence fragments but not a full sentence. "topic phrase" might the correct technical term for these, dunno.
i'm well aware that the lowercase ship has sailed a long time ago: people are used to uppercase way too much. but here's a trick for git commits and issue titles: use the "$module: title" pattern. think of "module" as a go module where the style is to use lowercase. then the lowercase style will be more natural, easier to swallow for others. e.g. you are adding a new a string conversion method to the standard strconv library: "strconv: parse integers with Ki/Mi/Gi suffixes". or if ui animations in a product are distracting, you can open a github issue titled like this: "ui: allow disabling animations". look, no uppercase needed!
also notice that it's easy for me to put the sentence's ending period after the closing quote when i am quoting these titles. i know that's not the official english rule but my ocd tells me that the period must be after the closing quote. moving the trailing period out of the quotes is just extra nuisance when concatenating these strings programmatically. on the other hand i really don't like two periods like this: "some title with period.". no such problem when the titles contain no periods.
[non-text content snipped]
i didn't find much discussion about this on the internet hence thought i come up with my own rules for myself to apply in my life.
here are some discussions i found. i'll add better ones if i find them:
edit 2024-08-18: https://www.conventionalcommits.org/ mentions a similar trick as above for git commits and all their examples are lowercase. yay!
published on 2024-02-03, last modified on 2024-08-18
# numids: yearstamp numeric unique ids too
this is a followup to @/yseq but for random numeric ids.
consider the unique ids that are used in urls such as in reddit urls or the youtube video ids. these are strings of alphanumeric characters. that gives great flexibility but strings come with some performance downsides in most programming languages. an int64 id in comparison is pretty easy to use, fast, and doesn't generate pressure on the garbage collector. and if a user ever needs to enter an id manually somewhere on a keypad, digits are always easier to type than strings (example: credit card numbers or bank account ids). i have a soft spot for int64 ids and prefer using them over strings in most cases.
there's a small caveat to that: javascript doesn't have int64s but only floating point numbers. so to ensure javascript never garbles the id, it's best to keep the id value less than 2^50 or so. but that should be still good enough for most cases. and there's no need to worry about accidentally generating a naughty word with integers.
on the flipside int64 ids can have high rate of collisions in the case of high rate of id generation. so relying int64 might be a bit risky but for posts and userids in small forums, issue tracker ids, it's more than enough. another downside could be that int64 ids are more "guessable" but this probably doesn't matter much for forum post or issue tracker ids.
# id length
how big should the id be?
i really love short ids. if the id is short, i can even remember it. e.g. if in my project a contentious issue has a memorable 4 digit id, i might remember it and look it up directly via id rather than always searching for it.
context: i love to type urls from memory perfectly. i never rely on autocompletion or history completion. i have relatively good memory for this. some websites handle this quite well thanks to their simple url structure. some are terrible. but if i create a website, i want it to have a simple url structure.
keep the id length short if the system doesn't generate lot of ids. but do vary the length. some ids should be 5 digits long, some 7 digits. this way nobody can rely on a specific length. furthermore the id length can simply grow if there are many collisions during generation. this way the system handles an increased id pressure gracefully.
perhaps distinguish id length for humans and robots. if an alerting system creates automated tickets, give those tickets long ids. this way robots don't eat up the short id space that humans prefer.
# yearstamping
in @/yseq i explained my love for putting some date information into the ids. the same can be done here too. append the last two year digits to the end of the id. so an id like 12323 mean it's an id from 2023. or use the last 3 digits if worried about the year 2100 problem. e.g. 123023 for an id from 2023.
it needs to be a suffix because the id length is variable. putting it at the end means both the generation and extraction of this piece of data remains trivial programmatically.
yearstamping also reduces the chance for collisions. a new id can only collide from other ids from this year. this can make the uniqueness check a bit faster.
it also allows the administrators operate on old ids easily. for instance they can use a glob like "*23" to select all ids from 2023 for archiving.
# weekstamping
in case you are doing full alphanumeric ids, then you can easily weekstamp too. just use A..Za..z for the week at the beginning (starting with capitals to make it easily sortable). that character set is 52 characters long, almost the same amount as the number of weeks in a year. just use lettertable[min((yearday-1)/7, 51)] to sanely deal with that pesky 53th week. you can also prepend the year number. the length of the year is no longer a problem because the weekstamp is a letter so you know where the year ends. no year 2100 problem this way. so an id like "9qdQw4w9WgXcQ" would mean an id from 2009, week 43. or an id like "16XXqZsoesa55w" would mean in id from 2016, week 24. or an id like "123Cabc" would mean in id from 2123, week 3.
sidenote: you can keep 64 (or 50) bit long ids even if you present the ids as string to the user. you can do this if you format the numeric id as a 26+26+10=62 base number when presenting it to the user. then you can have best of both worlds: short ids + lightweight representation in code.
# comparison to yseq
the downside of @/yseq is that the id length must remain static if the users want to use it to compare events chronologically via the less-than operator over the id numbers. no such length restriction on random ids because such comparison intentionally doesn't make sense. with sequential ids users often try to farm sequential ids to grab the round or nice numbers. no such incentive with random numbers.
go with the random ids unless there ids need to be able to express a chronological relationship between them. use an int50 id if you don't expect to need many ids (e.g. less than a million per year).
# edits
published on 2024-03-01, last modified on 2024-03-22
# postreqs: make http post requests via javascript
if i have a web form such as a login page (username+password) or a comment box then i try to use the following pattern:
contrast this to the traditional approach where the redirect/reload always happens on form submit. i think the in-page approach has a much better user interface properties than reloading the whole page with the result of the http post request. and i believe the above is much easier to implement than the traditional approach. the http post endpoints can remain pure api endpoints that a few lines of javascript can handle.
furthermore errors like overload are much easier to handle gracefully. on a traditional page the user must continuously retry and hope for the best. and this often results in duplicate posts. the javascript approach can automatically retry with some fancy retry algorithms. all while keeping the web page responsive and the user well informed about the status.
the downside of this approach is that it requires javascript. that's fair game nowadays if done reasonably. i think it's reasonable to avoid catering to the lowest common denominator. rather make the whole website and its content also accessible via an api so that it's easy for the users to write custom frontends. rely solely on the same api for the official frontend. this ensures that if you ever go overboard, users should be able to respond by writing a better frontend. make replacing your site easy rather than making it artificially important. that's how you can signal trust and its a form of a long term commitment (@/commitments) to be a good guardian of whatever data the users trust you with.
(speaking of responsive ui, here's a good overview what latencies we should be targeting: https://www.nngroup.com/articles/response-times-3-important-limits/. ideally a site's response is so fast that the user doesn't even notice step 3's feedback at all.)
published on 2024-03-09
# tokengen: token generator for media access
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/tokengen.html.
don't ask.
[non-text content snipped]
published on 2024-03-16
# abnames: create new names from abbreviations
software engineers need to create new terms all the time let it be for tools, services, packages, modules, etc. the name matters a lot: everybody will be referring to the new entity using the given name in forum comments, code variable names, filenames, etc.
suppose you are creating a "production logging service" in your company. will you call it production-logging-service? that's quite long and the presence of dashes creates problems when used in CamelCase languages such as go. and you can't use dashes in variable names in non-CamelCase languages either. there you would need to use production_logging_service. you can no longer search for production-logging-service to find all related usage, you would need to do a case insensitive search for "production.?logging.?service". that would then find both ProductionLoggingService and "Production Logging Service". and it takes long to type in too. it's a hassle. avoid multiword names.
another approach is to come up with a completely new, unrelated but cool sounding letter combinator such as "broxus". (i just made that up, any similarity to real entities is purely accidental.) this approach is dumb because the name is super hard to remember especially when you have a lot of such obnoxiously named services.
a third approach is to name them after some mythological entity that did something related. e.g. "herodotus was a greek historian that logged the ancient history" so let's name the service herodotus. it's a bit better but still silly. i have very bad memory for historical figures so such associations would be very hard for me to maintain especially when you have a dozen services named after ancient greek people.
a fourth, my preferred approach is that you take the reasonable sounding long explanatory name and create a short, easy-to-pronounce abbreviated name from it. so i'd name "Production LOGging Service" as "plogs". it must be easy to pronounce too. i have very good memory for this sort of naming. my mind can very quickly remember to break that name into "p-log-s". from there it can quickly associate to "production logging system" and boom, i know what service i'm reading about once i've seen the rule. and if it's unique enough then searching for documentation about the service will become a walk in the park.
there's one requirement for this: make sure these are documented. if you have a large project, then have a glossary that explains most of the commonly used abbreviations. and don't go overboard. only name big components like this, not every function.
even if you can't come up with a good name, a simple abbreviation is often better than using the full name or an unrelated name. that's how we got html, css, sql, png, gif etc and they ended up being quite usable in terms of searchability at least.
https://news.ycombinator.com/item?id=39299974 lists some nice examples for programming languages:
i keep doing this on this blog to keep all post urls short. even in this one: "abnames" means "abbreviated names". i write most of my posts to myself as a reference of my thoughts and opinions and i do revisit some posts regularly. it's super convenient to have a short, relatively easy to remember url to type.
published on 2024-03-23
# aclsystem: encode user and group names into ids to keep acls simple
caveat emptor: this is another fantasy posts where i think how would i design a system based with zero experience in such systems. usually i daydream about being a superhero but today it's about acl systems in a small/mid sized tech company.
suppose you have files in a filesystem, tables in a database, tickets in an issue management software, documents in a content management system, etc. you want to make it configurable which users can access the entities in your system and how. you could have a couple capabilities or access control lists (acls) and for each acls a list of groups or users who have that capability. examples:
suppose you own a file and you want alice to read it. all you need to do is to add alice to the read capability's list. easy peasy. though note that this isn't representible in the standard posix file permissions model. i think that's a very inflexible model and the above is more powerful. these lists don't have to be unbounded. even if you bound them to 4 entries, you already have a quite flexible system.
# ids
how do you represent these acl lists? ideally each user and group in your system has an int64 associated. then each acl is just a list of int64. that's a more compact representation than storing these as list of strings.
how do you map a username to an int64 and vice versa? one approach is to have keep a database around that contains the string<->int64 mappings. but that's overkill! there's a much simpler approach if you accept some limitations.
limit usernames to the form of "[basename]-[suffix]". basename can consist only of at most 10 letters (no digits or underscore allowed). suffix can be one of 8192 hardcoded suffixes.
you can encode one letter out of 26 in 5 bits (2^5 = 32). 10 such letters means you need 50 bits. you can encode one suffix out of 8192 in 13 bits. now we have a 63 bit long number.
there's one bit left: let's use that whether we want group expansion or not. if the id is negative, then username doesn't refer to the user itself, but to a group expansion that is looked up in some other system.
# id mapping example
let's encode 'a' as 00001, ..., 'z' as 11011. and to make the implementation of encoding/decoding simple, store it in reverse. so encode "alice" as "ecila".
that would be the int64 id for those users. the implementation is simple, to decode you would need something like this in go:
name := "" for ; len(name) < 10 && id&31 > 0; id >>= 5 { name += string('a' + id&31 - 1) }
encoding is similarly simple if the name already meets the limitations.
encoding names like acme-team, login-service, politics-discuss, accesslogs-readers can be done via the suffix logic. you just need a builtin constant map like this: 1-team, 2-service, 3-discuss, 4-readers, 5-group, ...
"politics" translates to 656379523568 and the suffix code for -discuss is 3 so 656379523568 + 3<<50 = 3378356100051440 is the id for politics-discuss. this could be a group that holds all the members subscribed to mailing list called politics-discuss.
to express all members of politics-discuss, use the id of -3378356100051440. note the negative sign. the member expansion would be provided via some external group expansion service.
# acl example
suppose alice has a file that she wants to share with her manager bob and the lawyers-team.
using numbers this translates to this:
checking if a user can read the file consists of two steps: the user's id is in the readers list? it is allowed. if not, then the system needs to group expand each group reference. this is more expensive but with some caching this could be a fast enough operation.
# the suffix map
the list of suffixes would be carefully selected to express common ideas. e.g. many tools and projects want to have a mailing list to discuss it so many teams would like a group with a -discuss ending name. so it makes sense to have that as one of the possible suffixes. this map can grow over time. but each addition must be carefully vetted for usefulness. there are only 8192 possible suffixes, it can run out very quickly if you allow users to register them without oversight.
the suffix map would be embedded into each application as a constant. this means that there's some delay until a new suffix is picked up in all applications. this shouldn't be a problem because most applications only care and communicate via the int64 ids. the map is only needed when the application wants to do a conversion between the id and the human-readable name. but even if the map is not updated, it can just use the raw id as a placeholder.
so decoding 3378356100051440 into politics-3 should be reasonable enough. similarly if an ui wants to encode politics-discuss into an id but doesn't know the id for -discuss then the ui simply returns an error. then the user can enter politics-3 and that should work too.
# namespaces
if it makes sense, you might sacrifice one (or more) bit from that bitmask for namespaces. suppose you are a web company and you have your internal employees and external users. you want to assign ids for both. use this new bit to decide whether an id is for an internal user or an external one.
if it's internal, you will have a selection only from 2¹²=4096 suffixes. if it's external, then the remaining 12 bits could be used differently than suffixes. maybe use it for +2 letter long usernames, 12 letters in total. or have 5 bits (0..31) for suffixes in case your website allows users to form groups (-discuss, -members, -announce) or implement bots (-bot). and then the remaining 7 bits (0..128) for yearstamping with the last two year digits. so if a user registers in year 2024, they get a username like alice24. other users can immediately tell how fresh a user is and prevents account reuse. see @/yseq for other benefits why yearstamping ids in general is good. the internal username decoders can then distinguish between internal and external users solely based on the fact whether the basename part of the username has numbers or not.
# abnames
the 10 letter, no digit restriction can be quite painful. for human usernames that might be fine, nobody likes long names anyways.
for service roles and product names it might feel more limiting. but judicious use of the @/abnames philosophy can give plenty of short names. these short names don't need to be perfect. the abnames come with a glossary so the user can easily look up the full, human readable name of the product.
in fact most user interfaces should provide a popup window which on popup that explains the details of the role including the full product name. such feature is also useful for human usernames: to see the full name, the profile photo, responsibilities, availability, etc.
# humans vs robots
often there's a desire to distuingish between humans and robots. for example in the above hover-popup-box example a system could look up data differently for humans vs robots. for instance the popup box wouldn't need to try to look at calendar availability for a robot. another example would be enforcing a human-review rule: each commit must be reviewed by a human. in that case the review system would need to be able to tell if an entity is a human or not.
to make this simple, use the following rule: the empty suffix means humans. in other words if a username contains a dash, it's not a human. robots can use a -bot or -service suffix.
i'm not fully sure about the usefulness of this rule because i really like short names. and i can imagine there would be some bots where a short name would be useful. but i think the value of easily recognizing fellow humans in our complex systems is getting more and more valuable so i think it's worth it. this way you can easily tell which one is human between alice and alice-bot.
# groups
i recommend keeping group membership data in version control. you could have the following configuration:
the g/ prefix in "g/acme-team" refers to expanded group. so login-service will contain alice and bob as members.
the group definitions need to be expanded recursively. so accesslog-readers would contain alice, bob, and charlie. this means the group membership lists must be acyclic.
tracking human memberships in a version control for a mailing list like politics-discuss would be overkill. so track groups with high churn (such as memberships for mailing lists) differently, e.g. in a database and have the users join or leave via an ui rather than editing text files.
then create a service that serves these group expansions. make it possible for clients to fetch all members for a group and then watch for updates. this means membership lookup remains local in the client and thus fast.
tip: log every time you look up a member in a group as part of making a decision on access. log it with reason, example:
func IsMember(group, user int64, reason string) bool ... acls.IsMember(acls.Id("accesslog-readers"), acls.Id("alice"), "raw access")
log it into a central logging system where users can later look up which memberships users actually used and when was a membership last used. such information will be super useful when trying to lock down privileged groups. eventually you will need such information so it's best if the system is designed with this in mind right away.
# special groups
to make expressing some things easier, create a couple special groups:
the expansion of these groups would be handled in the lookup logic specially: no lookup would be needed.
# management
it makes sense to associate some metadata with users, roles, and groups. e.g. for roles you could configure the full description of the role, the 4 byte linux uid_t, etc. for groups you would configure whether it's a mailing list or not, whether humans can join on their own via an ui, etc.
suppose you have a version control system with per directory access control. then create a directory for every admin team wishing to manage groups and put their roles under them. then all modifications in the files have to be approved via that admin team.
example:
# plogs-admins/plogs-admins.txtpb description: "group admin management team for plogs (Production LOGging System)." members: [ "alice", "bob", ] # plogs-admins/plogs-discuss.txtpb description: "mailing list for plogs (Production LOGging System) related topics. anyone can join." group_join_mode: "self-service" mailinglist { moderators: ["alice"] readers: ["g/all-special"] } # plogs-admins/plogs-backend.txtpb description: "service for clients wishing to upload production log entries into plogs (Production LOGging System)." vm_management { linux_uid: 1234 vm_admins: ["g/plogs-admins"] } # plogs-admins/plogs-frontend.txtpb description: "service for users wishing to browse the production log entries in plogs (Production LOGging System)." vm_management { linux_uid: 1235 vm_admins: ["g/plogs-admins"] }
then create a service that serves this metadata for other systems. so when the mailserver receives an email to "plogs-discuss@example.com" it can check this service whether it's indeed a mailing list. if so it then asks the group expander service for the members and forwards the email to them.
an edit from 2024-12-07: an alternative idea is to have per basename file and define each suffixed group in it:
# plog.textpb description: "Production LOGging System: service that that indexes events from production systems" groups { "admins": { description: "admins" static_members: [ "alice", "bob" ] } "discuss": { description: "mailing list for plogs (Production LOGging System) related topics. anyone can join." group_join_mode: "self-service" mailinglist { moderators: ["alice"] readers: ["g/all-special"] } } "dev": { description: "the developers who develop new features in the codebase" static_members: ["charlie", "dave"] } "backend": { description: "service for clients wishing to upload production log entries into plogs (Production LOGging System)." additional_admins: ["g/dev"] vm_management { linux_uid: 1234 vm_admins: ["g/dev"] } } "frontend": { description: "service for users wishing to browse the production log entries in plogs (Production LOGging System)." additional_admins: ["g/dev"] vm_management { linux_uid: 1235 vm_admins: ["g/dev"] } } }
the "admins" group is a mandatory group with static members that describe who can approve changes related to this "family" of roles. whenever a change is made to such a file and robot could look at the difference. it would allow committing only if the commit has approvals from at least two admins. if plogs-discuss is changed then 2 approvals are needed from "alice" and "bob". but if plogs-backend is changed then 2 approvals are needed from "alice", "bob", "charlie", "dev" thanks to the role's additional_admins setting.
# disclaimer
i admit, i'm not sure i'd design a real system exactly like this. 10 letters can be quite limiting. this system doesn't scale up to millions of employees creating millions of microservices each with a different username. the names will become very cryptic very fast. but if the company has less than thousand users in its system, this should be a pretty simple way to manage things. i like the simplicity and compactness this design requires so it could be fun to play around with in non-serious environments.
published on 2024-04-01, last modified on 2024-12-07
# statusmsg: use status messages instead of percent done indicators
in @/postreqs i linked to https://www.nngroup.com/articles/response-times-3-important-limits/. it mentions slow user interface actions should have a percent done indicator. i disagree with that. i do agree that some form of feedback must be given, i just disagree that it should be a percent done indicator. percent done indicators have places where the progress is very steady such as file downloads. but for many operations (e.g. game loading screens) percentages are terribly unreliable. but even in the download case i'd just prefer that the interface tells me a detailed status instead: size of the total transfer, already transferred data, speed, and the estimated completion time.
the application should be honest and tell the user the actual operation being done at any given moment. e.g. in a game loading screen it could just print that it's loading files (+ which file), it's uncompressing, compiling shaders, etc. if users complain about slow loading, they will also report which step is slow which will simplify debugging and optimization efforts. e.g. they complain about shader compilation? then it's clear that precompiled shaders would be a nice investment. avoid silly "reticulating splines" type of joke messages. that won't be useful for anyone.
print only the current action at any moment. don't bother keeping the full status history. at least don't print the history in the user interface. it's nice to keep them in logs but the user interface should be clutter free.
this is pretty easy to implement on webpages. just have a "status" element somewhere on the page and update it like this:
<span id=status></span> ... // send login request via http post. status.innerText = 'logging in...' fetch(...) ... // redirect to the login landing page after a successful login. status.innerText = 'login success, loading frontpage...' window.location.href = '...' ... // clear status message when user starts editing the form. status.innerText = ''
it is similarly easy in command line tooling (go example for linux):
// setStatusf writes the passed-in single line status message to stderr. // subsequent status writes update the previous status. // use setStatusf("") to clear the status line before printing anything to the screen. // avoid putting newlines into the status message because it breaks the clearing. func setStatusf(format string, args ...any) { // extract terminal width per https://stackoverflow.com/questions/1733155/how-do-you-get-the-terminal-size-in-go. var winsz [4]int16 r, _, _ := syscall.Syscall(syscall.SYS_IOCTL, uintptr(os.Stderr.Fd()), uintptr(syscall.TIOCGWINSZ), uintptr(unsafe.Pointer(&winsz))) width := int(winsz[1]) if r != 0 || width < 10 { // not a terminal or too narrow. return } msg := fmt.Sprintf(format, args...) if len(msg) >= width { msg = msg[:width-6] + "..." } fmt.Fprintf(os.Stderr, "\r\033[K%s", msg) } func printFeed() error { setStatusf("looking up dns...") addr := dns.Lookup("example.com") setStatusf("fetching feed...") feed := rss.Fetch(addr, "/rss") setStatusf("parsing feed...") parsedFeed = rss.Parse(feed) setStatusf("") fmt.Println(parsedFeed) return nil }
the "\r\033[K" terminal escape sequence combination means to go back to the beginning of the current line and clear everything from the cursor. this only works if the previous status message didn't contain any newlines, hence the warning in the doc comment.
note that this is printed only when the tool is used interactively. as a user i would be delighted to know what is happening when i'm waiting for a tool to finish. it makes debugging much easier when things go wrong.
suppose i noted that the dns lookup succeeded but then the tool got stuck in the "fetching feed..." step. at this point it will be clear to me that it's probably the website that is having problems rather than my networking setup.
this is not needed if the action or tool is very fast, only when it's normal that it can take more than a second. e.g. when there's networking involved.
also note that the above code examples are optimized for the occasional status updates. if you have a rapidly updating status (e.g. loading many files), then a polling approach is better to reduce the load on the terminal:
var status atomic.Pointer[string] // displayStatus keeps displaying the value of status until it becomes empty. // once empty, it writes true to done to signal that the status line was cleared. func displayStatus(done chan<- bool) { const updateInterval = 500 * time.Millisecond defer func() { done <- true }() lastStatus := "" for { // extract terminal width per https://stackoverflow.com/questions/1733155/how-do-you-get-the-terminal-size-in-go. var winsz [4]int16 r, _, _ := syscall.Syscall(syscall.SYS_IOCTL, uintptr(os.Stderr.Fd()), uintptr(syscall.TIOCGWINSZ), uintptr(unsafe.Pointer(&winsz))) width := int(winsz[1]) if r != 0 || width < 10 { // not a terminal or too narrow. return } msg := *status.Load() if msg == "" { fmt.Fprint(os.Stderr, "\r\033[K") break } if msg == lastStatus { time.Sleep(updateInterval) continue } lastStatus = msg if len(msg) >= width { msg = msg[:width-6] + "..." } fmt.Fprintf(os.Stderr, "\r\033[K%s", msg) time.Sleep(updateInterval) } } func setStatusf(format string, args ...any) { s := fmt.Sprintf(format, args...) status.Store(&s) } func example() error { setStatusf("starting...") done := make(chan bool) go displayStatus(done) for i := 0; i < 3000; i++ { setStatusf("doing action %d...", i) time.Sleep(time.Millisecond) } setStatusf("") <-done fmt.Println("done") return nil }
the status updater is now a background goroutine. it wakes up twice a second to look up the current status and print it. this approach avoids spending too much time in the write syscall printing status updates that the user wouldn't even have a chance of reading anyway.
there's another nice benefit of having such a global status variable even if you don't print it. you could periodically sample it and then you would get a nice profile what your application is doing. an ordinary code profile would only tell you which code is running but this could tell you which file takes the longest to load. or if you have a crash, the status global could give you additional debug data on what was happening at the time of the crash.
anyway, now go forth and add status messages to all the slow tools and interfaces!
published on 2024-04-08
# signups: allow signing up for web services only via invite or via payment
imagine creating a discussion forum or community site like reddit or twitter from scratch. if the site is popular and allows free registration then that creates huge amount of work for the moderators to keep spam at bay. it would be a battle that the site cannot really win.
what's a good deterrent against this? the simplest approach is to ask for some one time registration fee, like $10. if a spammer creates thousands of accounts then, well, it's raining money for the site. clearly spammers won't do this so they will avoid the site. good! it doesn't solve all spam but it limits its spread. account bans have more weight to them.
to make such payment more attractive for users, offer to send that money to charity. this would clearly signal that the site is meant to be free, the paywall is there only to reduce the spam. it also makes it clear that this is a non-refundable fee.
i do see this mechanism on some sites. chrome web store, android play console, microsoft dev account, and probably many other sites ask for a one time registration fee.
but what if the site wants to allow free accounts too? for that let users invite each other. the invite-graph could provide a very useful insights for combating spam accounts. and have the invites regenerate over time, such as 1 per week up to a 6 max in total. so if a user has 6 or more invites, they won't get another further free ones until the remaining invites drop below 6. the limit can be adjusted based on the desired growth factor. limiting the free invites prevents a user from banking on their invites and then creating a lot of new accounts in short amount of time. this is how gmail started and there are many private communities that work like this.
perhaps also allow paying for other people's registration fee too. e.g. pay $100 and get 10 paid invite links.
this invite-or-pay will greatly limit the growth of the site. perhaps allow free registration initially and set up the above limitations only after the site grew to a healthy size and spam is getting out of control. not allowing unbounded growth is good anyway. small, focused communities are much healthier than free-for-all mega-communities.
creating a payment system can be tricky though. one approach would be to create a business account with paypal, stripe, revolut, paddle, lemonsqueezy, shopify, or similar company and then use their api to add payments to your site. but that's quite an involved process given the requirements these systems have. alternatively something like ko-fi or buymeacoffee could work for the initial setup too. i haven't really used them before but their api and webhooks seem relatively easy and build up on.
# edit on 2024-05-09
i realized that in @/msgauth i had another idea for limiting signups: authenticate via whatsapp. this means users would need a unique working phone number for each registration. getting those has some barriers so it might be a good way to limit spam registrations.
# edit on 2024-08-01
note to self, maybe platforms like github sponsors, opencollective, liberapay, goteo, etc could be used as a paywall too. https://wiki.snowdrift.coop/market-research/other-crowdfunding is a comparison site of various platforms. it's a bit outdated but at least it's nice to see which platforms are still around and thus are somewhat more established.
# edit on 2024-10-07
lobsters works on an invitation system: https://lobste.rs/about#invitations. seems to be working quite well for them.
published on 2024-04-15, last modified on 2024-10-07
# limits: create and enforce limits for web services
suppose i'm offering a web service for my own users and i want to protect it against abuse. i can already limit signup via the methods mentioned in @/signups. but that's not enough: i should also ensure no single user can degrade my service on its own.
# throttling
one approach is to throttle excessive usage. internet access is often throttled. sometimes it's advertised as "unlimited traffic" at "unlimited bandwidth". what really happens (in the better cases at least) that after certain amount of traffic the bandwidth is throttled to slow speeds. so the mobile carrier or isp might provide the first 10 GiB in a month at 1 Gbps and then the rest at 1 Mbps. i think that's fair way to limit services. but be honest about it: just explain the limits and don't just say "unlimited" dishonestly as a marketing ploy.
but services where throttling works well are quite limited. it could work for fluid-like continuous services where giving less amount of the service is also fine. e.g. take tap water as a subscription. this is usually implemented via paying after whatever amount the user used. an alternative solution could be to provide the users and homes with fix amount of water at full pressure. the pressure drops when that amount is exceeded. sure, people should be able to sign up for unlimited usage at full pressure but if most people don't need it, then let them safeguard their bills with limits like that.
# tokens
suppose i want to limit something more discrete: how many comments a user can post per day, how many images can the user upload per day, how many requests a client can make per hour, etc. then a token based system might work quite well.
suppose i want to limit that my developers don't run the expensive integration test suite more than 4 times per day on average. then i could create a counter that tells the user the amount of runs they have in balance. if it's zero then they can no longer trigger the test. and replenish they token count every day like this:
newtokens = max(newtokens, min(oldtokens+4, 12))
this also allows accumulating more tokens over time so they can burst if they weren't testing a lot the previous days. i think the ability to burst is important otherwise the service would be unfair to people who are not constantly online but want to use the service in a batched manner. e.g. a developer might prepare a dozen commits while disconnected from the network for a day or two and then wants to run all the tests at once. that should be supported too.
let the user queue up their usage once they are out of tokens rather than just flatly refusing to service their requests. e.g. in the integration test case the queued up tests could then run automatically at midnight when the tokens replenish. though note that excessive queuing might lead to other problems, see https://en.wikipedia.org/wiki/Bufferbloat.
but also let users buy tokens or simply bump the above limits with a regular paid subscription. so maybe i know one of my developers is super productive then i could let them regain 6 tokens per day up to 20.
# credit
i quite like fly.io's pricing structure. it has many different services, each metered separately. i can use them however i want and at the end of the month i get a bill. but each month they credit $5 worth of usage. so if i stay below $5 worth of usage, i'm using the site for free.
furthermore they allow me to pre-pay my account. if my usage exceeds the credit available on my amount, they just suspend my virtual machines. i find that pretty neat as it avoids surprise bills. i wish i could set daily limits though. i'd set the limit to $1 usage. so even if one day i get a ddos attack or i mess something up, the next day i can start over with clean slate.
they also have monthly subscription plans. higher tiers get me more features such as access to support. and whatever monthly subscription fee i pay, i get that amount of usage for free by the same credit method described above.
i think similar approach could work for many things where the service consists of many dimensions and i want to price each dimension separately. this way i don't need to think about freebie quotas for each dimension separately, just gift certain amount of the bill for each user each billing cycle.
# probabilistic rejection
the above where methods for limiting usage from a single user. but how could i protect my service against many users trying to use it simultaenously?
suppose my server can have only 200 inflight requests at any given moment. the simplest approach is to simply reject any request that would cross the 200 inflight requests thresholds. but this makes the website go down way too suddenly.
smooth this out with probabilistic rejection. accept all requests until 100. then reject incoming requests with a probablity of (inflight - 100)/100. if there are 150 requests in flight, requests will be rejected at 50% probability. at 200 inflight requests, they will be rejected at 100% probability. the full formula for the probability would be this, assuming n is the maximum amount of inflight requests and u is the current usage: max(0, (u - n/2) / (n/2)).
if possible, add smart retry logic to the client side, similar to what i wrote about in @/postreqs. or just tell the user as it is: the site is under load, come back a bit later and try again. hopefully it will drive away just enough users to keep the server load under control but not more. this way load should be smoothed out leading to smaller peaks with slower but relatively smooth experience on the user side.
variants of this one can be used for many things where i want to limit many users trying to access a limited resource. limiting new account creation, new comments in a thread, tickets for events, etc. think of it like lottery.
# cooldown
there's also @/cooldown which i use for the completely anonymous and registration-free comments below. i think that's a pretty generic technique too.
opening up a service to the internet can be scary. but gracious use of various forms of limits can keep everything under control. this post is just a reminder for myself on what ways can i do that if i ever decide to write an online service.
published on 2024-05-06
# reactions: using limited emoji reactions for feedback can be useful
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/reactions.html.
this post was inspired by https://www.mcmillen.dev/blog/20210721-downvotes-considered-harmful.html. i like the idea in that post and here i'll just think loudly how to implement it in general.
more and more websites allow users to "emoji react" to the main content. for a nice example see a random popular github issue like https://github.com/golang/go/issues/15292. observe how most comments have some emoji reactions under them.
for a long time i didn't really like this idea. but i spent some time thinking about this and i think this can be pretty useful if done well. in this post i'll explore what features would make this a useful feature for me.
# emojis
but first let me add a disclaimer that i hate it when people communicate in pure emojis. to me it's like talking in ancient hieroglyphs. and if you encounter a hieroglyph you have never seen before then it can be pretty hard to look up what it means.
suppose you walk along a street and there's a red table saying this:
DISALLOWED: 🥨 🐶 🛼 🪅
you can sort of figure out what the first 3 means: no eating, no pets, no rollerskates. but what does the last one mean? and then you try to call your partner to ask what the 4th symbol means, how do you verbalize your question? unless you know that the author meant toys, you have hard time figuring out the meaning just from the picture.
words wouldn't have this problem:
DISALLOWED: eating, pets, rollerskates, toys.
i admit, i'm heavily biased here: i think verbally. i have to mentally verbalize or write things down in order to "think" and to solve problems. i can't think in pictures.
but there are people for whom pictures are easier. in that case there is an easy compromise: just display both.
DISALLOWED: 🥨(eating), 🐶(pets), 🛼(rollerskates), 🪅(toys)
easy peasy. and on user interfaces where there's little space, let me hover over the icon and the meaning should appear in a tooltip.
and i haven't even talked about the case where an emoji have completely different (sometimes opposite) meaning in different cultures. communicating with emojis across cultures without a reference to the meaning is very hard. or when corporations update the emoji pictures and retroactively change the meaning of past messages in subtle ways for better or worse.
tooltips were heavily used in early user interfaces such as in microsoft windows and microsoft office. i bet one wouldn't be able to figure out what each icon meant without the descriptions:
[non-text content snipped]
most emojis are just as cryptic for me. click on the picture to toggle the descriptions. in case of the above tool icons a full description was displayed in a tooltip if you hovered over the icon. and in menus you could see both the icon and the description to help build the mental association. once you familiarized yourself with a tool and its icon, you could comfortably use it from the toolbar. the toolbar was scary at first but things remained discoverable.
another nice example is this: https://github.com/erikthedeveloper/code-review-emoji-guide. here every emoji is well explained and with such guide in mind, i think using those emojis in communication is fine. to be fair, i'm not yet sure i'd like to see such emojis in my reviews yet. but if it's a must then it should be done with a limited set of icons and a guide to the icons.
the other big issue i have is that they are hard to type, usually require special tooling to enter them. i can't simply type them easily with a qwerty keyboard as i do words. well, some systems might allow me to type :thumbsup: and have a thumbs up emoji appear when presented to the user. if it's well accepted that emojis are always represented with english letters in the raw source, then maybe i can make peace with them. but i'm a bit sad that we are trending to revert the super useful invention of the alphabet to cavemen-like pictographic communication. are letters that hard to use? i'm sure i'm in the minority here and i should just move on (even if it feels going backwards).
so anyway, my point is that using pictures in communication is acceptable only as long as they paired with a written explanation that can be easily accessed. in that github example i don't see such explanations for the the various emojis. i don't know what it means when someone reacts with "rocket" to a comment. therefore i don't like that implementation. if you create an emoji reaction system, then create a guide describing how and when to use the various emojis.
# upvoting
all right but what's the case for such emoji reactions?
it's very common on mailing lists and on simple bug trackers that there's a long thread where people add a simple "+1" comment on its own. this is meant to signal that the given user also has the same problem and would like to see the issue fixed.
this is useful information. but at the same time it's very spammy and doesn't add much value to the thread itself.
i think it is efficient to have a dedicated +1 button to simply track the affected people without the spam. and then you can use this aggregated counter to determine what are the most important bugs to fix.
some projects explicitly call this out: https://go.dev/wiki/NoPlusOne.
"like" is similar. you can "like" a post or a video and then the website can use this information to compute the most popular posts.
so far so good.
# downvoting
+1 and like on its own is not enough because it cannot express disapproval. seeing the dislikes for an item is also very useful signal. in the issuetracker example maybe some people consider some bugs as a feature and don't want the bug fixed (cue https://xkcd.com/1172). then it's only fair that people can downvote such bugs.
once you have upvotes and downvotes and they can be trusted, then i can use that information to make decisions. if i'm in the mood for some funny relaxing videos then i can avoid low quality clickbait videos by avoiding downvoted videos. or if i'm a creator myself, i can use this feedback to see if people like or don't like my content.
for an example see github issues. it allows sorting by emoji reactions, see the sort dropdown on the right. example: https://github.com/golang/go/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions-%2B1-desc. check out other emojis as well, such as thumbs down, tada, rocket, etc. unfortunately this emoji set is pretty bad but more on emoji selection later.
unfortunately nowadays there's a trend towards removing or hiding negative feedback. mostly because some people don't use such tools constructively. they use it to harass people, e.g. dislike every post a person makes regardless of content.
then the creator is left wondering why their post has so many negative votes. they have all this negative feedback with no explanation and it makes them feel bad. solution? remove the possibility to react negatively, right?
that's not the right solution. the problem is not that people feel bad but rather content creators can't know why something was downvoted. this hints to an alternative solution: let the downvoters tell why they are downvoting something. a simple one-click "-1" or "dislike" button is not enough. make it at least two-click!
# demo
i've cobbled together some html to show what i have in mind in broad terms. you need javascript enabled to see the demo below. let's take a twitter-like post where people can emoji react to.
you can upvote and downvote a post. the score of the post is then upvotes - downvotes. it's displayed as the first thing right after the post. by default it's +13 because there are 25 upvotes and 12 downvotes. (the exact scoring algorithm doesn't matter for this discussion, it's just an example.)
next to the score is a thumbs up button. you want to simply upvote a post? go ahead and push that button. upvoting a post only needs one click. (really, go ahead, it's just a demo.)
however to downvote you need to press the 3-dot button. it presents you a more complex form. you can still simply click "dislike". but you will get other very common reasons for disliking: "duplicate content", "inaccurate". clicking those would still count as a downvote but the creator and other users will understand better why people don't like something.
but often the predetermined categories don't express all the nuance why someone doesn't like something. those people can add a more detailed comment into the "comment" text box. a sample of those comments is then showed in the feedback form. then the creator and also other users can have an even better understanding why others like or don't like something. try entering something in the box after selecting a reaction to see how the form changes. (in my @/ffpoll post i advocate for similar free form comment box for polls too.)
a similar mechanism can be used for flagging post for moderators, see the remove row. moderators can prioritize their moderation queue more efficiently based on the signals why something was flagged.
[non-text content snipped]
here i categorized the reactions into 3 broad categories: upvotes, downvotes, removal requests (moderation request). assigned 3 reactions to each category. maybe it makes sense to have 4 for each category but not more than that because then the interface can get overwhelming.
i keep the generic dislike reaction. but if people still complain about unexplained dislikes then the form can be further tweaked. replace "dislike" with "other" and require for that option a comment. then the creator can simply ignore the "other" reactions with clear conscience if they don't contain a meaningful comment. or such meaningless comments could be even flagged for removal (see the red flag if you hover or touch a comment).
i propose that even upvoting has multiple reaction options. suppose a disaster happens and someone makes a tweet about the event. some people feel weird to "like" such tweets. so in that case people can react with "hug" (or something similar) and still upvote the tweet to popularize it.
select the emojis for the user to choose from carefully. make sure they represent the most popular orthogonal reactions. the more difference is between them, the more useful the data will become. i've picked the demo's 9 emojis without mouch thought. in a real service this would need some research.
the comment that can be attached to the reaction is limited to 120 characters. it's meant to add a short explanation for the reaction. it's not meant for discussion. for discussion the user should be able to reply to the post properly. discussion responses also create a notification for the poster. reactions shouldn't.
# moderation
the 3 reactions for the removal requests are especially handy for moderators. if multiple users mark a post as obsolete, then the post can be collapsed and greyed out but still accessible in general. it's meant to hide duplicate posts and other irrelevant but otherwise fine posts. moderators can then undo this if such action was inappropriate.
if multiple users mark a post as "inappropriate" then the system can automatically unlist the post without moderation intervention. remove the usernames in unlisted posts just to ensure people cannot go witch hunting right away. then later a moderator can make the decision to completely delete the post if it's truly inappropriate. stackoverflow uses such community based moderation. if 6 users flag a post as spam or rude, it gets locked: https://stackoverflow.com/help/privileges/flag-posts. also note how flagging requires the reporter to select why something is flagged. the idea is very similar to what i describe here.
(sidenote: in general i like stackoverflow's approach to moderation. from https://news.ycombinator.com/item?id=39425274: "the only thing that scales with the community is the community".)
if a user marks a post as sensitive, the post would be made immediately unavailable. this is meant for posts that unintentionally contained sensitive data such as phone numbers or addresses. given the grave effect of this action, this reaction wouldn't be available to everyone but only for trusted users. or users who went through some training material explaining the button. and any misuse would result in temporary bans if needed. such bans should be scary enough if signing up to the service is hard per @/signups.
# anonymity
should the reactions be anonymous or public? in the github instance i've linked above it's public, you can see the list of usernames for each reaction type if you hover over the reaction with the mouse.
i'm not fully sure about this but i think making the votes anonymous is better. it might allow for meaner comments. but at the same time the creator will see more honest feedback.
e.g. you might want to avoid giving a negative reaction to a friend to avoid souring the relationship. but if it's anonymous, you would feel more free to give a honest reaction.
and as for mean comments: users should be able to flag the individual free-form comments for moderation. and then mean users can be tempbanned to cool down a bit.
it's not a hard rule though. in some cases it makes more sense to have the names associated. e.g. in technical discussions where you might want to use such feedback to guide decisions and want accountability. but any way you choose, make who can access this data clear enough for the users.
# update rate
avoid updating the scores in real time. some people would be obsessively reloading their post to see the feedback streaming in real time. the system should not encourage such obsessions.
update the stats only every hour or two. this also makes the system easier to implement and cache. no need to build super efficient realtime data aggregation systems.
and make sure if i react, there's at least 1 hour before my reaction appears in the stats. so if i react at 13:58, the 14:00's update won't contain my vote, only the 15:00 one will. this way it avoids the edge case where someone shares a post and then 5 minutes later they can check on the reactions and deduce how certain people reacted even in an anonymous feedback system.
# creativity loss
there's another subtle downside to allowing reactions. people start craving the positive feedback. so if a post doesn't generate a lot of positive reactions, the creators will revert to content that does generate lot of reactions. this is often easier to consume, lower quality content. the creator will lose its unique voice. in other words there is a loss of originality and authenticity in the process.
but this effect has to be counterweighted with the fact how useful seeing such feedback on content is. i'm super annoyed that whenever i look for movie trailers on youtube and i get all these "concept" fake trailers. the annoyance comes from the fact that such trailers are often not clearly labeled. e.g. the concept bit is the last word in a very long title. they are clickbait so they get lot of views. then the channels keep churning them out which then spams the search results.
i'm not against creators creating them but they should be clearly marked as such. if not, then users could tag such videos with the "inaccurate" reaction. and then the search could allow me to filter out "inaccurate" videos. that would be nice.
overall i think the benefits outweigh the drawbacks so i think it's worth having this system.
# reviews
i think such feedback system could be used for reviews too instead of the 5 or 10 scale systems that is common today. https://apenwarr.ca/log/20231204 (NPS, the good parts) is a good article explaining all the downsides of such scales.
not giving the full score to a gig worker (such as uber driver or delivery person) in a review could result in the worker losing their job. at that point the review system loses its value because most people don't want to mess up other's life for a small mistake. the reviews are then not fully honest.
instead just boil down the feedback into two "overall positive" and "overall negative" categories. and from those let people choose a sub-reaction that best describes their experience.
in case of videogames (because that's what i'm most familiar with) you could have this:
the reviewers then would need to decide whether their feeling about a game is overall positive or negative. and then they would need to choose the sub-category that most closely matches their feeling.
when comparing game a vs game b and you see that the first has score 7 and the latter has score 8, does that really give you good information? those scores are super subjective. but when i see that game a's review is "good gameplay" vs game b's is "good story" then i can compare games already. i might opt for the former because gameplay is what i want from games. i'd look for movies or tv shows if i want good stories anyway.
another way to approach this is to allow reviewers pick multiple reactions, not just one. so a game could be marked as "good gameplay, good story" but also as "short, buggy". in a 5 scale rating system that would mean a 3 but in this detailed system i get a much better understanding what to expect from this small structured piece of information.
such multi-option could be allowed for the emoji reactions too but i'm a bit wary of it because it might be a bit too complex to use and reason about.
# summary
to summarize my thoughts: emoji reactions (and review systems) are currently a bit fluffy and don't give much useful information for users. but with some tweaks and in exchange for a little bit of complexity these could be turned into super useful data. i hope various systems will slowly pick up such changes in the future.
published on 2024-05-13, last modified on 2024-09-23
# redir: implement shortlinking via a redirect script
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/redir.html.
[non-text content snipped]
[non-text content snipped]
[non-text content snipped]
it's nice when in markdown source i can simply write i/15292 and it auto-linkifies to my project's issue tracker. or when i can write cs/file:regexp/regexp.go and it links to a code-search of the given query in my project's repository. or when i can have goto links like go/ref#Operator_precedence.
[non-text content snipped]
what's even better? if i can type those queries into my browser's url bar and i'm navigated to the desired sites right away.
(i am using go as an example project, i am not affiliated with it.)
in this post i describe a hacky way to achieve that in a simple manner with only a minor usability drawback. i'll focus mostly on how to make this work in the url bar.
# dns search list
one very complicated approach to achieve this is to create a redirect service such as redirect.mycompany.com. then add redirect.mycompany.com to the dns search list (see man resolv.conf). then when in the browser you type i/123, the local dns resolver will try to resolve i.redirect.mycompany.com first.
i will not elaborate on this because this is hard to set up, hard to maintain, insecure because you can't do https, etc. i don't recomment this at all.
# browser extensions
another approach is to use a browser extension for this. https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webNavigation is one api with which this can be implemented.
i haven't really tested this but you would need something like this in the extension:
// this is where the shortlink -> full url logic would live. function expandurl(url) { ... } function navigate(ev) { if (ev.frameId != 0) return; chrome.tabs.update(ev.tabId, {url: expandurl(ev.url)}); }; chrome.webNavigation.onCommitted.addListener(navigate, {url: [{urlMatches: '^(i|cs)$'}]});
you would need only the tabs and webNavigation permissions. and i think this works even when clicking on shortlinks in a webpage. but a less intrusive approach to an extension would be to install this as a search engine. then it wouldn't work for clicking but you can have the rewriting still happening when you enter such a shortlink into the url. see the search_provider setting at https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/manifest.json/chrome_settings_overrides.
there are couple commercial extensions that implement this idea one way or another, for example:
these seem to only work for the go/ prefix, not any other such as i/ or cs/. maybe those are configurable too, not sure. but here i'm talking about creating all this ourselves anyway.
creating a browser extension is somewhat involved. and a lot of people are a bit uneasy about installing random extensions. it's very easy to change an extension to mine bitcoin without the userbase ever knowing about it.
but if we can accept a bit less usable solution, then there is another solution to this.
# redirect page
first create a static redirect page hosted on your site or github pages (for free). it's a html page which has a javascript that redirects based on the parameter after the hash (#) part in the url.
<script> if (location.hash.length >= 2) { let newurl = ... // compute where to redirect based on location.hash. window.location.replace(newurl) } </script>
for demonstrational purposes this page is such a redirect page:
but you can go moar crazy and redirect based on full urls too (prefixing with https:// is also fine):
sidenote: if your static file server ignories the query part of the url then you can put the redirect part after a ?q= and have the javascript redirect based on that instead.
# urlbar keyword
once there's such a page, all what's needed is to hit that page when a shortlink expansion is desired. all solutions will require a keyword. suppose the keyword is `s` as in shortlink. then in the url bar you need to press s, then space, and then you can write the shortlink.
so to go to a github issue, you would need to type "s i/123" into the url bar and press enter.
i'll document how to set this up on a desktop. i'm not sure about mobile phones, i'm not even sure i would care enough to have this over there.
# firefox: bookmark with a keyword
in firefox you can assign keywords to bookmarks. so bookmark this site: click the star button on right side of the url bar. then find the bookmark in the bookmark menu and click edit. append "#%s" to the url and add a keyword like this (the screenshot adds "s" as the keyword):
[non-text content snipped]
and that's it. i don't fully understand why chrome doesn't allow assigning keywords to bookmarks.
there's another quirky way to achieve the same. firefox apparently adds a "add a keyword for this search" for most simple input elements in a form. right click on this input below and click "add a keyword for this search" to add it. pick a keywords such as "s" to be able to save it:
[non-text content snipped]
# chrome: custom search engine
follow https://superuser.com/a/1828601 to add a custom search engine in settings at chrome://settings/searchEngines. use https://iio.ie/redir#%s as the url.
i think you can make the keyword empty (or mark the search engine as default) and then it becomes the default search engine. then you won't need to type a keyword to trigger the redirection. if you do this then make sure the redirector is a passthrough for most urls.
one advantage of putting the search query after the hash (#) and then do the translation locally is that the search query won't be sent to the server. that's because the server won't see the part after #. i type all sorts of sensitive garbage in the url so this approach reduces the risk of my garbage appearing in various server logs.
# firefox: custom search engine
in firefox the option to add custom search engines is hidden by default. you can enable the add button like this: https://superuser.com/a/1756774. then a similar approach should work as described above for chrome.
alternatively, you can set up https://developer.mozilla.org/en-US/docs/Web/OpenSearch xml for the redirector service. then the user can install the search engine relatively easily.
the site needs a <link rel=search ...> tag then then you can right click in the url bar and add the search engine from there. i have that for this page. right click in the url bar and select "add iioredir" from the context menu. and then you have to manually assign a keyword for it in the firefox settings. search for "search shortcuts" in the settings to find the form for this (about:preferences#search).
this way of adding search engines is not supported in chrome because they feel it leads to clutter for users, see https://stackoverflow.com/a/75765586.
# rules
all right, the user can use the redirect page. but how to implement it? how to represent the redirect rules? how to represent the rule that transforms i/123 to https://github.com/golang/go/issues/123?
requirement: the rules should be easy to parse in both javascript and go. javascript support is needed to make the redirection work in the url bar without specialized server software. the go support is needed to support applying the redirect rules to markdown (assuming the markdown renderer is in go). so when the user writes i/123 in the .md file the generated html page will contain a link to https://github.com/golang/go/issues/123. this avoids an unnecessary hop to the redirect service and makes the link work for users who don't have any redirection set up.
(the downside of skipping the redirection service is that you cannot track how often a rule is used. if you care about that then it might make sense to rely on a redirection service. but i recommend not tracking it, it creates all sorts of wrong incentives.)
to make things easy to implement, i propose representing the rules as a text file with the following syntax:
i'd support two forms of rules: simple prefix replacement and complex substitution. the github issue redirection could be described via two simple prefix replacement rules:
rule i [a-zA-Z] https://github.com/golang/go/issues?q= rule i .* https://github.com/golang/go/issues/
the first one leads to the search site. so typing i/regexp would search for issues about regexp. but if the user types a number, they would get to the page with that id. testcases can describe this more clearly:
test i https://github.com/golang/go/issues/ test i/123 https://github.com/golang/go/issues/123 test i/is:closed https://github.com/golang/go/issues?q=is:closed
websites can be easily added with the same syntax:
rule twitter.com .* https://nitter.poast.org/ rule x.com .* https://nitter.poast.org/ test twitter.com/carterjwm/status/849813577770778624 https://nitter.poast.org/carterjwm/status/849813577770778624 test x.com/carterjwm/status/849813577770778624 https://nitter.poast.org/carterjwm/status/849813577770778624
complex replacement would be needed whenever you want to extract bits of the shortform and convert them into a more complex url. this would trigger whenever the replacement contains a $ symbol. hypothetical example:
rule aclcheck ([a-z0-9]*)/([a-z0-9]*) myaclcheckservice.example.com/check?group=$1&member=$2 test aclcheck/employees/alice myaclcheckservice.com/check?group=employees&member=alice
or here's a youtube -> indivious example:
rule youtube.com ^watch.*v=([a-zA-Z0-9-]*).* https://yewtu.be/watch?v=$1 test youtube.com/watch?v=9bZkp7q19f0 https://yewtu.be/watch?v=9bZkp7q19f0
the exact syntax for the replacement is described at https://pkg.go.dev/regexp#Regexp.Expand. javascript follows similar rules.
ensuring the same regex works both in javascript and go is important. but that's why i propose that the datafile contains tests. they can run for both the go and javascript implementation to make sure they work across platforms.
here's an example implementation in go: @/redirgo.go. and here's an example implementation in javascript: @/redir.js. look for the newruleset() and the replace() functions. the javascript one is the actual implementation that's driving the redirect rules on this page.
the main reason that i have separate keyword and pattern parts in the rule definition is efficiency. the replace logic splits on the first / of the query and treats the first part as the keyword. and that allows to quickly filter the rules. this way the implementation doesn't need to try matching all regexes which can be slow if there's a lot of rules.
# goto links
another common usecase is the "goto links". these are links in the form of "go/someid" and link to some other website. and then users can freely set up new go links. this is the idea behind https://golinks.io and https://trot.to.
(i'd use the e/ slash for such usecase because it's shorter and still easy to pronounce. the "e" can mean "entry link". but i'll stick to the go/ nomenclature because that's what's commonly used.)
it should be easy for users to add new go links. if you have a separate service for this then all you need is this rule:
rule go .* https://goto.mywebsite.com/
and then the users would edit such links in that service.
but if you don't then let users simply add the goto rules directly into the rules file:
rule go ^blog([/?#].*)? https://blog.go.dev$1 rule go ^book([/?#].*)? https://www.gopl.io$1 rule go ^ref([/?#].*)? https://go.dev/ref/spec$1
then go/ref#Operator_precedence would link to https://go.dev/ref/spec#Operator_precedence.
currently it looks a bit ugly with the `rule` syntax if i want to be able to append stuff after the url such as in the go/ref example. but you could add a `gorule` directive to better handle the specialness of it. then you could write something like this:
gorule blog https://blog.go.dev gorule book https://www.gopl.io gorule ref https://go.dev/ref/spec
perhaps you would also want some acls on these links so an intern wouldn't be able to steal popular links and link them to the rickroll video. but i won't go into that here.
# demo
for reference here's a demo that implements the above rules. you configure the data here:
[non-text content snipped]
and here are the test results (updated after each change):
[non-text content snipped]
# automatic linkification
when we have the rules, we can easily linkify text. suppose that the "replace()" function runs the substitutions. then the following snippet can markdown-linkify all simple instances of such links (warning: this is a bit simplified, doesn't handle all edge cases):
function replaceall(ruleset, text) { return text.replaceAll(/[a-z.]*\/\S*\b/g, s => { let r = replace(ruleset, s) if (!r.startsWith("http")) s return `[${s}](${r})` }) }
this transforms a text like this:
[non-text content snipped]
issue i/123456789 will be hard to fix. the problem is this bug: cs/f:tar/common.go+%22could+overflow%22.
[non-text content snipped]
into this form:
[non-text content snipped]
sidenote: currently on this blog i don't do such transformation. i trigger linkification only after the @ sign (though i do linkify http:// tokens too). this lets me write i/123, u/alice, etc type of text without worrying about unexpectedly getting linkified to the wrong thing later in case i ever add shortlink support to my blog. so if i want to have i/123 linkified by my markdown renderer (assuming i have a rule for i) then i would type @i/123. it does add some visual noise to the text but in exchange i have less worries. i might change my opinion on this later though.
# reverse rules
once you have all this then create an extension or bookmarklet that can create a shortlinks from long links. so when you are on https://github.com/golang/go/issues/123 and press the extension's button, it will copy i/123 to the clipboard. this way people can easily create shortlinks without needing to remember the exact rules. you can implement this in the same ruleset via having a "revrule" directive.
extension is nicer because it can create a nice button next the url bar and can support hotkeys too. if a bookmarklet is desired then https://stackoverflow.com/q/24126438 could help to keep it short.
[non-text content snipped]
# keep urls simple
ensure links in common tools have a simple, memorable url structure. then people are more likely to linkify things naturally.
linking things together is what makes the web great. it allows us to dig deep into things. wikipedia is great. i don't say everything should be linkified (e.g. every word linking to thesaurus). but do give linkable references where it makes sense. and if you are creating documentation tools then make sure that linking things in it is easy.
[non-text content snipped]
published on 2024-05-20, last modified on 2024-09-02
# tlogging: sample the current activity every 30 minutes
i used to struggle with focusing on work tasks. sometimes i just felt overwhelmed and didn't know which task should i work on. i solved this problem by always picking my oldest task in my todo list whenever i started feeling overwhelmed.
but even then, i often got distracted and made little progress. i then tried techniques like pomodoro. the idea is that i select a task and then work on that for the next 30 minutes while there's a timer on my desktop reminding me to focus. then i take a 5 minute break and repeat. this didn't really work for me either. i just found it to be way too much of a hassle to be this formal about my focus.
# stochastic time tracking
i kept looking for some tips and tricks and eventually i stumbled across the idea of work sampling or stochastic time tracking as described at http://messymatters.com/tagtime. the idea is that a tool regularly interrupts me and asks what i am doing. i describe what i'm doing (or what was i doing the past few minutes) and then i have a logfile with samples of activities. it's like linux perf-like statistical profiling but for humans. or like a reverse pomodoro technique.
i thought maybe better time tracking will help with my focus because i will better understand where my time goes and can adjust accordingly. this type of sampled logging immediately resonated with me because i did something similar when i was doing @/recording. i was experimenting with recording my screen and i wanted an up-to-date statusbar that displays what i am doing in each moment. i kept forgetting to keep it up-to-date so i created a nagging tool that asked me to periodically update the status. and whenever i did this, i had pretty sharp focus for some reason. the only difference here is that is that now i also record the timestamp whenever i add a status update and append the new status to a logfile.
back then i named the tool as tlog as in "time logging". i quite liked the name so i kept it and use it now to refer to the whole method.
i started doing this type of tracking about a year ago. i don't do random sampling because i found that annoying. i just sample myself each ~30 minutes. this works out quite well with meetings. they are usually 30 minutes so if i add a sample just before the meeting, i get the next nag right at the end. and then i can log a quick note about the most important takeaway from the meeting.
# work log
these samples give me very useful work log at the end of the week. i use that to fill a short weekly summary note which is encouraged at work. the weekly summaries aren't mandatory at work but i do it nevertheless. i used to feel like i was doing nothing interesting at work and was very unproductive and useless. but doing this weekly review completely eliminates these dark feelings. and if management ever asks me what was i doing recently or where is my time spent, i can always easily answer that question.
whenever i'm interrupted i either log what i'm doing right now or log where bulk of my time went in the past half hour to ensure the big items are covered. if i finished two tasks, i might add two samples. this biases the timing data but i don't really care about that. the number of samples is already enough for me to see where bulk of my time is going.
i don't mind the interrupt either. if anything, it helps me focus. it serves me as a reminder to get back to work if i started to drift in thoughts.
if i find that i was mostly slacking, browsing, procrastinating in the past half hour, i just log it as "slacking". i feel a bit bad whenever i do that so this encourages me to achieve some useful stuff by the next sampling. writing a nice beefy update feels good. this then motivates me to work more so there's a positive reinforcement cycle to keep me working rather than procrastinating. if i procrastinate too much then i feel depressed due to feeling useless. this method eliminates lot of procrastination for me so thanks to this i feel sad less often. i like this method just from a mental health perspective too.
# tool
this is the command line tool i wrote for this: https://github.com/ypsu/cfg/blob/acdf4f5/utils/tlog.go. if i run `tlog -w` it starts watching ~/.tlog. it wakes up every minute and checks the last modification of that file. if it's older than ~30 minutes, it emits an alert to the terminal via writing the "\a" byte. my tmux is set to highlight the window with an alert and my i3 also alerts me if an xterm window has the alert bit set. that's quite non-intrusive and even if i accidentally clear the alert status, one minute later the tool will re-alert so i can't miss it for long.
the second invocation mode is `tlog [message]`. this simply appends the argument to ~/.tlog along with a human formatted timestamp in the local timezone. so i can just run "tlog slacking" or "tlog myproject: i fxed the race condition bug" in any terminal to log my work from the past half hour.
i can also run `tlog` on its own and it starts vim in insert mode for me to write the update. sometimes editing in vim is more convenient especially if my update contains all sorts of weird quotes.
to review the updates i just open ~/.tlog in a text editor and read through the log. then i summarize the more interesting updates into the work summary tool mentioned above manually.
# log format
i like to categorize my time into broad groups. so the first token of my update is usually the project name and then comes the message. so i might write `tlog project1, some achievement` or `tlog pblog, finished the tool section of the tlogging post`. i use a comma to separate the group name instead of a colon just because it's easier to type, it doesn't require pressing shift. i just write `tlog emails` if i was just reading emails.
this helps me to see where my time is spent in a broad manner. it's clear which project takes up most of my time just by eyeballing the beginning of the updates.
i also track meetings in the form of "meeting, foo sync, discussed x, i proposed y and people liked it" or "meeting, team sync, discussed status updates, nothing interesting". having such data for my meetings is nice to have in case i want to see how much time i spend in meetings and how useful they are.
# consumption log
i've seen a high level, tech-lead-like person who summarizes every document he sees for his own reference. basically every time he reads a work related design doc or roadmap, he makes a 2-3 line long note about the most important takeaways into a notes doc of his own. then if somebody references a doc or asks him about a doc he read a while ago, he doesn't need to re-read it again. i think such summarization is really helpful in remembering the things you consumed. and it also works as a work log of where his time was spent.
i also see a lot of bloggers keep similar notes for all the books they read albeit not timestamped. example: https://sive.rs/book.
i already have lot of private notes about various things. i have a notes file where i just keep appending various notes. for every note i give a short title and timestamp. i wrote about an earlier version of my workflow of this in @/task.
often it's not the note what's the most helpful in this but the very act of intentional summarization. the act of reconstructing the new information with my own words in my own mental world can deepen the understanding and help retain the information.
nevertheless the notes are super useful. i often end up looking up some random tidbits in that file. but i don't do it for every document, every website, every youtube video i consume. mostly because it is a lot of effort: i'd need to open the editor, add the title, timestamp, etc.
now with the tlog tool this is much less effort. and i now started doing this for some content i consume. so my .tlog file not only contains status updates but reference information too. convenience of the tool really helps me be more proactive in keeping notes.
but i don't do it for every document i read or video i watch. i'm not at the level that guy i mentioned above. too much effort. maybe one day.
i also put short, completely random unstructured notes into it occasionally since it's more convenient than opening my proper notes file.
sidenote: i did consider the zettelkasten method for my notes at some point. but i found that a simple structure in a text file is more than enough for me, no need to overcomplicate my life.
# recommendation
doing this requires non-trivial effort. it requires me to periodically summarize my past half hour into a few words. it can be hard sometimes because i have a bad memory. so i only do this for work during my work hours for work items only.
so do i recommend this to others? nope because of the effort it requires. i think i only do this because i have some sort of obsessive mind. but i don't think most people care about such things or even care where their time goes.
even for me this was not the first experiment that worked (e.g. i tried pomodoro too). if a recommendation is needed then i'd say experiment with various things and stick to whatever enjoyable method that works.
published on 2024-05-27
# msgxchg: exchange secret messages instead of gifts in secret santas
imagine a typical central european school for kids. you have a group of 20 kids who visit the same classroom for 9 or so years. they know each other quite well.
because it's a tightly knit group, secret santa is a very common tradition there. the kids draw names and so they get a secret assignment: give a gift to the drawn person. wikipedia: https://en.wikipedia.org/wiki/Secret_Santa.
there are other places where this is done: among friends, among employees, etc. i just mentioned the school as an example of a tight group and because that's where i encountered this.
but ugh, i hated this. i never could figure out an adequate gift. i never really appreciated any gift i received. and it felt just a waste of money or effort, creates unnecessary thrash, etc etc. i thought and still think this tradition is silly. my love language is not gifts i suppose.
# alternative idea
i do like the intent behind the game though. so here's my proposal for alternative rules, one that might be more meaningful.
each person draws 2 other people randomly. this is mostly for redundancy reasons and to make the game more challenging. perhaps a software can do the random assignments to reduce the chance of a person drawing themselves.
then rather than giving gifts to each other give 2 messages to each target. the sender has two respond to 2 prompts:
in the second the recommended action should be something the sender truly believes will do good for the target. the target doesn't have to accept the offer but if they do, the person offering should carry it out.
# motivation for sending
both questions are tricky especially if i don't know the target person very well. if i can't think of a nice recent action then i would need to go and ask the target's friends of things the target person did recently and then write something i liked the most. this would increase my social knowledge about my group both by talking to others i normally don't talk to and by learning something about the target. and socializing outside of my comfort zone is healthy anyway.
the second question has similar benefits too. but it also makes me think harder: i have to figure out some activity i would be happy to do with the other person. so here i need to come up with something i can do even if i don't like the other person. teaches me to find common grounds.
here are some example offers:
# motivation for receiving
i think it would be nice to receive these too. it's nice to hear someone noticing and then calling out something i did. it makes my heart warm.
the offer thing might be hit and miss. but maybe deep down i do think i need help with math but i never really asked for help. here i can give it a shot for free. or maybe i do think i should socialize more, i'm too much of a loner. an offer for a chat in such case can worth much more than a cheap chocolate.
these gifts have the chance to be truly meaningful for the receivers. with two received offers the receiver will have higher chance to receive something useful.
# process
when the game begins, the participants get the assignments from the admin (e.g. the teacher in the school example). they have 2 weeks to come up with the messages.
then the admin has 1 week to review the messages. there's still a chance that some people might send mean messages or the messages don't fit the prompts. when the admin sees invalid messages, they work with the senders to improve them.
then the messages are revealed to the recipients. this is done privately, other participants cannot see what others received or from whom. and participants should not reveal these even freely to avoid embarrassing the senders and to avoid peer pressure on people who truly don't want to reveal their received messages. such no-reveal commitment might increase the sensitivity and thus the personal value of the messages. it creates unique, secret bonds between random pairs of people. it's called /secret message/ exchange for a reason.
the admin part can be delegated to a computer program if the group is tight and the chance for a misunderstanding is low.
(the admin cannot take part in the game itself. an alternative solution to administration is peer review: each participant nominates a trusted message reviewer from their peers. the reviewer will have a chance to review the incoming messages for and work with the senders to improve them if needed without leaking the contents and the identities before the reveal time. hopefully this would prevent receiving messages that are truly bad fit. the reviewer would significantly bias the messages so i'm not sure this is a good idea but it's worth considering.)
and that's it. this can be played with online friends too, no need for physical presence or even sharing real names or addresses. not that i'll ever play this game since i'm no longer in school or part of a tight group. but something i wish i could have tried instead of the useless gifts. maybe next life.
(also if this is something you would want to try with a group and you need an app then let me know, i can whip together a simple page for this on this site. mention the variant you need. in exchange i'd like to know if people enjoyed this or not after the event.)
published on 2024-06-03
# rssfeed: add rss feeds to your blogs
an rss feed is a page that a simple blog such as this one can provide to report the latest posts on the blog in a structured format. then rss feed reader software can periodically fetch these and show the users the latest posts across many blogs. users can follow others without algorithmification and ads. pretty cool.
for an example of a feed go to my @/rss page and check out the source via the browser's source explorer.
for feed reader i recommend feedbro because it's a free, locally running browser extension and doesn't need registration to online services. there are many others, possibly better ones so do look around. (i haven't, explained in the next section.)
rss used to be a big thing while google reader existed. it became a bit obscure after that shut down but still many sites provide it and there are many feed readers.
in this post i want to compare rss to the alternatives and then give some implementation tips.
# addictive
disclaimer: i don't use any feed readers, i don't follow anyone with rss. in general i find most of these things too addictive. when google reader was a thing, i spent way too much time in it.
it's worse than something like a facebook or tiktok feed where you just get garbage content that you can easily quit after you spend 3 seconds thinking about it. your own feeds are personally curated, probably high quality and interesting so it makes it harder to quit.
but i did it nevertheless. now i rely on my memory to follow blogs and video channels. whenever i am in the mood for some fast-food equivalent of browsing, i just go through the blogs i remember, type their urls (i use no autocomplete) and check for new content manually like cavemen. if i forgot the exact url then i just websearch for it. if i forgot about a blog completely then good riddance, i just saved couple minutes for myself. then later if i rediscover the forgotten blog then it's always a nice present to read many new posts.
but nevertheless, rss is cool. i don't really like some of the quirks of it but at least we have a standard.
note that there's also atom, very similar, i don't fully understand the differences, but consider everything i say here apply to that too. atom seems to be using iso 8601 timestamps so it must be better. but iirc rss is more popular term that's why i talk about rss. i don't go into the technical details too much anyway.
# alternative: social network
one alternative to rss is to re-post the content on social networks and then the on-platform followers will get a notification about it if they are lucky. lot of people do this. if growing and reaching a large audience is the goal this is probably unavoidable.
it's a bit unreliable because as far as i am aware these megacorps take following and subscriptions more as a hint rather than an actual request. so a new post might be completely hidden for the followers.
and it isn't suitable for all followers: not everyone is comfortable letting megacorps know what they are following. not everyone has accounts on social media sites.
# alternative: web's push notifications
apparently javascript allows webpages to register for push notifications. then the website can remotely wake up a service worker in the user's browser to show a notification. this works even when the user doesn't have the page open! to reduce the creepiness of this, the service worker must show a popup so then the user notices that the website's code ran in the background and can disable that. (the service worker can skip showing a notification but if it does it too often, it will lose the notification permission and thus the permission to run without the page being open.)
this is pretty anonymous so it might be good for following sites. but it requires installing service workers so if the user changes browsers or clears site data then they might lose the subscription.
followers would get a notification on their phone or desktop immediately when the new content appears. i think this is very annoying so i'd guess not many people would sign up for this anyway.
to be fair, this is quite interesting technology, i might make a separate post about this later.
# alternative: email
the content creator can create a newsletter to which the users could subscribe via providing their email address. then the creator just sends out an email whenever a new content is published.
this can be convenient for the user because they can use their advanced email filters to categorize their subscriptions. if it's allowed then followers could even reply to the content and increase their engagement with the creator.
and it's also nice for the creator: they can see the number of their followers. and these are most likely real, interested followers since subscribing to a newsletter is a bit harder than subscribing to a channel on social media.
there are several problems though:
so i suppose email is somewhat okay but it's messy and still might not reach all people.
# alternative: rss
rss on the other hand is very simple to set up and can be served from a static website. just fill a couple fields in the xml such as publish date and title and it's done.
most rss feeds (mine included) also put the content into the feed but it's somewhat tricky. and each feed reader displays the feed slightly differently. if this is a concern, the feed could just contain a link to the main page. its main purpose is just to notify users of new content anyway. that's what i did for a while until i beefed up my feed generator.
some people put all content into the feed resulting in huge feeds. i recommend against this, just keep a couple months worth of content in the feed to keep it short (assuming it's updated regularly). otherwise the frequent fetches by the various readers can cause undue load. a small feed should be fine because i think most people only care for the freshest content anyway. for the "i-want-to-read-everything" usecase i recommend creating separate archive pages. my blog has that too, it's the github backup link at the top of the @/frontpage.
see http://rachelbythebay.com/w/2022/03/07/get for some other tips to reduce the load. (not sure why that post doesn't consider reducing the size of the feed though.)
the downside of rss is that it requires specialized tools so it won't reach many people either. but it's the cleanest subscription mechanism for the followers because it doesn't leak much towards the site. of course an evil rss feed could do some shady tricks provide like personalized rss feeds or pixel tracking but the other alternatives can be worse.
# implementation
i don't go into generating the feed itself, there are other, better pages for that. just couple notes on what to do once the feed is ready.
add something like this to each page's html header:
<link rel=alternate type=application/rss+xml title=domainname-or-other-title href=link-to-rss>
in my case i have this:
<link rel=alternate type=application/rss+xml title=iio.ie href=rss>
this will allow rss feed extensions to automatically recognize rss feeds in the page and the user can add them via one click. usually this is why they have the "access your data for all websites" type of permissions. (not sure if that can be disabled in the extensions if that's a privacy concern.)
for the love of god, set the content disposition as inline for the rss feed. it's so aggravating when i click on someone's rss link and i get the browser's intrusive download prompt. what am i supposed to do with that? with inline disposition the browser will display the raw xml. but at least i can easily copy paste the rss link from the url bar. serve it from an "rss.txt" if your static file server determines the disposition based on the filename.
for bonus points add styling to that raw xml via https://en.wikipedia.org/wiki/XSLT. xslt is pretty cool. the server serves raw xml data and then xslt transforms that into a nice user interface without any javascripting. i do this on @/rss but my xslt skills are very basic so i just do the basic transformation of showing the title (@/rss.xsl).
# recommendation
if you have a blog, add an rss feed to it, because it's a relatively simple static content that only needs updating whenever new content is added. give people choice how they follow you.
btw, if you want to follow something (e.g. such as a youtube channel) as an rss feed and the main site doesn't seem to be providing them then look for rss feed generators. e.g. for youtube the invidious frontend (such as https://yewtu.be) does provide the feed in a convenient place: it's the link in the top right corner on a channel's video page. (yt provides it too, but it's somewhat hidden, see comments.) web search can find generators for other popular sites. there are even generic newsletter to rss converters such as https://kill-the-newsletter.com/. or there's https://newsblur.com/ which provides an rss like interface to popular sites.
rss is nice way to follow content you care about rather than what megacorps think you should see. (but keep in mind that it can be addictive.)
published on 2024-06-10, last modified on 2024-06-17
comment #rssfeed.1 on 2024-06-16
YouTube provides feeds, e.g. https://www.youtube.com/feeds/videos.xml?channel_id=UCK8sQmJBp8GCxrOtXWBpyEA
comment #rssfeed.1 response from iio.ie
ah, thanks, adjusted the text accordingly.
# tame: tame the inner animal
there is this unexplainable thirst for sexual intimacy in me. i cannot explain or describe it. i think a lot of people have it otherwise the porn industry wouldn't exist. it's fascinating to me, i call it my inner animal since it's driven purely by instinct rather than reason.
i struggled with it for a long time. it made me depressed because i didn't know how to quench these desires. i made lot of weird posts stemming from this when this blog was relatively young. but nowadays it's no longer a problem. i find some of those posts pretty dumb in retrospect but decided to keep them up. i think it's ok to be dumb on the internet, i find such blogs more charming.
anyway, i decided to jot down what helped me the most to tame my inner animal. such a post might end up being useful for me later in case it gets too wild again and i need a reminder. i'm not saying these things are generic and work for everyone. no, these are the things that worked for me and i'm just describing my experience.
# exposure
about five years ago i moved to switzerland. to my luck this country is more lax about sexuality.
during my teenager years i struggled with nudity. i avoided public showers, i didn't shower instead, that sort of stuff. but here in switzerland lot of saunas are naked. so we went to saunas with friends several times, all bare, everybody else bare, no shame. exposing myself and seeing others does help get out of this irrational body shame stuff.
but more importantly, here in switzerland there are various courses about sexuality. tantra massage, orgasmic meditation, that sort of stuff. they are multi-day group classes: they explain the basics such as communication, consent, respect, arousal, anatomy, massage oils, various types of touches, etc. then the attendees are randomly paired, they get naked, and practice tantric massage on each other. and then teacher goes around and gives you feedback including stuff like whether you hold your partner's lingam (penis) or touch the partner's yoni (vulva) correctly. all very casually like in a cooking class.
i can highly recommend attending these courses if someone has such interests. they aren't restricted to switzerland only, they can be found in other countries as well. it might take a while to find the right web search keywords, it's not something widely advertised. sometimes they can be found by finding a few nearby independent tantra massage providers and looking for a "studies" or "certification" section on their website.
i attended a few of these and quickly normalized human bodies, touching others, etc. it helped reversing most of my bad thinking habits i had. this shit should be taught in high school. i mean as an optional practical class where interested pupils can learn and experience sexuality in a very intimate way with the help of professional models. if these things are given to teenagers in a practical but controlled manner then maybe they will make less mistakes and/or end up less psychologically damaged. but the world at large is probably not ready for this shift yet.
# therapy
so what if i have a deep desire to have a sexual experience with a different person than my life partner? do i spend thousands of dollars on therapy to try to suppress those emotions or medicate them away? well here's a cheaper way to address that: just have sex with another person if that's what you want. therapy doesn't get simpler than that.
the first step in this process is to open up to your partner. if you have any thought that is bothering you, you should tell that to your partner. that's what partners are for: to help each other. and chances are, assuming you have a reasonable partner that you trust, you can find some middle ground, some compromises, some experiments, etc. it might have some emotional costs but they might just be worth it if the partner doesn't want to live with a depressed, sad, lifeless person.
but this requires a rational partner. if sex is too much of an irrational taboo or monogamy has irrationally high value for them, then it can be tough. but even then, the communication channel must be established and be tread very carefully. it's hard but doable. i even experimented with stuff like @/optioning to help me bring up some topics in a slower manner.
and for communication in sexual desires i highly recommend the @/touch game. it helped me to become more assertive about what i want rather than simply hoping for the best. after a dozen sessions i sort of learned what particular activities make my inner animal the happiest. then i just ask for them from my partner and the obsessive thoughts stay at bay for much longer.
but yeah, i've done sex with professional escorts a few times, and it did calm down the desires for variety. it was a bit scary the first time but it gets easier as one gets more experienced. mature, independent escorts can give a really streamlined experience.
i didn't start with escorts right away though. i started with a few erotic massages and worked up my courage from there. this slow approach was also easier on the partner.
the point is that i don't feel depressed anymore, so i have no regrets. i'm very fortunate to live in a country that doesn't make a big deal from basic human desires and let people buy their happiness if they need it.
though note that this doesn't mean that the animal is fully gone from me. when it sees all barely clad ladies walking around in the hot summers, well, it still goes crazy. but at least i can now manage these uncontrollable emotions without going too crazy myself.
initially i thought i need a secondary relationship and live in a some sort of polyamorous setup. but my partner pushed back against that and it's not like i could find another crazy person willing to enter a relationship with my lazy ass. then i explored the escorting aspect and a few occasions of it turned out to be enough for me. it's much less hassle than maintaining a relationship. the first idea is not always the best. relationships are too much effort anyway, one is more than enough for me. it might be entirely the case for others that a secondary relationship would work better than transactional sex. others might just need counseling or a psychologist. everyone is different.
# busyness
the other thing that helped taming the animal is that i learned to be busy. a few years ago i couldn't finish the simplest of projects. nowaday i can finish anything i put my mind to. i always have a little hobby project i work on every day. @/mornings describes my daily habit (albeit it's a bit dated, i have my streamlined my morning routine since).
the benefit of this is that it keeps my mind busy. it busyloops about my project rather than exploring depressive thoughts and then spiraling into depression. i don't have time for depression. even if i feel like wanting to feel sad, it must come after i make progress on my current project. but the knowledge that i made progress makes me happy and then i don't feel sad anymore.
# age
and the other thing that is changing is that i'm getting older and so does my inner animal. these desires are much less intense compared to what i've felt in my twenties. maybe a few more years and they will completely evaporate and i will have one problem less.
# communicate
that was my journey. if there's one generic advice i could distill then that would be this: communicate.
that's how i started. i started writing dumb blog posts as a means to explore my thoughts. the writing itself didn't solve my issues but it helped me to start talking, exploring and trying things and eventually i found what i need.
don't be shy, talk with your partner or @/stream your thoughts anonymously onto the internet. it's fun!
published on 2024-07-08
# slackday: move all internet slacking to a dedicated day
whenever i felt pressure or uncertainty at work, i often turned to aimless browsing (such as hackernews) or watching youtube videos. they give me a nice relief like smoking does for the smokers. but just like with smoking, i quickly get addicted to the distraction. i then constantly need the distraction to put my mind back into the comfort zone. the need for distraction then seeps into the afterwork hours too: then i watch youtube all night and then feel overwhelmed and barely make any progress on anything.
fortunately i have a regular @/reflecting habit that catches me spiraling into madness. this habit forces me to periodically reevaluate my life and come up with new @/habits if i feel i need adjustments. this time i came up with the idea of the weekly slackday.
the idea is simple: i commit to read discussion boards, watch youtube, check blogs, look up trivia on web, etc. strictly on friday. it's like the cheat day in diets but for the internet. if i'm itching on a different day then tough luck for me. my allowlisted itch-scratching options are: writing the itch down, freewriting, leetcoding, exercising, walking, showering, daydreaming, etc. i particularly like leetcoding. tackling an easy problem is the simplest but satisfying distraction i can do that i don't feel too guilty about.
if i feel i want to look up something then i add its link or search query to my slackday todo entry. then on slackday i go through the queued stuff. simply writing the itch down helps calming it down.
# the effect
i feel more productive. i don't have access to distractions that are never-ending time sinks. so in the i end up circling back to my todo lists. if i'm not sure what to do, i just look at my oldest assigned worktask and see if i can make some progress with it. most of the time i can do that. and then i get surprised how much stuff i can get done if i don't distract myself.
my interests also seem to be changing. because my time is limited i focus more on content that is more relevant for me. i spend more time on reading golangweekly.com articles and watch less kurzgesagt videos. the latter is too generic and i never learn much from it. but it's easy to consume so i never found a way to stop watching it. now it's easy: i simply don't have time for it on my limited friday anymore.
oh and eat less junk food like potato crisps too. i used to eat it when watching youtube. but now less time for youtube, less time for junk food too.
in @/tlogging i mentioned i don't do consumption logging. but that's because consumption happened haphazardly, in the evenings or in small breaks, where such logging is inconvenient to do. but now that i'm spending time on consumption in a structured manner, doing consumption logging is easy. i started having short notes about various interesting blog posts that i could then later refer to when in future i try to make a post about those topics.
i'm doing this only for a little over two months now so the effect could be chalked up to https://en.wikipedia.org/wiki/Hawthorne_effect too where the increase in productivity is not due to the specific change but due to being more mindful about productivity stemming after any change. nevertheless i feel pretty confident that this has a net positive effect on me. in any case i'm also writing this post to remind myself to go back to this in case i start slipping in the future.
# 100% rule
100% rule is the explanation why such absolute commitment works:
100% commitment is easier than 98% commitment.
the short story is that i don't have to waste time thinking whether i can watch a youtube video this evening or not. it's simply not allowed, end of story. here are some links explaining the idea in more detail:
i highly recommend picking up this mental trick to deal with addictions. or a variant of it: in my case i don't fully reject the internet intake, i just limit it in a very predictable manner and that works for me. the point is to pre-make the decisions so i don't have to agonize about the same problem over and over thorough the day.
published on 2024-08-05
# featver: a semver compatible calendar based versioning scheme
i don't really like https://semver.org. it discourages code cleanups. even small changes such as removing an old unused function require a major version bump. i prefer the more natural piecewise evolution, time based guarantees and calendar based versioning schemes.
unfortunately the go tooling really insists on semver. while the semver schema is enforced, but its contract part isn't. so i came up with an alternative schema+guidance that looks like semver, is calendar based, and gives time based guarantees: https://ypsu.github.io/featver.
maybe i'm overthinking this. i'll experiment with it in my upcoming hobby projects and see.
published on 2024-09-02
# difftesting: review effect diffs instead of unittesting
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/difftesting.html.
four years ago i wrote @/goldentesting. my opinion didn't really change since. in fact lately i'm coding more so i had a stronger desire for the tool i outlined there. i thought a lot how i could it make convenient in go+git and came up with these modules:
effdump is where i spent most of my efforts and i think is the most interesting. it's a bit hard to explain what this is succinctly so here are two guided examples:
to make the package name unique and memorable i went with the term "code effects" instead of "output" or "golden output". so the library names are "efftesting" and "effdump". if i'm ever frustrated with tests, all i need to think of is "eff' testing!" and then i can remember my library.
# example usecase: my blog's markdown renderer
here's an example usecase i have for effdump. in this blog i have the source text of these posts in markdown and i have a hacky markdown renderer that converts these posts into html. the rendering happens in the server whenever a post is fetched (the result is cached though).
sometimes i change the markdown rendered, e.g. i add new features. whenever i do that, i want to ensure that i don't break the previous posts. so i'd like to see the rendered output of all my posts before and after my change.
effdump makes such comparisons easy. i just need to write a function that generates a postname->html map and effdump takes care of deduplicated diffing across commits. now i can be more confident about my changes. it makes programming less stressful and more of a joy again.
# example usecase: pkgtrim
here's another example where i used this: https://github.com/ypsu/pkgtrim. it's a tool to aid removing unnecessary packages from a linux distribution. in archlinux the list of installed packages are scattered in many files.
in order to test pkgtrim's behavior, i keep complete filesystems in textar files. i just grabbed my archlinux installation database and put it into a textar file and made pkgtrim use as a mocked filesystem. so my diff tests don't change even if i alter my real installation on my system. and i could add mock filesystems from my other machines too and see what pkgtrim does with them.
whenever i made a change i could immediately tell what the effect was across many inputs. i could immediately tell if the diff was expected or not. if i liked it, i just accepted the diff. if i didn't like it, i continued hacking. but otherwise i didn't need to toil with manually updating the unit test expectations. developing pkgtrim was a breeze.
# caveats
but i'd like add some caveats about output testing in general. they have a bad rap because they are hard to get right.
it's very easy to create an output that has spurious diffs after the slightest changes. e.g. outputting a go map will have random order. care must be taken that only the truly relevant bits are present in the outputs and any indeterminism is removed from the output. e.g. map keys must be sorted.
the outputs are easy to regenerate. this also means they are easy to skip reviewing and fully understanding them. it's up to the change author to remember to review them. because of this, it's less useful for larger teams who might find such tests too cryptic. on the other hand in single person projects the single author might find them extremely useful since they probably know every nook and cranny in their code.
another effect of "easy to accept even wrong diffs" is that it might be less suitable for correctness tests. it's more suitable where the code's effects are rather arbitrary decisions. e.g. markdown renderer, template engines, formatters, parsers, compilers, etc. you could just have a large database of sample inputs and then generate sample outputs and have these input/output pairs available during code review. then the reviewer could sample these diffs and see if the change's effect looks as expected. this could be a supplement to the correctness tests.
but also note that these days a lot of people write and update the unittests with artificial intelligence. people can make a code change and just ask the ai to plz update my tests. so the difference between the two testing approaches is getting less relevant anyway.
so output tests are brittle and easy to ignore. but they are not categorically wrong just because of that. there are cases where they are very good fit and makes testing a breeze. one needs a lot of experience with them to ensure these tests remain useful. unfortunately the necessary experience comes only after writing a lot of brittle and ignored tests. chances are that you will create anger to your colleagues if you do this type of testing.
caveats and disclaimers given, proceed with this approach on your own risk.
# diffing
diffing text is our fundamental tools in software engineering. distilling the effects of the application into human readable text and then diffing those can help a lot to understand the changes. it's the human way to make sense of the immense complexity of the world. there's a nice post about this here: https://exple.tive.org/blarg/2024/06/14/fifty-years-of-diff-and-merge/.
so go forth and distill effects into diffable texts and then learn through these diffs!
note from 2025-01-08: i've made a screencast about how to use these packages.
efftesting:
[non-text content snipped]
effdump:
[non-text content snipped]
published on 2024-09-09, last modified on 2025-01-08
# pkgtrim: a linux package trimmer tool
this post is about my little https://ypsu.github.io/pkgtrim project.
i tend to install too much crap on my machine and never uninstall them. this always bothered my minimalistic senses but i wasn't sure how to deal with this situation.
a friend showed me nixos and how you can have a config file and then drive the system installation from that config. i didn't really like nixos, it felt a bit too complex for my simple needs. but i really liked the config driven part.
the other thing he showed me was https://github.com/utdemir/nix-tree. this is a package explorer for nixos. it can also tell you the list and size of the "unique dependencies" for a package. these are the packages that have no other reverse dependencies other than the given package. i really liked that because those are all the packages i could get rid of after uninstalling the given package.
my system is archlinux and after that meeting i was wondering how to have an intent driven installation and allow me to explore package relationships in a simple manner. i think i've managed to figure it out. this is what i came up with: https://ypsu.github.io/pkgtrim/.
the ~/.pkgtrim on my small rpi4 contains all packages i need along with a comment why i need them. while setting it up i've managed to delete some garbage from my system. now i could easily reinstall the whole machine, run `pkgtrim -install`, and end up with the same packages installed as i have now. and i can keep the .pkgtrim file in my dotfiles repo. i think i will sleep better now.
oh and i used my new @/difftesting approach to develop this. writing this tool was a breeze!
published on 2024-09-16
# starglob: simplified glob for simple needs
lately i had multiple cases where i wanted to have the ability for the user to select multiple items with wildcards with a glob-like matcher:
furthermore there could be multiple matchers and an entry should be considered matching if it matches any of the matchers.
one way of describing a matcher is using regexes. so i'd use "linux-.*" and "outputs/.*" in the above examples. but i don't like this because regexes are verbose (i need the . before the *), are ambiguous whether they partially or fully need to match, and are unnecessarily powerful for the above usecases.
interestingly i have a similar problem with globs. ordinary globs are non-trivial too: https://pkg.go.dev/path#Match. i don't need most of these features either.
so i ended up using a very small subset of globs: just the * is special and it can match arbitrary number of characters. anything else is matched verbatim, including ? and [. these globs must fully match. example: "linux-*" would match "linux-v1.2.3/alpha" but not "somelinux-v123".
i'm not sure if this subset has a name but i went with the name "starglob" for simplicity. that's what i need 90% of the cases so might as well make my user interfaces use starglob by default.
another big advantage of this is that this is easy to implement, even to match with multiple matchers:
// MakeRE makes a single regex from a set of starglobs. func MakeRE(globs ...string) *regexp.Regexp { expr := &strings.Builder{} expr.WriteString("^(") for i, glob := range globs { if i != 0 { expr.WriteByte('|') } parts := strings.Split(glob, "*") for i, part := range parts { parts[i] = regexp.QuoteMeta(part) } expr.WriteString(strings.Join(parts, ".*")) } expr.WriteString(")$") return regexp.MustCompile(expr.String()) }
it just makes a single regexp that matches if any of the starglobs match. empty set of globs match only the empty string. to make the empty set match anything, i can add this to the beginning:
if len(globs) == 0 { return regexp.MustCompile("") }
and that's it.
sidenote: in this implementation * matches path separators too like /. no need for a separate ** syntax for that. most of the time such restriction is not needed so this is fine. it would be easy to add if needed though: first split on "**". then split the individual components on "*" and join those with "[^/]*". then join the "**" split with ".*". but again, this is rarely needed.
demo:
func main() { glob := flag.String("glob", "", "List of comma separated starglobs to match.") flag.Parse() matcher := MakeRE(strings.Split(*glob, ",")...) allfiles, _ := filepath.Glob("*") for _, f := range allfiles { if matcher.MatchString(f) { fmt.Println(f) } } }
prints all matching files from the local directory. e.g. to print all source files:
go run starglobs.go -glob=*.go,*.c,*.cc
easy peasy.
published on 2024-09-23
comment #starglob.1 on 2024-09-23
Implementing this via RE seems extraordinarily wasteful given the construction cost. Have you looked into this at all?
comment #starglob.1 response from iio.ie
agree that it is inefficient to construct. but i'd expect that it's rare that a user would pass my application a long list of complex globs that this starts to matter. matching should be ok in terms of performance.
i haven't looked into optimizing this much. if i wanted a faster or more featureful globbing (e.g. one that supports both alternatives and **) i'd probably go with a package. e.g. https://pkg.go.dev/github.com/gobwas/glob and https://pkg.go.dev/github.com/bmatcuk/doublestar both look nice.
this post is just short snippet that is easy to copy paste into my future projects when the simple needs don't warrant adding a complex dependency.
# goref: express the non-nil pointer annotation in go with a generic alias
i don't write a lot of typescript but i occasionally dabble in it. i wrote this typescript code recently:
// maybeVariableX and maybeVariableY type is string | null. // variableZ type is string. // then i had this code: if (maybeVariableX != null) { variableZ = maybeVariableY }
i got this error:
Type 'string | null' is not assignable to type 'string'.
i was... pleasantly surprised that this was caught. amazing! i wanted to have maybeVariableY in the condition, i just had a typo.
this thing is called "union types" in typescript. i don't really want that in go. but is it possible to have similar nil-check in go?
i found a nice suggestion to use & for non-nil pointers here: https://getstream.io/blog/fixing-the-billion-dollar-mistake-in-go-by-borrowing-from-rust/. but that requires a language change, that's too big of a change.
based on https://go.dev/blog/alias-names now i could have a simple package like this to represent pointers that should not be nil:
package ref type Ref[T any] = *T
it doesn't do anything in the go compiler, it doesn't create a new type. i can assign Ref[T] to *T and vice versa just fine. now i could write a code like this:
func f(s ref.Ref[string]) { fmt.Println(*s) } func g(s *string) { f(s) }
this compiles just fine. it has a semantic problem though: g takes a potentially nil pointer and calls f which wants a non-nil pointer. but a sufficiently smart linter could give a warning here similarly to typescript above! and it wouldn't give the same warning for this code:
func g(s *string) { if s != nil { f(s) } }
is this useful? would i use it? i don't know. but i wouldn't mind playing with it.
i'm not up for writing such a linter though. just wanted to express the desire that i'd like to try such a linter.
note 1: technically you could have achieve this previously without generic aliases too by writing `alias RefT = *T`. the downside of that is that you need to do that explicitly for each type. or you could use some special `/*LINT:nonil*/` comment next to the var where you want non-nils. the downside of that is that it doesn't get included in the generated godoc so users might miss it. both of these lack the right ergonomics. i think the `type Ref[T any] = *T` might be just simple enough that it can catch on.
note 2: i can imagine using such aliases for other linter-only annotations too such as const, e.g. `type Const[T] = T`. not that i want const annotations. i fear go is getting too complex.
published on 2024-09-30
# goerrors: annotate errors to save debugging time
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/goerrors.html.
instead of:
if err := barpkg.Frobnicate(bazpkg.Twiddle(key)); err != nil { return err }
always write:
package foopkg ... if err := barpkg.Frobnicate(bazpkg.Twiddle(key)); err != nil { return fmt.Errorf("foopkg.Frobnicate key=%q: %v", err) }
in other words: always %v (not %w) wrap errors with a detailed but succinct unique error identifier before propagating the error up. doing so gets you the best errors. that's it. thanks for coming to my ted talk.
[non-text content snipped]
[non-text content snipped]
nuance? yes, there's nuance to this.
for a long while i wasn't sure how to think about errors and always wondered what's the best way to go about them. after a good dose of stockholm syndrome i now love go's approach the best. there are a few concepts i had to understand before the big picture "clicked" for me.
# exceptions
go's errors are just ordinary return values. such system is often compared to exceptions. let's compare it to java. java has 2 types of exceptions:
which one to use? from https://docs.oracle.com/javase/tutorial/essential/exceptions/runtime.html:
If a client can reasonably be expected to recover from an exception, make it a checked exception. If a client cannot do anything to recover from the exception, make it an unchecked exception.
example for an unchecked exception: the code passes a null pointer to a function that accepts only non-null pointers. there's nothing the caller can do about this other than not calling it in the first place. so the fix here is a code change, not something that can be pre-coded.
another way to think about this: checked exceptions can be used for control flow. unchecked exceptions on the other hand can only be propagated up. they then end up in logs or presented to humans who can then do something about them.
# error domains
error values have similar 2 types (terminology from https://matttproud.com/blog/posts/go-errors-and-api-contracts.html):
this was a key realization for me that escaped me for years of programming.
if a function can return a domain error, it should be clearly indicated in its documentation. example "go doc os.Open":
Open opens the named file for reading. If successful, methods on the returned file can be used for reading; the associated file descriptor has mode O_RDONLY. If there is an error, it will be of type *PathError.
anything else should be treated as an "opaque error". such errors should be propagated up or logged/presented when it can no longer be passed upwards. they should be never used for making control-flow decisions.
in general return opaque errors unless returning a domain error is explicitly needed. fmt.Errorf allows wrapping errors with both %v and %w: https://go.dev/blog/go1.13-errors#wrapping-errors-with-w. wrapping with %w keeps the error a domain error. therefore in most cases use only %v to ensure the returned error is opaque.
# annotating errors
the main difference of error values to exceptions is that such propagation has to be done manually after each function call that can return an error. but this becomes super handy! let's take this example:
package userpkg ... func VerifyPassword(request *http.Request) error { ... hash, err := sqlpkg.LookupUserColumn(request.FormValue("username"), "hash") if err != nil { return fmt.Errorf("userpkg.LookupHash username=%q: %v", request.FormValue("username"), err) } ... }
with "return nil" type of error handling you might get this log entry:
/login failed: not found.
what was not found? with java-like exception handling you would get stacktraces too:
/login failed: NotFoundException sqlpkg.LookupUserColumn userpkg.VerifyPassword handlerspkg.RequestHandler
still unclear what was not found. but maybe this time one could make reasonable guesses what the problem might be after few hours of code reading. with the practice of adding handcrafted context at each level the log message could be this:
/login failed: handlerspkg.VerifyPassword request-id=12345: userpkg.LookupHash username="": userdb.NotFound
from this error message the error is immediately apparent: the request's form params don't contain a valid username. probably a form validation missed this before.
note that the stacktrace is not needed at all. while the stacktrace helps to locate where the error happened but it doesn't tell us what exactly the error was. it doesn't tell the "story" how the code led to the error.
the stacktrace is also very verbose and visually jarring. the above is simple but in reality the callchain is dozens of lines and contains lot of useless fluff. each log entry is very long and makes scanning the logs hard. the handcrafted message is quite to the point. not only tells where the error is but it also tells how the code ended up being in that state. it takes out a lot of mystery detective work from the debugging sessions.
in the above case each error message fragment has a unique prefix string. the uniqueness is ensured by the pkgname/ prefix, more on this later. the callchain can be easily reconstructed from this in the very rare cases when needed via simple grepping. and the callchain can be reconstructed even if there were some refactorings in the meantime. in the stacktrace case a refactoring would change line numbers and then it would be very tricky to follow the code exactly.
there are bunch of proposals and libraries for stacktraces, see https://www.dolthub.com/blog/2023-11-10-stack-traces-in-go/. don't use them. if you do annotations well then you won't need them and debugging errors will be a breeze. stacktraces might allow you to get lazy with the annotations and you might end up having harder time debugging.
# unique error message fragments
it's super handy when you have an error message and from it you can jump straight to code.
one way to achieve this is using source code locations in the error messages. this is what happens when the error includes stacktraces. as explained before this is quite verbose and spammy. furthermore the message on its own contains very little information without the source code.
another approach: make the error messages unique. this contains more useful information to a human reading it than a source code location. but it also allows jumping to the source code location directly with a grep-like tool. and the jump works even if the code was slightly refactored in the meantime.
there are proposals to add source code location tracing to fmt.Errorf or a similar function: https://github.com/golang/go/issues/60873. this should not needed if you can keep the error message unique.
how do you keep the message unique?
the established pattern fmt.Errorf() adds a message, then a colon follows, then the wrapped error. to ensure it's easy to find where a error message fragment begins and ends make sure the fragment doesn't contain a colon.
don't do this:
fmt.Errorf("verify password for request-id:%d: %v", id, err)
but do this instead:
fmt.Errorf("verify password for request-id=%d: %v", id, err)
this will make scanning the errors for the fragments much easier.
but "verify password" might not be unique on its own. read on.
# error message wording
how to phrase the error annotation? keep it short. avoid stop words such as failed, error, couldn't, etc. this is painful to read:
/login failed: failed verifying password for request-id 12345: failed looking looking up hash for "": not found
when wrapping errors then make the message an imperative mood of what the function tried to do just because the imperative mood is short. always start it with a verb. this style is similar to function names. they also start with verb and use imperative mood. but don't include the function name in the message, focus on the action the function was doing when the error encountered. the function name often doesn't matter and would be just visual noise (especially if the function is just a helper). the caller can often provide more accurate context (sometimes it's the function name, sometimes is something better).
leaf level errors usually describe a bad state. it's ok to use passive stance for those (i.e. when not wrapping). example: "not found" in the above snippet.
some people advise this:
func RequestHandler(request *http.Request) (err error) defer func() { if err != nil { err = fmt.Errorf("RequestHandler: %w", err) } } ... }
no, don't do it. it will make the errors harder to use. first, it might lead to avoiding describing the exact actions the function was doing and adding the necessary details. second, it breaks the unique string benefits: a simple grep to find code for an error will no longer work.
so don't name it based on the current function, name the error after what the current function was doing when the error occurred. now concatenate the words, CamelCase them, prefix them with the package name and the result is a near unique string. instead of
/login failed: failed verifying password for request-id 12345: failed looking looking up hash for "": not found
the error is this:
/login failed: handlerspkg.VerifyPassword request-id=12345: userpkg.LookupUserHash user="": userdb.NotFound
more about this at @/errmsg.
# avoid redundancy in annotations
if you squint enough then all this annotation work is actually writing a story. each layer or function has a piece of the full story and they have to include that fragment in the story. but the story gets boring and hard to read if it contains redundant information. take this example:
func readfile(filename string) (string, error) { buf, err := os.ReadFile(filename) if err != nil { return "", fmt.Errorf("read file %q: %v", filename, err) } return string(buf), nil } func f() { fmt.Println(readfile("foo.txt")) }
the error message from this would say this:
read file "foo.txt": open foo.txt: no such file or directory
this is redundant. in this particular case it is fine to simply "return err". don't take the "always annotate" rule too much to the heart. annotation is often not needed when propagating errors from helper functions, small wrappers of other functions from the same package. this is how go errors can avoid the java-like verbosity where each helper function is also included in the final stacktrace. if you do this then add a comment to be clear about this:
buf, err := os.ReadFile(filename) if err != nil { // no error wrapping: os errors already contain the filename. return "", err }
unfortunately you might not know beforehand that io errors all contain the filename. so in that case it's fine to err on the side of redundancy. simply remove the redundancy once you see that some errors are hard to read due to this.
writing a good story needs good artistic skills. those skills come with experience. don't worry too much about it. just make sure the errors contain all the important bits, even if duplicated.
# control flow
there's one big problem with all this manual error annotation: it's super slow. the good news is that it only happens on the error path which should be the rarer codepath. that assumes that you don't use errors for ordinary code logic.
this example from above is actually bad:
package sqlpkg ... func LookupUserColumn(username, column string) (string, error)
compare it to this:
package sqlpkg ... func LookupUserColumn(username, column string) (value string, found bool, err error)
this latter form distinguishes found/not-found from a sql database error such as bad sql query or connection error or database corruption. the not-found condition could be very frequent. and as such it would be frequently used to make code flow decisions. e.g. a not-found condition would lead to user-friendly error message that the username doesn't exist while everything else would create a ops ticket to investigate.
checking that bool could be magnitudes faster than trying to extract the not-found condition from an error fragment. https://www.dolthub.com/blog/2024-05-31-benchmarking-go-error-handling/ has specific numbers for this, i highly recommend checking it out.
i recommend returning a dedicated return value for describing specific conditions if those conditions will be often used to alter the caller's codeflow. search for something like "exceptions code flow antipattern" or similar keywords to see more reasons why it's unhealthy to rely on having lot of logic in error handlers.
# preconditions
suppose "func f(v *int) error" doesn't accept nil pointers. one is tempted to add a "assert(v != nil)" like logic to it. don't do it. return it as an error: if v == nil { return fmt.Errorf("mypackage.CheckNil variable=v") }.
why? if the application crashes due to this then the developer gets just a stacktrace. if it returns an error then the rest of the callers build up a "story" how the program ended up in the bad state. make sure to support this debugging experience.
though it makes no sense to add an error return value just to return errors for such bad invocation. it would be annoying if sqrt() would return (float64, error). only do this if the error return value is already there.
# metaphor
this type of error handling might feel as unnecessary busywork. medical surgeons also complained how annoying it was to wash hands or disinfect the surgical tools. after all no harm is done if they don't do it, right? it turns out the harm comes much later. once the medical profession learned this, they decided to accept the cost.
annotating errors is similar. the value of them is not apparent. the value becomes apparent when problems start arising. my hope is that the coding profession will recommend always-annotated errors too instead of exceptions-like error handling once it observes how good error messages make our lives much easier.
# references
this post was inspired by reading many other blog posts. i probably forgot to list all my sources but here are some of them i remember:
# takeaways
there's nothing wrong with error handling in go. all those error handling improvement proposals? not needed! it's good as it is.
the only problem with go's error handling is that it's verbose: needs 3 lines. i'll rant about this in my next post, stay tuned.
as a summary here are my key points from this post:
edits:
[non-text content snipped]
published on 2024-10-07, last modified on 2024-10-26
# errmsg: use identifiers as error strings to make error searching easier
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/errmsg.html.
i used to debug a lot with debuggers. i no longer do so. why? because i no longer need to. if something goes wrong then i have a pretty good idea about the problem thanks to the error message from my tool. i've explained in @/goerrors how to make error messages this useful.
in that post i mentioned that the error messages are basically a story of how an application ended up in that bad state. but i was still unsure about the right format back then. i continued experimenting after that post and i found this format having the best tradeoffs (i already updated the post):
pkg1.CamelCaseAction var1=value1 var2=value2: pkg2.WrappedErrorAction var3=value3: pkg3.SomeBadState var4=value4 (some free form explanation in parens)
let me unpack that. go established the practice that each error has a string representation. this string representation always includes the text of the child error:
if err != nil { fmt.Errorf("<some string here>: %v", err) }
the question is, what should be <some string here> be? the answer is "pkgname.CamelCasedErrorName context_variable1=value1 context_variable2=value2 ...".
previously i used normal sentence-like strings but i found the identifier name much easier to work with! so instead of
verify password for request-id 12345: lookup hash for "": not found
the error message would be this:
handlerspkg.VerifyPassword request-id=12345: userpkg.LookupPasswordHash username="" (user not logged in?): userdb.UserNotFound
this gives each error string a near unique token. makes it super easy to search for without the need for quotes and knowing where an error fragment starts and ends. it takes a bit of practice to read and write them well but i am convinced the simplicity in exchange is worth it. also note how userpkg.LookupPasswordHash has some free-form hint on what the problem might be. most errors don't need such hints though.
the identifier names an action the function is trying to do when the error happened. similarly to functions it should usually start with a verb except for the leaf level errors.
i also allow nested tokens in complex functions. e.g. "handlerpkg.VerifyUser.LookupPasswordHash" would be acceptable in some but rare cases. keep it simple wherever possible though.
there are other things to keep in mind: avoid stop words, avoid redundancy, prefer opaque errors, etc. check out @/goerrors for more tips about error handling.
i started using this form even in log messages. works quite well there too. but that's a story for another day.
note from 2024-11-16: don't use plain "main." prefix in binaries, use the binary's name. the "main." would have too many clashes when searching otherwise. i am now using this style of error messages in more of my code and my life became so much easier! jumping to code right from the short error message is now super simple. i highly recommend doing this!
note from 2025-01-08: i've made a video version of this post:
[non-text content snipped]
published on 2024-10-28, last modified on 2025-01-08
comment #errmsg.1 on 2025-01-26
This is actually a good idea in my opinion too. Couldn't imagine myself so religiously doing this, but looks like a good idea - also probably should be utilized in my compiler if I want any kind of real error messages...
# cstatus: error code with a message is all i need for errors in c
as explained in @/goerrors and @/errmsg i'm quite fond of go's error handling. before go i coded in c. error handling always bothered me there. but i wonder now that i have some go experience: could i design something simple enough for c that i would be happy with? it turns out yes!
most error handling in c is just returning an error code. and my typical way to handle it is to put it into a CHECK macro. CHECK is like ASSERT but meant to be always enabled, even in debug mode. here's how it used to look like:
int fd = open(...); if (fd == -1 && errno == ENOENT) { // handle this specific error. ... } CHECK(fd != -1); // handle all other unexpected errors ... sz = read(fd, ...); CHECK(sz != -1); ...
the application just crashed when there was an unexpected error. as explained @/goerrors, debugging such crashes wasn't always easy.
# requirements
so what do i need? i really like the error code based error handling. that's all i need 99% of the cases: "if error is A, do B. if error is C, do D. ...".
but i also need the context to make understanding the error easy. this can be represented via a simple string.
so that's it: i only need an error code and a string.
# error domains
there's one hook though. errors have domains. examples:
notice how all of these codes are just small numbers. so here's the idea: error codes are 64 bit unsigned numbers (8 bytes). 6 bytes represent the domain as an ascii string, 2 bytes (0..32767) represent the error code from that domain.
take ENOENT from the errno domain. ENOENT is 2, the domain's ID is just "errno". encode it as the following:
0x006f6e7272650002 o n r r e
the "errno" is reversed here because most machines are little endian, so the bytes are stored in reverse order. printing 5 letters from the 3rd byte of that uint64 data blob gets "errno". in @/abnames i write more about my admiration of short names.
so somewhere in a header i would have this:
enum errnoCode { // ... errnoENOENT: 0x006f6e7272650002, // ... }
then i can do this in my error handling code:
uint64_t errcode = somefunc(); if (errcode == errnoENOENT) { // handle errnoENOENT } else if (errcode != 0) { // propagate all other errors as internal error. return canonicalInternal; }
but this on its own is not enough because it doesn't allow me to append context and nuance in the form of an error message.
# status
i really like grpc's status proto: https://google.aip.dev/193#http11json-representation. it's a bit overcomplicated to my taste so here let me simplify it to my code+message needs in c:
typedef struct { uint64_t code; int msglen; // excluding the terminating 0 byte char msg[]; // has a terminating 0 byte. } status;
that's it. all it has a code and a zero terminated string. it also uses the trick where the string is at the end of struct rather than at a separate memory block. this way the string buffer doesn't have to be freed separately.
in order to use this, i also need 3 helper functions:
// The returned status must be freed. // wrapped, if passed, is freed as part of the wrapping. status* statusNew(status* wrapped, const char* format, ...); status* statusNewDomain(status* wrapped, uint64_t code, const char* format, ...); status* statusAnnotate(status* wrapped, const char* format, ...);
there's lot to unpack here so let me demonstrate this through an example. a hypothetical go inspired io module could have the following functions:
typedef struct { void* data; int len; int cap; } ioBuffer; status* ioOpen(int* fd, const char* filename); status* ioClose(int* fd); status* ioReadFile(ioBuffer* buf, const char* filename);
notice how all functions return a status pointer. the rule is this: NULL status means no error. non-NULL status means error.
the ioOpen and ioClose functions could look like this:
// ioOpen opens a file for read only. // On error returns an error from the errno domain. // The error message will contain the filename. status* ioOpen(int* fd, const char* filename) { *fd = open(filename, O_RDONLY); if (*fd == -1) { return statusNewDomain(NULL, errnoDomain + errno, "io.OpenForRead filename=%s", filename); } return NULL; } status* ioClose(int* fd) { if (*fd == -1) { return NULL; } if (close(*fd) != 0) { return statusNewDomain(NULL, errnoDomain + errno, "io.Close"); } *fd = -1; return NULL; }
they return errors from the errno domain. ioClose takes a fd pointer so that it can be passed already closed fd descriptors and do nothing for them. this will become handy if one uses the defer construct:
// ioReadFile appends the contents of the file to buf. // On error returns an error from the errno domain. // Most errors will contain the filename. // Always free buf->data, even on error. status* ioReadFile(ioBuffer* buf, const char* filename) { int fd; status* st = ioOpen(&fd, filename); if (st != NULL) { return st; } defer { free(ioClose(&fd)); } constexpr int bufsize = 8192; char tmpbuf[bufsize]; while (true) { int sz = read(fd, tmpbuf, bufsize); if (sz == 0) { break; } if (sz == -1) { return statusNewDomain(NULL, errnoDomain + errno, "io.ReadFromFile filename=%s", filename); } if (buf->cap - buf->len < sz) { int newcap = 2 * (buf->cap + 1); if (newcap - buf->len < sz) { newcap = buf->len + sz; } buf->data = xrealloc(buf->data, newcap); buf->cap = newcap; } memcpy(buf->data + buf->len, tmpbuf, sz); buf->len += sz; } return ioClose(&fd); }
note when there's no error, ioClose gets called twice. the second time it's called from defer. but it's fine because this time it will be no-op. this is a nice pattern from go to handle guaranteed close() and properly handle its error too on the error-free path.
so... umm... defer in c... yes it's possible with a non-standard compiler extension. it's super awesome, much nicer than gotos. but i cannot go into all tangents so just check out the full source code at the end of the post if interested.
oh, you noticed the "constexpr" bit too? it's not a typo, i didn't accidentally write c++. this is c23. welcome to the modern age.
there's lot more to unpack here... i won't do that for now, just marvel at the code until it makes sense.
# internal errors
in the above example the io functions returned an error from the errno domain. but most of the time the error is unexpected, doesn't fit into a clear domain. in that case return an opaque, internal error with statusNew(). opaque errors are not meant to be inspected or to be used in control flow decisions. they just need to be presented to a human through log messages or other form of alerts.
let's study a hypothetical "printFile" function that prints a file:
status* printFile(const char* fname) { ioBuffer buf = {}; status* st = ioReadFile(&buf, fname); defer { free(buf.data); } if (st != NULL) { return statusAnnotate(st, "test.ReadFile"); } size_t sz = fwrite(buf.data, 1, buf.len, stdout); if ((int)sz != buf.len) { return statusNew(NULL, "test.PartialWrite"); } return NULL; }
statusAnnotate keeps the existing domain code of a status and just prepends a context message. so test.ReadFile in this case would be an errno domain error. the caller could handle the errnoENOENT code (file not found) in a nice, user friendly manner.
test.PartialWrite is an opaque error because it was constructed via statusNew() which doesn't take a code. the caller shouldn't act on this error, just propagate it up. in this case it's triggered when fwrite() reports partial write. this could happen stdout if piped into a file and the disk is full. but there could be many other reasons. this function doesn't want to care about the various conditions so it just returns an internal error.
notice @/errmsg in action: because i use the identifier form for the various error conditions, it is much easier to reference and talk about them.
# wrapping errors
now suppose for some reason i'm writing a function that needs to return errors from the http domain. the errors can be wrapped like this then:
status* run(int argc, char** argv) { if (argc != 2 || argv[1][0] == '-') { printf("usage: test [filename]\n"); return statusNewDomain(NULL, httpBadRequest, "test.BadUsage argc=%d", argc); } status* st = printFile(argv[1]); if (st != NULL) { if (st->code == errnoENOENT) { return statusNewDomain(st, httpNotFound, ""); } if (st->code == errnoEACCES) { return statusNewDomain(st, httpForbidden, ""); } return statusNewDomain(st, httpInternalServerError, ""); } return NULL; } int main(int argc, char** argv) { status* st = run(argc, argv); if (st != NULL) { printf("error: %s\n", st->msg); free(st); return 1; } return 0; }
then here's how the various error messages could look like:
$ ./test usage: test [filename] error: http.BadRequest: test.BadUsage argc=1 $ ./test /nonexistent/ error: http.NotFound: test.ReadFile: errno.ENOENT (no such file or directory): io.OpenForRead filename=/nonexistent/ $ ./test /root/.bash_history error: http.Forbidden: test.ReadFile: errno.EACCES (permission denied): io.OpenForRead filename=/root/.bash_history $ ./test /root/ error: http.InternalServerError: test.ReadFile: errno.EISDIR (is a directory): io.ReadFromFile filename=/root/
notice how simple the resource management is. main() consumes the status, it doesn't propagate it up. in order to free it, it only needs a single free() call. easy peasy!
# creating domains
ugh, this is where things get ugly. this needs lots of boilerplate but magical macros can help a lot.
before i jump into this: i'm following go's naming convention even in c. if i work on the "status" package then all symbols are prefixed with status and then CamelCase names follow.
let's start with something simple: converting an at most 6 byte long string to a uint64. this is needed for getting the domain part of the code. here's how it could look like:
#define statusMKDOMAINID(str) ( \ (sizeof(str) > 0 ? (uint64_t)str[0] << 2 * 8 : 0) + \ (sizeof(str) > 1 ? (uint64_t)str[1] << 3 * 8 : 0) + \ (sizeof(str) > 2 ? (uint64_t)str[2] << 4 * 8 : 0) + \ (sizeof(str) > 3 ? (uint64_t)str[3] << 5 * 8 : 0) + \ (sizeof(str) > 4 ? (uint64_t)str[4] << 6 * 8 : 0) + \ (sizeof(str) > 5 ? (uint64_t)str[5] << 7 * 8 : 0) + \ 0)
then statusMKDOMAIN("errno") would give 0x6f6e7272650000.
whenever a new domain is defined, there are several structures that need to be defined:
fortunately x macros can make this pretty simple (https://en.wikipedia.org/wiki/X_macro). here's how the http domain could be defined:
constexpr uint64_t httpDomain = 0x707474680000; // statusMKDOMAINID("http") #define httpCODES \ X(http, OK, 200, OK) \ X(http, BadRequest, 400, InvalidArgument) \ X(http, Forbidden, 403, PermissionDenied) \ X(http, NotFound, 404, NotFound) \ X(http, InternalServerError, 500, Internal) \ X(http, CodeCount, 600, Unknown) #define X statusENUMENTRY enum httpCode { httpCODES }; #undef X extern const uint64_t httpStatusCode[statusCOUNT(http) + 1]; extern const char* httpCodeName[statusCOUNT(http) + 1];
the two additional arrays could be defined like this:
#define X statusSTATUSCODEENTRY const uint64_t httpStatusCode[statusCOUNT(http) + 1] = {httpCODES}; #undef X #define X statusNAMEENTRY const char *httpCodeName[statusCOUNT(http) + 1] = {httpCODES}; #undef X
the definitions of statusENUMENTRY, statusSTATUSCODEENTRY, and statusNAMEENTRY are ugly. i spare the reader from that. check the full source code at the end if curious.
# takeaways
aaanyway, there's a lot of fluff here, i know. and perhaps it looks a little bit overcomplicated. but i really enjoyed writing this c code. it's not much harder to write this than in go. and i can totally imagine happily using something like this in c if i ever program in c again.
a lot of this is a matter of tradeoff between complexity and ease of use. if the struct would allow incorporating custom objects (like how grpc does it) then it would require a much complex api. that would be very awkward to use from c. 99% of the time i don't need that so i think the simpler interface is better and i won't hate coding and error handling due to it.
the full source code is at @/cstatus.textar. there's a lot of things i didn't mention. there are some things that could be done better. but hey, future me, i don't code much in c, so be glad i documented the main points at least, ha!
published on 2024-11-04, last modified on 2024-11-16
# flagstyle: keep flags before the positional arguments
there are many schools of thought about command line flags:
as with everything with go, i found the ordering rule for the flags weird at first. but over time i learned to appreciate it. now it's my favorite style.
over time i also developed a few more rules i personally adhere to when passing flags:
when it makes sense i sometimes add checks to my tools to enforce the second rule to eliminate potential ambiguity.
but why?
# subcommands
some tools do this:
toolname -globalflag1=value1 subcommand -subflag2=value2 arg1 arg2
in this case -subflag2 is a subcommand specific flag and must come after subcommand. i personally don't like this. as a user i can't really remember which flag is global which flag is subcommand specific. this also allows redefining the same flag (such as -help or -verbose) twice and then the confusion intensifies. the form should be this:
toolname -globalflag1=value1 -subflag2=value2 subcommand arg1 arg2
when tool is initializing it should find the subcommand and register its flags into the global flag namespace. this should be done before all the flags are defined because the flag definitions depend on the subcommand. but extracting the subcommand without knowing which flags are bools is only possible if all non-bool flags use the "-flagname=value" form. that's why i enforce that form in my tools.
as an example let's take a hypothetical "compressor" application with two subcommands, "compress" and "decompress". running without any argument or just a -help would print a generic help message:
$ compressor --help usage of compressor: compressor [flags...] [subcommand] subcommands: compress: compress a file. decompress: decompress a file. use `compressor -help [subcommand]` to get more help.
running the help for a subcommand would print both the subcommand specific and global flags separately:
$ compressor -help compress usage of the compress subcommand: compressor [flags...] compress compresses a file. subcommand flags: -input string input filename. (default "/dev/stdin") -level int compression level between 1 and 9, 9 the best but slowest. (default 5) -output string output filename. (default "/dev/stdout") global flags: -force auto-confirm all confirmation prompts. dangerous. -verbose print debug information.
and it would also detect incorrect usage:
$ compressor -level 6 compress error: main.UnknownSubcommand subcommand=6 exit status 1 $ compressor compress -level=6 error: main.BadFlagOrder arg=-level=6 (all flags must come before the subcommand and must have the -flag=value form) exit status 1
both global and verbose flags must come before the subcommand:
$ compressor -verbose -level=6 compress compressing /dev/stdin into /dev/stdout, level=6, verbose=true.
see @/flagstyle.go for one potential (not necessarily the nicest) way to implement this. it uses reflection to magically create flags from structs. notice how the subcommand detection happens before flag.Parse(). that's only possible if all flag values use the -name=value syntax, hence the check for it.
# command wrapping
the command wrapping usecase is my primary motivation to have all flags as left as possible. take something like ssh:
ssh [ssh_flags...] [machine-name] [command] [command-args...] # example: ssh -X myserver uname -a
# go flag parsing: ssh -X jumphost ssh -X myserver uname -a # getopt flag parsing: ssh -X -- jumphost ssh -X myserver -- uname -a
you have to litter the commandline with --. some people like this sort of separation. but i am now using such commands extensively for years and i prefer to not have the -- markers. the former style gets natural very fast.
it might seem a rare usecase but at work i work with surprisingly many tools that have some sort of "pass/forward all subsequent args unchanged" needs:
i rely on these tools so much that i had to learn to keep my flags on left. then i might as well do it so everywhere. i started doing that and realized my life is much easier.
# short options
some people love short options. e.g. they can write "ls -lh" instead of "ls --long --human-readable". i don't miss short options in my tools. if that's really needed then perhaps make the first arg a short option collection like in tar or ps unix commands:
# create tar, verbose output, output file is output.tar: tar cvf output.tar file1 file2 ... # show all processes, format nicely: ps auxw
ls interface could have been similar:
# show permissions, owner, and name: ls pon directory1 directory2 ...
or if sacrificing the first positional argument feels too much then put all that into a single flag:
$ ls --help ... flags: -show=flags: pick the fields to show for each entry. ... $ ls -show=pon directory1 directory2 ...
# takeaways
in summary my recommendation is to only allow -flag=value form of flags and all flags must be on the left before the positional arguments. it's awkward at first but one gets used to it quickly and it allows combining commands in a more natural manner. this in turn leads to a more pleasant command line experience with fewer gotchas. shells have already too many gotchas anyway.
published on 2024-11-11
# funcdriven: use function driven tests instead of table driven tests
i would like to give my wholehearted endorsement to this article: https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e.
it advocates to replace the table driven tests like
func TestStringsIndex(t *testing.T) { tests := []struct { name string s string substr string want int }{ { name: "firstCharMatch", s: "foobar", substr: "foo", want: 0, }, { name: "middleCharMatch", s: "foobar", substr: "bar", want: 4, }, { name: "mismatch", s: "foobar", substr: "baz", want: -1, }, } for _, tc := range tests { t.Run(tc.name, func(t *testing.T) { got := strings.Index(tc.s, tc.substr) if got != tc.want { t.Fatalf("unexpected n; got %d; want %d", got, tc.want) // line 32 } }) } }
with function driven tests like
func TestStringsIndex(t *testing.T) { f := func(s, substr string, nExpected int) { t.Helper() n := strings.Index(s, substr) if n != nExpected { t.Fatalf("unexpected n; got %d; want %d", n, nExpected) } } // first char match f("foobar", "foo", 0) // middle char match f("foobar", "bar", 4) // line 15 // mismatch f("foobar", "baz", -1) }
in case of error this is what you see in the former case:
> t.Fatalf("unexpected n; got %d; want %d", got, tc.want) funcdriven_test.go:32: unexpected n; got 3; want 4
in the latter case this is what you see in your editor:
> // middle char match > f("foobar", "bar", 3) funcdriven_test.go:15: unexpected n; got 3; want 4
basically the error message points directly to the place where the erroneous data is. makes working with tests super convenient.
i used table driven tests for a long time but i now switched over tho this. i confirm from experience that i find these much easier and more natural to work with.
and when ready for an even bigger leap of faith then use https://pkg.go.dev/github.com/ypsu/efftesting to automate away the manual maintenance of the "want" argument.
i am starting to like writing tests, yay.
published on 2024-11-18
# gorun: run go code straight from web via go run
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/gorun.html.
i don't have moral problems with "curl https://example.com/sometool-install | bash". my biggest problem with it is that one should not program shell scripts in the 21st century. these shell scripts are not standardized: each script installs random crap into different places. and after i stop using the tool, the out of date trash remains around.
fortunately go has a much nicer alternative: "go run example.com/sometool@latest". or if the example.com isn't a git repo then: "go run github.com/example/sometool@latest". this will download, compile, and run the latest version of the tool. subsequent runs will use the cached binary. go needs to be installed on the user's machine but it's not a huge package, i think that's acceptable.
because it will compile everything on the user's machine, the tool needs to be compact: avoid huge code and sprawling dependencies. it's surprisingly easy to get lot of things done in go with the standard packages only. embrace that, some small duplication here and there doesn't hurt.
one downside of using the @latest tag is that it would trigger lots of redownload and recompilation as the tool gets developed. avoid this issue via using a dev branch for development. merge the dev changes into the main branch only on a weekly basis. but if the tip is broken then the user can always specify a specific version: "go run github.com/example/sometool@v1.23.45".
a special "prev" branch could be maintained for the previous release too which lags behind the main branch with a week. then users can run a simple "go run github.com/example/sometool@prev" to run the previous stable version if the latest one is broken. it might take a few hours until go caches pick up any changes in branches though. therefore update the prev branch a day before updating the stable branch to ensure the user can go back as soon as the @latest tag gets updated. (there's a trick to invalidate @latest cache by force requesting a new version but i haven't found such a trick for @branch references.)
the user can also perma-install with "go install github.com...". this one puts the binary into the default $GOBIN directory. user can run the tool without "go run" if that path is in $PATH. startup will be a bit faster but the tool won't auto-update.
users can set up aliases too:
alias sometool="go run example.com/sometool@latest"
i prefer shell wrappers to aliases because then i can use them from other tools such as vim:
$ cat .bin/sometool #!/bin/sh go run github.com/example/sometool@latest
that's all i need to put into my dotfiles repo, i don't need to litter it with complex makefiles and install scripts. and works out of box on both archlinux and debian and is always up to date.
there are a couple tools i now use like this:
later i plan to use this method for setting up new personal machines with a single command. another usecase i have is to run a static set of pre/post commit checks in my git repos without needing to deal with writing complex shell scripts.
example:
$ go run github.com/ypsu/textar/bin/textar@latest -help Manipulate .textar files. Create a textar file: textar -c=archive.textar file1 file2 file3 Extract a textar file: textar -x=archive.textar [...]
or in case you want to try it with a more official tool:
$ go run golang.org/x/exp/cmd/txtar@latest --help Usage of /tmp/go-build1914875111/b001/exe/txtar: -extract if true, extract files from the archive instead of writing to it -list if true, list files from the archive instead of writing to it -unsafe allow extraction of files outside the current directory -x short alias for --extract
very convenient. i wish companies would stop the curl|sh approach in favor of this. this has much better properties.
note from 2025-01-08: i've made a video version of this post:
[non-text content snipped]
published on 2024-11-25, last modified on 2025-01-08
# envchain: a conceptually simpler alternative to go contexts
ever used environment variables in unix? it's a mechanism to subtly pass configuration or other data down into child processes. each child can then crawl through all the environment variables and change its behavior based on it. it's not very clean to rely on envvars but they are quite practical.
go context is very similar but for functions. it's also just a random collection of key/values that functions can crawl through and make use of. but go's context interface is unnecessarily complex: it also includes functions for deadlines and cancellation. and the interface for storing arbitrary data in a context is hard to use.
there's another interesting way to think of contexts i heard of: it's reverse errors. as of go 1.13, errors are chains of values passed upwards (https://go.dev/blog/go1.13-errors). errors can wrap or annotate other errors. this way the deepest function can communicate information to the topmost function. the upper functions can use https://pkg.go.dev/errors#As to extract specific keys from this linked chain of values.
go context is then the reverse: it's also a chain of values. but here the topmost function can communicate information down to the deepest function. in error's case functions willing to participate in such information up-passing must have an error return value. in context's case functions willing to partipicate in such information down-passing must have a context function parameter.
# env
anyway, with those thoughts in my mind, here's a way to implement such value downpassing in a minimalistic manner:
package envchain type Link struct { Parent *Link Value any }
envchain.Link is a linked list of any values. the package would have a helper to extend the chain:
func (env *Link) Append(v any) *Link { return &Link{env, v} }
and similarly to errors.As, there would be an an envchain.As:
func As[T any](env *Link, target *T) bool { if target == nil { panic("envchain.EmptyAsTarget") } for env != nil { var ok bool if *target, ok = env.Value.(T); ok { return true } env = env.Parent } return false }
this works similarly to errors.As: extract any value up the chain.
and instead of something like
package exec func CommandContext(ctx context.Context, name string, arg ...string) *Cmd
you would have this:
func CommandEnv(env *envchain.Link, name string, arg ...string) *Cmd
or just this if backwards compatibility isn't a problem:
func Command(env *envchain.Link, name string, arg ...string) *Cmd
sidenote: in general avoid overloads. it doesn't make a sense to have both non-env taking function and env taking function. if it turns out a function needs an env or context then just add it. it's similar to its error counterpart, it doesn't make sense to have both a void and an error returning function:
func MyOperation() func MyOperationWithError() error
the latter only makes sense if MyOperation must be kept intact due to backwards compatibility. i recommend evolving the the codebase and remove such redundancies to ensure the packages remain clean. major version bumps are annoying, @/featver is an alternative for people not taking go's semver rules too seriously.
# passing down values
you can pass down any value this way. e.g. to pass down and then later read out a username:
package mypkg type username string // create a new chain from the parent chain with username in it: env = envchain.Append(env, username) ... // to extract it: var u username if envchain.As(env, &u) { fmt.Printf("username is %s.\n", u) } else { fmt.Printf("username not found.\n") }
you can use this to pass down values without the immediate functions needing to know about this. much easier to use than https://pkg.go.dev/context#Context.Value. a common example (and one that the context specializes on) is cancellation.
# cancellation
cancellation could be implemented as a standalone package apart from envchain. e.g. a structure like this:
$ go doc abort.aborter type Aborter struct { // Has unexported fields. } func New(parent *envchain.Link) (*envchain.Link, *Aborter) func WithDeadline(parent *envchain.Link, d time.Time) (*envchain.Link, *Aborter) func WithTimeout(parent *envchain.Link, timeout time.Duration) (*envchain.Link, *Aborter) func (a *Aborter) Abort(cause string) func (a *Aborter) Deadline() time.Time func (a *Aborter) Done() <-chan struct{} func (a *Aborter) Err() error
it's similar to context's cancellation management. and can be used similarly:
env, aborter := abort.New(env) defer aborter.Abort("function ended") ...
and it would be pretty easy to provide context compatibility too:
func FromContext(ctx context.Context) *envchain.Link func ToContext(env *envchain.Link) context.Context
aborter would also honor the deadlines and cancellation from contexts up the chain.
to make it easy to extract the current cancellation status from an env, abort would provide these helpers:
func Deadline(env *envchain.Link) time.Time func Done(env *envchain.Link) <-chan struct{} func Err(env *envchain.Link) error
here's how the Done function could be implemented:
type Abortable interface { Done() <-chan struct{} Err() error } func Done(env *envchain.Link) <-chan struct{} { var a Abortable if envchain.As(env, &a) { return a.Done() } return nil }
this can extract the Done() from both Aborters and Contexts. it also works if the chain doesn't contain any of them: it returns a nil channel which blocks forever when read from (i.e. the context is never done).
a deadline function would be more complex since Aborter has a different (simpler) return value for Deadline:
var InfiniteFuture = time.UnixMilli(1<<63 - 1) type Expirable interface { Deadline() time.Time } // For backward compatibility with context. type expirable2 interface { Deadline() (time.Time, bool) } func Deadline(env *envchain.Link) time.Time { for env != nil { if d, ok := env.Value.(Expirable); ok { return d.Deadline() } if d2, ok := env.Value.(expirable2); ok { d, ok := d2.Deadline() if !ok { return InfiniteFuture } return d } env = env.Parent } return InfiniteFuture }
this is an example where walking the chain explicitly is helpful. this is why envchain.Link members are exported. otherwise this function would need to walk the chain twice when trying to look for both contexts and aborters.
the full source is available at @/envchain.textar. the aborter package is a bit slower than context because it is unoptimized, creates 2 goroutines per each new aborter. this could be optimized to 0 with an "abortmanager" object that can manage many channels concurrently with https://pkg.go.dev/reflect#Select without needing to create a goroutine for each. the first aborter in the chain would create an abortmanager, the rest of the aborters would register into that. but all this is beside the point of envchain.
# my plans
changing context in go is futile at this point. that is set in stone. i'll stick to it in my projects.
but if i ever get a project where would need to use lot of indirect value passing then i might switch to envchains because it's easier to reason about and work with. it's compatible with context after all, see example.go in @/envchain.textar.
published on 2024-12-02
# mementomori: remember to die and look forward to it
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/mementomori.html.
death is unavoidable. i don't worry too much about it. in fact i look forward to it. i find it intriguing. i'm curious to see what being dead will be like.
# unlikely options
so what happens after we die? i know about these main theories:
void or nothingness: i won't experience anything anymore. i like this option: if there's no experiencing, there's no feelings, i can't feel anything, so i can't feel bad about this either. i wouldn't mind this but i find this very improbable. i just find it too weird that there's nothing for very long, then there's some experience, then there's nothing again. doesn't make too much sense from a math perspective.
reincarnation: you get to live another life randomly. most life is quite hard and challenging so chances are this comes with suffering. sometimes it's your current life that is creating the suffering for your next life (imagine being one of those bad dictators that ruin many lives). i'm not too worried about this possibility either. i feel like a lucky person, i'm sure i would luck out a fun next life too. but again, i find this unlikely too for the same reason as above: it had to start sometime and end sometime. but is there void before and after? doesn't make sense from a math perspective. though it could be cyclical: the universe dies but then reborns in the same configuration and all experiences get replayed. but that means "something" exists from infinite past until infinite future and sounds too complex to me logically.
"the egg" from andy weir: https://www.galactanet.com/oneoff/theegg_mod.html. i quite like this story. it's an interesting mix between reincarnation and simulation. i find it improbable but nevertheless i wanted to mention it because i find it very well written.
# simulation
and the final option i know is "we live in a computer simulation like a video game". i find this the most palatable theory. it goes against the occam's razor, i know. but it gives a simple satisfying answer about the world and i can sleep better without my brain trying to make sense of how the universe works. i'm not claiming that we live in a simulation, i just say i assume we live in a simulation because that is for me the simplest answer that keeps my brain calm. i can imagine the universe as a video game and that's it. when we die the words "game over" appear and you get escorted to the outer world for a debrief. then we'll have a chance to play again or do some other outerwordly things.
i think simulation is the most popular theory for others too. i count most religions such as christianity into this category. it also has a concept of creator (god) and the concept of an outer world (heaven and hell). same thing but with mysticism because back in the middle ages we didn't have computers so the concept of a simulation was hard to imagine.
my crazy beliefs go even further: i also believe intelligent design. the evolution was slightly prodded in order to build humans as they are today. dinosaurs were eliminated because they were too messy and ugly or something like that. if you are going to create a world, surely you want to ensure it looks attractive to attract the gamers. so if we have creators, then i find highly likely that they do influence the world they created.
and being in simulation also explains weird rules like light's speed limit. distributed computing is hard. changes propagate slowly in the system otherwise it would be too expensive to simulate.
anyway, my point here is that if this is a game, then my own death is no big deal. it's just a game.
note that this is about making peace with my own mortality. as for others: we should try our best to prevent unwanted deaths and unwanted suffering so that the game remains fun for others. i have some additional thoughts on this in @/simulation.
# meaning
so why live at all? what's the meaning of life?
i think i've managed to find a specific meaning for myself. i have a "backlog" of tasks i want to finish (see the "stashed tasks" in @/task). most of it are just blog post ideas. i won't rest easy until i finish that backlog. if that is not empty on my deathbed then i will be annoyed.
i started writing my backlog of ideas into a file about 4 years ago. i started tracking the backlog's size last year. this is how the file's size changed over time:
[non-text content snipped]
it's still going up. but surely it will start going down sometime soon. there's only so many new novel ideas and sentences that can occur to me, right?
i'm hoping that it will become zero when i am around ~50 years old (about ~15 years from now on). then i have a couple years to work on my steam games backlog. then i just go to sleep and never wake up. if there's an option to go for assisted suicide once my body breaks down (e.g. going blind, needing wheelchair, alzheimer, etc) then i'd happily take that.
phrased in other words: for me the meaning of my life comes from working towards whatever goals i set for myself. as long as i have goals, i have a will to live.
# dead body
also i would be happy to donate my dead body to a hospital so that new doctors and nurses can practice on me. then my family doesn't need to deal with my funeral. i definitely want to avoid a formal funeral; i hate such events. unfortunately donating body to a hospital requires lot of bureaucracy to set up, i haven't done it yet. this task is in my backlog so it's all good, i'm sure i'll get to it some day. i'd be happy to provide even my living body for science as long as they can guarantee to reliably shut down my brain before cutting me up. that's probably even harder to arrange. i don't really get what's the big fuss about dead bodies in our society especially if the body's owner explicitly gives it away for free use for anything. i'd certainly give that permission because i wouldn't care at that point: i would be dead.
# memento
why write this post? it's my memento mori. i just wanted to remind myself that i will die, i shouldn't worry about it too much.
my eyesight is continuously getting worse. should i worry about this? should i stop my current practice of sitting in front of my computer 16h per day? should i instead go outside now and then? nah, let me have my comforts now. i don't have any other life shrinking bad habits such as smoking, overeating, overstressing. i can probably last until 50 in this form and after that age i can stop caring about living longer. so i don't need to worry too much about deteriorating health. no need to go to regular health checkups either. i prefer surprise young death than living too long artificially on various medicaments and then worrying too much and being a burden on the healthcare system and the people around me. humanity didn't have preventative healthcare 200 years ago so it's not that worse than what most people experienced in the past. again, this is my own preference, not saying others should have the same preference.
and when i finish my backlog, maybe i can waste another 10-20 years gaming if i'm truly lucky. for me that sounds like a comfortable way to await the death. i'm sure it will all go according to this plan and i will be totally happy on my deathbed! this post will age well!
published on 2024-12-09, last modified on 2024-12-15
# imview: use the imperative mood in code reviews
in @/codereview i explored how to phrase review comments. i recommended the form of "describe problem. suggest solution with a question." about a year ago i switched to the imperative form of "request. reason. qualifiers." and i love it.
before i explain why, here's the internet explaining why imperative mood is not the best in reviews:
i agree with most of the advice from these articles. be nice, make the comments always about the code, try to avoid "you", perhaps avoid even the royal we. the only difference is that i make my first sentence of each comment an imperative request akin to a title like in git commits.
another quick note: by "code review" i mean the type of reviews where the review happens before the change or pull request is merged in to the mainline. all review comment threads have to be closed or resolved before such merge can happen. the author of the change cannot merge the change until the reviewer is happy with the change. but it's also fine for the reviewer to pre-approve a pull request and expect the author to make a few additional minor changes to address any open reviewer requests and then merge without the author's further re-approval. this is fine in high-trust environments in exchange for team velocity.
# reviewer's perspective
one of the tips from the links above is to ask questions and that was my previous approach too. however forcing myself to make the first sentence into imperative mood makes me think much harder about the comment i am about to make and thus likely to improve its quality of the comment.
suppose there's a line in the change that i don't understand. if i'm lazy, i can just drop a "why is this needed?" comment and publish my review. job well done, right?
but forcing myself to phrase things in the form of a request would make me try to understand the line harder. and if i still don't understand it, i can make the generic "add an explanatory comment about this line. it isn't clear to me from the context." comment.
an imperative comment presents a step forward. it asks for an action to be made. the reviewer can still reject it but at least it doesn't feel like the code review is going in circles.
note that the imperative mood only applies to the first sentence only. afterwards and in subsequent discussion i'm nice and try to follow the guidance the above websites recommend.
# author's perspective
consider the first sentence as the title of the comment's thread. imperative mood happens to be the shortest form in english. people shouldn't get offended by titles. the feeling of rudeness can quickly go away if this becomes a well established style. the feeling of rudeness is also strongly diminished if the comment has good description and qualifier parts.
often i don't want to hear the life story of the reviewer. i just want to hear what they want so that i can get the code merged and go home. them asking questions and being nice just comes across as passive aggressive and means more work on my side. so just start out with the request and the life story can come afterwards. it's similar to the common writing guidance which suggests to start with the conclusion.
example from a different workplace: i'm pretty sure nurses won't get offended when during an operation the surgeon just barfs "status?" instead of "could you please tell me the heartbeat rate? it will help me decide whether i can begin the operation", or just "scalpel!" instead of "could you please hand me over the scalpel? i would like to make an incision".
there are specific formal settings where it should be okay to omit pleasantries. for surgeons it is the operating table, for programmers it could be the code review thread titles (the first sentence of the code review threads). and people can quickly get used to it.
# annoying questions
take a look at the examples of a "nice review" from https://archive.is/LL0h4 ("exactly what to say in code reviews" from "high growth engineer"). let me quote just the first 3 examples, the rest are the same style:
i find such feedback annoying. such feedback is very easy to make, takes 5 seconds to come up with them but might take the author hours to answer. these questions stop progress.
the same feedback in imperative style:
these comments are much harder to make by the reviewer. the reviewer actually has to evaluate the options and make a recommendation based on their research. then the author can either accept or reject the recommendation but doesn't need to go into full research mode for an off-hand comment.
forcing the reviewer think hard is why the imperative style makes such comments much higher quality even if they can come off a bit rude-ish.
# good questions
questions that don't dump more work on the author are fine though. those are the ones where you try to confirm your understanding of the change.
a sole "why?" is a bad question because the author will need to type a lot and doesn't even know which part the reviewer doesn't understand. "is this needed because x?" is a simple yes/no question. here the reviewer demonstrates some understanding and the author can give a single word confirmation or give a very specific response to the misunderstanding.
these type of questions also require that the reviewer invests some time to understand the code and thus the question doesn't feel cheap.
but don't go overboard. one might tempted to request changes in the form of a question when the reviewer is truly unsure about the request themselves. "should we add caching here?".
no. my rule says that such a thing must be added as an imperative request: "add caching here." that sounds weird to write when unsure, right? the imperative mood forces me to think hard, perhaps research to understand whether that might make sense at all. and if still unsure then add an "i'm unsure about thist though" qualifier at the end to mark the unsureness: "add caching here. i think 99% of the people just look at the frontpage. but i'm not sure about this, thoughts?".
# concerns
suppose the the reviewer sees code that might be incorrect but not sure how the correct code should look like. there are creative ways to raise such concerns imperatively. e.g. "add a unittest for this piece of code. x returns y which sounds wrong." or "document this section of the code. it's a bit unclear how this works."
what if the reviewer is not sure what to suggests? the reviewer should always try to come up with an approach that addresses their concern even if the thing they come up with is not the best. they should request that with the qualifier that it might not be the best approach: "add caching here. 99% of people look at the frontpage, that should be a cheap request. not sure caching is the best approach though. thoughts?". the reviewer can suggest multiple options if they can come up with them: "add caching here to keep the frontpage requests cheap. or add a todo comment to handle this later one way or another. nevermind if you believe the load won't be a problem".
if the reviewer truly can't come up with a solution then they can omit the imperative request part and start with the concern but then explicitly acknowledge the missing request: "this makes the pageload time more expensive. i thought a bit about this but i don't see an easy way to address this. any ideas or a reason why we shouldn't be concerned about this?".
or if the reviewer is not sure if the concern applies or not then just omit voicing the concern at all. the review will have less noise. don't block people unnecessarily.
even if the reviewer wants to reject the code change, they should explicitly explain their concern but still provide a way forward for the author: "could you write a short one page document about this feature first? i have several concerns that i believe would be easier to hash out in a document". here i'm using the nicer "could you?" form of request here because this request is not aimed at the code but to the person.
# optionality
add justification for the request where it's not obvious. it makes it easier for the author to judge how important the request is. it will make rejecting the request easier. the author can explain why the reason or concern doesn't apply.
lean on making the requests optional especially for stuff that's easy to change later such as implementation details. if a change makes the codebase better, even if not the highest quality, they should be accepted. err on the side of team velocity rather than perfectionism. there are cases where perfectionism makes sense such as in interfaces or in widely used libraries but majority of the codebases aren't that.
learn to distinguish between one way and two way doors. jeff bezos seem to have popularized this metaphor. from a random article on the topic:
Some decisions are consequential and irreversible or nearly irreversible -- one-way doors -- and these decisions must be made methodically, carefully, slowly, with great deliberation and consultation. If you walk through and don't like what you see on the other side, you can't get back to where you were before. We can call these Type 1 decisions.
But most decisions aren't like that -- they are changeable, reversible -- they're two-way doors. If you've made a sub-optimal Type 2 decision, you don't have to live with the consequences for that long. You can reopen the door and go back through. Type 2 decisions can and should be made quickly by high judgment individuals or small groups.
As organizations get larger, there seems to be a tendency to use the heavyweight Type 1 decision-making process on most decisions, including many Type 2 decisions. The end result of this is slowness, unthoughtful risk aversion, failure to experiment sufficiently, and consequently diminished invention. We'll have to figure out how to fight that tendency.
most things in code are two way doors. even if you are absolutely sure about something, make the request optional. let people make mistakes. people learn more from the mistakes.
this assumes the person will be around to fix their mistake, e.g. teammates. being more strict on external, one-off contributions makes sense though.
even for stuff like style guide violations where the rules are very clear. it might be fine to let a few of them pass if the person is very opposed to some rules. maybe they are right about the particular rule so let them experiment. giving people freedom improves morale, they will be more productive over long term in exchange.
also if the review tool allows pre-approving a change then do that even if there are many open nits. of course that doesn't apply if there are concerns about the change or another round of review is warranted or based on prior experience the author doesn't respect the suggestions (e.g. ignores them without any response).
# qualifiers
mark the request with your expectations. this is super important for optional requests. giving a reason already implies sort of conditionality but it's better to make it explicit.
for more complex requests i often put a "thoughts?" note to the end to signal that i'm open for discussion about the request. but often add "nevermind if that's not the case" to signal that my assumption might be wrong. i also use "fine either way though" to mark that i don't really care about whether the request is applied or not. and many similar variants, all at the end.
there are other conventions too which put such qualifiers to the beginning:
i haven't used them yet but i think those are fine too.
# other title contexts
there are other places where the imperative mood is a good fit. one example is the first line of the git commit messages. this can be also seen as the title for the commits.
but this works great for bug and issue titles too! nowadays i would file "frobnicator: fix crash when x" instead of "frobnicator crashes when x". it was a bit awkward for some titles but i got better with experience and now my issues are much clearer just from looking at the title. the "projectname:" prefix style is also super useful for grouping issues solely based on the title (also see @/titles).
i try using the imperative mood even for my blog post subtitles. it keeps things short and to the point.
# feedback in general
these are just guidelines in general. better form might apply in some cases. e.g. simply quoting a rule in a code-style or readability review could be enough: "all top-level, exported names should have doc comments (https://go.dev/wiki/CodeReviewComments#doc-comments)". the imperative sentence could be omitted there.
some people might be overly sensitive and strongly prefer pleasantries (the opposite of https://www.lesswrong.com/tag/crockers-rules apply to them). well, just use whatever style they need to keep the review exchange efficient. this is not the hill to die on.
(sidenote: if your personality is still flexible then i highly recommend committing to https://www.lesswrong.com/tag/crockers-rules. life is so much easier when you don't stress about the exact words other people communicate with.)
these ideas go further than code review. all feedback should be imperative. the "just asking questions" does make sense in exploratory or socratic discussions but not in feedback.
but in non-formal environments such as online discussions or just normal everyday discussions more tact is needed. "could you pass me the salt?" works well for simple requests. or "i think asking more questions in meetings would demonstrate more leadership" could be another way to phrase a feedback in a semi-imperative way. both forms include a specific action that's requested so it ensures that the requester gave it a thought and isn't "just asking questions".
(sidenote: i generally try to avoid using the word "please" in my communication. the "could you" is already kind enough, there's not much point making my sentences even longer. in fact adding it makes the sentence feel more passive aggressive to me.)
published on 2024-12-16
# capitalize: the posts will be properly capitalized from now on
I woke up one day, decided I hate capitalization (@/uppercase), and decided to not bother with capitalization in my private notes anymore. Happened around the same time when I started this blog so I started writing in all lowercase. Then for the last 8 years I kept going so. This blog is sort of a cleaner extension of my private notes so I thought all-lowercase is fine here too.
But at work and in my bigger hobby projects I do use proper capitalization. Lately I'm writing more on this site (@/slackday boosted my productivity), sometimes stuff I'd like to share with others. I think on such posts I want to present my professional side (the capitalizing one) to avoid annoying others with my alternative writing style. I don't want the topic of capitalization distract the main points of such posts. The decision when I allow myself be all-lowercase vs the capitalizing one is getting harder. Sometimes I feel the lowercase style needs more effort from me than the well trodden path of the uppercases especially when I'm writing posts about the Go programming language where the case matters a lot.
Anyway, to reduce decision fatigue I decided to start capitalizing my sentences from this year on. Or at least try for a while and see how it feels like. It's a bit weird that most old post are in lowercase and the newer posts are in uppercase but so what. I'm not changing much else: I'll keep writing most posts mostly to myself so the writing remains a rambling one.
published on 2025-01-06
# screencasting: make narrated screencasts for demoing stuff
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/screencasting.html.
A picture is worth a thousand words. A narrated screencast is worth thousand pictures.
As an experiment I've recorded some screencasts:
gorun:
[non-text content snipped]
efftesting:
[non-text content snipped]
effdump:
[non-text content snipped]
goerrors:
[non-text content snipped]
# Format
The video size is 720x1280 so that they are convenient to consume from phones. A lot of people consume media on phones so I thought the videos should be convenient to consume for them too.
I really like how expressive this media is. When I'm explaining something, I often need to set context, e.g. write some code that demonstrates something. In the video format I can present the code and highlight the important bits from it and quickly jump over the unimportant bits. If the viewer is interested in the details then they can pause the video and study the frame. In the written format it's much harder to mark some section as non-important. I tend to err on including more context but then that makes the posts too long and too rambling.
I made the videos relatively short. I had an upper limit of 3 minutes in mind because that's what youtube's upper limit for shorts are. I sort of agree: I don't have a long attention span either. If I can't express the idea I want in 3 minutes, I probably need to split up the video into separate parts anyway. This allows the listeners to take a break and think through what I just said and decide if they want to continue or not. I often speed up the desktop action 10x so there's a lot of stuff happening quickly, I don't think it would be easy to sustain attention for longer periods for the above video style.
# Setup
Here's the process I used:
# Microphone
I got a fancy Rode PodMic USB. I can't record with it in my own room though. My room is empty so there's lot of echo. I have to go out to the living room and record there.
But for some reason the recording's volume is too low on the machine I have in the living room. I've ensured that the relevant volume settings in ALSA are high (I use bare ALSA there) and the mic's volume is 100% in rode's companion android app. Probably the usb port I'm using doesn't give it enough power or something. Doesn't matter much because I found a workaround: I put my mouth next to microphone at a 90 degree angle and shout in front of the mic. Thanks to the 90 degree angle the wind from the B, P, T sounds doesn't get picked up even from this closeness.
The gorun video had my ordinary quiet voice. I had to add 24 dB volume gain in shotcut onto the audio track. And I hate that voice. Too boring, no liveliness in it, it sounds like I'm talking from my deathbed.
I think the shouting voice I use in the other videos is much nicer so I'm sticking with that. I only add 12 dB additional volume gain in those videos. I'm actually glad that the mic was too quiet so that it forced me to find this voice.
# Hosting
The video size is about ~10 MB for each. They are small enough to keep them in git. But to keep my blog repo small I've created a separate git repo for them at https://github.com/ypsu/blogdata. Then I pointed data.iio.ie at that repo and host them from there. E.g. the goerrors video is at https://data.iio.ie/goerrors.mp4. It's pretty simple.
I thought about uploading them to youtube too. But I decided against that. I might start obsessing about views, subscribers, statistics, comments, etc. That shit is not healthy for my psyche. I think I'm better off keeping these videos just for myself on this secret blog without any statistics. I write most of the text posts for myself so the video posts shouldn't be different either.
# Takeaway
I'm quite happy with the result. I might create a few more screencasts later on. It makes me think harder on what I actually want to say and so my own thoughts become clearer to me.
I highly recommend creating such screencasts!
published on 2025-01-13
# condjump: conditional jumps would make Go code less sawtooth-y
this post has non-textual or interactive elements that were snipped from this backup page. see the full content at @/condjump.html.
[non-text content snipped]
A new serious error handling proposal appeared for Go from the maintainers: https://github.com/golang/go/issues/71203. It proposes replacing
r, err := SomeFunction() if err != nil { return fmt.Errorf("something failed: %v", err) }
with
r := SomeFunction() ? { return fmt.Errorf("something failed: %v", err) }
And if you don't annotate the error (you have only "return err") then you can write this:
r := SomeFunction() ?
I don't like it because it discourages error annotation. The annotation is very important as I argued in @/errmsg and @/goerrors. Furthermore the issue title is "reduce error handling boilerplate using ?" but it doesn't reduce the clutter I find annoying. It reduces some horizontal clutter but it doesn't address the vertical clutter. In this post I'll explore this latter aspect.
(This time I decided to complain publicly on the issue tracker: https://github.com/golang/go/issues/71203#issuecomment-2593971693. I think this is the first time I posted there. I always feel weird to bark my weird opinions into online expert forums. I don't even mind if my comment gets ignored or hidden, at least I had the chance to give my perspective. I will sleep better. I'm super glad the Go team allows such openness even if this adds huge amount of noise to their process.)
For me code with lots of error handlers sprinkled remains visually jarring and annoying to read top to bottom because of the sawtooth-y pattern. And the proposal is very specific to error handling.
Here's my semi-serious counterproposal to reduce the boilerplace experienced when reading code: introduce conditional jumps into the language. Make it possible to write "continue if cond", "break if cond", "return if cond, rv1, rv2, ...", "goto if cond". It results in a more a dense code but I find it more straightforward to read.
Here's a hacky demo of what I mean, hit condense to switch back and forth between the two styles:
[non-text content snipped]
(The ... syntax is from https://github.com/golang/go/issues/21182.)
It is dense but the code isn't visually jarring so it feels easier to read from top to bottom. Overall I argue the benefits overweigh the drawbacks. Furthermore:
A big con that this changes how Go is written. But so would any other error handling proposal but the above is more generic and would make Go more convenient to read in other, non-error contexts too.
Another big con is that this type of order is weird compared to, say, Python where you would write `return ..., fmt.Errorf(...) if err != nil`. But there the condition is on the right and I think that makes readability harder so I think this proposal is easier on the eyes once one gets used to it.
Oh, and if we are making changes to the language then also make main and TestFunctions accept an optional error return value. Then the error handling can be more idiomatic in those functions too and thus together with this proposal those functions would become more straightforward. Both of these changes would be backwards compatible.
# The panic hack
Sidenote: with panic one can emulate a poor man's one-line error returning:
func CopyFile(src, dst string) (returnError error) { // Note that these helpers could be part of a helper package. onerr := func(err error, format string, args ...any) { if err != nil { panic(fmt.Errorf(format, args...)) } } defer func() { if r := recover(); r != nil { if err, ok := r.(error); ok { returnError = err } else { panic(r) } } }() // Normal function logic follows. r, err := os.Open(src) onerr(err, "copy.Open src=%s dst=%s: %v", src, dst, err) defer r.Close() w, err := os.Create(dst) onerr(err, "copy.Create src=%s dst=%s: %v", src, dst, err) _, err = io.Copy(w, r) onerr(err, "copy.Copy src=%s dst=%s: %v", src, dst, err) err = w.Close() onerr(err, "copy.Close src=%s dst=%s: %v", src, dst, err) return nil }
I don't really like this because it's too much of a hack but for a long chain of error returning functions the hack might be worth it.
# Go dev feedback
What do the Go maintainers think about a proposal like this?
It turns out many people have tried proposing variants of this and all of them were rejected. But most of these proposals were in the context of error handling. As explained above, Go error handling is fine as it is. There are others who also like it as it is, see https://github.com/golang/go/issues/32825.
The only problem with error handling is that it is verbose: returning an error needs 3 lines. But this is because conditional jumps need 3 lines. I'm describing a much general issue here, error handling is just a specific instance of the issue. As such the objections of the previous formatting proposals should be revisited from this perspective and make sure break/continue/goto are covered too. Though some of these proposals did include this too.
Here are couple proposals i found:
Here is sample of Go maintainer responses:
https://github.com/golang/go/issues/27135#issuecomment-422889166 (single line if): We decided long ago not to allow this kind of density. If the problem is specifically error handling, then we have other ideas for that (as noted). But the decision that there are no 1-line if statements is done.
https://github.com/golang/go/issues/27794#issuecomment-430404518 (trailing if): There's no obvious reason to only permit the trailing if on return statements; it is generally useful. But then, there is no obvious reason to permit the trailing if at all, since we already support the preceding if. In general we prefer to have fewer ways to express a certain kind of code. This proposal adds another way, and the only argument in favor is to remove a few lines. We need a better reason to add this kind of redundancy to the language. We aren't going to adopt this.
https://github.com/golang/go/issues/32860#issuecomment-509842241 (trailing if): In my opinion `return fmt.Errorf("my error: %v", err) if err != nil` is harder to read, because it buries the important part. When skimming through the code, it looks like a return statement, so you think "wait, the function returns now? What is all this other code after the return?" Then you realize that this is actually a backward if statement. [...] Making this orthogonal would mean that every statement can have an optional condition, which is a poor fit for the language as it exists today. As [...] said above, it is easy to bury important side-effects.
https://github.com/golang/go/issues/33113#issuecomment-511970012 (single line if): As Rob said in his proverbs talk, "Gofmt's style is no one's favorite, yet gofmt is everyone's favorite." It is far more important to have one format than to debate minor points. There is not a compelling reason to change this one. (The original rationale was to keep conditional code clearly separated from the surrounding code; that still seems like it applies, but the bigger point is that gofmt format is basically done.)
https://github.com/golang/go/issues/62434#issuecomment-1709172166 (on condition return err): The idea of making error checks a single line has been proposed several times, such as in #38151, and always declined. This proposal is slightly different in that it omits the curly braces, but it's not very different. This would also be the only place where a block is permitted but optional. Also the emoji voting is not in favor. Therefore, this is a likely decline. Leaving open for three weeks for final comments.
So yeah, the maintainers are not too keen on this. In fact the emoji voting in https://github.com/golang/go/issues/62434 suggests that most people don't like proposals like this, it's not just the maintainers.
Though if people keep opening issues like this then there's clearly a demand for such a simple solution to the boilerplate. I wish the Go team would ask in a dev survey if people's problem is about the horizontal or the vertical boilerplate. Most error handling proposals seem to be optimizing the horizontal boilerplate but it might well be that people's problem is the vertical one.
# My actual opinion
To be fair, I'm a bit torn on the issue myself: I don't want Go to evolve further. And I don't like that it would allow expressing the same logic in two different ways. People would just argue which one to use and when. While I would welcome this but I think at this point it's too late to introduce such a change into the language. The main point of this post was just the above demo to serve as a comparison, not actually proposing such a change.
At most I might raise a point in the next dev survey that Go team should perhaps consider this approach as a less visually jarring error handling but I don't expect much from it. I also hope https://github.com/golang/go/issues/71203 doesn't pass. Fortunately the emoji feedback is quite negative at the time of writing.
[non-text content snipped]
published on 2025-01-15